Saturday 11 December 2010

The Iterative Reference Model: A new approach to OO

Object Oriented Models, even when very loosely coupled and very highly cohesive, can become quite complex over time. Also, when developing applications from scratch, it can be difficult to encapsulate all the requirements of the customer into one coherent model. In this article, I will share an interesting proposal to the architecture of OO-models that makes developing large systems highly iterative: You can completely develop one piece of the application, without having to take future extensions into account. This is ideal when you're using an Agile development method, such as SCRUM, where you have to show the customer a working application at the end of each iteration.

This approach is not so much something I invented after a long train of thought, though. Actually, it is something that I discovered while developing applications and that proved itself very useful even before I thought of it as a generally applicable pattern. Now, I try to use it as much as I can in every new model I design and it really makes developing applications so much faster and cleaner. I want to warn all the OO-purists out there: you may think that this approach is extremely wrong and that I am out of my mind, but bear with me! I will discuss the consequences of the approach later on.

If you want to improve, be content to be thought foolish and stupid

- Epictetus -

Before we look at the Iterative Reference Model, let's first start with the traditional approach of OO-models.

The Traditional Approach
Traditionally, OO-designs are based around "has-a" and "is-a" relationships. They try to model the different entities (classes) as you would do in real life. For example, take the class "Car" and the class "Wheel". Normally, a Car would have four Wheel-objects inside it (and maybe the Wheel-objects have a reference to the Car too). This is typically an example of "has-a", a Car has-a Wheel (or 4). Let's apply this to a more complex model, that of an (extremely) simple CRM application:


  • Company: A company that you do business with. A Company has Contacts and has ContactMoments.
  • Contact: A representative of a Company, i.e. a person, an employee of that Company, who is your contact inside that Company. There could be several Contacts per Company. A Contact has ContactMoments too.
  • ContactMoment: A moment of contact, i.e. a phone call or an email, that you have with a Company. Could be specific to one or more Contacts of that Company.
The arrows represent the references that a class has to another class. We forget about the entire UML-spec, as it is not necessary for this article and we keep it uni-directional, to keep it simple. The multiplicities however, are specified. The picture shows the traditional "has-a" relationships: A Company "has" Contacts and "has" ContactMoments. A Contact also "has" ContactMoments.

Let's also look at the underlying relational model. We don't want to "pollute" our domain tables with list indices, so we create separate tables for the one-to-many and many-to-many relations (a list_index denotes the position of, for example, a Contact within the list of Contacts of a Company). I use MySQL syntax, but that's irrelevant.
CREATE TABLE Company (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 ... properties of Company ...
)

CREATE TABLE Contact (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 ... properties of Contact ...
)

CREATE TABLE ContactMoment (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 ... properties of ContactMoment ...
)

CREATE TABLE Company_Contact (
 companyId           BIGINT NOT NULL,
 contactId           BIGINT NOT NULL,
 list_index          INTEGER NOT NULL,
 
 CONSTRAINT FOREIGN KEY (companyId) REFERENCES Company (id) ON DELETE CASCADE,
 CONSTRAINT FOREIGN KEY (contactId) REFERENCES Contact (id) ON DELETE CASCADE
)

CREATE TABLE Company_ContactMoment (
 companyId           BIGINT NOT NULL,
 contactMomentId     BIGINT NOT NULL,
 list_index          NTEGER NOT NULL,
 
 CONSTRAINT FOREIGN KEY (companyId) REFERENCES Company (id) ON DELETE CASCADE,
 CONSTRAINT FOREIGN KEY (contactMomentId) REFERENCES ContactMoment (id) ON DELETE CASCADE
)

CREATE TABLE Contact_ContactMoment (
 contactId           BIGINT NOT NULL,
 contactMomentId     BIGINT NOT NULL,
 list_index          INTEGER NOT NULL,
 
 CONSTRAINT FOREIGN KEY (contactId) REFERENCES Contact (id) ON DELETE CASCADE,
 CONSTRAINT FOREIGN KEY (contactMomentId) REFERENCES ContactMoment (id) ON DELETE CASCADE
)
This approach has the following drawbacks:

  • It is not iterative: you have to develop the entire domain model before you can start with other tiers, such as the database or the user-interface. Ideally, you want to focus on one class at a time. For example, you want to create the entire Company class first, top-to-bottom, so that you can show that to the customer at the end of the week, without having to worry about Contacts and ContactMoments just yet. That's for next week! In this model, a Company has references to Contacts and ContactMoments, so you have to create them (or at least mock them), before you can continue. Of course, this is a very simple example, but the same applies to more complicated models.

  • You need no less than 6 database tables to represent this OO-model in a clean way (that is, putting the one-to-many and many-to-many relationships in separate tables).

  • Suppose you want to delete a Contact. Now, you have to modify the list indices of all other Contacts of that Company that have a higher index, which means (if you use Hibernate or another ORM-library), deleting the Contact from the list of Contacts of the corresponding Company and than saving that Company again. This is a big hassle.

  • Suppose a Contact moves to another Company or you have mistakenly added a Contact to the wrong Company. Same hassle, you have to modify the list of Contacts of the Company that previously held the Contact and, separately, add the Contact to the correct Company.

  • Suppose that you want to delete a Company and its Contacts. In this model, when you delete a Company, the entries in the one-to-many tables will be deleted (because of the ON DELETE CASCADE), but the Contacts in the Contact table will remain. You have to delete them manually or do some tricky stuff with your ORM-library to take care of it.

  • Suppose that, for whatever reason, you cannot use the lazy-loading techniques of your ORM-library (such as Hibernate). Now, all Contacts and ContactMoments are loaded into memory if you want to adjust just one little field of a Company.
The Iterative Approach
So let's now introduce the Iterative approach. This approach is centered around the principle that you can add one domain object at a time and that the existing domain objects know nothing about the objects that are added later. As we will see, besides the iterative nature, this approach provides some additional benefits, too.


As we can see in the picture, basically all arrows are flipped the other way. The multiplicities however, remain where they were and are therefore on the other side of the arrow now! Let's go through the model:

  • Company: A company that you do business with. Holds its own properties, but knows nothing about the other classes.
  • Contact: A representative of a Company. Each Contact has a reference to the Company it belongs to. Multiple Contacts could refer to the same Company, but one Contact refers to one Company. A Contact knows nothing of ContactMoments.
  • ContactMoment: A moment of contact. Has a reference to the Company it belongs to and to the Contacts it applies to.
Let's look at the relational model:
CREATE TABLE Company (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 ... properties of Company ...
)

CREATE TABLE Contact (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 companyId           BIGINT NOT NULL,
 ... properties of Contact ...
 
 CONSTRAINT FOREIGN KEY (companyId) REFERENCES Company (id) ON DELETE CASCADE
)

CREATE TABLE ContactMoment (
 id                  BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 companyId           BIGINT NOT NULL,
 ... properties of ContactMoment ...
 
 CONSTRAINT FOREIGN KEY (companyId) REFERENCES Company (id) ON DELETE CASCADE
)

CREATE TABLE ContactMoment_Contact (
 contactMomentId     BIGINT NOT NULL,
 contactId           BIGINT NOT NULL,
 list_index          INTEGER NOT NULL,
 
 CONSTRAINT FOREIGN KEY (contactMomentId) REFERENCES ContactMoment (id) ON DELETE CASCADE,
 CONSTRAINT FOREIGN KEY (contactId) REFERENCES Contact (id) ON DELETE CASCADE
)
And here are the benefits of this approach:

  • It is highly iterative: you can completely develop the Company class top-to-bottom, show it to the customer, then develop the Contact class, show it to the customer and finally the ContactMoments. You never have to take future expansions into account.

  • You are down to only 4 database tables. We still need one many-to-many table, but we have nevertheless achieved a 33% decrease in the number of tables.

  • Suppose you want to delete a Contact. All you have to do is delete the Contact! The Company will not be affected and if a ContactMoment references the Contact, it will remain, which is probably what you want, since it still represents a moment of contact with the Company and it might reference other Contacts. The entry in the many-to-many table will be deleted, though, which is, again, what you want.

  • Suppose a Contact moves to another Company or you have mistakenly added a Contact to the wrong Company. Just update the Contact with the correct Company-reference and you are done!

  • In fact, there are no more list-indices associated with Contacts and ContactMoments in a Company. They are constructed at runtime, based on the order-by clause of the SQL (or HQL) query. Therefore, you can easily change the ordering of the Contacts and ContactMoments should you need to.

  • If you delete a Company now, all Contacts and ContactMoments will automatically be cleaned up (because of the ON DELETE CASCADE). No more manual deletions.
Some considerations

  • You might argue that you do want your Contacts referenced in your Company, because you will want to display them on the company page anyway. In the Iterative model, you have to perform a separate query to retrieve the Contacts of a Company, whereas they were readily available to you in the Traditional model. This is certainly true from an OO-perspective, but when it comes to performance and the number of queries to the database, it makes no difference. Your ORM-framework will make the same query to retrieve all the Contacts for a Company in the Traditional model too. You don't have to do it yourself anymore though, that is certainly true.

  • You might also argue that this goes against the object oriented principle that you should have proper "has-a" constructs. This might be true, but on the other hand, it is just a matter of how you look at it. Let's go back to our earlier example of the Car that "has" Wheels. This might seem logical, but does a Car really "care" whether it has wheels? The axes will turn when the engine is on, with or without wheels. You could argue that a Wheel "has-a" Car that provides it with torque. This may seem far-fetched, but I want to show you that traditional "has-a" relationships are tightly linked to the human interpretation of how things are and that the traditional OO principle may not be the only way to model things.

    Most people have principles in order to avoid the effort of thinking

    - Fliegende Blätter -

  • You might say: "Gert-Jan! You should never ever let your OO model influence your relational model and vice versa! They should be independent and the mismatch should be solved at the ORM layer!" I then would say: "You are absolutely right!" After all, my main motivation for this approach is the iterative nature of it: the ability to complete one part of an application without having to think of the next part. Trust me, it makes development so much easier! The positive consequences on the DB-side are just an additional benefit. Who says there's no such thing as a free lunch?

  • And finally: I'm definitely not suggesting that this is a Golden Hammer or a Holy Grail! There's no such thing... All I'm saying is that this approach could lead to faster and easier development in use-cases that lend themselves for it. And you can easily mix this approach with other Design Patterns, so that you can take full benefit of all the good things that OO has to offer!
Update: I got a lot of questions about why I would use join-tables to model the one-to-many relations in the Tradional approach. Basically, the reason is twofold:

  • I don't want to "pollute" domain tables with list indices, as I said before.

  • Suppose you have a one-to-many relationship Car -> Wheel. Many people suggested that a Wheel should have a foreign key to Car. But now we are going to add the one-to-many relationship Bike -> Wheel. Now what? You're stuck. When you use join-tables, you simply add a table for the new relationship. In the Iterative model, the one-to-many becomes many-to-one, so in that case, you can do it with a foreign key. But as I said, you should only apply this approach when applicable. In the Car, Wheel, Bike story, it is clearly not the best way to go, but there are a lot of cases where it is.
And please remember, the topic is about the Iterative approach in OO, the DB-things are just a sidestep. End of update.

Wednesday 17 November 2010

A talk with Brian Goetz on the future of Java

Two days ago, at a one-day mini-conference in The Netherlands, I had the great honor of speaking with Brian Goetz. He talked about the future of Java, Project Coin, Concurrency, Fork-Join and Lambda-Expressions and I asked him about the Oracle takeover, Java on mobile and why Oracle is getting so much bad press lately. Here's an abstract of this very interesting day!

I must start by emphasizing that the entire contents of this blogpost is my interpretation of everything Brian Goetz has said and may not be what was literally expressed by Brian Goetz.

For those of you that don't know, Brian Goetz is a Senior Java Language Architect at Oracle, working on the development of the Java language and platform. This development hasn't had much progress in recent years, because of Sun almost going out of business, but now that Oracle has taken over Sun, there seems to be much more perspective for the development of Java, with a clearly laid-out roadmap and well-specified plans.

To start with, as many of you probably already know, the next version of Java will be split among 2 releases, Java 7 and Java 8. Java 7 will incorporate many things that are "already done for the most part" and Java 8 will have the "bigger things that need more time". The roadmap is now as follows:

Java 7:

  • InvokeDynamic
  • Project Coin
  • Upgrade to ClassLoader Architecture
  • Method to close a URLClassLoader
  • Concurrency & Collections updates (including Fork-Join)
  • Unicode 6.0
  • Locale Enhancements
  • NIO 2
  • TLS 1.2
  • ECC
  • DBC 4.1
  • Translucent & Shaped Windows
  • Heavyweight / Lightweight Component Mixing
  • Swing: Nimbus & JLayer
  • Update to the XML-Stack (JAXP, JAXB & JAX-WS)
Java 8:

  • Modularization of the JVM
  • Project Lambda
  • Annotations on Java types
  • Project Coin part 2
  • Swing JDatePicker
Because of this split-up, Java 7 can now be released somewhere mid-2011, whereas Java 8 will be released somewhere in 2012.

One important thing is missing from these lists: Java on mobile. I asked Brian why it is still so difficult to develop for smartphones. They're all different, there is no single unified platform and it is not that easy to start a Java Applet (such as a JavaFX application) on a smartphone. Ideally, you don't want a distinction between Java SE and Java ME, they should be one platform and it should be just as easy to start a Java Application on a smartphone as it is on a desktop (i.e. Java Web Start). Brian responded by saying that despite the advancements in technology, the cheapest phones don't really get any faster, so it is difficult to create a unified Java platform for both desktop and mobile. Also, Oracle doesn't have any concrete plans yet, so what he could tell me was that "they are working on it". This is not as negative as it sounds though. Brian explained to me that the way Oracle communicates with the rest of the world is actually quite different from how Sun used to do it: Sun was very fast with promises about anything and mostly they were promises they couldn't keep. With Oracle on the other hand, you can count on them to deliver on their word. That also means that they can't say anything about stuff that hasn't been officially put on the roadmap. They have many very large clients, who base their decisions on what Oracle says in the media, so they really have to be careful about that.

Update: Brian sent me an email, explaining that mobile stuff is on an entirely different roadmap. This is the Java SE roadmap, whereas mobile stuff is on the Java ME roadmap. This does mean, however, that Oracle, at least for now, is not planning to merge the two, because in that case, there would be only one roadmap. I think this is a shame, firstly because Sun used to say that the two platforms were indeed going to merge (but maybe this was one of the "false promises" after all) and secondly because I think that unification of the platforms is very important, now that mobile and desktop are moving more and more towards each other. If nothing happens in this area, the "write once, run everywhere"-paradigm, is in serious danger. So let's hope that Oracle is planning some progress here, even though they are not talking about it publicly. End of update.

Brian was also joking about where in movies, you sometimes see people with a little angel on one shoulder and a little devil on the other. At Oracle, he said, you have a little angel on one shoulder and a little lawyer on the other...

But kidding aside, Brian told me that he was very impressed with the quality of management at Oracle. He thought that they are very smart people, who really know how to run a company. And really, there isn't as big a difference between cultures as you might think. They actually get along really well and for the first time in years, Brian actually has a budget to hire people again.

I also asked Brian about the bad press that Oracle has been getting lately. He explained that this also has to do with the fact that Oracle doesn't just say things in the media. They think about it really well before issuing a press release. That is why, in a lot of cases (such as the Apple JVM deprecation), only one side of the story is told in the media, with Oracle not responding to it. People start assuming the worst and when you combine that with the commercial reputation of Oracle, it is very easy to get bad press. Brian assured me, however, that Oracle is really committed to maintaining Java, because Oracle doesn't think about making money in the short term, but really focuses on the future and what is better for Oracle than Java being the number one programming language in the world? For the first time in years, there's really some progress going on.

During the day, Brian gave some very interesting presentations about Concurrency, The Fork-Join Framework and, the most interesting one, Project Lambda. This last one will be absolutely brilliant. Lambda expressions will be "translated" to SAM-types. A SAM-type is a "Single Abstract Method"-type, being an interface with a single method or an abstract class with a single abstract method. A Lamda expression will then have the same type as this SAM-type. Therefore, Lamda expressions can be used in any place where SAM-types are used today, for example ActionListener, Comparable, Cloneable, etc., making them instantly compatible with a large part of the existing API. I'm not going into this very far now, because it will take too much time and I will never be able to explain it as good as Brian can, but I can't withhold from you this piece of code:
//Old-style code:
Collections.sort(people, new Comparator() {
 public int compare(Person x, Person y) {
  return x.getLastName().compareTo(y.getLastName());
 }
});

//Lamda expressions & SAM-conversion:
Collections.sort(people, #{ Person x, Person y -> x.getLastName().compareTo(y.getLastName()) });

//Better Libraries:
Collections.sortBy(people, #{ Person p -> p.getLastName() });

//Type Inference:
Collections.sortBy(people, #{ p -> p.getLastName() });

//Method References:
Collections.sortBy(people, #Person.getLastName);

//Extension Methods:
people.sortBy(#Person.getLastName);
So, as you can see, besides just Lambda expressions, there will be many other optimizations that will allow programmers to optimally take advantage of Lambda expressions. Lambda expressions will also make it easier to write code that is optimized to run on multi-core computers. This is an area where the Fork-Join framework will also provide huge improvements. I've seen some details and the implementation is really smart.

When I told Brian that I thought that the SAM-conversion idea was really brilliant, he politely gave the credits to Joshua Bloch, who apparently came up with the name, and Neil Gafter, for the idea...

Finally, I asked Brian about Java on Apple mobile devices such as the iPhone and the iPad. His answer was both mysterious as well as hopeful. He said that Larry and Steve are really good friends...

So there you have it. One day with Brian Goetz and I have to tell you: at first, when I heard about the Oracle takeover, I was terrified, just as everyone else. The big bad giant that only cared about money was going to destroy Java. But gradually, as the Oracle roadmaps started to trickle out, I became more positive. Still cautious, but more open minded. Then, after reading many articles about JavaOne, I was starting to think that the takeover could actually be beneficial for Java. And now, after this day, I'm positive about that. It is going to be beneficial for Java. I finally had the chance of talking to someone who really knows what's going on and I'd like to tell all of you that I now have full confidence that Oracle is going to keep Java the greatest platform in the software-world!

Sunday 3 October 2010

How to build a Java buildserver

At Zenbi, we develop software systems using the Java language and technology. In this article, I will give you an insight into the architecture of our buildserver and development infrastructure.

For those of you who haven't ever used or even heard about buildservers, let me explain what they are and what they offer. Without a buildserver, you probably only use two tools in your development infrastructure: your IDE, such as Eclipse, IntelliJ or NetBeans and a version control system (VCS), such as CVS, SVN or GIT. Everyone checks in their code into the VCS at the end of each day and when the application is done, the lead developer builds the deliverables with his/her IDE and emails it to the server admins who install it on the servers. This could work very well in a small environment with only a few developers sitting in the same room, but when things get bigger and more complicated, the following problems arise:

  • There is no uniform way to build the final deliverables, because the local environment of the lead developer may change
  • Every developer must copy dependencies to his/her PC, leading to problems when different versions of dependencies are used
  • There is no uniform way to test the application. Unit-tests are run locally, making them dependent on the local PC environment
  • There is no way to automatically deploy the deliverables to a Development, Test or Production environment
  • You have no way of customizing the build-process of the deliverables. You could write some Ant-scripts, but they would probably only work on the PC on which the deliverables are built
  • You don't have statistics on the code, such as test coverage, bug-reports (such as PMD, Checkstyle or Findbugs) or metrics (such as JDepend)
A buildserver offers a centralized component in the infrastructure that can build and analyze the deliverables, manage the dependencies and deploy the deliverables to the servers.

Components
So, what components make up a buildserver? First of all, I would like to tell you that all software that we use at Zenbi is Open Source, we use Linux, OpenOffice, Eclipse, etc. Our buildserver too, is completely built with Open Source components, so if you want to copy our approach, you don't have to be afraid of costs.

The first component is a component that everyone, even the people without a buildserver, have. It's the VCS, the version control system. This is an absolutely necessary component, used to keep track of changes in source-code and providing a central storage point for all source-code. At Zenbi, we use SVN for this. I also have experience with CVS, but that one is clearly outdated and inferior to SVN. The new kid on the block here is GIT. We haven't tried it yet and the main reason why I have my doubts about it, is because it doesn't have version control on directories, but on files only. This was one of the major improvements of SVN compared to CVS, so I don't understand why GIT doesn't have it. Anyway, use whatever VCS you like, they're all Open Source.

The second component is used to actually build the deliverables from the source-code. As I said earlier, without a buildserver, you would do this with your IDE (for example Eclipse), but since this is a "client" application, you cannot use it to build in batch-mode (from the command-line). So therefore, you will need another tool, let's call it a Batch Builder. For this, we use Maven 2. Here too, there are many alternatives, Ant being the most famous one, but we chose Maven because it offers much more than just building your deliverables. Maven also provides a lot of plugins with which you can extend and modify the build-process, such as reporting plugins, that generate useful HTML-reports about your source-code (possible bugs in your code, test coverage by unit-tests, which are also automatically run by Maven during each build, and dependency metrics). You can write your own plugins too, if you want, to customize the build-process even further. Finally, Maven offers a central repository that contains all dependencies (jar-files) that you need. You can access this repository from your IDE (such as Eclipse), so that you won't have to copy all dependencies to your workstation anymore, making sure that there is only one place that contains dependencies, preventing version-hell. Maven also copies the actual deliverables from the build-process to this repository, so that you can re-use them as dependencies in other projects. Maven does add some configuration files to each project, but it's definitely worth it.

Maven however, is a command-line tool that you can start. It runs, and when it is finished, the process ends. There is no background process that is running continuously (I just said that you can access the Maven repository from your workstation, but this is filesystem-access, a.k.a. a "share", it does not require a separate "server"-process). So, who starts Maven? Does the buildserver administrator log in each day to start Maven for each project? Of course not. For this, we have the next component, the Continuous Integration Tool (CIT) and we at Zenbi use Luntbuild for this. Hudson is also a very popular and good choice, but CruiseControl and Apache Continuum are outdated, don't use them. The CIT runs as a background-process with a web-interface (Luntbuild has an embedded Jetty webserver). You can start builds from that web-interface, but you can also configure the CIT to schedule automatic builds (for example each night). This way, all source-code that is checked in by the end of the day is automatically built during the night. Luntbuild keeps its own "local working copy" of each project, which is basically a checkout from the VCS. Before each build, it updates the LWC and then it runs Maven on it. Maven builds, deploys to the Maven repository and Luntbuild tags the current version in VCS with the current buildnumber, so that you know which build belonged to which version of the source-code. It's that easy!

Automatic nightly builds have another advantage that is also very funny and brings joy to the department! Imagine for example, that someone checks in code that does not compile. Then, at night, the Maven-build will fail and Luntbuild will display a big fat red dot next to the corresponding project. Thanks to the VCS, it is easy to find out which developer checked in the faulty code and he/she has to buy cake for the entire department the next day! This way, everyone will make sure that their code works, before checking in.

Beware, though, of people who avoid checking in at all, because they know that their code won't compile... This is dangerous! The longer someone's code isn't integrated with the other people's code (hence the name "Continuous Integration"), the bigger the incompatibilities when it finally is. Also, workstations usually aren't backed up regularly, unlike the buildserver, so if a workstation harddisk crashes, it's goodbye to all code that wasn't checked in...

I told you that Maven generates useful HTML-reports about your source-code. These are just HTML-files that can be placed anywhere by Maven, but they will have to be published somehow. Therefore, we have also installed a Webserver on our buildserver. We use Apache Tomcat for this, but feel free to use any other.

We have also installed a Wiki on the Webserver, for general information sharing in the team. This doesn't really have anything to do with the build-duties, but the buildserver provides a centralized location for the development team and we have a Webserver running here anyway. Our choice here is JSPWiki, but there are many alternatives.

Next up is the database. Since almost every application uses a database, you need one for testing. Most companies use a database in the Development environment for this, but I'm not a very big fan of this approach. The Development environment is supposed to be identical to the Production environment (in architectural terms, not content), so that developers can test their applications before release. When you use the database in this environment for other purposes (local testing, local unit-testing or buildserver unit-testing), you "pollute" the database. The Development environment is then no longer identical to the Production environment. Therefore, we have installed a separate database on the buildserver, used for said local testing, local unit-testing and unit-tests run by the buildserver. Our choice here is MySQL, because we use that in Production too. Be sure to use the same database as in the other environments, like Test and Production!

And finally, the last component. It's the SystemManager, used to deploy the deliverables (we call them Systems, hence the name SystemManager) to the different environments, such as Development, Test and Production. I have to take back what I said earlier, this one is not Open Source, we built it ourselves. Deploying to an environment is usually very company-specific, so tools like this will almost always have to be developed inhouse. What it does is very simple though, it takes the most recent deliverables from the Maven repository and deploys them to the specified environment. The SystemManager is a command-line tool, without a web-interface, so an admin will have to log in to the buildserver in order to do it. We chose this option for security reasons, because you don't want anyone to deploy just like that. If you do want a web-interface, well it's your tool, so go ahead and develop it!

To make things a little more insightful, below is a picture with the different components of our development infrastructure. Between parentheses is the actual name of the tool that we use, but as I said, you can choose others. Click the picture to enlarge.


Extensions
There are some possible extensions to this setup that are sometimes used by other companies. One of these is a repository manager like Archiva or Artifactory. It acts as an intermediate between the Maven repository and the components that want to use the Maven repository and offers functionality such as version control on dependencies and identification of the person who submitted a dependency to the repository. We don't need such a component, because we see no harm in accessing the Maven repository directly and the buildserver administrator is the only authorized party to add new dependencies to the repository. Also, the above picture is complex enough for the average developer. Remember, we are not all buildserver admins. If you keep the usage of the buildserver simple, developers are much more likely to contribute to a clean environment. Make it complex and they see the buildserver as a difficult piece of overhead that is best left alone.

Simple goals and rules lead to intelligent behaviour,
complex goals and rules lead to stupid behaviour

- Albert Einstein -

Another thing to take care about is release management. You want to have control over the different versions that are released. This is also an issue where a repository manager can help. We, however, keep this very simple. Basically, we stick to the Maven standard of keeping the head revision of the application a SNAPSHOT revision. Every iteration, we create a branch for the release. Changes in that branch will also have to be merged into the trunk, of course. Because the branches and the trunk have different Maven versions, Maven keeps them separated in the Maven repository. That way, you don't need a repository manager to keep track of the different versions that you have.

Apart from that, each build gets its own version number from Luntbuild. We don't really use those version numbers, but because Luntbuild tags each build in the Subversion repository, we can always get the source-code of a certain build back. Luntbuild even supports re-building. That way, you can re-build a previous build, with the source-code as it was at that time. We have also developed a Maven plugin that writes the buildnumber to the actual application (in an xml-file), so that you can identify the buildnumber of a running application.

So, my advice is, if you want to extend your buildserver, keep one thing in mind: keep it simple! At my previous job, the buildserver was far more complex, with both Tomcat and Websphere running, with a CI tool (CruiseControl) that consisted out of 2 separate parts, with a custom made management application, with a separate release versioning tool (Harvest), with four different version-numbers per application (Maven version, CruiseControl buildnumber, Harvest version and a so-called "application-version" that I'm not even getting into), etc., etc. I have ample experience with developers avoiding as much as they could of that buildserver, because it was just too complicated. And still, some people wanted to install a repository manager on top of all that! Luckily, I was in the position to prevent that...

It takes an intelligent person to build something complex;
it takes a genius to build something simple

- Albert Einstein -

Be the genius!

Friday 23 July 2010

Why you should talk to tech people

Lately, an old and regularly discussed subject has resurfaced on the Internet once again: at a software development company, should technical people be allowed to talk to customers, or should that be done by sales people and accountmanagers? In this article, I will give you my take on the subject.

A while ago, I had such a conversation with a customer. For those who don’t know, I am a technical person, a really technical person. Despite that, I do have some conversational skills and from time to time, you might even call me social. Anyway, how did that conversation go? Did it go smoothly? Did we understand each other? Did we agree on various subjects? And the most important thing: was the client happy in the end?

To start off with the first question: no, it did not go very smoothly, especially in the beginning. Being a real technician, I started to ask why he wanted what he was asking for, what the underlying purpose was and finally, I told him that he shouldn’t want what he wanted at all and that he should go into a different direction altogether.

The client was startled, to say the least. He expected someone who didn’t give him a hard time, who understood what he asked and just delivered. Instead, all of a sudden, the client was forced to rethink his entire plan.

I am on the client side myself too, every now and then, when it comes to technology. On one such occasion, when I was buying some hardware, I was talking to a real salesman. He knew a thing or two about the things he was selling, but mainly his job was to get the potential customer to buy their needs at his company. At first, I felt comfortable talking to him. He was nothing like the notoriously aggressive and commercial salessharks, no, this man was really trying to help me solve my problems. But as time went on, the number of questions that he couldn’t answer kept growing. At a certain point, all he was telling me was that he would "consult the tech department" with my questions. He really became a redundant link in the chain. He also never suggested something different than what I asked for, while, in retrospect, some things I asked for were not exactly the best solution.

I think that this is one of the biggest drawbacks of non-technical middlemen: They can’t answer all of the client’s questions and, maybe more importantly, they don’t know which questions to ask the client. The client knows a lot about his or her core-business, the technical person knows a lot about technology, but what does a middleman know? An area where a middleman is useful is in the role of "translator". Sometimes the client isn't able to tell the technical person what he or she wants and in this case, a middleman can be very useful. I do however, want to make a clear distinction between two situations:

  • The client does not know what he/she wants: In this case, middlemen are not always necessary, since technical people are very capable of suggesting different solutions to the customer's problems.
  • The client can't explain what he/she wants: This is a very different case! In this case, middlemen who "speak both languages" can be extremely useful.
A mistake that is sometimes made by non-technical salespeople, is that they promise the client whatever they want and they tell them whatever they want to hear, only to have to retract all these promises at a later stage, when the technical people tell them that it cannot be done.

Nothing is impossible for the person who doesn't have to do it

– A.H. Weiler -

When you talk to a technical person, communication might not go as smoothly, you might need more time to get to understand each other, but when you finally do, you are absolutely certain that the things you agreed upon, the things he or she promised you, can and will be done. I think this answers the second question. It took us longer to understand each other, yes, but the understanding that we finally achieved, was worth so much more.

This brings us to the third question, did we agree on various subjects? I am not talking about price and contractual obligations here. I am talking about the actual project subjects. This seems strange, doesn’t it? Agree upon things? What’s that all about? Shouldn’t the supplier just deliver what the client wants? Isn’t the customer always right? No, he is not. Sometimes, the client doesn’t know what he or she wants, or sometimes they want the wrong things and I see it as my duty to help them shape the outcome of the project and to correct them where needed. After all, that’s why they come to me, right? I am supposed to be the expert on the subject.

Technical people in particular are able to show different approaches and solutions to a customer. Where a salesman always takes the demands and questions of a customer as a starting point, a technical person is able to take the final goals, purposes and targets of the customer as a starting point and then to come up with a modified or completely different solution than what the client asked for. Technical people are just more creative in this area and this is very important, because a project that has a flawed design, will never satisfy a customer.

Projects don’t fail in the end; they fail at conception

- Tony Collins -

After all this, you might want to ask: "If communicating with technical people has so many advantages, then why do most software development companies don't let the technical people in on the customer conversations?" I can counter this easily, by asking: "Why do so many IT-projects fail?"

Of course, the client must be willing to invest some extra time and effort. If he just wants to hear that everything is going to be all right and the end result is going to be great, then he shouldn’t complain when in the end, things don’t look as rosy as they did in the beginning. And yes, some customers really are like that.

I must make three reservations, though. Letting technical people do all the talking can also have its drawbacks.

First of all, they exist. Those socially impaired geeks who are brilliant at programming, but just can’t communicate. Of course, you should never let them talk to the customers. But not all technical people are like that. In fact, those nerds often have lots of trouble fitting into a team, so although they do exist, you usually don’t find them in development teams where people have to do lots of communication. Mostly, they reside in the basement in the systems administration department (IT Crowd, anyone?)

Secondly, the decision to let the technical people do all the communication with the customer can drive the programmers crazy when the customer calls every single day with all kinds of trivial questions. Programmers are at their most productive when they can work uninterrupted for long periods of time. So, during the design and development stage, communication should include the technical people, but once the system runs, the technical people should be somewhat shielded from continuous interruption. This does not mean, however, that you need accountmanagers or salespeople for this after all. You could, for example, require your customers to only communicate via email if they have a non-urgent question, so that the technical people can respond whenever they see fit, albeit within a reasonable timeframe, of course, for example two days. You could also employ a separate support department, that handles all trivial questions and forwards the rest to the technical people. In that case, the accountmanagers and salespeople can do what they are best at: perform the role of translator when it comes to new requirements.

And thirdly, I make a clear distinction between sales people and accountmanagers that form the link between the customer and the technical people on one hand, and marketing and sales professionals who do acquisition on the other. Marketing and acquisition is in a league of its own, a profession of which technical people generally do not know a lot. But once the potential customer is reasonably convinced that he can be helped, it is better to let a technical person join the conversation than let all the talk be done by sales people, because in that stage, there are definitely some technical issues that have to be taken care of.

So my conclusion is that in the end, letting technical people join the communication with the client is definitely the way to go. The extra initial investment will definitely pay itself back later on and both developer and client will look back on the project with satisfaction.

Oh, and as for the final question: In the end, the client was one of the happiest clients I’ve seen in a long time.

Tuesday 18 May 2010

A New World

We all know them, the many problems and challenges that we have to face when it comes to IT. In this article, I will not only give you advice about how to deal with those issues, but I will also explain how a good IT-system can influence the entire way a company is organized and the effects it can have on employees and company strategy.

Let's start this article with the simple, but astonishing fact that the computer is the only device in our lives, of which we find it normal when it does not work correctly. Personally, I don't find this normal at all, but then again, I have a degree in Computer Science. But should it really be normal for people without such expertise? Let's face it, the vast majority of companies does not have IT as its core business. Should they be satisfied with systems that periodically fail, because "that's just what computers do"? Absolutely not. Every person and every organization should have systems that do whatever they want, whenever they want it. And by "they", I mean the people, of course, not the systems...

So, what are those common challenges that almost every organization faces? Well, the list is not exactly small:

  • The presence of old systems, that are very difficult to develop and manage.
  • Different systems for different tasks, that do not integrate well. Data has to be manually transferred from one system to another.
  • Systems that don't do exactly what you want.
  • Systems that are too complex and provide way more options than you will ever use.
  • Systems that only work inside your company building, causing you to be trapped inside that building. Moving to another (larger) building will be very expensive in this case.
  • Computers that get slower and slower as time passes.
  • Complex network and server infrastructure, that takes a lot of effort to maintain.
  • The fact that nobody within your organization exactly knows what systems and software packages there are and how they work.
  • And, of course, the single biggest problem that follows from all the others: it just costs too much money.
I have worked at a big financial institution for three years. The IT-infrastructure there had many problems. There were literally thousands of different systems, all connecting to one another in some way. If new functionality had to be developed, not one, but dozens of systems had to be modified and a lot of new systems had to be built in order to transfer information between systems correctly.

Nobody in the company knew the global architecture of the infrastructure and the rate in which systems were added and things became even more complex was only accelerating. There were a lot of ancient systems from the '70s, systems sometimes took more than a minute to respond to a mouse-click, the balance between functionality and complexity was nowhere to be found and IT-costs were in the hundreds of millions per year.

This example is only applicable to very large companies, of course, but I'm sure everyone recognizes at least a few of the problems listed. And if not, my compliments! You're doing a great job. You still might want to read on, though, because solving problems is not the only subject of this article!

The Problem
So what causes all of these problems? There are different causes, of course, but they all come down to one single thing: complexity. When systems become too complex, problems are never far away. This applies to every aspect of IT: From personal computers filled with a lot of different programs and tools to server and network infrastructures that look more like a spider web than a hierarchical tree. Things shouldn't be complex, they should be simple.

Simplicity is prerequisite for reliability

- Edsger W. Dijkstra -

The next obvious question then is: What causes this complexity? Here, I am going to make a distinction between small and large companies.

For large (multinational) companies, the answer is basically twofold. First, they have been doing IT for a long time, probably ever since IT became available for businesses (between 1960 and 1970). Therefore, they just have a lot of "old stuff" running. Investing in new functionality is always more interesting than upgrading old systems, after all, why would you pay for something that you already have and that already runs? To avoid getting stuck in a maze of old stuff, of course, but tell that to the managers who have to explain to their superiors what they have done with their budgets.

The other answer lies in the fact that big companies often have to deal with takeovers. They acquire another company, or get acquired themselves. Every time two companies merge, their IT infrastructures have to merge too, something that never happens in an optimal and complete way and always adds to the complexity of the whole, without offering new functionality.

At the big financial institution I have seen these two causes happen over and over again. It's no one's fault, it's just difficult to avoid and almost impossible to get out of.

Smaller companies fortunately do not have these major causes of concern. They are flexible and dynamic and can easily adapt to new situations. Then why is it that they also have to face so many difficulties? I think this is caused by the fear of change. Let me explain. When a company starts, it usually has nothing more than some Word-documents, Excel-sheets and maybe an accounting program. This is fine. But when the company grows and starts to hire people, their IT needs also grow. And herein lies the problem. Those growing needs are often fulfilled by adding components (software / hardware) to the existing infrastructure. New components are added, but the old ones remain. After all, they work, so let's not touch them, IT is difficult enough as it is!

Let's compare this to a company building. If a company has a small building that becomes too small, then what do they do? Keep building extensions to the building every time they need more space? Of course not. They move to a larger building, that is optimally suited to handle a larger workforce. The same is true for cars, when a car becomes too small, you buy a bigger one. Then why do we stick to our old systems, when there is something better available? Why keep adding and adding to an increasingly complex infrastructure, without ever taking the time to evaluate the entire IT-environment and move to an entirely new system that is perfectly suited to your company as it is today? So my advice is: Keep the number of different systems as low as possible. Try to find systems that provide a solution to all of your needs, not just your new needs and replace your old systems with the new ones, before you get stuck. This may be a rather big investment in both money and effort, but trust me, it will pay itself back rather sooner than later.

A few years ago, I was involved in an IT innovation project of a furniture store. It was a rather small company, with only a few employees, so you would expect a simple IT-infrastructure, but no, they too had a large amount of systems for every part of their business process. A system to store customers, a system to manage their suppliers, a system for making a sale to a customer, a system for placing an order at a supplier, etc. The employees were spending a large portion of their day transferring data from one system to the other. They even reserved hours for "keeping the systems up-to-date".

At the end of the project, we had created a single system that integrated everything. You could manage your customers and suppliers and if you made a sale to a customer, then automatically an order was placed at the corresponding supplier, if the item was out of stock. The different phases of a customer order ("new", "ordered at the supplier", "delivered by the supplier", "sent to the customer" and "paid by the customer"), were taken care of by the sytem. You could always see the status of an order and the only thing that the employees had to do, was clicking a button to move the order to the next phase. The complexity of the company's infrastructure went down tremendously, the level of control that the store had over its processes increased and the company owner was very happy. The only complaint I ever heard came from the employees: they were now bored, because they had nothing to do anymore...

Simplicity is the ultimate sophistication

- Leonardo Da Vinci -

A final cause of problems that applies to both large and small organizations, is the fact that they often choose off-the-shelf software packages, instead of custom-made software. Off-the-shelf software might be easier and even cheaper at first, but it has a number of drawbacks:

  • It never does exactly what you need it to do. After all, the manufacturer didn't know your organization when they created it.
  • Because of this, many manufacturers put a lot of options and functionality in these packages, so that every customer will find something that they need. But nobody ever uses all functionality offered by the software, making it unnecessary complex.
  • You probably need more than one off-the-shelf software package, because none of them provide everything you need. This means more complexity and more manual labour if the different packages do not integrate and you have to transfer data from one system to the other by hand.
  • And the most important drawback: You are stuck with what the manufacturer gives you. If your needs grow, the software cannot adapt. Of course, you can request changes from the manufacturer, but will they develop them? And when will they be ready?
Of course, opting for custom-made software also has its risks. You will have to find a software development company that closely works together with you, analyzes your organization and its processes carefully and provides regular feedback during the development process. Basically, a development company that not only works for you, but also with you. This might seem a bit scary, the idea of inventing your "own" software. But don't worry. Good software development companies will help you make the right choices and decisions. And the rewards will be well worth it: A system that optimally suits your business, integrates all of your processes and provides everything you need.

The Solution
In this section, I can give you advice on how to better manage your IT, but first ask yourself: do you really want to manage IT? Is that part of your core business? Don't you want a system that just works, without you having to do anything for it?

"Is that possible?", I hear you ask. And the answer is: "Yes, it is!" With the "Software-as-a-Service"-method, your system is hosted in a professional datacenter and is accessed through the Internet via Cloud Computing. You and your employees can access it from anywhere in the world, such as your company building, at home, on business-trips, etc. This does not only provide a lot of flexibility, but it takes away all the technical issues from you organization. No longer will you have to manage any servers, infrastructure, software-packages, updates, etc. Backups of your data are automatically taken care of and all of your IT-related risks will go away. Your organization will rely on proven technology and a professional infrastructure, rather than on a self-composed environment. Your IT-costs will also be lower, because you will no longer have to invest in your own infrastructure and other IT-related resources. The only thing that you need, are computers with an Internet connection. I do recommend buying some of those USB-UMTS dongles if you have a small company, or a backup Internet connection if you have a large company, to avoid the risk of losing your Internet connection.

Currently, a lot of companies provide Software-as-a-Service (SaaS) packages. You pay a fee and your system will be available to you. One of the most well-known examples is Exact Online, an online accounting package, which is a lot easier to use than its predecessor, which had to be installed manually on every computer on which you wanted to use it. But still, they are all off-the-shelf packages, that come with the same disadvantages as the old-fashioned packages that were not SaaS-based. We discussed these disadvantages in the previous section.

But... If we combine SaaS with custom-made software, then things get really interesting. Imagine, one system that integrates and automates your entire organization. A system that does everything that you want and only what you want. A system that requires no maintenance and is available always and everywhere. A system that takes care of your data with regular backups and a secure environment. A system that combines the strengths of both SaaS and custom-made software. Now we are approaching perfection.

Company organization
What I am about to discuss with you now, is my vision on organizations altogether. I must warn you, this vision is quite progressive. If you think that it is a little too far away for your organization, then don't worry. I think there is plenty of room for advancement in your organization based on the previous sections. Once you have reached that point, you can always decide to take it one step further, or not.

When one used to ask the question, "What is a company?", you would primarily think about a company building, with the company logo on it, maybe the director and of course its people. I don't share this traditional vision of a building with people. In my vision, it is a system with people and those people are a little different. The system, of course, is a system described in the previous section, hosted in a datacenter, available through the Internet and automating and integrating all of the company's processes.

But what does this mean for the people? Do they still have the same job? Well, some do. The point is, since the system takes care of everything that can be automated, including business processes, the employees only have to implement steps in the business processes that cannot be automated. Things that need human input, human interaction, human creativity, even human emotion. Not only does this take away a lot of "boring" work, such as manual data processing and process management, the employees will probably also like their work better, because they can add more of their own creativity and personality to the company's business. A company may need less employees, because they don't need IT-personnel, process managers and people who do work that can be automated anymore, but the employees that remain, will be true craftsmen (and craftswomen) with a passion for their jobs and the ability to express that passion.

When art critics get together, they talk about Form and Structure and Meaning
When artists get together, they talk about where you can buy cheap turpentine

- Pablo Picasso -

Another thing that will change for the employees, is that they are no longer bound to time, place or building. The system can be accessed from anywhere in the world, at all times, so the employees will not necessarily have to be at the same location. For some companies, it just works better when the employees are all together in the same room, so maybe those companies will take less advantage of this aspect, but it still adds flexibility when it is needed. Also, when the company decides to move to another building, the transition will be very smooth, because the system will never go down. The employees will simply get a new desk with a computer with Internet access.

This entire approach also fits the trend that an increasing number of companies and organizations do not even have a building, for example Business Networking Companies, that provide networking gatherings for business owners. They often just rent a restaurant or a cafe for the event and have no building of their own. Their employees work from home or from rented flexoffices. For international companies, the advantages are obvious: a system that can be accessed from every company location in the world.

You could say that the system "holds" the status, the processes, the data and therefore the essence of the company. It takes up a central place in the company and the employees just do what the system tells them to do. They are therefore completely expendable. This is so not true! Yes, the system holds all of those things and takes up a central place in the company, but because the employees are now true craftsmen with passion and experience, they become even more important! No system can replace human vision and ingenuity. It is the task of the system to make sure that the employees can do their job, by taking away bureaucracy and overhead and creating an optimally efficient organization in which the employees only have productive tasks that are directly related to the core business of the company.

An obvious question to ask is: "What if the company processes change?" A human process manager can easily adapt, a system cannot. This is a very valid question, but let me ask another one: "Do process managers adapt to new company processes? Or do company processes change because of a new process manager?" More often than not, processes change when leadership changes. A new process manager will have to prove to the person who hired him that he or she has something to add and what easier way to do this than to change some processes and claim improvement?

Ask yourself, when reorganizations solve problems, then why do we have to reorganize again in a few years? When the company processes are automated by the system, it brings stability and continuity to the organization.

Movement is often confused with progress

- Richard Donovan -

Yet another question that I would like to ask is: "Does the process really change? Or does the product change?"

Remember the big financial institution? When I was an employee there, every now and then, a new insurance product was created and every time, new systems and new processes were invented to realize the new product. A new system to buy the new insurance, a new process that specified what would happen when a customer would claim damage, a new system to implement that new process and on top of that, a whole bunch of systems that had nothing to do with the new product, but were only there to make sure that the new systems could communicate with the old systems.

In this example, the process was tightly connected to the product. Every time the product changed, the process had to change. This is a very bad thing in a business where you know that products change often and keep changing. So, when designing the system that automates processes, you must carefully consider the architecture of your processes. They should always leave room for future product changes. This is a difficult task, but a good software development company will help you with this. In my opinion, creating systems is more than just programming. Finally, it should be clear that a good system that takes the aforementioned issues into consideration, can never be realized with off-the-shelf software packages. You really have to know the company, before you can define its processes.

But yes, in some cases, even when taking the above issues into account, company processes change and the system is not prepared. But don't worry! It is a custom-made system, remember? It can always be adapted to your wishes, unlike off-the-shelf packages. It might take some time (probably some weeks), but the system is probably not the only element of the organization that has to be adapted to the new process, so, there is time. Also, changing processes are often the result of a changing company strategy and you should always take time for those things. An important guideline is: The system must facilitate the company, but may never restrict it in any way.

Conclusion
I hope that, after reading this article, you have an idea of what IT can do for you and your organization. I hope that you understand that IT can definitely work for you, instead of against you. IT shouldn't be difficult and shouldn't cause problems, it should take problems and risks away and let you focus on your core business, whatever that may be.

If I have to give you one single advice, then that would be: "Don't be afraid of change." The IT-strategy suggested in this article is probably very different from what you know and it might take some time to get used to, but remember that it is focused on removing difficulties, not adding them.

When it comes to the future, there are three kinds of people:
those who make it happen, those who want it to happen and those who wonder what has happened

- X-Bit Labs -

If you have a small or medium-sized company and you think that this is way too big and expensive for you, it is probably not. If you can reduce your IT-workforce by just 1, you're already saving money! And if you have a large company and you think that your company is way too complex to transfer to such a new system, again, it is probably not. Besides, if you can save money, then it is worth checking out, isn't it? Also, spending less money is not the only financial consequence of this IT-strategy. The other one is predictability. You will never have unexpected IT-costs again, because the technology is maintained by your system provider.

Just imagine how your company could look like. No more overhead, no more technological issues, crafty employees that work with passion and a system that just works. Always and everywhere.

The future is here. It's just not widely distributed yet.

- William Gibson -

And if you think that this article is a way of generating more business for my company, it is actually just the other way around. I founded my company because of all the issues I mentioned, because I saw that IT could be done so much better. Call me an idealist, but I still believe in a perfect world.