Latest Posts

Archives [+]

State-of-the-art architecture for web content management systems

When you start to implement a large software package you are usually (or should be) driven by your own needs or what your customers ask for. That is, you are driven by features. On the other hand, when you decide to re-write a software package that has been in production for a while features are often less in focus. Instead, you would probably decide to do a re-write in order to get the architecture "right", i.e. to adapt the architecture to everything you learned so far. Of course, as a developer you always strive for a fitting architecture. But the difference between the first implementation and the re-write is that in the latter case you know all the features that need to be supported and you know what your users tend to do with the software.

Let's look at two examples of established products that are currently being re-written in the WCMS world:

 

 

At OpenExpo I could attend a talk by Typo3 5.0 lead developer Robert Lemke and found that T3 5.0 and CQ5 independendly came to the same conclusion how a modern web content management system's architecture should look like. As I blogged about previously it is a 3-layered stack:

The foundation is a Content Repository with APIs that allow access to the content in its raw form (untainted with business logic). The top layer is the content management system itself that should consist of nothing but business logic (like workflows and the user interface for editors). Inbetween the two is an infrastructure layer that provides basic plumbing for web applications like security, connection handling to the repository, script execution etc. One important aspect to qualify this part as a layer is that it also exposes its own API.

The CMS is only one possible application that can run on the infrastructure layer, i.e. custom applications do not run "on top" of the CMS, but "next to" it. This is enabled by the fact that the infrastructure layer is independent of the CMS.

In our case the name of this layer is Apache Sling, in T3's case it is Flow3.



 

I am convinced that this striking similarity is not a coincidence, but rather a confirmation that this kind of architecture constitutes the state of the art of WCMS architecture. Especially, as these two systems are situated in very different ends of the WCMS ecosystem. Moreover, I know that we as well as the T3 devs have spend a significant amount of time thinking about architecture and evaluating approaches (which is usually another difference to first time implementations where urgent requirements need to be satisfied immediately).

I should note that there are plenty of differences between Flow3 and Sling, for example Sling maps directly maps content and URLs (which I personally find utterly fitting for a content-centric framework) whereas in Flow3 URLs address controllers.

PS: in case you wonder: yes, that IS a JCR-compliant repository written in PHP in Flow3.