Latest Posts

Archives [+]

Entries by Lars Trieloff

    Posted by Lars Trieloff JAN 25, 2008

    Comment 1

    Since content-centric applications are content-driven, modeling the content structure is the most crucial part when documenting the architecture of your application. A big part of the general architecture is usually determined by the framework you chose to use: If you are using Sling, it is Content-Behavior-Appearance, if you are using Apache Cocoon, it is content pipelining, and so on. What makes your application special is the content structure or the content model. As understanding the content structure is a crucial part for communicating the architecture of your application, you should spend considerable amount of time on designing, documenting and communicating the content structure to other developers. In JCR content has two general properties that deserve documentation: one the one hand there is the location of nodes in the content tree. The most straight-forward approach of documenting this is simply expressing the tree structure in a diagram as the one below or using a JCR repository browser like the CRX explorer that comes with Day's CRX repository or the open source tool JCR Explorer.

    There are multiple downsides connected with this approach: One the one hand, these autogenerated tree models communicate importance and relation of portions of the content tree poorly, as they can only express parent-child relationships, and to a certain degree node types. Secondly as the tree grows, it becomes increasingly complex and confusing to the observer. If you really care about communicating your content structure, then drive structure documentation, do not let it happen.

    The second aspect of content modeling for JCR is the node type. JCR has a complex node typing system that allows multiple inheritance, mixins, child-nodes and references. For real-world application documentation three approaches can be found:

    • using standard CND notation - this is the most obvious approach as you have to write the CND files anyway and it provides a very compact notation that is able to express every aspect of the node type. Unfortunately, this CND notation is optimized for writes, not readability or comprehensibility. In order to make it easy to understand, the following two approaches are being used.
    • automatically generated HTML nodetype documentation, using a tool like Jackrabbit-NTdoc , which basically takes the node type definitions and automatically translates them into a number of HTML pages that are browsable similar to Javadoc and document every aspect that can be found in the node type definition.
    • ad-hoc graphical notations. These notations often are inspired by UML or entity relationship diagrams, but seldom reused or documented. While they are more readable than the CND notation or browsable HTML documentation, the lack of standardization and meta-documentation makes them hardly portable.

    A main advantage of these graphical notations however is that you as the architect can decide what is important, what is related and what is obvious and does not need to be documented at a high level. This again shows that you should drive your content model documentation and not let it happen.

    The notation proposed below uses a combination of a graphical treemap notation for describing the content tree and a UML-class-diagram inspired notation for documenting node types, node type inheritance and node references. A main advantage of this notation is, besides re-use of existing notations like UML or Fundamental Modeling Concepts (FMC) that it offers a connection between tree structure and node type.

    The upper part of the chart features an example content tree in treemap notation. Speaking in FMC terms, this content tree is a set of nested places and this nesting can be driven by the architect in order to express relation (places are next to each other), containment (one place in another) and importance (place is bigger). You can even "zoom in" parts of the chart to explain content structure more in-depth. A good example for variable content can be found in /apps/wiki/themes where any number of themes can be stored, but two "default" and "extra" are mentioned as examples.

    This treemap structure is both visually compelling and compact, so it can be combines with the UML-inspired node type notation at the bottom of the chart. This notation uses UML class diagrams to express node types (bold font, shaded background) and Mixins (italic font, white background). Node types can have three types of relations: inheritance, containment and reference. For inheritance the default solid line with a hollow triangle arrowhead at the super type is used. For child nodes and associations a basic "association" line without arrowheads is used. For the cardinality of relationships: as there is only one parent node or referencing node, only the cardinality indicator at the child or referenced node type is used. Here we use a simple-regular-expressions inspired syntax where * means: any number of node, + means at least one node, n means exactly n nodes, and so on.

    Using a dotted line you can map node types to places in the treemap where this node type can be used.

    To sum it up, the proposed notation is a tool that helps understanding and communicating content-centric software systems. It is not intended to be used to automatically generate code or to be generated automatically from code, instead it is a second description of your software system that lives beside the code of your system (as the primary description) and is suited for technical communication with humans.

    Posted by Lars Trieloff JAN 09, 2008

    Comment 1

    The in the last two days we have seen two exciting news: Google and Facebook joining and Google, IBM and Verisign agree to support OpenID. Together with Apache's Shindig, an open source implementation of Google's OpenSocial engine (see this Youtube Video for an interview with Brian McCallister who explains what Shindig and OpenSocial are) we are witnessing what will evolve to true social network portability.

    • With OpenID you are able to transfer your identity from one network to another. No need to enter the same information about yourself over and over again. You are even free to create multiple identities if you would like to separate some aspects of your digital life.
    • With you are able to transfer the social graph from one network to another. This means you do not have to find your friends on each new network again and again.
    • And with OpenSocial have application portability. If there is a nice widget in one network that you would like to embed into another network - no problem with OpenSocial.

    As a consequence, the costs of joining yet another social network will shrink dramatically. As you have portability of identity, social graph and applications, you can start cherry-picking by joining many specialized networks, selecting the parts of the application that are most useful and aggregate them in your main OpenSocial container. But this also changes the rules of the game for social network vendors. You do not have to build up your user community from scratch, you do not have to convince your users to jump the high sign-up-and give-away-your-private-information barrier anymore, no you can create a highly specialized niche social network that serves only a small fraction of the population or that has only few, highly specific use-cases. This will lead to the generation of thousands of nice social networks, some standalone, some embedded into larger websites, but all will be able to interchange users, social graph and widgets with each other.

    As for the big players Google and Facebook: They will have to compete for the best platform for running these widgets. Facebook can benefit from the large number of existing Facebook applications and the really neat integration into the site from a usability point of view, but Google's hold on the desktop with Google Toolbar and the ability to display desktop widgets in the web and vice versa could lead to a completely new way to see the web.

    Posted by Lars Trieloff NOV 28, 2007

    Add comment

    Tim Berners Lee recently coined the term "The Graph". The idea here is that computer networks undergo a conceptual evolution. The starting point, the Net (internet) connects computers, abstracting from the cables between computers allowing the creation of networked applications. The most successful networked application became known as the Web (world wide web) and provided another abstraction. It is not the computers we care about, it is the documents (resources) and links between them (hyperlinks). This web is what most content management systems and content repositories care about - they manage documents after all.

    The Graph is nothing else than the realization that we actually do not care about documents (or even the computers where these documents are stored), but about the things that are described by the documents. A document can describe a person, an idea, a place. And in the same way as people are connected to each other, as ideas are connected to people and people to places, documents can be used to describe these connections by the means of hyperlinks or more advanced technologies like RDF.

    To refrain this thought - the graph is not about the documents and their connections, but about the things described in the documents and their relations, about the document's contents. The graph is about content and content relations.

    If you are a user of content management system or a content repository, if you are developing applications that deal with content, if you are a vendor of a content repository or content management systems there are some relevant implications for you:
    Your content repository should be about content, not about data or documents. It should allow you to deal with content in all representations, from highly structured to unstructured. It should allow you to unify access over all content stores. It should allow you to expose your content to the outside, allowing you to build a stronger graph.

    If you are looking for this kind of content repository, you should have a look at CRX and its open source companion Apache Jackrabbit.