A rich debate around the respective places of data and structure in data models takes place since several years on the web. This debate could be summarized as following: Should data be driven by the structure or should the structure be driven by data?
It seems that in the real world both cases exist. Some problems benefit from being driven by a structure, some others can clearly not fit in predefined structures.
For example, we could do this analogy. Houses are rarely built from scratch without blueprints. However, if we take the scope of cities, there are generally no blueprints which plan their final states. So which lessons can we learn from this simple example? Are big and complex problems driven by data instead by structure? Not necessarily.
In the example of the house and of the city the problem could be seen as following. For houses, because budgets and resources available are generally known in advance, the most effective way to proceed is to define a structure before the construction. For cities, because resources and budgets available are generally not known in advance and are evolving, the most effective way to proceed is to let their structure emerge. If necessary, guidelines can be defined to control their growth.
Since information systems problems are involving an evolving community of stakeholders and since providers don't exactly know what will be done with their applications, similar questions are asked:
- Are the users known or not?
- Is the behavior of the users known or not?
- Is the final usage of the application known or not?
The incertitude level given as answer to these questions is probably one of the best indicators to choose one approach instead the other.
The JCR model advocates clearly for a structure driven by data. By creating content, items, nodes and properties, users are building the structure. Database administrators and applications programmers are just guiding this structure by defining rules and constraints.
In model implementations made with a relational approach, a structure is first defined by the database administrator and the application programmer. Then the users can register content items which fit to this structure.
Depending to the use case, both data models are making sense. However, the right questions should be asked before each implementation.