Here's the next part of my journey to retrieve some of the hidden JCR community knowledge. Encouraged by the interesting insights of the Mindquarry team I approached Cédric Damioli who is a Product Manager at Anyware Technologies based in Toulouse. Cédric has built the open source CMS Ametys on top of Apache Jackrabbit and Cocoon. It leverages OSWorkflow as a workflow engine and uses DocBook internally (nice architecture!).
Here's what Cédric had to say about building Ametys:
Q: Cédric, when you were designing your persistence architecture for Ametys what potential choices were you considering and what influenced your decision towards Jackrabbit? Did you consider XML databases like the Mindquarry team?
Back in 2004, when we made the Jackrabbit choice, we were designing the architecture of the version 2.0 of our CMS (the name Ametys was born only last year). The 1.x versions persisted their documents in a CVS repository. While this may sound somewhat odd, it appeared to be really nice: things like versioning or documents hierarchy were handled natively, and thanks to a Java bridge between Cocoon and the CVS repository, the source code was quite light and easy to understand.
But there also was some important drawbacks: a CVS repository can't handle custom metadata, the time to access a single document grows with the overall size of the repository and the installation of the CMS was very intrusive on the target server.
So we decided to switch to a new persistence architecture, with the same benefits as the CVS server, but without the same limitations.
We considered three technologies:
- Subversion, as the natural successor of CVS
The JCR spec was even not final, and Jackrabbit was still under incubation with no public release, but three facts made the choice obvious for me:
- In 2004 Stefano Mazzochi wrote a paper about a new technology called JCR and I remembered this article a few months later
- Sylvain Wallez, former Cocoon PMC chairman and our R&D director, is an Apache member. As such he was part of the JCR 1.0 expert group and early Jackrabbit committer, which seemed to be a quite good warranty to me.
- And last, but certainly not least, the JCR spec is very, very good! It contains all concepts I wanted to have in a content repository.
Q: Let us know how the reality check worked out. Did your expectations regarding the JCR come true or did you have to overcome some difficulties you did not expect? Were there any pleasant or not so pleasant surprises after working with Jackrabbit for a while?
For this question I have to clearly distinguish the spec and its implementation. While it was great to learn to work with JCR, the use of Jackrabbit in a production environment was not that great one or two years ago.
JCR, I would say, has the pros and cons of a young technology. JCR 1.0 handles nodetypes, hierarchies and versioning very well, but its search capabilities are limited in a real-world application.
About Jackrabbit: the spec is well implemented, but it lacks administrative tools and APIs. For example, it is impossible to programmatically inspect the contents of a PersistenceManager in order to detect or repair inconsistencies. It is also impossible to programmatically reindex a workspace or the whole repository. Moreover, there's no real backup/restore solution or monitoring application.
So yes, the JCR choice was good, and all content related tasks are well designed and implemented, but I did not anticipate that the needs of my customers would go far beyond.
Q: I am still trying to find my favourite JCR tool. What tools did you use for your JCR-related development work?
In early 2005, we used the Jackrabbit XML PersistenceManager and I used to crawl the repository with... vi or notepad. Now that the community has grown, some cool tools exist. We mainly use the web-based JCR-explorer and JCR Controller which is an Eclipse app.
Q: You surely must have learned some important lessons about JCR from building Ametys. Would you like to share some of them with us?
JCR is very powerful by itself. At the beginning, we made the choice to hide the JCR API behind our own Repository API, in order to give CMS developers more flexibility. Two years later, no other implementation of this proprietary API exists and our developers had to learn yet another API. So we were wrong: the content model is well thought through and the API is easily learnt. There is no need to map the API onto another one, like we used to do with JDBC.
Q: If you had one wish regarding JCR and the JCR community, what would it be?
The Jackrabbit community is healthy, I hope it will keep that way. One could expect to have more visibility into the JCR expert group work, but I know the JCP rules are quite strict.
Regarding functionality: better query features and more admin tools and APIs. That would definitively make JCR and Jackrabbit rock!