Latest Posts

Archives [+]

Categories [+]

Authors [+]

Archive for October 2008

    Posted by Bertil Chapuis OCT 30, 2008

    Posted in data first Comment 1

    A rich debate around the respective places of data and structure in data models takes place since several years on the web. This debate could be summarized as following: Should data be driven by the structure or should the structure be driven by data?

    It seems that in the real world both cases exist. Some problems benefit from being driven by a structure, some others can clearly not fit in predefined structures.

    For example, we could do this analogy. Houses are rarely built from scratch without blueprints. However, if we take the scope of cities, there are generally no blueprints which plan their final states. So which lessons can we learn from this simple example? Are big and complex problems driven by data instead by structure? Not necessarily.

    In the example of the house and of the city the problem could be seen as following. For houses, because budgets and resources available are generally known in advance, the most effective way to proceed is to define a structure before the construction. For cities, because resources and budgets available are generally not known in advance and are evolving, the most effective way to proceed is to let their structure emerge. If necessary, guidelines can be defined to control their growth.

    Since information systems problems are involving an evolving community of stakeholders and since providers don't exactly know what will be done with their applications, similar questions are asked:

    • Are the users known or not?
    • Is the behavior of the users known or not?
    • Is the final usage of the application known or not?

    The incertitude level given as answer to these questions is probably one of the best indicators to choose one approach instead the other.

    The JCR model advocates clearly for a structure driven by data. By creating content, items, nodes and properties, users are building the structure. Database administrators and applications programmers are just guiding this structure by defining rules and constraints.

    In model implementations made with a relational approach, a structure is first defined by the database administrator and the application programmer. Then the users can register content items which fit to this structure.

    Depending to the use case, both data models are making sense. However, the right questions should be asked before each implementation.

    Posted by Ngoc Dao OCT 27, 2008

    Posted in dynamic languages and jcr Comment 1

    This post was orginally published at Ngoc's blog in Vietnamese


    Using ActiveRecord is convenient due to its ease of use and richness of plugins. However, instead of a relational DB it can also be used with JCR. Let's try that for a simple blog application based on Rails.

    Calling JCR API everywhere is a bit painful. Let's extend ActiveRecord::Base to write a simple object-content mapper. If you already have a Rails application and you want to replace the relational DB with JCR and you have been following the "fat model, thin controller, stupid view" best practice, doing this minimizes code changes to only some parts of the model. The only disavantage is that you can no longer use any ActiveRecord plugin which talks to relational DB.

    Structure

    Jackrabbit is used as the transient JCR server (actually any JCR server will do), and Rails connects to it through RMI. Generally there are two ways to combine Java and Ruby: bring Ruby to Java or bring Java to Ruby. There's already an article about using "JRuby". Let's use "Rjb" for a change.

    We do not need a relational DB. To prevent Rails from connecting to a DB, edit config/environment.rb to remove ActiveRecord from config.frameworks (we still require ActiveRecord in the object-node mapper).

    Our models will extend class JCR defined in jcr.rb, instead of ActiveRecord::Base.

    /blog
     |
     +- app
     |  |
     |  +- controllers
     |  |  |
     |  |  +- articles_controller.rb
     |  |
     |  +- models
     |     |
     |     +- article.rb [<- extends JCR]
     |
     +- lib
     |  |
     |  +- java
     |  |  |
     |  |  +- [Jackrabbit JAR files]
     |  |
     |  +- jcr.rb [<- class JCR, the object-content mapper]
     |
     +- script
     |  |
     |  +- jcr [<- starts Jackrabbit]
     |
     +- [other things]
    

    script/jcr

    This script starts the Jackrabbit server on port 1100.

    require 'rubygems'
    require 'rjb'
    
    LIB_DIR = File.dirname(__FILE__) + '/../lib'
    CLASS_PATH = Dir.glob("#{LIB_DIR}/*.jar").join(File::PATH_SEPARATOR)
    
    Rjb::load(CLASS_PATH, [])
    
    JString              = Rjb::import('java.lang.String')
    LocateRegistry       = Rjb::import('java.rmi.registry.LocateRegistry')
    
    Repository           = Rjb::import('javax.jcr.Repository')
    SimpleCredentials    = Rjb::import('javax.jcr.SimpleCredentials')
    TransientRepository  = Rjb::import('org.apache.jackrabbit.core.TransientRepository')
    ServerAdapterFactory = Rjb::import('org.apache.jackrabbit.rmi.server.ServerAdapterFactory')
    
    repository = TransientRepository.new
    factory = ServerAdapterFactory.new
    remote = factory.getRemoteRepository(repository)
    reg = LocateRegistry.createRegistry(1100)
    reg.rebind('jackrabbit', remote)
    
    puts 'Jackrabbit started at rmi://localhost:1100/jackrabbit'
    
    trap('INT') do
      puts 'Shutting down...'
      repository.shutdown
    end
    sleep
    

    jcr.rb

    class ActiveRecord::BaseWithoutTable < ActiveRecord::Base
    ...
    end
    
    class JCR < ActiveRecord::BaseWithoutTable
      JCR_WORKSPACE = 'default'
    
      JString                  = Rjb::import('java.lang.String')
      JRepository              = Rjb::import('javax.jcr.Repository')
      JSimpleCredentials       = Rjb::import('javax.jcr.SimpleCredentials')
      JClientRepositoryFactory = Rjb::import('org.apache.jackrabbit.rmi.client.ClientRepositoryFactory')
    
      factory = JClientRepositoryFactory.new
      repository = factory.getRepository('rmi://localhost:1100/jackrabbit')
      begin
        @@session = repository.login(JSimpleCredentials.new('admin', JString.new('admin').toCharArray), JCR_WORKSPACE)
      rescue
        puts 'Could not connect to JCR server'
        exit(-1)
      end
    
      def self.session
        @@session
      end
    
      def session
        @@session
      end
    
      #-----------------------------------------------------------------------------
    
      # Avoid error ActiveRecord::ConnectionNotEstablished.
      def self.table_exists?
        false
      end
    
      def self.all
        node = session.getRootNode.getNode(table_name)
        it = node.getNodes
    
        ret = []
        while it.hasNext do ret << node_to_object(it.next) end
        ret
      end
    
      def self.find(attributes_values)
        work_space = session.getWorkspace
        query_manager = work_space.getQueryManager
    
        query_str = "//#{table_name}/node["
        attributes_values.each do |key, value|
          query_str << "@#{key.to_s} = '#{value}'"
        end
        query_str << ']'
    
        query = query_manager.createQuery(query_str, 'xpath')
        result = query.execute
        it = result.getNodes
    
        ret = []
        while it.hasNext do ret << node_to_object(it.next) end
        ret
      end
    
      def self.first(attributes_values)
        find(attributes_values).first
      end
    
      #-----------------------------------------------------------------------------
    
      def self.node_to_object(node)
        ret = new
        columns.each do |c|
          property = node.getProperty(c.name)
          value = property.send("get#{c.sql_type.to_s.capitalize}")
          ret.send("#{c.name}=", value)
        end
        ret
      end
    
      def save
        if valid?
          node = session.getRootNode.getNode(self.class.table_name)
          node = node.addNode('node')
          self.class.columns.each do |c|
            value = send(c.name)
            node.setProperty(c.name, value)
          end
          session.save
          true
        else
          false
        end
      end
    
      def update_attributes(attributes_values)
        attributes_values.each { |key, value| send("#{key.to_s}=", value) }
        save
      end
    end
    

    This file defines class Jcr which extends "BaseWithoutTable". If you want to get the session to the Jackrabbit server, call Jcr.session.

    article.rb

    class Article < Jcr
      column :user_name, :string
      column :title,     :string
      column :body,      :string
    
      validates_presence_of :user_name, :title, :body
    
      def to_param
        title
      end
    end
    

    With the above simple definition, we can have functionalities like finding, saving. For more complex functionality, the mapper must be improved, or we can use the JCR session directly.

    Hope you have an idea how to use JCR in your Rails applications.

    Source code

    Posted by David Nuescheler OCT 21, 2008

    Posted in announcements Add comment

    With great pleasure I would like to announce that Kevin Cochrane has joined Day as Chief Marketing Officer. Kevin, who previously was with Alfresco and Interwoven, has an enormous amount of experience and an excellent standing in the content management industry. I am very excited that Kevin has decide to join us and look forward to working with him.

    (Official press release is here)

    Posted by Michael Marth OCT 21, 2008

    Posted in ria, sling and tutorial Add comment

    While Adobe Flex has some drawbacks like a broken implementation of http it also has its virtues. One of them is that it makes it easier for taste-challenged developers like me to come up with decent user interfaces. So here's a little post on building Flex UIs for Sling (that are confined to reading content from the repository). The example app I would like to discuss is a slideshow where the images get retrieved from a Sling-powered JCR repository (where they are stored as regular files so that new images can be added through WebDAV). The images shall be displayed using a (simplified) Ken Burns effect.

    This is the Flex app in action:

    (does anyone else think that zooming into the snail is somewhat scary?)

    In order to retrieve the images from Sling I looked at two integration strategies: client-side and server-side.

    Client-side

    There is Sling's JS-based client library at /system/sling.js. Amongst other things this library allows read-access to the content (the library is for example used in the JSTs of the CRX Quickstart sample Firststeps). Flex has facilities to interface with JS so one can call the Sling lib's method for retrieving the content (as JSON), pass the JSON object back into the Flex app, parse it and retrieve the images from there.

    The content is assumed to reside within one folder within the repository. The folder's path is passed to the Flex app as a FlashVar which can be read from the Flex app like:

    var contentPath : String = Application.application.parameters.contentPath;

    In the Actionscript code I have (thinly) encapsulated some of Sling client lib's methods like this:

    public function getContent(path : String, maxLevel : int = 0, filter : Boolean = false) : Object {  return ExternalInterface.call("Sling.getContent", path, maxLevel, filter);}

    Passing the content root folder to the Sling client lib will give us the JSON response which we can parse like this:

    var sling : Sling = new Sling();var c : Object = sling.getContent(contentPath, 3);for (var a : String in c) {  if(a.indexOf(":") == -1 && a.charAt(0) != "." && a != "desktop.ini" && a != "Thumbs.db") { // this is a slide    ...    var slide : Slide = new Slide();    slide.imageUrl = contentPath + "/" + a;    ...  }}

    After that the rest is pure Actionscript code that knows nothing about Sling or any repositories.

    It is possible to remove all traces of Sling from the Actionscript code (this might be desirable if e.g. a Flex coder is not aware of Sling). This is achieved by not using the Sling client library but rather retrieving the content in JSON format from within Flex directly (through the content's URL with .json extension, i.e. http://somehost/path/to/content.3.json for retrieving the content 3 levels deep). Since Flex has not built-in JSON capabilities something like the as3 core library can be used. One disadvantage of this approach is that the existing functionality from the Sling client lib has to be re-implemented.

    Server-side

    Another possibility is to pass all image links on the server-side as FlashVars. This method passes all required URLs to the Flex app when the app is loaded so any additional requests for content and subsequent JSON parsing are not necessary. However, the FlashVars have to be parsed, of course:

    private function onCreationComplete() : void {  for (var i:String in Application.application.parameters) {    prepareData(Application.application.parameters[i]);  }}private function prepareData(imageUrl : String) : void {  var slide : Slide = new Slide();  slide.imageUrl = imageUrl  ...}

    The FlashParams are set on the server-side in an .esp script:

    var flashvars = {  <%  i =0;  var children = currentNode.getNodes();  for(child in children) {    i++;    if(child.charAt(0) != "." && child != "desktop.ini" && child != "Thumbs.db") {      %>url<%=i%>:"<%=currentNode.getPath()%>/<%=child%>",<%    }  }  %>  };

    The server-side solution appears to require less code, mainly because there is less parsing to be done. However, once the content structure becomes "richer" (e.g. by including image captions and links) or some sort of hierarchy the JSON-based approach looks more intuitive to me.

    The example code

    For running the sample: there is a CRX content package attached that contains the compiled code and some content. After importing the package hithttp://localhost:7402/content/sexyflexy/albums/album1.clientside.html for the client-side example and http://localhost:7402/content/sexyflexy/albums/album1.serverside.html for the server-side example.

    There is also a Flex Builder project file that contains the Flex sources.

    In case you wonder about the weird if-clauses when looking at the image file names: when you drop images into WebDAV-mounted repository folder various operating systems will additionally create miscellaneous files that need to be ignored (thanks Steve and Bill).

    I do not know too much about other RIA technologies like Silverlight and JavaFX, but I suspect that similar approaches should be possible. If you know some more, please leave a comment.

    The example images are taken from:http://openphoto.net/download/index.html?image_id=16690,http://openphoto.net/download/index.html?image_id=18186, andhttp://openphoto.net/download/index.html?image_id=10035.

    Posted by Michael Marth OCT 15, 2008

    Posted in osgi and tutorial Comments 3

    Currently, I am moving this blog onto the latest version of Sling. Part of this effort is the migration of the comment spam checker into an OSGi bundle (mostly, that means wrangling with Maven). Here's two little bits of information I encountered along the way. Maybe they can be useful to someone.

    The actual backend services that are used for comment verification are Akismet and Typepad's Anti Spam. I took David Czarnecki's Akismet-Java library that wraps the respectice REST APIs of these services (both service providers actually use the same API).

    The trouble with David's code is that it uses commons-httpclient which depends on commons-logging. That clashes with Sling's use of log4j (note to the Java community: how could we get into this logging mess?). I found the solution for this annoying problem in Sling's parent pom.xml. Here's the relevant bit:

    <dependency>
      <groupId>commons-httpclient</groupId>
      <artifactId>commons-httpclient</artifactId>
      <version>3.1</version>
      <scope>provided</scope>
      <exclusions>
        <exclusion>
          <groupId>commons-logging</groupId>
          <artifactId>commons-logging</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    

    The second thing I would like to point out: it is quite simple to make OSGi bundles configurable through the web console (at http://localhost:7402/system/console).

    This is useful e.g. for configuring the API key of the above-mentioned service providers. In order to expose a property in the console use the annotation @scr.property:

    /** @scr.property */
    public static final String PARAM_API_KEY = "akismet.service.api.key"; 
    

    Other types like integer or boolean can also be used:

    /** @scr.property value="0" type="Integer" options 0="Akismet" 1="Typepad" */
    public static final String PARAM_SERVICE_PROVIDER = "akismet.service.provider";  
    

    The values are read in the service's setup method:

    Object key = configuration.get(PARAM_API_KEY);
    if (key != null) {
        this.apiKey = key.toString();
    }
    

    As usual, the full sources are attached.

    Posted by Michael Marth OCT 14, 2008

    Posted in link of the day Add comment

    If you would like to use a (read-only) RESTful API on top of your Java Content Repository, but cannot use Sling for whatever reason have a look at Shane Johnson's JC-Rest. JC-Rest is the generalization of a couple of specialized Rest interfaces Shane had built before Sling came to life. One of its nice features is still missing in Sling (although discussed on the list): out-of-box support for Atom feeds. The code is available on Google Code under an Apache license..

    Apart from the REST API JC-Rest additionally includes two different repository browsers. Both are remarkably lightweight in terms of LOC: it seems that apart from the Spring-wiring they consist of one regular-sized source file each.

    Posted by Lars Trieloff OCT 07, 2008

    Posted in announcements Add comment

    Calling all bloggers that cover content-centric applications or content-related topics: if you would like to get added to the blog aggregation contentcentric.org(*) please drop me a line at ltrieloff(at)day(dot)com. I would love to hear from you.

    (*) contentcentric.org is an aggregation of blogs of the content centric community. This includes Java Content Repository, Sling, and content centric applications like Content Management Systems, Blogs, Wikis.

    Posted by Ulrike Fox OCT 07, 2008

    Posted in announcements Comment 1

    If you have not signed up to attend Day's 2008 Global Customer Summit…it is not too late!

    Be part of the worldwide launch of Day's CQ5, at the 2008 Day Global Customer Summit, October 22-24, 2008 in Basel, Switzerland. Hear case study presentations from leading companies such as Audi, McDonald's, City of Zurich and University of Phoenix, among others.

    Attendees will also have the opportunity to interact with senior Day executives, and hear a feature presentation from respected industry analyst Mick MacComascaigh from Gartner. David Nuescheler, Day CTO, specification lead and founder of the JCR standard, will launch Day's CQ5, and CEO Erik Hansen will deliver the keynote address.

    To reserve your place at the summit, please email the completed registration form to summit@day.com, or fax to +41 61 226 9897 (invitation PDF and registration form in Word format).

    Posted by Michael Marth OCT 02, 2008

    Posted in cms and communique Add comment

    When you start to implement a large software package you are usually (or should be) driven by your own needs or what your customers ask for. That is, you are driven by features. On the other hand, when you decide to re-write a software package that has been in production for a while features are often less in focus. Instead, you would probably decide to do a re-write in order to get the architecture "right", i.e. to adapt the architecture to everything you learned so far. Of course, as a developer you always strive for a fitting architecture. But the difference between the first implementation and the re-write is that in the latter case you know all the features that need to be supported and you know what your users tend to do with the software.

    Let's look at two examples of established products that are currently being re-written in the WCMS world:

     

     

    At OpenExpo I could attend a talk by Typo3 5.0 lead developer Robert Lemke and found that T3 5.0 and CQ5 independendly came to the same conclusion how a modern web content management system's architecture should look like. As I blogged about previously it is a 3-layered stack:

    The foundation is a Content Repository with APIs that allow access to the content in its raw form (untainted with business logic). The top layer is the content management system itself that should consist of nothing but business logic (like workflows and the user interface for editors). Inbetween the two is an infrastructure layer that provides basic plumbing for web applications like security, connection handling to the repository, script execution etc. One important aspect to qualify this part as a layer is that it also exposes its own API.

    The CMS is only one possible application that can run on the infrastructure layer, i.e. custom applications do not run "on top" of the CMS, but "next to" it. This is enabled by the fact that the infrastructure layer is independent of the CMS.

    In our case the name of this layer is Apache Sling, in T3's case it is Flow3.



     

    I am convinced that this striking similarity is not a coincidence, but rather a confirmation that this kind of architecture constitutes the state of the art of WCMS architecture. Especially, as these two systems are situated in very different ends of the WCMS ecosystem. Moreover, I know that we as well as the T3 devs have spend a significant amount of time thinking about architecture and evaluating approaches (which is usually another difference to first time implementations where urgent requirements need to be satisfied immediately).

    I should note that there are plenty of differences between Flow3 and Sling, for example Sling maps directly maps content and URLs (which I personally find utterly fitting for a content-centric framework) whereas in Flow3 URLs address controllers.

    PS: in case you wonder: yes, that IS a JCR-compliant repository written in PHP in Flow3.

    Posted by Michael Marth OCT 02, 2008

    Posted in jcr and link of the day Comments 3

    Alexandru Popescu discusses the InfoQ.com site architecture which features a home-made Jackrabbit-based CMS. The persistence is split between JCR and Hibernate/RDBMS:

    And then it goes down to the persistence layer which as I said is split between Hibernate and the JCR. So at the end we have two different storages. You can probably ask at this moment why we picked using two solutions for storing something that might have lived in the same storage. The problem was that when designing this application we weren't sure how the model will look and how our content will evolve over time, and dealing with these changes inside the relational schema is pretty difficult, complex to migrate data and maintain between different versions and things like that. The JCR API provides exactly this support for unstructured content and a couple of more features like versioning, full-text indexing and we are taking advantage of all of these features.

    Sounds like Data First in action to me.

    Posted by Michael Marth OCT 01, 2008

    Posted in jcr Add comment

    There is an aspect in the adoption of JCR that is well-known, but not made explicit very often: While JCR is sometimes perceived as a technology for (web) content management systems the range of applications that are actually built on top of JCR is quite a bit broader. This is understandable as the feature set defined in JCR is useful for many use cases (like object-level security, versioning, hierarchical data structure, etc).

    Recently, I stumbled across Boni Gopalan's blog which reminded me of exactly this fact. Boni's company BioImagene sells a software package for analysis and management of pathology images. He provides some reasons why they chose to base their software on top of JCR:

    • 1. Pluggable security model that can maintain ACL at object level
    • 2. Versionability for objects
    • 3. ability to choose the right storage device for binary content Vs. ascii data.
    • 4. Efficient pluggable search integration.
    • 5. ability to integrate with JEE environments and lightweight spring like frameworks.
    • 6. Import export support to and from XML.
    • 7. Efficient transaction management and the ability to seamlessly take part in JTA.

    This inspired me to compile an incomplete list of examples of some real-world applications that are built on top of JCR (which are not content management systems, document management systems, knowledge management systems, blogs or wikis) - more to be found in the Jackrabbit Wiki:

    This list illustrates why I regard JCR as a "horizontal infrastructure technology", not just as a "web content management vertical".