Latest Posts

Archives [+]

Categories [+]

Authors [+]

Archive for January 2009

    Posted by Greg Klebus JAN 30, 2009

    Posted in announcements and crx Comments 3

    I'm excited to share the good news about General Availability of CRX 1.4.1. This is the content repository, which powers our award-winning CQ 5.1 release of last November, and is immediately available for downloading. Do you honestly like trial software? We don't, so we decided to make it free for developers! And, if you're impatient like me, just grab the software here and give it a try even before reading on...

    There are many changes in this release, and not all of them in the software itself. What we haven't changed is the Web 2.0 ready application development package with great productivity potential:

    • The best commercially packaged and supported Java Content Repository implementation, based on the Apache Jackrabbit open-source reference implementation of the JCR / JSR-170 standard. The core repository is fully source-code compatible with Apache Jackrabbit sources.
    • Extra features on top of Jackrabbit, like virtual repository for federating content from other sources, administration console, content explorer, security module with user/group management and LDAP integration, great persistent storage technology in TarPM.
    • Easy to install and use, prepackaged, preconfigured, fully tested package.
    • Integrated Apache Sling, a RESTful web application framework and content delivery platform. With all this, you can get productive and have your first dynamic web application running in 15 minutes flat! Check it out here.

    Now for the changes. First, and foremost, we decided to make the software more available, and at the same time more "financially scalable" with our customer's growth. Why not share the good stuff, right?

    There are three software editions available, tailored to your needs:

    • CRX Developer: Free for all non-production use. Get productive immediately, at zero cost.
    • CRX One: Low-cost, entry level annual subscription. Get one application into production by just changing license, pay-as-you-go manner. You don't need to count your CPU's, either - price is per application instance!
    • CRX Enterprise: Grow into the enterise deployments with this edition. All-you-can-eat buffet for your enterprise needs.

    I'm particularly excited about the free CRX Developer (obvious reasons), but also about the CRX One, which allows to turn your good ideas implemented quickly into something you or your company can make money on by deploying the application in production!

    What's New

    On the more traditional laundry list of the release enhancements, see some of the highlights of this release below. For more information, please see the CRX 1.4.1 Release Notes.

    • Ready for Content Applications, tried & tested. This release underpins our innovative CQ 5 Web Content Management product, and has been tested and deployed in production well before the GA date! With CQ 5, the whole WCM platform and all applications are in essence Content Packages, fully stored and managed within the repository, including content, metadada, designs, configuration, and executable code (libraries, scripts). Well, with this, I dare say we're reaching the "everything is content" nirvana.
    • CRX Quickstart, a new Day CRX distribution mechanizm. Installation, startup & stopping of the instance is now managed with one central module. We've significantly simplified the installation / startup, it suffices to double click the CRX Jar file in your graphical environment to get it running. We released a preview version of Quickstart already in April 2008 as a part of Day JCR Cup 2008 initiative (by the way, we've just announced the winners, congrats!)
    • Improved overall operational efficiency via a few new features like Quickstart installation, easy one-button-click setup of clusters, and easy one-button-click backup of the whole instance, including application and configuration.
    • Easy clustering.To scale your instance, you can add a cluster node with a Join Cluster button. That's it - and it even works for content applications set up on another node, everything gets automatically synchronized to the joining cluster node. You can, e.g., create a CQ 5 cluster by joining a fresh, empty installation of CRX to the CQ5 instance. Clustering infrastructure is now turned on by default, even for a one-node installation. Surely you only pay for clustering when using a cluster of more than one node in production.
    • Easy backup. We've added the long-requested backup button, which creates backup image of the whole instance, including software, content, configuration, even the license. You are ready for disaster recovery - just unzip the backup Zip and start the CRX quickstart again.
    • Tar PM & Clustering Enhancements. TarPM can now automatically optimize index files, and the clusterig can automatically clean up journal entries to enhance performance characteristics.
    • Package Manager enahancements. Package manager has been extended, so that it can be the deployment tool for enterprise content applications like CQ 5.
    • Last, not least we added new development tools for applications developed on CRX platform. FileVault allows for Subversion-like, full-roundtrip sychronization between the repository content and the filesystem via checkin and checkout commands. The filesystem checkout of the content repository can then be synchronized to a source code version control tool. This is geared towards projects, where strict version control of the content application / and the content is crucial. There is also a revamped CQDE 5, developer environment based on Eclipse framework. It is currently in Beta and available to CQ and CRX One & Enterprise customers.

    Famous last words: we're cloud-ready, too! It's not a marketing statement, we actually run some of our CQ 5 web properties on Amazon EC2 servers! We will post a blog about the whole thing, a summary is that it was quick and easy. The fact, that CRX is self-contained, has no dependencies apart from Java 5+ and a filesystem, and has very small memory & CPU footprint makes it ideal for such deployments.

    Please let us know your feedback!

    Posted by Greg Klebus JAN 30, 2009

    Posted in announcements Comment 1

    We have just announced the winners of the JCR Cup competition. Read all about it here.

    Congratulations to the winner Russell Toris and the honorable mention winners Alexander Mokrushin and Renaud Richardet! And a big thank you to everyone who took part.

    Posted by Carsten Ziegeler JAN 29, 2009

    Posted in jcr, osgi, portal and portlets Add comment

    OSGi is today used in many places, for desktop applications, complex application servers and even for web applications. But what about portlets?

    In the last weeks I developed some portlets for our product offerings. The portlets allow you to display, edit and manage content inside a JSR 286 (portlet API 2.0) compliant portal server. As our whole product offering is based on OSGi for the obvious reasons, I thought about using the same technology for implementing the portlets. By using the great Apache Felix implementation, togther with some of the nice things we developed in the Apache Sling project I could easily manage to start the OSGi framework inside the portlet web application. A simple generic portlet implementation running outside the OSGi world receives the portlet calls, like events, actions and render invocations. This generic portlet just forwards them to a receiving portlet running as an OSGi service. Once you are inside the OSGi world you can use all the benefits of OSGi for your implementation and it was quite simple to develop the "real" portlets. Besides using the same base for all of our products and being able to reuse stuff through well defined bundles and services, one of the main advantages during development has been the easy update of the portlet code. While the whole portal application has been deployed to some portal container, updating the portal did not require an update/redeployment of the portal web application. Just updating a bundle through OSGi is enough and your new code is deployed. This makes the development and testing way faster. I really don't want to miss OSGi in everyday development, it creates clear boundaries, enables real modularization and reuse and makes the development much easier. Start your server once and just update the parts that have changed. No need to restart, no need to update the whole application. Priceless!

    Posted by Michael Marth JAN 29, 2009

    Posted in acl and jackrabbit Comments 4

    Remember Zed Shaw (origin of the "400 reboots a day" meme, author of Mongrel and inspiration to counter-projects)? Zed gave an interesting talk at the CUSEC conference where he discussed how dated the concept of static ACLs is and that static ACLs could not satisfy the requirements of his customer. He goes on to describe his implementation of a document management system where static ACLs have been replaced with Ruby code. (He also discusses steaks and strippers)

     

    I had never thought about it before, but I really liked his idea of ACLs that are implemented in code rather than declared as static rules. So I set out to implement a Jackrabbit-based document management system where the ACLs are dynamic. Since Jackrabbit comes with a WebDAV server I considered the DMS part done for the purpose of this excercise.

    Jackrabbit allows access to nodes be controlled by custom implementations of the AccessManager. While this sounds simple enough there is a plethora of names and concepts related to authorization. This ticket has a good overview.

    In order to hack my own access manager I looked at the Jackrabbit 1.5.2 branch and copied the org.apache.jackrabbit.core.security.simple.AccessManager into RAccessManager. Really, the only method that needed to be changed was internalIsGranted(Path absPath, int permissions) (the whole file is attached to this post):

    private boolean internalIsGranted(Path absPath, int permissions)throws RepositoryException {if (!absPath.isAbsolute()) {throw new RepositoryException("Absolute path expected");}checkInitialized();if (system) {// system has always all permissionsreturn true;} else if (anonymous) {// anonymous is only granted READ premissionsreturn false;}// prepare decent path stringString decentPath = "";for (int i = 1; i < absPath.getElements().length; i++) {String e = ((Element) absPath.getElements()[i]).getString();decentPath += ("/" + e.substring(e.indexOf('}') + 1));}// prepare userString user = "";Iterator pi = subject.getPrincipals().iterator();while (pi.hasNext()) {Principal p = pi.next();user = p.getName();break;}ScriptEngineManager engineMgr = new ScriptEngineManager();ScriptEngine engine = engineMgr.getEngineByName("ECMAScript");try {InputStream is = new BufferedInputStream(new FileInputStream("permissions.js"));Reader reader = new InputStreamReader(is);engine.eval(reader);Invocable invocableEngine = (Invocable) engine;boolean result = (Boolean) invocableEngine.invokeFunction("check",decentPath, user, permissions);return result;} catch (Exception ex) {ex.printStackTrace();}// fallbackreturn false;}

    This implementation prepares the user name and the requested node path and passes them to an ECMAScript function check(path, user, permissions) in the file permissions.js. The boolean this function returns will be used to determine access rights. That's all we need: access control done in a script.

    One type of access control that would be impossible to implement in the static approach is shown below: all access outside of the office hours is forbidden. Other than that, only resources that have the user's name in their path can be accessed (which has the weird implication that users can access paths where they cannot access the parent path):

    function check(path, user, permissions) {var now = new Date();if (now.getHours() <7> 18) {return false;}var exp   = new RegExp("/" + user + "/");return exp.test(path);}

    In order to get my implementaion of AccessManager to compile into the jackrabbit-standalone server I recompiled the core module and the standalone server. However, since the scripting capabilities utilized above are only available since Java 1.6 I needed to change the JDK settings in the parent pom.xml accordingly. After recompiling the standalone server I configured my new access manager in the repository.xml file:

     

    <AccessManager class="org.apache.jackrabbit.core.security.simple.RAccessManager">            </AccessManager>
    Since the script permissions.js is loaded at each request one can see the effect of editing this file with each request.

    Posted by Michael Marth JAN 29, 2009

    Posted in atom, atompub, cmis and rest Add comment

    David Nuescheler has published the slide deck he presented at the CMIS F2F in Redmond. Interesting bits about the relation of AtomPub and CMIS.

     

    Posted by Cedric Huesler JAN 23, 2009

    Posted in jcr Comments 3

    You can guess - we hear this question popping up quite often.

    Last year we had Bertil Chapuis in the house and we let him dig deep into the topic. He summarized his findings in a paper he recently published for this diploma at the Université den Lausanne.

    While reading - I found these quotes nicely expressing differences:

    Houses are rarely built from scratch without blueprints. Cities, however, usually evolve organically, without detailed blueprints....The JCR model clearly promotes a data-driven structure, without extraneous blueprints.

    and..

    The choice of the best approach should be made with regard to the responsibility given to the DBA and to the application programmer....on one hand, a prison guardian must control all the prisoner's movements...a tourist guide, on the other hand, has to ensure that the travelers have a good trip by directing them and giving them the right information. Do users have to be guarded or guided?

    without further ado:

    We at Day Software thank Bertil for allowing us to publish his work and wish him all the best for the future.

    Update: Here's a link for downloading the report without having to login at Scribd

    Posted by Cedric Huesler JAN 22, 2009

    Posted in announcements Comments 3

    How about getting instant notification when we are about to pull the next trick?

    You can now follow us on Twitter - to be in-the-know what's happening in the Day world.

    Don't know what Twitter is? Hmm.. check this out.

     

    Day employees across the company are sharing their stories. Feel free to join the conversation.

    And yes.. a new name: My name is Cédric - I'm a new troublemaker at Day.

    Currently doing: uploading day.com into the clouds.

    Posted by Michael Marth JAN 20, 2009

    Posted in jackrabbit, jcr, tools and tutorial Comments 4

    There is an interesting new piece in the Jackrabbit sandbox: Jukka Zitting has commited a JDBC to JCR bridge. This bridge acts as a JDBC driver and thus allows users to connect to a JCR repository through JDBC. I was very happy to see this because I have one or two use cases where this comes in very handy: for example I would like to use a standard reporting tool to produce reports for JCR content. These tools work best with relational data and therefore need JDBC connections (*).

    Jukka has outlined how to use the driver and how it works on the Jackrabbit mailing list. Apart from Jukka's instructions it is useful to know that the driver internally bundles the Apache Derby DB so that the SQL queries are restricted to what Derby can deal with.

    Want to get your feet wet?

    To get started check out Jackrabbit 1.5.2, compile it with mvn clean install and start the fabulous new standalone server in jackrabbit-standalone\target with

    java -jar jackrabbit-standalone-1.5.2.jar

    Afterwards hit http://localhost:8080 and populate the repository with some documents.

    You also need to checkout the driver from the Jackrabbit sandbox and build it with mvn package. Beware, here's a gotcha: you need to use Java 5 (yes, it did not know it still existed either). On Windows I additionally encountered a problem when the JDBC connections get closed and a temp file cannot be deleted. This can be remedied by commenting the line FileUtils.deleteDirectory(tmp) in JCRConnection.java's close() method and later deleting the temp files manually.

    Once you got it compiled put the driver on your classpath and you are ready to hit Jackrabbit with some old-school JDBC program. Here's an example:

    package com.day.samples;
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.ResultSetMetaData;
    import java.sql.Statement;
    public class JCRDriverTest {    
    public static void main(String[] args) throws Exception {        
    Class.forName("org.apache.jackrabbit.jdbc.JCRDriver");        
    Connection connection = DriverManager
    .getConnection("jdbc:jcr:http://localhost:8080/rmi");        
    try {            
    Statement statement = connection.createStatement();            
    try {
    ResultSet resultSet = statement                        
    .executeQuery("SELECT a.jcr_path as path, a.jcr_primaryType as type FROM NT_FILE as a");                
    try {                    
    ResultSetMetaData rsMetaData = resultSet.getMetaData();                    
    int numberOfColumns = rsMetaData.getColumnCount();
    System.out.println("Number of Columns=" + numberOfColumns);
    // get the column names; column indexes start from 1(!)                    
    for (int i = 1; i < numberOfColumns + 1; i++) {
    String columnName = rsMetaData.getColumnName(i);                        
    String columnType = rsMetaData.getColumnTypeName(i);
    // Get the name of the column's table name
    System.out.println("column name=" + columnName
    + " type=" + columnType);                    
    }                    
    while (resultSet.next()) {
    System.out.println(resultSet.getString(1) + " "                                
    + resultSet.getString(2));                    
    }                
    } finally {                    
    resultSet.close();                
    }            
    } finally {                
    statement.close();            
    }        
    } finally {            
    connection.close();        
    }    
    }
    }

    The connection string is "jdbc:jcr:http://localhost:8080/rmi". All nodes are arranged in views where the table name corresponds to the node type (i.e. all nodes of type nt:file are accessible in the view nt_file). ResultSetMetaData is available as well. It is possible to use quite complex queries, for example have a look at Jukka's test code:

    SELECT a.jcr_path as path, a.jcr_primaryType as type,  COUNT(*) as children
    FROM nt_base AS a JOIN nt_base AS b  
    ON (a.jcr_path || '/' = SUBSTR(b.jcr_path, 1, LENGTH(a.jcr_path) + 1))
    GROUP BY a.jcr_path, a.jcr_primaryType

    Reporting

    Now that the JCR repository looks like a RDBMS it is possible to throw a reporting tool at it. I took iReport 3.1.3 (a GUI for JasperReports), installed the JDBC driver and created a report that produces a pie chart of the top level domains from where the documents where downloaded (the path of the documents corresponds to their URL so the JCR path can be evaluated for this report).

     

    It should be noted that the driver is "in the sandbox" which means nowhere near production. If you are interested in using it and run into problems the Jackrabbit list is a good place to turn to.

    (*) Actually, for completeness, some reporting tools also work with XML data so one could also use JCR's XML export or Sling's XML rendering if available.

    Posted by Michael Marth JAN 12, 2009

    Posted in jcr and link of the day Add comment

    New to CND (the notation to describe JCR content structure)? Jochen Toppe has put together a readable primer (along with a good overview of JCR covering not only JSR-170, but also JSR-283).

    Posted by Michael Marth JAN 12, 2009

    Posted in ab testing, crx quickstart, data first, javascript, sling and tracking Comments 4

    John Resig of JQuery fame has written an interesting article about a Javascript library called Genetify by Greg Dingle which is for A/B Testing web sites. Wikipedia explains A/B Testing as:

    A/B testing, or split testing, is a method of advertising testing by which a baseline control sample is compared to a variety of single-variable test samples in order to improve response rates. A classic direct mail tactic, this method has been recently adopted within the interactive space to test tactics such as banner ads, emails and landing pages.

    Significant improvements can be seen through testing elements like copy text, layouts, images and colors. However, not all elements produce the same improvements, and by looking at the results from different tests, it is possible to identify those elements that consistently tend to produce the greatest improvements.

    In the context of a web page one might for example change the colors or the texts, display each variation to a subset of the site's visitors and determine the most successful variant by the number of page views or sold items.

    There's two things to note about Genetify: first, it takes this process to the client, i.e. the served HTML page already contains all possible variants and a particular variant is chosen on the client-side. Second, over time the optimal variation will be shown more often than suboptimal versions. This is the "genetic" part (as in Genetic Algorithms).

    John provides a good overview of the library and also points to Genetify's instructive demo. After John's post Genetify's author Greg Dingle has open-sourced Genetify on GitHub including a PHP/RDBMS-based backend which is announced and discussed in the comments of John's post. In another comment of that thread Rob Howell says:

    Also, would be very cool to see it integrated server-side into a decent CMS.

    Hmm, I happen to know a decent CMS so I had a look how Genetify could be ported (to Apache Sling actually, which makes it suitable for CQ5 or any other Sling-based web application):

    Originally, I planned to simply re-implement the PHP backend and leave the JS untouched. But I realized that the style of interaction between the JS script and the PHP-backend was so much out of tune with how one would design the interaction in a RESTful framework like Sling that I decided to tweak the JS script as well. As such, this excercise became more interesting in the sense that some differences between PHP/RDBMS-backends (I should rather say: the way PHP-based backends are usually designed) and Sling/JCR-backends became visible.

    The first difference was for recording "variants" and "goals". The variants are the permutations of the genes that are shown to a specific user. The goals are the desired outcomes that shall be measured, like buying something. Both need to be persisted, obviously. In the original version both are recorded by sending a GET request to the backend. I changed this to the (arguably more "correct") POST method. The original version sends a random number parameter with each request. As far as I understand the code this is needed to get around caching issues. Using POST would allow to drop this parameter. Whatsmore, Sling requires no backend code at all for writing a new node when the request is sent using POST method.

    The second change involves the layout of the stored data. In a RDBMS-based system one (obviously) puts the different entities into different tables (which need to be defined beforehand). In a JCR-based system one possible, if not even the natural approach is to utilize the hierarchy - and potentially not define any node types at all, like I did. Since I store all variants and goals in nodes of type nt:unstructured there is no need to define a data schema or the like beforehand. One can start writing into the empty repository.

    For example, the variants are stored in one node of type nt:unstructured that stores all the properties like on wich domain the variant was shown. The actual genes are stored in a child node below. A similar approach is taken for the goals where there is a node for each goal (named like the goal) and child nodes for the achieved goals.

    It is actually possible to create a node hierarchy like this in one POST request by simply setting parameter names accordingly:

    ./param1=value2&./childnode/childparam2=value2
    

    (this approach is also used in the blog sample application where a blog post can have an attachment which is stored as a child node of the blog post's node).

    As said above, this part did not involve server-side scripting. However, the Genetify JS script not only writes the goals, but also retrieves information about the previous performance of the genes when it starts (in order to lean towards more successful genes in the long run). I have (hopefully correctly) reverse-engineered the PHP scripts that generate this response and written an ESP script (server-side JS) that should do the same. It should be noted that the original Genetify server-side scripts do a lot more error checking which is not implemented in the ESP.

    If you want to check out the Sling'ed Genetify version grab the attached zip file, unzip it into your CRX repository at /apps/gen and point your browser to http://localhost:7402/apps/gen/index.html. The upper part of the page displays the values of two genes (the first one is "rock", "paper", or "scissors"). If you click the "vary" link below the genes will change (because keeping always the same state on one particular browser with a cookie is switched off for development). Clicking one of the two links further down "want it!" or "badly!" will be counted as an achieved goal for the genes that are curently displayed. If you click one of them and reload the page afterward the stats table will have changed. The stats table represents the success of particular genes on a particular page. For restarting just delete the results stored in /content/gen.

    While it's fun to look at how do things in Sling and how little code is needed to get things running it needs to be said that the approach presented above will not scale very well. For once, all variations are stored flat, i.e. without a hierarchy. Since each page view creates a variation the number of child nodes will quickly become much too large to be handled efficiently. The second scaling problem is the calculation of the previous results which takes will take much too long as well. Both problems could be remedied by another JCR-typical approach "Observations". A listener for /content/gen could be registered and move old variations into a properly structured archive a s well as pre-calculate the previous results table.

    Posted by Michael Marth JAN 05, 2009

    Posted in cms, cq, cq5, dynamic languages, ecm, jcr, link of the day, rest, sling and wcm Add comment

    Happy new year... And here's the bunch of links that have piled up in my inbox during the holidays: David Nüscheler has uploaded his presentation about CQ5 to Slideshare (embedded below)... There is a new tutorial on the Sling website on how to use Groovy in Sling... David Dossot has been interviewed about Mule's JCR Transport (he also was on dev.day.com)... I stumbled across a new blog concerned with Day Communique... Hippo CMS has published a tutorial how to access content through JCR... Shane Johnson looked at JBoss DNA's JCR implementation... and finally, Roy's post about the hypertext constraint in REST led to an article on InfoQ

    Introducing CQ 5.1

    View SlideShare presentation or Upload your own. (tags: wcm cms)