Latest Posts

Archives [+]

Categories [+]

Authors [+]

Archive for August 2008

    Posted by Michael Marth AUG 25, 2008

    Posted in jackrabbit, jcr and link of the day Add comment

    Storing data within the cloud (formerly known as SaaS, the elderly might even remember ASP) is the latest craze. If you want to jump on that bandwaggon or simply have a lot of files check out Scott Dietrich's Jackrabbit DataStore for Amazon S3. Scott has posted a link to the sources on the Jackrabbit user list.

    Posted by Michael Marth AUG 18, 2008

    Posted in cms Add comment

    Communique 5 is currently being built on top of Apache Sling. I believe this is a step forward in the overall architectural evolution of web content management systems because it exposes a new type of interface to developers of content centric applications: the web infrastructure layer. Let me explain this idea in the context of CMS history:

    1. The Big Blob

    Seen from a developer's perspective early CMSs came along as one big piece of software . If you wanted to develop custom applications with them you usually had to develop on top of the CMS API. This API for WCMS would contain abstractions like "paragraph", "user" or "workflow" and expose actions like "publish a page", i.e. the abstraction level was the level of the WCMS business logic (I use the term "business logic" as in "the business of managing content"). For the purpose of this argument, let's call these APIs the "business level API".

    2. Splitting off the repository

    It was obvious that the content had to be also accessible for developers outside of the context of the business-layer API. Thus, the repository layer was abstracted: a number of CMSs created an API that allowed direct access to the content, either through JCR/JSR-170 or some proprietary API. This gave developers two choices how to access the content: either in the context of the business API or the "raw" access through, say, JCR.

    Introducing the repository-level API can be seen as a clean architectural cut for a CMS in order to improve separation of concerns.

    3. Splitting off the web infrastructure

    In this next evolutionary step we take what is left of the WCMS after the repository is split off and separate off another layer: the web infrastrucure (Sling). This layer contains the basic plumbing required to build web apps, e.g. request handling, selecting content, script execution, authentication, filters etc. This leaves the business layer with the business logic only.

    One effect of splitting off the web infrastructure is that developers can write web applications that sit "next to" the WCMS application rather than running on top of the WCMS. This means that these developers do not need to be concerned with (and learn about) the particular CMS business logic because they have access to the web application infrastructure provided by the CMS.

    This is a second architectural cut and separation of concerns that acknowledges that there are different types of content centric applications and WCMS is only one of them (consider forums, blogs, media asset management, wikis, etc). These different content-centric apps share requirements for repositories and web infrastructure, though.

    It should be noted that compared to the second step above this additional separation of concerns does not really allow you to do anything that was not possible before. However, it is such a big advancement in terms of developer productivity that I consider it to be a qualitative change. (If you disagree: Try building a web app starting with just a connection to your repository and another one on top of Sling. You will understand.)

    Some clarifications

    There are of course myriads of WCMSs already that let developers write extensions or modules or similar. That approach is different though because these applications run "within" the CMS context, i.e. the developer is still interfacing the business layer. On top of Sling, your app is a first-class citizen just like CQ5.

    This CMS-oriented view of the world and its history might seem strange if you look from an application server angle: in the app server world the persistence layer, the application framework (Struts, PHP, ...) and the application were always neatly separated - so Sling would be nothing new. However, this is not quite the same situation as step 3 in the CMS world because in the classical web infrastructure world the application framework knows nothing about the underlying repository - usually it just handles database connection pooling or similar. Sling does know about the repository and thus it can provide additional functionality. Therefore, if you want to stay in the classical web infrastructure picture you could regard Sling as a very specialized application framework (specialized for content-centric applications). AFAIK this is a new piece of infrastructure that did not exist before.

    Posted by Michael Marth AUG 14, 2008

    Posted in everything is content Comments 2

    If you happen to live in Germany, Switzerland or Austria (and speak German) make sure you do not miss this month's Javamagazin. The title story is "Was ist Content?" ("what is content?") and there is also an introductory article about JCR and Apache Jackrabbit.

    The abstracts:

    Was ist Content?

    Bilder, Videos, Texte – man erkennt Content, wenn man ihn sieht. Aber hat Content charakteristische Eigenschaften, die ihn von reinen Daten unterscheiden? In Gesprächen über Content und Content-Management ist gelegentlich festzustellen, dass Content mit relationalen Datenbanken oder Dateisystemen gleich gesetzt wird. Obwohl Content letzten Endes oft im Dateisystem oder einer Datenbank gespeichert wird, wird eine Reduktion auf diese Low-Level-Sicht der Thematik nicht gerecht. Content ist eine spezielle Form von Daten, denen eine eigene Qualität innewohnt. Insofern ist auch Content-Management mehr als nur eine Datenbankmaske.
    David Nüscheler, Michael Marth

    JCR und Apache Jackrabbit

    Der JCR-Standard (Content Repository API for Java) beschreibt eine klare Schnittstelle zwischen Anwendung und Content-Ablage. Das hierdurch definierte Content Repository stellt die dem Content typischen Eigenschaften und Funktionalitäten in standardisierter Weise zur Verfügung. Dies erleichtert erheblich die systemübergreifende Nutzung von Content und erschließt damit ein bisher nur mit viel Mühen auszuschöpfendes Potenzial.
    Michael Marth, Gerd Handke, David Nüscheler, Carsten Ziegeler

    Posted by Michael Marth AUG 12, 2008

    Posted in open, rad, sling and tutorial Comments 2

    One of the productivity-boosting features of JCR is the included search engine (which is Apache Lucene in case of Jackrabbit and CRX). This feature can be used to very quickly develop an OpenSearch interface for a Sling-based application.

    I have recently provided a Sling example application called Notes which I want to use to demonstrate the implementation of OpenSearch. Import the Notes application into your CRX Quickstart (download Quickstart if you do not have it, yet). For importing navigate to http://localhost:7402/crx/index.jsp and start the Package Manager.

    In OpenSearch search results can be in different formats. Let's start with producing results in XHTML. Add a new file at /apps/notes/opensearch.jsp. It shall contain:

    <?xml version="1.0" encoding="UTF-8"?><%@page import="javax.jcr.query.*, javax.jcr.*"%><%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0"%><sling:defineObjects/><%@ include file="/apps/notes/util.jsp" %><%String q = "/jcr:root/content/notes//*["+ SearchUtils.parameterToQuery(request.getParameter("qt")) +"]";Query query = currentNode.getSession().getWorkspace().getQueryManager().createQuery(q, "xpath");NodeIterator result = query.execute().getNodes();%><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">  <head profile="http://a9.com/-/spec/opensearch/1.1/" >    <title>Notes app search results</title>    <link rel="search" type="application/opensearchdescription+xml" href="http://localhost:7402/apps/notes/opensearch" title="Notes Search" />    <meta name="<%=result.getSize()%>" content="4230000"/>    <link rel="stylesheet" href="/apps/notes/static/blue.css">  </head><body>    <div id="Header"><a href="/content/notes.html">< back to thread overview</a></div><div id="Content">    <h1>search results for query <%=request.getParameter("qt") %>, hits: <%=result.getSize()%></h1><ul><%@ include file="/apps/notes/searchBox.jsp" %><ul><%while(result.hasNext()) {  Node n = result.nextNode();  String t = n.getProperty("body").getValue().getString();%>    <li>    <a href="http://localhost:7402<%=n.getPath()%>.thread.html"><%=t.length() > 50 ? t.substring(0,50) + "..." : t%></a>    <div><p><%=t%></p></div>    </li><%}%></ul></div></body></html>

    There is only a few OpenSearch-related lines like the link to the OpenSearch descriptor file (explained further below)

    <link rel="search" type="application/opensearchdescription+xml" href="http://localhost:7402/apps/notes/opensearch" title="Notes Search" />

    and the META tag that describes the result set:

    <meta name="<%=result.getSize()%>" content="4230000"/>

    The rest of this jsp is just using the standard JCR-provided query functionality to produce standard XHTML. That's it. This is your shiny new OpenSearch interface (I need to stress this once more because it's really cool: no search engine crawling or or other setup was needed. Neither was there any JCR-JSP-wiring or other web app configuration). Point your browser to: http://localhost:7402/content/notes.opensearch.html?qt=JCR to have a look (the request parameter qt denotes the query term).

     

    The above mentioned descriptor file describes the app's search capabilities to external parties. Given the link element from above it needs to be in /apps/notes/opensearch and shall contain:

    <?xml version="1.0" encoding="UTF-8"?><OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/" xmlns:moz="http://www.mozilla.org/2006/browser/search/">  <ShortName>Notes app</ShortName>  <Description>Notes app search interface</Description>  <Url type="text/html" template="http://localhost:7402/content/notes.opensearch.html?qt={searchTerms}"/>  <Url type="application/rss+xml" template="http://localhost:7402/content/notes.opensearchrss.xml?qt={searchTerms}"/>  <Url type="application/x-suggestions+json" template="http://localhost:7402/content/notes.opensearchsuggestions.json?qt={searchTerms}"/></OpenSearchDescription>

    The first URL element describes the XHTML-based output implemented discussed above. OpenSearch also allows RSS-based output which is described in the second URL-element. The implementation of RSS-based OpenSearch results is even more succinct because there is less boiler-plate code. Create the renderer file /apps/notes/opensearchrss.xml.jsp and let it contain:

    <?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/"><%@page import="javax.jcr.query.*, javax.jcr.*"%><%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0"%><sling:defineObjects/><%@ include file="/apps/notes/util.jsp" %><%String q = "/jcr:root/content/notes//*["+ SearchUtils.parameterToQuery(request.getParameter("qt")) +"]";Query query = currentNode.getSession().getWorkspace().getQueryManager().createQuery(q, "xpath");NodeIterator result = query.execute().getNodes();%><channel>  <title>Local Notes app</title>  <link>http://localhost:7402/content/notes.html</link>  <description>Search results for "<%=request.getParameter("qt") %>" at the local Notes app</description>  <opensearch:totalResults><%=result.getSize()%></opensearch:totalResults><%while(result.hasNext()) {Node n = result.nextNode();%>  <item>    <% String t = n.getProperty("body").getValue().getString(); %>    <title><%=t.length() > 50 ? t.substring(0,50) + "..." : t%></title>    <link>http://localhost:7402<%=n.getPath()%>.thread.html</link>    <description><%=t%></description>  </item><%}%></channel></rss>

    Hit http://localhost:7402/content/notes.opensearchrss.xml?qt=JCR with your browser or any other RSS viewer to retrieve the search results in RSS format.


     

    OK, this is simple to implement, but what is it good for? For example, recent browsers support autodiscovery of OpenSearch engines and allow users to add them to the browser's upper right search box. Simply add a link in the HTML page's header:

    <link rel="search" type="application/opensearchdescription+xml" title="Notes" href="http://localhost:7402/apps/notes/opensearch">

    I have tested this in IE7 and FF2:

     

    Search Suggestions

    There is an enhancement suggested for upcoming versions of OpenSearch that would standardize search suggestions. In Firefox, however, a similar feature is already implemented. The third URL in the descriptor file describes the interface for search suggestions in FF (i.e. this will not work in IE). The corresponding implementation at /apps/notes/opensearchsuggestions.json.jsp looks like this:

    <%@page import="javax.jcr.query.*, javax.jcr.*, java.util.*"%><%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0"%><sling:defineObjects/><%@ include file="/apps/notes/util.jsp" %><%// Hashmap to collect and count full words that contain the query termTreeSet<String> collector = new TreeSet();// only accept min 2 char strings and does not end with a blankif((request.getParameter("qt").length() > 1) && !(request.getParameter("qt").endsWith(" "))) {// the actual searchString q = "/jcr:root/content/notes//*[" + SearchUtils.parameterToSuggestionQuery(request.getParameter("qt")) + "]";Query query = currentNode.getSession().getWorkspace().getQueryManager().createQuery(q, "xpath");NodeIterator result = query.execute().getNodes();// the constant part of the suggestionsString constantPart = "";String[] requestTokens = request.getParameter("qt").split("\\s");  for (int x=0; x<requestTokens.length - 1; x++) constantPart += requestTokens[x] + " ";// we need to see what the actual word was where the search result occurredwhile(result.hasNext()) {  Node n = result.nextNode();  String[] tokens = n.getProperty("body").getValue().getString().split("\\s");    for (int i=0; i<tokens.length; i++) if(tokens[i].toLowerCase().contains(requestTokens[requestTokens.length-1].toLowerCase())) collector.add(tokens[i]);}// print result// format is e.g. ["fi", ["firefox", "first", "fist"]]%>["<%=request.getParameter("qt")%>", [<%boolean first = true;Iterator results = collector.iterator();while(results.hasNext()) { %>  <%=!first?", ":""%>"<%=constantPart + results.next()%>"  <% first = false; }%> ]]<%} else { // if the query term is only 1 char long%>["<%=request.getParameter("qt")%>",[]]<%}%>

    As the jsp's file name suggests the jsp returns json-formatted data. The returned data is a collection of suggestions based on the actual content (i.e. suggesting words that would produce hits). The format for a query "fi" must be of the form ["fi", ["firefox", "first", "fist"]]. Most of the code deals with String manipulation which is a bit verbose in Java. Also, I wanted to have multi-word queries like "jcr re" return suggestions for the last word only (i.e. ["jcr re", ["jcr repository", "jcr renderer"]]) which requires a couple of additional lines of code.

    As usual, the full application including the complete sources is attached to this post in CRX Package Manager format.

    Posted by Michael Marth AUG 05, 2008

    Posted in open Add comment

    Day's Chief Scientist Roy Fielding gave a talk about Open Architecture at this year's OSCON. Roy specifically looks at Peyman Oreizy's thesis titled "Open Architecture Software: A Flexible Approach to Decentralized Software Evolution". Find the slides below: