Latest Posts

Archives [+]

Categories [+]

Authors [+]

Entries filed under 'davids model'

    Posted by Michael Marth MAY 04, 2010

    Posted in agile, data first, davids model, jcr and modelling Comments 2

    Recently, I read up on quite a number of NoSQL protagonists. Of course, one dominant theme in NoSQL land is "schemaless" as opposed to the full-schema nature of relational databases. As usual, both approaches have their specific pros and cons. A common critism of schemaless data stores is that the entropy of the data would create problems in the long run when too much unstructured data has been amassed. On the other, hand full-schema data bases are much less flexible or downright the wrong tool for unstructured data.

    In this post I would like to point out that you do not necessarily have to choose between those extremes: JCR-based data stores allow you to store unstructured data, fully structured data and anything inbetween. In lack of a better term I would like to call this a "schema-optional" data store with "semi-structured" data.

    • The JCR node type nt:unstructured is designed to accept any properties, so you can dump at will strings, dates or even binaries into such a node. This node type is very useful to get started with coding an application when you do not know what the end result should look like. It allows for a development approach coined "data first, structure later" where structure emerges from data, rather than be defined a priori.
    • On the other end of the spectrum you can have rigidly defined node types. JCR allows you to specify e.g. mandatory properties, default values or the allowed child node types in a node hierarchy. The Apache Jackrabbit site has a good overview of the Compact Namespace and Node Type Definition which is a notation used to define such structure.

    In between these two extreme cases any middle ground is possible in JCR repositories:

    • First, a rigid node type definition for a specific node can define "residual" properties. Such an approach allows the application to set not only the properties that were defined a priori in the node type definition, but also anything else. This is particularly useful for scenarios were only a part of the requirements is known beforehand or where the requirements are known to evolve over time. You can define the known parts but an application can still freely write anything into the node as if it was unstructured.
    • Second, it should also be noted that these structured, unstructured and semi-structured nodes can happily live next to each other in the same repository tree. So different parts of your application can make use of different levels of structure not only through different node types, but also through different parts in the node hierarchy.

    With JCR 2.0 it has become quite a bit easier to evolve the structure (after all, the mantra is "data first, structure later", not "structure never"): one can now change the node types of existing nodes. That facilitates a migration from, say, nt:unstructured nodes to more structured types.

    Posted by Michael Marth APR 18, 2008

    Posted in crx quickstart, data first, davids model, everything is content, graph, jcr, open, rest, sling, social and tutorial Comments 4

    In case you follow emerging Internet standards you will have come across OpenSocial, the Google-led spec for social network applications. Major supporters are MySpace, LinkedIn, XING, Google's own Orkut, Hi5 and others. The Apache Software Foundation's implementation of this spec is called Apache Shindig. It is a container (runtime) for OpenSocial applications (which are called gadgets).

    In my opinion OpenSocial and Apache Sling are a good technical fit for at least two reasons:

    1. On a raw technology level both use the same technology building blocks, e.g. JavaScript: in Sling JS is used on the server-side for .esp templates and on the client-side in the case of JST templates. OpenSocial gadgets are coded in JS as well. Moreover, associated technologies like JSON, feeds and REST are supported by both.
    2. On a more conceptional level: As a spec that must work across a number of different social networks the majority of information that is accessible through the OpenSocial API is optional, i.e. it is up to the container if data is returned or not. This situation is a good fit to the unstructured, "Data First" approach that is enabled by Sling (respectively the underlying JCR).

    I would like to show Apache Shindig (Apache's OpenSocial container implementation) and CRX Quickstart (a bundle of Apache Sling and Day's JCR-compliant repository) working together in this blog post.

    Installation

    In this screencast I have shown how to install CRX Quickstart: double-click on its icon (CRX Quickstart is not available, yet, but it will be very soon). Strictly speaking, you do not need CRX Quickstart for the examples below. It all works with "plain" Sling as well.

    Installing Shindig is a tad more complicated and described on Shindig's web site. You need to check out, do a Maven build (I used revision 648157 for this example) and start Shindig's Jetty server on port 8080 with:

    mvn jetty:run-war

    Once you started Shindig hit /gadgets/files/samplecontainer/samplecontainer.html on http://localhost:8080. You should get a kind of gadget console that looks like this (click to enlarge):

     

    Shindig comes with an example implementation of a social network. By default it runs the "Hello World" example gadget located at:/gadgets/files/samplecontainer/examples/SocialHelloWorld.xml on http://localhost:8080.(btw Shindig comes with some example data so don't worry, if you have no friends - Shindig has some imaginary ones for you).

    Friends are Content

    What I would like do is: grab the gadget's viewer's friends and all the available data about them and store this data in the repository. For this purpose I have written a little gadget (see below) and saved it in my JCR repository at /apps/friends/friendsaver.html. By default the repository is running on port 7402, so when I point the gadget console to http://localhost:7402/apps/friends/friendsaver.html I get (click to enlarge):

     

    The gadget retrieves the viewer's friends and displays them in HTML. Moreover, in the background the viewer's data and the available friend data is posted to my repository. In the Content Explorer this looks like (click to enlarge):

     

    Hey, remember the "Everything is Content" mantra? Well, your imaginary friends are content, too.

    Please note that this works without setting up any schema or any other configuration of the repository. I ran it on an out-of-the-box CRX Quickstart (see also this screencast and this post about Data First). Only for the fields that are actually sent node properties are created.

    The Gadget Code

    The gadget is completely standard OpenSocial code, no surprises here. In onLoadFriends() the viewer's friends (variable viewerFriends) are iterated and displayed in HTML. For each opensocial.Person object the function createFriendNode() is called. In this function an HTTP POST request is sent to the repository that persists the person. Available opensocial.Person.Field data is sent as POST parameters (in the code only gender and first phone number are implemented) and thus persisted as node properties. I want to leverage the repository's hierarchy and store the friends as child nodes below the viewer (see David's model, rule 2). Here's the relevant snippet:

     /**  * Request for friend information.  */function getData() {      var req = opensocial.newDataRequest();  req.add(req.newFetchPersonRequest(opensocial.    DataRequest.PersonId.VIEWER), 'viewer');  req.add(req.newFetchPeopleRequest(opensocial.    DataRequest.Group.VIEWER_FRIENDS),    'viewerFriends');  req.send(onLoadFriends);}; /**  * Parses the response to the friend request  * @param {Object} dataResponse Friend      information that was requested.  */function onLoadFriends(dataResponse) {  var viewer = dataResponse.get('viewer').    getData();  var html = 'Friends of ' + viewer.    getDisplayName();   html += ':<br><ul>';  createFriendNode(viewer);  var viewerFriends = dataResponse.    get('viewerFriends').getData();  viewerFriends.each(function(person) {    html += '<li>'      + person.getDisplayName()      + '</li>';    createFriendNode(person, viewer);  });  html += '</ul>';  document.getElementById('message').    innerHTML = html;};  function createFriendNode(person, parent) {     var url = "http://localhost:7402/content/friends/";  if(parent) {    url += sanitizeId(parent.getId())+"/*";     } else {     url += "*";    }        var params = {};    params[gadgets.io.    RequestParameters.CONTENT_TYPE] =    gadgets.io.ContentType.TEXT;    params[gadgets.io.    RequestParameters.METHOD] =    gadgets.io.MethodType.POST;    var postParams = "";  postParams += 'name=' +    sanitizeId(person.getId()) + '&fullname=' +    person.getDisplayName();  if(person.getField(opensocial.Person.Field.    PHONE_NUMBERS)) postParams +=    ('&phone=' +    person.getField(opensocial.Person.Field.    PHONE_NUMBERS)[0].    getField(opensocial.Phone.Field.NUMBER))  if(person.getField(opensocial.Person.Field.    GENDER)) postParams += ('&gender=' +    person.getField(opensocial.Person.Field.    GENDER).getKey())  // I could add more fields here...        params[gadgets.io.RequestParameters.    POST_DATA] = postParams  gadgets.io.makeRequest(url, null, params);};    function sanitizeId(id) {  return id.replace(".", "_");   }gadgets.util.registerOnLoadHandler(getData);  

    Round-Tripping

    Now that the friends are stored in the repository each one has a URL. Displaying a friend in a simple HTML page can be done with e.g. server-side Javascript. Storing this file in the repository in /apps/friends/html.esp

    <html>  <body>    <h1><%= currentNode["fullname"] %></h1><ul><li>gender: <%= currentNode["gender"] %></li><li>phone: <%= currentNode["phone"] %></li></ul>  </body></html>

    will yield for the URL http://localhost:7402/content/friends/john_doe/jane_doe.html

    But this is only half the fun. It is much more interesting to retrieve the friends data in another OpenSocial gadget. This can easily be done without any repository-side code as Sling natively supports the json format. For example the URL http://localhost:7402/content/friends/john_doe/jane_doe.json will return this node in json format. Like that, we can easily access the friends nodes through a gadget containing this snippet:

    function makeCRXRequest() {    var params = {};    params[gadgets.io.RequestParameters.    CONTENT_TYPE] =    gadgets.io.ContentType.JSON;    var url =    "http://localhost:7402/content/friends/john_doe/"+    document.getElementById("person_name").value+    ".json";    gadgets.io.makeRequest(url, response, params);};function response(person) {    var html = "";  html += "name: " + person.data.fullname +    "<br/>";  html += "phone: " + person.data.phone +     "<br/>";  html += "gender: " + person.data.gender +    "<br/>";    document.getElementById('content_div').     innerHTML = html;};

    The gadget in action looks like this (click to enlarge)

     

    This little hack could be the starting point for a cross-social network phone book application.

    Final remarks

    I hope I could show that Sling and Shindig go really well together. Especially, being able to utilize the JCR repository as a backend without any coding on the repository side looks tempting to me. Maybe at one point Sling will even be able to run OpenSocial gadgets natively.

    In this post I concentrated on frontend intergration technologies. But OpenSocial will soon add a REST API next to its JS API. For Shindig the implementation of this REST API is likely to be Apache Abdera which uses JCR as an optional persistence storage. So there will be additional points of contact.

    Posted by Michael Marth DEC 14, 2007

    Posted in crx, davids model, microsling, sling and tutorial Comments 2

    If you followed the previous parts of this little microsling tutorial (here, here and here) you should now have microsling up and running. It’s time to move on and create a real web application with microsling. In this post I will explain how this blog system that you are currently reading has been built.

    Content model

    There is not too much information about structuring JCR content available (I hope to address this issue in the future). But one thing that is available is "David’s Model". I took it as the basis for structuring the blog.

    Let’s see, there are blog posts, a blog entity to hold the posts, comments posted by readers and file attachments to blog posts. Now for the rules:

    Rule #1: Data First, Structure Later. Maybe.

    OK, that’s easy. I will store everything as nt:unstructured. Sounds good, did not want to think about node types anyway.

    Rule #2: Drive the content hierarchy, don't let it happen.

    This is a really good rule IMO. Especially, if one has done some relational DB modeling it is a bit hard to get into. But in our case it is easy and natural enough: blog posts will be children of the blog they belong to. Comments and attachments will be children of the post they belong to.

    Rule #3: Workspaces are for clone(), merge() and update().

    Actually, workspaces would be really useful for a staging area. However, I will not use them here in order to keep things simple.

    Rule #4: Beware of Same Name Siblings.

    This is no problem for the posts (I can just name them any way I like and there should not be too many), but I will come back to this regarding the user generated comments.

    Rule #5: References considered harmful.

    OK, I do not think I need them here.

    Rule #6: Files are Files are Files.

    I will use nt:file for attachments. Attachments are files after all.

    Rule #7: ID's are evil.

    I do not want to be evil. Hence, I do not use IDs (and I have not needed them so far).

    Right, so the model looks something like this:

    blog [nt:unstructured]
    |  +sling:resourceType[string]
    |--post [nt:unstructured]
    |    +title[string]
    |    +body[string]
    |    +sling:resourceType[string]
    |----comment [nt:unstructured]
    |      +body[string]
    |      +sling:resourceType[string]
    |----attachment [nt:file]

    Most of the properties should be self-explanatory (well, a blog post needs a title), but not the "sling:resourceType" properties. Since all nodes are unstructured (apart from the files) this additional property is needed for script resolution (see the post about microsling’s request processing). Microsling determines the resource type through this property (and indirectly the script to execute).

    Display a blog post

    Let’s now display a blog post. First, you need to create a post using CRX’s Content Explorer (see the last part of this tutorial if you are not sure how to do this). Create a blog node named "myblog". It shall contain a post node named "firstpost". Add a title and a body property.

    The property "sling:resourceType" of the post shall have the value "blogPost". For the blog node the value shall be "blog".

    You should get something like this:

    Next, you need to put a script to display this node into the repository. The script must be placed in the directory "/sling/scripts/blogPost" and called "html.esp" (the server-side JavaScript processor will be used). The file shall contain:

    <html>
      <body>
        <h1><%=resource.node.title%></h1>
        <%=resource.node.body%>
      </body>
    </html>

    To see the output of the script point your browser to http://localhost:7402/microsling/myblog/firstpost.html. As you already know, the bit between the <%= %> brackets is executed on the server and the result is inserted into the output. The resource object has a property node which represents the JCR node that was requested. This node has the two properties "title" and "body" which are accessed using JavaScript’s dot notation.

    Display the whole blog

    For displaying the whole blog you need to put a script called "html.esp" into "/sling/scripts/blog". The file shall contain:

    <html>
      <body>
      <%
      for (var prop in resource.node) {
        if (resource.node[prop]["sling:resourceType"] == "blogPost") {
      %>
          <h1><%= resource.node[prop].title%></h1>
          <p><%= resource.node[prop].body%>
          </p>
      <%
        }
      }
      %>
      </body>
    </html>

    The parts between <% %>, (without the equals sign as opposed to above) denote JavaScript that is executed on the server without having the results being written into the output stream (that is just like JSPs again).

    In the line

    for (var prop in resource.node) {

    the script iterates over the children of the blog node and looks for children of (Sling) resource type "blogPost" in this line:

    if (resource.node[prop]["sling:resourceType"] == "blogPost") {

    The square-bracket syntax for accessing a property (in this case ["sling:resourceType"]) is an alternative to the dot notation used above for title and body.

    To see the output of the script point your browser to http://localhost:7402/microsling/myblog.html

    An RSS feed and includes

    Now that you have the list of posts it is really easy to add an additional feature: an RSS feed. For this, create a new script called xml.esp (i.e. it shall respond to requests ending in ".xml") and place it in "/sling/scripts/blog". The logic is almost the same as above, just the markup is different. Simply iterate over the blog posts and produce xml instead of html. So the file should contain:

    <?xml version="1.0"?>
    <rss version="2.0">
    <channel>
    <title>My blog </title>
    <description>My blog</description>
    <link>http://mydomain.com/microsling<%= resource %>.html"</link>
    <%
      for (var prop in resource.node) {
        if (resource.node[prop]["sling:resourceType"] == "blogPost") {
          %><%=sling.include("/microsling"+\\
          resource.node[prop]+".rssitem.xml")%><%
        }
      }
    %>
    </channel>
    </rss>

    (The \\ characters shall denote that the line continues. Delete them from your code.)

    The first new bit in this code is the expression <%= resource %>. This evaluates to the node name including the full path. It is useful e.g. for generating links as above.

    The second new bit is the line

    %><%=sling.include("/microsling"+\\
    resource.node[prop]+".rssitem.xml")%><%

    The include method generates a new request on the server-side. The result of this request is included into the output stream. This includes an alternative "view" of a blog post. "resource.node[prop]" is the node that contains the post. "rssitem" is a selector on this node. With a selector you can render a certain resource type in alternative ways. In this case you need to render a blog post as an rss feed item. Thus, create a script at "sling/scripts/blogPost/rssitem/xml.esp" that contains:

    <item>
    <title><%= resource.node.title %></title>
    <description><%= resource.node.body%></description>
    <link>http://mydomain.com/microsling<%= resource%>.html</link>
    </item>

    Now point your browser at http://localhost:7402/microsling/myblog.xml to see the resulting RSS feed.

    OK, that’s all for today. In the next part, we will look at user-generated comments and attachments.

    (For your convenience all the scripts have been added as an attachment to this post.)