Latest Posts

Archives [+]

Archive for 'July 2010'

    Posted by Kas Thomas JUL 28, 2010

    Comments 3

    I promised last time to show a simple way to render CRX content as PDF. The technique in question involves using a PDF form as the readymade container, into which form data is imported using XFDF. The latter is the XML version of Adobe's Forms Data Format, which in turn is a file format specifically designed to allow import and export of data to and from PDF forms.

    The way it works is simple: Suppose you have a PDF form that you want to populate with data. You merely need to create a small data file (in XFDF format) and put it on the server. When a user requests the data file (which has a mimetype of "application/vnd.adobe.xfdf"), Acrobat Reader (or the Reader browser plug-in) detects the fact that form data will need to be imported into a form. The XFDF file itself contains a pointer to the actual form to be used. Reader fetches the form, then imports the form data into it, and renders the result as a PDF file containing the data. It all happens transparently to the user, and the user need only have Acrobat Reader (not a full copy of Acrobat Professional).

    In the example I'm going to show below, we generate the XFDF file dynamically on the server, via a script called (what else?) xfdf.esp. We'll get to that in a minute.

    The example we're going to talk about assumes that there is content in CRX (under a path of /content/films) that looks something like this:

    This particular content node is named terminator_2. It lives under /content/films/ in my CRX repository.

    Notice, in the above list, that there is a property (at the bottom) called sling:resourceType, set to a value of "films." This tells CRX to look under /apps/films for any scripts that might be necessary to render the content.

    In previous blogs, I've shown how to write scripts that render this content as HTML, SVG, or CSV. Right now, what we need is an XFDF renderer. That turns out to be pretty easy to set up.

    First, we need to create a PDF form to hold our data. In the Acrobat Professional forms editor, such a form looks like this:

    FilmSummary.pdf

    Notice that there are text fields with names like Director, Subject, Year, Title, and so on. (These fields can be read-only, or editable; the example below will work fine either way.)

    I've given this form a name of FilmSummary.pdf and placed it in CRX under a path of /apps/films/.

    In order to populate the form, we need to be able to generate an XML file that conforms to Adobe's XFDF schema. The script that does this, xfdf.esp, is very straightforward:

    <?xml version="1.0" encoding="UTF-8"?>
    <xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
    <f href="/apps/films/FilmSummary.pdf"/>
    <% response.setContentType("application/vnd.adobe.xfdf" ); %>
    <fields>
       
        <field name="Director">
            <value><%= currentNode.Director %></value>
        </field>
        <field name="Subject">
            <value><%= currentNode.Subject %></value>
        </field>
        <field name="Year">
            <value><%= currentNode.Year %></value>  
        </field>
        <field name="Title">
            <value><%= currentNode.Title %></value>
        </field>
        <field name="Length">
            <value><%= currentNode.Length %></value>
        </field>
        <field name="Actor">
            <value><%= currentNode.Actor %></value>
        </field>
        <field name="Actress">
            <value><%= currentNode.Actress %></value>
        </field>
        <field name="Popularity">
            <value><%= currentNode.Popularity %></value>
        </field>

    </fields>

    </xfdf>
     
     

    Note that we use response.setContentType( ) to explicitly tell the browser that this is an XFDF file. Note also the <f> element near the top, which contains a pointer to our PDF form. The rest of the file is pretty much self-documenting.

    With xfdf.esp in the repository under /apps/films, we're now able to issue a browser request for http://localhost:7402/content/films/terminator_2.xfdf, and automatically the browser (with help from the Reader plug-in) will fetch the PDF form (FilmSummary.pdf) and merge our data into it, to produce a rendered view of:

    file

    About the only thing this PDF form lacks is interactivity. It would be nice to give the end-user a way to select a film from a list of titles, then have the form automatically populate with data for the requested film. And that's exactly what we'll tackle next time.

    Posted by Kas Thomas JUL 26, 2010

    Comment 1

    I've shown how easy it is to push spreadsheet data into CRX (in such a way that there is one content node per row of data, where properties on that node correspond to column data). The reverse is also possible: It's easy to write a script that converts sibling nodes to row data formatted as CSV (comma-separated values per RFC 4180). Such a script, csv.esp, looks something like this:

    <%
    // Given a list of sibling nodes (presumably
    // similar in structure), and an array of
    // property names, convert each node
    // to one "row" of CSV data, where
    // columns correspond to properties.
    // We will encode all property data as
    // comma-separated values per RFC 4180.
    function nodesToCSV( nodes, propertyNames ) {

            var records = new Array( );

            for ( var i = 0; i < nodes.length; i++ ) {

                    var aRecord = new Array( );

                    // suck in the data for each property:
                    for ( var k = 0; k < propertyNames.length; k++ ) {
                            var data = nodes[ i ][ propertyNames[ k ] ];
                            var escaped = escapeData( data );
                            aRecord.push( escaped );
                    }
                    records.push( aRecord.join( "," ) );
            }

            var CRLF = String.fromCharCode(13) +
            String.fromCharCode(10);

            return records.join( CRLF );
    }

    // Return an array of property names for this node
    function getOrderedProperties( node ) {

            var array = new Array();
            for ( var i in node )
            array.push( i );

            return array;
    }

    // Escape field data per RFC 4180
    function escapeData( data ) {

            // replace " with ""
            data = String(data).replace( /"/g, "\"\"" );

            // if data contains comma, CRLF, or "
            // we need to wrap the entire thing in double quotes
            var escapables = /,|(\r\n)|"/;
            if ( data.match( escapables ) )
            return "\"" + data + "\"";

            return data;
    }
    %>
    <% nodes = currentNode.getNodes( );
    // get a list of property names
    propertyNames =
    getOrderedProperties( nodes[0] );%>
    <%= nodesToCSV( nodes, propertyNames ) %>
     

    The rules for escaping data for CSV are extremely simple. First, any data string that contains the double-quote (") character needs to have each such character converted to two double-quotes (""). Secondly, if the data contains a comma, the entire data string needs to be wrapped in quotation marks. The same is true for any data that contains double-quotes or line breaks (which RFC 4180 defines as CRLF -- carriage return followed by linefeed). The following very simple function enforces these escaping rules:

    // Escape field data per RFC 4180
    function escapeData( data ) {

       // replace " with ""
       data = String(data).replace( /"/g, "\"\"" );
     
       // if data contains comma, CRLF, or "
       // we need to wrap the entire thing in double quotes  
       var escapables = /,|(\r\n)|"/;
       if ( data.match( escapables ) )
          return "\"" + data + "\"";
          
       return data;
    }

    The function that actually converts nodes to records is very straightforward as well:

    function nodesToCSV( nodes, propertyNames ) {

       var records = new Array( );

       for ( var i = 0; i < nodes.length; i++ ) {

          var aRecord = new Array( );

          // suck in the data for each property:
          for ( var k = 0; k < propertyNames.length; k++ ) {
             var data = nodes[ i ][ propertyNames[ k ] ];
             var escaped = escapeData( data );
             aRecord.push( escaped );
         }
          records.push( aRecord.join( "," ) );
       }

       var CRLF = String.fromCharCode(13) +
                        String.fromCharCode(10);

       return records.join( CRLF );
    }

    Note that we need to explicitly provide the function a list of property names, rather than (say) let the function iterate through property names on an introspective basis. The reason for this is that if we simply try gathering property names with a for/in loop, we will get back property names in no particular order. And the order will, in fact, vary from content node to content node even if all of the content nodes have properties with exactly the same names. The unorderedness of the properties (as obtained through simple iteration) would scramble the column data in our CSV file. We don't want that. Hence, we pass in an array of property names, and march through the array in orderly fashion when pulling property data from each node.

    When I placed csv.esp in my repository under /apps/films and then navigated to http://localhost:7402/content/films.csv, CRX dutifully fired my script and produced a CSV file containing all of the data from my /films content nodes, causing my browser (in turn) to inform me that I was downloading a file of type "csv" (it then asked me what program I wanted to use to open the file; I specified scalc.exe, and OpenOffice dutifully loaded the file as a spreadsheet).

    So far, I've shown how to render /films data as HTML, SVG, and CSV. Next time, I want to show a simple trick for rendering the data as PDF. It's easier than you think!

    Posted by Michael Marth JUL 26, 2010

    Add comment

    It's always good to get a glimpse into the approaches taken by non-OSS JCR implementations: In a recent technical article on the developerworks website Malarvizhi Kandasamy describes how IBM goes about JCR fulltext search. The actual engine is

    Juru, which is a Java library developed by the IBM Haifa research lab

    According to the article Juru is capable of some natural language processing like stemming or finding similar spellings.

    IBM uses a JCR compliant repository in a number of their products, e.g. Lotus Web Content Management or WebSphere Portal.

    Posted by Kas Thomas JUL 22, 2010

    Add comment

    A few days ago, I talked about how to "shred and store" a spreadsheet -- i.e., how to push rows of a spreadsheet into individual nodes in CRX (one node per row, with column data stored as properties). I also gave JavaScript code for doing this in an OpenOffice macro. For testing purposes, I used the CSV file a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing.

    After running my OpenOffice macro on the Georgia Tech CSV file, my CRX repository now contains movie data (Title, Director, Year, etc.) for 1741 films, each film with its own nt:unstructured node under the path /content/films/. In the CRX Content Explorer, a given node (in this case, the node at http://localhost:7402/content/films/terminator_2) looks something like this:

    file

    Notice that the spreadsheet's column data now show up as properties (Actor, Actress, Director, etc.) with values like "Schwarzenegger, A.," "Hamilton, Linda," and so forth.

    Notice also that I've included a property of sling:resourceType, with a value of "films," for every movie node. This is important, because it tells Sling to look under /apps/films/ for any runtime scripts that may need to be applied in order to render a particular node (such as http://localhost:7402/content/films/terminator_2).

    Let's see how this works in practice. Suppose I want to render a movie node as HTML. I could put a file called html.esp under /apps/films/, containing the following markup:

    <html>
    <head>
    <link rel="stylesheet" type="text/css" href="/apps/films/films.css" />
    </head>
    <body>

    <img src="/apps/films/Film.png" width="95" height="92" />
    <br/>

    <span class="head"> <%= currentNode.Title %> </span><br/>

    <span class="normal">Director:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Director %></span><br/>

    <span class="normal">Year:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Year %></span><br/>

    <span class="normal">Genre:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Subject %></span><br/>

    <span class="normal">Actor:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Actor %></span><br/>

    <span class="normal">Actress:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Actress %></span><br/>

    <span class="normal">Runtime:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Length %> minutes</span><br/>

    <span class="normal">Popularity:&nbsp;&nbsp;&nbsp;</span>
    <span class="tdata"><%= currentNode.Popularity %></span><br/>

    </body>
    </html>

    As it turns out, I've also got a small PNG graphic, Film.png, located under /apps/films/, as well as a CSS file called (what else?) films.css, which looks like:

    .head {
    font-family:Tahoma;
    font-size:32pt;
    fill:#990000;
    color:#990000;
    }

    .normal {
    font-family:Tahoma;
    font-size:15pt;
    fill:#444444;
    }

    .tdata {
    font-family:Verdana;
    font-size:15pt;
    font-weight: bold;
    fill:#992222;
    color:#992222;
    }

    With these files in place, I can now direct my browser to go to http://localhost:7402/content/films/terminator_2.html, and Sling will automatically detect the need to use the html.esp script to render the node as HTML. The resulting rendition looks something like this:

    file























    But suppose I want to be able to provide a Scalable Vector Graphics rendition, for browsers (like Firefox) that can render SVG. Not a problem: All I need to do is create a script called svg.esp and place it under /apps/films/. The svg.esp script might look something like this:

    <?xml version="1.0" standalone="no"?>
    <?xml-stylesheet type="text/css" href="/apps/films/films.css" ?>
    <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
    "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

    <svg width="100%" height="100%" version="1.1"
    xmlns="http://www.w3.org/2000/svg"
    xmlns:xlink="http://www.w3.org/1999/xlink">

    <!--  Add a custom filter effect  -->
     <defs>
        <filter id="MyFilter" filterUnits="userSpaceOnUse" x="0" y="0" width="600" height="400">
          <feGaussianBlur in="SourceAlpha" stdDeviation="4" result="blur"/>
          <feOffset in="blur" dx="4" dy="4" result="offsetBlur"/>

          <feSpecularLighting in="blur" surfaceScale="5" specularConstant=".75"
       specularExponent="20" lighting-color="#992222"  
       result="specOut">
            <fePointLight x="-5000" y="-10000" z="20000"/>
          </feSpecularLighting>
          <feComposite in="specOut" in2="SourceAlpha" operator="in" result="specOut"/>
          <feComposite in="SourceGraphic" in2="specOut" operator="arithmetic"
       k1="0" k2="1" k3="1" k4="0" result="litPaint"/>
          <feMerge>
            <feMergeNode in="offsetBlur"/>
            <feMergeNode in="litPaint"/>
          </feMerge>

        </filter>
      </defs>

    <image x="20" y="20" width="95" height="92" xlink:href="/apps/films/Film.png"/>

    <!-- Apply the filter to Title -->
     <g filter="url(#MyFilter)" >
        <g  transform="matrix(1.15 0 0 1 0 0)"  >
          <text class="head" x="18" y="160" >
          <%= currentNode.Title %></text>      
        </g>
      </g>

    <text class="normal" x="20" y="200" >Director</text>
    <text class="tdata" x="200" y="200"><%= currentNode.Director %></text>

    <text class="normal" x="20" y="230" >Genre</text>
    <text x="200" y="230" class="tdata"><%= currentNode.Subject %></text>

    <text class="normal" x="20" y="260" >Year</text>
    <text x="200" y="260"  class="tdata"><%= currentNode.Year %></text>

    <text class="normal" x="20" y="290" >Actor</text>
    <text x="200" y="290" class="tdata"><%= currentNode.Actor %></text>

    <text class="normal" x="20" y="320" >Actress</text>
    <text x="200" y="320" class="tdata"><%= currentNode.Actress %></text>

    <text class="normal" x="20" y="350" >Runtime</text>
    <text x="200" y="350" class="tdata"><%= currentNode.Length %></text>

    <text class="normal" x="20" y="380" >Popularity</text>
    <text x="200" y="380" class="tdata"><%= currentNode.Popularity %></text>

    </svg>

    The big <defs> section near the beginning is an (optional) SVG filter effect, designed to provide a little extra visual appeal to the movie title by giving it a drop-shadow. The result (as rendered by Firefox) looks like this:

    file

























    Of course, in a real-world component or application, you would have logic somewhere (whether on the server side or in client-side code) that detects the type of browser the user has and fetches the SVG rendition only if the user's browser is SVG-capable. 


    Posted by Alexander Saar JUL 22, 2010

    Comments 2

    Since version 2.0 CRX comes with CRXDE Lite (CRX Development Environment - Lite), a web based tool to ease the development of CRX based applications. CRXDE Lite is implemented using the ExtJS Javascript library and aims to replace the CRX 1.x Content Explorer with a modern AJAX-based repository editor and browser, but it also provides improved means for searching, code editing and integrations for code version management and handling of non-scripted code. As a tool primarily for developers, it also comes with server side development functionalities like compilation of Java code, OSGi bundle creation and autodeployment, project wizard, etc.

    file

    You can access CRXDE Lite by clicking the "Develop" button on the CRX Welcome screen, or by entering the http://localhost:7402/crx/de/ URL in your browser (assuming the default installation).

    This article gives an overview of the features that are provided
    by CRXDE Lite.

    Editing

    CRXDE Lite provides all the basic features for node and property editing that were already available in CRX Explorer. In addition it provides in-place editing for files stored in the repository like CSS, Javascript, HTML, or Java and JSP files. For file editing we integrated Christophe Dolivet's EditArea, a fully web-based code editor that comes with a pluggable model for syntax highlighting. Future versions of CRXDE Lite may also support other (HTML5) editors since it comes with a plugin model that allows to hook in other functionality like editors.

    file

    Hint: A hidden feature that comes in handy when editing large files is the maximize edit area function. As you may have noticed the panels that contain the property list and repository tree can be collapsed to get more display space for the editor. You also can (un-)collapse both by just double clicking the title of the editor tab.

    Searching

    CRXDE Lite provides many ways to search and find content.

    Path search. The most obvious is the path search field at the top. Some may already know this from CQ. It provides selection of a node based on its path as well as autocompletion. You can also just enter the name of a node (e.g., html.jsp) and it will show you a list of all nodes with this name. Upon selection the path of this node is inserted into the field so you can go directly to that node by hitting Enter.

    file

    While name and path search is very valuable if you know the structure of your content, sometimes you are looking for things you don't have any idea where they could be located in the repository. Examples for this are uses of a class in code files or places where a certain Apache Sling resource type is used. There are two features provided for such cases.

    Full-text search on Home screen. It not only provides plain full text search but also has full support for the Jackrabbit GQL (Google Query Language) implementation which allows you to restrict searches to a path or node type.

    Using GQL on the CRXDE Lite Home screen it is trivial, e.g., to find all content, which has a given Sling resource type (and will be rendered by the specified component). Example:

    "sling:resourceType":bookstore/components/product

    will find all pieces of content, which will be rendered using the bookstore/components/product component. You might want to try this query out after you have installed our Bookstore sample application.

    file

    Query Editor. Another way for advanced content search is the query editor. This is similar to the query feature in CRX Explorer and can be used to find content or to test function and performance of queries that you plan to use in your application. In contrast to CRX explorer you can open multiple query editors and compare or re-run queries for optimization. If you want to just test the performance of a query with large content sets you can uncheck the "Display Results" checkbox, which will prevent all hits from being rendered but only display the size of the result set. If you choose to display all hits a double click on a hit will select the related node in the content tree.

    Tree Filter. Last but not least a tree filter is provided that allows to filter the content tree that comes with CRXDE Lite. This is useful not only to find content when you know the name of a node but also to reduce the number of nodes that are displayed in the tree.

    Note that the tree filter will only show the content, that has been already loaded (visited) in the tree browser in the current user session, so it's a quick, convenience feature to filter out certain JCR nodes you've been working with recently.

    file

    Code and Packages

    To become a full CRX IDE that provides all integrations for the compilation and bundle generation, these features were released with the latest version of CRX. With this you can create and build OSGI bundles or just compile Java classes that are stored directly in your repository. All classes that are exported by the bundles installed into the CRX OSGi runtime are available for your code and scripts.

    In addition CRXDE Lite also provides full integration for the new CRX SVN integration feature which allows you to checkout content directly from SVN into your CRX instance, modify it and commit the changes.

    The current version of CRX ships with a new and improved package manager. One interesting thing to mention is that if you install or preview package content, the list of content in the Activity Log contains links to directly open that content in CRXDE Lite. You can navigate to the given piece of content by clicking the link.

    file

    What's next?

    In one of the next articles in the CRX Gems series I'll write about the internal architecture of CRXDE Lite.

    Posted by Kas Thomas JUL 19, 2010

    Comments 2

    The first version of this post originally was published here.

    Lately I've been doing a fair amount of server-side scripting using ESP (ECMAScript Pages) in Sling. At first blush, such pages tend to look a lot like Java Server Pages, since they usually contain a lot of scriptlet markup, like:

    <%  // script code here  %>

    and

    <%=  // stuff to be evaluated here  %>

    So it's tempting to think ESP pages are simply some different flavor of JSP. But they're not. From what I can tell, ESP pages are just server pages that get handed to an EspReader before being served out. The EspReader, in turn, handles the interpretation of scriptlet tags and expression tags (but doesn't compile anything into a servlet). Bottom line, ESP is not JSP, and despite the availability of scriptlets tags, things work quite a bit differently in each case.

    Suppose you want to detect, from an ESP page or a JSP page, what kind of browser a given page request came from. In a Sling JSP page you could do:

    <%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0" %>

    <sling:defineObjects/>
    <html><body>

    <%
    java.util.Enumeration c = request.getHeaders("User-Agent");

    String s = "";

    while ( c.hasMoreElements() )
        s += c.nextElement();
    %>

    <%= s %>
    </body></html>

    But what do you do in ESP? Remember, <sling:defineObjects/> is not available in ESP.

    It turns out that Sling automatically (without the need for any directives) exposes certain globals to the JavaScript Context at runtime, and one of them is a request object. Thus, in ESP you'd simply do:

    <%

    c = request.getHeaders("User-Agent");

    s = "";

    while ( c.hasMoreElements() )
        s += c.nextElement();

    %>

    <%= s %>

    Very similar to the JSP version.

    So the next question I had was, what are the other globals that are exported into the JavaScript runtime scope by Sling? From what I can determine, the Sling globals available in ESP are:

    currentNode
    currentSession
    log
    out
    reader
    request
    resource
    response
    sling

    currentNode is the JCR node underlying the current resource; currentSession is what it sounds like, a reference to the current Session object; log refers to the org.slf4j.Logger; reader returns request.getReader(), which allows for reading the request body; request is a reference to the SlingHttpServletRequest; resource is the current Resource; response is, of course, a reference to the SlingHttpServletResponse; and sling is a SlingScriptHelper. All of these are available all the time, throughout the life of any ESP script in Sling.

    The nice part about server-side scripting in Sling (one of many nice parts), incidentally, is that you don't have to choose to do just ESP pages or just JSP; you can write an ESP handler for one situation and a JSP for another, and use ESP/JSP in any combination. You're not locked into one technology or the other.

    For more information, try the Sling Javadocs here or Day's page of resources here (note, in particular, the list of References on the right).

    Posted by Michael Marth JUL 16, 2010

    Add comment

    The current board of directors of the Apache Software Foundation has just been elected - congratulations to:

    • Shane Curcuru
    • Doug Cutting
    • Bertrand Delacretaz
    • Roy T. Fielding
    • Jim Jagielski
    • Sam Ruby
    • Noirin Shirley
    • Greg Stein
    • Henri Yandell

    Roy and Bertrand are colleagues of mine at Day Software.

    To find out more about what the board actually does have a look at "How the ASF works".

    Posted by Kas Thomas JUL 16, 2010

    Comment 1

    In a recent blog, I talked about how easy it is to store snippets of text from OpenOffice in a CRX repository using a little bit of JavaScript and the Sling REST API. While being able to store arbitrary bits of text this way is certainly useful, it would be even more useful to be able to store spreadsheet data. Of course, storing a spreadsheet in CRX, per se, is not much of a challenge: with WebDAV, it's a matter of drag and drop. But storing an entire spreadsheet as a single monolithic content item doesn't necessarily give you the greatest content-management bang for the buck. Often, what you really want to do is granularize the spreadsheet into records (or row data), and store individual rows as content items. (You could take it further and store individual cells as content items, but that would probably be overkill for most situations, although there's certainly nothing preventing you from doing it.)

    In the database world, where decisions often have to be made as to how best to decompose an XML document when mapping it to tables in a database, this general process (of decomposing a large document along the lines of its natural internal fine-structure) is known as shredding. What would be handy is to have an OpenOffice macro that could shred a spreadsheet into rows, and push the rows into nodes in CRX. That's what I propose to show you right now.

    It turns out to be pretty easy to parse a spreadsheet in an OpenOffice macro. Using JavaScript:

       // First, get the document object
       // from the scripting context
       oDoc = XSCRIPTCONTEXT.getDocument();

       // Next, get the XSpreadsheetDocument
       // interface from the document
       xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

       // Then get a reference to the sheets for this doc
       var sheets = xSDoc.getSheets();

       // get Sheet1
       var sheet1 = sheets.getByName("Sheet1");

    Once you've gotten the sheet reference, you can use it to obtain a cell reference:

    var cell = sheet.getObject().getCellByPosition( column, row );

    The cell, in turn, contains data, which (dependening on whether you're dealing with a native OpenOffice spreadsheet versus a freshly imported CSV file) can be a floating-point value, a string, or something else. For purposes of this discussion I'm going to assume that you've just imported a CSV or tab-delimited file into OpenOffice, in which case all cells will automatically contain string data. To get the string data from a cell in a freshly imported CSV file, you have to do:

    var content = cell.getFormula();

    At least, that's what works in OpenOffice 3.2.

    The general plan of attack, then, is to come up with a function that can parse a row's worth of data out of a spreadsheet; and have another function that can persist a row of data as a content item in CRX. Then it should be possible to create a macro that simply loops over all rows in a spreadsheet and pushes them out to the repository.

    The row-parsing function is pretty straightforward:

    function getRow( sheet, rownumber, startColumn, endColumn )  {

        var obj = sheet.getObject();
        var record = [];

        for (var k = startColumn; k < endColumn ; k++) {
             var cell = obj.getCellByPosition( k, rownumber );
             var content = cell.getFormula();
             record.push( content );
        }

        return record;
    }

    Given a reference to a Sheet, along with a row number and the starting and ending column numbers, this function loops through cells and pushes cell values into an array. The returned array represents a row's worth of data.

    To persist a row to CRX, we have a function that looks like this:

    function persistRow( sheet, rownumber, startColumn, endColumn ) {

       // get first row of data (column names)
       var columnNames = getRow( sheet, 0, startColumn, endColumn );

       // get specified record
       var row = getRow( sheet, rownumber, startColumn, endColumn );

       // build the request
       var request = {};
       request[":nameHint"] = row[2]; // Title
       request["sling:resourceType"] = "films";
       for ( var i = 0; i < columnNames.length; i++) {
           request[ columnNames[ i ] ] = row[ i ];
       }   
       var data = createRequest( request );

       // where to store it
       var url = "http://localhost:7402/content/films/";

       // finally, hit the repository
       var response = doJavaPOST( url, data );

       return response;
    }

    Notice that the code assumes that the first row of "data" in the spreadsheet contains the column names. This was in fact the case with the test-spreadsheet I used for testing this macro, namely a spreadsheet called a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing. Each row in the spreadsheet has information for a particular film, such as the film's title, the year the film was made, its genre, the name of the director, major actors and actresses, etc.

    Without further ado, here is the complete code for the OpenOffice macro:



    // Spreadsheet2CRX Macro
    // Kas Thomas, 15 July 2010
    // Public domain. Use at your own risk.
    // Tested with v3.2 of OpenOffice.org

    importClass(Packages.com.sun.star.uno.UnoRuntime);
    importClass(Packages.com.sun.star.sheet.XSpreadsheetDocument);

    // Do a POST
    function doJavaPOST( url, content ) {
            var reply = "";
            var responseCode = "";
            try {
                    var URL = new java.net.URL( url );
                    var urlConn =
                       URL.openConnection( );
                    urlConn.setDoOutput ( true );
                    urlConn.setRequestMethod( "POST" );
                    urlConn.setUseCaches( false );
                    urlConn.setRequestProperty ("Content-Type",
                    "application/x-www-form-urlencoded" );
                    var printout =
                    new java.io.DataOutputStream ( urlConn.getOutputStream ( ) );
                    printout.writeBytes ( content );
                    printout.flush ( );
                    printout.close ( );
                    responseCode = urlConn.getResponseCode();
            }
            catch(exception) {
                    java.lang.System.out.println( exception.toString() );
            }

            return responseCode;
    }

    // munge together the form data
    // into "name1=value1&name2=value2" etc
    function createRequest( object ){

            var data = [];
            for ( var i in object )
            data.push( i + "=" + object[ i ].toString( ) );

            var dataString = data.join( "&" );
            return dataString;
    }

    // Modal dialog with OK/cancel and a text field
    function prompt( msg ) {
            var swing = Packages.javax.swing;
            var text = swing.JOptionPane.showInputDialog(
            new java.awt.Frame(), msg );
            return ( null == text ) ? "" : text; // always return a string
    }

    // a Swing UI for displaying console info
    function EditorPane( ) {

            Swing = Packages.javax.swing;
            this.pane = new Swing.JEditorPane("text/html","" );
            this.jframe = new Swing.JFrame( );
            this.jframe.setBounds( 100,100,500,400 );
            var editorScrollPane = new Swing.JScrollPane(this.pane);
            editorScrollPane.setVerticalScrollBarPolicy(
            Swing.JScrollPane.VERTICAL_SCROLLBAR_ALWAYS);
            editorScrollPane.setPreferredSize(new java.awt.Dimension(250, 250));
            editorScrollPane.setMinimumSize(new java.awt.Dimension(10, 10));
            this.jframe.setVisible( true );
            this.jframe.getContentPane().add( editorScrollPane );

            // public methods
            this.getPane = function( ) { return this.pane; }
            this.getJFrame = function( ) { return this.jframe; }
    }

    function getRow( sheet, rownumber, startColumn, endColumn )  {

            var obj = sheet.getObject();
            var record = [];

            for (var k = startColumn; k < endColumn ; k++) {
                    var cell = obj.getCellByPosition( k, rownumber );
                    var content = cell.getFormula();
                    record.push( content );
            }

            return record;
    }

    function persistRow( sheet, rownumber, startColumn, endColumn ) {

            // get first row of data (column names)
            var columnNames = getRow( sheet, 0, startColumn, endColumn );

            // get specified record
            var row = getRow( sheet, rownumber, startColumn, endColumn );

            // build the request
            var request = {};
            request[":nameHint"] = row[2]; // Title
            request["sling:resourceType"] = "films";
            for ( var i = 0; i < columnNames.length; i++) {
                    request[ columnNames[ i ] ] = row[ i ];
            }
            var data = createRequest( request );

            // where to store it
            var url = "http://localhost:7402/content/test/";

            // finally, hit the repository
            var response = doJavaPOST( url, data );

            return response;
    }

    ( function main( ) {

            //get the document object from the scripting context
            oDoc = XSCRIPTCONTEXT.getDocument();

            //get the XSpreadsheetDocument interface from the document
            xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

            // get a reference to the sheets for this doc
            var sheets = xSDoc.getSheets();

            // get Sheet1
            var sheet1 = sheets.getByName("Sheet1");

            // construct a new EditorPane
            var editor = new EditorPane( );
            var pane = editor.getPane( );

            var size = prompt("Enter total rows and total columns, separated by a comma (e.g., '100,8')");
            if ( !size )
            return "No row/column info supplied.";

            var rows = Number( size.substring(0,size.indexOf(",")) );
            var cols = Number( size.substring( size.indexOf(",")+1) );

            var errors = 0;
            for ( var i = 1; i <= rows; i++) {
                    var response = persistRow( sheet1, i, 0, cols );
                    var text = pane.getText();
                    pane.setText( text + "\nProcessing: " + i );
                    if ( response.toString().indexOf("5")==0 )
                    errors++;
                    // provide a little bit of throttling:
                    java.lang.Thread.sleep( 200 );
            }
            pane.setText( pane.getText() + "\n" + errors + " errors" );
    })();




    You'll notice that the code creates a JEditorPane window to act as an error console. When you run the macro, a JOptionPane dialog appears, asking you to supply the number of rows and columns in the spreadsheet. (For the Georgia Tech spreadsheet, you can enter "1741,8", minus quotes.) Once you dismiss the dialog, the code goes to work looping over all the rows in the spreadsheet, posting each row to CRX at a path of http://localhost:7402/content/films/.

    Each new node is named according to a :nameHint parameter based on the Title of the film.

    Notice also, we designate a sling:resourceType for each node of "films." (This happens in the persistRow() function.) This fact will be important in a later blog when I show how to write server-side scripts that handle various types of requests for film data.

    And that's about it: Now you know how to shred a spreadsheet (say that 3 times in a row fast...) and store the results in CRX, using OpenOffice.

    Posted by Kas Thomas JUL 13, 2010

    Add comment

    This post originally appeared here.

    Over the past couple of days, I've been blogging a fair amount about Mozilla Rhino. One of the surprising (to me) things about Rhino is how much faster it is on its own (i.e., when you include js.jar in your classpath) than when you use the Rhino-based scripting engine that comes embedded in the JRE. (See previous blog.)

    In the past few days I've also been spending a lot of time with Apache Sling. Imagine my relief to find that Apache Sling uses Rhino proper (js.jar version 1.6R6) rather than relying on the JRE's onboard scripting engine. This means server-side EcmaScript runs much faster in Sling than it otherwise would. But Sling's use of Rhino 1.6R6 is a big win in another way as well. It turns out 1.6R6 is the first Rhino build to feature onboard support for E4X (the EcmaScript extensions for XML, otherwise known as ECMA-357).

    Rhino has had E4X support for some time, but until recently it's been a patched-on kind of support relying on the external xbean.jar (which was originally created by BEA, in pre-Oracle days). Prior to Rhino 1.6R6, you had to have xbean.jar in your classpath in order to have E4X support. Now it's built-in. No more need for xbean.jar.

    So it turns out you can use E4X grammar in your server-side scripts for Sling, which, I gotta say, is a huge turn-on (if you're as big a geek as I am).

    I'll be blogging more about E4X in Sling in coming days at dev.day.com. In the meantime, I thought I'd leave you with a quick example of what you can do with E4X on the server side.

    Recently, I ran into a situation where I had the following bit of markup in an .esp (server-side EcmaScript) file:

    <fields>
        <field name="Director">
            <value><%= currentNode["Director"] %></value>
        </field>
        <field name="Genre">
            <value><%= currentNode.Genre %></value>
        </field>
        <field name="Language">
            <value><%= currentNode.Language %></value>
        </field>
        <field name="Movie">
            <value><%= currentNode.Movie %></value>
        </field>
        <field name="Released">
            <value><%= currentNode.Released %></value>
        </field>
        <field name="Runtime">
            <value><%= currentNode.Runtime %></value>
        </field>
        <field name="Starring">
            <value><%= currentNode.Starring %></value>
        </field>
        <field name="Writers">
            <value><%= currentNode.Writers %></value>
        </field>
    </fields>
     

    Now mind you, there's absolutely nothing wrong with having markup that looks like this in an .esp file; it's fine as-is. But if you're an XML scripting geek, you see a situation like this and you inevitably start looking at ways to "roll up" all this verbosity into 2 or 3 lines of E4X. And sure enough, this is what I came up with:

    <%
    fields = <fields/>;

    names = ["Movie","Director","Genre",
    "Language","Released","Runtime",
    "Starring","Writers"];

    for (var i = 0; i < names.length;i++) {
        field = <field>
                    <value>{currentNode[names[i]]}</value>
                </field>;
        field.@name = names[i];
        fields.* += field;
    }
    %>
    <%= fields.toXMLString() %>

    Concise to a fault. Arguably, it's not as readable as the fully unrolled markup (particularly if you're not a scriptomaniac), but if you're well-versed in E4X, it's perfectly clear what's going on, and it shortens the .esp file to where all code fits on one screen without scrolling. (Always a good thing, in my book.)

    As I say, I'll be writing more about this sort of thing on dev.day.com soon.

    In the meantime, if you're new to E4X, I recommend taking a look at this article on IBM's Developerworks site. It'll get you up-to-speed quickly.

    Posted by Kas Thomas JUL 12, 2010

    Add comment

    Not long ago, I wrote about possible ways to get interaction to happen between Adobe Acrobat and Day CRX, and I gave an example of how to use a PDF form to push content into CRX. That's the simplest way to get content into the repository using Acrobat, but it's certainly not the only way. As it turns out, Acrobat's JavaScript API supports more sophisticated AJAX-style asynchronous communication back and forth between Acrobat and a host. That's what I'd like to talk about now.

    There are some important differences between what I'll call Acrobat AJAX and ordinary (browser) AJAX. The most important difference is that with Acrobat (and here, I'm talking about Acrobat Professional, not Acrobat Reader; unlike my last blog, everything we're going to talk about today requires a full copy of Acrobat), your AJAX scripts are scoped to the application (that is, Acrobat itself) rather than to the document, and in fact your script(s) can only run outside of document scope: the relevant API methods are prevented (by security restrictions) from executing as part of a document. So you can't just attach scripts to a PDF document's form fields, say, and expect to do AJAX. Instead, you have to put scripts in a /Javascripts folder on your local drive, under your /Acrobat install path. Acrobat registers the scripts on program startup, and they remain in scope for the duration of an Acrobat session (regardless of how many documents you open). In this sense, you can think of an Acrobat AJAX script as being similar to, say, a Jetpack script in Firefox.

    You may be wondering what, then, is the user gesture for getting a so-called folder-level script to fire? In Acrobat, the standard pattern here is to expose a folder-level script as a new menu item. The Acrobat JavaScript API has a method, app.addMenuItem, that looks like this:

    app.addMenuItem({

       cName: "Save Annotations to CRX",
       cParent: "File",
           nPos: 0
       cExec: 'myMethod();',
       cEnable: "event.rc = (event.target != null);",
    });

    The method takes a parameter block that can have several (mostly optional) properties. The cName property is the name of the new menu item. The cParent property designates the Acrobat menu in which the new menu item should live (in this case, the File menu), while nPos indicates the desired position of the new menu command in the list of commands on the menu in question. The cExec property points to the custom code you want to execute when the menu item is selected by the user.

    The optional cEnable property lets you specify whether the new menu item is enabled when the user sees it, based on certain conditions. In the example shown above, we've got code that essentially tests whether a document is already open in Acrobat. If no document is open, the menu command is greyed out.

    In ordinary (browser) AJAX, you're no doubt accustomed to using the XMLHttpRequest object to do the heavy lifting. Acrobat has its own XHR construct, called Net.HTTP.request. Like the addMenuItem() method above, the request() method of Net.HTTP takes a parameter block as an argument. There are many possible properties you can supply on this parameter block object (and they're all documented in the JavaScript for Acrobat API Reference). For our purposes, the most important are cVerb (which can be "GET", "POST", or any number of other HTTP and/or WebDAV verbs), cURL (the URL to which the request should be sent), aHeaders (a place to specify the HTTP request headers for this transmission), oRequest (the data stream for the POST), and the all-important oHandler. The latter needs to point to an object (any object) that has a response() method. The response() method of the object is called when the server is done handling your request. In other words, it's your callback method, analogous to onreadystatechange in conventional AJAX.

    When the response method of your handler is called, it gets called with four arguments. The first argument is a reference to the response body (a stream object). The second argument is just the URL to which the request was sent. The third argument points to an exception object (if the request was not successful). The fourth argument is an array of response headers returned from the server. Again, all of this is documented in Adobe's JavaScript for Acrobat API Reference and I won't belabor any of it further here.

    The code below shows an example of what you can do using Acrobat AJAX. In this example, we harvest all of the annotations (if any) in the currently open PDF document (the frontmost document, if there are multiple docs open), convert those annotations to XML, and POST the XML to CRX at a location of http://localhost:7402/content/acrobat/annots/, under a node name of "Annots for [PDFname]" (where PDFname is the file name of the PDF document from which annotations were taken).

    AjaxRequest = function(cURL) {
            this.params =
            {
                    cVerb: "POST", // default
                    cURL: cURL,
                    aHeaders: [
                    { name: "Content-Type",
                            value: "application/x-www-form-urlencoded"
                    }
                    ],

                    oRequest: null,

                    oHandler:
                    {
                            response: function(msg, uri, e,h){
                                    var stream = msg;
                                    var string = "";
                                    string = SOAP.stringFromStream( stream );
                                    app.alert( string );
                            }
                    }
            };

            this.invoke = function( ) {

                    Net.HTTP.request(this.params);
            }
    }

    // Prepare an AjaxRequest
    function createRequest( annots, fileName, url ) {

            var ajax = new AjaxRequest( );

            // where to store data in CRX:
            ajax.params.cURL = url;

            ajax.params.cVerb = "POST";

            var data = {};

            // optional redirect:
            data[":redirect"] =
            "http://localhost:7402/content/acrobat/thankyou.txt";

            // this is the name of the new node
            data[":nameHint"] = "Annots for " + fileName;

            // this is our annotation data as XML
            data.data = getAnnotationsAsXML( annots );

            var dataString = createDataString( data );

            // convert our data to a stream object
            ajax.params.oRequest =
            Net.streamFromString( dataString );

            return ajax;
    }

    // create a string of the form
    // name1=value1&name2=value2 [etc]
    function createDataString( object ) {

            var data = [];
            for ( var i in object )
            data.push( i + "=" + object[ i ].toString( ) );

            return data.join( "&" );
    }

    // converts annots to XML using E4X
    function getAnnotationsAsXML( annots ) {

            var xmlOutput = <annots></annots>;
            for ( var i = 0; i < annots.length; i++ )
            {
                    var props = annots[i].getProps();
                    xmlOutput.* += <annot/>;
                    var parent = xmlOutput.annot[i];
                    parent.* = <author>{props.author}</author>;
                    parent.* += <contents>{props.contents}</contents>;
                    parent.* += <page>{props.page}</page>;
                    parent.* += <creationDate>{props.creationDate}</creationDate>;
                    parent.* += <type>{props.type}</type>;
            }

            return xmlOutput.toXMLString();
    }


    var theURL = "http://localhost:7402/content/acrobat/annots/";

    // Add a new menu item under File
    app.addMenuItem({
            cName: "Save Annotations to CRX",
            cParent: "File",
            cExec: 'request = createRequest( this.getAnnots( ), this.documentFileName, theURL ); request.invoke();',
            cEnable: "event.rc = (event.target != null);",
            nPos: 0
    });
     

    If you have a copy of Acrobat Professional, copy and paste the above code to a text file (with an extension of .js) in your /Javascripts folder under your /Acrobat path, then restart Acrobat and you should see a new menu command appear under the File menu, called "Save Annotations to CRX."

    Note that to prevent security errors, you may have to go into Preferences (Control-K) and turn off Enhanced Security, or else add the currently open PDF document to the list of trusted docs. (In the preferences dialog, choose Security (Enhanced) in the list on the left.)

    Acrobat's JavaScript API makes it trivially easy to harvest all annotations from a PDF document with a single line of code:

    this.getAnnots()

    What you get back from this call is an array of Annotation objects (see Adobe's documentation), each of which has numerous properties that can be parsed out. We convert the annotations and properties to XML in the following method:

    // converts annots to XML using E4X

    function getAnnotationsAsXML( annots ) {

            var xmlOutput = <annots></annots>;
            for ( var i = 0; i < annots.length; i++ )
            {
                    var props =     annots[i].getProps();
                    xmlOutput.* += <annot/>;
                    var parent = xmlOutput.annot[i];
                    parent.* = <author>{props.author}</author>;
                    parent.* += <contents>{props.contents}</contents>;
                    parent.* += <page>{props.page}</page>;
                    parent.* += <creationDate>{props.creationDate}</creationDate>;
                    parent.* += <type>{props.type}</type>;
            }

            return xmlOutput.toXMLString();
    }

    You'll notice we use E4X syntax here for building the XML. If you're not familiar with it, E4X (ECMAScript extensions for XML, otherwise known as ECMA-357) constitutes a powerful -- and quite handy -- syntax for manipulating XML in ECMAScript. It is supported not only in Acrobat JavaScript but (on the server side) in Sling as well.

    When I ran this script on an annotated PDF document of my own, I got XML that looked like this:

    <annots>
      <annot>
        <author>Admin</author>
        <contents>Need more discussion of "privileges"</contents>
        <page>31</page>
        <creationDate>Mon Jul 12 2010 08:03:17 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Underline</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>Not sure we need to have this sentence.</contents>
        <page>29</page>
        <creationDate>Mon Jul 12 2010 08:02:39 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Highlight</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>Is this the correct copyright date?</contents>
        <page>1</page>
        <creationDate>Mon Jul 12 2010 08:01:44 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Highlight</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>We need to strike this.</contents>
        <page>729</page>
        <creationDate>Tue Jul 06 2010 14:43:57 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Highlight</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>I am underlining this.</contents>
        <page>729</page>
        <creationDate>Tue Jul 06 2010 14:44:12 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Underline</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>I liked this.</contents>
        <page>57</page>
        <creationDate>Tue Jul 06 2010 15:04:21 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Highlight</type>
      </annot>
      <annot>
        <author>Admin</author>
        <contents>This does not seem right.</contents>
        <page>57</page>
        <creationDate>Tue Jul 06 2010 15:04:32 GMT-0400 (Eastern Daylight Time)</creationDate>
        <type>Text</type>
      </annot>
    </annots>

     

    This is what gets stored in CRX.

    A while ago, I said that AJAX scripts in Acrobat are scoped to the application and may not (for security reasons) run in the context of a given document. Given that this is so, you may be wondering, at this point, how it is that we can harvest annotations from a document programmatically in an AJAX script. The key is that our script doesn't fire until there's actually a document open in Acrobat (remember, the menu command is dimmed out if there's no PDF open). When the script does finally run, it's safe to call this.getAnnots() -- "this" will be a reference to the currently open document (and using it has no security side-effects). The only real restriction is that you can't call Net.HTTP.request() from a document-level script. Doing so will cause Acrobat to complain.

    Those are the basics of doing AJAX against a CRX repository from Acrobat. In a future blog, I'll show how to populate a PDF form with values slurped from CRX (and do it in a way that allows you to use Acrobat Reader rather than Acrobat Professional). Stay tuned!

    Posted by Kas Thomas JUL 09, 2010

    Add comment

    In prior blogs, I've given code for interacting via HTTP with Day's CRX repository from a Chrome/Greasemonkey script environment as well as from an OpenOffice macro. Such interactions are, as you know, made easy by CRX's (or should I say, Apache Sling's) REST API, which exposes a huge amount of functionality through plain old HTTP, obviating the need for such cumbersome things as RMI-over-IIOP, SOAP-RPC, complex content mappings on the back end, etc.

    It turns out that precisely because of the ease with which you can interact with a Sling repository over HTTP, it's quite straightforward to set up interactions between Adobe Acrobat (and/or Portable Document Format files) and CRX.
    There are two styles of what I'll call Acrobat-CRX integration, in terms of RESTful interactions. One style involves PDF forms. The other style involves the Adobe Acrobat runtime environment itself. In the former case, you'll typically write JavaScript snippets that attach to form fields and are triggered by form events. In the latter case, you write scripts for Acrobat itself, and from within those scripts you can hit the server in a variety of ways and do AJAX-like things behind the scenes. In this blog, we'll talk about the first case (which is easy); in a subsequent blog, we'll talk more about the second case (which is only slightly harder).

    The easiest way to read or write the repository from PDF is to use a PDF form. Acrobat Professional comes with decent form-creation tools, allowing you to create all the standard sorts of controls (text fields, radio buttons, etc.) and attach scripts to them. The default behavior of the Submit button, in a PDF form, is to trigger a POST in which all field data gets sent to the server, either as Adobe FDF (Forms Data Format), XFDF, or HTML (depending on what you specified in the Acrobat UI when you created the Submit button). So if all you want to do is hit the server with a form POST and (in so doing) push content into a new node in CRX, you simply need to create a PDF form with a Submit button and specify a Sling-legal target URL per the Sling cheat sheet. No JavaScript required.

    Most of the time, though, you'll probably want to exercise finer control over the form's behavior. You might want to issue a ":redirect" directive to the Sling servlet, for example, so that your user is taken to a particular page after the POST. Or you might want to use the ":nameHint" directive to force the repository to create a node with a specific name. Perhaps there are fields whose values you want to include or exclude at POST time. To handle these and other special situations, you'll want to write a bit of JavaScript.

    First, though, create the Submit button using Acrobat Professional's form-design tools. In the configuration dialog for the button, select the Actions tab, then use the Selection Actions picker to select "Run a JavaScript." (Normally, you'd choose "Submit a Form." Instead, we're going to do the Submit programmatically.) See the screenshot below.

    Use the "Add..." button to add a script to the button, then select "Run a JavaScript" in the Actions pane and you'll see an Edit button become enabled.

    Click the Edit button. In the dialog that appears, cut and paste the following code:

    var params = {};
    params.cURL = "http://localhost:7402/content/myapp/";
    params.cSubmitAs = "HTML";
    params.cCharset = "utf-8";
    params.bGet = false;
    this.submitForm( params );
     

    What's going on here is that we're simply creating a parameter block that tells Acrobat what we want done at POST time, then we're passing that parameter block as an argument to the Acrobat JavaScript API method submitForm().

    As you can see, we've specified in cURL the CRX target location (the node under which to create a new item). Note that the path ends with a forward-slash.

    In the cSubmitAs parameter, we tell Acrobat to submit the form as HTML, while in the cCharset parameter we specify UTF-8 encoding. (We could also specify UTF-16, Shift-JIS, BigFive, GBK, or UHC.) We set bGet to false to force a POST rather than GET. Acrobat's JavaScript API allows you to adjust a variety of other settings, as well, but this is all we really need to do in this example.

    A sample form that contains two text fields (one for an e-mail address and another for a Comment) and a Submit button with this script already attached to it is available via the download link at the end of this blog. Note that the form actually contains four text fields, two of which are hidden. We've created a hidden text field named ":redirect" with a value of "http://localhost:7402/content/acrobat/thankyou.pdf," which tells Sling to serve the specified file (thankyou.pdf, a small file containing just the words "Thank you for your feedback") after the POST finishes. If we don't do this (if we don't include ":redirect"), Acrobat will serve up a PDF representation of the Sling servlet's default response (i.e., a wire dump), which is probably not what you want your user to see.

    There is also a hidden field named ":nameHint," set to a default value of "myform." This tells Sling (or CRX) to create a new node named myform at the path given earlier. If myform already exists, a new node called myform_0 will be created. If that exists, myform_1 will be created. And so forth.

    That's really about all there is to creating new content in CRX with a PDF form POST.

    Next time, we'll look at what it takes to push and pull content to and from the repository asynchronously, which is to say in more of an AJAX-flavored manner, using Acrobat JavaScript. So stay tuned. It gets pretty interesting!

     

    * FormExample.pdf
    Example of a PDF form that will submit content to CRX.

    Posted by Greg Klebus JUL 07, 2010

    Add comment

    I'm glad to announce that Ruben Reusser and Kyle Watson from Headwire.com, Inc. have contributed a code example package to the public CRX Developer Community area of Package Share. Thank you, Kyle and Ruben.

    cms-templatehandler package on Package Share

    We had a look at it and liked it very much - it's an example of a simple CMS with basic editing capabilities, also demonstrating templating capabilities and separation between content, design, and code. The sample templates are based on Joomla templates (converted to JSP). The application leverages Apache Sling framework and the JCR repository, and can be run, inspected, experimented with or further developed on CRX 2.1.

    To check it out, start your CRX 2.1 Developer Edition (or get one), head to Share on the welcome screen, login with your day.com account (or register). You will find the package in Public » CRX Developer Community » Headwire.com section, it's called cms-templatehandler. Download it to your local CRX, install, and open /site.html.

    Here's what you should see - a sample website, you can easily change templates, and edit content. Go to CRXDE Lite (Develop button on the welcome screen), and you will see the application under /apps/templatehandler and sample content under /site.

    file

    If you are interested in sharing your package(s) with the CRX Developer community, drop us a line at packageshare (at) day (dot) com.

    Posted by Kas Thomas JUL 06, 2010

    Add comment

    Not long ago, I blogged about how, with a little bit of client-side Javascript, it's pretty easy to save browser selections (in Chrome) to Day CRX. It turns out the same sort of thing is not all that hard to implement in OpenOffice.

    The scenario: You're reading a long document in OpenOffice (say, the JCR spec) and you come across a particular page or paragraph (or code snippet, etc.) that you'd really like to save for later, because you know you'll want to come back to it. If you want, you can Cut and Paste such snippets into new documents and save those docs on your local drive. But that's a bit awkward and leaves you with a file-management mess. (How easy or hard will it be to search for a given text string in order to find it again?) As an alternative approach, I offer the following macro, which lets you save any selected (highlighted) span of text from an OpenOffice doc straight to the CRX repository, under whatever pathname you choose.

    The following Javascript code will run as a macro in any version of OpenOffice from 2.0 on (it was tested in 3.2). If you've never created a Javascript macro in OpenOffice before, it's pretty easy: Under the Tools menu, go to Macros > Organize Macros > JavaScript. In the dialog that appears, click into the folder tree on the left and select the folder in which you want to create a macro. Doing so will enable a Create button on the right. Click it. Then select the newly created macro file (in the navtree), which will, in turn, enable an Edit button. Click Edit and you'll be brought to a JavaScript editor. Cut and paste the following code into the editor and Save it.

    // OOo Javascript macro (OOo 2.0 or higher required)
    // POST selected text to CRX repository

    importClass(Packages.com.sun.star.uno.UnoRuntime);
    importClass(Packages.com.sun.star.text.XTextDocument);
    importClass(Packages.com.sun.star.text.XText);
    importClass(Packages.com.sun.star.text.XTextRange );
    importClass(Packages.com.sun.star.text.XTextViewCursorSupplier);

    // utility method: get selected text
    function getSelectedText(controller) {
            var xViewCursorSupplier =
            UnoRuntime.queryInterface(XTextViewCursorSupplier, controller);
            xViewCursor = xViewCursorSupplier.getViewCursor();
            var selectedText = xViewCursor.getString();
            return selectedText;
    }

    // Do a POST
    function doJavaPOST( url, content ) {
            var reply = "";
            try {
                    var URL = new java.net.URL( url );
                    var urlConn = URL.openConnection( );
                    urlConn.setDoInput ( true );
                    urlConn.setDoOutput ( true );
                    urlConn.setUseCaches( false );
                    urlConn.setRequestProperty ("Content-Type",
                      "application/x-www-form-urlencoded" );
                    var printout =
                      new java.io.DataOutputStream ( urlConn.getOutputStream ( ) );
                    printout.writeBytes ( content );
                    printout.flush ( );
                    printout.close ( );
                    var input =
                      new java.io.DataInputStream ( urlConn.getInputStream ( ) );
                    var str = "";
                    while( null != ( str=input.readLine( ) ) ) { reply += str; }
            }
            catch(exception) {
                    java.lang.System.out.println( exception.toString() );
            }
            finally {
                    if ( input != null ) input.close();
            }

            return reply;
    }

    // munge together the form data
    function createRequest( object ){

            var data = [];
            for ( var i in object )
            data.push( i + "=" + object[ i ].toString( ) );

            return data.join( "&" );
    }

    // Modal dialog with OK/cancel and a text field
    function prompt( msg ) {
            var swing = Packages.javax.swing;
            var text = swing.JOptionPane.showInputDialog(
            new java.awt.Frame(), msg );
            return ( null == text ) ? "" : text; // always return a string
    }

    ( function main ( ) {

            var CRX_BASE_URL = "http://127.0.0.1:7402";
            var CONTENT_URL = CRX_BASE_URL + "/content/";

            var oDoc = XSCRIPTCONTEXT.getDocument();
            var oController = oDoc.getCurrentController();

            if ( !getSelectedText( oController) ) return;

            var reply =
              prompt(  "Where do you want to store this item? (e.g. a/b/itemName)" );
            if ( reply != "") {

                    var parts = reply.split("/");
                    var name = parts.pop();
                    var url = CONTENT_URL + reply.substring(0,reply.lastIndexOf("/")+1);

                    var request = {};
                    request[":nameHint"] = name;
                    request.content = getSelectedText(oController);
                    request.source = "OpenOffice.org";
                    request.timestamp = new Date();
                    request.docURL = oDoc.getURL();
                    var postData = createRequest( request );

                    var str =  doJavaPOST( url, postData );
            }

    } ) ( );  // end main( )

     

    We're able to do what amounts to AJAX programming from OpenOffice by dint of the fact that OpenOffice's JavaScript engine (nee Rhino) is implemented in Java and allows "reachthrough" calls  from JavaScript to Java. Basically, any JRE method available to a Java program is available to your script. Thus it is easy to call openConnection on a java.net.URL, set request headers, and do a POST programmatically, with java.io.* methods, all from JavaScript.
     
    The 90-some lines of code are fairly self-explanatory, save (perhaps) for the half dozen or so arcane OpenOffice-API calls. When the user fires the macro, a Swing dialog appears, asking for a (repository) pathname under which to store the snippet. The path you give doesn't have to already exist: CRX (or Sling, under the covers) will automatically create a node at the path you specify if one does not exist there already.

    As with my Chrome script of a few days ago, we set the :nameHint parameter in our data stream so as to tell CRX to canonicalize the node name (replace non-alphanumeric characters with underscores and do other fixups), guaranteeing a Sling-legal name for the new node, but without that name simply ending up being some randomly generated number.

    The code expects your repository to be available at http://127.0.0.1:7402, and the macro is hard-coded to use a root of /content/. The relevant lines of code are easy to change if you need to use a different repository location.

    After running the macro with text selected in OpenOffice, check your repository with CRX's Content Manager and you will see a new node (of type nt:unstructured) at the path you specified. If you entered "a/b/c," the new content will be a 'c' node under /content/a/b.

    And there you have it: OpenOffice-CRX integration in less than 100 lines of code!

    Posted by Michael Marth JUL 02, 2010

    Comment 1

    One particular strength of Java Content Repositories is that they provide so much infrastructure for developing content centric applications. Today, I discovered another hidden gem in JCR2 (JSR-283) that can come in very handy for app development:

    In JCR1 the class ObservationManager used to manage EventListeners that get triggered immediately when an event like a property change occurs. Starting from JCR2 the ObservationManager also provides an EventJournal for each node that can be retrieved without having to register a listener first. The EventJournal is a list of events, e.g. addition, moves or removal(!) of child nodes, complete with user id and timestamp.

    Attached to this post is a little CRX package with a servlet that renders the events for a given node. The relevant lines are:

    ObservationManager om = session.getWorkspace().getObservationManager();
    EventJournal eventJournal = om.getEventJournal(
        Event.NODE_ADDED | Event.NODE_MOVED | Event.NODE_REMOVED  | Event.PROPERTY_CHANGED
        | Event.PROPERTY_REMOVED | Event.PROPERTY_ADDED, path, false, null, null);

    Install the package and point your browser to http://localhost:7402/apps/eventy.html?path=/content (the path parameter specifies the node you are interested in). You should see a list of entries like:

    Event: Path: /content/n3, NodeAdded: , UserId: admin, Timestamp: 1272975189833, UserData: null, Info: {}
    Event: Path: /content/n3, NodeRemoved: , UserId: admin, Timestamp: 1272975212725, UserData: null, Info: {}
    ...

    And from the Javadoc:

    Events returned in the EventJournal instance will be filtered according to [...] the current session's access restrictions

    I am delighted.

    * eventjournal-1.0.zip
    CRX package: sample for EventJournal