Latest Posts

Archives [+]

Entries by Kas Thomas

    Posted by Kas Thomas SEP 09, 2011

    Comments 11

    Last time, I reviewed the basics of the powerful Clickstream Cloud feature of Adobe WEM (formerly CQ5), which is the feature whereby, if you type Ctrl-Alt-C, you get a popup summary of various bits of contextual information about the user, the user's browser, and the page the user is currently visiting (see illustration further below).

    As with almost everything else in WEM, the Clickstream Cloud can be customized relatively easily, because the code for the Cloud is easily accessible (and modifiable) in the repository.

    If you go in the repository under /libs/cq/personalization/clientlib/source/shared (best done in CRXDE Lite: just aim your browser at http://localhost:4502/crx/de/index.jsp#/crx.default/jcr%3aroot/libs/cq/personalization/clientlib/source/shared), you'll see a half dozen *.js files that govern the Clickstream Cloud's basic behavior, and if you look under /libs/cq/personalization/clientlib/source/clickstreamcloud, you'll find the *.js files that contain code for the various session stores that manage the information fields displayed in the cloud dialog. There's also a js.txt file at /libs/cq/personalization/clientlib/js.txt that governs how all these *.js files are loaded.

    As a very simple example of customization of the Clickstream Cloud, let us suppose that you wanted to add a timestamp to the cloud dialog under "Surfer information" as shown below:

    Clickstream Cloud Dialog

    Notice the part, under Surfer Information, where it says "Thu Sep 08 2011 17:09:45," etc. This information was added as a result of custom code.

    There are a couple of ways to do this. One way would be to alter the setSurferInfoInitialData() function in config.json.jsp (which is located in a somewhat obscure place, namely /libs/cq/personalization/components/clickstreamcloud/command/config/). You might be very tempted to do this since that's the function where the user's IP address, for example (which appears under Surfer Information), is set. But making a change in this function would actually be a bad thing to do, for a number of reasons. First, you're dealing with a core WEM file. And you're making hard-coded changes to it. There's no guarantee that this file will stay unmodified (or even continue to exist) in future versions of WEM, and by putting custom code in it, you've created a maintenance nightmare.

    A better alternative is to create your own separate file, perhaps called custom.js, and place it under /libs/cq/personalization/clientlib/source/clickstreamcloud/. The content of custom.js is simply:
     
    CQ_Analytics.CCM.addListener("configloaded", function() {
                  CQ_Analytics.SurferInfoMgr.setProperty( "timestamp", new Date() );
     }, CQ_Analytics.SurferInfoMgr);

    To ensure that custom.js loads at runtime, you do need to make a change to the aforementioned js.txt file (namely, the one at /libs/cq/personalization/clientlib/js.txt). Just add the line "clickstreamcloud/custom.js" to the end of the file.

    Now you should be able to go to a new page (or reload the current page) in WEM, type Crtl-Alt-C, and see the timestamp information in the Surfer Information portion of the Clickstream Cloud dialog.

    What's neat is that if you now click the Edit link in the upper right portion of the Cloud dialog, then click the Surfer Information tab of the dialog that pops up, you'll see timestamp info among the editable fields of the dialog:

    file

    For more information on the Clickstream Cloud API (including how you can create your own custom session store), see the documentation here.

    Posted by Kas Thomas AUG 25, 2011

    Comments 6

    Customer Experience Management (CEM) aims to empower and delight customers by (among other means) giving web visitors exactly the information they need, in the right form, at the right time. Doing this reliably, in real time, can be a challenge. It requires Web Content Management software that can aggregate relevant user information from a variety of sources so as to drive intelligent provisioning of content on a page according to predetermined strategies.

    Adobe's Web Experience Management offering (part of the new Adobe Digital Enterprise Platform) rises to this challenge with a patent-pending technology called the Clickstream Cloud.  

    The Clickstream Cloud represents a dynamically assembled collection of user data that can be used to determine exactly what content should be shown on a given web page, in a given situation.

    To see the Clickstream Cloud in operation, simply type Ctrl-Alt-C from within any web page. The Cloud summary toggles into view.

    file

    A variety of types of information are shown for the given user:

    • Profile data typically shows the user's registration data (assuming the user has registered with the website): Name, e-mail address, occupation, age, mailing address, phone number, and any other information the user has previously submitted in a registration process. This could also include social network information, data pulled from a CRM system, etc.

    • Tag cloud data shows the tags that were of interest to this user. Mousing over any of the tag names causes a tooltip to appear, showing the number of times the user has accessed a particular tag.

    • Page data shows the number of times the page has been visited, the page's title, and the Random factor as used in the random strategy of a campaign.

    • Surfer info shows the IP address, keywords used for search engine referral, the browser being used, the OS (operating system) being used, the screen resolution of the device the visitor is using to view the web page, and the mouse X-Y position on the page.

    • Events shows activity-stream information from SiteCatalyst.

    • Resolved Segments shows which (if any) defined segments of a marketing campaign have been matched.

    Several things make Adobe's implementation of the Clickstream Cloud unique:

    1. Much of the information (such as info about the user's viewing environment) is derived on-the-fly, in real time.
    2. Marketers can experiment with different user-data values to see changes to a page in real time (for purposes of trying different campagin strategies). The Edit button in the upper right portion of the panel allows for manual editing (overriding) of user-data values.
    3. The Clickstream Cloud is extensible. You can add a new (custom) session-store object whose contents can be displayed in the panel.
    4. Non-volatile information shown in the Clickstream Cloud viewer is persisted on the client side (in a cookie), relieving the server of having to maintain (and then transport back and forth) large amounts of user data.

    Because user info is persisted on the client, concerns over privacy and control of potentially sensitive user data are easily allayed: The user has ultimate control over the data.

    The Clickstream Cloud is a container for different data stores (also called session stores), which extend either CQ_Analytics.SessionStore (for values computed on page load) or CQ_Analytics.PersistedSessionStore (for values persisted from one page to another).

    Each data store is built up of property pairs (names and corresponding values) and represents a logical collection of properties (for example, profile properties).

    The default session stores available in Adobe WEM can be obtained from a call to CQ_Analytics.ClickstreamcloudMgr.get(), which returns an object with properties of:

    profile (correponding to CQ_Analytics.ProfileDataMgr)

    eventdata (correponding to CQ_Analytics.EventDataMgr)

    tagcloud (correponding to CQ_Analytics.TagCloudMgr)

    pagedata (correponding to CQ_Analytics.PageDataMgr)

    surferinfo (correponding to CQ_Analytics.SurferInfoMgr)

    mouseposition (correponding to CQ_Analytics.MousePositionMgr)

    There's much more goodness tucked away in the Clickstream Cloud's APIs, which we'll visit in another (followup) blog post soon. But for now, if you want to get into the details yourself, there's no easier way to get started than to consult our documentation here and (for even more detail) here.

    Posted by Kas Thomas JUN 20, 2011

    Add comment

    It's official: Adobe Systems today announced its new Adobe Digital Enterprise Platform for Customer Experience Management (CEM).

    A synthesis of the former LiveCycle and CRX products (with a lot of new functionality rolled in as well), the Adobe Digital Enterprise Platform is designed to enable enterprises to realize immersive, innovative multi-channel web apps that allow customers to get exactly the information they need, in exactly the desired format(s), any time, any place.

    Adobe SVP Rob Tarkoff talks more about the new Platform in this short video:

    Adobe will highlight the capabilities of the Digital Enterprise Platform at the first Adobe Digital Enterprise Summit, taking place in Los Angeles on October 3-4 in conjunction with Adobe MAX 2011. The Adobe Digital Enterprise Summit will bring together industry thought leaders, along with Adobe experts and solution partners, to explore the various ways in which customer-centric development can transform the digital enterprise. If you haven't yet made plans to attend the Summit, we hope you'll do so now. Go here for details on how to attend. We look forward to seeing you there!

    Posted by Kas Thomas MAY 02, 2011

    Comments 4

    Adobe CRX is an extremely versatile content store that can handle a wide range of content types (structured and unstructured), capable of reliably storing many millions of objects. In fact, the system's ultimate storage limits are actually not subject to any particular limitations of CRX itself but (rather) depend on the underlying persistence manager. You can choose from a number of different types of persistence (DB2, MySQL, Oracle, TarPM; see documentation here), each with its own particular limitations.

    In general, the default TarPM persistence manager gives better performance than most RDBMS alternatives for the typical CRX use cases (involving web content and user management). But in certain situations, with certain use cases, performance with TarPM can take a hit. The most common problem? Big Flat Lists.

    Although read performance remains good, write performance can suffer in the case where you need to store, say, thousands of sibling nodes under one parent node. This has to do with the fact that TarPM is an append-only store in which objects are immutable and never overwritten, only rewritten. What it means is that the cost of adding (or updating) Node No. N-thousand-plus-one can be quite high.

    Of course, the answer is to divide and conquer: Break the nodes up into smaller groups, preferably hierarchical groups.

    Suppose you have a large number of users whose user-data you want to store in CRX, and you'd like to be able to store users by name. The naive way (we'll keep the example simple and assume no name collisions) would be to store Joe Smith under a node named users/joe_smith, Lee Jones under users/lee_jones, etc. But after a thousand names or so, performance will start to suffer noticeably as new entries are written to the repository. Far better performance will result if container nodes (buckets) are created for each letter of the alphabet, and for each Last Name, so that you can add Joe Smith as /users/S/Smith/Joe, for example.

    A more sophisticated approach would be to hash user IDs and chunk the hash to form an ad-hoc hierarchy. For example, "Joe Smith" might give a hash of ab12cd34. The user data for Joe Smith can be stored at users/ab/12/cd/34. When the time comes to look up data for Joe Smith, you would first hash the name (to obtain ab12cd34), then create the necessary path from the hash, and look up the data.

    As it turns out, the Jackrabbit API (which of course is built into CRX) offers yet another alternative for efficient hierarchical storage of arbitrary data, in the form of the BTreeManager. This class provides B+ tree-like behavior in allocating subtrees of nodes that are always balanced, with a fixed limit on how many siblings any given node can have. (You provide the limit as an argument in the constructor.)

    I wrote a very short test script (in ECMAScript) to show how the BTreeManager operates, as shown below:

    <html>
    <body>
    <%
    /* Create a new TreeManager instance rooted at the current node.
    Splitting of nodes takes place
    when the number of children of a node exceeds 40 and is done such that each new
    parent node has >= 10 child nodes. Keys are ordered according to the natural
    order of java.lang.String. */
     

     var treeManager = new Packages.org.apache.jackrabbit.commons.flat.BTreeManager(    this.currentNode, 10, 40, Packages.org.apache.jackrabbit.commons.flat.Rank.comparableComparator(), true);

     // Create a new NodeSequence with that tree manager
     var nodes = Packages.org.apache.jackrabbit.commons.flat.ItemSequence.createNodeSequence(treeManager);
     
     var totalNodes = 100;
     
     // Do some profiling:
     var start = 1 * new Date();
     
     // add a bunch more nodes
      for (var i = 0; i < totalNodes; i++)
       nodes.addNode( "MyNode" + i,
       Packages.javax.jcr.nodetype.NodeType.NT_UNSTRUCTURED);
       
     var end = 1 * new Date();
     
     %>
     
    <%= "Total time: " + (end - start) + " millisecs" %>
    </body>
    </html>

    I called this script tree.esp and placed it under /apps/tree in CRX, then created a dummy node under /content and gave the dummy node a sling:resourceType of "tree" (to trigger the script when navigating to content/dummyNode.tree).

    The performance benefits of BTreeManager are notable. On my (decrepit Dell) laptop, adding 100 nodes as a flat list took 1.6 seconds (which includes about 200 milliseconds for servlet compilation). Adding 1000 nodes as a flat list (no B-tree) took 22 seconds. Adding 5000 nodes took 289 seconds. Note that adding five times as many entries took almost 13 times as long.

    By contrast, using BTreeManager (set to a maximum sibling breadth of 40), adding 1000 nodes took 14 seconds and adding 5000 took 86 seconds. (Five times the data takes roughly five times as long.)

    The real lesson here is: If your content is hierarchical (or can be made to look hierarchical), by all means capitalize on that fact! Don't try to treat your content as a Big Flat List, especially if you'll be doing a lot of updates. (If you're doing mostly reads and few writes, on the other hand, it doesn't much matter.) Introducing a bit of hierarchy to your content organization scheme will go a long way toward promoting fast update performance.

    (Many thanks to Felix Meschberger and Marcel Reutegger for input into this blog.)

    Posted by Kas Thomas APR 21, 2011

    Comments 2

    With tablet and smart-phone shipments eclipsing PC and laptop shipments -- and with new mobile broadband connections far exceeding the number of new fixed broadband connections  -- it's clear that a tipping point in thie history of the internet has been reached. Which means that now may be as good a time as any to ask yourself: Are you well-positioned to leverage the Mobile Internet?

    Former Morgan Stanley research analyst Mary Meeker (now a partner at Kleiner Perkins Caufield & Byers) recently gave a state of the Mobile Internet presentation at Google. Slides from that presentation can be seen here:

    The slideshow is full of mindblowing numbers, charts, and observations. Some of the leading takeaways:

    • Mobile Internet data traffic is expected to grow by 26X over next 5 years
    • Mobile Internet devices are expected to reach 10X the billion-or-so devices on today’s Desktop Internet
    • The Mobile Internet is growing much faster than the Desktop Internet did
    • The Mobile Internet will be a much bigger phenomenon than most people think
    • One of the big growth drivers is the Tablet

    The last point is worth emphasizing. Did you know that in its first 3 months, Apple's iPad outsold the iPod and iPhone (combined) by a factor of 3.5? Or that Gartner is predicting that by 2013, there will be as many tablets in use in the enterprise as there are PCs?

    The fact is, we have entered a world in which realtime 24/7 broadband connectivity in the palm of one's hand is the new baseline.

    This means that the majority of software interactions will (from this point forward) be occurring on handheld devices. If you're not poised to take advantage of this, you will soon be in a small (and diminishing) minority.

    That's why Adobe Dreamweaver, Flash Builder, and CQ5 offer crossplatform mobile development and deployment options. And it's why you'll want to take an especially close look at the new release of the Adobe Enterprise Platform when it arrives soon. To take full advantage of Mobile, you need world-class design tools and runtime infrastructure that were created with Mobile in mind. Anything less is (or should be) unacceptable -- unless you're planning to live in the past.