Advanced search
[day-communique] Integrate CQ5 with data from another system
17 messages
expand all |
collapse all
We are working a public website using CQ 5. Some of the data like
product catalogue comes from another system. The requirement asks us
to get the data and embed them as part of the page hosted by CQ. This
is a challenging job to us, as we can not find any example out there
doing this. The challenges we have identified are:
1. How to get/load the data
2. How the data being updated and notified
3. How the data being rendered
4. How to embed them as part of the page
Before we sending out this mail to ask for help, we did some
consulting with sales of CQ, and did some homework like reading source
code of sling. Here are our planned apporaches. We need some advices
here about are those apporaches following the best practices of sling
or simply anti-patterns.
1. How to get/load the data
Some background: we will deploy three CQ instances. One author
instance and two publish instances. The data will be prepared in
author instance using JDBC, and doing some transformation in memory.
The result will be persisted into a xml file and saved in JCR
repository. We rely on replication functionality to publish the xml
file to the publish instance. Then in the publish instance we just
need to read things in xml back into memory.
2. How the data being updated and notified
We will set up a scheduler in author instance to poll the data every
day. The update will be notified as event. If the replication process
can generate event itself, then we will rely on it. Otherwise, we need
to set up new topic and publish event to it when the serialized xml
file being updated. We assume CQ will distribute the event to publish
instance so that the publish instance can know they need to refresh
the data in memory using the updated xml file.
3. How the data being rendered
As the data is not stored in JCR repository as nodes, but a plain
file, so things like /mobile/handset/NokiaN75 can not be understand by
CQ. To make those nodes available, we write our own resource provider,
and map it to /mobile/handset. We will do URL parsing ourself to get
the parameters. The final resource provided to script engine will
contain the necessary data. In the page, we can reference the handset
by reference the current node (a.k.a the resource provided). The
resource provider is similar to tranditional controller in MVC
architecture. So there might be 20+ controllers in a system, would
that be a issue?
4. How to embed them as part of the page
We leverage the CQ component here. The component will be configured
with a URL, where it can fetch its html. So, the comonent can not
render itself, instead, it relies on another URL to do it (which is
backened by a controller). To do this, we either need to implement
some sort of server side include, or using client side ajax call. The
concern of server side include is how to do that, and how to pass user
session. The concern of using client side ajax call is the user
experience might be impacted as the network could be slow. We were
told from the sales of CQ that to be effective about page caching,
we’d better to use ajax call to retrieve the dynamic content.
Any comments?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
On Mon, Mar 16, 2009 at 07:32:20AM -0700, taowen wrote:
>
> 2. How the data being updated and notified
>
> We will set up a scheduler in author instance to poll the data every
> day. The update will be notified as event. If the replication process
> can generate event itself, then we will rely on it. Otherwise, we need
> to set up new topic and publish event to it when the serialized xml
> file being updated. We assume CQ will distribute the event to publish
> instance so that the publish instance can know they need to refresh
> the data in memory using the updated xml file.
If CQ 5 does also support "repliacation on modification" as CQ 4.X does (I
strongly assume that Day hasn't removed this feature) I would suggest to
use it. It automatically replicates the changed handles (in a configurable
subtree) when it has changed.
Joerg
--
What did you do to the cat? It looks half-dead. -Schroedinger's wife
The primary comment I would have is to wonder why you are storing the content as and XML file? Perhaps I am missing a requirement, but it would seem to be that you would be better off storing the content you import as nodes in the CRX repository.
As an example think of how the campaign content is managed in the Geometrixx example. You have a tree in the repository that is separate of site tree, but is still standard content in the repository.
If you stored the content as a node it would simplify your replication issues and your design for how you embed the content in your component. You would be able to leverage the standard sling resource resolver so you wouldn't have to build your own. Depending on exactly what your requirements are you could look at how the teaser or reference components render the data they display.
You mention that you are transforming some of this data after you fetch it via JDBC so it may not be a good candidate for this, but this seems like a natural use for a CRX connector and a virtual repository. Again your requirements may mean that its not a good approach, but its worth considering.
Paul McMahon
Acquity Group
--- On Mon, 3/16/09, taowen <taowen@gmail.com> wrote:
From: taowen <taowen@gmail.com>
Subject: [day-communique] Integrate CQ5 with data from another system
To: "Day Communique" <day-communique@googlegroups.com>
Date: Monday, March 16, 2009, 8:32 AM
We are working a public website using CQ 5. Some of the data like
product catalogue comes from another system. The requirement asks us
to get the data and embed them as part of the page hosted by CQ. This
is a challenging job to us, as we can not find any example out there
doing this. The challenges we have identified are:
1. How to get/load the data
2. How the data being updated and notified
3. How the data being rendered
4. How to embed them as part of the page
Before we sending out this mail to ask for help, we did some
consulting with sales of CQ, and did some homework like reading source
code of sling. Here are our planned apporaches. We need some advices
here about are those apporaches following the best practices of sling
or simply anti-patterns.
1. How to get/load the data
Some background: we will deploy three CQ instances. One author
instance and two publish instances. The data will be prepared in
author instance using JDBC, and doing some transformation in memory.
The result will be persisted into a xml file and saved in JCR
repository. We rely on replication functionality to publish the xml
file to the publish instance. Then in the publish instance we just
need to read things in xml back into memory.
2. How the data being updated and notified
We will set up a scheduler in author instance to poll the data every
day. The update will be notified as event. If the replication process
can generate event itself, then we will rely on it. Otherwise, we need
to set up new topic and publish event to it when the serialized xml
file being updated. We assume CQ will distribute the event to publish
instance so that the publish instance can know they need to refresh
the data in memory using the updated xml file.
3. How the data being rendered
As the data is not stored in JCR repository as nodes, but a plain
file, so things like /mobile/handset/NokiaN75 can not be understand by
CQ. To make those nodes available, we write our own resource provider,
and map it to /mobile/handset. We will do URL parsing ourself to get
the parameters. The final resource provided to script engine will
contain the necessary data. In the page, we can reference the handset
by reference the current node (a.k.a the resource provided). The
resource provider is similar to tranditional controller in MVC
architecture. So there might be 20+ controllers in a system, would
that be a issue?
4. How to embed them as part of the page
We leverage the CQ component here. The component will be configured
with a URL, where it can fetch its html. So, the comonent can not
render itself, instead, it relies on another URL to do it (which is
backened by a controller). To do this, we either need to implement
some sort of server side include, or using client side ajax call. The
concern of server side include is how to do that, and how to pass user
session. The concern of using client side ajax call is the user
experience might be impacted as the network could be slow. We were
told from the sales of CQ that to be effective about page caching,
we’d better to use ajax call to retrieve the dynamic content.
Any comments?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi,
the reason why I did not choose to store in JCR is there is some
transformation need to be done before the data being presented. Things like
the list of price plan could change depends on the handset you have chosen.
If things are stored in JCR, I think I need to do OCM and load it in memory
then filter the list by applying business rules. Also, in my case, I just
need to view it, no need to update/delete.
2009/3/17 Paul McMahon <orotas@yahoo.com>
> The primary comment I would have is to wonder why you are storing the
> content as and XML file? Perhaps I am missing a requirement, but it would
> seem to be that you would be better off storing the content you import as
> nodes in the CRX repository.
>
> As an example think of how the campaign content is managed in the
> Geometrixx example. You have a tree in the repository that is separate of
> site tree, but is still standard content in the repository.
>
> If you stored the content as a node it would simplify your replication
> issues and your design for how you embed the content in your component. You
> would be able to leverage the standard sling resource resolver so you
> wouldn't have to build your own. Depending on exactly what your requirements
> are you could look at how the teaser or reference components render the data
> they display.
>
> You mention that you are transforming some of this data after you fetch it
> via JDBC so it may not be a good candidate for this, but this seems like a
> natural use for a CRX connector and a virtual repository. Again your
> requirements may mean that its not a good approach, but its worth
> considering.
>
> Paul McMahon
> Acquity Group
>
> --- On *Mon, 3/16/09, taowen <taowen@gmail.com>* wrote:
>
> From: taowen <taowen@gmail.com>
> Subject: [day-communique] Integrate CQ5 with data from another system
> To: "Day Communique" <day-communique@googlegroups.com>
> Date: Monday, March 16, 2009, 8:32 AM
>
>
> We are working a public website using CQ 5. Some of the data like
> product catalogue comes from another system. The requirement asks us
> to get the data and embed them as part of the page hosted by CQ. This
> is a challenging job to
> us, as we can not find any example out there
> doing this. The challenges we have identified are:
>
>
> 1. How to get/load the data
>
> 2. How the data being updated and notified
>
> 3. How the data being rendered
>
> 4. How to embed them as part of the page
>
>
> Before we sending out this mail to ask for help, we did some
> consulting with sales of CQ, and did some homework like reading source
> code of sling. Here are our planned apporaches. We need some advices
> here about are those apporaches following the best practices of sling
> or simply anti-patterns.
>
> 1. How to get/load the data
>
> Some background: we will deploy three CQ instances. One author
> instance and two publish instances. The data will be prepared in
> author instance using JDBC, and doing some transformation in memory.
> The result will be persisted into a xml file and saved in JCR
> repository. We rely on replication functionality to
> publish the xml
> file to the publish instance. Then in the publish instance we just
> need to read things in xml back into memory.
>
>
> 2. How the data being updated and notified
>
> We will set up a scheduler in author instance to poll the data every
> day. The update will be notified as event. If the replication process
> can generate event itself, then we will rely on it. Otherwise, we need
> to set up new topic and publish event to it when the serialized xml
> file being updated. We assume CQ will distribute the event to publish
> instance so that the publish instance can know they need to refresh
> the data in memory using the updated xml file.
>
> 3. How the data being rendered
>
> As the data is not stored in JCR repository as nodes, but a plain
> file, so things like /mobile/handset/NokiaN75 can not be understand by
> CQ. To make those nodes available, we write our own resource provider,
> and map it to
> /mobile/handset. We will do URL parsing ourself to get
> the parameters. The final resource provided to script engine will
> contain the necessary data. In the page, we can reference the handset
> by reference the current node (a.k.a the resource provided). The
> resource provider is similar to tranditional controller in MVC
> architecture. So there might be 20+ controllers in a system, would
> that be a issue?
>
> 4. How to embed them as part of the page
>
> We leverage the CQ component here. The component will be configured
> with a URL, where it can fetch its html. So, the comonent can not
> render itself, instead, it relies on another URL to do it (which is
> backened by a controller). To do this, we either need to implement
> some sort of server side include, or using client side ajax call. The
> concern of server side include is how to do that, and how to pass user
> session. The concern of using client side ajax call is
> the user
> experience might be impacted as the network could be slow. We were
> told from the sales of CQ that to be effective about page caching,
> we’d better to use ajax call to retrieve the dynamic content.
>
> Any comments?
>
>
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
hi tao,
thanks a lot for your post. i think you raise a lot of interesting
questions, that are very relevant for people new to jcr, hence this
is a great conversation to have in a public and archived way.
> the reason why I did not choose to store in JCR is there is some
> transformation need to be done before the data being presented. Things like
> the list of price plan could change depends on the handset you have chosen.
> If things are stored in JCR, I think I need to do OCM and load it in memory
> then filter the list by applying business rules. Also, in my case, I just
> need to view it, no need to update/delete.
personally, i would probably never use ocm in this scenario.
jcr has a built-in mechanism to serialize and deserialize to and from xml.
as a matter of fact there is an argument to be made that jcr can produce
a stream of sax events directly without actually parsing an xml document,
which definitely can be an advantage in terms of performance. [1]
i would definitely suggest to store the product information as separate nodes
so they can be managed at a fine level of granularity. i think both joerg and
paul are spot on with their comments.
having individual nodes lets you use things like search, versioning, access
control etc. at a fine grained level without compromising on transformation
or script-ability. there are certain cases where it makes sense to store
information as an xml blob mostly if we are talking about huge, massively
complex documents that are not worth the additional effort deserializing.
in my experience, jcr is a very different persistence paradigm and i would
like offer up the recommendation that if this happens to be your first project
with jcr that you would have your "content model" looked at either
by a day services engineer or architect or by an experienced day partner.
usually, a "content modeling workshop" of a couple of hours can save
you a lot of time later on in the project.
of course you are more than welcome to post your content model as a
cnd to this list in a less formal fashion and get everybody's input.
regards,
david
[1]
http://www.day.com/maven/jsr170/javadocs/jcr-1.0/javax/jcr/Session.html#exportDocumentView(java.lang.String,%20org.xml.sax.ContentHandler,%20boolean,%20boolean)
> 2009/3/17 Paul McMahon <orotas@yahoo.com>
>>
>> The primary comment I would have is to wonder why you are storing the
>> content as and XML file? Perhaps I am missing a requirement, but it would
>> seem to be that you would be better off storing the content you import as
>> nodes in the CRX repository.
>>
>> As an example think of how the campaign content is managed in the
>> Geometrixx example. You have a tree in the repository that is separate of
>> site tree, but is still standard content in the repository.
>>
>> If you stored the content as a node it would simplify your replication
>> issues and your design for how you embed the content in your component. You
>> would be able to leverage the standard sling resource resolver so you
>> wouldn't have to build your own. Depending on exactly what your requirements
>> are you could look at how the teaser or reference components render the data
>> they display.
>>
>> You mention that you are transforming some of this data after you fetch it
>> via JDBC so it may not be a good candidate for this, but this seems like a
>> natural use for a CRX connector and a virtual repository. Again your
>> requirements may mean that its not a good approach, but its worth
>> considering.
>>
>> Paul McMahon
>> Acquity Group
>>
>> --- On Mon, 3/16/09, taowen <taowen@gmail.com> wrote:
>>
>> From: taowen <taowen@gmail.com>
>> Subject: [day-communique] Integrate CQ5 with data from another system
>> To: "Day Communique" <day-communique@googlegroups.com>
>> Date: Monday, March 16, 2009, 8:32 AM
>>
>> We are working a public website using CQ 5. Some of the data like
>> product catalogue comes from another system. The requirement asks us
>>
>> to get the data and embed them as part of the page hosted by CQ. This
>> is a challenging job to
>> us, as we can not find any example out there
>> doing this. The challenges we have identified are:
>>
>>
>> 1. How to get/load the data
>>
>> 2. How the data being updated and notified
>>
>> 3. How the data being rendered
>>
>>
>> 4. How to embed them as part of the page
>>
>>
>> Before we sending out this mail to ask for help, we did some
>> consulting with sales of CQ, and did some homework like reading source
>> code of sling. Here are our planned apporaches. We need some advices
>>
>> here about are those apporaches following the best practices of sling
>> or simply anti-patterns.
>>
>> 1. How to get/load the data
>>
>> Some background: we will deploy three CQ instances. One author
>> instance and two publish instances. The data will be prepared in
>>
>> author instance using JDBC, and doing some transformation in memory.
>> The result will be persisted into a xml file and saved in JCR
>> repository. We rely on replication functionality to
>> publish the xml
>> file to the publish instance. Then in the publish instance we just
>> need to read things in xml back into memory.
>>
>>
>> 2. How the data being updated and notified
>>
>> We will set up a scheduler in author instance to poll the data every
>>
>> day. The update will be notified as event. If the replication process
>> can generate event itself, then we will rely on it. Otherwise, we need
>> to set up new topic and publish event to it when the serialized xml
>> file being updated. We assume CQ will distribute the event to publish
>>
>> instance so that the publish instance can know they need to refresh
>> the data in memory using the updated xml file.
>>
>> 3. How the data being rendered
>>
>> As the data is not stored in JCR repository as nodes, but a plain
>>
>> file, so things like /mobile/handset/NokiaN75 can not be understand by
>> CQ. To make those nodes available, we write our own resource provider,
>> and map it to
>> /mobile/handset. We will do URL parsing ourself to get
>> the parameters. The final resource provided to script engine will
>> contain the necessary data. In the page, we can reference the handset
>> by reference the current node (a.k.a the resource provided). The
>>
>> resource provider is similar to tranditional controller in MVC
>> architecture. So there might be 20+ controllers in a system, would
>> that be a issue?
>>
>> 4. How to embed them as part of the page
>>
>> We leverage the CQ component here. The component will be configured
>>
>> with a URL, where it can fetch its html. So, the comonent can not
>> render itself, instead, it relies on another URL to do it (which is
>> backened by a controller). To do this, we either need to implement
>> some sort of server side include, or using client side ajax call. The
>>
>> concern of server side include is how to do that, and how to pass user
>> session. The concern of using client side ajax call is
>> the user
>> experience might be impacted as the network could be slow. We were
>> told from the sales of CQ that to be effective about page caching,
>> we’d better to use ajax call to retrieve the dynamic content.
>>
>>
>> Any comments?
>>
>>
>>
>>
>>
>
>
> >
>
--
Visit: http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi Michael,
thanks a lot for the clarifications.
I understand that in product information usually there are a lot of
many to many relationships that it definitely makes sense to have
some (based on personal taste even most or all) of those objects
as POJOs.
In terms of many-to-many relationships there are a number of ways
to a achieve relationships in JCR and depending on the characteristics
of those relationships the one or the other path is desirable. Traditionally
people initially think of (hard-) references protected by referential integrity
as we know them from relational databases.
JCR offers a variety of different ways of modeling relationships between
nodes, which lead me to mentioning the overuse of references as one of
the guidelines that I personally use when modeling content. [1]
We recently have gone through a number of modeling workshops for
customers with actual product information. Personally, I think that
product information is a great example of semi-structured information.
There is a lot of unstructured information that is primarily used for display
but bears no relevance for the respective e-commerce transactions.
Of course there is also structured information like the price, sku or the
in-stock information that is very relevant for the transactional information.
Traditionally, a lot of the unstructured (non-transaction-related)
information is
also stored in a structured way in the RDBMS since one does not have
the luxury of leveraging a "data first" [2] approach for the elements
of a product
catalog that do not really benefit from the "structure first" approach.
I understand that this sometimes may be a big leap and it is my experience
that an actual example with the real data model that you are looking at is
most helpful to see what a JCR approach could be.
If you are interested in sharing this on the list here, we could definitely work
through our data model recommendations for your specific data model in
public on this list so everybody can pitch in.
Of course we are more than happy to organize for a
"content-modeling-workshop" to quickly walk you through our suggestions
for the product catalog that you are are looking at.
Please let me know, if I can be of any further assistance or if I should have
misunderstood your explanations.
regards,
david
[1] http://wiki.apache.org/jackrabbit/DavidsModel
[2] http://www.betaversion.org/~stefano/linotype/news/93/
On Wed, Mar 18, 2009 at 8:39 AM, Michael Robinson
<michael8robinson@gmail.com> wrote:
>
> Just to clarify, the object graph will reside in heap during normal
> processing, and will only be updated from the persistence mechanism
> when it changes or when the service restarts.
>
> On Mar 18, 1:06 pm, Michael Robinson <michael8robin...@gmail.com>
> wrote:
>> David and Paul,
>>
>> Thank you for your responses below. I'm working with Tao Wen on the
>> same project, and I may be able to clarify a number of points for you.
>>
>> Our fundamental concern is best practice for data/content integration
>> in the CQ/CRX application development environment.
>>
>> We have a legacy back-end system that manages a complex product and
>> service catalog with several levels of many<->many relationships and
>> associated business rules. Changing the processes and systems for
>> managing this catalog is out of scope. We have created a POJO domain
>> model which we populate through an ETL process from the legacy system,
>> and we have custom CQ components to expose the catalog structure
>> through dynamic user interaction following the recommended AJAX client-
>> side application composition approach.
>>
>> We have two main questions. First, what is the recommended approach
>> to persist the catalog object graph, and synchronize it across all
>> publisher instances. One possibility is to persist it to an XML
>> document, and put it in a node for synchronization and persistence.
>> Because it is a graph of objects (i.e. cyclically-associated instances
>> of classes with essential business logic methods), not a tree of
>> content, mapping directly to JCR semantics would be a huge loss in
>> terms of effort, performance, and functionality.
>>
>> Our second question is what is the recommended CQ approach for
>> integrating data-driven transactional user applications (exemplified
>> by, say, expedia.com) with a content-centric website (exemplified by,
>> say, cnn.com). Particularly, what's best practice for handing off
>> user session and transactional state across the boundary between the
>> two.
>>
>> Any guidance you may be able to provide will be greatly appreciated.
>>
>> -Michael Robinson
>>
>> On Mar 17, 3:08 pm, David Nuescheler <david.nuesche...@gmail.com>
>> wrote:
>>
>> > hi tao,
>>
>> > thanks a lot for your post. i think you raise a lot of interesting
>> > questions, that are very relevant for people new to jcr, hence this
>> > is a great conversation to have in a public and archived way.
>>
>> > > the reason why I did not choose to store in JCR is there is some
>> > > transformation need to be done before the data being presented. Things like
>> > > the list of price plan could change depends on the handset you have chosen.
>> > > If things are stored in JCR, I think I need to do OCM and load it in memory
>> > > then filter the list by applying business rules. Also, in my case, I just
>> > > need to view it, no need to update/delete.
>>
>> > personally, i would probably never use ocm in this scenario.
>>
>> > jcr has a built-in mechanism to serialize and deserialize to and from xml.
>> > as a matter of fact there is an argument to be made that jcr can produce
>> > a stream of sax events directly without actually parsing an xml document,
>> > which definitely can be an advantage in terms of performance. [1]
>>
>> > i would definitely suggest to store the product information as separate nodes
>> > so they can be managed at a fine level of granularity. i think both joerg and
>> > paul are spot on with their comments.
>>
>> > having individual nodes lets you use things like search, versioning, access
>> > control etc. at a fine grained level without compromising on transformation
>> > or script-ability. there are certain cases where it makes sense to store
>> > information as an xml blob mostly if we are talking about huge, massively
>> > complex documents that are not worth the additional effort deserializing.
>>
>> > in my experience, jcr is a very different persistence paradigm and i would
>> > like offer up the recommendation that if this happens to be your first project
>> > with jcr that you would have your "content model" looked at either
>> > by a day services engineer or architect or by an experienced day partner.
>> > usually, a "content modeling workshop" of a couple of hours can save
>> > you a lot of time later on in the project.
>> > of course you are more than welcome to post your content model as a
>> > cnd to this list in a less formal fashion and get everybody's input.
>>
>> > regards,
>> > david
>>
>> > [1]http://www.day.com/maven/jsr170/javadocs/jcr-1.0/javax/jcr/Session.ht...)
>>
>> > > 2009/3/17 Paul McMahon <oro...@yahoo.com>
>>
>> > >> The primary comment I would have is to wonder why you are storing the
>> > >> content as and XML file? Perhaps I am missing a requirement, but it would
>> > >> seem to be that you would be better off storing the content you import as
>> > >> nodes in the CRX repository.
>>
>> > >> As an example think of how the campaign content is managed in the
>> > >> Geometrixx example. You have a tree in the repository that is separate of
>> > >> site tree, but is still standard content in the repository.
>>
>> > >> If you stored the content as a node it would simplify your replication
>> > >> issues and your design for how you embed the content in your component. You
>> > >> would be able to leverage the standard sling resource resolver so you
>> > >> wouldn't have to build your own. Depending on exactly what your requirements
>> > >> are you could look at how the teaser or reference components render the data
>> > >> they display.
>>
>> > >> You mention that you are transforming some of this data after you fetch it
>> > >> via JDBC so it may not be a good candidate for this, but this seems like a
>> > >> natural use for a CRX connector and a virtual repository. Again your
>> > >> requirements may mean that its not a good approach, but its worth
>> > >> considering.
>>
>> > >> Paul McMahon
>> > >> Acquity Group
>>
>> > >> --- On Mon, 3/16/09, taowen <tao...@gmail.com> wrote:
>>
>> > >> From: taowen <tao...@gmail.com>
>> > >> Subject: [day-communique] Integrate CQ5 with data from another system
>> > >> To: "Day Communique" <day-communique@googlegroups.com>
>> > >> Date: Monday, March 16, 2009, 8:32 AM
>>
>> > >> We are working a public website using CQ 5. Some of the data like
>> > >> product catalogue comes from another system. The requirement asks us
>>
>> > >> to get the data and embed them as part of the page hosted by CQ. This
>> > >> is a challenging job to
>> > >> us, as we can not find any example out there
>> > >> doing this. The challenges we have identified are:
>>
>> > >> 1. How to get/load the data
>>
>> > >> 2. How the data being updated and notified
>>
>> > >> 3. How the data being rendered
>>
>> > >> 4. How to embed them as part of the page
>>
>> > >> Before we sending out this mail to ask for help, we did some
>> > >> consulting with sales of CQ, and did some homework like reading source
>> > >> code of sling. Here are our planned apporaches. We need some advices
>>
>> > >> here about are those apporaches following the best practices of sling
>> > >> or simply anti-patterns.
>>
>> > >> 1. How to get/load the data
>>
>> > >> Some background: we will deploy three CQ instances. One author
>> > >> instance and two publish instances. The data will be prepared in
>>
>> > >> author instance using JDBC, and doing some transformation in memory.
>> > >> The result will be persisted into a xml file and saved in JCR
>> > >> repository. We rely on replication functionality to
>> > >> publish the xml
>> > >> file to the publish instance. Then in the publish instance we just
>> > >> need to read things in xml back into memory.
>>
>> > >> 2. How the data being updated and notified
>>
>> > >> We will set up a scheduler in author instance to poll the data every
>>
>> > >> day. The update will be notified as event. If the replication process
>> > >> can generate event itself, then we will rely on it. Otherwise, we need
>> > >> to set up new topic and publish event to it when the serialized xml
>> > >> file being updated. We assume CQ will distribute the event to publish
>>
>> > >> instance so that the publish instance can know they need to refresh
>> > >> the data in memory using the updated xml file.
>>
>> > >> 3. How the data being rendered
>>
>> > >> As the data is not stored in JCR repository as nodes, but a plain
>>
>> > >> file, so things like /mobile/handset/NokiaN75 can not be understand by
>> > >> CQ. To make those nodes available, we write our own resource provider,
>> > >> and map it to
>> > >> /mobile/handset. We will do URL parsing ourself to get
>> > >> the parameters. The final resource provided to script engine will
>> > >> contain the necessary data. In the page, we can reference the handset
>> > >> by reference the current node (a.k.a the resource provided). The
>>
>> > >> resource provider is similar to tranditional controller in MVC
>> > >> architecture. So there might be 20+ controllers in a system, would
>> > >> that be a issue?
>>
>> > >> 4. How to embed them as part of the page
>>
>> > >> We leverage the CQ component here. The component will be configured
>>
>> > >> with a URL, where it can fetch its html. So, the comonent can not
>> > >> render itself, instead, it relies on another URL to do it (which is
>> > >> backened by a controller). To do this, we either need to implement
>> > >> some sort of server side include, or using client side ajax call. The
>>
>> > >> concern of server side include is how to do that, and how to pass user
>> > >> session. The concern of using client side ajax call is
>> > >> the user
>> > >> experience might be impacted as the network could be slow. We were
>> > >> told from the sales of CQ that to be effective about page caching,
>> > >> we’d better to use ajax call to retrieve the dynamic content.
>>
>> > >> Any comments?
>>
>> > --
>> > Visit:http://dev.day.com/
> >
>
--
Visit: http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi Michael,
I just realized that I did not respond to your second question.
> Our second question is what is the recommended CQ approach for
> integrating data-driven transactional user applications (exemplified
> by, say, expedia.com) with a content-centric website (exemplified by,
> say, cnn.com). Particularly, what's best practice for handing off
> user session and transactional state across the boundary between the
> two.
This largely depends on the situations. Some customers have existing
application logic in an existing web framework say struts or spring.
For those customers we would recommend to run the CQ5
(well, Apache Sling) web application in the same servlet container
in case you want to use session information across both applications.
It is important though that through our REST oriented architecture we don't
actually need a session neither for the authoring user interface nor for the
actual presentation layer in production.
Of course we leave it to our customers if they want to use sessions for
their applications but we don't have a session dependency per se.
For customers that do not have an existing application we would recommend
for them to build their transactional applications as part of the CQ5 / Sling
to leverage all the features of the WCM throughout their applications.
This means that information like message or string bundles can be managed
through the WCM UI by a translator or business user and a simple change
of the text on a button does not require an application deployment any more.
It also means that a WCM author can place their application (say a shopping
cart detail view) on a page, add header and footer and modify the design,
preview it and for example send it through workflow for approval.
Since one CQ5 comes with and runs in an OSGI container there is really
no limitation to the functionality that you can embed.
So I would probably recommend to implement the more hard-structured
transactional, business logic in OSGI bundles, deploy those via the content
repository and then tie everything into CQ5 components available to the
content author using JSP (or any presentation layer scripting language of
you choosing).
Does that make sense?
regards,
david
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi, David,
Thanks for your feedback. Unfortunately, the catalog data is managed
through of a chain of legacy systems. While there is clearly room for
improvement there, making such improvements is completely and definitively
out of scope for our work.
So we have to deal with what we've got, which is an ETL process into our
domain model in a controller deployed as an OSGi service, integrating with
CQ.
While the details are client confidential, the attached UML is
representative of the structure and relationships found in the catalog
elements (i.e. this is not the structure of an instance of "package", but
rather the "meta" structure of package composition). The rules which govern
creation of valid package selections are numerous, complex, and subject to
frequent change (and, again, managed through legacy systems). We are
building dynamic package builder UI elements to assist in the package
selection and validation process.
This is the area where we are looking for guidance: how do we best integrate
management of the custom client-side data-driven dynamic UI elements with
the WCM interfaces, and how do we best integrate our dynamic client-side
elements with the production CQ page composition mechanism.
Thank you for any further assistance you may be able to provide.
-Michael Robinson
On Wed, Mar 18, 2009 at 10:20 PM, David Nuescheler <
david.nuescheler@gmail.com> wrote:
>
> Hi Michael,
>
> thanks a lot for the clarifications.
>
> I understand that in product information usually there are a lot of
> many to many relationships that it definitely makes sense to have
> some (based on personal taste even most or all) of those objects
> as POJOs.
>
> In terms of many-to-many relationships there are a number of ways
> to a achieve relationships in JCR and depending on the characteristics
> of those relationships the one or the other path is desirable.
> Traditionally
> people initially think of (hard-) references protected by referential
> integrity
> as we know them from relational databases.
>
> JCR offers a variety of different ways of modeling relationships between
> nodes, which lead me to mentioning the overuse of references as one of
> the guidelines that I personally use when modeling content. [1]
>
> We recently have gone through a number of modeling workshops for
> customers with actual product information. Personally, I think that
> product information is a great example of semi-structured information.
> There is a lot of unstructured information that is primarily used for
> display
> but bears no relevance for the respective e-commerce transactions.
> Of course there is also structured information like the price, sku or the
> in-stock information that is very relevant for the transactional
> information.
>
> Traditionally, a lot of the unstructured (non-transaction-related)
> information is
> also stored in a structured way in the RDBMS since one does not have
> the luxury of leveraging a "data first" [2] approach for the elements
> of a product
> catalog that do not really benefit from the "structure first" approach.
>
> I understand that this sometimes may be a big leap and it is my experience
> that an actual example with the real data model that you are looking at is
> most helpful to see what a JCR approach could be.
>
> If you are interested in sharing this on the list here, we could definitely
> work
> through our data model recommendations for your specific data model in
> public on this list so everybody can pitch in.
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
hi michael,
thanks a lot for the additional UML diagram.
here are my two cents, you will find that i am more of a believer in
"data-first"
hence my personal data modeling will go more "by example" than by uml diagram,
but please consider this a matter of personal preference not a
limitation of the content
repository or cq5.
since i will work in examples i will assume a telco operator based on the above
generic uml catalog model.
i think the entities at our disposition are clearly described by your
uml diagram so i think we can stick to those. since i am not a big fan
of actually going to describe a hard-structured datamodel in nodetypes
i would rather use the feature of residual properties and childnodes
and probably go with an nt:unstructured (or similar) approach and
use the sling:resourceType property as the means of identifying the
entity type for starters.
the simples way to go about this in a cq5 way would then be to say,
ok we got packages, services, devices, ...
so let's just place all of those into /content/packages,
/content/services, /content/devices...
(this may be oversimplified since one would probably still toss
countries and languages
into the hierarchy)
usually in cq one would make those entities individual templates, now given
that in your case "content authors" cannot create services or packages it
is not really necessary to expose them as templates but i would still
expose them
as individual sling:resourceTypes.
so that would yield something like:
/content/packages/christmaspackage
/content/packages/studentspackage
/content/services/mobile
/content/services/voice
/content/services/broadband
/content/devices/nokia/5800
/content/devices/apple/iphone
/content/accessories/apple/bluetooth_headset
/content/accessories/nokia/minispeakers_md8
as you see i already used the 1-to-many relationships of the vendor to the
devices and accessories as a mechanism to make the hierarchy more user friendly.
it is important to mention that one of the main drivers of the hierarchy should
be the access control. this means that object that "belong together"
since a group
of people is responsible to manage or see them, really should be stored in one
location in the repository. i think this is fairly intuitive given
that we all worked
with file systems, and learned to understand the benefits of a hierarchy when it
comes to access control.
you may find yourself in a situation where the number of sibling nodes
in the hierarchy
becomes too large to handle. this is more of a gut feeling thing than
any hard rule.
as much as all of us would try to avoid 100k files into a single
folder or have empty folders
with just a single file/folder in it... (except for us java developers
with our mostly empty looking
java packages folders ;))
as you may see above we would also drive the path of the hierarchy to
be usable for
human readable and seo friendly repurposing in the url.
so the iphone device description would be reachable at
http://localhost:4502/content/devices/apple/iphone.html
now one may argue that it is possibly easier to use an id in the path and create
something like /content/devices/1834620042374723 but i think it is
fairly evident that while this is definitely a mapping that works from
a repository
perspective it is not particularly user friendly. keep in mind that there is a
uuid available at your disposition for every node if you (should
ever;)) need one.
the next step that i would recommend is to expose our "devices",
"accessories", etc..
through the "content finder" in cq5. this means that on the left hand
side the wcm
of a content page the user would have a "devices" tab, or a
"accessories" tab, that
can be used throughout the content pages to associate let's say
products for a link or
a teaser box in content.
when "associating" let's say a "device" with its accessories i would
use the normal wcm mechanism of having a multi value property called
"recommendedAccessories"
on my node "iphone" and i would personally just put a path to the my "Blue Tooth
Headset" in there... (.../apple/bluetooth_headset)
now a lot of people may argue that they want either "referential
integrity" or "stability across
move operations" in there. If you do need them, feel free to do that.
jcr offers mechanism
for both. personally, i believe that both are unnecessary and the
benefits don't justify the
cost of losing an intuitive data model.
... compare probably the bestcase:
/content/devices/apple/iphone/jcr:content/recommendedAccessories=apple/bluetooth_headset
...where i would argue everybody who looks at this automatically understands
what that means.
... to somewhat of a worst case from a content modeling perspective.
/content/devices/473653453/jcr:content/recommendedAccessories=132332425
(well i have seen worse, but i don't dare to bring it up ;))
now given that this pretty much outlines, how i would approach the situation
from a green field perspective, of course this may impose complexity that
has not been planned for in the etl layer, and of course in real-life you would
need to compromise and cut corners depending on the focus of your project.
as mentioned before i am more than happy to have a look at the specifics
give you my recommendations on where to cut corners without losing too much
of cq5's facilities (ui-integration, replication, versioning, ...).
regards,
david
On Thu, Mar 19, 2009 at 3:31 PM, Michael Robinson
<michael8robinson@gmail.com> wrote:
> Hi, David,
>
> Thanks for your feedback. Unfortunately, the catalog data is managed
> through of a chain of legacy systems. While there is clearly room for
> improvement there, making such improvements is completely and definitively
> out of scope for our work.
>
> So we have to deal with what we've got, which is an ETL process into our
> domain model in a controller deployed as an OSGi service, integrating with
> CQ.
>
> While the details are client confidential, the attached UML is
> representative of the structure and relationships found in the catalog
> elements (i.e. this is not the structure of an instance of "package", but
> rather the "meta" structure of package composition). The rules which govern
> creation of valid package selections are numerous, complex, and subject to
> frequent change (and, again, managed through legacy systems). We are
> building dynamic package builder UI elements to assist in the package
> selection and validation process.
>
> This is the area where we are looking for guidance: how do we best integrate
> management of the custom client-side data-driven dynamic UI elements with
> the WCM interfaces, and how do we best integrate our dynamic client-side
> elements with the production CQ page composition mechanism.
>
> Thank you for any further assistance you may be able to provide.
>
> -Michael Robinson
>
>
> On Wed, Mar 18, 2009 at 10:20 PM, David Nuescheler
> <david.nuescheler@gmail.com> wrote:
>>
>> Hi Michael,
>>
>> thanks a lot for the clarifications.
>>
>> I understand that in product information usually there are a lot of
>> many to many relationships that it definitely makes sense to have
>> some (based on personal taste even most or all) of those objects
>> as POJOs.
>>
>> In terms of many-to-many relationships there are a number of ways
>> to a achieve relationships in JCR and depending on the characteristics
>> of those relationships the one or the other path is desirable.
>> Traditionally
>> people initially think of (hard-) references protected by referential
>> integrity
>> as we know them from relational databases.
>>
>> JCR offers a variety of different ways of modeling relationships between
>> nodes, which lead me to mentioning the overuse of references as one of
>> the guidelines that I personally use when modeling content. [1]
>>
>> We recently have gone through a number of modeling workshops for
>> customers with actual product information. Personally, I think that
>> product information is a great example of semi-structured information.
>> There is a lot of unstructured information that is primarily used for
>> display
>> but bears no relevance for the respective e-commerce transactions.
>> Of course there is also structured information like the price, sku or the
>> in-stock information that is very relevant for the transactional
>> information.
>>
>> Traditionally, a lot of the unstructured (non-transaction-related)
>> information is
>> also stored in a structured way in the RDBMS since one does not have
>> the luxury of leveraging a "data first" [2] approach for the elements
>> of a product
>> catalog that do not really benefit from the "structure first" approach.
>>
>> I understand that this sometimes may be a big leap and it is my experience
>> that an actual example with the real data model that you are looking at is
>> most helpful to see what a JCR approach could be.
>>
>> If you are interested in sharing this on the list here, we could
>> definitely work
>> through our data model recommendations for your specific data model in
>> public on this list so everybody can pitch in.
>
>
>
> >
>
--
Visit: http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Thanks, David. I think I understand what you're saying now. All the
data should be placed in well-positioned nodes in the JCR hierarchy
with a user-friendly repurposeable URI.
So, expanding on your examples, we would have a node in the content
repository like:
/content/packages/service/mobile/minutes/500/sms/200/device/nokia/5800/
red/vas/pushtotalk/vas/mobiledata/200Mb/service/voice/device/CPE/
M21120/vas/IDD/200minutes
And in that node we would attach attributes like the list price for
that particular package, the price after applicable promotional
discounts, eligibility rules, and so on.
And then our package selector would execute a JCR query across the /
content/packages hierarchy to collect the nodes that matched the
client-side UI settings as the user adjusted, say, the number of
mobile minutes in the plan, or the color of the phone.
Is that right?
And if not, then where do you store and how do you query the business
rules around package validity (e.g. data VAS plans only for
smartphones), promotion applicability (e.g. 10% discount on 3 or more
VAS), availability (e.g. service not available in rural postal codes),
promotional pricing calculation, and so on?
Also, given that all the catalog data is coming out of legacy systems,
how is the mapping to user-friendly URLs accomplished when new records
are created in the back end?
On Mar 20, 5:08 pm, David Nuescheler <david.nuesche...@gmail.com>
wrote:
> hi michael,
>
> thanks a lot for the additional UML diagram.
>
> here are my two cents, you will find that i am more of a believer in
> "data-first"
> hence my personal data modeling will go more "by example" than by uml diagram,
> but please consider this a matter of personal preference not a
> limitation of the content
> repository or cq5.
>
> since i will work in examples i will assume a telco operator based on the above
> generic uml catalog model.
>
> i think the entities at our disposition are clearly described by your
> uml diagram so i think we can stick to those. since i am not a big fan
> of actually going to describe a hard-structured datamodel in nodetypes
> i would rather use the feature of residual properties and childnodes
> and probably go with an nt:unstructured (or similar) approach and
> use the sling:resourceType property as the means of identifying the
> entity type for starters.
>
> the simples way to go about this in a cq5 way would then be to say,
> ok we got packages, services, devices, ...
> so let's just place all of those into /content/packages,
> /content/services, /content/devices...
> (this may be oversimplified since one would probably still toss
> countries and languages
> into the hierarchy)
>
> usually in cq one would make those entities individual templates, now given
> that in your case "content authors" cannot create services or packages it
> is not really necessary to expose them as templates but i would still
> expose them
> as individual sling:resourceTypes.
>
> so that would yield something like:
> /content/packages/christmaspackage
> /content/packages/studentspackage
> /content/services/mobile
> /content/services/voice
> /content/services/broadband
> /content/devices/nokia/5800
> /content/devices/apple/iphone
> /content/accessories/apple/bluetooth_headset
> /content/accessories/nokia/minispeakers_md8
>
> as you see i already used the 1-to-many relationships of the vendor to the
> devices and accessories as a mechanism to make the hierarchy more user friendly.
> it is important to mention that one of the main drivers of the hierarchy should
> be the access control. this means that object that "belong together"
> since a group
> of people is responsible to manage or see them, really should be stored in one
> location in the repository. i think this is fairly intuitive given
> that we all worked
> with file systems, and learned to understand the benefits of a hierarchy when it
> comes to access control.
>
> you may find yourself in a situation where the number of sibling nodes
> in the hierarchy
> becomes too large to handle. this is more of a gut feeling thing than
> any hard rule.
> as much as all of us would try to avoid 100k files into a single
> folder or have empty folders
> with just a single file/folder in it... (except for us java developers
> with our mostly empty looking
> java packages folders ;))
>
> as you may see above we would also drive the path of the hierarchy to
> be usable for
> human readable and seo friendly repurposing in the url.
> so the iphone device description would be reachable athttp://localhost:4502/content/devices/apple/iphone.html
> now one may argue that it is possibly easier to use an id in the path and create
> something like /content/devices/1834620042374723 but i think it is
> fairly evident that while this is definitely a mapping that works from
> a repository
> perspective it is not particularly user friendly. keep in mind that there is a
> uuid available at your disposition for every node if you (should
> ever;)) need one.
>
> the next step that i would recommend is to expose our "devices",
> "accessories", etc..
> through the "content finder" in cq5. this means that on the left hand
> side the wcm
> of a content page the user would have a "devices" tab, or a
> "accessories" tab, that
> can be used throughout the content pages to associate let's say
> products for a link or
> a teaser box in content.
>
> when "associating" let's say a "device" with its accessories i would
> use the normal wcm mechanism of having a multi value property called
> "recommendedAccessories"
> on my node "iphone" and i would personally just put a path to the my "Blue Tooth
> Headset" in there... (.../apple/bluetooth_headset)
>
> now a lot of people may argue that they want either "referential
> integrity" or "stability across
> move operations" in there. If you do need them, feel free to do that.
> jcr offers mechanism
> for both. personally, i believe that both are unnecessary and the
> benefits don't justify the
> cost of losing an intuitive data model.
>
> ... compare probably the bestcase:
>
> /content/devices/apple/iphone/jcr:content/recommendedAccessories=apple/bluetooth_headset
>
> ...where i would argue everybody who looks at this automatically understands
> what that means.
>
> ... to somewhat of a worst case from a content modeling perspective.
> /content/devices/473653453/jcr:content/recommendedAccessories=132332425
> (well i have seen worse, but i don't dare to bring it up ;))
>
> now given that this pretty much outlines, how i would approach the situation
> from a green field perspective, of course this may impose complexity that
> has not been planned for in the etl layer, and of course in real-life you would
> need to compromise and cut corners depending on the focus of your project.
>
> as mentioned before i am more than happy to have a look at the specifics
> give you my recommendations on where to cut corners without losing too much
> of cq5's facilities (ui-integration, replication, versioning, ...).
>
> regards,
> david
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
David and Paul,
Thank you for your responses below. I'm working with Tao Wen on the
same project, and I may be able to clarify a number of points for you.
Our fundamental concern is best practice for data/content integration
in the CQ/CRX application development environment.
We have a legacy back-end system that manages a complex product and
service catalog with several levels of many<->many relationships and
associated business rules. Changing the processes and systems for
managing this catalog is out of scope. We have created a POJO domain
model which we populate through an ETL process from the legacy system,
and we have custom CQ components to expose the catalog structure
through dynamic user interaction following the recommended AJAX client-
side application composition approach.
We have two main questions. First, what is the recommended approach
to persist the catalog object graph, and synchronize it across all
publisher instances. One possibility is to persist it to an XML
document, and put it in a node for synchronization and persistence.
Because it is a graph of objects (i.e. cyclically-associated instances
of classes with essential business logic methods), not a tree of
content, mapping directly to JCR semantics would be a huge loss in
terms of effort, performance, and functionality.
Our second question is what is the recommended CQ approach for
integrating data-driven transactional user applications (exemplified
by, say, expedia.com) with a content-centric website (exemplified by,
say, cnn.com). Particularly, what's best practice for handing off
user session and transactional state across the boundary between the
two.
Any guidance you may be able to provide will be greatly appreciated.
-Michael Robinson
On Mar 17, 3:08 pm, David Nuescheler <david.nuesche...@gmail.com>
wrote:
> hi tao,
>
> thanks a lot for your post. i think you raise a lot of interesting
> questions, that are very relevant for people new to jcr, hence this
> is a great conversation to have in a public and archived way.
>
> > the reason why I did not choose to store in JCR is there is some
> > transformation need to be done before the data being presented. Things like
> > the list of price plan could change depends on the handset you have chosen.
> > If things are stored in JCR, I think I need to do OCM and load it in memory
> > then filter the list by applying business rules. Also, in my case, I just
> > need to view it, no need to update/delete.
>
> personally, i would probably never use ocm in this scenario.
>
> jcr has a built-in mechanism to serialize and deserialize to and from xml.
> as a matter of fact there is an argument to be made that jcr can produce
> a stream of sax events directly without actually parsing an xml document,
> which definitely can be an advantage in terms of performance. [1]
>
> i would definitely suggest to store the product information as separate nodes
> so they can be managed at a fine level of granularity. i think both joerg and
> paul are spot on with their comments.
>
> having individual nodes lets you use things like search, versioning, access
> control etc. at a fine grained level without compromising on transformation
> or script-ability. there are certain cases where it makes sense to store
> information as an xml blob mostly if we are talking about huge, massively
> complex documents that are not worth the additional effort deserializing.
>
> in my experience, jcr is a very different persistence paradigm and i would
> like offer up the recommendation that if this happens to be your first project
> with jcr that you would have your "content model" looked at either
> by a day services engineer or architect or by an experienced day partner.
> usually, a "content modeling workshop" of a couple of hours can save
> you a lot of time later on in the project.
> of course you are more than welcome to post your content model as a
> cnd to this list in a less formal fashion and get everybody's input.
>
> regards,
> david
>
> [1]http://www.day.com/maven/jsr170/javadocs/jcr-1.0/javax/jcr/Session.ht...)
>
>
>
> > 2009/3/17 Paul McMahon <oro...@yahoo.com>
>
> >> The primary comment I would have is to wonder why you are storing the
> >> content as and XML file? Perhaps I am missing a requirement, but it would
> >> seem to be that you would be better off storing the content you import as
> >> nodes in the CRX repository.
>
> >> As an example think of how the campaign content is managed in the
> >> Geometrixx example. You have a tree in the repository that is separate of
> >> site tree, but is still standard content in the repository.
>
> >> If you stored the content as a node it would simplify your replication
> >> issues and your design for how you embed the content in your component. You
> >> would be able to leverage the standard sling resource resolver so you
> >> wouldn't have to build your own. Depending on exactly what your requirements
> >> are you could look at how the teaser or reference components render the data
> >> they display.
>
> >> You mention that you are transforming some of this data after you fetch it
> >> via JDBC so it may not be a good candidate for this, but this seems like a
> >> natural use for a CRX connector and a virtual repository. Again your
> >> requirements may mean that its not a good approach, but its worth
> >> considering.
>
> >> Paul McMahon
> >> Acquity Group
>
> >> --- On Mon, 3/16/09, taowen <tao...@gmail.com> wrote:
>
> >> From: taowen <tao...@gmail.com>
> >> Subject: [day-communique] Integrate CQ5 with data from another system
> >> To: "Day Communique" <day-communique@googlegroups.com>
> >> Date: Monday, March 16, 2009, 8:32 AM
>
> >> We are working a public website using CQ 5. Some of the data like
> >> product catalogue comes from another system. The requirement asks us
>
> >> to get the data and embed them as part of the page hosted by CQ. This
> >> is a challenging job to
> >> us, as we can not find any example out there
> >> doing this. The challenges we have identified are:
>
> >> 1. How to get/load the data
>
> >> 2. How the data being updated and notified
>
> >> 3. How the data being rendered
>
> >> 4. How to embed them as part of the page
>
> >> Before we sending out this mail to ask for help, we did some
> >> consulting with sales of CQ, and did some homework like reading source
> >> code of sling. Here are our planned apporaches. We need some advices
>
> >> here about are those apporaches following the best practices of sling
> >> or simply anti-patterns.
>
> >> 1. How to get/load the data
>
> >> Some background: we will deploy three CQ instances. One author
> >> instance and two publish instances. The data will be prepared in
>
> >> author instance using JDBC, and doing some transformation in memory.
> >> The result will be persisted into a xml file and saved in JCR
> >> repository. We rely on replication functionality to
> >> publish the xml
> >> file to the publish instance. Then in the publish instance we just
> >> need to read things in xml back into memory.
>
> >> 2. How the data being updated and notified
>
> >> We will set up a scheduler in author instance to poll the data every
>
> >> day. The update will be notified as event. If the replication process
> >> can generate event itself, then we will rely on it. Otherwise, we need
> >> to set up new topic and publish event to it when the serialized xml
> >> file being updated. We assume CQ will distribute the event to publish
>
> >> instance so that the publish instance can know they need to refresh
> >> the data in memory using the updated xml file.
>
> >> 3. How the data being rendered
>
> >> As the data is not stored in JCR repository as nodes, but a plain
>
> >> file, so things like /mobile/handset/NokiaN75 can not be understand by
> >> CQ. To make those nodes available, we write our own resource provider,
> >> and map it to
> >> /mobile/handset. We will do URL parsing ourself to get
> >> the parameters. The final resource provided to script engine will
> >> contain the necessary data. In the page, we can reference the handset
> >> by reference the current node (a.k.a the resource provided). The
>
> >> resource provider is similar to tranditional controller in MVC
> >> architecture. So there might be 20+ controllers in a system, would
> >> that be a issue?
>
> >> 4. How to embed them as part of the page
>
> >> We leverage the CQ component here. The component will be configured
>
> >> with a URL, where it can fetch its html. So, the comonent can not
> >> render itself, instead, it relies on another URL to do it (which is
> >> backened by a controller). To do this, we either need to implement
> >> some sort of server side include, or using client side ajax call. The
>
> >> concern of server side include is how to do that, and how to pass user
> >> session. The concern of using client side ajax call is
> >> the user
> >> experience might be impacted as the network could be slow. We were
> >> told from the sales of CQ that to be effective about page caching,
> >> we’d better to use ajax call to retrieve the dynamic content.
>
> >> Any comments?
>
> --
> Visit:http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Just to clarify, the object graph will reside in heap during normal
processing, and will only be updated from the persistence mechanism
when it changes or when the service restarts.
On Mar 18, 1:06 pm, Michael Robinson <michael8robin...@gmail.com>
wrote:
> David and Paul,
>
> Thank you for your responses below. I'm working with Tao Wen on the
> same project, and I may be able to clarify a number of points for you.
>
> Our fundamental concern is best practice for data/content integration
> in the CQ/CRX application development environment.
>
> We have a legacy back-end system that manages a complex product and
> service catalog with several levels of many<->many relationships and
> associated business rules. Changing the processes and systems for
> managing this catalog is out of scope. We have created a POJO domain
> model which we populate through an ETL process from the legacy system,
> and we have custom CQ components to expose the catalog structure
> through dynamic user interaction following the recommended AJAX client-
> side application composition approach.
>
> We have two main questions. First, what is the recommended approach
> to persist the catalog object graph, and synchronize it across all
> publisher instances. One possibility is to persist it to an XML
> document, and put it in a node for synchronization and persistence.
> Because it is a graph of objects (i.e. cyclically-associated instances
> of classes with essential business logic methods), not a tree of
> content, mapping directly to JCR semantics would be a huge loss in
> terms of effort, performance, and functionality.
>
> Our second question is what is the recommended CQ approach for
> integrating data-driven transactional user applications (exemplified
> by, say, expedia.com) with a content-centric website (exemplified by,
> say, cnn.com). Particularly, what's best practice for handing off
> user session and transactional state across the boundary between the
> two.
>
> Any guidance you may be able to provide will be greatly appreciated.
>
> -Michael Robinson
>
> On Mar 17, 3:08 pm, David Nuescheler <david.nuesche...@gmail.com>
> wrote:
>
> > hi tao,
>
> > thanks a lot for your post. i think you raise a lot of interesting
> > questions, that are very relevant for people new to jcr, hence this
> > is a great conversation to have in a public and archived way.
>
> > > the reason why I did not choose to store in JCR is there is some
> > > transformation need to be done before the data being presented. Things like
> > > the list of price plan could change depends on the handset you have chosen.
> > > If things are stored in JCR, I think I need to do OCM and load it in memory
> > > then filter the list by applying business rules. Also, in my case, I just
> > > need to view it, no need to update/delete.
>
> > personally, i would probably never use ocm in this scenario.
>
> > jcr has a built-in mechanism to serialize and deserialize to and from xml.
> > as a matter of fact there is an argument to be made that jcr can produce
> > a stream of sax events directly without actually parsing an xml document,
> > which definitely can be an advantage in terms of performance. [1]
>
> > i would definitely suggest to store the product information as separate nodes
> > so they can be managed at a fine level of granularity. i think both joerg and
> > paul are spot on with their comments.
>
> > having individual nodes lets you use things like search, versioning, access
> > control etc. at a fine grained level without compromising on transformation
> > or script-ability. there are certain cases where it makes sense to store
> > information as an xml blob mostly if we are talking about huge, massively
> > complex documents that are not worth the additional effort deserializing.
>
> > in my experience, jcr is a very different persistence paradigm and i would
> > like offer up the recommendation that if this happens to be your first project
> > with jcr that you would have your "content model" looked at either
> > by a day services engineer or architect or by an experienced day partner.
> > usually, a "content modeling workshop" of a couple of hours can save
> > you a lot of time later on in the project.
> > of course you are more than welcome to post your content model as a
> > cnd to this list in a less formal fashion and get everybody's input.
>
> > regards,
> > david
>
> > [1]http://www.day.com/maven/jsr170/javadocs/jcr-1.0/javax/jcr/Session.ht...)
>
> > > 2009/3/17 Paul McMahon <oro...@yahoo.com>
>
> > >> The primary comment I would have is to wonder why you are storing the
> > >> content as and XML file? Perhaps I am missing a requirement, but it would
> > >> seem to be that you would be better off storing the content you import as
> > >> nodes in the CRX repository.
>
> > >> As an example think of how the campaign content is managed in the
> > >> Geometrixx example. You have a tree in the repository that is separate of
> > >> site tree, but is still standard content in the repository.
>
> > >> If you stored the content as a node it would simplify your replication
> > >> issues and your design for how you embed the content in your component. You
> > >> would be able to leverage the standard sling resource resolver so you
> > >> wouldn't have to build your own. Depending on exactly what your requirements
> > >> are you could look at how the teaser or reference components render the data
> > >> they display.
>
> > >> You mention that you are transforming some of this data after you fetch it
> > >> via JDBC so it may not be a good candidate for this, but this seems like a
> > >> natural use for a CRX connector and a virtual repository. Again your
> > >> requirements may mean that its not a good approach, but its worth
> > >> considering.
>
> > >> Paul McMahon
> > >> Acquity Group
>
> > >> --- On Mon, 3/16/09, taowen <tao...@gmail.com> wrote:
>
> > >> From: taowen <tao...@gmail.com>
> > >> Subject: [day-communique] Integrate CQ5 with data from another system
> > >> To: "Day Communique" <day-communique@googlegroups.com>
> > >> Date: Monday, March 16, 2009, 8:32 AM
>
> > >> We are working a public website using CQ 5. Some of the data like
> > >> product catalogue comes from another system. The requirement asks us
>
> > >> to get the data and embed them as part of the page hosted by CQ. This
> > >> is a challenging job to
> > >> us, as we can not find any example out there
> > >> doing this. The challenges we have identified are:
>
> > >> 1. How to get/load the data
>
> > >> 2. How the data being updated and notified
>
> > >> 3. How the data being rendered
>
> > >> 4. How to embed them as part of the page
>
> > >> Before we sending out this mail to ask for help, we did some
> > >> consulting with sales of CQ, and did some homework like reading source
> > >> code of sling. Here are our planned apporaches. We need some advices
>
> > >> here about are those apporaches following the best practices of sling
> > >> or simply anti-patterns.
>
> > >> 1. How to get/load the data
>
> > >> Some background: we will deploy three CQ instances. One author
> > >> instance and two publish instances. The data will be prepared in
>
> > >> author instance using JDBC, and doing some transformation in memory.
> > >> The result will be persisted into a xml file and saved in JCR
> > >> repository. We rely on replication functionality to
> > >> publish the xml
> > >> file to the publish instance. Then in the publish instance we just
> > >> need to read things in xml back into memory.
>
> > >> 2. How the data being updated and notified
>
> > >> We will set up a scheduler in author instance to poll the data every
>
> > >> day. The update will be notified as event. If the replication process
> > >> can generate event itself, then we will rely on it. Otherwise, we need
> > >> to set up new topic and publish event to it when the serialized xml
> > >> file being updated. We assume CQ will distribute the event to publish
>
> > >> instance so that the publish instance can know they need to refresh
> > >> the data in memory using the updated xml file.
>
> > >> 3. How the data being rendered
>
> > >> As the data is not stored in JCR repository as nodes, but a plain
>
> > >> file, so things like /mobile/handset/NokiaN75 can not be understand by
> > >> CQ. To make those nodes available, we write our own resource provider,
> > >> and map it to
> > >> /mobile/handset. We will do URL parsing ourself to get
> > >> the parameters. The final resource provided to script engine will
> > >> contain the necessary data. In the page, we can reference the handset
> > >> by reference the current node (a.k.a the resource provided). The
>
> > >> resource provider is similar to tranditional controller in MVC
> > >> architecture. So there might be 20+ controllers in a system, would
> > >> that be a issue?
>
> > >> 4. How to embed them as part of the page
>
> > >> We leverage the CQ component here. The component will be configured
>
> > >> with a URL, where it can fetch its html. So, the comonent can not
> > >> render itself, instead, it relies on another URL to do it (which is
> > >> backened by a controller). To do this, we either need to implement
> > >> some sort of server side include, or using client side ajax call. The
>
> > >> concern of server side include is how to do that, and how to pass user
> > >> session. The concern of using client side ajax call is
> > >> the user
> > >> experience might be impacted as the network could be slow. We were
> > >> told from the sales of CQ that to be effective about page caching,
> > >> we’d better to use ajax call to retrieve the dynamic content.
>
> > >> Any comments?
>
> > --
> > Visit:http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
hi michael,
thanks again for the further details.
> Thanks, David. I think I understand what you're saying now. All the
> data should be placed in well-positioned nodes in the JCR hierarchy
> with a user-friendly repurposeable URI.
that's correct as far as the path in the content repositories goes.
this should ideally relate to my rule #2 in my personal guidelines on
how to model content. [1]
> So, expanding on your examples, we would have a node in the content
> repository like:
> /content/packages/service/mobile/minutes/500/sms/200/device/nokia/5800/
> red/vas/pushtotalk/vas/mobiledata/200Mb/service/voice/device/CPE/
> M21120/vas/IDD/200minutes
while i am not sure of the underlying data model this looks like a scary
path from a depth perspective. i usually consider that a well balanced
tree could have something like average 5-15 siblings per level of the hierarchy.
if i would take an average of 10 this would mean in the above example
that under /content/packages we would have 23 levels, which would be
something like 10^23 nodes for packages. which definitely sounds like
a lot. almost any number ^23 sounds like a lot ;)
i am not exactly sure of how many "packages/variations" that are individually
priced you would have but it seems like i would probably drive to a reasonable
amount of packages with various price-unrelated options. i think one
would try to
keep the tree from being sparse.
> And in that node we would attach attributes like the list price for
> that particular package, the price after applicable promotional
> discounts, eligibility rules, and so on.
i think if there is a matrix of a many different options and the matrix
is sparse, i would probably have all the different plans and devices, etc.
in a content hierarchy and would probably store the aggregates of all
the different combination that need to be priced out separately in a way
that would not yield many more than nodes than the total number
of priced items.
do you have an estimate at how many individually priced items (packages &
variantions) one would be looking at in your model? just order of magnitude,
1k, 10k, 100k, 1m, 10m?
> And then our package selector would execute a JCR query across the /
> content/packages hierarchy to collect the nodes that matched the
> client-side UI settings as the user adjusted, say, the number of
> mobile minutes in the plan, or the color of the phone.
> Is that right?
generally, that could be one solution... if we are talking about the
client-side ui, are we talking about a web-ui? depending on the
number of combinations it may even make sense to just load
the entire graph in one go (json) for the "configurator"-ui.
> And if not, then where do you store and how do you query the business
> rules around package validity (e.g. data VAS plans only for
> smartphones), promotion applicability (e.g. 10% discount on 3 or more
> VAS), availability (e.g. service not available in rural postal codes),
> promotional pricing calculation, and so on?
i think i would store the individual aspects of pricing specific promotion or
discount nodes, with something like a pricing or discount type that
refers in the one or the other way to java class that can then find
out about the effect of a discount on a specific selection.
overly simplified i would probably have a node at:
/content/specialoffers/students-200-free-sms
- applicableServices: /content/services/...
- limitations: student
> Also, given that all the catalog data is coming out of legacy systems,
> how is the mapping to user-friendly URLs accomplished when new records
> are created in the back end?
since the paths in the repository would be generated automatically by
the etl process
i think the addition of new records should not introduce additional complexity.
i think my recommendation would be to use human readable names in
the content hierarchy rather than for example numbers (except "numbers"
are meaningful to the user). there may definitely be some additional work
needed in case the back-end system changes its data model.
regards,
david
[1] http://wiki.apache.org/jackrabbit/DavidsModel
> On Mar 20, 5:08 pm, David Nuescheler <david.nuesche...@gmail.com>
> wrote:
>> hi michael,
>>
>> thanks a lot for the additional UML diagram.
>>
>> here are my two cents, you will find that i am more of a believer in
>> "data-first"
>> hence my personal data modeling will go more "by example" than by uml diagram,
>> but please consider this a matter of personal preference not a
>> limitation of the content
>> repository or cq5.
>>
>> since i will work in examples i will assume a telco operator based on the above
>> generic uml catalog model.
>>
>> i think the entities at our disposition are clearly described by your
>> uml diagram so i think we can stick to those. since i am not a big fan
>> of actually going to describe a hard-structured datamodel in nodetypes
>> i would rather use the feature of residual properties and childnodes
>> and probably go with an nt:unstructured (or similar) approach and
>> use the sling:resourceType property as the means of identifying the
>> entity type for starters.
>>
>> the simples way to go about this in a cq5 way would then be to say,
>> ok we got packages, services, devices, ...
>> so let's just place all of those into /content/packages,
>> /content/services, /content/devices...
>> (this may be oversimplified since one would probably still toss
>> countries and languages
>> into the hierarchy)
>>
>> usually in cq one would make those entities individual templates, now given
>> that in your case "content authors" cannot create services or packages it
>> is not really necessary to expose them as templates but i would still
>> expose them
>> as individual sling:resourceTypes.
>>
>> so that would yield something like:
>> /content/packages/christmaspackage
>> /content/packages/studentspackage
>> /content/services/mobile
>> /content/services/voice
>> /content/services/broadband
>> /content/devices/nokia/5800
>> /content/devices/apple/iphone
>> /content/accessories/apple/bluetooth_headset
>> /content/accessories/nokia/minispeakers_md8
>>
>> as you see i already used the 1-to-many relationships of the vendor to the
>> devices and accessories as a mechanism to make the hierarchy more user friendly.
>> it is important to mention that one of the main drivers of the hierarchy should
>> be the access control. this means that object that "belong together"
>> since a group
>> of people is responsible to manage or see them, really should be stored in one
>> location in the repository. i think this is fairly intuitive given
>> that we all worked
>> with file systems, and learned to understand the benefits of a hierarchy when it
>> comes to access control.
>>
>> you may find yourself in a situation where the number of sibling nodes
>> in the hierarchy
>> becomes too large to handle. this is more of a gut feeling thing than
>> any hard rule.
>> as much as all of us would try to avoid 100k files into a single
>> folder or have empty folders
>> with just a single file/folder in it... (except for us java developers
>> with our mostly empty looking
>> java packages folders ;))
>>
>> as you may see above we would also drive the path of the hierarchy to
>> be usable for
>> human readable and seo friendly repurposing in the url.
>> so the iphone device description would be reachable athttp://localhost:4502/content/devices/apple/iphone.html
>> now one may argue that it is possibly easier to use an id in the path and create
>> something like /content/devices/1834620042374723 but i think it is
>> fairly evident that while this is definitely a mapping that works from
>> a repository
>> perspective it is not particularly user friendly. keep in mind that there is a
>> uuid available at your disposition for every node if you (should
>> ever;)) need one.
>>
>> the next step that i would recommend is to expose our "devices",
>> "accessories", etc..
>> through the "content finder" in cq5. this means that on the left hand
>> side the wcm
>> of a content page the user would have a "devices" tab, or a
>> "accessories" tab, that
>> can be used throughout the content pages to associate let's say
>> products for a link or
>> a teaser box in content.
>>
>> when "associating" let's say a "device" with its accessories i would
>> use the normal wcm mechanism of having a multi value property called
>> "recommendedAccessories"
>> on my node "iphone" and i would personally just put a path to the my "Blue Tooth
>> Headset" in there... (.../apple/bluetooth_headset)
>>
>> now a lot of people may argue that they want either "referential
>> integrity" or "stability across
>> move operations" in there. If you do need them, feel free to do that.
>> jcr offers mechanism
>> for both. personally, i believe that both are unnecessary and the
>> benefits don't justify the
>> cost of losing an intuitive data model.
>>
>> ... compare probably the bestcase:
>>
>> /content/devices/apple/iphone/jcr:content/recommendedAccessories=apple/bluetooth_headset
>>
>> ...where i would argue everybody who looks at this automatically understands
>> what that means.
>>
>> ... to somewhat of a worst case from a content modeling perspective.
>> /content/devices/473653453/jcr:content/recommendedAccessories=132332425
>> (well i have seen worse, but i don't dare to bring it up ;))
>>
>> now given that this pretty much outlines, how i would approach the situation
>> from a green field perspective, of course this may impose complexity that
>> has not been planned for in the etl layer, and of course in real-life you would
>> need to compromise and cut corners depending on the focus of your project.
>>
>> as mentioned before i am more than happy to have a look at the specifics
>> give you my recommendations on where to cut corners without losing too much
>> of cq5's facilities (ui-integration, replication, versioning, ...).
>>
>> regards,
>> david
>
> >
>
--
Visit: http://dev.day.com/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi David,
It seems that we are converging on an understanding.
Do you agree:
1. Just as it is possible to force content repository semantics into
a relational database implementation, it is also possible to force
relational database semantics into a content repository. Just because
it's possible doesn't necessarily mean it's a good idea.
2. In a complex application domain with both content aspects and
relational aspects, one can go with a pure relational approach, a pure
content repository approach, or a hybrid of the two.
3. In a hybrid approach, the application design needs to separate
concerns between the relational part and the content part.
4. In a hybrid approach, the application design needs to ensure the
interfaces and contracts between the relational part and content part
are simple, clean, flexible and well-leveraged. This has implications
for the
design of the client-server interfaces as well.
If we can agree on that, then a number of questions arise:
For 2, what are the criteria for deciding whether to adopt a
relational, content, or hybrid application design?
For 3, what are the criteria and design principles to use for
separating concerns?
For 4, what are the criteria and design principles (particularly with
respect to the details of the CQ architecture) for designing and
implementing the interfaces and contracts between the relational and
content components?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
hi michael,
thanks for your thoughful comments.
let me quickly try to answer some aspects of those.
> 1. Just as it is possible to force content repository semantics into
> a relational database implementation, it is also possible to force
> relational database semantics into a content repository. Just because
> it's possible doesn't necessarily mean it's a good idea.
generally, i see the content repository as a system that exposes both
the relational model and a hierarchical model.
see here [1].
so the content repository exposes a hierarchical view of the content
and also relational view of the content as tables.
the query engine exposes xpath for people more interested in
the hierarchical aspects and exposes sql for people more interested
in the relational aspects.
so you can definitely use sql in a content repository to query the
repository in its relational model where the nodetypes are viewed
as tables.
the hierarchy in a content repository is an optional feature that
can (and in my personal opinion "should") be leveraged wherever
it makes sense.
in a web environment i find a hierarchy particularly valuable since
it gives you a very intuitive means to map urls (or even a navigation)
to content (or data if you prefer) that is very simple and similar
to a static webserver.
i also see a lot of value for a hierarchy in environments where
access control (or other features that benefit from inheritance)
is of value.
granted that there may be applications that do not benefit
from a hierarchy as explained above, nor from things like
versioning, full text search or don't have any benefit from a
semi-structured approach. personally, i have
not run into an application that does not benefit from any of
the aspects exposed by a content repository.
however, of course this is a very theoretical argument.
in reality one will contrast particular implementations,
tooling, performance and most importantly personal
experience of content repositories and relational databases
and put them into perspective with a scope and timeline
of a project.
> 2. In a complex application domain with both content aspects and
> relational aspects, one can go with a pure relational approach, a pure
> content repository approach, or a hybrid of the two.
i would disagree with that on theoretical level.
if you have aspects of "content" in your application (see featureset [1])
(and of course i would argue any application does), you are bound
to build a proprietary "content repository" on top of a relational database.
which of course is possible and people unfortunately do that in our
industry on a daily basis.
given my personal experience, i really have not run into a situation
where i would personally prefer to use a bare rdbms as my backing
store.
on a practical level (timeline, personal experience, legacy systems)
i completely agree that there may be many reasons to use a relational
database.
> 3. In a hybrid approach, the application design needs to separate
> concerns between the relational part and the content part.
agreed. (except that on i would probably call it the content repository
part and the relational database parts, since the content repository
also features the "relational" aspect)
> 4. In a hybrid approach, the application design needs to ensure the
> interfaces and contracts between the relational part and content part
> are simple, clean, flexible and well-leveraged. This has implications
> for the design of the client-server interfaces as well.
sure.
> If we can agree on that, then a number of questions arise:
>
> For 2, what are the criteria for deciding whether to adopt a
> relational, content, or hybrid application design?
from a theoretical standpoint i would use [1] and see how
applications requirements match the featureset of a content
repository.
then combine that with practical issues like my personal experience,
performance or legacy infrastructure.
> For 3, what are the criteria and design principles to use for
> separating concerns?
assuming that a hybrid solution would be driven by practicalities
i would assuming that the separation of concern would also
be driven by practicalities... right?
maybe i misunderstand the question.
> For 4, what are the criteria and design principles (particularly with
> respect to the details of the CQ architecture) for designing and
> implementing the interfaces and contracts between the relational and
> content components?
on a very practical level, if someone would go with a hybrid model i would
probably recommend that there is a regular java api domain model
that would be used from the cq display component probably through
a tag library wrapped around the java api.
in the specific case that we discussed around product information i would
probably also expose the relevant product catalog information through a
content finder plugin so the author of the cms can use drag and drop
references to the product information, which would also mean that the
items that are referenced are also exposed as a sling resource.
regards,
david
[1] http://www.day.com/o.file/cr.png?get=4dc9fc18275e91e7f03a8be8ac35a6ce
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Hi, David,
Thanks for your comments so far. There are a number of points I'd
like to clarify further:
On Mar 26, 4:19 pm, David Nuescheler <david.nuesche...@gmail.com>
wrote:
> generally, i see the content repository as a system that exposes both
> the relational model and a hierarchical model.
> see here [1].
On reviewing section 8.5 of JSR-170, it appears that there is no
support for aggregation, no support for updates, insertions or
deletions, no support for outer joins, and support for inner joins is
optional (except where it's required to support mixin semantics, where
it's not really a join at all)
This is very far removed from what I understand as the relational
model.
Also how do you reconcile your "data first" (start with everything as
nt:unstructured) design advice with the view that the repository
exposes a "relational model"?
> granted that there may be applications that do not benefit
> from a hierarchy as explained above, nor from things like
> versioning, full text search or don't have any benefit from a
> semi-structured approach. personally, i have
> not run into an application that does not benefit from any of
> the aspects exposed by a content repository.
[...]
> given my personal experience, i really have not run into a situation
> where i would personally prefer to use a bare rdbms as my backing
> store.
Is it your position that the aspects of the relational model not
exposed by JCR-170 are unnecessary for a content repository, or just
simply unnecessary as a general matter of application design?
> > For 3, what are the criteria and design principles to use for
> > separating concerns?
>
> assuming that a hybrid solution would be driven by practicalities
> i would assuming that the separation of concern would also
> be driven by practicalities... right?
> maybe i misunderstand the question.
Given your clear design guidelines for a hierarchical content-centric
application design[1], I was hoping you might have similar insights or
guidance on how to integrate hierarchical content-centric application
design with relational data-centric application design.
[1] http://wiki.apache.org/jackrabbit/DavidsModel
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
hi michael,
thanks a lot for your post.
>> generally, i see the content repository as a system that exposes both
>> the relational model and a hierarchical model.
>> see here [1].
> On reviewing section 8.5 of JSR-170, it appears that there is no
> support for aggregation, no support for updates, insertions or
> deletions, no support for outer joins, and support for inner joins is
> optional (except where it's required to support mixin semantics, where
> it's not really a join at all)
> This is very far removed from what I understand as the relational
> model.
i completely agree that jcr 1.0 (aka jsr-170) does not mandate a broad
set of functionality in the query section, mostly because the vendors
involved in the expert group were not comfortable with making a broader
set of functionality mandatory.
given the fact that the industry has evolved quite a bit in the meantime
i would probably take jsr-283 as a reference to look at what can be mandated
these days from a specification perspective. on a side note jsr-283 will hit
proposed final draft in a couple of days, but you will already find an updated
query section that includes joins in the public review draft [1].
i think it is also important that the functionality exposed by a vendor or a
product (in your case crx) is not limited by the specification at all and the
specification provides a very clear path for vendors on how to expose
functionality beyond what the specification requires.
i think one really has to separate the "relational model" from the
expressiveness and features in an implementation. i definitely agree
that one needs proper joining capabilities to make practical use of
the relational aspects of a content repository, but i would argue that
inserts and updates would not really be needed given that there are
other means to easily update a result set or insert new date.
> Also how do you reconcile your "data first" (start with everything as
> nt:unstructured) design advice with the view that the repository
> exposes a "relational model"?
i would say that my interpretation of the "data first" approach primarily
leverages residual properties and child nodes, nt:unstructured is just the
ootb example and manifestation of that comes in very handy when
starting, and gets people started the right way.
if one wants to have different "tables" to use something like
a join, then i having a "marker" nodetype to facilitate the handling
is definitely an option.
>> given my personal experience, i really have not run into a situation
>> where i would personally prefer to use a bare rdbms as my backing
>> store.
> Is it your position that the aspects of the relational model not
> exposed by JCR-170 are unnecessary for a content repository, or just
> simply unnecessary as a general matter of application design?
i think particularly joins are very necessary for applications.
it was just impractical in the time frame of jsr-170 to mandate it
in the specification. so consider jsr-170 as a baseline that a broad
expert group was able to agree upon but not what i personally think
the functionality of a content repository should be.
of course our implementations generally exceed what jsr-170
mandates in various different respects, without violating any of
the contracts imposed by the specification.
>> > For 3, what are the criteria and design principles to use for
>> > separating concerns?
>> assuming that a hybrid solution would be driven by practicalities
>> i would assuming that the separation of concern would also
>> be driven by practicalities... right?
>> maybe i misunderstand the question.
> Given your clear design guidelines for a hierarchical content-centric
> application design[1], I was hoping you might have similar insights or
> guidance on how to integrate hierarchical content-centric application
> design with relational data-centric application design.
i think it is a very good idea to determine certain very clear decision criteria
on how to mix/separate a traditional java database backed application
with cq5. i don't think that this exists in the form a ready to use "recipe",
but i would definitely be interested in using something as specific as
your particular use case to derive some guiding principles from.
a lot of our integration partners and customers have mixed relational
database applications with cq, of which most were driven by an existing
legacy application. obviously an existing legacy app makes the decision
process a lot simpler...
i am definitely interested in continuing this conversation in whatever
form or shape and would like to thank you for all the valuable contributions.
regards,
david
[1] http://jcp.org/aboutJava/communityprocess/pr/jsr283/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Day Communique" group.
To post to this group, send email to day-communique@googlegroups.com
To unsubscribe from this group, send email to day-communique+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/day-communique?hl=en
-~----------~----~----~----~------~----~------~--~---
Running on CRX Quickstart and Sling
Copyright © 2010 Day Software, Inc.
Copyright © 2010 Day Software, Inc.