Here's a use case that I think summarizes what FISE could do, medium-term, to help CMS vendors manage their content in more semantic ways. I won't scare you with RDF, ontologies and the like: at this level we're just looking at providing valuable features to our users, without requiring them to learn anything new. There might well be RDF, ontologies and SPARQL queries under the hood, but at our level we don't care, this is just about the user story.
Here's a picture that I took on a trip to Iceland a few years ago. Typical Icelandic house with typical big Icelandic four-wheel drive vehicle (unlike many places you actually need those there, believe me) parked in front, with a canoo on top. Kinda makes you want to live there if you like wide open spaces.
Now, here's a drawing by young Bertrand which has much the same content, at the semantic level: a big car in front of a house, with a boat on top of the car. Not too stylish, but the same basic information is in there. Smiling sun of course, which you might get in Iceland every ten minutes in between showers...
For our eyes and brain it is trivial to see that both images describe a similar scene. However, I doubt your CMS or digital asset management system would consider them as similar. You need a good semantic understanding of them to find out that they pretty much tell the same story - it's not just about the raw bits.
That's where FISE comes into play. We don't have all the required semantic analysis algorithms in FISE for this use case today, but the current infrastructure would (mostly) enable it if we had them.
FISE allows you to plug in such algorithms, using a simple Java EnhancementEngine interface. Based on OSGi, FISE makes it possible to mix and match a wide range of Java libraries without conflicts, allowing pre-existing or new analysis modules to collaborate. Analyzers written in other languages can be integrated using either native language integration or remote access, ideally over HTTP.
Image analysis scenario
Here's how FISE would help find out that our images are similar:
- A JPEG engine extracts the EXIF metadata from the images if present.
- A text-based entity extraction engine looks at that metadata, and if the images have a good title or description connects them with some well-known entities. For example Country=Iceland and Contains=House for the first one, and Contains=House and Contains=Car for the second one, if the images titles are "House in Iceland", and "The Big Car in front of Dad's House" for the second one.
- A shape-based entity recognition engine adds metadata like Contains=Car and Contains=House for both images.
- A graphical analysis engine adds metadata like Style=Photo for the first and Style=Drawing and Style=Childish for the second image, due to its strong primary colors and ragged lines.
- A similarity search engine integrated in FISE can then find out that both images contain similar objects, so they can be considered similar even though the style of image is very different. You could also search for childish drawings of houses, and then get a link to the nicer photo besides young Bertrand's drawing.
The role of FISE is to coordinate the various analyzers, combine and store their results, make them searchable and provide a RESTful interface to all this.
FISE is the integration engine that makes such scenarios possible once analyzers of sufficiently good quality are available. As usual, the sum is greater than its parts, so being able to combine various such analyzers should lead to very valuable results, even with imperfect analyzers.
Orchestration and intents
What's currently missing in FISE is a way of orchestrating the enhancement engines: currently they only run in a configurable sequence, without real interactions between them.
We'll have to discuss this on the FISE mailing list, but right now I'm thinking that something similar to the Android Intents mechanism, where an engine broadcasts information about what it has found so that other engines can build upon that information, might be well suited to that problem. The orchestator would start by broadcasting an "analyze incoming content" intent, to which a few engines would respond. The engines in turn broadcast intents like "enhance title and description", "analzye image content" etc. and the orchestator keeps going, iteratively, until there are no outsanding intents left.
That analysis might take some time, depending on which analyzers are used, but the FISE design allows for asynchronous computing of metadata as well. In some cases, involving humans in parts of the analysis (a la mechanical Turk) might be the best way to get meaningful results, at least until Kuzweil's Singularity hits us. Asynchronous analysis would then be required, and FISE would have to be able to say "I have some metadata for your content already, and more is supposed to come at some point". This is foreseen in the current FISE design but not yet fully specified nor implemented.
Coda
I think this image similarity use case is a good way to explain what FISE is about, and will help validate the FISE design.
FISE is on a very good track to making such things possible, while keeping things simple from the CMS integrator's point of view, thanks to its RESTful interface. The design needs some refinements, and we'll very certainly get some good input about that next week at the workshop - looking forward to it!