Access and Feeds

Universal Access to Data Repositories

By Dick Weisinger

Universal access to any structured repository is a compelling concept and something that truly warrants the description of being a ‘disruptive technology’.  It’s still a dream, but it has become the goal behind many new technologies now operating under the name of Enterprise Information Integration (EII).  A commonality behind all or most of the EII approaches is an attempt to create a single unified interface that can provide virtual access to many different data stores and repositories that may have implementations spanning vastly different data structures.

The same trend towards creating a virtual interface spanning heterogeneous systems has also been happening in the world of enterprise content management (ECM).  The last twenty years have seen many many companies take their stab at the content management space, all creating products with different subsets of the ECM-space feature list and all of them with competing architectural and data models.  This is to be expected. 

Formtek is one of these many vendors — we’ve concentrated on our customers and created  quality solutions for the niche industries of their businesses.  Like many vendors, we feel that the unique solutions we’ve created address very specialized customer requirements.

But concepts within ECM have been evolving over time and a lot of concepts have matured.  Some areas still continue to evolve today, especially in the areas of search and compliance.  But because of the ever on-going evolution, there never was an industry-wide shared vision for ECM with a detailed blueprint for how to achieve that vision.  This approach has been a healthy one, one that lets products and ideas compete, leading to solutions that address real-world problems.

The result of this proliferation of systems is that large enterprises are finding that their data is being managed by multiple systems.  Creating and providing access to enterprise-wide Business Intelligence is hampered by the disparate systems.  Collaborative and Convergence technologies are attempting more and more to create consolidated views of all information across the enterprise.

In the world of Java, one standard has come about for specifying a common method for accessing content repositories across multiple systems.  It is JSR-170 and is also known as the Content Repository API for Java or JCRDay Software has been the main proponent behind hammering out this standard.  Day’s visionaries Roy Fielding and David Nuescheler (spec lead) have been authors and promoters of the new approach towards standardizing the access across repositories.  They have made great attempts of trying to be inclusive by bring more than 60 vendors into the standardization process, including vendors like IBM, Oracle, Stellant and Filenet.  Day clearly outlines the benefits of a standard approach towards accessing content, and with broad support, this could be very disruptive.

Day has also been leading a reference implementation of JSR-170 called the JackRabbit project.  It is being hosted as an Apache incubator project.

The JackRabbit project, perhaps for portability reasons, is a non-DB based implementation of the JSR-170 spec.  It serves the purpose well as a reference implementation, but some question whether this implementation can be ‘industrial strength’ enough for enterprise applications — DB vendors have spent years wrestling with problems related to high-volume transactions and scalability. 

Other factors that have impeded acceptance of JSR-170 to date have been no major commercial vendors offering JSR-170 connectors for their repositories, the fact that the specification was frameworked within Java, the level of difficulty in creating a JSR-170 interface for an existing product, and the implications of being a lowest-common-denominator technology. 

Undoubtedly many ECM vendors view JSR-170 as a threat to their installed base and have been reluctant to move forward with the technology.  Day has helped to accelerate the process by creating connectors for two major vendors, Documentum/EMC and Filenet.  Day has also pledged to create JSR-170 connectors for OpenText LiveLink, Microsoft SharePoint, IBM Domino.doc, Software AG Tamino, and Interwoven. 

The availability of connectors to existing ECM systems will be a major decision point for determining whether or not the technology can be useful for customers or not.  Maybe a decision by Day (and others) to Open Source a collection of JSR-170 connectors for popular ECM vendor systems would truely accelerate the disruptive power of this technology — but this does not yet seem to be their current business strategy.

Then there is the choice of Java.  The ‘J’ of JSR-170 is ‘Java’.  While the overall concept of JSR-170 need not be limited to Java, that was how it was framed, but it may be limiting to any kind of cross-over acceptance.  Over the last decade Java has had many successes and can clearly be called an enterprise-level technology.  Most of the major ECM vendors are heavily, if not predominantly, Java based in their implementations. 

But there are competitors to Java.  Java is most commonly compared against the .Net equivalent language, C#.  Actually Java has reached a phase where it is being questioned.  Some well-known Java proponents have even recently defected to newer technologies, like Ruby.  There are issues with Java as having poor performance, being too complex and just too ‘heavy’.  Especially in the area of web development, the Java-based solutions have not been too nimble, leading to an explosion now of Java-based ‘frameworks’ that try to fill the short-comings.

Because of the wide range of web technologies available for thin-client development, JSR-170, a technology framed within the Java community, may not seem to be very compelling for Web Content Management systems that aren’t Java-based.  One open-source attempt to create a PHP equivalent implementation of JSR-170 does not seem to have gotten much traction.

There is the issue of the difficulty in implementing technology.  Implementation may not be ‘difficult’, but it is certainly not trivial.  And as with all specifications, it often takes a lot of time to ensure true compliance with and complete understanding of a specification document.  More template connector code and more examples might help better seed the vendor community to motivate them to create/contribute connectors.

Finally, there is the issue that the technology is lowest-common-denominator.  If you want to build an application on top of XYZ vendor’s technology, you’d be able to access a much richer level of functionality by using that vendor’s API rather than using the somewhat limited capabilities of JSR-170.

Formtek has great interest in JSR-170.  The capabilities of JSR-170 could lead to some very interesting applications.  It has great potential, but it still seems to lack momentum, and needs to become more widespread to become very useful.  Day’s continuing contributions to the list of available connectors may change that.

The jury is out on that, but JSR-170 and Jackrabbit has certainly generated a lot of interest.

The approach that Formtek has taken towards EII has been SOA and Web Services.  We believe the momentum behind SOA technology is there today and at least in the short-term that this is the best approach.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published. Required fields are marked *

*