Archive for January, 2008

JCR and AtomPub

Tuesday, January 15th, 2008

Now that Galaxy is announced, I can unleash my fury of blogging again. Or, as we all know will probably happen, I can blog just slightly less sporadically again.

One of the recent posts I’ve wanted to comment on is Atom is the New JCR (prompted by Sam Ruby). Adrian Sutton writes:

 

When the Java Content Repository (JCR) standard first came out it was supposed to bring in a new era of compatibility between content repositories and put an end to the content silo. There was, and still is, a lot of talk about it and just about everyone added JCR compliance to their marketing materials. Unfortunately, that’s mostly where things stopped – the implementation work that followed was generally done was buggy or incomplete and the only viable JCR implementations that I’ve seen have come out of Day Software, who lead the JCR spec effort.

We use JCR inside Galaxy for all our storage needs at the moment. Using it has been an interesting experience to say the least. One of my favorite “features” of JCR has been the fact that you can not have two seperate threads create a child node with the same name at the same time. Which means in our activity log, we can’t add two nodes called “/activity” to “/activities” at the same time.
The main reason we decided to go with JCR is that it supports both simple database type data and file type data via the same interface transparently. I’m not 100% sure it was the right choice though. I had to write an ORM like framework as the one in Jackrabbit seemed pretty immature and I didn’t have time to delve into the rabbit hole of the jcr-mapping module in Jackrabbit. Mine is limited but as least I could figure out how to use it. We’ll be reevaluating this for the future though now that Jackrabbit 1.4 is nearly out.

Adrian continues:

Then along came Atom which is all about remote access and manipulation of data and missing probably 90% of the functionality that JCR offers. It really isn’t a competitor to JCR at all and yet it’s doing more to break down content silos than JCR ever has. Atom support isn’t just being added to the marketing materials, it’s actually shipping and is usable in a lot of places – IBM’s Lotus Connections has Atom APIs to everything and, as best I can tell, only Atom APIs to it’s repository.

I completely agree that JCR isn’t very worthwhile as something that will break down content silos. It does have value as an API to work with data though. Atom is quite limited in the granularity it can work with data (which coincidentally is one of the reasons Web3S exists as well). And you still need to store your data somewhere.

Atom has not replaced JCR it has supplemented it. We use JCR as a content store for Galaxy. I’ve also written a generic JCR content store for AtomPub inside of Abdera. Sure, the vendor promises are probably wrong about the content-silos, but it sure simplifies some things when writing applications – which is what really counts!

Update: David Nuescheler has written a great follow up to this whole discussion.

Also, Jackrabbit 1.4 is out

Don’t drink while reading

Tuesday, January 15th, 2008

Don Box writes:

And as for D being so much better than C or C++, I would expect D to overtake C adoption right after WS-Transfer replaces HTTP on the public internet :-)

I couldn’t swallow the water I was drinking and had to leave before I ruined my keyboard (again) as I was laughing quite hard.

Mule Galaxy Governance Platform

Monday, January 14th, 2008

Today we’re announcing, among other things, the first release of Mule Galaxy which is our “SOA governance platform with integrated registry/repository.” (whew – thats a mouthful)

We’re exploring what it means to bring open source tools and governance together. Governance has typically been a game for those with lots of cash at hand with registries typically costing hundreds of thousands to a million or more dollars. Open source breaks down that barrier and makes it much more approachable to everyone. And there’s a lot of reasons it should be approachable as governance can be very valuable. For instance, governance tools can help you:

  • Gain visibility into services across your organization and collaborate on them
  • Manage an artifact or service’s lifecycle more effectively
  • Gain visibility into and control over who is consuming your services, schemas, etc.
  • Enforce best practice policies across your services like you would across your build with Checkstyle or PMD.
  • Manage the security of your services more effectively through policies
  • Easy manage configurations, schemas, WSDLs, etc for deployed clusters of machines

Yeah, but what is it?

At Galaxy’s core is an artifact and metadata repository which can store whatever you want in it. Mule configurations, WSDLs, schemas, WS-Policy documents, etc are all recognized by Galaxy out of the box and it can store unrecognized artifacts like zips or jars too (although soon these will be recognized). Galaxy allows you to manage your artifacts by versioning them and organizing them into workspaces.

The magic starts to happen with the layers on top though.

Indexing: Galaxy includes the ability to index artifacts and extract metadata from them for easy searching. For instance, I can search for Mule configurations with a mule service or a WSDL with a particular portType or for policies which should be applied globally or for an xml schema with has element Foo inside it.

You can also build your own indexes with XPath or XQuery expressions which is quite handy.

Metadata: You can associate any type of metadata that you want with an artifact and search on it.

Galaxy query language: Galaxy has an SQL-like syntax for querying artifacts based on their metadata and content. For instance you could do:

select artifact where mule.descriptor = 'HelloWorldUMO'


select artifact where wsdl.portType = 'HelloWorldPortType'


select artifact from '/Policies/Global' where documentType = {http://www.w3.org/2006/07/ws-policy}Policy

Policy Enforcement: The ability to apply enforce policies across artifacts in the repository. For instance, included is support for policies which:

  • Enforce WS-I BasicProfile compliance on your WSDLs
  • Require that Mule configurations only use SSL encrypted endpoints
  • Enforce WSDL backward compatability
  • And of course the ability to extend it and write your own.

Dependency management: See who is consuming your artifacts so that you may make informed decisions about the future development of them.

Lifecycle management: Galaxy provides some workflow-like features so that you can take appropriate actions at various points in an artifact/service’s lifecycle.

Integration: Included is integration with a number of open source products including Mule, Apache CXF and Spring. With Mule you can discover and find your necessary configurations – all you need is a URL which queries Galaxy to boot Mule. With CXF you can build a set of runtime policies and apply them to your services. We’re also working on building out a number of other plugins including one for Microsoft’s WCF framework which will be released soon.

AtomPub API: So if you were all wondering what got me started on AtomPub, this is it. We represent workspaces, artifact version histories and comments as AtomPub collections. Artifacts can be queried with AtomPub using the above query language (http://host/api/registry?q=select artifact ….).

To read more about how we represent artifacts and their metadata using AtomPub, check out this page.

And of course we use Apache Abdera :-) More on this topic later as I have a lot of thoughts about the strengths and limitations of AtomPub in this scenario.

Feedback

We have a number of other cool features planned for the near future as well. But just as importantly we need your feedback. Please check it out and let us know what you think. What would convince you to deploy this inside your organization? What governance features are you looking for (if any)?

For more information see: