A quick guide to the XML/object mismatch

June 7th, 2005

Complain all you want, but the fact of my life is that I work in both the object and XML worlds. Eventually XML needs to turn into objects and vice versa. This guide explores what the mismatch entails and methods to handle it.

What is the mismatch? I see a three fundamental parts:

  1. Transfer: Some XML concepts which don’t map to OO concepts (xsi:nil, xs:any, Faults != exceptions, etc).
  2. Versioning: OO software has no versioning concept. This makes our lives really really hard sometimes – when we depend on different versions of the same library for example. XML is much more powerful in this respect. Different versions can happily coexist in the same document.
  3. Independence: Our service contract most likely needs to evolve independent of our internal model. However, they are definitely closely related.

DTOs
The most prominent method of handling XML/object conversions is the Data Transfer Objects pattern. The concept is that you have a set of objects which are either created from your contract or are used to generate your contract. You then write glue code which transfers from one object domain (web service) to the other (your internal objects).

DTOs have the advantage in that they’re easy to work with and are often more familiar to the developer than XML/XSD/WSDL.

I see two types of DTOs in use – unified and versioned.

Unified DTO: With a unified DTO pattern, only one set of DTOs is ever used for all versions of your service. The toolkit would be responsible for handling different xml versions magically.

Versioned DTO: The versioned DTO pattern takes this a step further and creates a new set of DTO objects for each schema/WSDL version. This makes its easier on the toolkit when handling schema evolution at the cost of more code to maintain.

DTOs suffer from problems with #1, although the concept mismatch can be minimal depending on how much of xml schema you’re using (i.e. don’t allow nillable schema types on java/C# primitives because an int can’t be null! ). Toolkits such as XMLBeans which handle 99.99% of XML schema can be helpful in this area.

Contract First Development
Some proponents argue that writing our contract first (WSDL and XSD) is necessary for XML nirvana. Usually in this case DTOs are generated from the schema, however you may also just work with the documents as well (see below).

When combined with DTOs this is a double edged sword with respect to XML/object transfer. It helps by ensuring that you use the XML concepts which are best suited to the problem – not the concepts your toolkit happens to support. On the other hand, it may create issues for people consuming your service because you may inadvertently use concepts which don’t work well with other toolkits. This is still an issue currently, although it will become less of a problem in the future as toolkits such as JAXB 2.0 and .NET 2.0 increase their schema support.

Contract first also ensures that your internal model develops independently and gives you a lot of freedom over schema versioning.

Code First
Because of the issues associated with matching XML concepts to OO concepts, some people have called for code first development. The toolkit is then responsible for making sure the code is to XML concepts nicely. It also has the added convenience of being much easier for the OO developer who doesn’t know XML schema very well.

Sidetrack: DTOs and Annotations
When using annotations to control your object/XML conversion: beware. You are limited to the versioned DTO pattern only. Toolkits such as JAXB 2.0 and .NET will really only allow you to map one object to one xml schema.

Working with Documents
Working with documents can be another supposed nirvana of XML. DOM, XQuery, XPath, XSLT, and other XML technologies can be very helpful in processing your documents. I find that this is mostly helpful when your desired output is also XML (i.e. for an XML database). They certainly help with versioning and independence because you have very fine grained control over what happens.

However this doesn’t help you all that much with the transfer problem. In the end you may just be relegated to writing the glue code by hand.

Other solutions

  • Don’t use objects. If you believe this to be true, this guide isn’t for you. Sometimes this is right, but a lot of the world is still trapped in OO programming.
  • Don’t use XML: Once again, if you believe this, get out of here. XML isn’t right for all cases, but this guide is for the cases where it is right.
  • XPath/Object mapping – You could use XPath expressions to match up XML elements to objects and their properties. I have yet to see a really robust implementation of this.

2 Responses to “A quick guide to the XML/object mismatch”

  1. David Zhao Says:

    I could try liquid xml.
    It is a good to generate classes from xml schema(Java, C++, C#, VB)

  2. ag Says:

    I too live in between the XML and OO worlds. The JiBX XML-OO mapping library makes doing so bearable. XBIS is pretty handy too.

    http://jibx.sourceforge.net/
    http://xbis.sourceforge.net/