XML API

XML processing

We all know that standard Java DOM implementation by Sun, which is bundled with JDK since early eras, has certain disadvantages:

  • org.w3c.dom.NodeList does not provide a common Java collections interface iterate through it.
  • There are no traversal algorithms support, the only two methods are org.w3c.dom.Document#getElementById() and org.w3c.dom.Document#getElementsByTagName().

These limitations do not come out from bad model design but are the result of strict following W3C specifications. The way out is to use alternative libraries, which compensate these limitations plus provide some extra bonuses.

Framework Xerces / W3C DOM Electic XML (jar) AXIOM1) (jar) dom4j v2.x dom4j v1.6.1 (jar) jDOM (jar) XOM (jar)
Implements org.w3c.dom interfaces? :YES: :YES: :YES: :NO: :NO: :NO: :NO:
Are Java5 collections enabled? :NO: :NO: :NO: :YES: :NO: :NO: :NO:
Provides interfaces for all model elements? :YES: :YES:2) :YES: :YES: :YES: :NO: :NO:
Are model elements java.io.Serializable? :NO: :YES: :NO: :YES: :YES: :YES: :NO:
Are model elements java.lang.Cloneable? :NO:3) :YES: :YES: :YES: :YES: :YES: :NO:
Fluent API? :NO: :YES: :NO: :YES: :YES: :YES: :NO:
Quick serialization to XML :NO: :YES: :YES: :YES: :YES: :NO: :YES:
Quick Element.getFirstChildElement(String name) :NO: :YES: :YES: :YES: :YES: :NO: :YES:
Quick Element.setAttribute(String name, String value) Element#setAttribute(String, String) :YES: :YES: :YES: :YES: :YES: :YES:
Quick Element.setText(String) Node#setTextContent(String) :YES: :YES: :YES: :YES: :YES: :YES:
Memory-efficient processing of big XML files :NO: :NO: :YES: :YES:4) :YES: :NO: :YES:5)
Visitor pattern support (DOM tree traversal) :NO: :NO: :NO:6) :YES: :YES: :NO: :NO:
Building from/to DOM :DEL: :DEL: :NO: / :NO: DOMReader#read(Document) / DOMWriter#write(Document) DOMReader#read(Document) / DOMWriter#write(Document) DOMBuilder#build(Document) / DOMOutputter#output(Document) DOMConverter#convert(Document)/:NO:
Building from/to SAX :NO: :NO: StAXOMBuilder#getDocument() / OMXMLReader SAXContentHandler#createDocument() / :NO: SAXContentHandler#createDocument() / :NO: SAXHandler#getDocument() / SAXOutputter#output(Document) :NO:7) / SAXConverter#convert(Document)
Building from/to StAX XMLDOMWriterImpl via XMLOutputFactory#createXMLStreamWriter(Result) / :NO: :YES: / :NO:8) StAXBuilder#StAXBuilder(XMLStreamReader) / OMSerializable#serialize(XMLStreamWriter) :NO: / STAXEventWriter(XMLEventConsumer) :NO: / STAXEventWriter(XMLEventConsumer) :NO: / :NO: :NO: / :NO:
XSTL transformation (TrAX) DOMSource, DOMResult :YES:9) OMSource, OMResult DocumentSource, DocumentResult DocumentSource, DocumentResult JDOMSource, JDOMResult, XSLTransformer.html#transform(Document) XSLTransform#transform(Document)10)
XPath search :NO: :YES:11) :YES:12) :YES:13) :YES:14) :YES:15) :YES:16)
XPath for given node :NO: :NO: :NO: :YES: :YES: :NO: :NO:
XML Schema Data Type support17) :NO: :NO: :NO: :YES: :YES: :NO: :NO:
XInclude support :NO: :NO: :NO: :NO: :NO: :NO: :YES:
Canonical XML support :NO: :NO: :NO: :NO: :NO: :NO: :YES:
License GPL EXML license (OpenSource, copyright) Apache license BSD licence BSD licence Apache license LGPL


Personal notes:

  • XMLTool:
    :ADD: Provides a true fluent API on the top of W3C DOM JDK implementation. It provides you the mechanism to create, navigate and navigate DOM tree in a stateful manner.
  • Digester:
    :ADD: A nice lightweight solution to you define a XML-to-Java object mapping, which is based on XML pattern rules which are triggered when the given XML path is recognized.
    :ADD: Designed to target divorced big XML files.
  • Electic XML:
    :DEL: The serialization and parsing of XML tree is included into tree model elements. This triggers the following limitations:
    • One can read/write only to String, File, OutputStream or Writer at the moment.
    • One can probably write an adapter for pull parsers (e.g. XMLEventReader), but not for push parsers (e.g. ContentHandler).
    • So there is no way to provide another version of parser that for example creates the XML tree with elements that extend the basic (e.g. I want to replace the standard Attribute implementation with mine, which extends the standard)
  • AXOIM:
    :ADD: This is natural AXIOM facility not to create the complete XML tree model as the pulling the elements from builder occurs when you request the needed information from the model.
  • jDOM:
    :DEL: If choosing between jDOM and XOM – take XOM as it is jDOM's successor.
    :DEL: jDOM has no natural DOM elements class hierarchy (e.g. Document, Attribute, Element is a Node)
    :DEL: Have a look at XOM vs. dom4j notes by Elliotte Rusty Harold (author of XOM and also the contributor of JDOM).
  • XOM:
    :DEL: The community is so limited, that nobody cares about releasing new versions of XOM into maven.
    :DEL: XOM aims to enforce correctness better than JDOM/dom4j, but the API variety is lower, also because:

Relative links:

XML serialization

1) The description in this column about AXIOM model is incomplete
2) Only W3C DOM interfaces
3) Supports node cloning via Node#cloneNode()
4) Consumer's ElementHandler should detach the node from the tree after processing
5) Custom NodeFactory implementation should return empty node list
6) Traversing can be done by using OMNavigator
7) XOM has a hanlder nu.xom.XOMHandler, but it is non-public class
8) , 9) As Electic XML implements W3C DOM interfaces, any existing adapters can be used including ones bundled in JDK
10) nu.xom.xslt.XOMSource and nu.xom.xslt.XOMResult are non-public classes
11) , 12) , 13) , 14) , 15) , 16) Implementation is build on the top of Jaxen project
17) Framework support for creating a schema datatype checking model elements
programming/java/xml_api.txt · Last modified: 2010/12/02 12:23 by dmitry
 
 
Recent changes RSS feed Driven by DokuWiki