Forgot your password?
typodupeerror
User Journal

karniv0re's Journal: My Journey to Find a Java-based XML Diff Util

Journal by karniv0re

I hope to save some headaches for other people by going through my own
here.

I currently have the need to determine the difference between two XML
documents. Not a line-by-line unix-style diff utility, but a nice
DOM/Tree/XML document as a result.

So let's start, shall we?

First up, VMTools seemed promising. It looked to do what I needed it to
do. But after importing it into my Maven repository, and running a
test case with their sample code,
it became all too clear that they were using a very outdated JDOM.

java.lang.NoSuchMethodError: org.jdom.Element.addContent(Lorg/jdom/Element;)Lorg/jdom/Element;
        at org.vmguys.vmtools.ota.OtaUpdate.generateUpdate(OtaUpdate.java:319)
        at org.vmguys.vmtools.ota.OtaUpdate.generateDiffs(OtaUpdate.java:226)
        at org.vmguys.vmtools.ota.OtaUpdate.generateDiffs(OtaUpdate.java:181)
        at com.uprr.streamline.shipment.mgmt.core.notificationMessage.carrier.XmlDiffTest.testDiff(XmlDiffTest.java:92) ...

Next!

Next up is X-Diff, which
looks like it was written for an early-2000s era masters thesis by the
looks of the website.
Oh, nvm, it was.

Let's hope it works better than it looks.

Opening up the zip file, we run into... a makefile?! What is this,
amateur hour? The entire package consists of 5 files: XDiff.java,
XHash.java, XLut.java, XParser.java, and XTree.java.

After dicking with this one for way too long, I'm giving up. It's clear
this dude used Vim or Notepad++ at the very most to develop this,
because it is very amateurish, though credit where it's due, I'm sure
the implementation is entirely scientifically correct. Just looks like
shit and probably won't integrate well with my app. Saving it as a last
resort.

Next, I tried XmlUnit. This looked promising, and I was able to get it
from the Maven Central Repo, which was nice. Upon trying the diff
option, I got this as a result:

  differences: [Expected number of child nodes '11' but was '15' - comparing at /root[1] to at /root[1], Expected sequence of child nodes '3' but was '5' - comparing at /root[1]/whatsInTheBucketMan[1] to at /root[1]/whatsInTheBucketMan[1], Expected text value '

        ' but was '
        ' - comparing

          at /root[1]/text()[3] to
          at /root[1]/text()[3], Expected sequence of child nodes '5' but was '7' - comparing at /root[1]/whatsInTheBucketMan[2] to at /root[1]/whatsInTheBucketMan[2], Expected text value '

        ' but was '

        ' - comparing

          at /root[1]/text()[4] to

          at /root[1]/text()[4], Expected sequence of child nodes '7' but was '9' - comparing at /root[1]/whatsInTheBucketMan[3] to at /root[1]/whatsInTheBucketMan[3], Expected number of child nodes '7' but was '3' - comparing at /root[1]/drink[1] to at /root[1]/drink[1], Expected attribute value '1' but was '4' - comparing at /root[1]/drink[1]/bucket[1]/@value to at /root[1]/drink[1]/bucket[1]/@value, Expected text value '
                ' but was '
        ' - comparing
                  at /root[1]/drink[1]/text()[2] to
          at /root[1]/drink[1]/text()[2], Expected attribute value '2' but was '4' - comparing at /root[1]/drink[1]/bucket[2]/@value to at /root[1]/drink[1]/bucket[1]/@value, Expected sequence of child nodes '3' but was '1' - comparing at /root[1]/drink[1]/bucket[2] to at /root[1]/drink[1]/bucket[1], Expected text value '
                ' but was '
        ' - comparing
                  at /root[1]/drink[1]/text()[3] to
          at /root[1]/drink[1]/text()[2], Expected sequence of child nodes '4' but was '2' - comparing
                  at /root[1]/drink[1]/text()[3] to
          at /root[1]/drink[1]/text()[2], Expected attribute value '3' but was '4' - comparing at /root[1]/drink[1]/bucket[3]/@value to at /root[1]/drink[1]/bucket[1]/@value, Expected sequence of child nodes '5' but was '1' - comparing at /root[1]/drink[1]/bucket[3] to at /root[1]/drink[1]/bucket[1], Expected sequence of child nodes '6' but was '2' - comparing
          at /root[1]/drink[1]/text()[4] to
          at /root[1]/drink[1]/text()[2], Expected sequence of child nodes '9' but was '13' - comparing at /root[1]/drink[1] to at /root[1]/drink[1], Expected text value '
' but was '

        ' - comparing
  at /root[1]/text()[6] to ...umm, cool story, bro. Unfortunately, that is not the format I need it
in. Because, at this point, I'm back to parsing. I just want a list of
nodes, or even a tree would work. Nodes added, nodes removed, and nodes
changed.

Still, this one has potential, but would require me to take the code
into my own hands, much like X-Diff.

3DM is a 3-way XML merging and
differencing tool. I'm not sure why one would need to merge things 3
ways. Maybe in the case of revision controlling and patching. But ok,
anyway, that's not what I'm after. Does it do a diff? It does do a tree diff,
much like I would want, but it seems to be pretty custom. They do note
that there is another diff tool to try: fc-xmldiff.

fc-xmldiff - man, developers are
really not good at marketing. I'm just going to call it "FUCXML" (read:
"fuck xml"). After just briefly looking at this, it is also going to
take work. ARGH, I JUST WANT SOMETHING OUT OF THE BOX.

Next, I will look at DiffX. This
seems to be a fairly robust project, but I haven't dug enough into it to
see for myself. I am putting this on hold for now, but will probably
post an update when I come up with something. More than likely, I will
end up rolling my own using one of these projects.

Le sigh.

This discussion has been archived. No new comments can be posted.

My Journey to Find a Java-based XML Diff Util

Comments Filter:

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...