Become a fan of Slashdot on Facebook

Tim Bray on the Birth of XML, 10 Years Later 260

Posted by CmdrTaco on Monday February 18, 2008 @12:34PM from the all-bloatetd-and-grown-up dept.

lazyguyuk writes "Tim Bray posts a lengthy blog on the birth of XML, formalized as 1.0 in Feb 1998. 'XML is ten years old today. It feels like yesterday, or a lifetime. I wrote this that year (1998). It's really long. The title was originally Good Luck and Internet Plumbing but the filename was "XML-People" and I decided I liked that better. I never got around to publishing it, so why not now?'"

This discussion has been archived. No new comments can be posted.

Tim Bray on the Birth of XML, 10 Years Later

Load All Comments

Search 260 Comments Log In/Create an Account

Comments Filter:

Classic (Score:5, Funny)

by Gothmolly ( 148874 ) writes: on Monday February 18, 2008 @12:45PM (#22464414)

Young Buck: Hey, we have a data exchange problem between two systems, lets use XML !
Greybeard: Ok, but now you have 2 problems.

Share
twitter facebook
- Re:Classic (Score:5, Insightful)
  
  by smittyoneeach ( 243267 ) * writes: on Monday February 18, 2008 @01:07PM (#22464668) Homepage Journal
  
  In defense of XML, the parsing problem is handled.
  Best wishes on solving the semantic snarls.
  XML, like all good approaches, handles mechanism, not policy.
  
  Parent Share
  twitter facebook
  - Re:Classic (Score:4, Interesting)
    
    by fireboy1919 ( 257783 ) writes: <.gro.llehseerf. .ta. .pytsur.> on Monday February 18, 2008 @01:29PM (#22464930) Homepage Journal
    
    In defense of XML, the parsing problem is handled.
    
    To me that says that XML handles a problem that wasn't there. Parsing problem for pretty much everything is almost universally solved by regex...
    
    I don't really care about the XML format. Personally, I'd be happier if it were stored in binary. The thing I like is the DOM tree as a data construct, XPath as a means of addressing, and XQuery as a means of getting parts out of it. (XSLT is okay, but from my experience, it's a lot clearer to represent a transformation as a series of productions than it is to use XSLT...perhaps a production-oriented approach that used XPath addressing?)
    
    With those, you've got a good mechanism for serializing, reading, and deserializing objects, classes, and all manner of other things.
    
    There are only a few problems with this:
    1) Non-ancestor relationships and references (i.e., having the same node as multiple locations in the XML document) are not covered by XML, but are possible with objects.
    
    2) Attributes in XML have no obvious mapping to objects...so what do you do with them?
    
    I wish we could use something like XML (in that it could use DTDs as schemas, and had support for DOM methods along with XQuery and XPath), but with a more effecient format (binary), and with the ability to encode references.
    
    That would be just about perfect.
    
    Parent Share
    twitter facebook
    - Re:Classic (Score:5, Insightful)
      
      by oyenstikker ( 536040 ) writes: <slashdot@ s b y r n e . o rg> on Monday February 18, 2008 @01:45PM (#22465136) Homepage Journal
      
      There are only a few problems with this:
      1) Non-ancestor relationships and references (i.e., having the same node as multiple locations in the XML document) are not covered by XML, but are possible with objects.
      You can with refids and keys.
      but with a more effecient format (binary)
      It is wonderful to be able to easily read and edit the data in a text editor. If you want it more compact for storage and transmission, compress it. I understand that a binary format could lead to more efficient processing and parsing, but I think the benefits of readable text outweigh the efficiency.
      
      Parent Share
      twitter facebook
    - Re:Classic (Score:4, Insightful)
      
      by Flambergius ( 55153 ) writes: on Monday February 18, 2008 @02:08PM (#22465398)
      
      To me that says that XML handles a problem that wasn't there. Parsing problem for pretty much everything is almost universally solved by regex...
      
      XML doesn't handle parsing. XML makes parsing easier; in fact so easy that parsing XML isn't a problem anymore.
      
      For an expert, I think XML and regex are complementary techniques. For anyone other than an expert regex are way too brittle. Ordinary people need to be able to operate on their data, it can't require voodoo. (Not that XML in all its arcane application is anything close to plain English, but it's much better than custom data formats and regex.)
      
      Parent Share
      twitter facebook
    - Re: (Score:3, Informative)
      
      by shutdown -p now ( 807394 ) writes:
      
      To me that says that XML handles a problem that wasn't there. Parsing problem for pretty much everything is almost universally solved by regex...
      God, no... another Perl hacker...
      Regex are not a solution to everything, and most certainly not to writing fast parsers!
      (Not that XML is easy to parse fast, but that's another story. You still don't write a JSON parser using regex.)
    - - Re: (Score:2)
        
        by fireboy1919 ( 257783 ) writes:
        
        But not all languages are regular languages to be described by regular expressions. Is there a standardized form of confrex or consenex
        
        This is a red herring.
        
        No, not all natural languages are regular, and even most computer languages are not regular.
        
        But I'm pretty sure that all languages (or to go more primitive, algebras) that can be expressed as XML can be parsed by a regular expression.
        
        Can you disprove this?
        
        That's easy to disprove... (Score:2)
        
        by warrax_666 ( 144623 ) writes:
        
        Try writing a regex for parsing documents consisting of arbitrarily deeply nested elements. Say, documents of the form
        
        <x><x><x><x>...</x></x></x></x>
        
        See?
        
        Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Sure (Score:2)
        
        by warrax_666 ( 144623 ) writes:
        
        That was just the standard trivial example -- it stands to reason that some people have hacked around it since it's such a common practical limitation. There are also other examples, say, anything requiring arbitrary amounts of (token) lookahead to resolve ambiguities.
        
        Re: (Score:2)
        
        by msuarezalvarez ( 667058 ) writes:
        
        You cannot use regular expressions to decide whether a string is an instance of the following DTD or not: <!ELEMENT a (a | b) > <!ELEMENT b EMPTY > This is quite basic language theory.
        
        Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Re: (Score:3, Informative)
        
        by WilliamSChips ( 793741 ) writes:
        
        No, you cannot with a regex. If you can, it's not really a regex, it's something different.
        
        Re:Regex (Score:5, Informative)
        
        by TheRaven64 ( 641858 ) writes: on Monday February 18, 2008 @03:02PM (#22466122) Journal
        
        You fail Computer Science 101. Regular expressions are exactly as expressive as finite automata. A finite automaton is incapable of solving the matching brackets problem, since that requires a potentially infinite number of states in order to keep track of the number of open brackets in an input stream. Because of this, a regular expression can not be used to parse any XML schema that allows an arbitrary depth of nesting, since parsing such a form with would require counting the open and close tags to make sure they match, which is not possible with a regular expression.
        This is why regular expressions are typically used for lexical analysis (tokenisation) not syntactic analysis (parsing).
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Insightful)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
XML and Interfaces (Score:3, Insightful)

by PIPBoy3000 ( 619296 ) writes: on Monday February 18, 2008 @12:46PM (#22464422)

I realize the XML is used for a lot of things, but whenever my fellow developers learn that the vendor is shipping us some interface in XML, the groans are audible. About half the time, their XML format isn't quite standard, and we've got to dig around for utilities to try and work with it (or write something custom). I'd say the vast majority of our interfaces are good ol' delimited text files.

For other purposes, XML is great and very readable, but I'm not sure it makes sense to use it everywhere.

Share
twitter facebook
- Re:XML and Interfaces (Score:5, Informative)
  
  by MBCook ( 132727 ) writes: <foobarsoft@foobarsoft.com> on Monday February 18, 2008 @01:02PM (#22464622) Homepage
  Here are some of the "fun" things I have run across in other people's (almost certainly custom) XML interpreters/producers:
  
  Tags must be upper case
  
  Tags can't be upper case
  
  You must put line breaks between elements
  
  There can't be any whitespace between elements
  
  It's import to URL encode the XML before it gets sent from them to me
  
  You don't need CDATA blocks, just put the ampersands and >s right in there, it'll be OK
  
  Your XML should all be inside a CDATA block in container XML
  
  No tags can self-close
  
  Self closed tags need a space between the slash and bracket
  
  Self closed tags can't have a space between the slash and bracket
  
  That's just what I can think of off the top of my head. We've seen quite a bit of crazy stuff. If everyone would just use one of the already written XML producers or parsers (the big ones, the ones that work) life would be much easier around here from time to time.
  Parent Share
  twitter facebook
  - Here, let me fix that for you ... (Score:5, Insightful)
    
    by trolltalk.com ( 1108067 ) writes: on Monday February 18, 2008 @01:33PM (#22464988) Homepage Journal
    
    If everyone would just use one of the already written XML producers or parsers (the big ones, the ones that work) life would be much easier around here from time to time.
    
    If everyone would just went back to using simple delimited ascii text life would be much easier around here.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by msuarezalvarez ( 667058 ) writes:
      
      That will work until you need to use non-ascii content, to include the delimiter in the data itself (so that you need an escaping mechanism), you want to represent hierarchical data, you want to be able to compose data from different sources without semantic collisions, and what not.
      
      So you solve these issues, turning for `simple delimited ascii text' format into something a bit more complicated, and next you will be wantint to interchange instances with others, so you will have to start coming to an agreem
      - Re: (Score:2)
        
        by trolltalk.com ( 1108067 ) writes:
        
        "That will work until you need to use non-ascii content, to include the delimiter in the data itself (so that you need an escaping mechanism), you want to represent hierarchical data, you want to be able to compose data from different sources without semantic collisions, and what not."
        Using a single null (0x00) to delimit each field, and two nulls to delimit each record, is UTF-friendly. As for non-ascii contents, just encode them in base64 (which you would probably be doing anyway in a cdata section).
        
        Re: (Score:2)
        
        by msuarezalvarez ( 667058 ) writes:
        
        So you do not see the use of empty fields?
        
        Re: (Score:2)
        
        by msuarezalvarez ( 667058 ) writes:
        
        And what if you are not dealing with a table with record and fields but with a tree?
        
        Re: (Score:2)
        
        by trolltalk.com ( 1108067 ) writes:
        
        "And what if you are not dealing with a table with record and fields but with a tree?"
        Trees and graphs are not a problem - just like they aren't in regular table design (though it IS ugly). One field holds the parent node record id, or is blank if its a top-level node. A second field can hold the node type. Extend your schema as required.
        
        Re: (Score:3, Informative)
        
        by trolltalk.com ( 1108067 ) writes:
        
        If you know how many fields there are in each record, then why did you need a special record delimiter to begin with? Sounds like a design mistake, which isn't surprising since it was ad-hoc...
        Wrong - the special null delimiter is needed only for variable-length (and zero-length) fields and records. For fixed-length fields and records, no delimiter is needed.
        For example: First Name\0x00Last Name\0x00Age0x00\0x00
        Joe\0x00Blow\0x0042\0x00\0x00
        Mary\0x00Doe\0x0024\0x00\0x00
        \0x00Cowboyneal\0x00\0x00\0x
    - Re:Here, let me fix that for you ... (Score:4, Insightful)
      
      by CaptainPinko ( 753849 ) writes: on Monday February 18, 2008 @03:22PM (#22466350)
      
      ASCII doesn't even support the letters needed by the majority of the world's language.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by trolltalk.com ( 1108067 ) writes:
        
        "ASCII doesn't even support the letters needed by the majority of the world's language."
        UTF does - so just use UTF - no big deal, and a lot easier to parse out than xml, which is butt-ugly to parse, terrible to index, doesn't support random access read/write, etc.
    - - Re:Here, let me fix that for you ... (Score:5, Insightful)
        
        by kyz ( 225372 ) writes: on Monday February 18, 2008 @02:28PM (#22465636) Homepage
        
        I have, and I can tell you that it's a waste of time.
        
        It amazes me how something that looks so simple can have so many corner cases, and how they can be solved so differently by different implementations.
        
        CSV is fine if you want to store data that has no quote marks, commas, carriage returns or linefeeds. For everything else, please use a better specified format, preferably one that has a formal definition. Like XML, for example.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by trolltalk.com ( 1108067 ) writes:
        
        Plain ascii text != csv.
        You can delimit your data with nulls, for example (1 null byte per field, 2 null bytes per record). Even javascript can parse that out, and its unicode-compatible.
        Or you can use fixed-width fields and records, in which case reading a record is as simple as an lseek (header_size + record_len * (recno - 1)). Generating indexes on the data is also much quicker, as is editing the data. With a fixed-width field and record, there's no need to rewrite the rest of the file if your new v
      - Re: (Score:3, Informative)
        
        by trolltalk.com ( 1108067 ) writes:
        
        "Ever tried parsing CSV?"
        All the time. Its not that hard. Also, if you're worried about such things as quoting, etc., you can always use fixed-width fields - makes indexing, looking up, and modifying values REAL FAST. Compare that to the mess of xml.
        
        Re:Here, let me fix that for you ... (Score:4, Interesting)
        
        by thrillseeker ( 518224 ) writes: on Monday February 18, 2008 @03:10PM (#22466190)
        
        I knew we would (d)evolve to punch cards eventually.
        
        Parent Share
        twitter facebook
  - Re: (Score:2)
    
    by Frans Faase ( 648933 ) writes:
    
    My experiences with some of generic XML parser is not very good. (Technically speaking we are dealing here with lexical scanners, not parsers.) Especially the SAX interface is not a pretty one. It is a typical interface that was designed from the inside (the parser) point of view, but not from the user point of view. Also, because XML is very rich, and you hardly every use all of this richness, there is always a performance penalty. If you have to parse megabyte size of XML files that you know only make use
  - - Re: (Score:2)
      
      by MBCook ( 132727 ) writes:
      
      You'd think. Proper XML isn't a problem. What we run into is people who's XML parsers (which we usually suspect to be customer, often in the form of simple string extraction and not even real parsing) who have these weird little desires that are contrary to the XML spec. Some of it seems sane, some of it is way off (I've seen XML, that gets URL encoded, put in a CDATA block, in XML... and that was how they sent everything).
      XML, done right is just fine. But some people just get it very very WRONG.
  - - Re: (Score:2)
      
      by MBCook ( 132727 ) writes:
      
      That's my understand too. I listed that to make the distinction between it and "you must have a space between the slash and bracket" set. Both upper and lowercase tags are allowed, but some people choose a side (like a bad parser that requires the space) and then require it like it's the law.
- Re: (Score:2)
  
  by Aladrin ( 926209 ) writes:
  
  It never makes sense to use any 1 thing 'everywhere', but if people would actually stick to the standard and use it intelligently, XML could be very beneficial.
  
  Unfortunately, as you point out, very few do. I'm sick of not-quite-standard crap as well. It's a nightmare to work with... Even moreso than no standard would have been.
XML was formalized? (Score:2)

by damn_registrars ( 1103043 ) writes:

Considering all the (internet, and elsewhere) crapola that gets passed around as XML, with pretty much anything-you-want included, I don't really understand how we can call it "formalized".

Add to that the fact that then the ability to "display" XML comes down to the whatever-you-want-to-write manner, and I think there are plenty of people who would be hard to convince that there really is a "formal standard" for XML.

Perhaps Duke Nukem Forever will be written with this fantastic standard?
- Re:XML was formalized? (Score:5, Insightful)
  
  by Jerf ( 17166 ) writes: on Monday February 18, 2008 @01:43PM (#22465116) Journal
  
  Yes. XML was formalized. It is strictly defined and easy to check for compliance (with the right tools). Only a little bit of the definition has passed out of common usage, mostly focused around DTDs.
  
  If you encounter a file that claims to be XML, but does not meet the XML standard, then it is not the XML standard that is to blame. The claim is wrong and the file is not XML.
  
  XML is not a fuzzy-wuzzy adjective that can be applied willy-nilly to anything and magically turn it into "XML". It is not a marketing term or English Professor term. It is a rigidly specified engineer term for a document format, and a given document is XML if and only if it meets that format.
  
  If someone wants to hack together a half-assed parser or emitter of any language, they will. I've seen half-assed XML parsers, I've seen half-assed JSON parsers, I've seen half-assed HTML parsers, I've seen half-assed YAML parsers, I've seen ... you get the idea. If a standard can't solve the problem, you can't count the lack of solution against it.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by damn_registrars ( 1103043 ) writes:
    
    If a standard can't solve the problem, you can't count the lack of solution against it.
    
    Forgive my lack of knowledge on XML - I primarily just see bad implementations of it.
    
    But what problem was XML supposed to solve? Exactly who/what/where was in need of an extensible markup language, anyways?
    - Re: (Score:2)
      
      by msuarezalvarez ( 667058 ) writes:
      
      Anyone wanting to combine different markups, like html+mathml+svg+rdf+etc.
Java and XML, bad tastes that are worse together (Score:4, Insightful)

by Omnifarious ( 11933 ) writes: <eric-slash AT omnifarious DOT org> on Monday February 18, 2008 @12:58PM (#22464572) Homepage Journal

I've recently taken a job at a primarily Java shop. After seeing XML used and abused for ant, maven and various other things I've grown even more disenchanted with it. And now I've also gotten the chance to see that not only does Java represent a poor trade off between the annoyances of a strongly typed language and the speed of a dynamic interpreted one, it has a horrible mess of dependency issues that nobody really solves besides.

I'm much more hopeful about technologies like Thrift [facebook.com] and/or D-Bus [freedesktop.org] than I ever was about such abysmal abominations as SOAP, or the only slightly better XML-RPC.

The Java XML world seems like this little closed ecology of mutual masturbators who all come up with more Java and XML 'solutions' to problems that never existed before they started using Java and XML.

I see the value of XML for long-lived documents that don't spend a lot of their life on the wire. And possibly for config files, though IMHO it is too ugly and unreadable for those. But as a general tool for Internet plumbing it's awful.

Share
twitter facebook
- Java and XML - Addendum (Score:4, Insightful)
  
  by Omnifarious ( 11933 ) writes: <eric-slash AT omnifarious DOT org> on Monday February 18, 2008 @01:03PM (#22464632) Homepage Journal
  
  And, of course, my post is incomplete with reference to my little rant on why CORBA and other forms of RPC are bad [omnifarious.org]. Both Thrift and D-BUS are pretty close to the ideal solution I describe later. They focus on message content over semantics and are extremely easy to parse. SOAP and XML-RPC fail on both of those counts. They are about semantics (you are making a remote function call that does some specific thing, not sending a hunk of data that has some particular content) over content and they are a huge pain to parse.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Interesting)
    
    by cjonslashdot ( 904508 ) writes:
    
    CORBA uses IDL for interface definition. Therefore, you don't even have to write code to parse it: the parsing code is generated automatically. So the arguments about parsing are non issues. With regard to content, one can define content in IDL very easily. I have not used the APIs you refer to (e.g., Thrift), so I cannot comment on those. I will say this though: when I used to write apps 10 years ago using CORBA, it took me so little time to throw a system-to-system interface together that I almost didn't
    - Re:Java and XML - Addendum (Score:5, Insightful)
      
      by Omnifarious ( 11933 ) writes: <eric-slash AT omnifarious DOT org> on Monday February 18, 2008 @01:31PM (#22464960) Homepage Journal
      
      CORBA is a minor pain to parse. From what I could tell you could just sit down with a spec and code up your own parser for ye-old random language in a day or two. But that's not my major issue with it.
      
      My major issue with it was that it promotes designing distributed systems that focus on the semantic roles of the participants instead of the data moving around. In fact it discourages programmers using it from even thinking of what they're doing as sending messages to some system many milliseconds away. Among other evils this leads to all kinds of interesting issues with threading and concurrency that didn't even have to exist.
      
      Parent Share
      twitter facebook
- Re:Java and XML, bad tastes that are worse togethe (Score:3, Interesting)
  
  by MBCook ( 132727 ) writes:
  
  I do a lot of Java and XML. I don't know what you're using for a library, but I'd suggest JDOM.
  As for the abuses for Maven and Ant... yeah. I'll agree. There are a lot of things that seem to use XML just because they can. I know there is some theory behind why they use them (machine readable, blah blah blah) but for most things it's just a giant pain for the complexity you get. Maybe if you were trying to build Windows with Ant.
  - Re:Java and XML, bad tastes that are worse togethe (Score:5, Insightful)
    
    by bckrispi ( 725257 ) writes: on Monday February 18, 2008 @01:36PM (#22465024)
    
    I'll take an Ant XML build file over an "is that a tab or a space" Makefile any day...
    
    Parent Share
    twitter facebook
    - Re:Java and XML, bad tastes that are worse togethe (Score:4, Informative)
      
      by CoughDropAddict ( 40792 ) * writes: on Monday February 18, 2008 @04:14PM (#22466978) Homepage
      
      So you're the guy who shits tabs in random places in source files, because you haven't figured out how to set up your editor to show you the difference. Please stop doing that. Tabs and spaces are different characters, even if the language you're using today treats them the same. If you're a VIM user, please learn to use "list" and "listchars."
      
      Parent Share
      twitter facebook
  - Re: (Score:2, Informative)
    
    by fartrader ( 323244 ) writes:
    
    Java is clearly moving away from the massive over-use of XML in everything from configuration to messaging. From Java 5 onwards, annotations are rapidly becoming the configuration mechanism of choice, where infrastructure configuration is placed in the source code directly, in a way thats significantly less obtrusive than writing code to manage things like persistence and transactions yourself, and significantly easier to follow than placing it in many XML files. Anyone who has migrated from EJB 2.1 to 3.
- Re:Java and XML, bad tastes that are worse togethe (Score:5, Interesting)
  
  by GodfatherofSoul ( 174979 ) writes: on Monday February 18, 2008 @01:24PM (#22464864)
  Yay! Nothing like the combination of XML and Java to bring out the haters. Incompetent use of a language/API doesn't equate to a bad language/API. I can show you plenty of crappy C/C++ code freely browsable in some open source libraries. Does that mean C++ sucks? Hell no.
  My experience with Java+XML you ask? OFX servers for financial institutions. Without name dropping, check out the list of banks, brokerages, tax services, and credit card providers [microsoft.com] (Quicken [microsoft.com]) out there successfully serving up client data. I guess we're all circle jerking while you're downloading your account information into Quicken or Money.
  Some good uses for XML:
  
  Ephemeral representations of atomic, structured data; usually for transport.
  
  Config files. More verbose and the syntax is far better at keeping you from fat fingering a setting and blowing up your app. If you can't clearly read XML, you need glasses.
  
  Some bad uses for XML:
  
  High volume, rapid response data streams; like say an on-line multiplayer game (though I've never benchmarked this)
  
  Unbounded data streams; e.g. streaming media
  
  Databases
  
  I have to admit, I'm clueless about your Java dependency issues. The only way I can see that ever happening is if you're dumping all of your classes into the default top-level package; and that's major user error if you are.
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by nguy ( 1207026 ) writes:
    
    Incompetent use of a language/API doesn't equate to a bad language/API
    
    No, but incompetent design of a language/API does, in fact, equate to a bad language/API.
  - Re: (Score:2)
    
    by Omnifarious ( 11933 ) writes:
    
    OFX servers for financial institutions. Without name dropping, check out the list of banks, brokerages, tax services, and credit card providers (Quicken) out there successfully serving up client data.
    I'm aware of OFX, and it is something I consider a non-evil use of XML. It is all about the data, and the data is high-volume, structured and text-like, so something like XML makes sense for representing it.
    OTOH, name dropping gets nowhere with me. Large institutions routinely adopt very stupid technologies for the most ridiculous of reasons. I'm much more interesting in what a small, nimble high-tech company like Automated Trading Desk [atdesk.com] is doing than what Chase-Manhattan is doing. Of course, ATD appe
    - Re: (Score:2)
      
      by MBCook ( 132727 ) writes:
      
      That's Maven's job. It's supposed to get all the JARs you tell it to.
      It's not Maven's job to figure out if you actually use a JAR (which gets complicated when code depends on JAR A, which depends on JAR B, which....).
      The usual way to handle something like this is to use Maven to keep things up to date on your machine. You can deploy all those JARs with your program (as you seem to be doing) or you can keep them somewhere else on the server and update them manually. Maven makes sure you have the requisite
  - Re: (Score:2)
    
    by Black-Man ( 198831 ) writes:
    
    That's because OFX IS A DEFINED STANDARD - a standard driven by Intuit. I guess you're too young to remember NPC - a competing standard? Or having to support BOTH? Oh yeah... that was great fund.
    
    You tell me what is a standard in Ant? Nice taking his comments out of context.
  - Re: (Score:2)
    
    by shutdown -p now ( 807394 ) writes:
    
    Some bad uses for XML: Unbounded data streams; e.g. streaming media
    The success of XMPP, which is entirely centered around the concept of an XML stream, seems to disprove this.
    Databases
    Normally true, but for small catalogs of a few hundred records at most, one may consider XML for its ease of handling and recoverability.
    Also, for tree-like structures, XML/XQuery databases can often beat relational (once you start getting into 10+ joins in the latter, that is). Of course good XML databases don't really st
  - Re: (Score:2)
    
    by farnsworth ( 558449 ) writes:
    
    I have to admit, I'm clueless about your Java dependency issues.
    Usually this is because the container you are using depends upon version X of XML Library A (usually to read it's own config files, or other boring stuff) while some your own code or some third-party API you use depends upon version X+1 of that same Library A. It's not an impossible problem to get around, but it's a problem that exists in almost every non-trivial app I've ever worked on.
- Your comments seem tainted with inexperience. (Score:3, Insightful)
  
  by sidragon.net ( 1238654 ) writes:
  
  In general, if you have data to be structured and serialized, XML is one way to do it. If you think XML a poor choice, then could you suggest an alternative? Incidentally, that suggestion should not imply that everyone reinvent their own formats (again).
  
  [N]ot only does Java represent a poor trade off between the annoyances of a strongly typed language and the speed of a dynamic interpreted one ...
  Would you provide evidence aside from personal anecdotes, and possibly consider evidence to the contrary [wikipedia.org]?
  - Re: (Score:2)
    
    by hoggoth ( 414195 ) writes:
    
    > If you think XML a poor choice, then could you suggest an alternative?
    
    YAML for the win!
    YAML is concise, easy to read, easy to write, easy to parse, easy to edit.
    It has high signal-to-noise ration, and is effortless for the human eye.
    It can represent any data structure I can imagine.
    It has libraries for any popular language I can think of.
  - Re: (Score:3, Informative)
    
    by argent ( 18001 ) writes:
    
    If you think XML a poor choice, then could you suggest an alternative?
    
    Depends on the problem you're trying to solve.
    
    A hell of a lot of the stuff I'm seeing in XML these days would be better off as token-separated self-describing tables (tables where the column names are the first row), or a modestly extended token-separated format like CSV.
    
    For binary data something derived from Electronic Arts semi-self-describing interchange file format is good, examples in current use are MIDI File Format and Portable Net
- Re:Java and XML, bad tastes that are worse togethe (Score:2)
  
  by bcrowell ( 177657 ) writes:
  
  Java and XML are similar in that both of them got over-hyped. They're also similar in that sometimes they really are the right solution -- just not as often as PHBs seem to think. I've had exactly one application where I started designing the file format, and realized, "Oh heck, I'm reinventing XML," so I went with XML and it was the right choice. For config files, the advantage I can see is that although XML may not be optimal for every type of config file, it does provide an alternative to the traditional
- Re:Java and XML, bad tastes that are worse togethe (Score:2)
  
  by AlXtreme ( 223728 ) writes:
  
  I don't know about Thrift being a real contender in the web/internet-based services area. Really, code generation? How 80's. Haven't we learned enough from Sun RPC that this is a PITA, give me a proper library dammit! And AFAIK D-Bus is for local IPC, good luck sending messages over a network without a couple of hoops to jump through.
  
  I can see your viewpoint, if you want to squeeze as much performance out of your application you might want to investigate Thrift, D-Bus or simply write your own TCP protocol.
Oblig (Score:5, Funny)

by mariuszbi ( 1113049 ) writes: on Monday February 18, 2008 @01:00PM (#22464598)

XML is like violence.. when it doesn't work, use some more!

Share
twitter facebook
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
- Re: (Score:2)
  
  by EMN13 ( 11493 ) writes:
  
  Semantics are difficult. XML does not solve semantic issues like what tags mean. Be happy if your RSS provider provides syntactically valid XML - at least you can unambiguously interpret the structure of the document now!
  
  As to semantics, if you're trying to interpret such a home-grown format as atom/rss, without reference implementation (and most specifically without a good test), your problems lie not with XML, but with that spec.
  
  And indeed, RSS and atom aren't very good in that sense. It may be hard to
Is XML just SGML redux? (Score:2)

by OrangeTide ( 124937 ) writes:

XML doesn't seem like a big deal. SGML was around since the mid-80s, making it over 20 years old. XML is stricter in many ways, and layers some useful concepts on top of SGML. But otherwise it seems to have a lot of the same uses and syntax as SGML itself.

As a side note, I dislike it when people use XML inappropriately, like using XML-RPC when something based on ASN.1 might be more appropriate. (How many wannabe MMORPG projects have I read that are "XML-RPC" based? too many). I'm sure there are good uses fo
XML lite (Score:2)

by hey ( 83763 ) writes:

There needs to be some description of an XML lite.
For config files and such.

- No doctype needed
- tags are case insensitive
- Can do comments with # character instead of
- Etc
Inferior to S-expressions (Score:2)

by 5pp000 ( 873881 ) writes:

TFA is a fun read. Too bad XML sucks. As Jerome and Philip Wadler write [ed.ac.uk], "[T]he essence of XML is this: the problem it solves is not hard, and it does not solve the problem well."
Lisp had the same problem solved 40 years earlier. While a lot of people find S-expressions verbose, XML is quite a bit more verbose. Slava Akhmechet has a nice essay [defmacro.org] on the relationship between the two notations.
- Re: (Score:2)
  
  by 5pp000 ( 873881 ) writes:
  
  Whoops -- the authors of the linked paper should have been given as Jérôme Siméon and Philip Wadler. Sorry for the error.
- Re: (Score:2)
  
  by Shados ( 741919 ) writes:
  
  Really, XML does solve the problem. The only issue with it, is that its designed to solve ALL problems, instead of using the usual 80/20 rule... instead of being optimized for most problems, it uses the lowest common denominator to try and catch the 20 other %...and everything that makes XML suck comes from that extra 20%.
  
  If it was easier to handle dates in JSON without schemas, we'd have one heck of a winner there though.
What's happened with XML reminds me of... (Score:2)

by AmazingRuss ( 555076 ) writes:

what happened with DBASE and its kin. It's easy enough to use that any idiot can...and you end up with schema that reflect that idiocy.

XML isn't the problem. Idiots writing XML is. I'm beginning to think that a certain level of difficult is necessary as a screening device.
- Comment removed (Score:5, Informative)
  
  by account_deleted ( 4530225 ) writes: on Monday February 18, 2008 @12:50PM (#22464480)
  
  Comment removed based on user account deletion
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by TechyImmigrant ( 175943 ) * writes:
    
    >By keeping your data in an XML format, you can use simple XSL stylesheets to generate multiple types of output.
    
    Just like LaTeX! Reinvention is a wonderful thing.
    - Comment removed (Score:4, Informative)
      
      by account_deleted ( 4530225 ) writes: on Monday February 18, 2008 @01:22PM (#22464846)
      
      Comment removed based on user account deletion
      
      Parent Share
      twitter facebook
      - Re:10 Years and still waiting (Score:5, Informative)
        
        by TheRaven64 ( 641858 ) writes: on Monday February 18, 2008 @02:44PM (#22465862) Journal
        
        Does anyone still use latex2html? All of the TeX users I know who care about HTML output switched to tex4ht years ago. It produces a variety of XML formats, including XHTML (with MathML) and OpenDocument.
        
        Parent Share
        twitter facebook
  - Re: (Score:2, Interesting)
    
    by Coelacanth ( 323321 ) writes:
    
    Excellent point, and I'll take it one step further. When coupled with XSLT and other WS-* standards, you have an extremely flexible way to connect otherwise absurdly different applications (See Sun's OpenESB and JBI standard).
    
    The hatred for XML, I think, stems from frequent, ugly misuse. Here's one basic, freakin' obvious rule: if a human, at any time at all, has to read or manually edit an XML document, you're doing it wrong. Just because it's ASCII doesn't mean it's human-compatible.
    - Re:10 Years and still waiting (Score:4, Insightful)
      
      by iamacat ( 583406 ) writes: on Monday February 18, 2008 @02:30PM (#22465666)
      
      Here is another obvious rules: If a computer, at any time at all, has to parse or generate XML in large amounts, you are doing it wrong. There is really no need to resend the same string 100000 times, encode multi-megabyte binary data as BASE64 or lose floating point precision by encoding to or from strings. If need be, an efficient binary format can represent the data with an arbitrary schema. Communicating parties can exchange their schemas at runtime and avoid sending attributes that the other end is not going to use.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by Planesdragon ( 210349 ) writes:
        
        Here is another obvious rules: If a computer, at any time at all, has to parse or generate XML in large amounts, you are doing it wrong
        Depends on what you're doing.
        
        One computer storing temporary data? XML is worthless. A computer storing data for use on said same computer? XML brings little to the table.
        
        One computer program writing something that a different computer program will read from a file system at a later date? Look at XML. If you save a non-trivial amount of processor or developer time, go with it.
        
        And let's ignore the fact that AJAX really doesn't work without XML, will we? Because that kind of defeats the original whiney
      - Re: (Score:3, Insightful)
        
        by David Gerard ( 12369 ) writes:
        
        It Depends. We have systems that are arranged in a long content chain. One machine sends data to the next machine, maybe by pull, maybe by push. Next machine does ... something ... with it, passes it to next machines. Maybe the developers talk to each other, or remember why their predecessor made the system do that, or maybe they don't. XML is really Just The Thing for the job. And the fact that it can be tweaked by a human (e.g. the sysadmin who has to fix a broken thing) is fantastically useful.
    - Re: (Score:2)
      
      by jmorris42 ( 1458 ) * writes:
      
      > Here's one basic, freakin' obvious rule: if a human, at any time at all, has to read or manually edit an XML document, you're doing it wrong.
      
      Amen! Which is why I absolutely HATE HATE HATE XML config files. Because they aren't human readable and editing one is an invitation to disaster. There are no editors so XML is only useful for apps to communicate with each other. And there are equally useful ways for that to be implemented.
      
      Seriously, there is no editor. I'm told you can buy them for Windows i
    - - Re: (Score:3, Insightful)
        
        by colmore ( 56499 ) writes:
        
        xhtml is one very small dialect of xml.
        
        when you are entering html style markup tags, you are using xml. but xml is a much much larger subject than that. hand editing a website is fine. (if the documents are getting huge, it should be split into smaller files and automated somehow, anyway) hand editing, say, Open Office's xml format or any of the fairly arcane XMl formats used for interprocess communication.
        
        XML is sort of designed to be the second best data format for any application. There are a lot of
- YAML and JSON (Score:3, Insightful)
  
  by goombah99 ( 560566 ) writes:
  
  I'm perpetually surprised every-time I see a new implementation of XML. For example, macintosh plists, many of which replace older ad hoc Unix configs, are in XML. Why oh why do people use XML for data centric, quasi-human readable configuration files when YAML [wikipedia.org] is the ideal solution for this. And for web usage, where perl, python, and ruby abound, why would would people not use YAML since it's so easy to parse with just regular expressions, and because you don't have to instantiate the multi-megabyte str
  - Re: (Score:2, Insightful)
    
    by ral8158 ( 947954 ) writes:
    
    Actually, OS X uses plists because XML, which is more widely known than YAML and much easier to learn, is built directly into the Cocoa API.
    
    Using an XML file basically consists of the following code:
    NSError xmlError = [[NSError alloc] init];
    NSXMLDocument doc = [[NSXMLDocument alloc] initWithContentsofURL:@"Put your URL here" options:NSXMLDocumentTidyXML error:&xmlError]; //Handle errors around here
    
    Then you can basically do anything with your doc object. You can insert a child at a certain index, you can
  - Re: (Score:3, Informative)
    
    by tjansen ( 2845 ) writes:
    
    As you say, YAML is a specialized markup-language (data-centric, almost human-readable) and not a good choice for many use-cases (document-centric languages like XHTML and DocBook, combining languages with XML namespaces). In other words, it can not replace XML, it's just another syntax to learn. It needs a completely new infrastructure: new parsers, new editors, new schema description language, new translation languages and so on. Is that really worth it, only to make editing files with a simple text edito
    - Re: (Score:2)
      
      by goombah99 ( 560566 ) writes:
      
      Is that really worth it, only to make editing files with a simple text editor easier?
      I think you answered the question. Yes. for many uses being able to use a simple text editor is great.
      
      Have a look sometime at the yaml documetnation quick reference-card : it's written in YAML and fits on one page. That's how compact yet human readable it is, try that in XML. [yaml.org]
      
      These days everything is one library away from being immediately available for use. Since YAML can do everything XML can do, and because it's trivial to insert XML into YAML (but not the reverse), you really could replace XML with
      - Re: (Score:2)
        
        by tjansen ( 2845 ) writes:
        
        Actually, after looking at that reference card, YAML is much more complex than I thought it was.. compared to that, XML is simple (provided you ignore all that outdated crap like DTD/Doctype, processing instructions) and just use elements, attributes and built-in entities.
        
        Re: (Score:3, Interesting)
        
        by goombah99 ( 560566 ) writes:
        
        Actually, after looking at that reference card, YAML is much more complex than I thought it was.. compared to that, XML is simple (provided you ignore all that outdated crap like DTD/Doctype, processing instructions) and just use elements, attributes and built-in entities.
        
        Well good for you, for actually looking. But as you say about XML, most of the time you only use the base elements in YAML too. In YAML those are "-" for arrays, ":" for hashes, and "|" for block quotes. YAML streamlines things even further by getting rid of close-tags and it mostly dispenses with attributes being special data and having to live in tag, and just merges them all into the payload area, putting all data and attributes on equal footing.
        
        Here's another document to look at that's a great 1-pag [yaml.org]
    - Re: (Score:3, Insightful)
      
      by fireboy1919 ( 257783 ) writes:
      
      In other words, it can not replace XML
      
      That's pretty much completely wrong. YAML's functionality is a superset of XMLs while being easier to read & understand (because the *basic* usage of it is exactly the same as XML's, but with a simpler syntax). It just hasn't been adopted anywhere except configuration because that's the easiest niche to move into.
      
      it's just another syntax to learn.
      
      That's a stupid thing to say. Anybody that can't learn the syntax of either XML or YAML in less than five minutes sho
  - Re:YAML and JSON (Score:5, Funny)
    
    by cliveholloway ( 132299 ) writes: on Monday February 18, 2008 @01:57PM (#22465282) Homepage Journal
    
    <reply xmlns="Slashdot:Comment"> <paragraph> <sentence>What?</sentence> <sentence>Are you telling me that this isn't the preferred way of presenting data?</sentence> <sentence>Honestly, this & SOAP are two technologies that have made my life so much more "interesting" as a developer.</sentence> <sentence>Fucking XML...</sentence> </paragraph> </reply>
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by goombah99 ( 560566 ) writes:
      
      Beautifully said.
    - Re: (Score:3, Insightful)
      
      by msuarezalvarez ( 667058 ) writes:
      
      I've never understood why people complain about XML as you do.
      
      Are you generating XML by hand in your applications? Are you not parsing it using some standard library into an abstract tree or using a standard library to transform XML documents into sequences of events, in exactly the same way lex tokenizes a string of characters? Are you generating it by concatenating strings?
      SOAP is complicated, but that has nothing to do with XML.
      XML does exactly one thing: it allows you to pretend that data is provide
      - Re: (Score:3, Interesting)
        
        by bytesex ( 112972 ) writes:
        
        It's bad because people ARE generating XML by hand, which, according to the spec, they should be able to do, making a lot of syntactical mistakes in the process (to which it is prone). Plus; it's terrible to read. It's also bad because on the machine side, it takes a lot of effort (CPU cycles, parser-programmer effort) to decipher. In other words, it the worst of both worlds. It's the Visual Basic of formats: you can really only use it with GUI tools, but you can't really do what you really want to do w
  - Re: (Score:2)
    
    by TheRaven64 ( 641858 ) writes:
    
    For example, macintosh plists, many of which replace older ad hoc Unix configs, are in XML.
    Property lists are part of the OpenStep specification (circa 1993). They have one canonical representation, which is similar to JSON (which postdates it by some years). OS X also supports two other representations, one is XML, the other binary. The XML form is commonly used on OS X, presumably so that they can be modified using XSLT or XPath type things.
    I agree, it doesn't make a huge amount of sense. Both the old format and the binary format are easier to parse than the XML format, and the old form
  - Devolution (Score:2)
    
    by Ilan Volow ( 539597 ) writes:
    
    Because the only thing that's more scary and complex than the overly-complicated RDF we have today is the under-planned, overly-extended JSON and YAML that we'll have five years from now, whose original form is twisted and contorted beyond recognition in an attempt to make it do things in the future that XML was designed to do from the get-go.
  - Re: (Score:2)
    
    by Just Some Guy ( 3352 ) writes:
    
    Why oh why do people use XML for data centric, quasi-human readable configuration files when YAML is the ideal solution for this.
    I'm looking at the YAML 1.1 specs [yaml.org] and don't see anything about schemas or data validation. Am I overlooking something?
    because you don't have to instantiate the multi-megabyte structured data entire file just to grep out one record.
    You don't have to do that [wikipedia.org] with XML, either. You can, but you don't have to.
    But really folks, do yourself and the rest of us a favor and read up on JSON and YAML.
    (Un)?fortunately, "the rest of us" seems to make up about 5% of the programming population. The rest of the rest of us are using XML.
- Re: (Score:2)
  
  by Klaus_1250 ( 987230 ) writes:
  
  If everyone had jumped on the boat 10 years ago, it might have. But that didn't happen.
  
  XML is too difficult, and allows abuse/over-use too easy. Personally, I love it, but I'm a minority. The other key-factor is that there is simply no short term need for it in many places. Or better, the need for it isn't recognized by the majority. Pragmatic solutions have a tendency to win over new revolutionary ones.
- Re:10 Years and still waiting (Score:4, Informative)
  
  by EMN13 ( 11493 ) writes: on Monday February 18, 2008 @01:36PM (#22465030) Homepage
  
  I use it in web development constantly, and have for about 8 years. It's great for documents mostly since it's much easier to process than a home-grown set up.
  
  You want to transform the document, you can use any of a number of techniques, and trivially guarantee that the resulting document is at least syntactically valid. If you use a home-grown format (or HTML), you'll need to resort to regular expressions, or a custom parser - which works fine up to a point. Regex's are error prone (it's quite difficult, for instance, to make an untrusted HTML document safe with regex'es), and parsing is difficult, and doesn't solve the transformation step very elegantly - wheras XPath and others are absolutely brilliant for quickly distilling the stuff you need from a document.
  
  But on the parsing side... take a look at ANTLR, it's just great :-).
  
  Parent Share
  twitter facebook
- - Re: (Score:2)
    
    by somersault ( 912633 ) writes:
    
    I thought that was caused by people adding comments boxes to webpages? You don't need XML to do web 2.0 type stuff :o
- Re:IVE BEEN WAITING SINCE 1998 (Score:4, Funny)
  
  by halivar ( 535827 ) writes: <bfelger@gmail.cGIRAFFEom minus herbivore> on Monday February 18, 2008 @01:00PM (#22464596)
  
  Looks like you're going to have to wait a little longer. Try holding your breath, this time.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by bckrispi ( 725257 ) writes:
  
  You *do* realize that most systems that read/write xml also read/write gzipped xml, dont'cha?
  - Re: (Score:2)
    
    by smcdow ( 114828 ) writes:
    
    This doesn't address the requirement that binary data be encoded/decoded to/from base64.
- Re: (Score:2)
  
  by tjansen ( 2845 ) writes:
  
  Actually XOP has W3C technical recommendation status since October 2005: http://www.w3.org/TR/xop10/ [w3.org]
  - Re: (Score:2)
    
    by smcdow ( 114828 ) writes:
    
    Unless I've missed somthing, the xop:Include element would be base64 encoded.
    - Re: (Score:2)
      
      by tjansen ( 2845 ) writes:
      
      In the data-model: yes. When being transmitted: no.
      
      XOP optimizes only the transport of base64 encoded binary objects. When you parse the file with a XOP-capable parser, the element would look to you like a base64 string.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Classic (Score:5, Funny)

Re:Classic (Score:5, Insightful)

Re:Classic (Score:4, Interesting)

Re:Classic (Score:5, Insightful)

Re:Classic (Score:4, Insightful)

Re: (Score:3, Informative)

Re: (Score:2)

That's easy to disprove... (Score:2)

Re: (Score:2)

Sure (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Regex (Score:5, Informative)

Re: (Score:3, Insightful)

XML and Interfaces (Score:3, Insightful)

Re:XML and Interfaces (Score:5, Informative)

Here, let me fix that for you ... (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Here, let me fix that for you ... (Score:4, Insightful)

Re: (Score:2)

Re:Here, let me fix that for you ... (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Here, let me fix that for you ... (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

XML was formalized? (Score:2)

Re:XML was formalized? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Java and XML, bad tastes that are worse together (Score:4, Insightful)

Java and XML - Addendum (Score:4, Insightful)

Re: (Score:3, Interesting)

Re:Java and XML - Addendum (Score:5, Insightful)

Re:Java and XML, bad tastes that are worse togethe (Score:3, Interesting)

Re:Java and XML, bad tastes that are worse togethe (Score:5, Insightful)

Re:Java and XML, bad tastes that are worse togethe (Score:4, Informative)

Re: (Score:2, Informative)

Re:Java and XML, bad tastes that are worse togethe (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Your comments seem tainted with inexperience. (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re:Java and XML, bad tastes that are worse togethe (Score:2)

Re:Java and XML, bad tastes that are worse togethe (Score:2)

Oblig (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Is XML just SGML redux? (Score:2)

XML lite (Score:2)

Inferior to S-expressions (Score:2)

Re: (Score:2)

Re: (Score:2)

What's happened with XML reminds me of... (Score:2)

Comment removed (Score:5, Informative)

Re: (Score:2)

Comment removed (Score:4, Informative)

Re:10 Years and still waiting (Score:5, Informative)

Re: (Score:2, Interesting)

Re:10 Years and still waiting (Score:4, Insightful)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Insightful)

YAML and JSON (Score:3, Insightful)

Re: (Score:2, Insightful)

Re: (Score:3, Informative)