Forgot your password?
typodupeerror
Microsoft

XML Support In Office 2003 Isn't For Everyone 213

Posted by timothy
from the market-will-bear-a-lot dept.
0x0d0a writes "Unfortunately, it seems that Microsoft's recent campaign to promote Office 2003 based on its XML support may be a bit misleading. Only the Enterprise and Professional releases will have this support -- not Standard. Microsoft will still be leveraging file format compatibility for at least another Office release."
This discussion has been archived. No new comments can be posted.

XML Support In Office 2003 Isn't For Everyone

Comments Filter:
  • by numbski (515011) <numbskiNO@SPAMhksilver.net> on Sunday April 13, 2003 @05:00PM (#5722996) Homepage Journal
    "But analysts contend that WordML's compliance with industry standards is a misnomer. Because the schema isn't fully documented, people who want to edit files created in Office 2003 will only be able to do that with Office itself, as before. Text in Office 2003 files stored in XML format might be viewable in other desktop programs, but all document formatting would be lost and most other files would be unreadable."

    Love thy neighbor. Embrace and extend my brothers.

    Amen.
    • by torpor (458) <jayv@NoSpam.synth.net> on Sunday April 13, 2003 @05:04PM (#5723009) Homepage Journal
      Right, like we couldn't have seen this coming from a looong way off.

      I've given up on Office completely. I even try to reject .DOC files completely - thanks to .PDF, it's been mostly successful.

      "Compatability" is still a bitches game.

      • Right, like we couldn't have seen this coming from a looong way off.

        They surprised me - I expected:
        <?xml version='1.0'?>
        <ms_word_doc>

        insert base64(word97format) here

        </ms_word_doc>

        The DTD would be the ms_word_doc tag defined as a CDATA field. Perfectly valid XML.
    • by Anonymous Coward on Sunday April 13, 2003 @05:24PM (#5723095)

      Text in Office 2003 files stored in XML format might be viewable in other desktop programs, but all document formatting would be lost

      Actually, this is entirely the point of XML. XML is not Yet Another Word Processor Format. It's intended to store "content" as opposed to "presentation", leaving "presentation" up to the app, much as was the original intent of HTML. Rather than an evil Microsoft plot, they are in fact conforming to the spec when they produce such a file.

      The semi-trailer truck sized hole in the notion is, of course, that "presentation" isn't really entirely separable from "content", especially in a modern document. All that graphic-artist stuff like layout and font choice and formatting actually affects the value and usefulness of the document. That's why we put it in in the first place. And that's why everyone always whines when Word strips out all the "presentation" they've spent all that effort putting into the document and just leaving them with the raw XML "content" -- a bunch of text.

      The flaw here is in the attempt to erect too high of a wall between presentation and content, not in Word.

      By the time you get fine-grained enough control over the presentation to create documents that actually look the way you want, the "content" usually becomes illegible. Alternatively, you have only coarse control over the presentation, in which case the content most often looks like crap. This problem is easily seen in any number of web pages that feel obliged to include some little rant at the top about bloated HTML and how they concentrate on "pure content", which usually means a sea of unreadable and undiffentiated Times Roman.

      The flip side is if you actually do break up the content enough to get control over the presentation. The last time sometimes tried to create a human-readable ASCII-text format for documents, they wound up with Postscript. A typical document actually looks something like: /Euro.Helvetica
      [556 0 24 -19 541 703 ]

      AddEuroGlyph /Euro /Helvetica /Helvetica-Copy BuildNewFont
      } if
      F /F4 0 /256 T /Helvetica mF /F4S53 F4 [83 0 0 -83 0 0 ] mFS
      F4S53 Ji
      688 1320 M ( )S
      F2S53 Ji
      800 1518 M (802.3z Gigabit Eth)[42 42 42 21 42 36 21 60 23 41 37 42 23 23 21 51 23 0]xS
      1431 1518 M (ernet local)[37 28 41 37 23 21 23 42 37 37 0]xS
      1781 1518 M (-)S
      1809 1518 M (side interface)[32 23 42 37 21 23 41 23 37 29 27 37 37 0]xS
      2255 1518 M ( )S
      F3S53 Ji
      650 1620 M S
      F4S53 Ji
      688 1620 M ( )S
      F2S53 Ji
      800 1620 M (Supports f)[46 41 42 42 42 28 23 32 21 0]xS
      1145 1620 M (ull Gigabit line rate)[41 23 23 21 60 24 41 37 42 23 23 21 23 24 41 37 21 28 37 23 0]xS
      1795 1620 M ( )S
      F3S53 Ji
      650 1722 M S
      F4S53 Ji
      688 1722 M ( )S
      F2S53 Ji
      800 1722 M (Operates in either media convert)[60 42 37 28 37 23 37 32 21 23 41 21 37 23 24 41 37 28 22 63 37 42 23 37 21 37 42 42 42 37 28 0]xS
      1888 1722 M (er)[37 0]xS
      1953 1722 M ( or line)[21 42 28 21 23 23 41 0]xS
      2189 1722 M (-)S
      2216 1722 M (card )[37 37 28 42 0]xS
      800 1817 M (mode)[63 42 42 0]xS

      Here's a hint. The "content" is clearly delimited by parentheses (instead of, oh, "") Easily readable by humans, right? A cinch to import into other applications, right? Guess what: a real XML word processing document that kept the presentation information isn't going to be any more readable. You're not just going to whip out vi and fix it up any more than you can do that to your Postscript documents now.

      XML is not magic application pixie dust that makes all features transparently interoperable when you sprinkle it on.

      • by Erik Hollensbe (808) on Sunday April 13, 2003 @05:55PM (#5723243) Homepage
        It's great to see someone else gets it. Postscript is actually a language which describes layout -- really, nothing is stopping you from doing all your work in it. Same with TeX. Of course, both languages (and they are true languages) are extremely complex and generally benefit from a middle-ground tool to do the real work (LyX, TeXinfo, Acrobat, Dia (?) etc).

        Treat XML like a database. It has rules of operation, but what you contain and how you describe the data are completely arbitrary.

        That said, if office is really aiming for interoperability, they would publish the XML schema and layout rules. However, as most of us already know, it's just yet another business with the desire to put "XML" on their "Corporate Resume" to make them look more "open".

        Sorry for all the double-quoted words. :)
        • > Sorry for all the double-quoted words. :)

          "layout" *tick*
          "middle-ground" *tick"
          "interoperability" *tick*
          "XML" *tick*
          "Corporate Resume" *tick*

          BINGO! [everything2.org]

        • by torpor (458) <jayv@NoSpam.synth.net> on Sunday April 13, 2003 @06:12PM (#5723309) Homepage Journal
          Treat XML like a database. It has rules of operation, but what you contain and how you describe the data are completely arbitrary.

          Anyone who has used XML knows perfectly well that it's entirely possible to describe the complete dataset for content, layout, and presentation, within an XML document, in a form which can be easily parsed by humans and software alike. Completely. Using open standards, even.

          Consequently, it's also possible to wrap it all up in 'parseable', yet 'unhandleable-unless-you're-on-the-inside' data blobs which mean nothing to no-one, yet still use 'XML' as a wrapper.

          It's a liability of having such an open design, and Microsoft are exploiting this fact, in the context of *CLEAR* market-division tactics.

          *They* created the artificial 'Professional/Enterprise/Standard' labels. Not the Users.

          MS' use of XML here is perverted. It serves no purpose other than to give MS an opportunity to blag press release points about how their software uses 'the latest open standards' to people who have *NO CLUE* what they're talking about ...
      • Rubbish! (Score:2, Insightful)

        by torpor (458)
        It's intended to store "content" as opposed to "presentation", leaving "presentation" up to the app, much as was the original intent of HTML. Rather than an evil Microsoft plot, they are in fact conforming to the spec when they produce such a file.

        This is just the sort of disinfo that MS themselves love to seed. Classic post, nice try.

        It's just not true. XML is *NOT* 'just' a presentation format, a la HTML (nice smear), nor is it 'inevitable' that the fileformat ends up like Postscript.

        XML is a text
        • Re:Rubbish! (Score:5, Insightful)

          by MrWa (144753) on Sunday April 13, 2003 @06:36PM (#5723433) Homepage
          It's just not true. XML is *NOT* 'just' a presentation format, a la HTML (nice smear), nor is it 'inevitable' that the fileformat ends up like Postscript.

          So wait a second - the original post stated that XML is ALL about the content and specifically NOT the presentation. Now you are saying that XML is apparently *self documenting* and the USER decides how the content should be displayed.

          So, according to your post, Microsoft is correct when their XML file output includes the *content* and the *user* can display it however they want.

          • Re:Rubbish! (Score:3, Insightful)

            by torpor (458)
            No, what I'm saying is that there's no reason for Microsoft to not have used the parseability features of XML to make their document formats more open - to do so would have fulfilled the actual purpose of XML.

            These other posts are out of field. XML can be used to store content, as well as all significant details about how that content should be displayed/portrayed to the user in various scenarios, in a way in which the details can be easily parsed - both by software and by human.

            XML is an attempt to prol
            • Re:Rubbish! (Score:2, Interesting)

              by MrWa (144753)
              That makes more sense.

              From a layman's perspective it appears that the biggest problem is that people just don't understand XML and what it is used for (I admit that I don't.) So, while a document may be XML-compliant, from what have seen, it isn't necessarily readable by any other program.

              If the data it not readable by any other program then, yes, that is pretty useless as far as data lifetime is concerned. Does this data, though, differentiate from content and presentation? Is it important to know exa

              • Re:Rubbish! (Score:2, Insightful)

                Well, XML is, at the end of the day, JUST a file markup language. No company in the world (be they Microsoft, Sun, or IBM) is going to write an XML document that can just instantly be loaded in any application -- there always has to be code to parse that XML file and do something with it (like display it on-screen).

                I'm not sure where these people who thought Office being XML would instantly make it compatible with other word processors are coming from -- if the other word processors don't implement suppo

        • Re:Rubbish! (Score:5, Informative)

          by tinrib (632120) <davidNO@SPAMstain.org> on Sunday April 13, 2003 @07:43PM (#5723813)

          XML is a text-based system for data storage and retrieval, intended to be *self documenting*. In other words, the details on what fonts are used, what settings The User has set for individual parts of the documents, the parameters for those setting, etc. ARE ALL SUPPOSED TO BE STORED IN READABLE FORMAT WITHIN XML TAGS, CONFORMING TO A KNOWN, PUBLISHED DOCUMENT DESCRIBING THE CONTENT.

          No it's not. XML is not supposed to store information such as 'font' and other presentational features. This is the job of the XSL stylesheets or CSS etc. XML is designed to store data in a structured way. So for instance you may have a <chapter> tag, but what font to use for chapter tags is only supposed to be specified in the XSLT. If I did an XML export of my word document, I would expect (hope for) an XML document, and either an XSLT stylesheet transforming the XML to HTML, or an XSL:FO stylesheet so that I can turn the XML into a pdf or postscript file. However, the stylesheets would be the 'icing on the cake'. The essential item is the XML formatted data, not the presentational information.

          • You keep talking about how it is "supposed" to work but they are just tools. XSLT do not have to be used just for presentation. XSLT can be used to transform from data from one format to another (for instance when converting from one vendor-specific format to another). On the flip side, an XML document can contain all the formatting information ala the Apple Keynote software and its open DTD.

            Certainly if one WANTED to erect a high wall between content and formatting one could use XML and XSLT to do so.
          • 1. XML is not supposed to store information such as 'font' and other presentational features.

            2. This is the job of the XSL stylesheets

            Bzzzt. XSL is XML. XSL stores fonts.

            XML stores data. [text location='1.25 from left margin, 8in from top']This is [italicized angle='23.t degrees']text[/italic].[/text] (with appropriate bracket substitutions) is a perfectly valid XML document; your computer doesn't crash nor does the universe blow up if you submitted this to an XML processor.

            Now this document format
          • Re:Rubbish! (Score:3, Insightful)

            by TummyX (84871)

            No it's not. XML is not supposed to store information such as 'font' and other presentational features. This is the job of the XSL stylesheets or CSS etc.


            Um. XML is for storing any kind of information -- including font styles. It's just a better idea to seperate those two concerns into seperate schemas.

            PS. XSL uses XML!
      • I disagree.

        Of course, opening the XML file in another application would ignore formatting...just like opening a .DOC file in a hex or text editor would ignore formatting.

        The issue here is the documentation of the formatting-related tags. Take the two following XML 1.0 fragments:

        1:

        <content>Hello! This text is <style add="italic">different</style>.</content>

        2:

        <content>Hello! This text is <msft secretFormattingCode="0x3B">different</msft>.</co n tent>

        Both
      • Actually, this is entirely the point of XML. XML is not Yet Another Word Processor Format. It's intended to store "content" as opposed to "presentation", leaving "presentation" up to the app, much as was the original intent of HTML. Rather than an evil Microsoft plot, they are in fact conforming to the spec when they produce such a file.

        Bullshit, XML is designed to describe absolutely anything and everything.

        Sure, in the case of XHTML there is a desire to eliminate presentation aspects through the use of
  • Propoganda (Score:4, Funny)

    by ahkbarr (259594) on Sunday April 13, 2003 @05:02PM (#5723000)
    This is not reliable source! This is US led propoganda campaign!

    Seriously, though, who here could not have predicted this?
    • Any fans of Mohammed Saeed al-Sahaf, the Iraqi Information Minister (or "Comical Ali" as he is also now known) should be sure to visit this site. [welovethei...nister.com]
    • by superyooser (100462) on Sunday April 13, 2003 @06:03PM (#5723279) Homepage Journal
      Mohammed Saeed al-Sahaf, Iraqi Minister of Information:
      "I triple guarantee you, there are no American soldiers in Baghdad."

      Mohsen Khalil, Iraqi Ambassador to the Arab League:
      "Iraq will not be defeated. Iraq has now already achieved victory - apart from some technicalities."

      Jean Paoli, Microsoft's XML architect:
      "I'm out of the business of creating formats. Our focus on Office is on data exchange. There is no more difference between documents and data."

      ... apart from some.. err umm "technicalities"?

  • by MrLint (519792) on Sunday April 13, 2003 @05:02PM (#5723003) Journal
    Of course sniping is way to eay here.. so i will,

    For the love of B0B how hard is it to deploythe feature across the entire suite. What can we conclude here?

    1) Its not really ready and the high end versions will ship later.
    OR
    2) its a cheap ploy to rake in more money later on.

    *sigh*
  • by Anonymous Coward on Sunday April 13, 2003 @05:03PM (#5723008)
    but XML support in OpenOffice [openoffice.org] is.

    ------------
    This is guarenteed to not be the first post.
  • by ralphart (70342) on Sunday April 13, 2003 @05:05PM (#5723013)
    The sun rose this morning; sunset predicted for later today!
  • importing (Score:3, Interesting)

    by Snuffub (173401) on Sunday April 13, 2003 @05:06PM (#5723016) Homepage
    But doesnt this mean that the standard will at least be able to import the office XML files? Otherwise who would use it? and if that's the case it means that at the very least the standard edition would be able to import files flawlessly from any office app that supports the format.
    • The point you miss is that apart from the embrace, extend, exterminate action on XML, this is an exercise by Microsoft to get all business users upgrading to Professional from Standard. Their argument will be that XML is probably overkill for home users who only type in Outlook Express anyway.

      So if they do get above their station and think they need Office then Standard without XML and no Access is OK for them to do their typing and checkbook balancing on.

      But if you are in business you need the Profess

  • Schools? (Score:5, Insightful)

    by shibbydude (622591) on Sunday April 13, 2003 @05:07PM (#5723018) Homepage Journal
    I really like the idea of human-readable code, but who really wants to lose backward compatability with all the rest of the versions? At my highschool, there are about five versions of Office, all which save in different formats. Most people save as rich text just for compatability, because even the small releases or updates do not save in a compatible format for the older releases. If we introduce a format which is absolutely not readable by older versions, it will not only baffle our techies for months, but I know productivity will take a hit when students *accidentally* save in the wrong format and then cannot open it for the life of them.

    This is one reason I use openoffice (openoffice.org [openoffice.org] at home as it supports most word versions flawlessly, without promting me to "insert office cd 2" to install the feature.

    • At my highschool, there are about five versions of Office, all which save in different formats. Most people save as rich text just for compatability, because even the small releases or updates do not save in a compatible format for the older releases

      This just isn't true. I'm not even going to take the effort to refute it.
      • True, true... (Score:4, Informative)

        by Chordonblue (585047) on Sunday April 13, 2003 @07:55PM (#5723903) Journal
        "This just isn't true."

        The hell it isn't. Ever try to open an older works document in 'X' version of MS Office?

        How about support of international versions? Can a Japanese student use their version of Office 97 to write an English document, printable in our labs? Dunno. Sometimes.

        How about opening say, a Word 97/2000/XP doc in Office 95? Oh, right, that doesn't work either.

        Schools aren't like your average corporation. We can't always afford to go out and get the latest and greatest. I also have to question WHY we'd even bother doing so and I wish our public schools would seriously consider this question as well - our tax dollars can be better spent. To be honest, Office 97 was all we ever really needed functionality-wise.

        Then there's what happens when a student goes home and works on a paper. Who knows WHAT format it'll come back in. The biggest problem for us has been when an upgrade cycle comes around and some of my students (or parents) end up with it (came with their brand new PC).

        Last year I posed a question to the teachers: Why not use Open/StarOffice? This has (for the most part), solved our compatibility issues. As I work for an international school, we have students with every version of MS Office, Works, Wordperfect, hell, even NOTEPAD!

        Standardizing everyone (teachers, parents, students) on OpenOffice.org was the smartest thing we've ever done. Document compatibility was major factor in that decision.

        • Re:True, true... (Score:3, Insightful)

          by cyril3 (522783)
          Ever try to open an older works document in 'X' version of MS Office?

          How about opening say, a Word 97/2000/XP doc in Office 95

          The way I read the post in question the poster referred to the difficulty of opening in say Office 97 a document saved in the format of a later version say XP. As far as I know apart from one stupid upgrade a while back which they fixed, any MS office prog can save in the format appropriate to an earlier version. Please let me know where I'm wrong. (International version differen

          • Except OpenOffice.org is feature-rich, free, and cross-platform, making it a better then average choice for standardization.
          • "any MS office prog can save in the format appropriate to an earlier version..." ...as long as you HAVE it. That's my point! Only endless upgrades will ensure this kind of compatibility unless you can standardize on a more open standard.

            Also, you'd be surprised how many people have a difficult time understanding why their .doc here doesn't work with their .doc at home. Doesn't make much sense to me either - how the hell can you tell? Why should it matter?

            And as I believe I did mention international compat
  • On file formats (Score:5, Insightful)

    by bogie (31020) on Sunday April 13, 2003 @05:08PM (#5723021) Journal
    The entire business world is still being held hostage or pushed around by a proprietary file format. How sad, annoying, and wasteful.

    I always said during the DOJ trial all I wanted was to have the Office file formats opened. That would have really lead to some change.

    Btw in case your new here, try OpenOffice you might like it.

    www.openoffice.org
    • Btw in case your new here, try OpenOffice you might like it.

      Picky point, but due to some copyright or trademark or something I believe you are refering to the OpenOffice.org office suite.

      Although in person I use the shorter, technically incorrect name too.
  • by bamberg29 (240460) on Sunday April 13, 2003 @05:10PM (#5723035)
    I've been using Office 2003 Beta 2 for about a month now and the XML support seems fairly poor. I've saved some of my Word documents in XML format and tried opening them in some other XML supported programs , but had a hard time opening them. I guess MS needs to work some more on the XML support in Office.

    David
    • by sciwhiz007 (665637) on Sunday April 13, 2003 @05:16PM (#5723055) Journal

      We have to remember that this is Microsoft we are talking about here. Any time they say "we are going to switch to an open format", there's always a catch to it.

      Is Microsoft ever going to switch to an open format? No, why would they? They will only lose money. As for the people complaining about competition, why should a company with 90 - 95% of the desktop Office suite market care?

      People with little or no knowledge about what Microsoft has done in the past might think that Microsoft is taking a great step forward. But remember, this isn't going to be complete XML, it is "Microsoft XML"

      All this about Microsoft doing a great thing by switching to an "Open XML base" is all hype, nothing more.

    • by Anonymous Coward
      Wasn't the Office tag format supposed to be

      <data>
      afl3iuao3fa#FA(U#F#(UFWLIJFwlkfjaw3f
      </ data>

      ?
  • by DogIsMyCoprocessor (642655) <dogismycoprocessor&yahoo,com> on Sunday April 13, 2003 @05:14PM (#5723043) Homepage
    Microsoft announced that only the Enterprise and Professional versions of Office 2003 would support the feature of saving files to industry-standard media such as IDE and SCSI hard disks. The Standard version of Office 2003 will allow the user to save document files only to Microsoft Zippo (TM), a new proprietary USB-based external removable media device. "We believe this is an innovative way to provide extra value to our customers." said Microsoft spokesman Hugh Jass.
  • And when... (Score:5, Insightful)

    by Black Parrot (19622) on Sunday April 13, 2003 @05:15PM (#5723050)


    If they ever do make it general they'll encumber the components with so many patents and copyrights that it will be a proprietary format in spite of being XML based.

    The people running Microsoft might not be "nice", but they certainly aren't stupid. Moving to an open file format would immediately saw one of the legs out from under their monopoly. Expect them instead to vaporize the file format issue and drag it out as long as possible, so that people and companies tempted to switch to a WP with an open format will think they can get the open formats without switching, if only they wait a little longer and pay for a few more upgrades.

  • Oh Come On (Score:3, Funny)

    by Snowspinner (627098) <philsand@@@ufl...edu> on Sunday April 13, 2003 @05:20PM (#5723071) Homepage
    The reason you don't have all versions of Office be identical is that then you wouldn't need different versions. The Standard versions of programs contain fewer features than the Professional and other shiny versions. This is to help justify charging more for the professional versions. This is not unreasonable. As with much of capitalism, paying more gets you more. Jesus, some days I think MS could liquidate and give all their money to the EFF and still get flamed by you people.

    • As with much of capitalism, paying more gets you more.

      This is ok most of the time, because providing more service actually costs more money. That isn't the case here---the code exists and costed the same to develop whether 10 people use it or 100 people.
      • Re:Oh Come On (Score:2, Insightful)

        by Snowspinner (627098)
        The cost of something in a capitalist system is ultimately not based on its production cost, but on its value to the end user.

        The fact that MS could put the XML in the home version at no cost is irrelevent. The important thing is that there exist people who will pay more money to get the functionality offered in the Professional version of office over the Home version.

        Therefore the Professional version costs more.
        • No, you're wrong. Go read some economics. Capitalist systems are supposed to maximise the number of transactions; the number of transactions increases as the price decreases towards marginal cost. Draw your own conclusions.
    • The article does say that this is the first time that different Office versions will have different capabilities of the same program. Previously, the more expensive versions just got you more programs, like Outlook, Access, etc.
      • Re:Oh Come On (Score:2, Informative)

        by Snowspinner (627098)
        Regardless, other companies (Adobe springs most quickly to mind) have been packaging Light and Full versions of software for ages. Depending on how much functionality you want/need, you can pay more or less.

        Hell, Microsoft basically did this with Windows XP Home and Professional, with Home having a cap on its network size. Though I think that particular move was fucking absurd (My home is not a small office or business, but has too many computers to network on XP Home).
  • by Trailer Trash (60756) on Sunday April 13, 2003 @05:25PM (#5723101) Homepage

    Microsoft will still be leveraging file format compatibility for at least another Office release.

    Here we go again. "If Microsoft would just use an open format like XML then anyone could read the documents with any program and the world would be a better place."

    XML is a format for creating data formats. It is not a data format. The fact that a particular format is XML compliant says nothing for its readability, it simply means that it can be parsed into a document tree by an XML parser. That doesn't mean that anybody can determine what the tree represents, only that it can be created. My favorite analogy: "If Microsoft would just start using 8-bit bytes, then anybody could read their file formats."

    Microsoft has made it clear that the dollar value of secret file formats isn't lost on them. They will continue to use secret file formats, even if they're XML-based, until someone makes them stop. At the same time, they'll be able to harvest the stupidity of PHB's who will claim that Microsoft file formats are open because they're XML. It's surprising how many people on Slashdot foolishly believe the same.

    Michael

    • Wait until everyone's using it.
      Then GPL it.

      That'll learn 'em.
    • That doesn't affect the validity of my post. I simply said that they'd be leveraging file format compatibility for at least another release.

      You're certainly correct that using XML doesn't mean that the document will be parseable and renderable, but I'd say that the move *is* a prerequisite -- no one is going to be able to implement all the quirks and legacy crap in the current parser. (This is the same parser that accidently left chunks of uninitialized data from the disk in saved files on the Mac a few
    • by jkarlin (171967) on Sunday April 13, 2003 @07:35PM (#5723779) Homepage
      Excellent points. I'm using the Beta of Office 2003 Pro and I just saved 'Hello World' as an Office XML file. Thought it would be nice to actually see what we're talking about.

      <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
      <?mso-application progid="Word.Document"?>
      <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word /2003/2/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C1488 2" xml:space="preserve"><o:DocumentProperties><o:Titl e>Hello World</o:Title><o:Author>Jason Karlin</o:Author><o:LastAuthor>Jason Karlin</o:LastAuthor><o:Revision>1</o:Revision><o: TotalTime>0</o:TotalTime><o:Created>2003-04-13T23: 29:00Z</o:Created><o:LastSaved>2003-04-13T23:29:00 Z</o:LastSaved><o:Pages>1</o:Pages><o:Words>1</o:W ords><o:Characters>11</o:Characters><o:Lines>1</o: Lines><o:Paragraphs>1</o:Paragraphs><o:CharactersW ithSpaces>11</o:CharactersWithSpaces><o:Version>11 .4920</o:Version></o:DocumentProperties><w:fonts>< w:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/><w:font w:name="Tahoma"><w:panose-1 w:val="020B0604030504040204"/><w:charset w:val="00"/><w:family w:val="Swiss"/><w:pitch w:val="variable"/><w:sig w:usb-0="21007A87" w:usb-1="80000000" w:usb-2="00000008" w:usb-3="00000000" w:csb-0="000101FF" w:csb-1="00000000"/></w:font></w:fonts><w:styles>< w:versionOfBuiltInStylenames w:val="3"/><w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/><w:style w:type="paragraph" w:default="on" w:styleId="Normal"><w:name w:val="Normal"/><w:rPr><wx:font wx:val="Times New Roman"/><w:sz w:val="24"/><w:sz-cs w:val="24"/><w:lang w:val="EN-US" w:fareast="EN-US" w:bidi="AR-SA"/></w:rPr></w:style><w:styl e w:type="character" w:default="on" w:styleId="DefaultParagraphFont"><w:name w:val="Default Paragraph Font"/><w:semiHidden/></w:style><w:sty le w:type="table" w:default="on" w:styleId="TableNormal"><w:name w:val="Normal Table"/><wx:uiName wx:val="Table Normal"/><w:semiHidden/><w:rPr><wx:fon t wx:val="Times New Roman"/></w:rPr><w:tblPr><w:tblI nd w:w="0" w:type="dxa"/><w:tblCellMar><w:top w:w="0" w:type="dxa"/><w:left w:w="108" w:type="dxa"/><w:bottom w:w="0" w:type="dxa"/><w:right w:w="108" w:type="dxa"/></w:tblCellMar></w:tblPr></w:style>< w:style w:type="list" w:default="on" w:styleId="NoList"><w:name w:val="No List"/><w:semiHidden/></w:style><w:sty le w:type="paragraph" w:styleId="BalloonText"><w:name w:val="Balloon Text"/><w:basedOn w:val="Normal"/><w:semiHidden/><w:rsid w:val="4E5A63"/><w:pPr><w:pStyle w:val="BalloonText"/></w:pPr><w:rPr><w:rFont s w:ascii="Tahoma" w:h-ansi="Tahoma" w:cs="Tahoma"/><wx:font wx:val="Tahoma"/><w:sz w:val="16"/><w:sz-cs w:val="16"/></w:rPr></w:style></w:styles><w:docPr> <w:view w:val="print"/><w:zoom w:percent="100"/><w:doNotEm
      • And on my computer it looks something like this.

        Hel lo World

        Ha Ha that's funny. I do a copy and paste on the text section and its got an extra space in the middle of the word Hello.

        It looks like the spellchecker in 2003 Pro doesn't work yet either.

      • I notice there are http URLs for the schemas. If they contained the specifications for the parse tree, it could be quite informative (even if it just listed them -- complete context is helpful). Sadly, they simply get 404s (though the schemas.microsoft.com server exists). I wonder if that violates some sort of standard....
      • Fucking hell - 3571 characters to produce "hello world?"

        Pity there isn't an obfuscated XML contest - we'd have a winner here.
      • OO output (Score:5, Interesting)

        by IamTheRealMike (537420) <mike@plan99.net> on Monday April 14, 2003 @04:11AM (#5726341) Homepage
        For comparison, here is the equivalent (empty) document in OpenOffice.

        content.xml:

        <?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE office:document-content PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "office.dtd">
        <office:document-content xmlns:office="http://openoffice.org/2000/office" xmlns:style="http://openoffice.org/2000/style" xmlns:text="http://openoffice.org/2000/text" xmlns:table="http://openoffice.org/2000/table" xmlns:draw="http://openoffice.org/2000/drawing" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:number="http://openoffice.org/2000/datastyle " xmlns:svg="http://www.w3.org/2000/svg" xmlns:chart="http://openoffice.org/2000/chart" xmlns:dr3d="http://openoffice.org/2000/dr3d" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="http://openoffice.org/2000/form" xmlns:script="http://openoffice.org/2000/script" office:class="text" office:version="1.0">
        <office:script/>
        <office:font-decls>
        <style:font-decl style:name="Arial Unicode MS" fo:font-family="'Arial Unicode MS'" style:font-pitch="variable"/>
        <style:font-decl style:name="HG Mincho Light J" fo:font-family="'HG Mincho Light J'" style:font-pitch="variable"/>
        <style:font-decl style:name="Nimbus Roman No9 L" fo:font-family="'Nimbus Roman No9 L'" style:font-family-generic="roman" style:font-pitch="variable"/>
        </office:font-decls>
        <office:automatic-styles/>
        <office:body>
        <text:sequence-decls>
        <text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Table"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Text"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
        </text:sequence-decls>
        <text:p text:style-name="Standard"/>
        </office:body>
        </office:document-content>

        meta.xml:
        <?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE office:document-meta PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "office.dtd"><office:document-meta xmlns:office="http://openoffice.org/2000/office" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="http://openoffice.org/2000/meta" office:version="1.0"><office:meta><meta:generator> OpenOffice.org 1.0.1 (Linux)</meta:generator><!--SRC641_[7663]_LINUX_IN TEL__stripples.devel.redhat.com_at_9/10/02_8:50:05 --><meta:creation-date>2003-04-14T09:09:00</meta:c reation-date><dc:language>en-GB</dc:language><meta :editing-cycles>1</meta:editing-cycles><meta:editi ng-duration>PT0S</meta:editing-duration><meta:user -defined meta:name="Info 1"/><meta:user-defined meta:name="Info 2"/><meta:user-defined meta:name="Info 3"/><meta:user-defined meta:name="Info 4"/><meta:document-statistic meta:table-count="0" meta:image-count="0" meta:object-count="0" meta:page-count="1" meta:paragraph-count="1" meta:word-count="0" meta:character-count="0"/></office:meta></office:d ocument-meta>

        That is only 2 out of the 4 or 5 files openoffice saves. Oh, and for all those who made sucky Base64 jokes about MS WordML, take a look at this:

        <config:config-item config:name="PrinterSetup" config:type="base64Binary">ugL+/0dlbmVyaWMgUHJpbnR lcgAAAA
        AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAA
        AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAU0 dFTlBSVAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWAAMAAAIAAAAA
  • by stj (607714)
    I just wonder what kind of model exactly they have with this? If someone wants portability, they would prefer being able to import/export data to as many users as possible. So, let's look at the big picture: your company buys pro edition, you get PC at home and get home edition. In short, you're grounded at work... So, either you buy pro for home (which will be yet another license in addition to your OEM home version) or you end up doing acrobatic conversions five times a day. I'm really hoping to see some
  • by ignatz (10191) on Sunday April 13, 2003 @05:27PM (#5723112) Homepage
    This is the user schema support we're talking about here - which allows users to build Word templates from XML schema and then use them to save schema-compliant XML from documents. This is only being included in the professional and enterprise SKUs, not the home and SME SKUs.

    User schema aren't really suitable for home and SME users - it's the sort of thing you need if you're dumping XML output into enterprise applications, and want your data entry folk to use their usual Office applications.

    For XML transfer WordML is still supported in all SKUs, which is defined by a schema at a specific URI, so it will validate in most parsers.

    What will be much more interesting will be uderstanding the pricing for InfoPath...
  • Some alternatives (Score:5, Informative)

    by stonebeat.org (562495) on Sunday April 13, 2003 @05:28PM (#5723121) Homepage
    I m not counting on MS Office Suite to provide me with a XML editor. Here are some alternatives:
    DocSoft's W2XML Version 2 [docsoft.com]
    Authentic by Altova [altova.com]
    i4i Tagless Editor [i4i.com]
    XMLWriter by Wattle Software [xmlwriter.net]
    Opensource Extensible XML Modeling Application [xerlin.org]

    If you know of any other GUI based XML modeling/editing apps, please feel free to add them to this list.
    • i don't think anyone was going to use it as xml editor, more like using the xml to export documents to other (free) office packages to get out of ms lock-in.
  • by littleRedFriend (456491) on Sunday April 13, 2003 @05:29PM (#5723124)


    Develop once, sell many times...
  • Open Office (Score:5, Informative)

    by Anonymous Coward on Sunday April 13, 2003 @05:31PM (#5723136)
    This is as good-a-time as any to migrate away from Microsoft Office. Open Office 1.1 is about to come out and it looks brilliant!! (the beta is currently available at http://www.openoffice.org/ ) It supports open standards (eg. XML), Microsoft Documents (word/excel/powerpoint) and exports to PDF (both text and graphics) at the press of a button! It also manages to count page numbers correctly when printing (* cough - word, cough *).

    On the other hand, my wife prefers Word and I prefer Open Office. The only time she likes open office is when she asks me to convert a document from one word format to another - because word won't do it at all, or word converts it very badly.

    Also, I save several hundred dollars every few years :)

    AC
  • by 2TecTom (311314) on Sunday April 13, 2003 @05:33PM (#5723144) Homepage Journal
    lemme see ...

    there's MS Java, then there's the other version
    there's MS HTML, then there's the other version
    there's MS VC++, then there's the other version
    there's MS OS's then there's the other OS ...

    same ol same ol ... see Bill, see Bill emulate.

    Nope, nuthin new here folks, move along ... ;)
  • by wouterke (653865) on Sunday April 13, 2003 @05:46PM (#5723210) Homepage

    Even if XML was supported in all versions of Office, would that mean that Office would suddenly have an open file format? I don't think so. It's perfectly possible for me to write anything in XML in a way that you will not be able to read it.

    Which is normal. XML is a way to describe data. If you have the DocType Definition (DTD) of an XML file, the only thing you know is whether that XML file is structured correctly, and how you would create another XML file that would look like the same thing for an XML parser. Nothing more.

    In the long run, XML is nothing more than a standard you can use to base other standards on. XML can be put in the same row as ASCII, bytes, the file concept, or even SGML: it's a standard intended for the creation of other standards.

    Nothing more, nothing less

    Therefore, I think the argument that Microsoft Office will 'support XML' is just a marketing joke. It won't do anything out of the ordinary...

  • by cyberformer (257332) on Sunday April 13, 2003 @05:48PM (#5723221)
    Microsoft will still be leveraging file format compatibility for at least another Office release.

    They'll do this as long as they have a monopoly (or near-monopoly). The XML support isn't about making file formats compatible with competitors, or even about pretending to. It's just one more feature that MS has added to Office, in an attempt to persuade existing users to upgrade. It means that Office can be used to edit XML documents. It doesn't mean that Office's proprietary file formats are disappearing.

    XML editing is a useful feature for some people, and from what I've heard it works better than the horrible HTML support in previous versions of Office, but it's still a niche. (True, it can be used to help with cross-platform compatability, but so can RTF and other existing "save as" options.) Most users just want to write a letter or design a presentation, and aren't concerned with markup languages.
  • by Eric Smith (4379) <eric.brouhaha@com> on Sunday April 13, 2003 @05:52PM (#5723233) Homepage Journal
    I fully expected that they would store the Office documents as a big blob of binary data, base64 or base95 encoded, in an XML wrapper. Technically that would be a fully standards-compliant XML file, but in practice it would be completely useless.

    So it's not surprising that they haven't made their XML format completely transparent and uniform, but rather it is surprising that they haven't made it completely opaque.

  • by Bob9113 (14996) on Sunday April 13, 2003 @05:57PM (#5723253) Homepage
    Microsoft's Leach emphasized that this change in positioning doesn't negate that "customer-defined XML schema support is a feature of Pro." On the other hand...

    Cool, they've actually appointed a corporate leach. Perhaps that explains why MS Office came out with XML support after it was released in OpenOffice.
  • by di0s (582680)
    Two other features also are similarly restricted: the document protection technology Windows Rights Management Services (RMS), and Excel List, a feature for improving analysis of data lists. Microsoft plans to deliver the three features only in the Enterprise and Professional versions of Office 2003, the company confirmed late Thursday.

    No DRM in the Standard version means no DRM'd documents for the Office version that 99% of people use(and the version that comes with most OEM PCs). So at least Rights Res
  • WordML isn't stripped of the formatting, it is simply very obfuscated XML -- but it will be translateable / transformable as soon as we gets our hands on it. That said, however -- I'm still waiting to Microsoft's other foot to drop -- namely, they'll patent some part of WordML or go after people who reverse engineer it using the DMCA. *Sigh*
  • There was that glimmer of hope, but I can't say I'm too suprised.

    It's sunday, meaning it's "Microsoft is evil, and Apple is being good... for now" day. XML itself isn't the holy grail. Without proper documentation, it can be just as nasty to figure out as the binary Word file (depending on how competent the designers of the format were). But properly documented format, with schema [apple.com], XML or not, can be a really nice thing.
  • by overshoot (39700) on Sunday April 13, 2003 @06:19PM (#5723333)
    I ask because of persistent rumors that "MSXML" is effectively an XML wrapper around a binary blob that only MSOffice components can read.

    Before someone corks off that that wouldn't be legal XML, please note that XML can carry encrypted content. As an existence proof, please note that MS could encrypt parts of the file such that decryption requires an MS key. The result would be perfectly legal XML, and perfectly useless without the MS key.

  • OMG....! (Score:3, Funny)

    by IWannaBeAnAC (653701) on Sunday April 13, 2003 @07:00PM (#5723590)
    What is RMS going to do when he finds out MS has named "Rights Management Services" after him?

    Is it possible this was deliberate?

  • by 26199 (577806) on Sunday April 13, 2003 @07:15PM (#5723682) Homepage

    "By only having this in the Pro version, customers who don't want this aren't paying for it."

    I wonder how much more Microsoft would be forced to charge for Office with XML support? It's truly good of them to try and save us money this way...

  • Maybe next year... (Score:3, Interesting)

    by b3h (663941) on Sunday April 13, 2003 @07:39PM (#5723792)
    This is what happens when you have complete market control. Why innovate when you can duplicate and still rake in hundreds of dollars per copy of the same suite you released last year?
    OpenOffice, the world needs you!
  • does anyone know for certain if the xml output is going to be published?

    Is the file output in a readable format?

    I would be interesting to see if they allow for reading the file like openoffice [openoffice.org] and allow text processing.

  • now c'mon (Score:2, Insightful)

    by standsolid (619377)
    who else read this and thought "and..."?

    in other news: garvity keeps you on the ground! more at eleven!
  • by unfortunateson (527551) on Sunday April 13, 2003 @10:43PM (#5724869) Journal
    I haven't looked at the XML generated by saving 'normal' docs to XML, but I'm rather impressed by Word's ability to edit XML.

    You need a schema, which is a bit of a pain, but it's at least as friendly as most of the XML editors out there. Plus you can embed all the 'normal' Word formatting content where any CDATA would go.

    I'd like to see a better UI for entering attributes rather than having to right-click the tag -- there's this handy-dandy task pane on the right, why not default to attribute entry there?

    The live validation is pretty good, the pick-and-choose entities is just fine. The best part, is that the XML is accessible from VBA, .Net, and anything else that can talk COM/OLE.

    I'm starting to look into their "SmartDocs" SDK, where you can have behaviors appear in that task pane (probably can do the attribute editing there), based on the XML tags. It's an extension of their SmartTag interface, and not the most straightforward interface I've ever seen, because the tag is just a parameter to a generic call, but I think I can make it work.

    I'm less impressed with their XML form editor Infowhatever -- it appears to be limited to usability with certain kinds of schemas (and never DTDs, it seems), more database-like, less document-like. If its forms could be embedded ito Word, it would be even nicer.

    FYI, the DTD I'm working with is the International Council of Harmonization's [ich.org] Electronic Common Technical Document, which is not a document, but the table of contents for submissions of data to the Food and Drug Administration and regulatory agencies worldwide (Ok, only Europe and Japan, with Canada and Australia and others riding the coattails).
  • Sure (Score:3, Informative)

    by Peer (137534) <rene@ n o t f o u n d.nl> on Monday April 14, 2003 @04:17AM (#5726357) Homepage
    "By only having this in the Pro version, customers who don't want this aren't paying for it."

    WOW, I only pay for what I get? What about xBox, Hotmail etc. Afaik they are being paid or by unsuspecting/ignorant Office-users.

    Yes I did purchase an xBox for the very same reason ;)

Prof: So the American government went to IBM to come up with a data encryption standard and they came up with ... Student: EBCDIC!"

Working...