Office 2003 and XML 626
zachlipton writes "Internet World is reporting that initial reports from Office 2003 beta testers don't look good for those hoping to share documents with non-MS systems using the XML file format. Gary Edwards, the OpenOffice.org representative for the OASIS XML file-format group is quoted as saying "although it's still early in the review process, it does look as though XP XML has been so seriously crippled as to be useless to anyone but the big content management and collaboration system providers." Apparently, all formatting and presentation information is removed from the XML. Furthermore, Office's new collaboration featres will only work with users who are also running Office 2003 (requiring Windows 2000 or 2003) that are connecting over XP servers." So Microsoft will continue its efforts to lock-in users with proprietary formats, and hopefully the rest of the world will produce an XML standard document format without them.
Duh. (Score:3, Insightful)
With the US antitrust suits off now, the EU is our only hope to curb their anticompetitive practices.
Re:Duh. (Score:2, Informative)
You could have the same xml content file to outputting to pdf, rtf, postscript, and any number of other formats. Separating data from format is one of the strengths of xml. This is much better than straight binary format or (ugh) RTF. Separating data and format is a good thing.
You don't keep xml
Re:Duh. (Score:5, Funny)
Also, of the comma-delimited file.
Re:Duh. (Score:3, Insightful)
Re:Duh. (Score:5, Insightful)
Have you ever played a game like Civilization or Alpha Centari? You would be amazed at how much those games make you understand politics. Once you are in the lead, you do anything you can to protect that lead. And why would you expect the real world to be any different?
But this isnt a game, this is business. And since businesses are SUPPOSED to make money, they need to make sure people continue to buy MS Office. And making an office suite that shares documents with all the various third-tier office suites just doesnt do that. Why should my company buy MS Office if the documents it produces are exactly the same as those of FreeBeerOffice? Now, if FBO cannot do things MSO can do, then there is an incentive...
Real World Vs. Game (Score:5, Funny)
Or spy on other people from a God perspective. Damn you! Now I'll have to spend the rest of my day realizing how pathically small my scope is...
Re:Duh. (Score:3, Insightful)
In the real world, once you are in the lead (say a civilization with advanced sciences and arts, bounty for all, etc) why would you work to keep other civilizations/countries down? You'd work to improve their science, their arts, their h
Duh. (Score:5, Insightful)
This is simply a company who has the dominant product protecting their lead.
For a monopolist, nothing is simply any more. In the absense of market forces to correct misbehavior, exactly how they attempt to protect their lead does matter.
And quite honestly, I dont see anything wrong with that, as long as they confine their practices to their product (ie. they arent making Office the only suite that can run on windows) [emphasis added]
As long as nothing in the Office Suite promotes the Desktop OS monopoly.
As long as nothing in the Desktop OS monopoly promotes their own Office Suite.
But this isnt a game, this is business.
And screwing your customers is bad business.
And screwing your suppliers is bad business.
And screwing your investors is bad business.
And screwing your employees is bad business.
Even screwing your competitors is bad business.
And since businesses are SUPPOSED to make money, they need to make sure people continue to buy MS Office.
And General Motors needs to make sure people continue to buy Chevrolets.
And making an office suite that shares documents with all the various third-tier office suites just doesnt do that.
It just makes incomprehensible gibberish unless the recipient happens to have the exact same sooper-dooper magic decoder ring. Unless I can read my stuff, under circumstances of my own choosing, I have a problem. Unless I can send stuff to my correspondents and they can read it un circumstances of their own choosing, I have a problem. If my documents are hostage to the whims of a supplier, I have a problem.
Why should my company buy MS Office if the documents it produces are exactly the same as those of FreeBeerOffice?
New twist on Clippy?
No reason they should. That's Microsoft's problem, not yours or your company's (unless you work for Microsoft;)
Re:Duh. (Score:3, Insightful)
How is that any different? They do open up their Windows API so people can write software for it but they don't open up the document format so people can write documents for it. How is closing up windows so it can run only office any different from closing up word so it can open only office documents?
At some point..... (Score:5, Insightful)
They still don't get that their attempts to "embrace and extend" the whole damn internet isn't going to work.
The rest of the world WILL produce an XML standard document format without them, thank heavens.
Re:At some point..... (Score:5, Insightful)
Re:At some point..... (Score:5, Insightful)
Re:At some point..... (Score:5, Interesting)
Re:At some point..... (Score:4, Interesting)
For those not running Windows, the Word viewer comes "free" with a $199.- (list price) version of Windows, a good sized chunk of your system disk (not that it really matters much given today's HD prices and capacities) and the usual installation hassles, like drivers for equipment which isn't included on the CD etc. Even if you got Windows "free" with your PC from the manufacturer, you just paid the Microsoft tax up front, and will continue to pay if you want to keep your system up to date.
That's like saying the Grappa I got offered after shelling out $150.- for dinner with a date last Saturday was "free". Sure, I didn't pay for it, but you can't get it without buying dinner first.
Yes, I know there are solutions for reading MS Office documents on Linux. But I always cringe when people tell me to use the "free" readers - they're not free in any sense of the word in my book.
Re:At some point..... (Score:3, Informative)
Re:At some point..... (Score:5, Insightful)
Strangely, there doesn't seem to be a Linux version. Or a Mac version, either. It's not so free when I'd have to buy a copy of Windows and spend 2 hours installing it, is it?
Re:At some point..... (Score:5, Interesting)
Re:At some point..... (Score:3, Interesting)
Your supposed to bend over backwards to help and assist your clients, not make them do that for you. Of course if you do business with a holier-than-thou Free Software ethos then yeah I guess you wouldn't see a problem with acting like that. And I'm not saying it would put you out of business either. You'll simply be regarded as a jerk.
Re:At some point..... (Score:3, Insightful)
Not necessarily. What if bending over backwards forces you to spend thousands of dollars more on software, just because one or two clients are unwilling to use the "Save As..." option in their word processor? Would you hire a consultant that charged an extra $10/hour because some of his other clients are too stupid/lazy/arrogant to cooperate with the consultant to get the job down at the lowest cost and l
Re:At some point..... (Score:3, Insightful)
It would be like McDonalds asking all their customers to remove their shoes and socks wh
Re:At some point..... (Score:3, Insightful)
Actually, once Microsoft succeeds in transitioning to the subscription model, then buying Microsoft software will be a regular, on-going cost.
Re:At some point..... (Score:3, Insightful)
And what do you mean a cultural thing? Some folks make too much out of being dignified and prideful. They'll turn any insignificant issue into a matter of principles.
If you want to be principled then choose a REAL issue to make your stand on. Something like human rights, homelessness activism or education reform. But the use of MS Software? Puhlease.
Re:At some point..... (Score:5, Insightful)
I remember problems with AutoCAD back 7 years ago or so, going from release 12 to release 13. 13 was a dog. It had an incompatible file format, forcing upgrades for everyone that shared the same document. Since 13 didn't offer enough incentive for them to reach critical mass, it died with most people sticking with 12 until the next release came out... which solved a lot of problems. Autodesk got a humility pill and realized that forcing the upgrades is bad policy, although you can do thing to encourage it (default format save).
The trouble with MSFT's approach is that it breaks too many things at once; you have to get critical mass not only on the office application, but also the operating system and servers. A company that is not posed for this migration will not do it. If a single client requires it, then they will hire a secretary to do a saveas down to a more manageable format. If half the clients require it, it is difficult to avoid the upgrade.
Re:At some point..... (Score:3, Interesting)
Understand that I solely use OpenOffice for my documents when I'm not using vi. I prefer vi over all other types of documenting - it's fast and easy for me. Anyway, the point is - your personal preference doesn't matter. Communicating quickly and effeciently is.
Yes, Microsoft sucks, closed formats suck, etc. The truth of the matter is, you either adapt to what (in this case) your clients use or they don't
Re:At some point..... (Score:5, Insightful)
If your clients tell you to bend over, you bend over? You seem to have a very sad life. Grow some spine, explain things to them, and you'll be surprised about how many of them get it.
And, in case you wonder,
I'm not a student.
I own a business.
And yes, I'm doing rather well even with principles.
Cheers,
Re:At some point..... (Score:3)
Hmm. If you set your company's email server to automaticlly reject any ".doc" attachments with an appropriate autoresponse, the "clue" process can actually be quite quick and painless.
As a side note, I was quite surprised when my realtor forwarded house information in
Re:At some point..... (Score:3, Interesting)
Hmmmm... Not to be a jinx, but so far the rest of the world has not come up with a replacement for .doc which has superseded the Word format (and of course, that is widely used).
I always thought that you could have a single computer running Word, and whenever someone sent you a Word attachment, it would go through this computer, which would run Windows+Word+Some clever script that would translate the offending file to (say)
The US Navy (Score:3, Informative)
Computerworld [computerworld.com]
snips (with added emphasis)
After studying the issue for a year, the Navy issued its formal usage policy, which Jacobs said provides comprehensive guidance on how to use existing specifications and calls for component reuse whenever possible. It also prohibits the use of proprietary extensions to industry specifications.
"The new policy urges commands to use the specifications from the W3C [World Wide Web Consortium] and other consortiums such as OASIS," said Jacob
Re:At some point..... (Score:5, Informative)
While Microsoft could, theoretically, completely redefine RTF whenever they felt like it, it's so widely used (at least within Microsoft) that it wouldn't be worthwhile.
Microsoft can (and does) add new control words whenever they feel like it, but new control words won't affect existing (non-broken) RTF readers.
Re:Grrrrrr (Score:5, Interesting)
Most people should vote, keep up with current events, be knowledgable about history, be able to program their VCR, and floss their teeth, too.
But don't expect that they will.
Re:Grrrrrr (Score:3, Funny)
I think this will be the greatest benefit of Microsoft actually achieving Total World Domination. The signal/noise ratio will drop as Msofties will be constrained to their litter box, leaving the rest of the world to clueful people.
Re:At some point..... (Score:3, Interesting)
Separating Content from Presentation a Good Thing (Score:5, Insightful)
Re:Separating Content from Presentation a Good Thi (Score:5, Insightful)
Goldfarb's conjecture (Score:5, Informative)
I think the root of the confusion goes back to Golfarb's original theory for SGML-- that the styles in a document are secondary to the structures, and should be kept separate.
This has been a religious conviction ever since, despite the fact that most authors are messy and intuitive, and SGML-etc are very, very rigid and unintuitive. The rationalisation is that messy authors can just represent their styles using 'fake' (ad hoc) XML, but if this turns out to be 90% of the real users of MS Office, then I think MS could indeed save valid XML, but it won't be portable in any useful sense.
Re:Separating Content from Presentation a Good Thi (Score:5, Insightful)
There may be other *valid* criticisms of what Microsoft is doing but this isn't one of them.
Re:Separating Content from Presentation a Good Thi (Score:5, Insightful)
Re:Separating Content from Presentation a Good Thi (Score:5, Insightful)
Please, next time try to avoid the condescending tone, people might respond more constructively.
Re:Separating Content from Presentation a Good Thi (Score:3, Interesting)
Yes, good HTML is valid XML.
Unlike your example, which is not even valid XML. But that's beside the point.
Re:Separating Content from Presentation a Good Thi (Score:4, Insightful)
Unfortunately, Manny Manager and Sarah Secretary are now very used to depending on the formatting and presentation information. To be honest, not too many people these days subscribe to the whole minimalist document theory (unless your idea of starting your editor is typing 'vi').
The main point here is to encourage the
Re:Separating Content from Presentation a Good Thi (Score:4, Insightful)
I don't think this means that there is no stylistic information in the document, rather that the style information is contained within the proprietary code segment of the document.
If Word documents all utilised the same style for various elements, it'd all be hunky-dory. However, users like their choice of a 50pt purple serif font for a title to stand, so the formatting information MUST be included with the document.
Perhaps a better format would be a zipped file that contains seperate XML and XSL documents...
Re:Separating Content from Presentation a Good Thi (Score:5, Informative)
If Microsoft just puts the raw text data into a
As an example of a good way to do this (IMHO), take a look at how OpenOffice.org builds their files. When you make a
After unzipping this file, the following directory structure was exposed:
content.xml
META-INF/manifest.xml
meta.xml
m
settings.xml
styles.xml
With this type of design, you can get the best of both worlds. Technically, there is a separation between your presentation and content which allows simple programatic access to the data when necessary. At the same time, this design allows for full collaboration between people who also consider the styling of the data to be part of the content because the style rules for the content are included with the document.
With xml-saved Office documents containing only data and no style, collaboration between non-office users (and apparently Win9x users as well) will be no better off than before. Perhaps worse, assuming the binary
If this article is true and Microsoft has decided to remove the styling of their xml-saved office documents, I see two possible reasons for this:
The first is obvious. You're not using Office? Ok, second class citizen, here's the data but in a format that is next to useless for you to use.
The second possibility involves Microsoft just not being where they want to be with the Office XML sharing. Keep in mind that it took OpenOffice.org something like a year and half or so to define their XML interchange format. Microsoft may be going there, but due to overwhelming inertia, it just might not be going there very quickly.
Personally, I think the first option is the most likely. However, with OpenOffice.org working with OASIS and others on a common XML interchange format, I'm hoping Microsoft will be forced by the marketplace into option 2.
Best regards,
David
Re:Separating Content from Presentation a Good Thi (Score:3, Insightful)
You seem to forget that, in the context of office programs such as Word, the 'content' is the sum of 'text' + 'formatting' + 'presentation'. You need all 3, or you do not have a workable document. Having 'text' only is not enough. We are not talking about being able to read a .doc file on your scrollable cellphone screen here. We are talking about interoperability between all major office suite producers.
Re:Separating Content from Presentation a Good Thi (Score:3, Insightful)
Taking the presentation out of data would be like making PSD"s xml but putting the colour in some hidden away place. You'd have only the useless basics and nothign else.
At least XLink the "presentation layer" you are imagining in, in a seperate resource file... ala XSL or SOMETHING.
What did you expect? (Score:2, Interesting)
Pigs will fly if that day ever comes!
Style Sheets (Score:5, Insightful)
Re:Style Sheets (Score:3, Interesting)
It's unclear from the article whether that leaves the style information intact, and obviously Gary Edwards has an ax to grind, but in the systems I implement, sometimes I can't get users to adopt the use of style sheets, but I can extract the semantic information from stylistic patterns. It's not all that difficult to look at the formatting for a screenplay, for example, and pull out the meta information about what actors appear in what scenes based on the bold outdented bits.
If I can get to t
Re:Style Sheets (Score:3, Informative)
Anyway, Office has a ridiculously complicated format. Any XML that it generates will most likely be a nig
Re:Style Sheets (Score:5, Interesting)
The problem is that they don't include it elsewhere.. So in order to share documents in the style intended by the user, it must be saved as the proprietary format.
IMHO, this ensures the user will opt-out of the XML format, and stay with the proprietary format. As I posted above, if Microsoft are going to do this, then they should bundle an XSL document with each XML document.
Save As XML = WordML (Score:5, Informative)
InternetNews is authored by morons.
-malakai
Re:Save As XML = WordML (Score:5, Informative)
<Style ss:ID="s26" ss:Parent="s16">
<Borders>
<Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
<Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
</Borders>
<Font ss:FontName="Times New Roman" x:Family="Roman" ss:Size="12" ss:Bold="1"/>
<NumberFormat ss:Format="_(* #,##0_);_(* \(#,##0\);_(* "-"??_);_(@_)"/>
</Style>
<Style ss:ID="s27">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Geneva"/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s28">
<Font ss:FontName="Geneva" ss:Size="12"/>
<NumberFormat ss:Format="0.0"/>
</Style>
<Stuff in between here to get around Lameness filter>
<Style ss:ID="s27">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Geneva"/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s28">
<Font ss:FontName="Geneva" ss:Size="12"/>
<NumberFormat ss:Format="0.0"/>
</Style>
This is a surprise why? (Score:4, Interesting)
IMNSHO, I think that this will backfire eventually. Slowly but surely the world is moving more and more towards open, interoperable standards.
I use Office 2000 and OpenOffice, and I won't be moving to Office XP or later versions anytime soon, if ever. The enviroment I work in still uses Office 97 (mostly due to budgetary constraints, though they ARE considering a move to XP sadly).
Microsoft is at the point where they will do anything to lock in their current market share, and are trying to make it increasingly harder to move away to anything different. Once you can't share your files with any other application suite, the sheer cost of file conversion alone will keep most people from switching to other alternatives.
Re:This is a surprise why? (Score:2)
Yes. Since when was XML supposed to be used for presentation data? I agree that MS should open up their API's and document formats, but I'm glad that the Office XML doesn't contain presentation data, because that IS supporting the standard properly.
You're Wrong As Is The Article (Score:5, Informative)
Office 11 supports a significant number of W3C XML standards including SOAP, XML, XPath, XSLT, WSDL, DOM and XSD. Don't take my word for it read Jon Udell's columns on Infoworld such as 10 Things You Need To Know About XDocs [infoworld.com] or Exploring Office 2003 [infoworld.com]. I personally was quite stunned and very pleased when I found out that the Office folks were moving from binary formats to XML which opens the doorway for producing and processing Office documents using off-the-shelf XML tools and technologies.
The only real complaint I saw in the entire article that some tags related to presentation are stripped out when saving as XML. Specifically Jon Udell explained the differences in his blog entry [infoworld.com] where he stated Basically it looks like the authors of the article want to have their cake and eat it too. They somehow want to preserve all the formatting information in their documents in the XML output yet not end up with a lot of Office specific content in their documents.
Secondly one of the primary goals of XML is the separation of presentation from content. Meaning that how an XML document is displayed to the user is unimportant (that's what stylesheets are for just look at the direction XHTML 2.0 [w3.org] is going in) and instead what is important is the data & metadata within the document. In my opinion, this actually allows people to innovate because they are not limited to a single look and feel for their documents but instead can present them in different ways for different audiences and different platforms. This was the major failing of HTML and it is sad to see people try to bring that mentality to the XML world.
Re:You're Wrong As Is The Article (Score:5, Interesting)
The choices then appear to be "data only XML" or "RTF marked up XML". Is this correct?
If so, then I think the critics are correct. The critics wish that the document can be read and manipulated by some non-Microsoft editor. I doubt this is feasible with the WordML format, aka "RTF marked up XML". I'll explain why.
If, as you point out, a WordML document is collection of data marked up by XML tags that provide only low-level (RDF) presentation information, then it is of no use to an alternate editor implementation. An analogy would be to attempt to edit a Word
It sounds to me like it is the same with WordML. You can read and edit WordML because it's valid XML. However, the higher level data model of Word is simply lost. No means is provided for a processor to understand the original structure of the document.
For example, if I should create a "style" in Word and apply that style to a paragraph, the WordML output will tell me what font to use. However, the WordML file tells me nothing about the "style". So I can't tell what other paragraphs are supposed to change in sync if I change the style. I can't know the inheritance of style parameters. In short, I can't programmatically edit the Word data model.
The hope/expectation was that the XML output would provide this information. Thus, it would be possible to essentially re-implement the Word data model and correctly manipulate Word documents. With this hope/expectation in mind, it's clear why what they have actually found is considered crippled. WordML can't be used to recreate any part of the original data model in an alternate editor. It's just data mixed up with low level markup.
I have always thought that this expectation is niave. Microsoft protects the tools they sell by making it infeasible to create alternate implementations. Just because the tool can output XML doesn't mean that you can do without Word.
BTW, I am by no means an expert at any of this. I'm no Office beta tester and I haven't looked at OpenOffice in months.
Re:You're Wrong As Is The Article (Score:3, Interesting)
It would be more reassuring to hear if Word can successfully read back in one of these files and the user can then continue to edit it as before, ie nothing is lost. Unless this is true then MicroSoft can easily be shown to be forcing everybody to use their proprietary format.
This can be done with imbedded proprietary blocks if necessary but in that case it should be possible to easily strip out the proprietary blocks and Word must still read in the file correctly and it still ha
LOL (Score:5, Insightful)
When I first heard that MS Office was moving to an XML based file format, I didn't think "ooh yippy do, we'll be able to share information".
I thought:
<msoffice type="word">
6647AB84B348W837G86438H5D345W34
6647AB84B348W837G86438H5D345W34
6647AB84B348W837G86438H5D345W34
6647AB84B348W837G86438H5D345W34
6647AB84B348W837G86438H5D345W34
</msoffice>
I was right.
Re:LOL (Score:3, Informative)
The XML file format will contain only the actual data of the document. The rest - structure and format - will be contained in a seperate proprietary file format.
Now what MS should do, in reality, is use XML + XSL for their xml file format but they aren't. Of course I have no idea how this is supposed to be more useful than saving the file as
Sorry, you are wrong (Score:5, Informative)
Saving as WordML give open office the ability to modify XML data and thus "modify" a legit word document. Word has no problems opening a WordML document I created by hand.
What these morons authors are talking about in this article, is when you check off the "Save As Data" checkbox in Save as XML file dialog. Word then strips formating and tries to stick the data according to the choosen XSD (which should have been mapped to the Word elements) into an XML format. You can optionally choose to then have word run an XSL transform against the resulting data, and save _that_ to the file system.
If these guys can't figure out how to make OpenOffice work with this WIDE FUCKING OPEN FORMAT then I certainly don't want to use their product.
Hell, I'll write them a WordML to xsl transform for them in a day.
-malakai
Re:LOL (Score:5, Funny)
Can we stop and think for a second (Score:3, Insightful)
This is a beta product.
Until it comes this way in a final product, there's no reason to get all excited about it.
Wow. (Score:5, Funny)
Do Better? (Score:3, Interesting)
This is hardly surprising news.
My question, though, is whether it is possible for other vendors and OpenOffice to create a better , more pleasing formatting and presentation of the content in the XML than Office 2003 does?
Missing the point (Score:5, Insightful)
I'm not trying to start a flame war here, but it seems that they're missing the point! We don't want it to be MS with one format and the rest of the world with another. That really wouldn't make it much different from how it is now. At least the way it is now, non-MS office software can read the MS formats. If it comes down to the choice between using the MS format or the "rest of the world" format, MS is going to win every time..
I thought we already had an XML standard for docs? (Score:2)
Personally, I only use a word processor to re-markup things I've written in HTML. That includes my dissertation. HTML isn't super printer friendly, but come on, we're all trying to go paperless anyway, right?
No good to me... (Score:2, Insightful)
Microsoft used to be able to force everyone to upgrade because if you didn't, you wouldn't be able to read documents sent to you by others. I don't think that is going to be so successful now, there's too much resistance
bollocks (Score:5, Insightful)
Instead, create an XML format that is specific to your needs and write a DTD or XML-Schema that describes it. If you need to translate it to someone elses' XML document format, a quick XSLT stylesheet will transform the document with a minimum of effort.
Just my 2 cents.
Re:bollocks (Score:3, Informative)
An average office worker should NEVER have to deal with XSLT, and probably shouldn't even be messing with XML outside a visual editor that conforms to your DTD, ala programs like XML spy [xmlspy.com].
The point is, if you have to translate to another format, you hire a developer to do it once, and the XSLT stylesheet that he/she develops can be reused again and again to transform documents. Maybe make a drag & drop script to do the transformat
Feature? (Score:2)
Biased? (Score:2)
"The idea is for XML not to specify how the information should be processed, but rather leave that task to XSL (define) templates and other post-XML processing steps," he said. "XML is supposed to be a presentation-neutral format."
There's nothing stoping anyone from making their own collaborative product that works with XML files. MS isn't going to do it, but that doesn't stop an open source solution.
MS .doc / Adobe PostSript & PDF (Score:5, Interesting)
Re:MS .doc / Adobe PostSript & PDF (Score:3, Insightful)
Basically, I'm not forced to use the Adobe product.
I'm sure that Microsoft realises this and would hate to let the users have a choice of what they can use. Why let them choose when they can almost be foreced to use the MS product.
I dunno...maybe I've been hanging out on
Wouldn't it make sense? (Score:2)
Haven't looked at Word's automation API recently, but I suspect you could get a lot of the data necessary that way and export accordingly. Perhaps I'm totally off the mark...
Looking at it now, it would probably be pretty cumbersome, but probably manageble. I'm just thinking, it would be nice if Microsoft would do it for us, but I suspect that the applications at some
Part of the concept (Score:5, Insightful)
This article makes it sound as if MS is doing something completely improper with XML (i.e. changing it's "standard"). But it seems to me that MS is simply separating content from presentation and relying on ????(something proprietary, xsl, more xml) to provide presentation. Just because they don't use the standard the same way you want them to doesn't mean that they are breaking the standard. I'm sure if you look at the XML that they output it's all standard XML. It also sounds as if they are not using any of the "tricks" that others have complained about (i.e. storing binary data in an xml tag).
Instead of bitching about the problem maybe we should
1) provide feedback if we are a beta tester
2) wait for it to be released
3) ready some tools to provide interoperability
4) work harder on creating tools better than MS
Re:Part of the concept (Score:4, Insightful)
The article is blantantly wrong... (Score:5, Informative)
First off, by default, if you save the word document as XML, it gets saved as WordML,which preserves Word's styles and formatting in an XML name-space that's separate from the one bound to the schema-controlled data.
If you check off the checkbox "Data Only" then you will lose all formating and your own XSD will be used to map this document into XML data.
WordML looks like a XML'ified RTF language. It would be trival to create an XSL stylesheet that transforms WordML into HTML/CSS with all formating (that HTML is capable of) which directly mimics MS Word. OpenOffice could also eat WordML quite easily and have all the formating/style of Word.
What the authors of this article are REALLY bithing at, is the fact that MS didn't buy into the OpenOffice Document Specification from OASIS. MS prolly sees OASIS as the US sees the UN. Defunct, not needed.
If you describe your data using XML semantics, and all it takes to convert from semantic style A to B is some XSL, then who cares about forcing everyone to use one specific format.
-malakai
Why should XML be compatible (Score:2)
Although a XML-MS-Word forma would make compability easier, it doesn't means that it will be compatible. MS and Mr. Bill Gates uses of many theories developed through the centuries to overcome not only enemies but everybody else. From romans to George Orwell.
There's no to MS use a standard format when they own the standard today (MS Office files). And even if they hadn't the standart they would destroy it, just like they tried with Java and J++.
MS still have a lot to learn (and to suffer) until it's ca
Vague article doesn't have the details I want (Score:2, Insightful)
The article is on the other hand very vague (probably because the information still isn't available) about what information is left in. My interest is no so much in being able to read Off
sometimes.. (Score:5, Interesting)
1) Take MS, make a report that says they did something bad, watch how many people flock to bash them DESPITE THE FACTS PRESENTED IN THE ARTICLE, which leads me to:
2) How many people read the article? And of those people who DID, :
3) How many of them know that XML is supposed to be a divorce of data from presentation? Why this comes as a shock to people is obvious - they didn't know that.
The poster above who said "style sheets" - bravo. You couldn't have made a better point with two words.
Re:sometimes.. (Score:3, Insightful)
(1) Post Inflammatory (or sometimes Blantantly Unfactual) Story on Issue X
(2) Get lots of hits from pro and anti-Issue X people
(3) Get lots of hits from people who waste time informing everyone how ignorant the Slashdot editors are
(4) Profit!
Michael and CmdrTaco specialize in these stories. See CmdrTaco's recent post about SuSE "back away from UnitedLinux" to see an excellent example of this.
It really comes down to all they want are page hits. T
This is great news (Score:3, Funny)
Two ways to look at it... (Score:3, Interesting)
On one hand, there are a lot of folks who have very strong opinions about the fact that the data should be separated from presentation... If Office 2003 were to strip the MS-apps-specific formatting (which is probably NOT very standards-friendly), but leave the style markings (heading, paragraph, footnote, etc...) then really, they would be providing a semi-structured document that conformed to XML standards.
As a web application developer/web author, there have been many times when I have been given MS Word docs and Excel spreadsheets as content for our web site... In the past, I have resorted to copying the whole page directly onto a text editor (thereby scrubbing all formatting information) and then using HTML markup to make the document look much like the Word original, but without having to deal with that rather poor HTML output the Word and Excel's Save as HTML features produced. If I could have a semi-structured document, it would have been easier to write some macros to parse the XML structure to automate some of the rough formatting (hooks for stylesheets or somesuch).
On the other hand, it seems to me that is might be in Microsoft's best business interest (the selfish ones) to make darn sure that it's not possible for OpenOffice fully interoperate with MS Office documents. I don't think they would be very smart (current business model-wise) if their new products (which will rapidly become de-facto business standards) helped to enable Open Office standards to take away their marketshare.
In the final analysis, I probably wouldn't worry too much until there's a critical mass of people using it. By then, a bunch of folks will have figured out what CAN be done with whatever format MS ended up with. At that point, Office 2003's XML format will probably make it possible for people to do something they couldn't do before or at least, to do something easily that once was more trouble that it was worth.
That's worth something...
WordML (Score:5, Informative)
It's only when you Save as XML with the "Data Only" checkbox that you get into striping formating (and rightly so). Word WARNS you about this. In addition, you can specify your own XSD to save to. And word will VALIDATE this for. Not to mention, you can use a word tool to map elements of Word documents to elements of your schema. DAMN COOL.
In addition (As if that isn't enough) when you save, in either way, you have the option of specifiying a XSL style sheet. It'll go ahead and transform the output for you as part of the save.
Then only thing the OpenOffice people are upset about is that MS didn't buy into the OASIS/OpenOffice Document Specification. Tough shit. I'll write them an XSL that'll work again WordML to solve that for them. Lazy bastards.
-malakai
I have Office 2003 and this article is BS (Score:5, Informative)
1. Opened a heavily formated
2. Saved the document as XML.
3. Opened up the XML document in Word and it looks EXACTLY like the original
I also opened the XML file in a text editor and sure enough it contains complete formatting information.
Re:I have Office 2003 and this article is BS (Score:4, Informative)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word
The authors of the article didn't bother to RTFM.. (Score:5, Informative)
Think of a resume. you could define an XSD for a resume, and be able to save resumes against this XSD, as validated pure XML.
Now, if you want to produce a document, using an XML syntax but want to combine both data and presentation, then you want WordML.
WordML uses Word's own tags to markup the word document. I was going to show you an example of WordML but i don't feel like escaping allt he greater-than/less-than signs. Anyhow, WordML contains all the formating and everything necessary to display a Word document as it is supposed to look.
I think this Open Office guy is looking for a devil in Office 11 that isn't there. That or he didn't read the friggin manual.
-Malakai
Wait a minute... (Score:5, Insightful)
That indicates to me that the problem is really that the document format is so complicated that it takes tremendous resources to understand and implement compatibility with it, as this implies that larger companies like say a Xerox will have no problem producing tools to work with it.
So from a business consumer perspective this is still a tremendous win.
This sounds like more whining from the open source crowd.
Proprietary Document Formats (Score:5, Insightful)
Once an easy to use, open document format is created, and the ability to read and write those documents is built into many programs, I think we will see an end of
While there are currently some "open" formats like PDF and PS, the problem is that they are not easy to create for the average user, nor are they easy to edit. While PDF may be a good format, we need something better.
XML is a logical choice as a base for an open format because it is a well defined standard, it is text based, and is quite easy to parse.
But I ramble.
REPEAT AFTER ME: XML IS NOT A FILE FORMAT (Score:5, Interesting)
Internet World is reporting that initial reports from Office 2003 beta testers don't look good for those hoping to share documents with non-MS systems using the XML file format...
That's because XML is not a file format, it is instead a format for file formats. To quote the O'Reilly "Learning XML" book, page 2:
Note that despite its name, XML is not itself a markup language: it's a set of rules for building markup languages.
I've said this many times on /. (look at my history), but the fact that a particular format is XML-based says nothing of your ability to read it. I'm even going beyond the fact that Microsoft could simply stick their traditional file formats into a CDATA and claim XML compliancy.
The statement "If Microsoft used a standard XML format for their documents then anyone could read them" makes as much sense as an equally stupid statement like "If Microsoft just used 8-bit bytes in their file formats then anyone could read them".
Sorry to rant, but the level of cluelessness around XML is astounding. Please read up, there's a ton of useful information on XML around the internet.
MDC
drm ? (Score:3, Informative)
From a pure bussiness standpoint (not technical)a close proprietary file format is essential if you want consumer lock-in to keep prices sky high. If a competitor can write software that can read your files and format them proprerly then you lose your file format monopoly and would have to compete with everyone else.
And we would 'upgrade'.... WHY??? (Score:3, Insightful)
On XML file formats.. (Score:5, Informative)
Uh ... what? (Score:4, Interesting)
Excuse me if I don't take this article seriously, but the author apparently knows nothing about Windows. Office 2003 will only work on Windows 2000 or 2003? Not Windows XP? Maybe he meant that the collaboration servers require Windows 2000 servers or Windows Server 2003 servers, since there is no XP Server. And speaking of XP, what exactly does he mean by "connecting over XP servers"? That's simply impossible -- there is no server version of XP, only Home and Pro.
As for Microsoft not supporting Office on the obsolete Win9x platforms, good for them. It's past time for Win9x to be killed off once and for all. Not supporting it in Office is a good step forward.
Jesus TapDancing Christ (Score:3, Insightful)
Relgious zeal with XML content being separated doesn't MEAN SHIT to users. And it doesn't get me anywhere when the fact remains that when i send in my busines proposals to the government, they want it in Word-97
wankers. However you want to make an open format - be our (the Joe Salesdepartment) guest... until there is something which is universal (.doc and
Re:Windows 2003? Where's that? (Score:2, Informative)
Windows 2003 [microsoft.com]
Re:More goofy formats (Score:2)
Re:This is a question (Score:5, Informative)
http://www.usdoj.gov/atr/cases/f200400/200457.htm [usdoj.gov]
J. No provision of this Final Judgment shall: Require Microsoft to document, disclose or license to third parties: (a)portions of APIs or Documentation or portions or layers of Communications Protocols the disclosure of which would compromise the security of a particular installation or group of installations of anti-piracy, anti-virus, software licensing, digital rights management, encryption or authentication systems, including without limitation, keys, authorization tokens or enforcement criteria; or (b)any API, interface or other information related to any Microsoft product if lawfully directed not to do so by a governmental agency of competent jurisdiction.
Microsoft is not locking up your information they are simply stripping out the formatting and layout in the XML file format.
Re:is it possible... (Score:3, Interesting)
If there is no documentation then any reverse engineering of the file format would be at least a violation of the EULA.
In the worst case, since reverse engineering the format would allow a person access to a copy protected data set, this would be a violation of the DMCA.
Did any of us really thing that B.G. hadn't thought this whole thing out years
Re:Forget XML, doc, and other crap (Score:3, Informative)
Wow. Someone on /. suggesting we use an MS file format.
For those of you that aren't aware, RTF is an 'Open' format created by MS. All native word files I've looked at ('97 and earlier) used an RTF derived format. The RTF spec is availible from Microsoft, and is the most obfiscated document I have ever had the misfortune of having to read (in the end I gave up and derived the format from the output of wordpad, it was easier).
Re:Microsoft's new file format is: (Score:5, Funny)
<?xml version="1.0" standalone="yes" encoding="en">
<!DOCTYPE worddoc [
<!ELEMENT document (document_properties, document_section)>
<!ELEMENT document_properties (title, author, organization, department, job, generalsummary)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT organization (#PCDATA)>
<!ELEMENT department (#PCDATA)>
<!ELEMENT job (#PCDATA)>
<!ELEMENT generalsummary (#PCDATA)>
<!ELEMENT document_section (sectionsummary, proprietarybinary, unenhancedcrappytext)>
<!ELEMENT sectionsummary (#PCDATA)>
<!ELEMENT proprietarybinary (#PCDATA)>
<!ELEMENT unenhancedcrappytext (#PCDATA)>
]>
<document>
<document_properties>
<title>Crappydoc</title>
<author>William H. Gates III</title>
<organization>BORG</organization>
<department>Unimatrix 0</department>
<job>Secondary information processing adjunct</job>
<generalsummary>Doc about crappy M$ things.</generalsummary>
</document_properties>
<document_section>
<sectionsummary>Haha, you cant parse this and make it look perty, it's BINARY! You're still screwed!</sectionsummary>
<proprietarybinary>firoiorfioeiojvonvonviniooiwnc
<unenhancedcrappytext>Hehe, doesnt this text just look ugly? I bet it does, if you arent using M$ WORD!</unenhancedcrappytext>
</document_section>
</document>