Tim Bray on Microsoft Office 589
jgeelan writes "The co-inventor of XML, Tim Bray, has been talking about the newly XML-enabled version of Microsoft Office, code-named 'Office 11' and tells XML-Journal that 'when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.'"
Yay Evil Monopoly Of Doom! (Score:3, Interesting)
StarOffice has used XML for their native file formats for some time now; I wonder if this means we'll see an even better-quality translator between the two formats?
Historical turningpoint? (Score:5, Interesting)
One small such point is when IBM gave out the specs to their hardware for PC allowing everyone to clone it, while Apple did not.
This could be such a point. Maybe in 10 years we'll look back at this and ask ourselves "Why the heck did MS XML-enable their Office app, releasing the hold that they had"
Only time will tell I guess.
I Play Hattrick [hattrick.org]
XML takes away Microsoft's main advantage (Score:5, Interesting)
However, if Microsoft Office documents become "built around an open, internationalized standard", i.e. XML, would this not enable the people behind OpenOffice, StarOffice etc to acheive total 100% file compatability and thus negate Microsoft's largest advantage with Office?
Of course, this could be yet another Microsoft "embrace and extend" tactic, a la` kerberos. Incorporate the standard in a bastardised form, claim standards compatability, then pollute it so you must be using Microsoft technology to properly interact with it.
HTML from Word (Score:5, Interesting)
To me it looks like classic MS embrace and extend! (Score:2, Interesting)
Seems to me MS is just doing what it always does -They add the possibilty to manually export/import a new format.
Geez...What else is not new?
Re:read through "EULA" in the XML? (Score:2, Interesting)
No way. What happens when I recieve a MS 11 XML Word document on my Linux system via email. I haven't accepted any sort of EULA, and I can start hacking out the DTD straight away - which I must point out, a complex XML document is close to worthless without.
They may prevent MS users from reverse engineering the documents on their MS OS's and I suppose they could even forbid users emailing their documents to other OS's (EULA's are great eh?) - but I doubt they will do this, it would cripple Microsoft Office.
Andrew McCall.Re:Too good to be true (Score:5, Interesting)
Too good to be true? (Score:2, Interesting)
"The important thing," he explains, "is that Word and Excel (and of course the new XDocs thing) can export their data as XML without information loss..."
Does this mean that MSO will have the same support for XML as currently for RTF? In that case I'm not that excited. If the default will be to save as MS-word format, and not XML (or MS-XML as the case may be), then we are no better off. Only Microsoft is, as they are now able to import OpenOffice/StarOffice documents.
It's sort of like when Word could read WordPerfect documents in the old days.
What I heard.... (Score:3, Interesting)
I think maybe it was the CEO of Microsoft Denmark. I'm NOT sure though
Re:Typical XML-proponent mistake (Score:5, Interesting)
As long as you don't get a DTD with extensive comments on how to interpret the elements, along with some promise/guarantee that the DTD won't change every minor release, there is no real improvement at all.
Have you ever tried to reverse engineer a binary file format? And have you ever tried to do the same thing with an XML file format? I learned huge chunks SVG yesterday _without_ opening an SVG book, just by mucking around in an existing SVG file and with an SVG viewer. Of course, Microsoft could do something clearly in violation of the spirit of XML, by making the whole thing one tag full of base64ed text or something. But as long as they use tags in a semi-sane way (which is the whole point, for integration with corporate systems), XML will be a big step forward.
They already did this for two other products (Score:5, Interesting)
ASP.net uses XML for all the human-readable files, and the IIS in windows.net server finally uses Apache-style configuration files which are also XML.
Hype! Hype! Hype! (Score:5, Interesting)
The thread a couple of weeks ago about the death of META headers will apply 1000 times worse for semantic tags-- if the semantic web is going to work at all it needs to start from headers describing the webpage as a whole.
(Also, what's with XML-Journal's claim the article has three pages when it only has two?)
What we need is a ISO standard (Score:5, Interesting)
Then goverments and corporation will adopt it for official documents so they can read their own documents in ten years.
Re:I doubt it. (Score:5, Interesting)
There are 2 problems with the current format of Microsoft Office file:
This is mostly solved (thanks to years of trials and errors).
This is definitively more difficult, as nobody knows Office internals and how they expect such additional data to be. StarOffice guys managed to make an acceptable job, at the price of years of trials and errors. It's like watching at a dump of your computer's memory, guesssing what's code, what's data, what's padding and the meaning of every byte...
Now, do an XML format simplifies things? Well, yes, just as an RTF text is easier to manage than a pure binary format, but nothing prevents putting extra cruft in an XML document, so it's just that instead of having to use a hex editor, you now may use a text editor, but giving a correct interpretation of tags and attributes is something that only Microsoft can do, unless it publishes the full specifications (present and future: after all, XML is eXtendible, right?)
Personally, I think that:
Re:I doubt it. (Score:2, Interesting)
http://support.microsoft.com/default.aspx?scid=
(furthermore I'm impressed that a reply like "They'll probably do something evil..." would be rated as "Insightful")
Re:What will be the default save format? (Score:5, Interesting)
Re:Yay Evil Monopoly Of Doom! (Score:3, Interesting)
Re:What we need is a ISO standard (Score:3, Interesting)
This may interest you:
http://www.1dok.org/eng/index.html
Comment removed (Score:5, Interesting)
Re:What we need is a ISO standard (Score:2, Interesting)
It would allow for competition in Linux word processors, without having to worry about file format compatibility problems.
Then if someone just creates a script which converts MS Office docs (on mass, like every one inside the directory structure) to this wonderful new format (Should be possible thanks to Open Office) and it would be much easier to then switch to OSS.
I personally have no problems with the current open office format, but if they made it human readable, so it can be created from plain text editors if necessary...
Quick somebody suggest it to them
the most wonderful thing... but it's not happening (Score:3, Interesting)
Unfortunately, Microsoft won't let it happen. The data may be "in XML", but that doesn't mean you can read it or generate it well. Instead, Microsoft will give you just enough to serve their business interests and nobody else's.
How? Office will probably stick undocumented base64 encoded binary stuff into the output, containing formatting information. You can use the document content, for example, with a database, but you can't load it into another word processor and preserve all the formatting. And in the other direction, sure, you can generate simple documents that Office will import, but you can't generate arbitrary Word documents--they will, again, have weird, undocumented tags and binary stuff.
In short: don't hold your breath. Microsoft isn't stupid.
Re:What are you all complaining about? (Score:2, Interesting)
"about MONO, we'll see" - go and see then - you only had to click the fscking link that I put there for you. Even a Windows user should be able to manage that.
"all kinds of IP rights" - and you reckon Sun doesn't have those for Java?
Re:There is some documentation of Office XML alrea (Score:2, Interesting)
No, it doesn't (Score:3, Interesting)
XML helps only if the creator of the document wants the information to be easily accessible by programs other than their own.
Bet (Score:2, Interesting)
I like this bet, because either I win money or we all win an open word file format. The only way I could loose is if we start to argue about the definition of open, which is probably what would happen. I hate that.
Inightful my ass! (Score:1, Interesting)
Your post is just a bunch of paranoid, slashbot FUD. No wonder you got modded up!
Re:MOD PARENT UP (Score:3, Interesting)
Did you ever think that maybe all the things MS has done in the last 24+ months have at their root the exact same motivations as everything MS has done in the past 24+ years? MS has a long and well documented history of showing "increasingly high level of support for interoperability" while at the same time subverting those same open standards so that they will only work with MS Operating Systems. Kerberos? SMB? ASCII text files?! The list goes on...
Did it ever occur to you that, despite what your stockbroker keeps telling you, past performance just might be indicative of future performance?
What part of "Embrace, Extend, Exterminate" do you not understand?
Re:MOD PARENT UP (Score:3, Interesting)
The folks at Microsoft haven't concluded that the Halloween documents were garbage, they are simply under increased pressure from their customers to provide features that are actually useful. Microsoft's biggest problem isn't Linux or StarOffice or any other non-Microsoft product. Microsoft's biggest problem is that people are increasingly happy with the Microsoft software they already own. There are a lot of companies that are perfectly content to keep on using MS Office 2000, and these guys hurt Microsoft's business model just as much as the Linux converts do. So Microsoft has to do something to entice these users to the new versions.
What Microsoft would like to do is simply switch formats like they did between Office 95 and Office 97. That would force everyone to become current. However, that move was viewed very negatively by most of Microsoft's larger customers. A new XML format, that won't be readable by older clients, as a secondary format is as close as Microsoft is likely to get. Throw in the fact that for the first time ever businesses will be able to use the information in these common formats easily and you have an idea that might tempt even some of the more stalwart holdouts that an upgrade is in order.
And Microsoft has to play fair in this case too. The fact of the matter is that StarOffice has XML formats now. If Microsoft gets too heavy handed then corporations will simply jump ship.
In other words, you are absolutely right. The XML formats are going to be great. They have to be, otherwise people will simply continue to use the old format and Microsoft will fail in their attempts to get everyone to upgrade. What the new formats won't have are open schemas (or DTDs). Sun and Corel get to reverse engineer another document filter.
Government Contracts Might be The Reason (Score:4, Interesting)
This tied to the fact that US sales are going to slow down or are already, due to the complete inundation of PC, they need new markets, and unless they use an open format they won't be able to get them. I'd be panicked Linux and Java eroding their server market. Governments are eroding their Office market. They only way they can grow is add value.
What really is going to happen (Score:3, Interesting)
Some starry-eyed graduate student there is going to stay up all night for a few weeks and try to do it right, and may even be 3733t enough to try non-MicroSoft tools to read the XML to see if they really did it right. Probably all the problems with the format is that this person is going to be inept. In fact I'm sure that amateur or inept programmers are far more responsible for all the standards breaking from MicroSoft than some evil plan by Bill Gates.
The problem is that this is not going to be the default save-as format. Most likely the ability to change to this format will be buried pretty deep, and once you do it will pop up error boxes that say "some features of your document may be lost". Again this probably wont really be an order from evil overlords to discourage XML. It will be the inept programmer, realizing that they can't figure out how to translate an obscure feature and thinking they better warn the poor user, and too stupid to figure out how to delay the warning until they detect if the document is using the untranslatable feature.
The result is that "Word" files will still be the same as they are now. If you don't believe me, MicroSoft long ago tried to standardize of RTF, with exactly the same fanfare and claims that this would solve the incompatability problems. Nobody uses RTF now. And try sending an RTF saved by Word to one of the places that insists you send them a Word document. They will not take it.
Word also saves as HTML and plain text and can make a pdf, and despite claims here that they are ugly they are still parsable and adhere enough to standards that you can write code to read them. All of this is totally irrelevant, these are not "Word documents". And this new XML is not going to be a "word document" any more than those are.
Genuine XML? (Score:4, Interesting)