Microsoft Claims OpenDocument is Too Slow 553
SirClicksalot writes "Microsoft claims that the OpenDocument Format (ODF) is too slow for easy use. They cite a study carried out by ZDNet.com that compared OpenOffice.org 2.0 with the XML formats in Microsoft Office 2003. This comes after the international standards body ISO approved ODF earlier this month." From the ZDNet article: "'The use of OpenDocument documents is slower to the point of not really being satisfactory,' Alan Yates, the general manager of Microsoft's information worker strategy, told ZDNet UK on Wednesday. 'The Open XML format is designed for performance. XML is fundamentally slower than binary formats so we have made sure that customers won't notice a big difference in performance.'"
I don't know about the rest of you... (Score:3, Insightful)
INCITS (Score:5, Insightful)
There was much speculation that Microsoft had joined INCITS with the intent to slowdown or stop the spreading use of ODF and insert their own standard. Sounded like another Microsoft power trip to me.
I predict that Microsoft will bitch and bitch about ODF and then release study after study suggesting some other patent laden format (probably Open XML) over ODF. This is just the first complaint against ODF--too slow. Perhaps next they'll complain that it's not documented well enough, some of their apps just can't support it, it gives their developers arthritis, it looks too ugly, etc.
If I was an MS shill. (Score:5, Insightful)
(read the 'study')
But I am sure the shills will pipe up with "easier to use", "people are used to it", "noone forces people to use MS" and other such irrelevance.
It's a fucking WORD PROCESSOR (Score:5, Insightful)
It's not a game loading complex 3D worlds and sound effects, it's a load of text being displayed on screen. What difference does a few milliseconds here or there make? OpenDocument could be ten times slower and the benefits of an open document format would still vastly outweigh the effects of loading time.
Haven't they defended that one themselves? (Score:4, Insightful)
"Any performance limitations now will be resolved as Moores Law continues"
Not that I like the argument.
Re:I don't know about the rest of you... (Score:4, Insightful)
Re:I don't know about the rest of you... (Score:5, Insightful)
In fact, the study cited doesn't even refer to "the speed of ODF". It's about OO.o's speed only.
Something important to remember (Score:5, Insightful)
I see this as an attempt by Microsoft to slander this format and try to further their own semi-OpenXML format.
--
Jason Faulkner
Eastern US Press Contact
OpenDocument Fellowship
Format Slow? (Score:4, Insightful)
False dichotomies (Score:1, Insightful)
ODT XML files are binary files. So are old Word 2003 .doc files. So are Microsoft's new XML files. So it's pointless to claim that a "binary" file format is faster than an XML file format. Perhaps that MS guy meant to say, "XML-based file formats are slower than non-XML-based file formats." At least this is a coherent claim, even if it's not necessarily correct.
The other big mistake: file formats aren't fast or slow. The algorithms for reading and writing them are (or aren't) slow. Marino Marcich of the ODF Alliance implicitely made this point when he said that different ODF-capable applications have different performances. Perhaps you could, in a fit of brilliant computer science analysis, prove that no reader for a particular file format could parse it as fast as Word 2000 can parse a .doc file, but no one has made that claim.
Re:If I was an MS shill. (Score:5, Insightful)
Although it is complete true that the distinction between application and document format is key, it is quite possible to design a document format with performance in mind versus merely counting on Moore's law to handle performance issues. My observation is that Microsoft has thought through some performance and reliability details to an impressive degree in OpenXML. The files are sorted in the zip file in the order that they are needed for incremental loading. The zip file is stream decompressed so that a lost bit halfway through the file does not prevent decompression of the beginning. Textual data is earlier in the file than bitmap data both because it is needed sooner and also because a truncated file will still have its text and basic formatting intact.
Obviously this Microsoft dude is not making any kind of fine distinctions. But I would love to see a careful analysis of the performance and reliability choices made in OpenDocument versus OpenXML if only so that OpenDocument can copy the best (unpatented) ideas from OpenXML. Microsoft has a lot of experience optimizing the performance of office suites and their file formats. I know from experience that those considerations tend to get lost in the standardization process.
Re:It's a fucking WORD PROCESSOR (Score:5, Insightful)
Oh wait...it has been done. By Microsoft too, in fact. IE, Mozilla, and Opera are all capable of much more than ODF and at ridiculously high speeds.
If you add to that the fact that the MS version actually has more useless features in it (which add to the parse time), I guess this is entirely a lie.
in RAM? (Score:3, Insightful)
It seems you're trying to load an XML document... (Score:5, Insightful)
If a Windows-capable PC has enough oomph to render clippy in 3-D translucent splendor for Vista, then it's certainly fast enough to load an XML document.
Translation (Score:2, Insightful)
Re:I don't know about the rest of you... (Score:5, Insightful)
In other words if you don't have an ODF appilication all you have to do is unzip it( a feature found in most OS's these days) and extract the data by hand.
If you don't have MSFT Word of version x you can never open MSFT's formats. Patents will prevent third parties from implenting it. Defeating the entire point of having a standard.
Study = 1 Blogger running one test (Score:3, Insightful)
He had a humongous spreadsheet (a couple hundred megabytes) and was tracking the load time.
He whined about the memory OO takes, and didn't mention that MSOffice pre-loads its stuff on startup, so you are loafing MSOffice stuff whether you need it or not.
Uhmm... (Score:5, Insightful)
OPTIMIZE YOUR CODE!
I know that there are many variables here, but seriously... how slow can it be? I use OpenOffice 2.0 on an Athlon64 3200+ and I have no issues, in fact, I find it much quicker than M$ Office
Re:It's a fucking WORD PROCESSOR (Score:3, Insightful)
Oh noes! (Score:3, Insightful)
Re:If I was an MS shill. (Score:5, Insightful)
I'm not condoning or defending this particular study (although I have to admit, to me it smacks of "Company rubbishes competitor, talks up own product - film at 11"), I'm just getting a little weary of seeing all the calls of troll, shill and astroturfer levelled at anyone with an opinion that differs from that of the collective.
(And before anyone says it, yes, that goes for both sides, Linux zealots and MS weenies alike)
ODF is faster because... (Score:1, Insightful)
So regardless of the speed of any particular implementation the freedom ODF gives us ensures it always will be able to be opened.
Re:For those two people not in the know... (Score:3, Insightful)
Re:It's a fucking WORD PROCESSOR (Score:2, Insightful)
It's XML. So how can it be a non-bloated format? <gd&r>
Re:If I was an MS shill. (Score:5, Insightful)
a prim example is look at compressed archives.. say RAR.. if you look at the diffrence between a normal RAR archive and a Solid RAR Archive. the Solid archive takes all the files and treats them like a TAR ball so that you can compare like data and get better compression.. It doesn't take much longer to create the orginal file than a normal RAR archive which treats each file on it's own basis but when extracting or updateing you have to read through every file before the one you want in the archve when reading. and when writing you have to read all the ones before to evaluate the one you have and change it and then progress and extract and recompress every other file after it instead of just skooting them over when updateing a normal archive.
both methods use the same compression methods and are of the same type and data types.. one gives you better compression but is and is faster to extract but is horid at random openings and updates where as a normal archive doesn't have the horid side affects but doesn't give you the higher compression or the speed in extracting.
One thing MS has always been very good at is making MS Word fast. the load times are impressive and the save times like wise. forget about the stability for the moment and give them credit for being fast.. now i know TFA is fud and stupid but there might be a legit argument. MS knows how to make doc files fast, they designed them to be - if ODF wasn't as thought out for speed i could see it being an issue for anyone trying to implement it, and with some implementations there really is no way to make it faster.
It is just something to think about. While the artical is dumb the argument could very well be legit. people should bash it just because it has MS writen all over it.
Re:I don't know about the rest of you... (Score:5, Insightful)
How about if someone with a Windows PC at hand compared the speed of opening and saving OpenDocument vs. the usual
I'm sure Microsoft would very much like to shift the debate from OpenDocument vs. Open XML to OpenOffice vs MS Office. Let's not fool ourselves MS Office has many advantages.
Re:I don't know about the rest of you... (Score:3, Insightful)
Well, for one thing, if one stored the formatting and type face information on an as-needed basis, while the other stored it on a per-character basis, which would you expect to be quicker to parse?
(Yes, it's a facetious example, but you get the idea)
Oh Right... (Score:5, Insightful)
In fact, until this very day I didn't even realize that performance was even in Microsoft's dictionary, and like so many other words Microsoft uses I don't think it means entirely what they think it means. Newsflash, Microsoft, "innovation" does not mean "steal other people's ideas." "Security" does not mean "It'll be taken over before you can download the first update for it." And "performance" doesn't mean "the entire fucking system stops for 30 seconds when some application decides to stop handling its windows controls." Now STFU [stfu.se] and go back to pushing your poison kool-aid on unsuspecting consumers before Apple eats your lunch.
Re:Uhmm... (Score:3, Insightful)
I moved my office admin machine to OOO during version 1.x, and that thing is a 633Mhz P3. These guys are fools throwing poo. Nothing more...
Re:If I was an MS shill. (Score:5, Insightful)
Here's 3:
Because we find the slashbots' misinformed, knee jerk, MS bashing tedious?
Because we find that often, their tools are a good solution for our problems?
Because we aren't interested in fighting the Linux Jihad?
Re:I don't know about the rest of you... (Score:3, Insightful)
Bottom line is that ODF is a better format -- it's a cleaner format and superior for archival purposes.
Fileformat performaces (Score:5, Insightful)
Personally I already have seen this kind of numbers, even though I've never minded to measure them.
Why? Simply put, because it matters very little.
Compared to Windows 3.11, Windows XP needs 100 times more disk space, 10 times more RAM and 10 times more time to boot.
Compared MS to Word 5.5, MS Word 2003 if slower and bigger.
Today I wouldn't revert back to Windows 3.11 and would not choose Word 5.5. What'd be the most important features expected in a document file format? In my opinion:
1. compactness
2. openness
3. flexibility
No "access performances", though.
Because the time needed to load a document, when you do real office work, weighs by far less than the time you spend on it while working.
And when someone sends you a file written with a different version of the software or even with a different software, how much time do you spend to make that file readable and printable?
Re:I don't know about the rest of you... (Score:5, Insightful)
* they use single-letter tag names, for the most part, to reduce parsing time
* they remove all strings and put them in a look-up table
Thing is XML was desgiend to be readable and easy to parse. If you start doing hacks like embedding tons of binary data (OpenXML has images embeded in the XML), using one letter tags and look-up tables, you've essentially a bloated binary format.
You can call it an XML, it's technically XML, but it really isn't.
It would be better that Microsoft offers an open binary format, but truly open, patent free. XML is really heavy compared to efficient binary formats. Compressing the resulting XML makes XML formats on par with binary as to size, but that's just faking it: the program will have to decompress it and parse an XML, which is tons harder that directly parsing binary offsets and bits (for a machine).
Re:I don't know about the rest of you... (Score:5, Insightful)
This is arguably analagous to Microsoft saying (about a format they can't control, which has been approved by the ISO as their open XML hasn't yet), "We'd support it but it's too slow"
Well... is it $200 slower? (Score:5, Insightful)
Because "free" still means more to me than an additional 1.7 seconds.
Re:I don't know about the rest of you... (Score:3, Insightful)
The idea with XML is to have a portable format that can be used by various applications/services (web, editors, and XML backend parsers).
The power with XML is that not only does it describe a document - but that it can also be parsed by search engines and meta data can be embedded which all taken together - allow your documentation to also serve as a data source for various applications - (tied to RSS feed perhaps, part of taxonomy based search engine etc..) some of which are only now being developed - and many that are just ideas.
Embedding a proprietary binary format into XML defeats the purpose of this and is, in fact, not 'open' at all.
More FUD from Redmond - why is anyone surprised by this?
Eh? (Score:3, Insightful)
Why the hell does a text editor need to block the UI while writing to disk?
Re:I don't know about the rest of you... (Score:4, Insightful)
XML is a miserable failure on both counts. It may technically be readable, but it is excruciating. Easy to parse, it most certainly is not. About the only thing it has going for it, is that it is an extensible standard.
Re:I don't know about the rest of you... (Score:5, Insightful)
...and it can also be written with any program that can read and write text. Right now, today, I can generate valid OpenDocument files with standard Unix command line tools and simple "print" commands in common scripting languages. While that isn't valuable to the average user, it's extremely handy for those of us who want to generate documents dynamically with as little overhead as possible (example: sending quotes based on form input on a website).
Beyond that, XML is human readable (even if not terribly convenient). I can read well-designed XML documents with any text editor. 100 years from now, I'll still be able to glean the content of OpenDocument files with any program that understands by-then legacy encodings like ASCII. If a binary spec is lost, though, so are the documents written with it.
Re:Oh Right... (Score:3, Insightful)
As some other posters have mentioned, I've seen Windows and Office run quickly on systems where, say, KDE under Linux and OpenOffice wouldn't be nearly that fast. On a P3 in the 400MHz range with plenty of RAM, Office feels just as "snappy" as GVIM.
In fact, some of the things you complain about in your post may have at one point been helped performance even though on modern hardware they hurt it. That doesn't make them things that shouldn't be complained about, but I hardly think that Microsoft is a company that can be said to not care about performance. They make their stuff perform very well in general on the computers they think it's important to run well on.
Re:If I was an MS shill. (Score:5, Insightful)
Performance optimization should be extremely limited before the product is feature complete and in the hands of at least expert customers, and preferably the real customers. Performance optimization is in tension with programmer friendliness. ODF is zipped ASCII XML with binary embeds (eg: raster graphics) stored in a separate part of the zip - it is really easy to generate documents (I have written a few apps that do it). MS XML is not going to be so easy - inline binary and lookup tables for content. Do you want nicely encapsulated code that can meet the customer's evolving needs without developing bugs (eg: Office's security holes), or do you want a document format that can run on a Pentium 60?
I work for a very large company that has had a number of teams developing code on several different ideologies for the five years that I have been here. I have been able to see up close the long term cost/benefit of teams that write heavily optimized code versus those that write code that is heavy on OO theory at the expense of performance (and versus those that write code that is neither clean nor fast, which is kind of funny/painful to watch). In the long run, there is no competition - the maintainable code wins hands down for anything that has evolving customer needs (which, except for those that have been cancelled, is every project I have seen).
The zip file is stream decompressed so that a lost bit halfway through the file does not prevent decompression of the beginning. Textual data is earlier in the file than bitmap data both because it is needed sooner and also because a truncated file will still have its text and basic formatting intact.
That is the very epitome of inappropriate technical magic put in place by the, "Shouldn't our code handle hypothetical situation X?" people. It makes the code harder to write, understand, and maintain, and it solves a problem that doesn't happen in normal operating conditions. If there's a problem with software or hardware failures during write, do what OOo and MSO already do - keep a backup while the file is open. Once the file is on disk, it is very unlikely to be truncated or bit-flipped unless the drive goes bad (in which case you are going to have a hard time recovering it anyway). If you need your data to withstand drive failures, use an off-disk, off-site, or off-line repository as appropriate.
Re:I don't know about the rest of you... (Score:5, Insightful)
No, the idea that XML-based documents AREN'T "inherently" slow is silly. Of course an XML-based document will be slower than a binary document. XML gives a number of niceties, in the form of maintainability and platform-independence, but it can never be made faster than a well designed binary document. That's just the trade-off.
Deliberate Confusion Between File and App (Score:3, Insightful)
They deliberately confuse the application with the file format.
Psycologically reinforcing the perception that everything in a computer is vertically oriented and "incompatible" unless it comes from our application.
They understand the immense threat that a viable alterative (file format in this case) presents. PHB gets idea, "If this is iteroperable, gee I wonder what else is?"
Beautiful.
Re:If I was an MS shill. (Score:2, Insightful)
Uh... you *did* read my comment, didn't you?
>> I remain morally certain that some people hold legitimate pro-Microsoft opinions, with better or worse justification.
This is in spite of the fact that the author of the article (Bruce Ediger) has amassed a very damning laundry list of documented astroturfing/shilling incidents on the part of Micro$oft. He's pointing out that it's still possible for perfectly sincere people to express positive opinions about Micro$oft.
That's not the problem.
The third time the little shepherd boy cried "wolf!" *he was telling the truth*. Yet no one believed him. His word was no longer trustworthy.
That's the problem.
>> Microsoft, or its public relations company(s), have so muddied the water with all the shilling and astroturfing that a neutral observer cannot determine whether a paid shill produced an arbitrary pro-Microsoft opinion as propaganda, or a random person produced it as his or her own opinion.
The question isn't simply whether or not honest pro-Micro$oft testimony still exists -- it's *whether or not anyone can tell the difference anymore*. Thanks to Micro$oft's long and sordid history of "disinformation" (which includes lying to a judge in court AND GETTING CAUGHT IN THE ACT) many people no longer accept pro-Micro$oft testimony at face value.
> Oh, wait it doesn't. There's nothing better than watching someone post something they think is on topic and insightful, then poiting out how it's neither.
>
> Thanks for the opportunity to do that to you.
A great many people in the tech community view Micro$oft with disgust and suspicion (including yours truly) -- so yes, Your Honor, we plead guilty as charged! Now I pose a rhetorical question to *you*:
Whose fault is it? Is it Micro$oft's, for behaving in such an underhanded, dishonest manner? Or is it our fault, the Micro$oft skeptics, for not disregarding the corporation's history of behavior?