Microsoft Claims OpenDocument is Too Slow 553
SirClicksalot writes "Microsoft claims that the OpenDocument Format (ODF) is too slow for easy use. They cite a study carried out by ZDNet.com that compared OpenOffice.org 2.0 with the XML formats in Microsoft Office 2003. This comes after the international standards body ISO approved ODF earlier this month." From the ZDNet article: "'The use of OpenDocument documents is slower to the point of not really being satisfactory,' Alan Yates, the general manager of Microsoft's information worker strategy, told ZDNet UK on Wednesday. 'The Open XML format is designed for performance. XML is fundamentally slower than binary formats so we have made sure that customers won't notice a big difference in performance.'"
Re:I don't know about the rest of you... (Score:2, Informative)
*cough*
Re:If I was an MS shill. (Score:3, Informative)
When I used OOo I didn't think it was fast but it was nowhere near as slow or as much of a memory hog as this test found.
Re:I don't know about the rest of you... (Score:5, Informative)
* they use single-letter tag names, for the most part, to reduce parsing time
* they remove all strings and put them in a look-up table
I'm not sure how much difference these things actually make in practice, but there's probably a little speed there.
What's not fair is to compare OOo to Microsoft Office, and determine the speed of OpenDocument versus OXML based on that...
For those two people not in the know... (Score:3, Informative)
All in all - OOo's file formats are a nice and simple solution for exchanging reasonably sized documents (if you don't mind usual XML-namespace-hell structure) but for editing/working on larger documents/spreadsheets you may find yourself using MSOffice document formats (from within OOo). Pity they don't provide their own "scratch-pad/database-in-a-file" formats.
So - for once, Microsoft is kinda right here.
Re:False dichotomies (Score:4, Informative)
With old style formats, you knew that the header was 512bytes followed by 600 bytes of meta data, followed by the document sections which all indicate their size (or have some way of calculating it based upon the block type)
With XML, you get a tag opening and have to parse until the closure, this adds a lot to the complexity of reading.
Writing is slightly different, and should infact be simpler with XML even though it may be more verbose, you don't need to buffer the entire block or rewrite the section header to indicate the length, you just happily do a sequential write.
Re:I don't know about the rest of you... (Score:2, Informative)
Re:False dichotomies (Score:5, Informative)
ODT XML files are binary files. So are old Word 2003
When people say "binary files" they mean this as opposed to "text files", a seperation that stems from the ability to open a file for in "binary" or "textfile" modus in several APIs. Has to do with, amongst others, interpretation of control codes such as ^Z.
The other big mistake: file formats aren't fast or slow. The algorithms for reading and writing them are (or aren't) slow.
*slaps cheek* NO WAI!
You fail to see the point of what they're saying. They're saying a binary file, with a header and fixed data structures, are alot easier to read & parse than an XML file, which consists of structures of variable length, needs to be interpreted, etc etc etc. This is a problem with XML.
Re:MS App Tweaks (Score:3, Informative)
Re:I don't know about the rest of you... (Score:5, Informative)
DOC files don't so much as stream as open for Random Access. They're structured in such a way that the information is stored as an object heirarchy scattered across the file. This makes saving faster because only the changes are saved to the file. It also make opening faster, because Office only needs to pull up the information that's on the screen at the moment. (Even if it's at the end of the document.) PDFs work in a similar, but more structured, fashion.
The unfortunate fact about ODF is that it requires a complete decoding of the file when loading, and a complete reencoding of the file when saving. However, I don't see any reason why Microsoft can't just add ODF support and make it an optional format. Computers are fast these days, and it should be up to the user to decide whether he needs the performance provided by the MS DOC *cough* "standard".
Or in other words, Microsoft is grasping at straws, trying to find a reason why they shouldn't support opening and saving of ODF files. I feel so sorry for them. (Not.)
The OS's API is unreliable for cross platform... (Score:2, Informative)
OpenOffice has its own fonts and font engine, though it can utilize others. Office uses the OS's font engine but adds fonts to the OS during installation.
OpenOffice has its own engine to place, draw, clip... windows/forms. Office uses the OS's.
OpenOffice has its own database engine, though it can use several others. Office uses jet which is part of the OS.
The list goes on...
If the file format was supposed to be tested for perfomance then they should have used the two different formats with the same application.
Re:If I was an MS shill. (Score:3, Informative)
MS has a known history of paying people for grassroots positive PR online. That is where we got the term astroturfer from.
So go ahead and defent MS on the merits. Convince me a
Reminds me of 'performance speak' from psych paper (Score:5, Informative)
This issue is about Microsoft defending their turf rather than not wanting to learn something new. But it's basically the same motive at work: find ways to undermine the new to benefit the old.
It goes on, "This model of learning also explains other surprising behavior that I frequently observe. I have seen novices in software development with knowledge of a single programming language explain to experienced expert developers why their choice of programming language was a particularly bad one. In one case, I talked to a student of computer science who told me why a particular programming language was bad. In fact he told me it was so bad that he had moved to a different university in order to avoid courses that used that particular language. When asked, he admitted he had never written a single program in that language. He simply did not know what he was talking about. And he was willing to fight for it. With respect to programming languages, negative opinions about a language that a person does not know, are usually based on very superficial aspects of it. To people obsessed with performance lack of such in a programming language is a favorite reason to advocate its eradication (even though performance is not a quality of a language, but of a particular implementation)."
The positive lesson to take away from this is the MS is undoing itself. It's turning to cheap, nasty, suit-driven mentalities to defend its turf rather than the old days when it would just go out and write something new and nasty. It's become an unwieldy beast. I read about the Vista delays yesterday and briefly thought "Will anyone notice - who uses Windows these days". To an extent it shows what a bubble I live in. But it's true - *all* of my regular contacts use linux, freebsd or mac os x. As they should. After all - friends don't let friends use Windows.
Find a good reason for some bad design (Score:2, Informative)
http://www.groklaw.net/article.php?story=20051125
For those who are too lazy to read, here's a brief summary of the main differences between the two formats:
- m$ tags are 2-3 letters long and not readble
- m$ format looks more like a dump of the binary structure, and makes no attempt to separate content and style
The author was already feeling the size argument coming for m$ format, which is nonsense because both formats are compressed anyway and a XML should be readable.. but somehow, he was not expecting the "speed" issue.
Come on. If you wish something "efficient", use a binary format. If you start having a textual XML + compression, then obviously speed is not your concern. What's your concern then? Readability, processing by third party tools. In that case separation of content and style is more important. Who cares that "stuff" is written in Helvetica 12 black. I personally prefer to know it's a "title". And so on..
As for the speed, on today's computer which are virtually 1000x faster than required for typesetting document, this is laughafable. In addition, for large documents, I know many "word" addicts who separate documents in 100pages portions or so, because it become impossible to handle...
What I think about m$ XML, is that. well. it's not that bad. Even though not really "open", it's still better than before. But comon. This was done in a "rush", to fight back open document initiative. And in that case, dumping dummily the "internal binary structure" into a XML document was making more sense for them. There's nearly no development cost involved (no reasearch whatsoever) and it could be implemented very quckly.
Then Yates come and talk about "customer experience" (cf ZDNET article).. This is laughfable.
Regarding "customer experience", when will word support a real vector image format (no WMF crap please). like let's say EPS/PS/PDF... ? I personally hate having to make a raster of my images and make the word document explode in size (when i'm FORCED to use word).
2030?
Re:MS App Tweaks (Score:3, Informative)
Hence, me having to buy books like Undocument DOS" [amazon.com]
Re:MS App Tweaks (Score:3, Informative)
http://www.newsfactor.com/story.xhtml?story_id=28
Link to a new item about a lawsuit Novell filed in 2004 alleging OS-level sabotage. It does point out that WordPerfect's main problem was lack of a Windows version, but it also alleges Microsoft indulged in some software sabotage.
http://www3.gripe2ed.com/scoop/comments/2005/10/2
An anonymous posting to Ed Foster's Gripelog by someone who claims his wife was a WP beta tester. Mentions the undocumented API issue but does point out it has never been proven sufficiently to allow companies to sue MS for damages. Blames a lot of the troubles with both MS Word and Wordperfect on memory management issues, which is a valid shot.
But the most interesting is this analysis of the MS anti-trust trial written by Ralph Nader (admittedly no friend of any monopolist, but a guy who does his homework): http://www.cptech.org/ms/harm.html [cptech.org]. When you get far enough down in the article, you'll find this quote:
But, as Judge Jackson points out, and as most computer experts know, not all of the quality problems are innocent. In its internal emails and by countless examples, Microsoft has demonstrated that it believes it benefits when consumers cannot make competitor's products work correctly. Microsoft has a range of methods to undermine its competitor's products. When it does not use deliberate sabotage, it can withhold important technical information or refuse to license technology to its competitors, such as when it refused to permit Netscape to distribute a utility to log-on to Internet Service Providers, or when it withholds or unexpectedly changes applications programming interfaces and data file formats.
The reason Novell included intentional sabotage in their suit was becuase of evidence submitted from the anti-trust trial. Again, there are only indirect references to the practice in the trial evidence, not explicit evidence from the OS code itself, but when has anyone who hasn't signed a non-disclosure agreement really gotten a good look at what's under Windows' hood?
Does it pass the test for "beyond resonable doubt" -- probably not. However, "preponderance of evidence" only requires 51% certainty. There are quite a few people who will look at the trial evidence and Microsoft's behavior in other areas and pass that 51% mark.
Re:For those two people not in the know... (Score:5, Informative)
TeX consists of long streams of ASCII bytes and offer no random-access abilities whatsoever except those implemented by a text editor and the underlying filesystem. And yet, LyX, which can easily handle thousand-page documents, loads and saves nearly instantaneously.
Your complaint is really over the relative brokenness of two major office suites, not the inherent advantages of their document formats.
Re:Uhmm... (Score:2, Informative)
Re:I seem to remember... (Score:3, Informative)
I seem to remember a rather depressing benchmark with respect to how fast OOo was able to save and re-open a large spreadsheet- and how much memory was required to do so. The results were not pretty, and would have definitely qualified as something that goes into the "must improve asap" category. I use primarily open source apps, but I have to admit that this performance benchmark was a little disappointing. Here's a to a related ZDNet article: http://blogs.zdnet.com/Ou/?p=119 [zdnet.com]
Re:I don't know about the rest of you... (Score:3, Informative)
Dude, I hate to break it to you, but this is 2006. We've had multi-threaded applications for how many years now? Spin off another thread for the auto-save process. Word already does this.
Re:If I was an MS shill. (Score:3, Informative)
Uh... why did the villagers assume that the little shepherd boy was lying the third time he cried "wolf"?
Because by then he had lost credibility with them. It took only two times for that to happen. How many times has Micro$oft deceived the public? Let us count the ways....
http://www.inlumineconsulting.com:8080/website/ms
More to the point, the author closes his article with an answer to your (rhetorical?) question:
> CONCLUSION
> The alert reader cannot believe any pro-Microsoft opinion presented in any forum.
>
> I remain morally certain that some people hold legitimate pro-Microsoft opinions, with better or worse justification. Microsoft, or its public relations company(s), have so muddied the water with all the shilling and astroturfing that a neutral observer cannot determine whether a paid shill produced an arbitrary pro-Microsoft opinion as propaganda, or a random person produced it as his or her own opinion.
The little shepherd boy lost his flock because of his dishonesty. The Micro$oft corporation lost its credibility with the tech-savvy because of its dishonesty. I have sympathy for neither.
Re:I don't know about the rest of you... (Score:3, Informative)
Re:Microsoft's Message is Loud and Clear (Score:5, Informative)
OpenOffice uses ODF. Office uses binary formats. The performance analysis quoted doesn't compare ODF and OpenXML. It states right in the article:
Here is a comparison with the standard 16-sheet SXC and XML sample file I've been using. The sample is in compressed XML format because it is smaller and easier for you to download. You'll have to convert the XML file to XLS and the SXC file to ODS to run the following test yourself.
XLS is a binary format. This study is irrelevant to the statements made. And it's the only data given to substantiate the claims made. So there is no data given at all.
All you can conclude from this is that OpenOffice 2.0, retrofitted recently for ODF, is much slower in a windows environment than Office 2003 using binary file formats. A far cry from any statements made either by Yates or by the summary.
What a pile of crap journalism.