Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror

A National Archive Moves to ODF 99

Posted by ScuttleMonkey
from the real-time-case-studies dept.
Andy Updegrove writes "The National Archives of Australia (NAA) has announced that it will move its digital archives program to OpenOffice 2.0, an open source implementation of ODF. Unlike Massachusetts or the City of Bristol (which announced it would convert to save on total cost of ownership), the NAA will deal almost exclusively with documents created elsewhere in multiple formats. As a result, it provides a "worst possible case" for testing the practicality of using ODF in a still largely non-ODF world. If successful, the NAA example would therefore demonstrate that the use of ODF is reasonable and feasible in more normal situations, where the percentage of documentation that is created and used internally is much larger."
This discussion has been archived. No new comments can be posted.

A National Archive Moves to ODF

Comments Filter:
  • I'm wondering if this will be the start of the use of Open Source in more business applications. Most companies use M$ Office, since it is mainstream, even with it's large cost. Maybe the Government's example will be the beginning of the revolution.
    • I'm wondering how long before the FUD starts vomiting forth from Redmond, and how much longer after that that we start to see mysterious political pressure to try to snuff these plans. Napoleon Gates and Stinky Ballmer will poison the well in every way they can to make sure their monopoly isn't harmed by a competing standard.
      • The problem is that it's turning into whack-a-mole. Eventually, you just can't keep up with the speed that the moles are appearing.

        Even though they have a stack of cash, the change is happening quite quickly, and sending people out to talk to governments, businesses etc around the world costs time and money. For small businesses, it's only worth it so that they don't become poster boys for others. But all efforts so far are not stopping the interest in it.

        I know some non-geeky business guys using OOo. O

    • Re your sig: super as a prefix doesn't mean "very", it means something along the lines of "above" or "beyond", as in "superscript".

      For a more on-topic note, I'm not sure why an office format would be the best thing to use for archives of final documents; why not use something like pdf? Readers are widely available, it will always produce the same results when printed, and it's been around for a while. Plus it's very straightforward to produce a pdf from absolutely any document that can be printed on at le
      • I'm not sure why an office format would be the best thing to use for archives of final documents;

        ODF is an electronic document format, not an "office" format, whatever that means. Its advantage in this context is that any document in ODF can be dissasembled ito its component parts easily. Text, images and formatting can all be extracted and used separately if needed. PDFs are hard to convert back to the raw data.

        • What I mean by an "office" format is probably more correctly termed an "intermediate" format: one that's designed to be edited again. Intermediate formats usually aren't guaranteed to look and print exactly the same on all systems. For example, one system might have a different version of fonts installed that would be slightly different in size and mess up the formatting. Or a different program (in ODF's case, say, koffice vs. openoffice) might render a table differently. If the file was well-constructe
          • Of course, you can extract text and images from a pdf

            Yes, but not as straight-forwardly as from a word-processing document. Sometimes the font subsetting makes copying text problematic (uncommon characters come out as a blank when copied). And there is no distinction between line wraps and deliberate line breaks, "real" or soft hyphens, and similar classes of information are obfuscated simply because they're not important to just viewing or printing.

            I'm sure the Archive is looking to allowing useful sea

      • A couple of years ago I went on a tour of the West Australian archives & they said that computer generated documents were their bane.
        They had 150 year old documents going back to wherever but they had trouble reading 25 year old floppy disks in weird formats and converting them to the raw text-only format they used back then.
        If they standardize on an XML based format like the ODF ones and convert all of their old stuff to this it will make archiving the current documents much easier. It may even in a
  • First (Score:1, Interesting)

    by Doytch (950946)
    Is this the first time a national government has switched to odf?
  • As a result, it provides a "worst possible case" for testing the practicality of using ODF in a still largely non-ODF world.

    Wouldn't this sort of test be a more or less good test case for switching to ODF and dealing with non-ODF outside documents? Maybe I just misunderstood the comment.
    • You misunderstood. It is a good test of the "worst possible transition case".
    • by qwijibo (101731)
      That's the point. The real world plethora of formats is the worst case. If ODF can handle the worst case, it would be a testament to the robustness of the format. The worst case test for interchangeable file formats would demonstrate that ODF is viable.
    • by MrPower (687654) on Monday April 03, 2006 @05:08PM (#15053782)

      What I think they meant to convey is that this will be a worse case scenario they can use for testing the practicality of using ODF in a non-ODF world.

      But I don't actually think so...

      Whereas I think this will be great for ODF, as the NAA will have to produce heaps conversion software to convert many formats to ODF but because they are an archiving operation, they won't ever have to convert back. Instead, I imagine that the common document format for outgoing files from of the archive will most likely be PDF...

      This scenario won't test the ability for ODF in collaborative work among entities, something that I would see as the worst case scenario needed to test the practicality of using this format.

      Having said all of that - to hell with everyone else - I have been using non Microsoft formats (first Star Office formats and now ODF) for five years now and rarely come across a problem. Then again, I am a simple user so I wouldn't expect too much grief. From my experience advising other people I can see that the true hurdle is not the file format, rather the application. Word and Excel are automated from so much business and scientific software that people just expect the results of their query or analysis to be dumped directly into their spreadsheet or word processor. So until Quicken or MYOB support something other that MS software, or until alternative software is produced that does, business will largely use MS.

      On the other hand I strongly recommend to people to use OOo at home and with the ever increaseing compatability that OOo has with MS formats, this is not a bad option.

      • So until Quicken or MYOB support something other that MS software

        Actually Quicken is available for OS X [intuit.com] and has been for some time. And since OS X is basically BSD, it's a much smaller move to port to Linux when Intuit decides that the time is right to do so.

        So there's nothing in that regard keeping small businesses on Windows, unless they happen to like the extra mainenance.

        On the other hand I strongly recommend to people to use OOo at home and with the ever increaseing compatability that OOo ha

      • I agree that OOo can do most of what the average MS Word/Excel user needs... but the problem that I ran into when I was trying to import a Word document, then export it to PDF through OOo, was that the formatting was not preserved. Tabs, margins, etc. ended up at different locations - which was a real dealbreaker for the document I was working on at the time. Everything had to be in exactly the place I had set it.

        Unfortunately, in that case, OOo didn't cut it. Does anybody know whether this is something

    • It's definitely a good test case. But it's also a good indicator of how ODF will function in a worst-case scenario. This scenario being a bunch of documents in a bunch of different formats all being converted to the target platform (the target being ODF).
  • Bristol, UK? If so, I missed that.
  • by Sir_Jordan (819187) on Monday April 03, 2006 @04:38PM (#15053578)
    Years ago when Novell switched over to Linux operating systems, one of their largest fears was the trouble integrating their documents in a Microsoft stardard based world. It turns out that Open Office was more than adequate concerning reading/writing various document standards.
  • OOo is slow because it's still largely impelemented using a Java VM-based architecture with bytecode and all that entails. I really think these guys should reconsider. MS is moving toward an XML-based file format which shouls be open enough for anyone. And MS Office is a client app written completely in optimized Windows assembler code. That should help with performance hemi-dramatically.
    • You must be kidding OpenOffice.org is almost all C++ code, it's slow to start because it calls to many files on startup, it has certain parts of it which use java like the Base and some templates.
    • by LWATCDR (28044) on Monday April 03, 2006 @05:00PM (#15053726) Homepage Journal
      "OOo is slow because it's still largely impelemented using a Java VM-based architecture with bytecode and all that entails."
      No it isn't. I just ran OpenOffice writer V2.0 and checked my task list. No java was running at all!
      OOo uses java for some functions but it in not "largely impelemented using a Java VM-based" anything
      http://en.wikipedia.org/wiki/OpenOffice.org#Java_c ontroversy [wikipedia.org] is a list OpenOffices use of Java.
      OpenOffice is mostly a C++ or C program.
      I have not run a profiler on OOo so I can not tell you 100% what makes OO slower than Office but I would guess that part of it is the XML format that OO uses.
      Just from my own experence I have found that you can write a fast XML parser and you can write a "safe" XML parser. But a fast safe XMP parser is very hard.

      • Yep. Speed is the main advantage of binary formats (since we now compress the textual ones). But I'd like to add that I'm using OOo for a few mounths now without ever needing to enable the java functionality (you can disable java at the configuration window). Almost all of it is written in other languages.

      • The slowness of the XML parser doesn't matter much if you don't save and reload the document with every modification.

        Actually, OOo is so slow because they don't use a widget set. The display is hand-drawn by a bunch of monks in Germany, because the project started before Qt or Gtk.
        • Well Qt wasn't an option because of the closed nature of Qt for Windows. The next version of Qt should solve that issue. GTK for Windows wasn't mature until very recently. What gets me is when none programmers start spouting off about how java is slow, or how this or that feature of bug would be so easy so the programmer is an idiot.
          I have never noticed OO.org was slow except when saving or loading a file.
          I have worked with XML parsers and find them really slow when dealing with a file with a few hundred th
      • Turning off java does speed up Openoffice considerably.

        So does increasing the memory settings.

        However it still takes about 3 or 4 seconds to start up on my desktop. As far as I remember from when I still used Windows this is not all that different from MS Office on XP on similar hardware. Does any one else who has done the same tweaks differ?

        However Abiword or Lyx starts instantly. I mostly use Lyx (which I find more productive) and Gnumeric (faster, with some nice features) rather than OO.
      • OOo spends time converting from .doc format to its own internal format. It's optimised to read odf instead.

        Try saving as OOo format and then reopening. It's just as fast as MSO.

        J.
    • I don't think so. My understanding is that it uses java to enable macros.

      If you turn off java, the entire program is tons faster and all you lose are macros.

    • Like the others have said turn off the java stuff if you don't need it.

      My question for OpenOffice guys, why have this turned on by default?

      Those who would need it can turn it on but I always thought OpenOffice was a dog because of this.

      Maybe a pop-up window or something...
    • Ah. That'd be the...er..."Open Standard" XML format? The one MS let's you write but won't tell you how to read it?

      Good idea! Sounds much more open than this silly ODF format....

      Apologies for sarcasm, but even if you're not into the political and social reasons for Open Standards, a closed, pervasive document format is A Bad Thing(TM). And when you get past the poor PR attempts, Office XML is still a closed, soon to be pervasive format. Hence it's A Bad Thing(TM).

    • I was going to join in modding you funny, but I thought I'd try and give you a clue in case you are wondering why mods think your post is funny:

      Funny is what people mod when they would like to mod something "So wrong it's not even... funny".

      You're either a really crap troll/shill or substantially misinformed about OOo.

      Justin.
      PS What does hemi-dramatically mean?
  • Small experience (Score:5, Interesting)

    by Anne Honime (828246) on Monday April 03, 2006 @05:20PM (#15053849)
    Back in the Uni, I was in charge of merging some 20+ articles from various authors into a single document. The target was to give the publisher a uniform document which he would then transform into a book.

    All documents were made with a flavour of Word or another, from word for MacOS 6.0 to the latest (at the time) word XP for windows. As you'd have already guessed, the only word processor able to make sense of all the documents at once was Openoffice.org. Of course, I faced issues (bulleting appearing "funny", for instance), but as I was applying a style I created, that was not a problem as long as the text was there.

    No single version of word in my possession was able to open all the documents, some documents even crashing word XP with thunder and lighting.

    • I switched to star office 5, which emachines had bundled with my computer when my copy of office 95 and publisher 97 refused to open the new word 2000/XP files and rather than warez it i gave star office a shot, this was back before OOo even existed so i was very happy with it back then and have used SO / OOo ever since
  • Well Since I just do not know this AT ALL. I work at places where at times (I am a commercial artist) where I have to use MS Office. Most of the time these places have all kinds of macros set up to do given tasks. MANY MANY MANY Macros because of the freelance pool they use they just want the macro's to take care of all heavy lifting so that people don't have to try and figure out how to input data for a week before you get it right. Anyway the question is. How can these be implemented into Open Office
    • Taken (Score:2, Informative)

      by tepples (727027)

      OK why is the little o included in the name? Its just Open Office. OOo is a website that Has OO. I don't get it.

      If this Wikipedia article [wikipedia.org] is to be believed, then the name of the web site, project, and product is "OpenOffice.org" because "OpenOffice" was taken.

    • Re: templates (Score:3, Informative)

      by michaelbuddy (751237)
      I almost thought you were joking about the templates, because what you described is pretty exactly what some people have done. It's called OOextras.

      I don't think they match up to the beauty of (some) MS or Corel templates , but StarOffice has some templates you could steal from I bet. Would those be freely distributable under their license?

      Anyway, http://ooextras.sourceforge.net/ [sourceforge.net]

      that's the
    • Document creation is not the place for data entry.

      I've seen people do it, and often they collect the data, which gets pasted to the word doc, printed and saved.

      Which means that the data can't be analysed or transformed easily, and it's all over the place.

      What you really need is a simple application, which has the functionality to produce a print.

      That said, Macros can be done in OpenOffice.org too. But need some manual conversion.

  • Questions here (Score:5, Informative)

    by countach (534280) on Monday April 03, 2006 @05:49PM (#15054016)
    I wrote the original version of the National Archives software that does the conversion. The current version of the software is available here: http://sourceforge.net/projects/xena [sourceforge.net]

    If anybody wants to ask any questions here I'll try and answer.
    • Every version and variant of OOo I've tried to use to read Word for Mac documents prior to 6.0 fails miserably. This would be trivial except for the fact that Word 6 was received so poorly by the Mac community that most Mac users never switched until the OS X version came out.

      The current versions of Office for OS X can correctly read 5.x files but no open source app I've found so far can. Its file format is different from the Windows version.

      12 years' worth backsupport sounds good until you realize the appl
      • Every version and variant of OOo I've tried to use to read Word for Mac documents prior to 6.0 fails miserably. This would be trivial except for the fact that Word 6 was received so poorly by the Mac community that most Mac users never switched until the OS X version came out.

        I remember that. A lot of computer labs with Macs back then had a site license for MS Word 5.x and tried to force students to use 6 when it came out. But MS Word 6 for Macintosh blew chunks so bad that students were contantly fin

    • > Are you going to do what OOo won't?

      I very much doubt the NAA will do anything that OOo won't. They don't have enough resources.
  • by digipres (877201) on Monday April 03, 2006 @08:48PM (#15054925)

    Our use of the OpenDocument format will be quite important, but it's only one facet of what we do. The Xena software has been developed with a plugin architecture that lets us use various external helpers to 'normalise' or convert to open formats any data objects in our care. For each data object, we use Xena to create a base64 encoded copy so that we can embed some metadata with it, and separately for a conversion to an open format. Much of the data ends up as XML, while images for example are png or jpg. We're currently investigating open audio formats. Xena is also used to 'present' data objects that it normalises.

    Until now, Xena has made use of OOo 1.1.x for the normalising of office documents into flat XML. Other development priorities have kept the move to OOo2 in the background. I must stress that we have not yet released Xena with OOo2 support, there is more testing to be done and we feel that the release must be accompanied by good user and developer documentation.

    The 'current' binary of Xena available at sourceforge is waaaaay out of date and will shortly be replaced by a much sleeker and more intuitive version. For the curious, anonymous cvs is pretty up to date. If you have a java 1.5 sdk and apache ant, check out a pile of modules and go nuts. Anyone who wishes to become involved in the development effort is more than welcome.

    For anyone else, keep an eye on the http//xena.sourceforge.net/ [slashdot.org] for the upcoming binary release.

  • As much as I like the principle of using standards-based formats, I am not 100% sure that ODF is well suited to the archiving business. Even PDF itself is not well suited, therefore the existence of the PDF/A standard. PDF/A defines a subset of PDF, leaving out features that present a risk for the long-term capability of reading the document; for instance, audio or video content, non-embedded fonts, javascript, etc... I would not be surprised if a format as rich as ODF also included such features.

    But at
    • Excellent point. Whatever conversion system is designed, and by the looks of it this has been a design consideration for the NAA, it must be extensible as the present array of standards are probably not in their final form nor the sum total of required formats needed to capture all the stuff that's being generated out there.

      nonetheless, awesome job NAA!!!

         

"Life sucks, but it's better than the alternative." -- Peter da Silva

Working...