Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
The Internet

Pickling Australia's Online Past, Present, Future 38

stylewagon writes "The Australian has an article about a project undertaken by the National Library of Australia's Electronic Unit. The project is called PANDORA [Preserving and Accessing Networked DOcumentary Resources of Australia]. In a nutshell, they've been archiving important local Web sites since 1997 at regular intervals, with the aim of preserving Australia's online history. Everything from old political campaign sites to online journals long gone are there for public viewing." This is a cool project; seems like a handy application of Doing Stuff over the temptation of Grander Schemes, which must be tempting indeed with the rash of "archive the Web" projects lately. I just wonder how posterity will view the selection process that determines which sites are considered "important" enough to archive. Remind anyone of Foundation?
This discussion has been archived. No new comments can be posted.

Pickling Australia's Online Past, Present, Future

Comments Filter:
  • petrol prices have jumped 20c/l (because of govt excise and taxes)

    Try coming to the UK sometime. We're paying about six times that (again due to taxation -- something like 80% of the price at the pump goes directly to the government). People here can't help but watch in amusement when others around the world complain about petrol prices. Americans are apparently up in arms at the moment over their $1/gallon prices. I can't remember when it last cost that little over here...

  • When a recent high-profile crashed, I spent a week listening to CNN & NYT reporters et al. trying to pronounce "schadenfreude".

    Woomera is easy, in comparison.

  • I knew Australia was sparcely populated, but I always assumed there was more than one Aussie!
  • I wrote an article [] for on this topic over a year ago (24 August 1999).

    A couple of my concerns were regarding their handling of the growing number of dynamic sites on the Internet, and the criteria for inclusion (ie, musician fan sites, and childrens' entertainment groups were included in the database).

    My primary suggestion was that the Australian government's money could be better spent on providing subsidised hosting for approved and culturally significant Web projects, instead of merely duplicating (at least in the short term) a lot of content.

  • For example, recently Australia had a site along the lines of '' set up by an ex-police officer who was offering a database service of peoples criminal records.
    The stink that this raised marks this site as a 'milestone' in Australian net history, but somehow I don't think it will be archived.
  • by kyrky ( 207650 )
    the question "why bother?" comes to mind. what will anyone get out of this?
  • From their guide lines:
    4.3.1 High priority is given to authoritative(12) publications with long term research value(13). The National Library will preserve scholarly online publications (indicated in the case of e journals by peer review) without physical format equivalents, whether or not they are preserved elsewhere.

    It's seems that the kind of things they are placing in the collection are things that get archived by default. Most publications that go through any kind of peer review are probably back up else where.
    But this sounds like a good idea... Libary of Congress for the internet. And once we have a cheap stable high density storage medium it might even work.
  • Why Pandora ? As a Greek compilation of Spam sites and MLM scams, then it's a good name, but it's hardly culturally relevant to Australia.

    Shame they didn't choose an Aboriginal name. They tend to make for great project names anyway; no cultural baggage from other languages, plenty to choose from so there's usually a pretty relevant one to be had, and easily pronounced across other languages.

    Are the didgerati an Australian folk band ?

  • I'm starting to wonder where they stand on copyright here. Don't the sites, images, text and design belong to someone? Someone most likely paid for each of them along the way, isn't this, in essence, just ripping them off directly?

    I guess you can just ask for them to be taken off. I wonder if they keep a secret mirror of the mirror.

  • The front page of an IT section of today's Australian newspaper brought the above report of one Aussie e-tailer...

    I wonder what future Australians will think when they discover that Australian e-tailer Harvey-Norman STOPPED TAKING CREDIT CARDS as a payment instrument... at a time when e-commerce was quickly taking market share from traditional points of sale.

    It's an all too common pattern here...

    Why trust technology? Why try to solve a business problem (like misuse of others' credit card details) with technology? "Too hard" (etc.)

    Just limit your would-be clientele to whatever works for -you- ... and make believe they don't -really- want to -use- credit cards!

    In short, send your customers to [overseas] competitors who go the extra kilometer... even when there is some risk involved for the parties.

    It's happening now... as the world begins to turn its eyes to the big Island continent down under, i.e. with the coming of Olympics 2K, too.

    ...which reminds me: the official Olympics 2K web site is said (in the same newspaper) to be in violation of Equal Opportunities Legislation because it is -inaccessible- to people with visual impairments.

  • This is an Australian government project; some of the most clued-up people [] on the Net regarding metadata, and positively amazing by government standards. I hope they do something neat with metadata mark-up, or RDF.

  • They do have copyrighted material on the site, but they also provide links to the copyright statement for those sites.

    As they do bother about the copyright, I guess they likely only archive those sites that give permission to archive.

  • So at best this is only going to get a small number of pages, and which pages they get will be related to their choices of spidering techniques. It's kind of hard to get a truly random sample of pages from the entire selection of content - you're either forced to use what the search engines have indexed or just attempt tor randomly follow links. You'll always miss sites.
    Sounds kinda like life. Sooner or later you have to choose. Whatever you choose, you're going to miss whatever you don't choose. This is no reason to borrow your neighbour's shotgun.

    "Better a good life than a bad marriage"
    - My grandfather on moving in with his old mistress.
    (don't ask me why, but it seems appropriate here.)

  • Is that it will soon be buried in lawsuits. Which is a shame, because it's a brilliant initiative.

    My name is Sue,
    How do you do?
    Now you gonna die!
  • Dr. Louis Leaky MMDLLXIIV will unearth a fossilized monitor with a web page [] burned into its phosphors and people will draw conclusions about the late 20th century. History can be so cruel.

    Vote [] Naked 2000
  • So is the system that coordinates PANDORA called "PANDORA's Box?"
  • I spent a week listening to CNN & NYT reporters et al. trying to pronounce "schadenfreude"

    mwahaha, you win :)
  • Too bad you didnt put the uptime stats in there.

    If you did then I could say "Pandora's box love you long time."

  • Why are people saying that it has been "done before?" What one country does is completely different from another. If Sweden has done it and Australia wants to do it now, all the power. All the sites out there that list the archived sites have everything, not just Australia. The Internet is a global community without many walls, and most seem to have this mindset when it comes to the sites contained within. Australia on the other hand want to catalogue these sites. I don't want to get into the nitty-gritty of the fine print (as many have), but from a heritage point of view, if they want to do it, then do it. Good for them. At least they can have a piece of their Internet history, as Sweden does now.

    Even the samurai
    have teddy bears,
    and even the teddy bears

  • I think it's important to have archives of web content because, as more of the authoritative versions of so much information becomes electronic (like phone directories and legal documents), it will become ever more tempting for the publishers to &quotup-date&quot information to suit present circumstances, as in 1984, the famous George Orwell novel.

    Those who control the past, control the future.

    Alex Pollard

  • The negative side of the story, is that such an archive can actually contain information about individuals which was online at some time, but has been removed for privacy reasons. This might not be desirable.
  • Pandora has been running for more than a year now already.
  • Hey man, even individuals may need their own archive, for each has his/her own selection criteria.
    Do you know any web archiving, retrieging & browsing tool set or methodology?

    Don't say that wget is enough! It fails, for this purpose,
    1) Many pages changes frequently. Archiving means when and what.
    2) Some identical pages changes url.

    Any suggestions?
  • How much does petrol in the UK cost? At the moment, it's about $1.05 (AU) per litre which is a big jump up from about 85 cents before the government introducted the new tax system (GST) on July 1st
  • Then Pandora is their Encyclopedia Galactica, which is just a cover for their plans of World Domination(tm)... Clearly they must be stopped!

    If I recall my psychohistory correctly, then the Galactic Empire^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H US Government will send some big ship with a cocky general over to subjugate them, after they take over New Zealand and all those other tiny islands around them, only to be foiled by a member of the 2nd Foundation, armed with Mentalics... But that's where the similarity would end, because we all know that the Austrialia doesn't have telepaths...

    [Ominous Music]Or do they?[/Music]

    Is anybody else a bit unsettled by this revelation?
  • by Spurious George ( 225993 ) on Monday August 28, 2000 @11:24PM (#820445)
    This seems to be one of the few projects funded by any government that does not merit the name "Pandora"...

    (Hehe... did Pandora's box run Linux?)

    while ( !universe->perfect() ) {
    hack (reality);

  • by w00ly_mammoth ( 205173 ) on Monday August 28, 2000 @11:47PM (#820446)
    The Internet Archive [] has been doing this for several years now. There was a scientific amer. article on this. []

    This is a very good idea, but frankly, I've found the Useless pages [] to be the best chronicle of the web. Everything, from the ate my balls pages, to the first spam sites, to the first annoying business pages, are listed in their raw earnest early form.

    As for the aussie site, it suffers from the same disease as any govt. funded site - official seriousness. The most interesting and popular stuff on the net is not the crap put on the web by govt. commissions, but the output of real people. But this is all explained right on Pandora's site:

    "At the beginning of 1996, before the PANDORA Project was formally set up, the Selection Committee on Online Australian Publications (SCOAP) developed selection guidelines"

    The incredibly long and boring selection guidelines reveal that the SCOAP is out of touch with what the net's all about.

    "4.1.1 To be selected for national preservation, a significant proportion of a work should

    be written by an Australian(11) of recognised authority and constitute a contribution to international knowledge"

    Yeah, that takes a realistic snapshot of what the web is like.

    That says a lot - 4 yrs of committee work, and not much to show for it. Just goes to prove the govt. should stay out of anything to do with the net, including archiving it for historic reasons.


  • I can't belive how many bloody trolls there are in this story :(

    Anyway, what would be nice to see is an archive of the various BBS's that were floating around in the 80s/90s in .au, before the internet killed them off. I guess that they wouldn't take up too much HDD if the old programs were dropped, just have fidonet(etc), local messages and the designs of the BBS (welcome page etc). It would be great to go back and read the older stuff. Similar to the story about early usenet posts from a few days ago.

    How many SysOps trashed thier BBSes when they shut down? Would it be feasable (sp?) to mirror them on a free web provider? Just wrap them up with "pre" tags and "href" links instead of the old key-combos that you had to memorize under 2400 baud :)

    How much work do you think would be involved in a project like this? I think that I'd like to try it, just for larks :)

  • by iktos ( 166530 ) on Tuesday August 29, 2000 @12:00AM (#820448)
    The corresponding Swedish project [] gathers everything they can, and not only Swedish sites but foreign sites which are about Sweden or by Swedes -- provided they can be found. (This is the same institution which keeps a copy of everything which is printed in more than 20 copies in Sweden.)
    Throwing stuff away because "it's not interesting" is a bad strategy, as we've found out 100 years later. Is there an Australian institution which keeps everything which is printed in Australia for posterity? If not, that might explain why the strategy is like it is.
  • and easily pronounced across other languages

    you gotta be joking? ever heard an American try to speak English properly, let alone Abo?

    yo, yankees... record yourself saying "Wagga Wagga" and send it to Geeks In Space for us to have a laugh at

    I rest my case.
  • These archive the web projects are great in principle, and would constitute a valuable historical resource for future researchers, but they do have a major drawback - they are only ever going to cover a small portion of the available online content.

    Even search engines like Google and Lycos, with all of their resources dedicated to searching and indexing new sites, only cover less than 10% of available web pages, and the growth of the net is expanding faster than the ability of these sites to keep up. And if search engines cannot keep up with the Internet's growth, how the hell is an archiving project going to? It's just not possible.

    So at best this is only going to get a small number of pages, and which pages they get will be related to their choices of spidering techniques. It's kind of hard to get a truly random sample of pages from the entire selection of content - you're either forced to use what the search engines have indexed or just attempt tor randomly follow links. You'll always miss sites.

    Anyway, my point is nice idea, but not really worth as much as they promise.

  • How many SysOps trashed thier BBSes when they shut down?

    mine's still going, thank you :)

    although the filebase has been archived to CD and wiped, with the exception of BBS support files, since anything worth d/l'ing can be had off the net (and up to date) without tying up my phone lines. Fidonet and online games are what it's all about anyway

    pity Telescum/Craptus can't offer me a cable/asl service cheap enough to run a telnet board.
  • Now my book reviews [] are in .com instead of .au and hosted in Pennsylvania (with, I'm probably not eligible to be archived by them.

    And the Australian National Library wouldn't issue me an ISSN, because I didn't have formal issue numbers :-( (lots of numbers [])

    Danny []

  • The Swedish Royal Library (which, similarly to the Library of Congress, archives everything published in its country), started a project like this in 1996. They're trying to archive stuff in Swedish, even if it's outside of .se. Swedish policy on registering domain names used to be stupidly restrictive until recently (only stock-holding companies were allowed to register under .se), so many people registrered under .com or .nu instead.

    The contents downloaded will be saved as a "cultural heritage", and will most likely be held private by the researchers until copyright has expired. Yes, in the future, people will look at your crummy homepage and see what people thought at that time!

    You can find out more at ml [].

  • The National Archives of Canada [] attempts to keep an archive of any thing that was distributed with more than three copies, including websites, pamphlets, books and music recordings.
  • I assume the National Library of Australia [] is a copyright library like most other national libraries in the world, so it should receive a copy of every publication of any size produced in Australia.
  • So you thing the "eat my balls page" should be preserved for future generations using tax payers money, right?

    There's a whole lot of research that doesnt make it to readily available printed versions and are put online for it to be available for anyone interested (and connected).

    Making sure peoples hard (intelectual, creative etc) work doesnt vanish creates indeed an interesting snaphot of a culture in a given age.

    In fact one of the more interesting uses of internet itself is the publication of academic research. Those annoying gold rush and vanity sites will be always be there and I cannot immagine much unique thought that remains interesting for a society at large to be archived.
  • Except that they ask for permission. Read it before making dumb comments like that

An elephant is a mouse with an operating system.