Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
Announcements The Internet

British Library to Archive Electronic Resources 76

An anonymous reader writes "The British Library is a government-owned library that legally has to hold a copy of every book, pamphlet, map, journal, newspaper and piece of sheet music published in the UK. Today, that law changed and now the Library will be able to collect non-paper resources, such as websites, electronic journals, CD-ROMs and microfilms. Obviously, the library won't be archiving everything in these categories (for a start, the Wayback Machine already does a pretty good job of the websites), but will be keeping resources of national, historical or academic interest. There's more specific information in The British Library's press release. BBC News (which will now be archived by the Library) has an article on the changes."
This discussion has been archived. No new comments can be posted.

British Library to Archive Electronic Resources

Comments Filter:
  • by TedCheshireAcad ( 311748 ) <ted&fc,rit,edu> on Saturday November 01, 2003 @11:51AM (#7366766) Homepage
    so.... dmca vs british govt?

    i got 20 bucks on the brits.
    • Maybe it's the EU against the DCMA. Brits are EU-members, so I guess that their laws follow EU-guidelines and laws.

      This comment Copyright 2003 The Queen of Britain. If you steal this shit, she'll fucking cut you.

      • Re:funny face off (Score:2, Interesting)

        by TomV ( 138637 )
        A very similar requirement benefits the Library of Congress in the USA, under the name "Mandatory Deposit" (here [copyright.gov] are the rules).
        • I wonder how many times the mandatory deposit requirement has been enforced? Supposedly every item copyrighted in the US since 1978 should be on deposit at the LOC, and available to member libraries on request. How much money is allocated to storing and preserving these works?
          • Well, it's not 100%, certainly. Though given that providing a Mandatory Deposit copy if requested is a condition of Copyright Registration, I'd be surprised if a publisher were to refuse a request.

            From the 2001 Annual Report [loc.gov], the Copyright Office had a budget of $40,896,000 in fiscal Year 2001, though I'm not clear on whether this includes funding for the storage of earlier MD items. Total LOC funding for 2001 was $572M, plus $119M in gifts.

            With the volume of published material increasing exponentially
      • Perhaps you mean the Queen of England.

    • The archive will comprise selective "harvesting" from the 2.9 million sites that have "co.uk" suffixes.

      If a site is using a .co.uk (or other .uk address, the TLD for the UK) then it's a reasonable assumption that it's content is both British in its origin and intended primarily for a British audience.

      The potential for overlap with content covered by the DCMA seems negligible but even if there was such an overlap I fail to see how keeping a copy of a web page (and not the files that it may link to) would
  • by k98sven ( 324383 ) on Saturday November 01, 2003 @11:55AM (#7366782) Journal
    The Swedish Royal Library [www.kb.se], which has also stores everything published in Sweden (since 1640) has been archiving all swedish web pages. (since 1996, I think)

    There was a small flap about this recently, due to new data privacy legislation. They workaround is that the material is not available on the web, but can be accessed at the library.

    Which is of course, a bit silly given things like the wayback machine, which are located in foreign countries where EU privacy directives don't matter.
    • From the Swedish Royal Library: http://www.kb.se/ENG/kbstart.htm Opening hours Friday October 31 the library will close at 16 PM. Saturday November 1 the library is closed Now, thats an intresting time.
    • I always find it a bit difficult to get worked up about British Library. Yes, it's great that it has a copy of every letter printed since God-knows-when, but who actually gets to see them?!?

      The way I understand that access to the Swedish Royal Library works -- and most other libraries in Sweden for that matter as well as every library I have been in contact with in Denmark and I don't suppose the situation is very different in Norway or Finland -- is that everyone has access to it. That is, it doesn't matt
  • but will be keeping resources of national, historical or academic interest.

    does my personal "watch my litle baby pictures" blog will be archived then?
  • Storage (Score:5, Insightful)

    by rf0 ( 159958 ) <rghf@fsck.me.uk> on Saturday November 01, 2003 @12:03PM (#7366814) Homepage
    Well I have to wonder how all this will be stored and made secure for the next 100 years. Its going to take some large scale hardware, with a fast recall mechanism. Whatever company gets/has the contract must be rubbing their hands with glee

    Rus
    • Re:Storage (Score:4, Interesting)

      by TomV ( 138637 ) on Saturday November 01, 2003 @12:30PM (#7366900)
      Considering the cost of the existing 340km of basement shelving, mostly mobile, in a tightly controlled microenvironment, with fire and flood protection, I certainly wouldn't expect them to skimp on the storage. But I'd expect the competitive tendering process to keep some sort of a lid on the spend.
    • Re:Storage (Score:5, Insightful)

      by bug-eyed monster ( 89534 ) <bem03&canada,com> on Saturday November 01, 2003 @12:32PM (#7366907)
      Yeah it'll be interesting to see how the info will be stored. Looks like they're also collecting CD-ROMs and other "non-print publications." I don't think they absolutely need to store it somewhere that'll last for 100 years. They could store it in redundant media and just replicate them over time as the media's lifespans expire.

      As far as fast recall, the articles don't say if the info will be available on the net. If it's just for archival purposes, they don't need to put it anywhere that's quickly accessible. After all it's a government-run library, so nobody will expect to take less than a day or two to retrieve anything.
  • Censorship? (Score:5, Insightful)

    by samjam ( 256347 ) on Saturday November 01, 2003 @12:04PM (#7366819) Homepage Journal
    So when all the news web-sites have to pull a story because it relates to a trial... will it be pulled from the archive?

    Will it be put back after the trial?

    Or will it be a highly biased archive where anything that ever went to trial is strangely absent apart from the verdict.

    I used to manage the ananova search engine and it was a royal pain to have to yank spidered stories out of the result set, yet the way some websites work (different urls for same story) it would be back in again after a while. Judges don't care for such technical excuses.
    • by Idou ( 572394 ) * on Saturday November 01, 2003 @01:01PM (#7367032) Journal
      I really believe there is too little discussion about issues like this. What you are hitting on is the matter of accountability. It is an extremely important tool for our society. Unfortunately, it usually takes a serious disaster (like the Great Depression) before people realize that accountability is essential to our civilization and something gets implemented. And the situation is even worse with relatively new technology.

      People tend to see technology as a separate "thing" that does not require the kind of scrutiny that other issues get. People only get excited when the technology stops working.

      For instance, the majority of users have no problem with using a closed source OS like Windows. There are some really important issues about accountability that get neglected but as long as it works, people don't care. The only time people start to care is when insecure code allows their files to be erased and reality bursts their bubble. But what is the complaint? "MS, you need to get it together!" Unfortunately, the majority of people do not associate "accountability" as the main factor behind insecure code. They blame MS for being lazy (which is absurd, for so many reasons).

      It seems that accountability is always an after-thought. If the system appears to be working, noone complains. However, without accountability, it is very easy for the system to be completely upside-down, yet appear to be working fine on the surface (most accounting scams appear flawlessly normal on the service, even when BILLIONS of dollars are being stolen or misrepresented).

      This is not purely academic, and us /.er's are not immune. Why do we invest so much time into this site without demanding a certain degree of accountability? Is it not possible for our experience with this site to be pretty normal, yet what actually is going on in the background is quite contrary to our very reason for coming here? Without accountability, how will we ever know?
    • I doubt it'll be pulled from the archive as that's not necessary. In order to comply with the law, all that's needed is for the material to be inaccessable until a verdict is reached.
      • Yeah, but guess whats easiest?

        To remove something
        or
        to remove it and put it back again later
        or
        to remove it and remember to put it back later and put it back later
        or
        to remove something and be bothered to remember to put it back later and to be bothered to put it back later and to put it back later

        Guess whether or not any stories we yanked from the search set were restored, or if we left the cron job running which kept pulling them.
        *cough*
        I don't know, but I suspect the cron job may have been stopped when an
        • Re:Censorship? (Score:3, Insightful)

          It's "easiest" just not to archive anything whatsoever - regardless of whether the content's legal or not. However, doing so would be against both the spirit and word of the law.

          The law puts an onus on the British Library to archive everything. It also puts an onus on the British Library not to publish material that might prejudice a court proceeding. The only way to obey both laws is to archive everything and provide conditional access.

  • by Anonymous Coward
    do they plan to download and archive mp3s? i could help them out there.
  • by Ed Avis ( 5917 ) <ed@membled.com> on Saturday November 01, 2003 @12:28PM (#7366894) Homepage
    What the articles don't make clear is why legislation was needed. If all that will happen is for the British Library to crawl .uk sites, they could do that already.

    For print publications it is mandatory to send a copy to the BL. Obviously that would never be workable for websites. But does the law now say that the BL has the right to take copies of what you publish whether you like it or not, as already happens for dead-tree publications?

    For example the library might spider even sites with a robots.txt that forbids it, and be protected (in the UK at least) from legal harassment for doing so.

    What new powers does this Act give the library that it didn't have before?
    • Why is this important? Unless you have "sensitive" data on you web page, storing the contents of your index.[html|shtml|php] is no big deal now, is it? If you do have this "sensitive" data on your web page in the first place, don't you wish it to be archived somewhere. The age-old question of privacy appalls me sometimes. Not everything is government control and big brother watching upon us. Lighten up!
    • I believe the legislation was needed to increase the library's remit to cover electronic documents.

      The library is currently government funded to cover just paper documents.
    • I suspect there are two asepcts to this:

      1. New media, such as CD Roms, etc were not previously covered by the mandatory deposit rights the library has, so they may simply be taking this opportunity to make an announcement that covers both CD Roms and the web since that's simpler for the general public to understand.

      2. It may also be that without the law being changed, they would not have received the necessary government funding to create and maintain their web archive.
  • Well, how many years till it becomes the worlds largest archive of porn? After all... that's intersting to a lot of people...
  • They better invest in a bigger pipe - the site is terribly slow already :(
  • by Anonymous Coward
    goatse.cx so I can look back on my mispent childhood in 40 years time.
  • That's nice but. (Score:1, Flamebait)

    by CGP314 ( 672613 )
    Ever try to get a library card there? Well, you can't. Not without a letter from your university saying there is no place else in the world you can find the material. I would kill for one big, central, public library in London like the New York Public library. But no, there are 10,000 crappy, little one all over the place.
    • Re:That's nice but. (Score:2, Informative)

      by TomV ( 138637 )
      There's nothing in the BL that you can't get within a fortnight by Inter Library Loan from the crappy little library of your choice.
      • Often it's faster than that and if necessary the copy can come from the BL - I've received a book before now that was held in the BL before being delivered to my local library for my attention.
        • Actually, if I dust off my old Librarian degree for a second, the British Library is in fact not a biulding full of books at all, but rather a collection of orghanised knowledge, which can be acesses from any of thousands of crappy little libraries, and at several huge ones too ;-)
    • Are you kidding me? (Score:4, Informative)

      by WIAKywbfatw ( 307557 ) on Saturday November 01, 2003 @01:38PM (#7367168) Journal
      The British Library isn't a public lending library, it's an academic library. It houses one of the most extensive literary collections in the world and it would seem patently obvious to me why it is that you can't just walk in, fill in a form and just take out whatever you like.

      Some of its treasures are so delicate that they can't be touched by human hands - is that the kind of item you think should be easily accessed on a whim?

      Is getting hold of relevant material at your own university's libraries really that difficult? Or is obtaining a letter of approval from your faculty impossible? I have to doubt that the answer to both these questions is a "yes".

      On a parting note, perhaps you should try comparing the British Library to its one true American counterpart, the Library of Congress. The LoC is a fantastic archive, but despite being publicly funded and supposedly open to the public, you can't access it unless you're actually part of the political machine, as Michael Moore once illustrated.

    • I talked my way into a "British Library card" once. I was stationed in the UK as a USAF medic and was spending the day in London. I told them I was doing a research project on historical military medical techniques, showed them my two ID's (the regular military ID card and the "don't shoot this guy" Geneve Convention card that medics get) and was issued a three-day BL card that gave me access to, well, pretty much the whole damn thing.

      They were watching me pretty closely, though, so, in a profound fit of
    • ...particularly if you attend a university next to one. For example, the University of Wales, ABerystwyth even allows undergrads access; tutors are generally very willing to sign papers. I've still got a card valid into 2005 which I rarely used because the university's own facilities are more than sufficient for many disciplines.

      I understand national libraries in Scotland and elsewhere are a lot less friendly with access, but lots of people visit Wales specifically for NL access. Also, there are only, what

  • hmmm (Score:2, Funny)

    by Anonymous Coward
    i wonder if they'll archive the wayback machine.
    i wonder if the wayback machine will archive them.
  • ...no, not obvious copyright ones (the web being a publishing medium no different to any other in this respect; content is copyright but is said to have been published publically unless password-protected. I don't think robots.txt would stand up in court if other agents such as browsers have access.)

    A while back it was posited that sites should actually be reponsible for providing snapshots of sites, though. Fortunately, I believe this was shot down; the cost implications would be mind-boggling.

    I'm glad

Somebody's terminal is dropping bits. I found a pile of them over in the corner.

Working...