Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
The Internet

How Google Saved USENET 280

Masem writes: "Salon has a well-written article article on the recent revival of much of the USENET archives from '81 to '90 by Google. It mentions that much of the recovery was thanks to years of work in transferring data off 140-some 10" magnetic tapes (~120megs of data) to a more conventional format in order to recover much of the early posts. Even a reference to the previous Slashdot story is made." Update: 01/07 23:52 GMT by T : btempleton adds: "O'Reilly Network asked me to do an article on similar themes and rememberances of USENET history." Thanks, Brad.
This discussion has been archived. No new comments can be posted.

How Google Saved USENET

Comments Filter:
  • by Chagatai ( 524580 ) on Monday January 07, 2002 @05:27PM (#2800956) Homepage
    ... they must have recovered the earliest copies of the script to Monty Python and the Holy Grail and Douglas Adams Jokes ever!

    --Chag

    • It's been posted here before, but a list of "first mentions" are here [google.com]. Notably absent is the first mention of Kibo... just an early post BY him. :)
      • by TheGreenLantern ( 537864 ) <thegreenlntrn@yahoo.com> on Monday January 07, 2002 @05:49PM (#2801111) Homepage Journal
        From the "First Mention of Star Wars Episode 6" entry:

        I can't really imagine waiting until 1997 to see all nine parts of the Star Wars series.

        I don't know what this "nine parts" jazz is, but that little 1997 blurb is about the funniest thing I've seen all day.
        • I don't know what this "nine parts" jazz is, but that little 1997 blurb is about the funniest thing I've seen all day.

          According to Lucas, SW was supposed to be a trilogy of trilogies (Lucas has since recanted and said that E3 will be the last). E5 was out 3 yr. after E4, E6 three years after that. You do the math. No one expected the long hiatus between E6 and E1. After Jar Jar, they wondered if Lucas had waited long enough...

          • And with Attack of the Clowns on the way, with (rumor has it) NSync in it, and more Jarhead Bites, you certainly have to wonder if Lucas is having "Sellout" tatoo'd on his forehead....

            Now, LOTR may not have been perfect, but at least it was reasonably true to the book (hence a decent story) and showed what you can do with a good story. In this instance, we have Lucas busily destroying the mystique and the depth built up in the first SW trilogy (well, first in terms of release date).

            I waited outside for a few hours to get tickets to EP1. I'll wait till a while after the premier to see this next film. If it is as disappointing, I'll wait for EP3 maybe longer than that. George, this is not the way to go about prying Imperial Credits from my wallet....
      • Re:Just think... (Score:2, Interesting)

        by Cowculator ( 513725 )

        They did leave out this first mention [google.com] in 1991 of a certain kernel, though, which Linus obviously remembered just a few months later in his own first. [google.com]

        To quote another /. poster via the article about how embarrasing things like this are, "It's like having naked baby pictures of yourself stapled to your forehead when you walk around"...

        • Re:Just think... (Score:5, Informative)

          by ideut ( 240078 ) on Monday January 07, 2002 @06:17PM (#2801254)
          Reading your first link, it's amusing to see that even ten years ago there were a lot of ridiculous IP shenanigans. Such as

          "Ashton-Tate is once again pushing its case for a copyright on the programming language used in DBase. ".

          And the numerous silly patents, such as

          'Emacs is threatened by IBM patent number 4,674,040 which covers "cut and paste between files" in a text editor. Many Emacs features are threatened by patent number 4,458,311, which covers "text and numeric processing on same screen." Patent 4,398,249 covering the general spreadsheet technique known as "natural order recalc" stops us from using it in GNU '

  • Wow, similar story (Score:3, Informative)

    by Tairan ( 167707 ) on Monday January 07, 2002 @05:27PM (#2800960) Homepage
    released today in the San Francisco Chronicle. Read it over at sfgate.com [sfgate.com]. I/m surprised two independent media organizations would review the same company about the same thing and release it in the same general time frame! Amazing~

    • released today in the San Francisco Chronicle. Read it over at sfgate.com [sfgate.com]. I/m surprised two independent media organizations would review the same company about the same thing and release it in the same general time frame! Amazing~


      Gee, the print media has a hierarchy: All editors read the NY Times, the LA Times, and the Wash Post to see what the consensus important stuff is. The editors of the LA Times and the Wash Post read the NY Times to see what the important stuff is. The editors of the NY Times decide what's important stuff to print. This is why all the newspapers look the same.

  • by ImaLamer ( 260199 ) <john.lamar@NospaM.gmail.com> on Monday January 07, 2002 @05:27PM (#2800966) Homepage Journal
    I think the title of the story should be How Google Saved USENET.

    Yes, google saved the historical record of the USENET, but it needed not to save the USENET from anything else. USENET is alive and well.

    • Ah, but Google didn't save the "historical record of the USENET", it restored them (well not really them - go read the article).
    • Haveing 30 odd thousand mac v. windows flame wars might not seem like a great thing to save now (espicly if you were invovled with them), hell, no one might care in 100 years, but a lot of histroy is based on reading the common writings of everyday people. How great is it to be able to read the dirary entries of a Frenchmen durring the middle of the revolution, or to look at the account book of a middle ages merchent. Most of it may seem mundane, but histroy is made up of many mundane momments. Thats why we have grad students, to sift thrugh all the "Me to" posts and "M$ Sux" posts to find the really meaningfull stuff. You must admit, its great to look at the google archives of the birth of Linux or the first mention of AIDS.
  • by TheLocustNMI ( 159898 ) on Monday January 07, 2002 @05:29PM (#2800975) Homepage
    Having had to work with those bastards, I'd have to give extra kudos to Google! There are few places in the United States that can actually read them, and get you the data from them anymore, and they must've been lovingly cared for, with some of them being 20 years old!

    I think I speak for everyone when I say "Thank you Google for arming me with the information contained in old USENet posts to bring up embarassing teenage posts to my friends!"

  • by reaper20 ( 23396 ) on Monday January 07, 2002 @05:31PM (#2800987) Homepage
    Google Groups is awesome, especially when searching for some obscure piece of hardware advice or settings.

    I don't have to worry about getting and setting up a news client, and it's just one tab over from my default search engine.

    Google did save USENET for me - though I never post, searching through all the linux and comp newsgroups is usually faster than looking up a HOWTO.
    • yes, you sum it up (Score:3, Insightful)

      by markj02 ( 544487 )
      So, you are saying that USENET has changed from an informal discussion group to a searchable perpetual repository of technical support Q&As, plus a repository of background information on people who were foolish enough in the 1980s to post under their own names. I agree. The part I don't understand how you think that constitutes "saving" USENET. USENET didn't use to be much of an on-line community compared to some of the others, but it was a community. Once it became archival, anonymous, and searchable, that went away. Who, after all, wants their every word recorded and replayed into perpetuity?
    • Google did save USENET for me - though I never post, searching through all the linux and comp newsgroups is usually faster than looking up a HOWTO.

      As a regular USENET poster, I'm gratified that you've found our posts useful, but please, please do consider participating yourself!

      "But I don't know anything worth posting!" , I hear you cry. Well, for a start, since when has that stopped anyone on USENET, myself included! Besides, I'm sure everyone knows something about something, even if it's "only" mexican cooking (alt.food.mexican-cooking) and Italian manga (alt.italian.anime-manga).

      Take the trouble to subscribe to a few groups and get involved. Keep them as lively discussion fora, not dusty historical archives and a spam collection!

      I discovered USENET in 1992, and I've rarely gone away. It's definitely the most consistently interesting and useful part of the Internet, IMHO.

      --

  • by ThomasMis ( 316423 ) on Monday January 07, 2002 @05:31PM (#2800988) Homepage
    As a software developer, no matter what problem I run into, somebody else has already run into that problem and has asked my question and recieved an answer on groups.google.com. Whenever I get stuck on anything at all, it's the first place I run to. groups.google.com is the single most useful site you can point your browser (konqueror!!!) towards. I'm not sure how they make money over there at google, but what a great service they are providing!
    • by aussersterne ( 212916 ) on Monday January 07, 2002 @06:17PM (#2801252) Homepage
      This is absolutely true. I am often asked "What book(s) can I buy, to learn what you've just told me? How do I gain the knowledge in [subject X] that you have? I don't care if it takes me a decade, I just want to learn it, but I can't seem to find out where. Is it written down?"

      I tell them: it is a decade's worth of learning, and then some, but not from books. It is all from USENET. I became a competent C programmer who writes more efficient code and makes fewer fundamental mistakes thanks to usenet. I learned to use BSD and then to use Linux as fast and furious as I can type and to get myself out of any system problem, save my data from nearly any corruption thanks to usenet. I am able to network these odd things, build these robots, and have this "cool stuff" that you like so much that works so well thanks to usenet. I can make nearly any computer go, now matter how old or wierd or what media or operating system it uses (a feat which makes you a legend in your own department) thanks to usenet.

      It's not my knowledge... I humbly picked it up in the mid and late '80s and early '90s and still constantly refer to it, first through Deja and now through Google. It is our knowledge, collective and stretching backward in time. To ever lose the news archive would be a tragedy -- the amount of searchable data on everything from chemistry and biology to computing and electronics to literature and politics is truly stunning. With the news archive, you can learn to hotwire together any two things so long as they have *wires* to do something useful; you can learn to brew just about anything including some of the best beer ever; you can learn just what the HELL James Joyce is talking about at times in Ulysses. Every question has been answered before you even asked.

      The only sad thing has been the degree to which the groups have been turned into a boulevard of endlessly flashing neon porn signs in the last few years, almost to the degree that anything else is drowned out by the brightness.

      Study USENET. Use USENET. Live and learn. Amen.
  • by SmittyTheBold ( 14066 ) <[deth_bunny] [at] [yahoo.com]> on Monday January 07, 2002 @05:31PM (#2800989) Homepage Journal
    ...how Google will make money off of this. They supposedly make money off licensing their technology (and presumable their collected data, as well.) No ads whatsoever. I applaud their dedication to that goal so far.

    Groups.google.com seems like the kind of thing they're doing just becuase they can, though. I can't imagine there is much money to be made off the technology, because it's all text - the same search tech applies. So, as far as I can tell, there is no business reason to be doing this. it's a drain on resources with little to no return, except for (geek) community goodwill.

    The conclusion I draw, then, is Google is in this just for the fun, challenge, or doing something for the community - maybe all three. Philantropy at its best. =)
    • by frank_adrian314159 ( 469671 ) on Monday January 07, 2002 @05:57PM (#2801164) Homepage
      I can't imagine there is much money to be made off the technology, because it's all text - the same search tech applies. So, as far as I can tell, there is no business reason to be doing this.

      If you build it, they will come...

      The old USENET posts are an information archaeologist's garbage heap. If information has any intrinsic value at all, this is the place to find treasures. Just because some folks see dirt doesn't mean there isn't gold to be mined.

      • If you build it, they will come...

        Oh please, the "dollars follow eyeballs" fantasy hasn't been mouthed by anyone worth their weight in salt in over two years. 99% of the posts Google is archiving have absoluately zilch, nada value, to anyone, including the original posters.

        My guess is that Google will realize that 95% of the searches pertain to posts from the last twelve months and will send the rest back to the tape locker.

    • by krogoth ( 134320 ) <slashdotNO@SPAMgarandnet.net> on Monday January 07, 2002 @06:09PM (#2801217) Homepage
      I think google should be paid just for being so damn cool. They deserve spontaneous income for things like the groups (with the history they now have), having a '1337-h4x0r' language you can use (http://www.google.com/intl/xx-hacker/), changing their banner for special days (anyone else see the christmas thing?)...

      There's a lot of companies right now that should be punished for doing stupid things, but Google is the complete opposite; I'd like to see Microsoft, the RIAA, and the MPAA have to donate 20% of their money to google :)
    • No ads whatsoever.

      Googles DOES have ads, just not the obtrusive, annoying kind. I.e., look up "car tires" and the first thing you see is a "sponsored link" by Tire Rack.
    • I would think that market share would play a large role in this. If you are licensing your technology and you become the de facto standard, there's a lot of bucks to be made in that.


      If google can capture geek market share, guess who usually makes the IT decisions at a company? By having some goodwill out there today, they will try to bank on it in the future. (Hey, remember us? We saved your flame war from '89!) Buy our stuff!


      Besides, since it IS all text, it probably doesn't take up THAT much space. There are probably pr0n sites that create more memory usage in one day than USENET did in one year.

    • There IS money to be made with this. Google's text based ad technology is VERY powerful, and has some of the best targetting potential in the industry.

      While I'm not sure of the legalities, Google will probably add the same text based ads located on its web search to its newsgroup search. This will mean when you search for "tivo upgrade", you could see a text based ad pointing offering hard drive upgrade kits next to the news posts. Unobtrusive, yet effective.

      Not really bait and switch, but they're getting everyone hooked on the system now, and'll work on ads later. (just like they did for the web search)

      Again, I don't blame them. Everyone has to make a buck, and Google's doing it in the best possible way.
  • Archaic Technology (Score:5, Interesting)

    by irregular_hero ( 444800 ) on Monday January 07, 2002 @05:31PM (#2800991)
    The article isn't kidding about the difficulty of finding a reader for your typical nine-track tape these days. I spent lots of bucks on a SCSI nine-track a few years ago for archiving system and application software on nine-track from old computer systems. And although the purchase helped, there are still occassions when I have to fire up some very old Big Iron to read one tape or another.

    An interesting thing about these tapes: They stretch over time and can sometimes become unreadable because of that. There are times when, to extract the information on the tape, I would put a number of them in my freezer for an hour or so, then try again. Nine times out of ten that would actually work.

    Another note about the article: I can still remember discussions with others who had modems about 1200 baud being just "too fast". The reasoning was that the average person couldn't read much faster than 300 baud. :)

    • I think that Google's data transfer woes is a convincing example behind the argument of reading archived information after the medium becomes obsolete. Supposed Google or somebody else didn't decide to do this for another couple of years, they might not have even found equipment to read old tapes. There's alot of companies that spend beaucoup cash keeping their 20 year old equipment around just so they can keep a record of business transactions 20 years old.

      • Clever businesses transferred their 9-track archives to Exabyte about a decade ago. The problem is people with only a few tapes, not clueful people with lots of them.

        As an example, the VLA (Very Large Array, a radio telescope in New Mexico) had its entire archive on 9 track. When Exabytes finally became cheap, they just copied their entire data archive (everything observed since it started taking data in 1978, thousands of tapes) to Exabyte tapes. The expense wasn't that large compared to their overall operations expense.
        • by wiredog ( 43288 )
          Nasa has an ongoing program to transfer all their data to new formats. Last I heard they were (still) moving it all from tape to 12" optical disks. They have lots of data.
    • The reasoning was that the average person couldn't read much faster than 300 baud. :)
      Also note the implicit assumption that (nearly) everything coming across was worth reading. Unlike now, when usenet and web content make Sturgeon look like an optimist...
  • by slashdot.org ( 321932 ) on Monday January 07, 2002 @05:31PM (#2800992) Homepage Journal
    years of work in transferring data off 140-some 10" magnetic tapes

    That means at least one person spent several DAYS PER TAPE???

    Even punch tape 'd faster than that. ;)
  • by msolnik ( 536110 ) on Monday January 07, 2002 @05:33PM (#2801010)
    I was wondering what kind of backup googles uses now for all its info? What happens if one day a script kiddie breaks in and rm -rf / all the boxes? Do they have tape backups? How many etc. I also wonder how much they spend on it.
    • Easy. They will just post a message on the site asking for someone who might have it on cdrom. That's happened before, you know :-)

      Wasn't it Linus who said something like "real man don't use backups, they post their stuff on a public ftp server and let the rest of the world make copies"?
  • It's like going back to your kindergarden and seeing what you used to play on. The stuff you remember and the stuff that seems much smaller now.

    I ran a search under my name...found some old posts...also found some wild stuff, like an old Slashdot quote I had that someone had pulled out, snipped, and posted along with a bunch of other quotes on an alt.atheist posting.

    Kinda of like a community Napster for the brain...Hope they never go away.
  • by prof187 ( 235849 ) on Monday January 07, 2002 @05:34PM (#2801015) Homepage
    Google seems to be getting involved with a lot of things. It's nice to see that a group is not only trying to push the Internet forward, but also trying to preserve the past.
  • ...is just to prove that porn-solicitations and X10 ads aren't the true purpose of "distributed communications." Remember when you could actually carry on an on-topic thread?

    I recently tried to track down the milestone changes for Mozilla, and got a link to the original newsgroup posting. I thought I'd dig around through the responses and see what everyone else thought -- 550 message headers later, I realized that even the Mozilla servers were utterly spammed with "me and my friends, naked & FREE!" For some reason (probably just bad memory) it didn't seem like we had these types of problems back at Berkeley...
  • by C. Mattix ( 32747 ) <cmattix@@@gmail...com> on Monday January 07, 2002 @05:42PM (#2801073) Homepage
    I know this is a repeat but this is a great read. Dr. Gene Spafford's farewell posting. If you don't know who that is, look it up.

    ===
    From: spaf@cs.purdue.edu
    Newsgroups: news.announce.newusers,news.misc,news.admin.misc,n ews.groups,soc.net-people
    Subject: That's all, folks
    Followup-To: poster
    Date: 29 Apr 1993 19:01:12 -0500
    Message-ID:

    [ I originally was going to post nothing on this topic. I'm burned
    out, and I don't want my fatigue to appear like I'm posting
    self-indulgent garbage. However, several people have argued with
    me, and convinced me that maybe I should make a statement to "end an
    era," and as a piece of net "history." At the least, even if it is
    perceived as self-indulgent garbage, it will fit right in with the
    rest of the net. ]

    There is a Zen adage about how anything one cannot bear to give up is
    not owned, but is in fact the owner. What follows relates how I am
    owned by one less thing....

    About a dozen years ago, when I was still a grad student at Georgia
    Tech, we got our first Usenet connection (to allegra, then being run
    by Peter Honeyman, I believe). I'd been using a few dial-in BBS
    systems for a while, so it wasn't a huge transition for me. I quickly
    got "hooked": I can claim to be someone who once read every newsgroup
    on Usenet for weeks at a time!

    After several months, I realized that it was difficult for a newcomer
    to tell what newsgroups were available and what they covered. I made
    a pass at putting together some information, combined it with a
    similar list compiled by another netter, and began posting it for
    others to use. Eventually, the list was joined by other documents
    describing net history and information.

    In April of 1982 (I believe it was -- I saved no record of the year,
    but I know it was April), I began posting those lists regularly,
    sometimes weekly, sometimes monthly; the longest break was for 4
    months a few years ago when I was recovering from pneumonia and poor
    personal time management. (Tellingly, only a few people noticed the
    lack of postings, and almost all the mail was "When will they come
    out?" rather than "Did something happen?") As time went on, people
    began to attach far more significance to the posts than I really
    intended. It was flattering for a very short time, and a burden for
    most of the rest; there is no telling how much time I have devoted
    over the last decade to answering questions, editing the postings, and
    debating the role of newsgroup naming, to cite a few topics. I really
    tired of being a "semi-definitive" voice.

    Starting several years ago, at about the time people started pushing
    for group names designed to offend or annoy others, or with a lack of
    concern about the possible effects it might have on the net as a whole
    (e.g., rec.drugs and comp.protocols.tcp-ip.eniac) I began to question
    why I was doing the postings. I have had a growing sense of futility:
    people on the net can't possibly find the postings useful, because
    most of the advice in them is completely ignored. People don't seem
    to think before posting, they are purposely rude, they blatantly
    violate copyrights, they crosspost everywhere, use 20 line signature
    files, and do basically every other thing the postings (and common
    sense and common courtesy) advise not to. Regularly, there are postings
    of questions that can be answered by the newusers articles, clearly
    indicating that they aren't being read. "Sendsys" bombs and forgeries
    abound. People rail about their "rights" without understanding that
    every right carries responsibilities that need to be observed too, not
    least of which is to respect others' rights as you would have them
    respect your own. Reason, etiquette, accountability, and compromise
    are strangers in far too many newsgroups these days.

    I have finally concluded that my view of how things should be is too
    far out-of-step with the users of the Usenet, and that my efforts are
    not valued by enough people for me to invest any more of my energy in
    the process. I am tired of the effort involved, and the meager --
    nay, nonexistent -- return on my volunteer efforts.

    This hasn't happened all at once, but it has happened. Rather than
    bemoan it, I am acting on it: the set of "periodic postings" posted
    earlier this week was my last. After 11 years, I'm hanging it up.
    David Lawrence and Mark Moraes have generously (naively?) agreed to
    take over the postings, for whatever good they may still do. David
    will do the checkgroups, and lists of newsgroups and moderators
    (news.lists), and Mark will handle the other informational postings
    (news.announce.newusers).

    I'm not predicting the death of the Usenet -- it will continue without
    me, with nary a hiccup, and six months from now most users will have
    forgotten that I did the postings...those few who even know now, that
    is. That is as it should be, I suspect. Nor am I leaving the
    Usenet entirely. There are still a half-dozen groups that I read
    sometimes (a few moderated and comp.* groups), and I will continue to
    read them. That's about it, though. I've gone from reading all the
    groups to reading less than ten. Funny, though, the total volume of
    what I read has stayed almost constant over the years. :-)

    My sincere thanks to everyone who has ever said a "thank you" or
    contributed a suggestion for the postings. You few kept me going at
    this longer than most sane people would consider wise. Please lend
    your support to Mark and David if you believe their efforts are
    valuable. Eventually they too will burn out, just as the Usenet has
    consumed nearly everyone who has made significant contributions to its
    history, but you can help make their burden seem worthwhile in
    between.

    In closing, I'd like to repost my 3 axioms of Usenet. I originally
    posted these in 1987 and 1988. In my opinion as a semi-pro
    curmudgeon, I think they've aged well:

    Axiom #1:
    "The Usenet is not the real world. The Usenet usually does not even
    resemble the real world."
    Corollary #1:
    "Attempts to change the real world by altering the structure
    of the Usenet is an attempt to work sympathetic magic -- electronic
    voodoo."
    Corollary #2:
    "Arguing about the significance of newsgroup names and their
    relation to the way people really think is equivalent to arguing
    whether it is better to read tea leaves or chicken entrails to
    divine the future."

    Axiom #2:
    "Ability to type on a computer terminal is no guarantee of sanity,
    intelligence, or common sense."
    Corollary #3:
    "An infinite number of monkeys at an infinite number of keyboards
    could produce something like Usenet."
    Corollary #4:
    "They could do a better job of it."

    Axiom #3:
    "Sturgeon's Law (90% of everything is crap) applies to Usenet."
    Corollary #5:
    "In an unmoderated newsgroup, no one can agree on what constitutes
    the 10%."
    Corollary #6:
    "Nothing guarantees that the 10% isn't crap, too."

    Which of course ties in to the recent:

    "Usenet is like a herd of performing elephants with diarrhea --
    massive, difficult to redirect, awe-inspiring, entertaining, and a
    source of mind-boggling amounts of excrement when you least expect
    it." --spaf (1992)

    "Don't sweat it -- it's not real life. It's only ones and zeroes."
    -- spaf (1988?)

    --
    Gene Spafford, COAST Project Director
    Software Engineering Research Center & Dept. of Computer Sciences
    Purdue University, W. Lafayette IN 47907-1398
    Internet: spaf@cs.purdue.edu phone: (317) 494-7825
    ===
  • by Henry V .009 ( 518000 ) on Monday January 07, 2002 @05:42PM (#2801074) Journal

    Ye Gods!

    The modern slashdot nerd trembles in the presence of those ancient USENET nerds of old

    A 300 pound slashdot weakling is easily flung aside by the 500 pound USENET god. Who at slashdot keeps taped archives of every post for the nerds of future generations? Truly those were nerds.

  • by Anonymous Coward on Monday January 07, 2002 @05:45PM (#2801083)
    i'm a tad concerned about the posts i made in the early 90's when i was an asshole know it all teenager coming back to haunt me... i wish google never uncovered those... i cringe when i read them now...
  • History of USENET [vrx.net]

    Archive for the History of Usenet Mailing List [ucsd.edu]

    Usenet Readers and Clients [wicip.org]

    History of Usenet - Development, people involved [about.com]

    (Yeah sure, anyone could look these up but isn't it easier to just point and click? There is more to USENET history than Google. Also, if you think I'm a karma whore, that's fine. I've got karma to burn.)
  • by sinserve ( 455889 ) on Monday January 07, 2002 @05:49PM (#2801117)
    In a major university, and I decided to honor his
    soul and follow his foot steps.
    And now, thanks to google, I find myself battling
    the flame wars he started.

    Better go back and do him and VI and honor .. alt.emacs, here I come.
  • Save the posts (Score:5, Insightful)

    by Kefaa ( 76147 ) on Monday January 07, 2002 @05:50PM (#2801121)
    I am sorry they will allow requestors to delete their own postings. While we might wish it otherwise, 10, 20, 50 years later, this may be the real historical value. To purge, seems the equivalent of having a letter to the editor removed from newspaper archives.

    To those who feel like "they are walking around with their baby picture stapled to their forehead", we all mature. What I thought at 20, 30, and 40 show how I grew. What other archive in human history can provide the transitional opinions, discussions, and outright imbecilic flames wars?

    While we would hate to have someone pull out our post in support of the flat earth theory, to act as though we all believed the earth was round is rewriting history. Convenient for us, but misleading to the future.

    The question now becomes, what happens after Google and Slashdot, when the archive is tera-bytes large? Will it take 100 years for the next conversion?
    • Wow, I didn't know we could delete our own posts, I guess I can go back and remove all my "In 10 years, we will be vindicated, we'll all be running OS/2 and Microsoft will not exist!" posts. Heh.
    • It reminds me of Larry Wall quote:

      "Usenet is essentially Letters to the Editor without the editor. Editors don't appreciate this, for some reason."
  • There are no posts in rec.humour by Minas Spetzakis (ca ~1992.)

    Since he's immortalized in the Net Legends [uncommon-sense.net] FAQ, it's a shame there are few examples of his jokes, other than in our memories.

    And now, the Minas'ized version of this post:

    Friend says to me, "See Google because they have many funny posts." I search for my name and find out I am being a kook. Friend says "Legendary!"

  • Me, too!!! (Score:5, Funny)

    by ideut ( 240078 ) on Monday January 07, 2002 @05:54PM (#2801146)
    The first "me too" post isn't until two years into the archive. I suppose that says something about the intelligence of the usenet demographic back then.
  • that it's that hard to get or use the equiptment.

    http://www.unisys.com sells 10" SCSI readers for thier A-series system. You can buy it seperately without a A-series service contract, and it works like any other /dev/rmt device.

    I worked for a company that distributed bank software on them as late as... well... now. And yes, it is cobol software. ;)

    Major Kudos to google for bringing back old usenet posts. Besides the knowledgebase provided, they are fun to read! Lots of stuff tasteful geek humor. I recommend checking it out.

  • by Saint Aardvark ( 159009 ) on Monday January 07, 2002 @05:58PM (#2801165) Homepage Journal
    Three tapes for rec.singles desperate
    Seven for alt.swedish.chef.bork.bork.bork
    Nine for comp.sci compiling late
    One for Google's engine dark
    In their Linux cluser where the shadows lie.
    One engine to search them all, one engine to bind them
    One engine to index them all and in the darkness find them
    In Google's cluster where the shadows lie.
  • by ortholattice ( 175065 ) on Monday January 07, 2002 @06:03PM (#2801188)
    Since Salon's revenue is based on page hits, the next story will be:

    How Slashdot Saved Salon


    • Salon's main source of revenue is now subscriptions.
      The page hits just get you to the line that says
      "Want to read more? Subscribe now", ideally just
      when it starts to get interesting.

  • by kisrael ( 134664 ) on Monday January 07, 2002 @06:04PM (#2801194) Homepage
    Google has a history of doing a lot of things right, but I have my doubts about their new service: catalogs.google.com [google.com]. It's a search engine for graphically scanned in versions of mail order catalogs! You type in sewing machine [google.com], say, and you get 3 views for each match: a scan of the catalog cover, a scan of the page, and a close up of the page, with the search terms highlighted in yellow.

    It's so retrofuture weird! Like what someone on a C=64 in the 1980s might think a future of online shopping would look like...
  • by GGardner ( 97375 ) on Monday January 07, 2002 @06:14PM (#2801240)
    I find it very interesting that in the last 10 years of USENET, it's traffic (and presumably use) have grown dramatically. However, the number of servers has, I believe, dropped equally dramatically. USENET was one of the most distributed systems I remember using, with it's shared-nothing, "flood-fill" algorithm.

    Yet, as it scales up to more and more messages, it actually is becoming less distributed. A good lesson for all the futurists forcasting the rise of distributed systems...
  • FIRST POST [google.com]!!! :)
    March 11th 1981 - It'll be 21 years old soon, wild!
  • google really should put the Oh How I envy American Students [google.com] usenet posts in the timeline [google.com].
  • by btempleton ( 149110 ) on Monday January 07, 2002 @06:36PM (#2801324) Homepage
    This is a popular theme this month, with no surprises. O'Reilly Network also asked me to do an article on the history of USENET and things discovered in the archives. At the same time I also did an article on the history of some popular net terms like spam and net surfing.

    You can read the article I wrote on the O'Reilly site [oreillynet.com]

  • USENET used to be an informal discussion forum, like something where you might talk with others like you would around the water cooler. Google, AOL, and similar services have greatly expanded the user base for USENET, which means that it isn't much of a community anymore. And by archiving and republishing in perpetuity, thinking people have to watch carefully what they say, or they just post anonymously (or don't participate at all anymore).

    This was probably an unavoidable turn of events. Nevertheless, whether it is Google or some other company, I consider it wrong for them to republish this stuff, in particular as part of a commercial venture. It's the equivalent of digging out old security surveillance tapes and broadcasting them for the amusement of the masses. It's wrong, and the fact that people find some sort of voyeuristic delight in it doesn't change that. The backup tapes that Google used should have been destroyed.

    • by BlacKat ( 114545 ) on Monday January 07, 2002 @06:39PM (#2801339)
      Erm... Usenet is a PUBLIC system, any and all posts you make there are in the public domain.

      The information is preserved for posterity, not for making money or other commercial exploits.

      I can't really believe you think we'd be better off destroying information instead of preserving it!
      • Erm... things you publish aren't automatically in the public domain. Besides, I wasn't making a legal point, I was making a point about the social impact of archiving and republishing informal discussion groups into perpetuity.

        The information is preserved for posterity, not for making money or other commercial exploits.

        Oh? When did Google become a non-profit foundation for the preservation of historical electronic information? And what does historical preservation have to do with publishing a searchable database to the web?

        I can't really believe you think we'd be better off destroying information instead of preserving it!

        Well, we can preserve a lot more information. For example, we can install video recorders all around your house. I'm sure people around the world would find it amusing, and 200 years from now, historians will love those sorts of documents.

        • Preserving newsgroup posts is hardly the same as posting private information. When you post onto a newsgroup, you post to whoever wants to read it. If you don't want it to be read, don't post it. It's that simple. It's like walking in front of a camera in a TV studio and hoping nobody sees you on TV.
    • Try the "x-deja-noarchive=1" command at the end of USENET posts. That used to work...not sure if it does any more.

      Google's indexing system might be different than Deja's indexing system.
      • We had this in 1981: expiration headers. The expectation was that articles would expire within a few weeks, and some articles had explicit expiration dates. But Deja/Google just decided that those didn't apply to them and if you wanted your article to expire, you really had to add their new header to your messages. What guarantee is there that in another 20 years, some other company isn't going to decide that "x-no-archive" really doesn't apply to them?

        The appearance of Deja/Google archives killed USENET because it has shown that there are no guarantees: only a fool would now engage in any kind of controversial discussion on USENET under their own name.

  • Henry Spencer... (Score:4, Interesting)

    by Jacco de Leeuw ( 4646 ) on Monday January 07, 2002 @06:38PM (#2801336) Homepage
    Ironically, Henry Spencer is also the lead programmer for the Linux IPSEC stack FreeS/WAN [freeswan.org] (encrypted and secret communication).

    While also saving the Usenet archives (public and widely dispersed information)..!

  • ... I was a member of Team OS/2!!! Where is that URL to get postings deleted?
  • That little? (Score:4, Interesting)

    by man_ls ( 248470 ) on Monday January 07, 2002 @06:45PM (#2801365)
    So let me get it straight. 9 years of USENET posts occupy only 16.8GB of hard disk space?

    You sure those 10-inch magnetic tapes weren't 1200MB or 120GB or something? Hell, a converted VCR using VHS as a backup medium can store like 100GB (saw one somewhere, I forget the link.)
    • Re:That little? (Score:3, Informative)

      by snake_dad ( 311844 )
      From the google groups faq:
      * Can I access binary content on Google Groups?

      No. Google Groups does not archive any binary content.
      So maybe binaries where not archived in the early days either, or maybe there were no binaries yet, i can't remember. Anyway, nowadays binaries account for most of the enormous amounts of data pushed over the usenet. So filtering that out makes the data a bit more usable.
      • Re:That little? (Score:4, Informative)

        by btempleton ( 149110 ) on Monday January 07, 2002 @07:43PM (#2801655) Homepage
        There were no binaries in the early days. First there were the net.sources groups where you would find new Unix programs, notably the lastest updates of USENET software.

        Binaries groups showed up a bit later, mostly after the great renaming, mostly for IBM PC Shareware and freeware binaries. No Warez or photos, not until a lot later.
    • Actually, I've done this. Back on my Amiga 500, I had no way to back up my *massive* (chuckle) 120 MB hard drive, so I used a VCR backup kit. It was a bugger to use, but it was the poor man's ass saver for sure!

      Ah the memories today...
    • Hell, a converted VCR using VHS as a backup medium can store like 100GB (saw one somewhere, I forget the link.)

      Assuming 9 Mbps of raw data (half the data rate of HDTV, because garden-variety VHS is nowhere near broadcast-quality), and assuming some heavy-duty error correction reducing effective data rate to 6 Mbps, VHS's SP mode records for 7200 seconds, giving 5 gigabytes on a tape at a bare minimum. (For comparison, a single-layer DVD holds about 4 1/2 GB.) If we go to EP mode, increase the bandwidth to S-VHS levels, and apply 3:1 text compression (common with deflation [gzip.org] of large Latin-alphabet texts, especially containing quoted material), we may be able to store even more data per tape.

  • by BrookHarty ( 9119 ) on Monday January 07, 2002 @06:55PM (#2801417) Journal
    Allot of the good gurus are moving over to slash ran message forums. Talking to a guy who is a perl guru, he has moved most of his perl help requests from usenet to Perl Mongers. I've been seeing this trend in the last few years, as independent subjects are moving over to a website based web forums. I even spend more time reading 5 mailing lists and a dozen message forums, and dont touch usenet anymore.

    With these message forums and mailing lists not linked to a usenet group, there is a lot of wasted knowledge that is not shared. I would love to see a slash-mod or some type of mailing list enhancement that posts a overview or some kind of daily message post to usenet.

    The whole idea of usenet was knowledge sharing, not binaries and spam ads. Glad google has saved usenet, but some effort needs start using it again.

    Humm, Maybe Slashdot should enhance a usenet forum? Thou 5-20,000 posting a day on a usenet might be a little much. Maybe only 2+ posts make a moderated usenet group.
    • Sorry, shuold of called the message topic
      Message forums (Slash) are killing off Usenet.
    • by geekotourist ( 80163 ) on Monday January 07, 2002 @09:27PM (#2802067) Journal
      I argue that there isn't a better way to do discussions:
      • Usenet is blazingly fast: text-only has that advantage. Works well for a 14.4 and a T1- good for more towns, not just the DSL'ed ones.
      • simultaneously local and international
      • It isn't dependent on one company's bandwidth or financial health.
      • less susceptible to censorship
      • a group doesn't have to be online at the same time (unlike chat), and threads can contain many layers of discussions without getting confusing (unlike mailing lists). The discussions can be complex, with room for step-by-step instructions or line-by-line critiques. There is time to stop and think about answers, and the discussion threads persist.

      Looking at the history, [google.com] the first big Usenet spams came at exactly the wrong time- and it badly twisted the subsequent development of the Web.

      Spam hurt Usenet by ruining it as a tourist destination right as mass tourism to the Web began. Long-time Usenet users couldn't recommend it to new Internet users ( "Really its a great place, just ignore the trash and the noise and don't give your name because you'll get a zillion ugly mails afterward" doesn't work as tourist advice). And for existing users, reading Usenet meant wading through muck, and then with address harvesting starting, a muck filled mailbox. Between this and the constant interruption of irrelevant ads, people were driven out, the extra traffic made Usenet a burden to ISPs, old users went elsewhere, new users never came. While the rest of the web exploded, Usenet started its long fade.

      Arguing alternate history here, but if mass Spam had hit much earlier or later, the damage wouldn't have been as bad, both to Usenet or to the Web overall. Had it been much earlier, perhaps the cancelbots and other technology responses to spam would've been well developed by the time the mass tourism started. "let's ignore the problem and go somewhere else" isn't a solution when there is no 'else' to go to. Had it been much later, higher adoption rates for Usenet (as a % of all Web demand) would mean companies would need to take the Usenet model into account: people might've expected/demanded better spam solutions, more cross-website communications, and less walled-gardens. People would've been less likely to accept 'the only protection you'll get is to stop posting and come to our walled-garden web discussion group' as a solution. Ditto with the loss of shell accounts and open relays.

  • by Kingfox ( 149377 ) on Monday January 07, 2002 @06:57PM (#2801425) Homepage Journal
    This is downright scary.
    Nothing like looking through the archive to see an old post from a skilled sysadmin friend asking a basic question in the wrong group years ago.
    Nothing like seeing delusional inane posts you wrote while in high school making you look like an utter twit.
    Nothing like seeing old usenet posts from friends who have died years ago. This is just too creepy for words.
  • by Large Green Mallard ( 31462 ) <lgm@theducks.org> on Monday January 07, 2002 @07:00PM (#2801436) Homepage
    Aside from his good works in the terms of Usenet, David is the reason I am where I am today. 4 years ago, I was stuck in Perth, Australia and very bored. I was reading the student newspaper one day and saw an article about student exchanges. To cut a long story short, 6 months later I was at The University of Western Ontario.

    I had looked over the courses they ran in Computer Science there, and saw one called "Unix and C". Being a bit of a geek and having used unix a *tiny* bit in my high school days, I thought it was be a cool one to take. David was the lecturer for this course. He had a lot of knowledge and passion for the subject, which is unsurprising considering his experiance with all manners of unicies. His classes for CS175a taught me a lot about Unix (and a little about C). I got 92% overall for the unit, an A+ and the highest mark I've ever got for any unit. The next semester I was at Western, I taught myself Perl, using an account on the CS Department servers and on the Reznet linux box a friend had :)

    It was a unit for non comp-sci majors. CS Majors were expected to learn this stuff in a bunch of different classes.
    Sadly, Western no longer offers CS175a - Unix and C. I feel it is a loss to the community as a whole, but at the same time, I understand that a one semester course in Unix and C probably isn't seen as too acedemic by many. Which I think is a shame. Too many universities turn out gimps fluent in one langauge, and one language only - Windows *shudder*. I think it sad that units to teach people how to click mice and use Word can get you acedemic credit, but Unix and C courses don't seem worthy enough to run.

    When my time was up in Canada, I came back to Australia and while I finished my degree, I made money on the side doing CGI scripts in Perl. Then, when my degree was finished, I applied for a job as a System Admin at a department at The University of Western.. Australia. It was the first job I applied for and I got a callback the morning after I had a 70 minute panel interview. Due, in large part, to the stuff I had learnt in David's class, I passed the interview quite well.

    Today, I am 22, earn over AU$40k, I get to play with lots of cool computing and network hardware, and I think it would be safe to say that if I hadn't taken that course with David, I wouldn't be where I am today. I suspect I would have been working as a security guard, making minimum wage, since my degree wasn't actually in Computer Science, but Security Studies. Thinking back, I'm pretty damn glad I did take it ;)

    David's homepage is here [csd.uwo.ca]
  • by pipeb0mb ( 60758 ) <pipeb0mb&pipebomb,net> on Monday January 07, 2002 @07:05PM (#2801468) Homepage
    Can you still download the archives? If so, where?
    All that info would be incredibly useful!

    What format do you think it would be in? Threaded text or database format or what? How would you read it or search it?

    Also, what do they do with the attachments? Imagine THAT archive. Heh heh heh.
  • ... would the USENET archives qualify? As compared with say MS development network which is the equivalent knowledgebase? Hate, love or indifferent, you cannot deny that MS has had a major influence on the growth of the PC sector and a large part of this success is their fanatical devotion to their developers (please no jokes about if you got them by the balls, their wallets will follow). USENET is a nice snapshot but is it something purposeful?

    I was just musing the other day about what would be the 7 wonders of the digital world ... personally I would consider the first choice to be the Guttenburg Project which is low-key but represents the unflinching efforts of many many experts and volunteers. Given the dissipation of the social contract w.r.t. modern copyright laws, the scope and vision of the originators can only be admired.

    Sure, GNU/Linux could be nominated but I'm a little ambivient about it as the impact is mainly social (due to GPL and the contributors' belief in libre software). As a technical piece of work, is it on the same relative scale as the ancient wonders were in their heyday? We are talking global uniqueness, recognised by a wide population segment, and something difficult to duplicate here.

    LL
  • Anyone else think it would be a great idea to dump the whole thing to say- 10-20 DVD's and sell it on the net?

    I'd buy it. There's a ton of knowlege in usenet that I would love to grep.
  • by peterdaly ( 123554 ) <petedaly@ix[ ]tcom.com ['.ne' in gap]> on Monday January 07, 2002 @07:50PM (#2801670)
    I found some 7-8 year old posts I made when I was a teenager. I can't believe how cocky I was, and how poorly I wrote. Very few people ever replied to my posts, and I now understand why. I even found a "me too" (well, almost) post from myself. Wow, that's scary.

    I appologize to the whole slashdot community for my teen cockiness in the mid 90's. I didn't mean what I said the way I said it...at least looking back.

    One good way to find your old posts is to search for your (old?) email address.

    -Pete
    • by Da VinMan ( 7669 )
      Heh... I'll second that. I'm not ashamed at anything I said, just amused. And I had no idea how many posts I put out there. :)
  • I'm tired of moderators moderating posts down because they disagree with the content, so I'm reposting this. Go ahead, moderate it down again, I have lots of points. But I suggest rather than having some gut reaction to this, you think about it and, if you disagree with it, post some reasoned response.

    I have been using USENET for 20 years, so I am affected by this, and I have seen USENET slowly fall apart. USENET was always a bit rough and had a lot of noise, but people did get to know each other personally and professionally. Today, USENET is nearly completely useless for any kind of social functions, and the huge expansion of people posting, anonymous/pseudonymous postings, and the need to post anonymously because of searchable archives is largely responsible. There is no forum like USENET was 20 years ago anymore.

    USENET used to be an informal discussion forum, like something where you might talk with others like you would around the water cooler. Google, AOL, and similar services have greatly expanded the user base for USENET, which means that it isn't much of a community anymore. And by archiving and republishing in perpetuity, thinking people have to watch carefully what they say, or they just post anonymously (or don't participate at all anymore).

    This was probably an unavoidable turn of events. Nevertheless, whether it is Google or some other company, I consider it wrong for them to republish this stuff, in particular as part of a commercial venture. It's the equivalent of digging out old security surveillance tapes and broadcasting them for the amusement of the masses. It's wrong, and the fact that people find some sort of voyeuristic delight in it doesn't change that. The backup tapes that Google used should have been destroyed.

  • by Quixote ( 154172 ) on Monday January 07, 2002 @08:48PM (#2801908) Homepage Journal

    OK: how long before a presidential candidate's Usenet postings will be dragged out for the whole world (US) to see ? :-)
    • My prediction: Never.

      Back when usenet was where the action was, (before http), all the future politicians were in law school. And the law school students were way off on the other side of the campus, and thought the compsci /engr students were dorks.

      And now, the only people that still post on Usenet are...

      Personally, I gave up on Usenet in the early 90's, after following the Clipper Chip debate on comp.org.eff.talk all summer.
  • From a Phil Karn comment in November, 1988...

    5. Making the source code generally available is perhaps *the* best way to prod the vendors into fixing *lots* of holes in their systems, not just the ones exploited by the worm.

    Face it, we all know how vendors behave -- everyone does the least work possible, subject to the vocalness of their customers' demands. Several people have already stated that they knew of the hole in sendmail for many years and they just chalked it up to the net being composed of benign people. Since it wasn't generally known (I didn't know about it, for example) there was no general cry to fix it, and it lay open long enough for Morris to come along and exploit it.

    6. I found it ironic to read that the elder Morris recently submitted a paper on UNIX security for publication, but his employer squelched it. Who knows what was in that paper? Perhaps, just perhaps, maybe it contained a description of the hole in sendmail, among other things. Perhaps, just perhaps, Robert Jr., learned of this hole from his dad. Perhaps if that paper had been published, people would have taken steps to protect themselves before the younger Morris had unleashed his worm.

    In sum: SECURITY THROUGH OBSCURITY JUST DOESN'T WORK!
  • Ok look, I'm suggesting anybody use this method to go stalking anybody or anything - but, has anyone ever searched google for those "send a dollar to the people of this list" spams where the message contains an address local to your area?

    Because I have. The search I did was something like:
    "this is totally legal" dollar [my home town]
    I found 20 or so people's names and addresses and looked up their names in the phone book. Of those 20, I only found 2 people who's names and street addresses matched what was in the spam.

    So... I called them. I asked if anybody had sent them money and if there had been any consequences. Neither one of them had any idea what I was talking about. They denied ever posting the spam. I even got the impression that they didn't know what usenet was.

    So, what do you make of that?
  • SAIL recovery (Score:3, Interesting)

    by Animats ( 122034 ) on Tuesday January 08, 2002 @01:30AM (#2802668) Homepage
    A few years ago, several Stanford CS alumni, including myself, did something like this for the archives of SAIL, the Stanford AI Lab system dating back to about 1970. Old backup tapes still existed, having been recopied around 1990 to 6250 BPI 2400' 0.5" open reel tape. We read in several hundred reels, using an old Sun 3 server. The data was transmitted to Bruce Baumgard at IBM Almaden Research (another Stanford CS alum), who converted it to Unicode (SAIL had a nonstandard character set with extra symbols) and sorted out the files.

    The original SAIL users were contacted, one by one, and offered CD-ROM copies of their files. Where the original users permit, their files will be made publicly available. The permission process is still going on, but the result will be an archive of the early days of AI.

  • by osswid ( 451334 ) on Tuesday January 08, 2002 @02:57AM (#2802819)
    Google is a private startup. They might still go out of business, or be bought by someone. Even if they have a successful IPO, these could still happen later.

    What happens to the archive when they're bought by someone else, or end up in bankruptcy court? Will it go the away of the online digital photo storing sites, vanishing one day without a trace, taking irreplaceable data -- data of immense academic historical interest -- with it?

    Google should promise to donate the archive to the Library of Congress, do the transfer now, and make a social contract with the net community to turn over the reigns on this project if they're acquired or go out of business.

"Why should we subsidize intellectual curiosity?" -Ronald Reagan

Working...