Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Education

Project Gutenberg's 32nd Birthday 178

David Moynihan writes "July 4th marks the 32nd anniversary of that day in 1971 when Michael Hart first sped an all-caps version of the Declaration of Independence to anyone and everyone then on what later became the web, thus founding Project Gutenberg. Thanks to an army of volunteers and the Distributed Proofreaders, this is the last year PG will have fewer than 10,000 titles. Strangely, Microsoft picked this dual anniversary of literacy and freedom to re-launch their Reader product, with three free bestsellers a week, if you activate the new version with Passport, sign a EULA, etc. Real reason for the upgrade might be that the DRM on MS's old Reader was cracked. If you're not into giving away data, or are running a system other than Windows, maybe you could take the time to tell a friend about free books online, or even help out by visiting the Distributed Proofers and editing one page per day."
This discussion has been archived. No new comments can be posted.

Project Gutenberg's 32nd Birthday

Comments Filter:
  • by mikeophile ( 647318 ) on Friday July 04, 2003 @01:05PM (#6368257)
    Seriously, awesome work people.
  • by ryants ( 310088 ) on Friday July 04, 2003 @01:08PM (#6368271)
    even help out by visiting the Distributed Proofers and editing one page per day.
    You can't seriously be asking Slashdotters to volunteer as proofreaders.
  • by Blaine Hilton ( 626259 ) * on Friday July 04, 2003 @01:08PM (#6368272) Homepage
    Now all we need is more people promoting this in schools and printing the books. Much like the IA Bookmobile [archive.org]. It seems like the people who could use this the most, don't even know it exists.
    • Yes, I can agree with this. We people here won't benefit from it half as much as needy school districts who could use the texts. Methinks what they really need to do is work on some awareness program, distributing the books to teachers... or even letting know that such a resource exists. With more technology in the classroom, Gutenberg shouldn't be out of reach to many teachers.
  • doh (Score:1, Funny)

    by Anonymous Coward
    Download Error

    You'll need to install and activate the current version of Microsoft Reader before you can download these Owner-Exclusive titles.

    Click here to get started now.


    No Linux version!? Gah.
  • very timely for me (Score:5, Interesting)

    by b17bmbr ( 608864 ) on Friday July 04, 2003 @01:16PM (#6368315)
    i am going to be teaching modern civ next year in high school (i have been at the junior high for 7 years) , and have already gone to the site and gotten works from aristotle, plato, locke, montesque, et al. thanks guys. there is still something to be said for a classical education. glad somebody is doing all they can to preserve the classics, especially with all the assaults on it from the social reconstructionists.
    • The Gutenberg people are doing all they can to preserve every book they can legally get their hands on. Personally, I'd like it if they could get their hands on some newer books.
  • by Tablizer ( 95088 ) on Friday July 04, 2003 @01:16PM (#6368317) Journal
    ...first sped an all-caps version of the Declaration of Independence to anyone and everyone then on what later became the web

    I knew it! This country was founded by COBOLers.
    • by Anonymous Coward
      ADD 1 TO POST-POINTS.
      MOVE "Funny" TO POST-STATUS.

      (That's Cobol, for those who don't know)
  • by jaemark ( 601833 ) on Friday July 04, 2003 @01:17PM (#6368322) Homepage
    There's really a problem though about getting the word out to people, in pretty much the same way the popularity of libraries today has been dropping. A good idea would be a separate advocacy site to come up with lists of texts in the project (i.e. What's New?, Most Popular, etc.) to help people wade in immediately.
    • by Cthefuture ( 665326 ) on Friday July 04, 2003 @01:45PM (#6368447)
      Yes, they need something like that badly.

      I remember poking around on PG not long ago but soon forgot about it.

      If you're not looking for something specific then the site is kinda, meh. As you suggested, they need a news site, ratings, and other stats so you can see what's available.

      And sections. "Technical", "Poetry", etc. Otherwise it's not very useful to the casual browser.
    • by Anonymous Coward

      Want to know what's new, etc? The Project Gutenberg website admittedly sucks, and their ASCII adherence admittedly verges on dogma, but there is a good substitute:

      The Online Books Page
      http://digital.library.upenn.edu/books/ [upenn.edu]

      It currently has 20,000 FREE titles listed, from hundreds (at least!) of sources, in all subjects, beautifully categorizes by title, author and subject--and topped off by an up-to-date what's new listing and a fine search engine. Much props to John Mark Ockerbloom and the University

  • More free books (Score:5, Informative)

    by Cruciform ( 42896 ) on Friday July 04, 2003 @01:18PM (#6368324) Homepage
    The Baen Free library [baen.com] has a number of titles available in several formats.

    It's a great way to introduce readers to a series or a talented new author.
    • Talented? I checked that out and all I saw were hacky sci-fi authors whose titles clutter the sci-fi section of my local bookstores. I wouldn't pay money for those books.
  • by Chmarr ( 18662 ) on Friday July 04, 2003 @01:19PM (#6368332)
    Just on a whim, I decided to see how much cheaper titles in microsoft reader format was over a physical book.

    I went to the MS Reader site and followed the links to the on-line publishers sites (such as B&N and amazon). In most cases, the reader format is only $1 cheaper, and sometimes $2 more expensive, than the corresponding paper book (soft or hardcover).

    So... why in the world would anyone want to use a format that ties them to the computer?? With a paperback, I can read it anywhere, read for as long as I want without having to change batteries, and even pass the book onto a friend.

    If they want to make the electonic formats more attractive, they need to make them a LOT cheaper than the corresponding paper version.
    • So... why in the world would anyone want to use a format that ties them to the computer?? With a paperback, I can read it anywhere, read for as long as I want without having to change batteries, and even pass the book onto a friend

      Well, I don't use MS-Reader myself (For commercial e-books I like the cross-platform Mobipocket), but a major reason I like e-books is I like to read them on my PDA -- not to save money. I carry my PDA around anyway, and having e-books means less to carry. I would purchase all m
    • Someone else mentioned the fact that he's got a reader with him all the time anyway, which makes it pretty conveinent to have a book or three in there. I'm not going to bring a book around with me everywhere I go just on the offchance that I might get stuck in a long line, or waiting for someone. But when such an event happens, having good reading material right at hand is very nice. Also nice is being able to have a selection of books in there at any one time, just in case I finish one book while waiting s
    • I have never really used the reader however an advantage to having the book electronically is being able to search.
    • > and even pass the book onto a friend.

      Ahh but you are forgetting, in the USA, you cant do that.
      Well, you can, but then you are voilating copyright and thus a criminal.

      The law specifically says you can not distribute a work that is copyrighted without the copyright holders permission.

      The only reason its not _illegal_ is because of fair use laws, but the DMCA removed most of those, and the next version of law change will no doubt remove most or all of the rest.

      Its only a matter of time if things dont
      • First sale doctrine (Score:3, Informative)

        by yerricde ( 125198 )

        The law specifically says you can not distribute a work that is copyrighted without the copyright holders permission.

        True, 17 USC 106 [cornell.edu] says that, but it limits itself "Subject to sections 107 through 121", such as 17 USC 109 [cornell.edu]:

        Notwithstanding the provisions of section 106(3), the owner of a particular copy or phonorecord lawfully made under this title, or any person authorized by such owner, is entitled, without the authority of the copyright owner, to sell or otherwise dispose of the possession of that

    • I went to the MS Reader site and followed the links to the on-line publishers sites (such as B&N and amazon). In most cases, the reader format is only $1 cheaper, and sometimes $2 more expensive, than the corresponding paper book (soft or hardcover).

      These facts being plainly obvious, the logical conclusion is either that A: The cost of setting up the Reader infrastructure is so high that these high prices must be charged to recoup them, or B: They want them to fail.

      I don't know which it is. But there
    • I agree that the eBook prices are too high. I've settled for reading the classics on my Handspring Visor.

      Check out Plucker Books [pluckerbooks.com]. These are Gutenberg books formatted for the Plucker reader.

      I still prefer a real book, but these come in handy when I'm feeding my infant son...bottle in one hand and Visor in the other.
    • Size (Score:3, Insightful)

      My entire CD collection fits in my pocket with my iPod. If I could fit my entire book collection in my pocket, that would be a dream and a delight.
  • Huh??? (Score:2, Insightful)

    by lilricky ( 632829 )
    "...to anyone and everyone then on what later became the web..." What?? In 1971 http protocol was around? Or is the author trying to suggest that the internet became the web? I thought the web was part of the internet, not a replacement for. Perhaps Im misreading the article.
    • Re:Huh??? (Score:3, Insightful)

      by dissy ( 172727 )
      > "...to anyone and everyone then on what later became the web..." What??

      I think they are saying in 1971 it was distributed to anyone and everyone...
      Then, on what later became the web, they distributed it there too.

      Keeping in mind the web ripped most of its ideas from gopher, and FTP before that, so the web wasnt a breakthrough idea out of nothingness.
      But i dont think they meant it as 'distributed on one medium which later that medium turned into the web'

      Thats atleast how i believe it was suppost to
  • XML please (Score:4, Insightful)

    by DrXym ( 126579 ) on Friday July 04, 2003 @01:23PM (#6368352)
    Gutenberg is great and all, but it really needs to dump the text format. So much information is lost that it makes reading some texts extremely difficult. Some format that preserved chapter headings, footnotes, illustrations etc. would be a massive step forward.
    • Re:XML please (Score:5, Informative)

      by starseeker ( 141897 ) on Friday July 04, 2003 @01:37PM (#6368410) Homepage
      I think they discuss this somewhere. The whole point of ASCII is that it can be accessed simply, by almost any machine. It is as stable a format as you will find for data storage, anywhere. They are commited to these books being widely readable, and ASCII is the best way to assure this.

      However, I agree that some books (most actually) lose something in ASCII. What I would like to see is a project which works off the basic Gutenberg texts and formats them in a readable way, preserves illustrations, etc. But it should be an add on to the project, not the main project. Also, remember that that level of preservation is much harder than just typing in and proofreading - you have to consider formatting and scanning images as well.

      As a temporary measure, it would be nice to see someone do an XML markup that can be easily translated into LaTeX, so people can have pdfs with nice fonts, table of contents, title page, etc. That would be a step up. But to do it properly would take a separate effort, and a very large scale one even by Gutenberg standards. Worthwhile, yes. But involved.
      • a very large scale one even by Gutenberg standards

        I wonder. Does Gutenberg keep their sources in ASCII or something else that they runoff to produce the ASCII final version? It might be that they already have formating information that a smarter runoff process could use. (Heh, I can dream, right?)

        • Re:XML please (Score:4, Informative)

          by belbo ( 11799 ) on Friday July 04, 2003 @02:01PM (#6368529)
          The final ASCII version is also produced by hand. After two rounds of proofing, the text gets into a queue. From that queue, a 'post-processor' checks it out and reformats it according to the Gutenberg guidelines, along with any error corrections that might still be necessary. Then she or he uploads the final version to Project Gutenberg, where the 'whitewashers' check the text yet again before posting it to the archive.

          About the XML: You are in fact welcome to produce an XML version, I believe some fellows at DP indeed do that already. However, the main version is the simple text version, since you can read that with everything. But nothing keeps you from also posting an XML or PDF or TeX or whatever version.

          belbo, post-processor at DP

          (Boy I do hope there are no spelling errors in this *g*)
          • It seems a lot of work. (From here at a distance.) I know that even when they started there were tools for that sort of work. (I just found my DTSS RUNOFF*** manual, whee!)

            Ah well, if they have a standard way formating ASCII text then producing an XML version from it should be too daunting. (Me, once again from a distance.)

            But an automated translation to Klingon, priceless! (I'm joking, that would be daunting!)

            • Ah well, if they have a standard way formating ASCII text then producing an XML version from it should be too daunting. (Me, once again from a distance.)

              I would suggest reStructuredText [sourceforge.net], which doesn't look like markup but is.

      • Re:XML please (Score:5, Insightful)

        by fm6 ( 162816 ) on Friday July 04, 2003 @02:02PM (#6368533) Homepage Journal
        The whole point of ASCII is that it can be accessed simply, by almost any machine.
        Just because you store something in XML, doesn't mean people have to use XML to read it. The whole point of XML is to have a format that you can easily transform. Transforming in ASCII is particularly easy.
        XML markup that can be easily translated into LaTeX
        If it's a good content-oriented XML app, it's easily transformed into LaTeX, or anything else. If it isn't a good content-oriented XML app (the StarOffice native format comes to mind) then it shouldn't be used for an online document repository.

        I think the basic problem with the Guttenberg/DP people is that they've been doing things a certain way for so long, and they don't want to retool. And I can see their point -- changing over to XML is a lot of work. And the core DP team already seems pretty busy keeping the web site going.

        On the other hand, I do wish they'd make it a priority. Right now I'm a volunteer proofreader, concentrating on getting out the famous Britannica 11th edition [wikipedia.org]. The amount of information that gets lost in scanning in Greek and other text with weird phonological conventions is just appalling. And the conventions for math and science formulas and equations produces a complex linear format I can't believe anyone would actually want to read.

        Then again, it wouldn't be that hard to go back and insert proper markup. For 90% of the text there's a simple transform between the Gutenberg conventions and a reasonable XML format. The other 10% probably need another look anyway, and wouldn't be hard to do if they've saved the scan images. I haven't had the heart to ask if they do.

        • Re:XML please (Score:3, Informative)

          by dvdeug ( 5033 )
          And the conventions for math and science formulas and equations produces a complex linear format I can't believe anyone would actually want to read.

          It's basically TeX, the one true math typesetting system. Most mathematicans and many scientists know it quite well. It beats the heck out of MathML (one example in a MathML tutorial was 8 characters in TeX, and about 50 in MathML.)
      • http://www.conglomerate.org/

        Lovely bit of kit.

      • They scan anyways - the proofreaders compare the ASCII version to the scanned image of a page to make sure they match.
    • Re:XML please (Score:3, Informative)

      by DarkOx ( 621550 )
      The entire point of the project is to preserver the content in a format that is both human and machine readable. See if I don't have any software from the present here in fifteen years and XML is long dead I will still be able to read standard ASCII text even if I am just cat(ing) it through less or printing it as is. I can't resonably read a book that is filled with XML tags and if there is no longer software to parse them then its not to useful. I am not saying that it would be hard to write such softw
      • Re:XML please (Score:4, Insightful)

        by Eloquence ( 144160 ) on Friday July 04, 2003 @01:50PM (#6368480)
        I can't resonably read a book that is filled with XML tags and if there is no longer software to parse them then its not to useful.

        This is complete bullshit. With a proper setup you would convert the source into multiple output formats, including TXT, but you would keep the source in a format that maintains meta information such as formatting, chapters and pages. XML is used in the entire industry exactly with the expectation that it will be around for decades. Even if it won't, the open source code that we have to parse it will not magically disappear -- PG would keep using it to generate output texts from the XML source through all these years. You might as well argue that ASCII will go away.

        • Re:XML please (Score:2, Insightful)

          by GigsVT ( 208848 )
          With a proper setup you could read MS Word 2000 docs 100 years from now too. The whole point is to not make it reliant on any particular software, or any particular fad.

          XML hasn't been around long enough to say whether it is a fad or not. ASCII has been around longer than most of us have existed.
          • This is just wrong (Score:2, Interesting)

            by Anonymous Coward
            XML is not a character encoding. XML does not require the use of non-ASCII characters. What can be represented by an XML document is a superset of what can be represented by a plain ASCII document. XML is a human-readable markup.

            MS Word 2000 .doc is a binary format.

            I suspect that you have very little idea what you are talking about.

            PG already uses XML-like markup to indicate an emphasized portion of a passage, among other things. If we were to accept your argument, then even this alone should be seen as
        • Re:XML please (Score:3, Insightful)

          by Mr. Piddle ( 567882 )
          You might as well argue that ASCII will go away.

          ASCII is simply 127 or 255 characters or so. Writing software to translate it is trivial, and it can even be decoded by hand, if necessary.

          XML adds a lot of complexity beyond this, which hampers a person's ability to read a file with practically no software tools.

          Also, XML is not as ubiquitous as you think, and huge numbers of people don't know how to use the tools to work with it.
      • Heh, XML: The BSD of Markup Languages! :)
      • Re:XML please (Score:3, Insightful)

        by Vann_v2 ( 213760 )
        With some works the layout itself is an important part in comprehending them. Do blindly remove the formatting so that everyone can read it is an injustice to the original author.
      • Re:XML please (Score:4, Insightful)

        by DrXym ( 126579 ) on Friday July 04, 2003 @02:10PM (#6368568)
        Yeah but the entire point of XML is that it defines structure not presentation. If you want to go off and produce something which is readable in some other format (e.g. text), feed the document through some XSL transformation or perl script and it pops out the other end in any way you desire. Someone else can feed it through something that produces a PDF, someone else a Palm e-Book, someone else braille. And this can all be automated on the server. Everyone is happy.


        As for XML being long dead, this is highly unlikely. XML is just structured data and is itself just text. It would be trivial 5, 10, or even 100 years from now to pull out the data from the xml format in any way you please. Unless the grammar is horribly mangled (MS Office), it would even be possible to infer it without even knowing the grammar. I would trust Gutenberg to collectively come up with a format which would be simple for proof readers and parsers alike.

    • Re:XML please (Score:5, Informative)

      by Teancum ( 67324 ) <robert_horning@netz e r o . n et> on Friday July 04, 2003 @01:56PM (#6368511) Homepage Journal
      Michael Hart has repeatedly made mention that he does not want to get caught up into the fad of the moment with text formatting issues, and that plain old ASCII is one constant that hasn't needed changing. Indeed, you can open up the original Declaration of Independence document with your standard web browser, and you can still read it just fine. I dare you to try and find any other data format that was commonly used 32 years ago that you can still read with current equipment.

      With that said, I believe that XML is perhaps going to have the staying power that ASCII text has had for the past many years. And there are many volunteer projects that you can get involved with that do this including:

      The HTML Writers Guild [hwg.org] - Originally they were trying to convert all of the gutenberg texts to HTML, which has been admittedly a resonable standard for a good number of years. Currently they are now going to a version of XML with some standard headings for titles, copyright info (or lack thereof), chapter headings and so forth. More is on this website.

      Project Gutenberg XML [pgxml.org]This is a group more dedicated to the XML, but has a very similar purpose.

      The point here is that once the data is put into ASCII text format, projects like this can and are being done. If you really feel that you want to help with the effort, please join one of these. Also, at any time you can also take the Project Gutenberg files yourself and do this, but at least this gives you a forum to share your work once you are done.
      • The thing is, XML is just plain ascii too (assuming you mandate not to use Unicode or some weird charset), so therefore you're not reducing the ability of people to read the text. At worst they'd be inconvenienced by extra tags if they tried to read it raw, but then again they wouldn't have to.

        The reason for this is XML is easily translatable into just about anything else that the grammar allows for. So I don't see it would make any difference to the project goals if the 'master copy' for every document w

      • One of the advantages of XML is that it's very easily transformable. If Project Gutenberg were to produce XML texts, it'd be trivial for them to automatically convert them to plain ASCII and make that version available as well.
      • Re:XML please (Score:5, Insightful)

        by fm6 ( 162816 ) on Friday July 04, 2003 @03:59PM (#6369068) Homepage Journal

        ... that plain old ASCII is one constant that hasn't needed changing.

        I think you're a little unclear as to what ASCII is. As the "A" in "ASCII" indicates, it's oriented towards American applications. And it consists of a mere 127 characters, which includes 32 control characters that you don't use in text.

        In point of fact, Project Gutenberg has long outgrown the 96 graphic characters in ASCII, though I think they themselves are ignorant of the fact. The seem to have experimented with characters until they found a set that displays the same on "normal" Windows, Macs and Unix/Linux. The result is something they call "extended ASCII" but that's actually subset of both ISO's Latin1 character set [czyborra.com] and Microsoft's Latin1 code page [microsoft.com].

        When is this an issue? Well, I'm a DP volunteer, and I'm concentrating on the Britannica 11th edition. Lots of geographic entries, all of which contain degree symbols. This symbol is not in ASCII! If you follow the DP instructions, you end up entering byte 186 (decimal). If you're using the ISO or Microsoft Latin1 set (and if your computer is localized for the U.S., Canada, or Western Europe, you probably are) then 186 does in fact display as a degree symbol. But if your system is localized for Eastern Europe, you're probably using Latin2, and this byte stands for an S with a cedilla accent!

        In short, "ASCII" is actually less universal than well-formed HTML. In which you represent the degree symbol with a character entity (&deg;) that's the same everywhere.

        Indeed, you can open up the original Declaration of Independence document with your standard web browser, and you can still read it just fine.

        Hardly a representative example. The Declaration of Independence [archives.gov] was hand-written, and thus doesn't include a lot of fancy fonts or formatting. A better example is a contemporary novel, such as 1984.

        As it happens I just finished re-reading this one. I read a Plucker [plkr.org] file that somebody had transformed from an HTML version [adelaide.edu.au], which in turn came from the Project Gutenberg "ASCII" version. Readable enough. But all the typographic nicities -- italics, boldface, etc. -- were reduced to ALL CAPS in the text version, and that was retained in the HTML version. Pretty distracting -- made me feel like somebody was shouting at me. Double Plus Ungood! Thoughtcrime!

        ...once the data is put into ASCII text format, projects like this [XML] can and are being done.

        You make it sound easy. A lot of information is lost when your primary version is "ASCII". It all has to be put back by hand. There's no avoiding this for the large body of existing Gutenberg texts. And of course as recently as 5 years ago, there wasn't a real choice anyway. Even HTML had issues, and serious XML tools didn't exist.

        But now XML technology is pretty mature. It makes sense to store new Gutenberg texts in XML. If people still want "ASCII" copies, the XML is easily transformed into that. Though I a lot more people will want the HTML version -- a format which is actually accessible to more people than "ASCII".

        There are two reasons this won't happen soon.

        The first is that somebody will have to design and implement the necessary XML apps for inputing and proofreading the texts. (Which would alsio elminate a lot of the errors proofreaders make, like entering [Greek: Tau] when they mean [Greek: T].) A huge project. As it stands, the people who maintain the DP web site have their work cut out just to keep the existing software working. That's a vali

        • In point of fact, Project Gutenberg has long outgrown the 96 graphic characters in ASCII, though I think they themselves are ignorant of the fact.

          Then I invite you to actually take a look at some of the texts. The Gutenberg people know quite well when they're using ASCII and when they're using Latin-1. If you'll look at the books that are posted, some of the books are posted just in ASCII, and some in 7foo.txt and 8foo.txt files, where 7foo is ASCII and 8foo is Latin-1, and a few just in Latin-1, and the
      • Re:XML please (Score:3, Insightful)

        by jeremyp ( 130771 )

        Using ASCII presupposes that all the important texts you want to preserve are in American English. Since a fair amount of the important pieces of literature come from mainland Europe (actually even the British £ sign isn't in ASCII), it is clearly not up to the job and should be replaced.

        Further, authors often use devices like italics or bold to add emphasis to their work and nowadays even completely different fonts and typefaces. Translating these works to ASCII with no markup actually destroys so

        • A sterling mistake (Score:3, Insightful)

          by fm6 ( 162816 )

          Since a fair amount of the important pieces of literature come from mainland Europe (actually even the British £ sign isn't in ASCII), it is clearly not up to the job and should be replaced.

          As a matter of fact, the DP web interface allows you to enter the pound sterling symbol even if you don't have it on your keyboard. It also has a lot of accented characters that aren't in English. The fact is that the Gutenberg people think they're using ASCII, but are actually using Latin1. So Gutenberg texts wi

          • by dvdeug ( 5033 )
            The fact is that the Gutenberg people think they're using ASCII, but are actually using Latin1. So Gutenberg texts will display correctly on any system that's localized for the U.S., Canada, or Western Europe. But not elsewhere.

            Excuse me? The Gutenberg people know quite well when they're using ASCII and when they're using Latin-1. If you'll look at the books that are posted, some of the books posted from DP are posted just in ASCII, and some in 7foo.txt and 8foo.txt files, where 7foo is ASCII and 8foo is
            • I hadn't noticed that. But that convention isn't followed consistently. Of the last 10 files posted from DP, only 7 follow this convention. And I haven't seen it documented anywhere.

              I shouldn't have spoken categorically about the Gutenberg people. Somebody is aware of this issue, because recent posts from DP say "Character set encoding: ISO-Latin-1", which I guess is some help. My assumption of ignorance was based on the DP Proofing Guidelines [pgdp.net], which refers to 8-bit characters as "Upper ASCII". But I gues

          • Well I did preview it and it looked OK so that was good enough for me although technically a mistake since I was using HTML mode.

            However, even latin-1 does not have the complete range of characters in use by all writing systems based on the Latin alphabet and you're totally screwed if you want to preserve the Iliad or the Bible (to pick two random texts) in the original. Also, to do bold and italics etc you need some sort of markup - so it might as well be XML or HTML.
            • Actually, you didn't make any mistakes with your input, and I shouldn't have implied that you did.

              This all comes down to a simple misunderstanding: people use "ASCII" and "text" interchangably. Nine times out of 10, when you hear somebody talking about ASCII, they're really talking about Latin1. Usually, this mistake doesn't really matter. But this time it did: The guy who was defending Gutenberg's use of "ASCII" managed to imply that Gutenberg uses an American character set. Which was why you flamed him

      • by gotem ( 678274 )
        I dare you to try and find any other data format that was commonly used 32 years ago that you can still read with current equipment.

        punchcards.. what? you mean you don't have your punchcard read connected?
        • While I actually took the time to sit down and learn how to read punchcards from just their hole patterns (which isn't too difficult compared to reading data files directly from a hex editor if you have to dig into why a program isn't reading a certain file correctly).

          I have seen some punchcard machines come into the local thrift store a couple of years ago, I think it would be hard to find one now.

          The nice advantage that punch cards have over just about every other data storage medium is that as long as
    • Gutenberg is great and all, but it really needs to dump the text format. So much information is lost that it makes reading some texts extremely difficult.

      The guy that runs the scheme is a bigott on this issue. He has some wierd issue with the Web as a competitor to what he sees as his domain.

      Use of a lightweight markup of any sort would improve the value of the texts. Even if they invented their own markup it would be an improvement.

      Archeologists have managed to decipher the Myan hierogliphs, even li

  • by Faust7 ( 314817 ) on Friday July 04, 2003 @01:23PM (#6368354) Homepage
    I absorb all information directly through a USB link from my laptop to my head. Pretty nice, except for the typographical migraines. I always have ibuprofen in hand when visiting Slashdot.
    • Man, that's going to be a nasty upgrade to USB 2...
    • I get all the information I need (and more) from "reading" lamb livers (all the Universe is reflected in even its tiniest fragment, you only have to look hard enough). On most days though, I have to resort to using tea leaves (as there aren't too many sheep left in 20 mile radius) but tea leaves have lower bandwidth and they generate more errors (mostly typos, but when reading Slashdot, I occasionally experience a kind of deja vu). I post to Slashdot by using complicated black magic (it includes drawing sev
  • by shadowbearer ( 554144 ) on Friday July 04, 2003 @01:24PM (#6368363) Homepage Journal
    I like what happens when you run across a title which isn't on the site. [ibiblio.org]

    Example: "It's not there, eh? -- Canadian"

    Heh.

    SB
  • Too bad... (Score:5, Interesting)

    by Insurgent2 ( 615836 ) on Friday July 04, 2003 @01:26PM (#6368370)
    Unfortunately, with the copyright periods being extended so long, the material will only be of (ancient) historical interest. The 98 percent of copyrighted works that are unpublished and should be on there, unfortunately, gets to sit collecting dust instead of benefitting mankind.
  • Business Model (Score:1, Interesting)

    by AndroidCat ( 229562 )
    1. Gather great PD books.
    2. Hard work to put them in computer form.
    3. ????
    4. Profit! (For all humanity.)

    Hip-Hip-Hooray for a job well done!

  • MS Reader is crapola (Score:3, Interesting)

    by blair1q ( 305137 ) on Friday July 04, 2003 @01:41PM (#6368437) Journal
    "cannot open this title on a Terminal Services session"

    What bollocks. Free software and free books but you can't read them over a network link to your own compute server? Microsoft, as usual, screws the pooch.

    Now. How do I uninstall this without removing my adenoids?
  • This is why copyrights shouldn't be more than 25 years.

    I say, make 'em 10 years renewable up to 50 (and non-transferable).

    If only there were more works there like, er, hmm, Roald "Charlie & the Chocolate Factory"/"Matilda"/"The Witches" Dahl. :}

    Meh, well, better than nothing. Too bad though they don't have the Tomson New Testament of 1576 [tripod.com].

    -uso.
    • I say, make 'em 10 years renewable up to 50

      Even better is the suggestion that anything out of print becomes public domain.

      Copyright holders shouldn't be able to use their copyright to make something inaccessible to the rest of us.

  • Greenstone (Score:5, Interesting)

    by gmaestro ( 316742 ) <{moc.liamg} {ta} {yrdiug.nosaj}> on Friday July 04, 2003 @01:45PM (#6368451)

    Great to see a project like this run on Free software. Read more at Greenstone's website [greenstone.org].

  • From the disclaimer/header on Project Gutenberg files:

    If you have an FTP program (or emulator), please
    FTP directly to the Project Gutenberg archives:
    [Mac users, do NOT point and click. . .type]


    Given that a) Macs, being Unix-based, have command-line FTP like everybody else and b) the idea of a point-and-click interface has now passed so far from being a bizarre and contemptible innovation that lots of people are trying hard to develop nice-looking Linux GUIs... ... isn't this snarky instruction now more
  • by tie_guy_matt ( 176397 ) on Friday July 04, 2003 @02:33PM (#6368692)
    Putting a flag on your front porch is a great way to celebrate the 4th of July. An even better way to celebrate the United States' birthday would be to go to this site and actually read the documents that define us as a country.

    In this day in age when it seems everyone is a suspected terrorist and our liberties are stripped one by one in the name of homeland security, and in the name of the rights of large companies, I wish some of our elected officials would actually read these documents sometime.

    A red white and blue flag isn't what makes this country great, nor does an extremely high gross domestic product -- it is the set of ideas that where written over 200 years ago that makes the USA great.

    So everyone go to this site and read those documents. Even if you aren't American you should still read those documents because everyone has the right to the freedoms that our founding fathers wrote about.
  • by gbnewby ( 74175 ) on Friday July 04, 2003 @02:58PM (#6368789) Homepage
    Thanks to everyone who has helped contribute eBooks and other support to Project Gutenberg! If you haven't already, please visit Distributed Proofreaders [pgdp.net] and proof a page today!

    Lots of plans for the future:

    • Post-#10000 formatting changes. We'll be rearranging our directories to make it easier to find things. Likely we'll go with something OAI (OpenArchives.org) compliant
    • Conversion on the fly to many formats. We'll putting eBooks into XML format (mostly using teixlite.dtd, we think) for conversion on the fly to many other formats.
    • New ways to donate. "Sponsor a book"
    • More contemporary content. We receive donations nearly every week from currently published authors who want to make their stuff available to a wider audience (i.e., our Doctorow's Down and Out [ibiblio.org])
    • Your ideas! Visit gutenberg.net [gutenberg.net] to sign up for newsletters, find out how to get started producing an eBook, and find eBooks


    Thanks especially to our main and backup distribution sites, iBiblio [ibiblio.org] and The Internet Archive [archive.org]. And thanks to the THOUSANDS of volunteers who have brought us nearly to our 10,000th eBook.



    Dr. Gregory B. Newby

    Chief Executive and Director

    Project Gutenberg Literary Archive Foundation
    http://gutenberg.net

    A 501(c)(3) not-for-profit organization with EIN 64-6221541

    gbnewby@pglaf.org

    • > We'll putting eBooks into XML format

      This one has my vote. Good move! Thanks for running PG.

      I know it is complicated, but is it worth also publshing a style sheet for each work, which can be used to replicate the 'look and feel' of the original? It shouldn't interfere with the aims of readability, as one is free to ignore the style sheet and just read the raw XML or text file.

      (from a Distributed Proofreader)

  • I just looked over the links in earlier replies (PGXML and HTML-Writers) and was surprised: HTML-Writers hasn't touched only converted 20-odd etexts from Jan to Feb 2000; and PGXML hasn't even the ability to do valid HTML curled quotes.

    Both look like amateur do-gooders, and we need more of those; but these efforts should be folded back into the organisation of PG, where they may find a permanent home. The alternative is to go adrift, due to too few people being involved (only _two_ people do PGXML) to rou

  • by evilviper ( 135110 ) on Friday July 04, 2003 @10:27PM (#6370651) Journal
    Here's what I did...

    A while back, I used wget to mirror the entire Project Gutenberg works. (I did it off-hours, and contacted them to see if it was a problem, or if there was some other more effecient way to do things)

    Anyhow, with my GBs of text, I used bzip2 -9 to compress each text file. In the end, the entire collection of PG was able to fit on one CD. Since most people don't have bzip2 support I also included the free archiver, Ultimate Zip [ultimatezip.com] on the CD as well. I also put a read-me on the CD (that would appear as the first file) with basic instructions what to do.

    One of the great things about CDs is how easy they are to transfer... One stamp, and a 5cent CD envelope, and you can send 2 CDs anywhere in the country (this predated Netflix AFAIK).

    Anyhow, I sent these CDs to two different people, and the next time I talked with them, I found out they'd made several coppies of it. Basically, they heard someone mention some subject that related to one of the files on the CD, brought up the CD, and offered to make a copy for them. This happened a few times that I know of, and quite possibly many times that I don't know of. Quite as easy way to spread the word.

    Of course, with that said, I don't read the PG texts myself... There are two reasons. The first is that I have yet to come across decent software designed for long-term reading. Something that saves your place (automatically?), something with a legible font, and something with light colored text on a dark background, which brings me to my next point...

    The second reason is that monitors are all backlit... That means, reading on a computer screen is like reading text on a floursent lightbulb. It's possible for a while, but your eyes are quickly fatigued. The only screen I have that doesn't do that is my 640x240 B&W LCD screen on my Psion handheld. As good as that is, it's just too small for efective reading. Someone needs to create a non-backlit LCD screen, approx 6" (about the size of a book page) that is small, light, silent, compatible with everything, and most importantly, it needs to have good software that makes reading less work than it normally is on a computer... Until then, relectronic reading isn't going to really be feasable. Screw electronic paper, just give me a screen that doesn't hurt my eyes, and I'm set to go.
  • Does this mean that the Declaration of Independence is the first spam?

"Being against torture ought to be sort of a multipartisan thing." -- Karl Lehenbauer, as amended by Jeff Daiell, a Libertarian

Working...