Most of you have probably heard of project Gutenberg: preserving texts that are finally in the public domain for our common heritage.
A spin-off project is PGDP: Project Gutenberg Distributed Proofreaders (http://www.pgdp.net/c).
It allows you to use a convenient web interface to proofread one page at a time.
At the moment, the queue for new, freshly OCR-ed books (called "P1" queue) contains the following 1342 page english book:
A Dictionary of Arts, Manufactures and Mines
containing a clear exposition of their principles and practice
ANDREW URE, MD.
F.R.S., M.G.S., M.A.S. Lond., M. Acad. N.S. Philad., S. Ph. Soc. Germ., Hanov., Mulii. etc. etc.
Illustrated with twelve hundred and forty engravings on wood
Exactly as the title describes: A 1440 page compendium of the arts and sciences of 1840. Reflecting the major industries of the day, articles concerned with the manufacture of cloth are many and lengthy, while articles on any applications of electricity are absent: even the voltaic pile had little application at this time in any field except telegraphy (also not mentioned).
Proofing: Relatively straightforward, the text is entirely in English apart from listing the foreign names for materials and minerals. and has generally OCRed quite well, considering the age and condition of the pages.
Numbers will need special attention:
# The decimal point has frequently OCRed as the hyphen character
# zeroes are frequently OCRed as capital Os.
A number of somewhat archaic spellings are used; don't update or correct these in any way. There are a few (very few) simple equations.
I can add to this that the tables are absolutely horrid and should best be done off-line with vi or sed or something like that (something Slashdotters might be more adept with than an average proofreader?)
If you want to help out with preserving this large work for the world, and if you have a good eye
for detail and you don't get bored quickly, why don't you go write yourself in as a volunteer,
study the proofreading summary carefully (its 2 pages), do a few dozen pages of the books marked "beginner", and then help proofreading this technology dictionary!
This is why I post this information here on slashdot:
If you read slashdot, you must have too much time on your hands anyway, so you might as well do something that is instructive for you as well as useful for other world citizens
A lot of the basic technology from the 1840's has *not* changed much in the past 160 years.
But where do you nowadays find a concise summary on how to drill your own artesian well, test which mosses are chemically useful to make "archil" clothes dye with, etc. etc.? OK, the difference between an atom and a molecule weren't clear yet (e.g. HO instead of H2O) but believe me, it's a fascinating read. And I'm only at the letter A so far.
When this book is finally finished in PGDP (could take a few years), it goes to Project Gutenberg which has a widespread popularity. Millions of people could access it if they download the Gutenberg DVD.
The two points I'm getting at are the following:
- If any part of our world hasn't reached the technology level of 1840 yet, this can be a valuable
free reference work (yes, I know it's in english and contains a lot of jargon and the science is outdated). Think: bzip2 compressed text file on your OLPC computer?
- Should any part of our world be thrown back to pre-1840 living conditions, this can be a valuable
free reference work. Think: technology bootstrap? (yes, I read too much science-fiction).
What do you think?