Comment Re:diybookscanner.org forum (Score 1) 122
I built a primitive single-camera scanner using a cardboard box, a piece of glass, and a point-and-shoot camera and tripod I had handy, after reading that site. It was a pain lifting the glass to turn the page, and I spent a lot of time trying different lights (and locations of lights), but It worked well enough that I decided to pursue it seriously.
I started looking at building one of the better scanners using plans on that site. But after a lot of time thinking about it, and reading about the many decisions that went into the Archivist scanner kits, I finally decided to just buy one of their kits. Yes, it was $1200, but in the end I decided that just getting good quality wood, glass, and lights wouldn't be all that cheap, and I don't have a lot of free time, so it was worth it to me.
I got a couple of refurbished cameras from Canon, and a raspberry pi to drive the cameras. I use the SpreadPi software, and connect to it via a web browser on my laptop (I don't have room for a monitor/keyboard near the scanner, but can lay my laptop on a nearby bed), and a super-cheap foot pedal that I use to trigger the cameras.
I can scan about 1300 pages/hour (about twice as fast as when I started several months ago). Spreadpi then lets me download a tar file containing the JPG images (from the web browser). I then spend about 2 minutes per book opening a few pages in Gimp (the front cover, back cover, one even page, and one odd page) to determine cropping regions, then use a Perl script that calls ImageMagick to crop and rescale everything and stitch into a PDF. I reduce the image size a bit to reduce file size without compromising readability much. I also convert to a grayscale colormap for B&W books to further reduce file size.
I save all the original JPGs because I expect later I may re-do the post-processing in a better way. E.g. for now I am not doing OCR. I'm mostly scanning children's books and math books, in preparation for an extended international trip. I didn't want to haul a bunch of my kids' books with me, nor the part of my library I need for my work. But for now, the 2 minutes or so of attention per book I need to devote (after physical scanning) makes this not feel like a chore.