Comment Re:Duplication of effort (Score 1) 48
I was hoping someone would bring up OCR.
Can anyone recommend good OCR software (preferably open source) for converting archival material to plaintext?
The article author mentions Dropbox's photo tools, but as far as I can tell, those would still be a PDF containing an image, not text, so the best the tools would be able to do would be to add PDF annotations. Scrivener was mentioned too, but a quick look at that doesn't show me any capabilities related to image to text conversion. If the historian's time in the archive is moving towards taking photos of original documents, it seems to me that a substantial portion of the work of writing a new book would be to convert all of those images into searchable, copy-pasteable digital text.
I ask because I have a digital copy of a book. The original physical copy is no longer available. The digital copy is in the form of jpeg images of pages. I'd like to translate that into computer readable text.