Submission + - Using CAPTCHAs to improve OCR (recaptcha.net)
An anonymous reader writes: reCAPTCHA is a CMU project that makes CAPTCHAs out of words, from printed text, that stumped the optical character recognition program used by the Internet Archive. Each CAPTCHA is composed of an unrecognizable printed word together with a recognizable printed word, the latter for verification. Both words are distorted. When several users respond in the same way to an unrecognizable printed word, it gets "tagged", presumably for use by a supervised learning algorithm.