I've always thought that going with a higher level thinking would be harder to break. Instead of copying letters from an image you have to identify a set of images that is easy for a person but more difficult for a computer. Think children's picture book type deal. Can a computer reliably tell a dog from a cat from a cow?
I think that's a pretty good thought. I'd extend it with perhaps one of those, "which of these things doesn't belong" type of setups (which may have been what you meant). It could then show pictures of a banana, an apple, an orange, some grapes, and a baseball hat. I don't know, perhaps there is a way to solve these easily by computer. But I know the stupid text CAPTCHAs that I had to go through yesterday to sign up for one site were so "obfuscated" that I couldn't read them either and I had to click the button for "show another" about 6 times before I could get one I could actually answer correctly. I'm pretty sure if we were asked to do something like you mention that was higher level we would be able to answer it without having to ask for "show another" over and over hoping to get one that is legible.
"The chain which can be yanked is not the eternal chain." -- G. Fitch