Web Users Angered by Anti-Spam 'Captcha' 267
Carl Bialik from WSJ writes "Captchas -- the jumbles of letters that users must type to gain access to some websites -- are a growing irritation, the Wall Street Journal reports. But programmers hope to make new variations that are both easier to decipher and harder to crack. From the article: 'Some captchas have been solved with more than 90% accuracy by scientists specializing in computer vision research at the University of California, Berkeley, and elsewhere. Hobbyists also regularly write code to solve captchas on commercial sites with a high degree of accuracy. ... Henry Baird, a professor of computer science at Lehigh University who studies PC users' responses to the codes, has been working with colleagues to develop new generations of captchas that are designed to be easier on humans but baffling for computers.'"
What? (Score:5, Funny)
Re:What? (Score:4, Informative)
To read this comment enter the text (Score:5, Funny)
I prefer kitten auth [kittenauth.com].
Re:To read this comment enter the text (Score:2)
Basically, it suffers from the same problem for any non-dynamically generated captcha (and if you add distortions, etc to the images, you're just going to make them harder to identify & remove the point of it).
Re:To read this comment enter the text (Score:2, Funny)
Re:To read this comment enter the text (Score:4, Interesting)
That still leaves things like manually capturing every possible unique base kitten image, then doing a pixel-by-pixel comparison and marking everything mostly matching as a kitten. It can be slowed down by changing the brightness or tint of the overall image slightly, but too much would make the image unrecognizable.
It would be more interesting to combine several ideas. Rather than "click on the kitten" have each picture marked with a random letter, and "enter the letters of the pictures with kittens". Or maybe change it up, pick brown kittens or black kittens or white kittens, kittens playing with a ball, etc.
Fourier to the rescue? (Score:2, Interesting)
Basic image comparison techniques are pretty easy to fool. Change one pixel and the entire image hashes to something else.
Change one pixel and the peaks of the Fourier transform of the image remain mostly the same. It's the same reason one can hear a tone above white noise.
Some "dupe detectors" reduce the image to a grid of n*m, take the average color of each square, and hash that.
Which is the same as using only the low-pass parts of the Fourier fingerprint.
This can be defeated by changing the
Re:To read this comment enter the text (Score:3, Interesting)
I think it's a step in the right direction, though. It's an interesting insight into what human memes can be considered universal.
Re:To read this comment enter the text (Score:2)
Then, you are basically at the point where you need to step away from the keyboard and go outside for awhile. I'm aware that maybe not everyone is aware of the difference between a llama and an alpaca, or other exotic things, but really, kittens? I work in international economic development and have worked in Southeast Asia, Latin America, and Africa, and EVERYBODY knows what a kitten is.
Let's just assume your a
Re:To read this comment enter the text (Score:2, Funny)
Thank you for the link to Kitten Auth -- I hadn't heard of it, and it looks interesting.
However, as others have pointed out, even image classification is something that (presumably) algorithms will eventually be able to simulate.
Therefore, I propose that authentication take advantage of the area where we know (through science fiction, of course) computers will never be able to mimic humans: lust and desire.
Introduce: Hottie Auth: Click on the picture of the hottest person in the following collage of pict
Re:To read this comment enter the text (Score:2)
Of course, this captcha theory is prone to lots of misses. The person has to know the word and what the animal looks like -- all versions of the animal -- and not get it confused with similar animals. Even the test phase requires that people testing the auth don't confuse a wombat with a squirrel. If most people can't tell the difference, but I can, I lose, because LCD determines whether I'm right or n
Blind people using capthia (Score:2)
Image Key Sets & Dynamic Captchas (Score:5, Informative)
In order to use the p0rn site he ran, you had to either pay money or spend time identifying captchas. He would then store them in a database and match it up with a checksum of the image. When he had completed a site's captcha key set, he would sell these lookup tables to anyone with money.
All they then had to do was write their program to do a checksum of the image (or the image itself if he had stored it) and then plug the word from the database into the page for verification.
With the introduction of splashers that spatter the statically stored images with lines or dots, the image is stored and a something like an edit distance is applied to it to find the closest match. Once that is accomplished, it references the keyword out of the database. You turn up the splasher and you risk the user not being able to figure out the word.
It seems that evil always finds a way. This is why captchas should always be dynamically generated on the fly from a very large dictionary! Check out Securimage for PHP [hotscripts.com].
Re:Image Key Sets & Dynamic Captchas (Score:2)
Re:Image Key Sets & Dynamic Captchas (Score:3, Interesting)
I spent some time working on an alternative to captcha, I call AOMIS. http://aomis.net./ [aomis.net.] I haven't had a chance to work on it for a while, but the basic idea was, provide a piece of media, the user must identify the content.
In most cases, it would be an image. So, I might show you a picture of an elephant, and to submit the form, the user would have to enter 'elephant' into the box. Each image would have a number of correct answers to account for common spelling mistakes, and the most common correct r
Re:Image Key Sets & Dynamic Captchas (Score:2)
- Use different images: Doesn't matter what it shows or wheter it describes an abstract concept. The time you use to collect and describe images == the time used to add to DB. Add new pictures every now and then? So the hostile script is alerting the user when a new picture is shown.
- You change a few pixels: The picture is analyzed on the fly instead of using checksums. Code ready to be taken out of ShowImg [jalix.org].
- Audiofiles? Time to manually create them =
Re:Image Key Sets & Dynamic Captchas (Score:5, Interesting)
The second approach was simply to set up captcha solving sweatshops somewhere in Asia with cheap labor, with people paid a few cents an hour to sit and solve captchas all day. This brought the cost of a new email address up to something like 1/3 cent, which for many spammers is still a viable price. The cost does limit this approach, though, so the captcha still helps.
The interesting thing about both of these strategies is that they use humans to solve a problem that is difficult for computers, which is von Ahn's research area - he's also one of those behind The ESP Game [espgame.org] (caution - this can be shockingly addictive). There's essentially nothing that can be done to defeat either approach without also making a system a huge pain in the ass for legitimate users. From this point of view, spending time trying to come up with more advanced captchas is kind of pointless.
Re:Image Key Sets & Dynamic Captchas (Score:2)
Works for non-static sets just as well (Score:3, Interesting)
Hell, let's use Slashdot as an example, since everyone has seen the captchas here.
It works like this: I'll set up a porn site all right. Gets people's interest easier than anything else. I promise some free porn, or heck, even some links to othe
How about "shootcha"? (Score:2, Funny)
I have a patent on it, of course...
90% accuracy? Not bad. (Score:5, Funny)
Hell, that's better than my average. They are getting so cryptic, it seems I get them wrong about 25% of the time these days.
-josh
Re:90% accuracy? Not bad. (Score:5, Funny)
Re:90% accuracy? Not bad. (Score:2)
A naive reader could misunderstand you and think that it's a program written by those scientists that gets 90%, but this is obviously not the case. I'm not an idiot (I hope), and I keep getting captchas wrong like half of the time.
Take advantage of colorblindness? (Score:2)
Re:Take advantage of colorblindness? (Score:3, Funny)
I've never heard of a colorblind computer.
I often fail those Turing tests (Score:4, Funny)
Re:I often fail those Turing tests (Score:2)
Perfect! How about, to access content or whatever the captchas are guarding, you have to pass a conversational Turing test first? So you'd spend some time chatting with a dude in India, and if he thinks you're human, you're in!
Of course, it seems that, for most of the people I've talked to for overseas tech support, I'd have failed them if I had administered a Turing test, maybe it's not such a great idea...
Different method entirely (Score:5, Interesting)
Which of these is a number: A 2 R P?
Seems that regardless of what they come up with there's going to be some part of the population that won't figure it out anyway, and if the whole point is to confuse auto-registerers, then I'd think it'd be harder for those to account for every possible question and answer set.
(Yea, it's in TFA, but mentioned like an aside...)
Re:Different method entirely (Score:2)
Something non-subjective like your suggestion, as long as it is not done in actual text so that the algos can identify keywords.
Re:Different method entirely (Score:5, Funny)
Or, even better, put it to music - and add a time limit!
"One of these things is not like the others,
one of these things just doesn't belong.
Can you tell me which thing is not like the others,
before I finish this song?"
Re:Different method entirely (Score:3, Insightful)
Re:Different method entirely (Score:2)
Re:Different method entirely (Score:2)
Re: (Score:2)
Re:Different method entirely (Score:2)
Re:Different method entirely (Score:2)
The fact that it TELLS YOU THE ANSWER in PLAIN TEXT means that scripting it becomes trivial.
captchas discriminate against the blind (Score:5, Interesting)
Re:captchas discriminate against the blind (Score:5, Funny)
Re:captchas discriminate against the blind (Score:2)
The bots would never figure it out.
RTFA (Score:2)
Re:captchas discriminate against the blind (Score:2)
Re:captchas discriminate against the blind (Score:2)
Audio recognition is actually harder for computers than visual recognition, and plenty of sites do audio captchas as well as visual ones. Blogger, for one. I was actually impressed when I first saw the little wheelcha
Re:captchas discriminate against the blind (Score:2, Insightful)
Re:captchas discriminate against the blind (Score:3, Insightful)
hell, go have a look over at Trolltalk... (Score:2)
captcha isn't that bad.... (Score:5, Insightful)
And even if you aren't blind, I've run into many a captcha that I couldn't decipher. Poorly designed sites may delete the entire content of your post if you fail the captcha, but I guess that's a design issue for another topic.
Re:captcha isn't that bad.... (Score:3, Interesting)
Sites should have alternate means, but even the ones that claim to have alternate means never really follow up on anyone.
Re:captcha isn't that bad.... (Score:2)
If it makes you feel any better, most of those women on Yahoo Personals are either Russians looking for American husbands or Bots. So the message you lost wasn't going to that hot, rich, and single girl you thought it was anwyay.
But thanks to recent advances in Captcha defeating technologies, that Bot will soon be sending you a link to a "Live" Cam-Show. So not all is lost.
Re:captcha isn't that bad.... (Score:2)
I suspect that all captchas that are harder to break will also be much more difficult to solve for humans. At least for the field I now relatively well, audio.
For visual captchas I guess the same applies, the better yahoo and microsoft's visual captchas are sometimes unsolvable by (non-alien
How ironic... (Score:2)
Something got me thinking about captchas ... what was it? ... oh yes it was that article on automated Spamcop submissions the other day.
No wonder they're a growing irritation. But websites need to know at least something about you. This site is letting me post now because: 1) I'm not going through a proxy 2) I've enabled cookies 3) I have a login. Now most sites I visit, I can't tick any of those boxes. And yes I'll venture over to bugmenot occasionally as well.
So sites need them. Especially for those f
Re:How ironic... (Score:2)
Many sites use them although they don't need them. In particular, forums and blogs wouldn't need them if they would simply discard any post containg an offsite hyperlink; allow plaintext URLs, but ban hyperlinks, and the problem disappears. Forum/blog spams always represent an effort to boost the pagerank of some other page, and thus always contain hyperlinks.
Re:How ironic... (Score:2)
News for Nerds? (Score:4, Informative)
Re:News for Nerds? (Score:5, Interesting)
What's wrong with an article being a spark for more in-depth discussion? How else are things rarely discussed in the media and never in depth (like most tech topics) going to be discussed on slashdot?
Sure, I know this post (and the parent) are off-topic, but it bugs me when people think that the purpose of slashdot is just to accumulate articles... that's what RSS feeds are for.
The discussion is what keeps me coming back, and typically, no matter how moronic the article is, there are several posts that give the kind of information that I wish was included in the article (but isn't). At the very least, people provide links to more comprehensive information and/or discussion of the issues concerned.
spammer bounties (Score:2, Insightful)
WSJ examples (Score:2)
Not the point (Score:3, Interesting)
Re:Not the point (Score:5, Insightful)
The paradox is, if a site has one that works really well for them, other sites will want to use it as well. As other sites use similar or identical systems, it becomes exponentially more beneficial for crackers to crack. So, as soon as something's good enough to use, it becomes good enough to crack.
The human factor (Score:5, Funny)
If I wanted to be really sadistic, I could instead present site readers with a sentence, in which they have to fill in either "their," "there," or "they're."
Re:The human factor (Score:5, Funny)
Your a looser for even sugesting such a thing!
Re:The human factor (Score:2)
If that was the only thing you did, with rotating sentences, a computer would probably beat most internet users, defeating the purpose.
That's a terrific idea! (Score:2)
Maybe like the one they give as an entrance exam for the Marines:
The door is:
A) Open
B) Closed
C) Not enough information
Hey, as an ex-Army guy, I'm allowed to give those gyrenes a hard time
Re:The human factor (Score:2)
If only forums did that... keep out spammers and peeple taht like 2 post in tahrd-speek.
http://images.slashdot.org/hc/59/0b4e0bc0ee0a.jpg [slashdot.org] voucher, spammers, voucher (do I get my free pr0n now?)! At least these stupid things on /. a) only bug me when I'm not at home and b) are generally easy enough to read that you don't do it six times.
Re:The human factor (Score:2)
Re:The human factor (Score:2)
Re:The human factor (Score:2)
I heard that's the scheme Microsoft originally used for the installation key for the Vista beta but had to abandon it after the third week of nobody being able to install the thing.
Re:The human factor (Score:2)
Damn right they irritate me (Score:2)
Re:Damn right they irritate me (Score:2)
But you want to know horrible anti-spam measures? look no further than slashdot itself. The numerous ways of obfuscating email addresses require so much effort to deciper it that I don't want to bother mailing them. C'mon backwards text? If it takes longer to decipher it than it is to email a quick question/reply, forget it.
20% error rate (Score:3, Informative)
That's amazingly high. 1 in 5 CAPTCHA's are incorrectly entered by humans doing their best to do the right thing.
No wonder people get mad at them.
John.
Easy: Real Life Objects or Critters (Score:2)
- There are tons of pictures of these things floating around
- they're easy to modify (blur, detour, cell-shade, rotate, mirror,
- Getting computers to guess the difference between a dog and cat, while feasable (don't care to fish the link to the pro
Re:Easy: Real Life Objects or Critters (Score:2)
Link to actual sample: http://gs264.sp.cs.cmu.edu/cgi-bin/esp-pix [cmu.edu]
Exists... KittenAuth (Score:2)
Server in the Middle (Score:5, Interesting)
This is v1.0 of the Matrix, where human brains are harnessed to solve problems by a more powerful and wise, though less "intelligent" computer network.
Re:Server in the Middle (Score:2)
Re:Server in the Middle (Score:2)
Re:Server in the Middle (Score:2)
Re:Server in the Middle (Score:2)
I know capacity on "real botnets" is resold to spammers (and other no-goodniks) this way; I don't see why people wouldn't be reselling capatcha-cracking resources too. It's a buyer's market.
Would DHTML work? (Score:2)
Let's not forget the porn. (Score:2)
Now, come up with a better way of preventing spam than simply proving that someone is human.
Not just OCR (Score:2)
also, it's no wonder that people are annoyed by CAPTCHAs - half the time they don't explain why the user has to enter the text, and almost all CAPTCHAs are developed around making the text hard to read. At the moment, it's only a few geeks who have managed to bulk-OCR
Captcha is a nice idea but... (Score:5, Insightful)
HOWEVER. A short and simple multiple-choice or true-false quiz might determine with some level of accuracy if the poster is a person or not. Simple stuff like a random image of a sheep, a lion, a bear or a whale with a radio button selection below it. It's easy to run through, it shouldn't require much skill from the user and has the potential to confuse interpreting software a lot more.
This approach could also even be ENTERTAINING to the user in that funny pictures could be used in the image interpretation drill. Such questions could be "Is this person having a good day?" and you can put all manner of interesting images in there for a true-false scenario. Being an entertaining method will definitely win fans. Being tedius, stressful and mistakable will lose fans.
Re:Captcha is a nice idea but... (Score:2)
Spiro Agnew is
a. a form of social disease.
b. a jazz-fusion rock band.
c. a former Vice President.
d. the first woman in Congress.
Making a "Hole in One" is
a. every golfer's dream.
b. too dirty to discuss here.
c. something carpenters do.
d. best done with scissors.
My boss is
a. a jerk.
b. a total jerk.
c. an absolute total jerk.
d. responsible for my paycheck.
Whips, chains and handcuffs a
Deal with it (Score:2)
Sorry, but the CAPTCHA plug-ins I've used with Word Press etc. are *highly* effective. Where people typically screw up in their implementation is to use the default dictionary word list which ships with them. The majority of CAPTCHA-defeating scripts out there today use a dictionary attack rather than successfully decyphering the CAPTCHA image. If one sets the CAPTCHA to generate a string of random letters rather than a word from the stock word list, the amount of comment spam posted drops dramatically.
I don't turn it on (Score:2)
I use Akismet spam filter instead, and it's blocked 780 so far, and has false positived 4 comments, and missed about 4.
Captchas are a bandaid solution (Score:2)
In the end, captchas are obnoxious for legitimate end users, while only providing temporary relief from spammers. The spammers can and will find ways around the captchas, which may include more sophisticated OCR algorithms, but also other solutions such as the manually created lookup tables that were mentioned earlier.
Other ways need to be found to distinguish humans from spammer's bots.
Other ideas (Score:2)
1. Text based passwords
Pro: People are used to them, quick-n-easy
Con: Subject to brute force attacks, trivial to automate a login once you have the password
2. Graphical passwords
Pro: Can use a larger set of images than characters, easy to remember
Con: time consuming, can only present a small set o
word puzzle (Score:2)
animated gifs? (Score:3, Interesting)
Re:animated gifs? (Score:3, Interesting)
I have something like that. In fact, it's a part of a three tier security measure I came up with last year. Having spent a lot of time programming A.I. and automation routines in the past, I realized there was a class of processes that could be guaranteed to work against automated spammers. One tier involves recognizing patterns of movement between fields on a form and data entry patterns. There is usually a very unique pattern to the way a human
Re:animated gifs? (Score:2)
I was thinking more along the lines "move the mouse in a circle", or "complete the following mouse gesture, L-L-R-U-D" (could even use randomly stylized arrows as a layer of obfuscation)
Re:animated gifs? (Score:2)
That's certainly the general idea. Keeping track of the time taken to fill out a form is one angle, but for registration forms that are generic enough for browser auto-completion, that defeats a useful time saver for users. I prefer having the computer determine through studying the user's input patterns (and yes even browser auto-completed forms will pass the test) whether the user is human or not, rather than instruct the user to do crazy things.
Re:animated gifs? (Score:2)
Put spammers to work (Score:2)
My solution (Score:2)
My solution is simple. It also defeats the "porn server in the middle" attack. Assuming the page is in English, just ask a random English language question about the banner ad at the time of the page. You "kill two birds with one stone" by getting people to prove they are human and read the ads at the same time.
This should work fine for all users that don't block banner ... uh ... never mind.
Just Had To Consider This (Score:3, Interesting)
The first problem with captchas is the barrier it puts up, however small, between you and the users of your site. Apologies for the corney analogy, but captchas are a speedbump on the information superhighway. People hate running into them.
The impediment to visually disabled users is also a big one to consider. It's not just fully blind people. People can be shortsighted, colour blind, dyslexic or perhaps simply shortsighted users relying on specialist software to read your website. You're letting these people down by adopting this practice and that's something I would really feel bad about doing.
But the biggest reason not to use captchas is spammers increasing abilities to interpret them. At even a five percent success rate in interpreting captchas, a spammer can bombard your site with requests and still get something through. They're just using the same model as they did with email, and it will work.
Instead I chose some other plugins available for Wordpress to help with the spam. Akismet [akismet.com] sounds like it could work as a kind of distributed spam check/blacklist of sorts, though I am wary of the fact that a private company is running the service. I also installed Bad Behaviour [homelandstupidity.us], though it's clear that eventually some spammers will adapt their behaviours to this.
Ideally what I'd like is a true bayesian comment spam filter plugin for wordpress, but so far I haven't been able to find one. Such filters have done wonders for me in Thunderbird for my email spam, with something like a 99.99% sucess rate and no false positives. Clearly the situation is quite different with comment spam, but all the same it would be nice to have one.
I envisage that the comment spam situation will get a lot worse as time goes by, regardless of any pagerank type algorithm changes. Comment spam will no doubt become as ubiquitous as regualar spam and I can forsee dozens of "splog" post per day in the not too distant futre. My opinion is that Blog software should come with robust, adaptable and self updating anti-spam software on by default before this problem escalates out of control.
Captcha Faux Pas (Score:3, Funny)
Re:Language independance (Score:2)
HTTP 404: Objekt nicht gefunden (Score:2)
Yeah, I can see how that would stop all the spam.
Re:Ball in a Hole (Score:2)
Flash is even worse than Captchas.