Forgot your password?
typodupeerror

How to Prevent Form Spam Without Captchas 272

Posted by ScuttleMonkey
from the can't-beat-the-curte-power-of-kittenauth dept.
UnderAttack writes "Spam submitted to web contact forms and forums continues to be a huge problem. The standard way out is the use of captchas. However, captchas can be hard to read even for humans. And if implemented wrong, they will be read by the bots. The SANS Internet Storm Center covers a nice set of alternatives to captchas. For example, the use of style sheets to hide certain form fields from humans, but make them 'attractive' to bots. The idea of these methods is to increase the work a spammer has to do to spam the form without inconveniencing regular users."
This discussion has been archived. No new comments can be posted.

How to Prevent Form Spam Without Captchas

Comments Filter:
  • And how... (Score:5, Interesting)

    by Creepy Crawler (680178) on Wednesday November 08, 2006 @01:36PM (#16770691)
    Ok, so captchas and other email obfuscation mechanisms are used a lot. Fine, a web designer can choose to do this.

    Now, lets enter US law: American with Disabilities Act. Target [arstechnica.com] is currently being sued for NOT complying with this federal law. I can understand why businesses would be required for this, but where will the net-boundaries stop?

    For example, I have a US corp. I hire an offshore datacenter to handle web processing. Is my website have the compulsory ADA lawss upon it, or do they not apply due to international boundaries? Yipe.
    • by vertinox (846076)
      Perhaps the vision impaired could get audio captchas?

      Click this button, listen to the sound, and then choose the selection what the sound was.

      Like birds chirping, babies crying, piano playing and maybe other familiar sound effects that you would choose from a multiple choice list.

      Of course if the user is deaf and blind, I'm not sure how they are using a computer to begin with.
      • Of course if the user is deaf and blind, I'm not sure how they are using a computer to begin with.

        braille display [deafblind.com]

        • by Firehed (942385)
          How well do they render all of these fancy new Web 2.0 sites? Instinct tells me that the rounded corners and glossy icons might be lost...
      • Re: (Score:2, Flamebait)

        by Lord Apathy (584315)

        Perhaps the vision impaired should just learn to live withing their disabilites and accept the fact that not everything is going to be availiable to them. Harsh, yes but its life. Making resonable requests to accomidate them is one thing but making people liable under law for not is something else.

        • by GigsVT (208848)
          The ADA wasn't passed by disabled people, it was passed by able bodied legislators who, on the left, wanted some bullshit feelgood legislation, and on the right, wanted to play up how supportive they were of disabled veterans.

          Most disabled people accept thier limitations and aren't imposing about it.
          • Re: (Score:2, Interesting)

            by fprintf (82740)
            Just try taking their reserved parking spaces closest to the mall entrance and you will see just how "imposing" disabled people can be about it.
      • by rk (6314) *

        "Of course if the user is deaf and blind, I'm not sure how they are using a computer to begin with."

        Pinball [geocities.com] interface.

    • So the government will get you when you don't completely comply with a regulation that affects a small group of people, but at the same time doesn't do anything to rid the world of the constant barrage of spam that annoy *everyone*, including the disabled?

      Well, that's nice then...

      • by Thansal (999464)
        I just realized how confuzing spam must be to some one ussing a screen reader.

        Sally zimbabwe google mark ford fish tot bing gong down
        *Insert GIF telling you to BUY BUY BUY xyz corp stock*


        Or actualy listening to the horibly mangled english that is a 419 email.
    • by johneee (626549)
      While I can't comment on your specific question, I do know that with the (now defunct?) COPPA you would indeed have to comply. In fact, even if you had a non-US company with an offshore datacentre, you would have to comply.

      I did some research on COPPA at the time because I worked on a kid's web site, and I called the agency that administrated it. They told me that any time I was collecting information from people within the US, no matter where I, the website, or my company was set up, the law affected me.
    • Re: (Score:3, Insightful)

      by clambake (37702)
      Now, lets enter US law: American with Disabilities Act.

      So? Just put a phone number on the site with a "If you are disabled and can't use our captcha, please call our tech support and we'll set up an account."
  • by Thansal (999464) on Wednesday November 08, 2006 @01:36PM (#16770695)
    Why is it so hard to make a captcha that a bot can't read but a human can?

    The slashdot captchas are among the easiest I have ever seen to read, however I still havn't seen any spam on slashdot. Is there something else goign on here? It can't be anything like IP banning or flood controlls as those don't stop botnets. Is it that spammers just don't target slashdot? or is it that captcha reading bots are not nearly that good at breaking them and we could tone down the level of those horrible tiwsted-doted-lined Captchas?
    • by ari_j (90255)
      Try running the Slashdot front page through crm114 sometime and see if it really is better than a human (specifically, better than you) at distinguishing spam from legitimate content. ;)
    • by Anonymous Coward on Wednesday November 08, 2006 @01:42PM (#16770829)
      Men's and Ladies Prestige Watches For all occasions! Perfect Christmas gifts!

      These replicas have all the presence and poise of the originals after whome they were designed at a fraction of the cost. The attention to detail is paramount and they are comparable to the originals in every way.

      To view our huge inventory visit our website now at:

      http://pwned31337.ku/ [pwned31337.ku]

      : Replicated to the smallest detail
      : 98% A+ Accuracy
      : Includes all Proper Markings
      : Wide selection and fast worldwide shipping
      : Authentic Weight
      : True-to-original self winding and quartz mechanisms
      : Guaranteed worldwide Christmas delivery
    • by everphilski (877346) on Wednesday November 08, 2006 @01:43PM (#16770851) Journal
      Think about it ... the slashdot crowd is technical and informed and "knows better" ... why would someone spambot slashdot? It surely would not be effective...
    • Re: (Score:3, Insightful)

      by Agent00Wang (146185)
      I've always wondered why designers don't use something simpler such as showing a picture of an easily identifiable object and requiring the user to identify it. This would work in 99.9% of cases. Alternatively, for the screen reader crowd, the check could something like, "What is the fifth word in this sentence?" There's probably some obvious flaw with this technique that I'm not thinking of, or I imagine it would have been done already.
      • Re: (Score:3, Insightful)

        by Lanoitarus (732808)
        The obvious flaw is that you need to create each one, and they therefore are inherently more limited in number. Text-based chaptchas are generated by a computer- pictures of pandas and their associated word would have to be done by hand.
      • by Thansal (999464) on Wednesday November 08, 2006 @01:54PM (#16771063)
        I actualy like the ones like that.

        instead of obfuscated images, just put in plain text questions.
        What is 2+2?
        What is the 3rd word in this sentance?
        What is the name of my blog?

        All of these can be answered by some one using a screen reader, and take less time then figguring out a captch. Sure it does not stop manual spamming, but what does?
        • Re: (Score:3, Informative)

          by JesseMcDonald (536341) *

          instead of obfuscated images, just put in plain text questions.

          That's been considered before. The problem with that approach is that, unlike image-based CAPTCHAs, there are a limited number of templates available for natural-language questions. The spammer just has to compile a list of the various patterns of questions and answers, a much easier task than designing an OCR program capable of extracting random, disconnected letters and numbers from a randomly distorted image. The problem is essentially on

        • by Trailer Trash (60756) on Wednesday November 08, 2006 @03:05PM (#16772517) Homepage
          What is the 3rd word in this sentance?

          How about:

          Which word is spelled incorrectly in my sentance?

      • Re: (Score:3, Informative)

        by nine-times (778537)

        These questions or pictures again need to be either automatically generated or generated by humans. If automatically generated, they would need to follow a pattern, and so the challenge would then be on the spammers to identify the pattern and train their bots to read the pattern and respond appropriately.

        If, on the other hand, they're generated by humans, it would be expensive to generate each one, and so they'd be limited in number. Therefore the spammers simply go about collecting each one, identifyin

      • I think simplicity is key, the best possible prevention is one that doesn't hinder the user but effectively stops machines. Most of the proposed methods require human interaction requiring some kind of task that a human can do but a machine can't (like identifying a kitten). Wouldn't we be better off going the other way around? blocking access based on things that machines can do but humans can't. Another problem I have with most anti-spam measures is that they seem terribly unprofessional

        On an older blo
      • Re: (Score:3, Insightful)

        by Goaway (82658)
        Except that there is no such thing as a picture of an easily identifiable object, especially not if you don't want to block non-English speakers. People will come up with many different words for the same thing, people will misspell it, people will not know the English word for it, and people will just not know what it is.
    • Re: (Score:2, Informative)

      by junglee_iitk (651040)

      Why is it so hard to make a captcha that a bot can't read but a human can?

      Numerous times there is confusion between I and L. Since every site uses its own set of images and its own 'set of rules to obfuscate', the user has all the reasons to be confused. Then there is 3 coupled with something that makes it look like B etc.

      Ofcourse, you will fail one time only, as on next reload you will get a new image to read, but as the article says, user response drops. People want to help you and you are making it, kind

      • by Thansal (999464)
        zero and "o" are my 2 problems genneraly.

        However, as I said, I have never failed a slashdot captcha, probably because they are all words....
      • How effective can captcha's be anyway. A nice "man in the middle" style attack. You want to hack some web forum - put up a porn site with a "read this captcha to get your porn" link on it. As your bot encounters captcha's it posts them out to your porn "clients" to hack for you with the correct brain power.

        I've wondered why the big spam services haven't setup this kind of scheme. I fear that I am just ahead of the times on this particular vulnerability

        • by nuzak (959558)
          This is already happening, and indeed it works exactly as you described, porn site and all. It scales terribly, however, since you effectively need a botnet to get proxies that won't be banned or otherwise regulated, and botnets are currently dedicating their resources mostly to email spam.
    • In a word: accessibility. Blind readers can't see graphic-based captchas and screen readers won't read them. Audio-based captchas have been used, but they can be difficult for some people with disabilities as well, are often difficult even for abled people and may be easier to process by bots in many cases.

    • Re: (Score:3, Informative)

      by sugapablo (600023)
      What's worked surprisingly well for me is simple arithmetic. Adding a random math problem such as 2 + 5 = [ ] or 3 + 4 = [ ] has DRAMATICALLY decreased the amount of form spam two of my websites have received.
    • by Pichu0102 (916292) <pichu0102@gmail.com> on Wednesday November 08, 2006 @01:52PM (#16771029) Homepage Journal
      The slashdot captchas are among the easiest I have ever seen to read, however I still havn't seen any spam on slashdot.

      You obviously don't browse the comments at -1.
    • by jfengel (409917)
      Slashdot has a couple of extra things going for it:

      * A "lameness filter" which excludes certain posts (ill-defined and probably continually changing to keep up)

      * A 20-second rule which prevents you from blasting the board

      * Moderation, which puts anonymous posts in a place most people don't read anyway. They may be there and you don't see them.

      That's still not sufficient for some jackass not to at least try, especially since the audience is so large. It may not be worth the trouble, since Slashdotters are r
      • Re: (Score:3, Insightful)

        by Thansal (999464)
        actualy, I browse at 0, as alot of ACs have some rather good posts. (infact I brwse at 0 Nested, so I see even more of these posts)

        I still have yet to see anything that was an ad, I have seen pleanty of trolls, but those are not bots. I forgot about the lameness filter, and I admit to being curious if that is catchign things....
      • Not only that, but slashdot tries to find the http://slashdot/ok.txt [slashdot] file through your IP address to see if the IP posting is an open proxy. Who knows what else they do behind the scenes.
      • Not to mention that more than a few of us believe the proper response to spam involves the use of hired goons and blunt objects.....

      • I personally hate the Slashdot lameness filter. It punishes fast typists who want to get their point across, without being verbose. Not all replies have to be several paragraphs long. I wish the user's karma/posting history would lessen the grip of the lameness filter. I assure you I'm not abusing the comment system. Don't tell me to slowdown, and I'm not a cowboy.
    • May I ask a really dumb question?

      What SlashDot captchas are these? Are they subscribers only?

    • by nuzak (959558) on Wednesday November 08, 2006 @02:59PM (#16772359) Journal
      > Why is it so hard to make a captcha that a bot can't read but a human can?

      Because anything difficult to OCR can be a real pain for humans too. Still, it's not that spammers are mass-OCR'ing images, it's that they actually get humans to enter the captchas, sometimes providing porn as a reward, but it's sometimes also a paid operation with goldfarming-style sweatshops. In a way, this is fine, because it scales far worse than full scale automation, but it does keep captchas from being a panacea.

      It's the combination of the captcha, rate controls, and moderation that keeps spam out of here. All links here have rel="nofollow" as well, making them useless for google spamming, and the spammers know it. Basically it's a poor return on investment when you can spam a bunch of blogs that are wide open.
  • Javascript (Score:5, Interesting)

    by Aladrin (926209) on Wednesday November 08, 2006 @01:39PM (#16770757)
    I hadn't read the article yet, and just the summary, and as soon as they said 'hidden fields' that are attractive to spambots, I thought "Why not hide the fields from the spambot instead?"

    It's easy, you just have the javascript create all or part of the form. Or modify the form in some way. It would happen before the user even sees the form, and the spambot would have to implement a javascript parser to get it. (Or a parser, that's unique to your site.)

    I would think AJAX would be a huge hamper to them as well.
    • I guess you'd better hope that brail terminals have a javascript parser.
    • by _xeno_ (155264)

      It would amaze me if the bot writers weren't already using JavaScript-capable bots. Internet Explorer is an ActiveX control that bots can use. Firefox offers plenty of ways to access its browser programmatically. (Imagine a SpamBot extension.) Firefox's JavaScript engine is open source, and I think Internet Explorer exports their via the Windows Scripting... thingy. (You'll have to forgive me for being more knowledgeable of how Firefox works than Internet Explorer.) In any case, the JS engines can als

      • I'm guessing an end user would notice when IE opened up and started filling out forms on websites. Or they could use a hacked firefox, but then the worm payload would be gigantic. Compare that to the current bots which evade detection by running in the background.
        • by _xeno_ (155264)

          There's absolutely no reason why a bot would have to actually display the browser. The browser engines themselves are designed to allow embedding into other applications. There's no reason why a bot application would ever have to bother actually displaying the window created to house the browser's control. It would still run in the background, never displaying any UI, simply hosting the browser's control in an invisible window. (Windows has a class of windows that do not appear in the task bar, so the b

    • by Reziac (43301) *
      I did RTFA, and it mentioned problems with javascript and why they discarded that notion.

      TFA page has an example of the "hidden form", and it is indeed invisible -- so one less thing to confuse the user. Confused users were part of the issue they wished to resolve, so...

      I suppose spambots will evolve to check for how a form is set up, but meanwhile, I like this idea much better than the alternatives.

    • Re: (Score:3, Informative)

      by Aladrin (926209)
      Please, read before you respond.

      "I hadn't read the article yet," is NOT the same as "I haven't read the article yet,"

      I've read it. You can stop posting the same 'rtfa' over and over. Jeez.
  • Blind users (Score:4, Insightful)

    by awtbfb (586638) on Wednesday November 08, 2006 @01:42PM (#16770817)
    This is still somewhat problematic for blind users. If decoy field names are picked up when CSS is turned off, then there will be a lot of users exposed to the bogus fields.
    • by Reziac (43301) *
      Actually, no. TFA has an example, and it is indeed invisible to my preferred braindead browser that don't know no CSS. It's just blank space on the page, that I'd never know was anything unusual if TFA hadn't pointed it out. Rather like commented code.

      Unless readers for the blind start scraping HTML source instead of visible text, it shouldn't be an issue.

    • Re: (Score:3, Insightful)

      by Ahnteis (746045)
      It's fairly trivial to also hide a comment telling non-CSS-browser-users to leave a field blank.

      Blind people can't see. They aren't stupid. :P (Well, any more then anyone else.)
    • Use CSS' media types.

      Aural, braille, and embossed are all media types that would hide the fields for blind users if done correctly (i.e. used and the reader supports it, which you'd think they would want to). This technique is not the only reason why blind user's tools need to work differently based on mediate type in CSS.
  • Just shoot 'em on sight.

    KFG
    • by Thansal (999464)
      Just like everyone else this is highely discriminitory against the visualy impared.

      You are a horrible horrible person!
  • field name encrypt (Score:2, Interesting)

    by Inmatarian (814090)
    Private Key encrypt the randomized field names and have a hidden Public Key field. That way, the fields foo, bar, and abacab have no sense of meaning to the bots, but will decrypt to subject, body, and spammer catcher.
    • Re: (Score:2, Interesting)

      by thejrwr (1024073)
      Mxing the Form order up would help too, as the bot maker could just look at the order of the fields,
  • by gorckat (960852) on Wednesday November 08, 2006 @01:54PM (#16771057)
    ...can it be clearly labeld as bogus? Something like:

    Subject: _______{-enter your spam topic here if you want me to disregard your email

    Can the label/tag telling someone to leave a field blank be hidden form a bot but clearly visible to a live person?
  • Hiding things seems like a good way to get search engines to not like you.

    • by lexarius (560925)
      Search engines do not need to index or analyze forms, only content and links. These techniques are not for hiding content or links, just making it more difficult for spambots to figure out how to use submission forms like the one I'm typing in right now.
  • My Method (Score:3, Interesting)

    by CastrTroy (595695) on Wednesday November 08, 2006 @02:01PM (#16771211) Homepage
    My Method is to just disallow posting of html. I have a simple blog, and if they try to do anything like post too many HREFs or or something, then I just deny the post. That seemed to work for the most part. The bots usually tried to post URLs on my site, so if they posted something like <a href=.... then I would just display an error message, since html doesn't show up properly anyway, because I encode the < and > with &lt; and &gt;. They also try posting [link]...[/link] which also doesn't work on my blog, so I just display an error message and let the user fix it. You can still post straight URLs, but that's not too good for spammers, because they usually want a link. I also stop people from trying to post more than 5 URLs in a single post, since I noticed the bots like to do that. I recently upgraded by blog to use AJAX to submit the comments. Adds an extra layer of protection against the bots, but I really haven't needed any since I added in the filters mentioned above.
  • Warn them before they post that they can't post spam.

    Make it a contract to post there.

    If someone posts spam then make them a 1 or 2 bucks. Money$$

    Or even organize other blogs and websites to sue them.
  • Related Story (Score:4, Informative)

    by Amazing Quantum Man (458715) on Wednesday November 08, 2006 @02:13PM (#16771411) Homepage
    Since the editors didn't see fit to put this in related links:

    What Ways Can Sites Handle Spambot Attacks? [slashdot.org]

  • I maintain a small site that uses the Gossamer Threads Links 2.x package (any decent, free PHP/database packages to replace this cruft with?). It's one of those apps that allows related sites to submit links to be added to our 'partner links' page.

    I quickly eyeball the 100+ bot submissions daily for the few *real* submissions. The rest are for "Laboratory Equipment", Viagra, mail-order brides, porn, and other crap.

    And before anyone asks, I *have* looked into modding the scripts to add a simple barrier

  • I have 2 blogs set up on Blogger [blogger.com], one with a customized stylesheet and another using one of the standard CSS templates. I am not sure how good Blogger 1.0 does to prevent bot spam on blogs that allow anonymous posting, but there seems to be a lot of it around.

    However, the one with the customized style sheet receives no bot spam! The 'Comment' link is actually called 'Talk about this', and the whole section of the Blogger posting is set up differently (i.e. left to right rather than top to bottom). The one t
  • Unregistered users have to wait 15 seconds between previewing their comment and posting it. This should make it slow enough to spam that spammers will go elsewhere. Registered users that spam should be subject to moderation. If more than n of their posts get modded 'spam', they get booted. Permanently. Sure, they could create another account. But more likely, they'd just move on to easier targets.

    -b.

  • It seems like people rediscover the same techniques over and over and over without even bothering to do a simple Google search to find out if things have been done before. I block about 90% of submitted spam using Bad Behavior. I'm working on the other 10%...
  • I have a small-ish website that allows people to submit sites that they want listed in my directory (think old Yahoo). I review the sites submitted before adding them so I can make sure the sites are relevant. Robo-spam submission was getting pretty horrible so I switched to a simple captcha script and it stopped all the robo-spam. Problem is, spam is still getting through because humans are still submitting things by hand. Somebody in India, for example, is getting paid to manually submit irrelevant si
  • One of these things not like the others:
    Cat dog fish *car*

    Black *stapler* white red

    car truck *J-lo* SUV

    *Madonna* J-lo K-fed Ja-rule

  • Sorry for the ignorance, but where are the /. captchas? I don't run into any when submitting comments...are they somewhere else?
  • by liangzai (837960) on Wednesday November 08, 2006 @02:41PM (#16771917) Homepage
    This will prevent 100% of the bots from even entering your page... ... plus a few IE users.
  • What actually all but eliminated spam sent through my web forms is disallowing newlines in fields where they shouldn't be (like the subject and from address fields).
  • Instead of introducing images and their associated problems and disadvantages, I would give users a little puzzle to solve; something that requires them to understand human language. Something like "Enter the first letter of every word in this sentence", "What color is a banana?", etc.
  • Vbulletin forums? (Score:3, Informative)

    by Shoeler (180797) * on Wednesday November 08, 2006 @03:11PM (#16772657)
    I run two largish Vbulletin forums - and we get at least 1-2 spammers a day. I haven't found a way to prevent them yet, but I have found a way to stop em from getting any traffic or money for the unsuspecting idiot that clicks on them.

    I use an anti-spam e-mail technique: blacklist.

    Vbulletin has a censoring system where words you choose can be replaced with your choice of characters - by default it's an *. www.clickmeforspam.com, where I would use the "clickmeforspam.com" as the censored word, shows up as www.****************** .

    It's quite hilarious to see the humans behind the spam, who have registered, gotten through a human image trap, clicked on a link e-mailed to them, logged in and posted their spam re-post it like 2-3 times only to realize they got owned by my filter. They get all pissed off, and by that time a user has reported the post or we've seen it and banned them. It's very fun to make fun of them in their spam posts filled with ***s. :)
  • So far, the approaches I've heard that I like the best are simple human question (what is X times/plus/minus X, what is the second word in this sentence, etc). Field obfuscation and embedded public/private keys are pretty useful techniques. Even though I don't like making a form only work when javascript is enabled, but there was a pretty clever little script that didn't apply the "action" of the form until is submitted would probably confuse a lot of spam bots as well.

    However, I really haven't heard much
  • by sparkz (146432) on Wednesday November 08, 2006 @05:07PM (#16775151) Homepage
    I've been doing a variation this for quite a while now on my phpBB forum [steve-parker.org]. There are bots which identify a phpBB forum and simply POST a user-account creation to the relevant page. This then adds their URL to the forum's memberlist page, improving their Google ranking.

    I won't stand for that, so the simple fix is to remove the "WEBSITE" input from the form. If "WEBSITE" gets POSTed along with the other data, I know it's a robot and post a message to kindly go away. Genuine users can edit their profile once the account is activated, if they want to plug their website.

  • use dnsbls (Score:4, Interesting)

    by joost (87285) on Wednesday November 08, 2006 @06:30PM (#16776645) Homepage
    Shameless plug! I developed a plugin for Ruby on Rails that uses DNSBLs to combat form spam. (begin shameless self promotion)

    dnsbl_check rails plugin [spacebabies.nl]

    Basically what the plugin does is check clients against one or more DNSBLs. You might know them from mail servers. You see, it turns out that the forms are almost always abused by bots. These bots are quite well known. sbl-xbl from spamhaus catches 80% in my setup, spamcop catches the rest. You enable the plugin for key controllers and it really does work.

    (/end shameless self promotion) mod me down if you wish

"Religion is something left over from the infancy of our intelligence, it will fade away as we adopt reason and science as our guidelines." -- Bertrand Russell

Working...