Forgot your password?
typodupeerror

What Ways Can Sites Handle Spambot Attacks? 75

Posted by Cliff
from the barbarians-at-the-gates dept.
Amazing Quantum Man asks: "I'm a member of a site devoted to nitpicking TV shows and movies. It has always had an open posting policy — no registration required, and you could use any name you wanted. This policy was instituted way back in 1998, and led to some quite fun, freewheeling threads on various boards. Recently, we have come under spambot attack, with spambots posting links to gambling and porn sites on every single discussion board on the site. The admins have been trying to block IPs, but it's useless against a botnet. As a defense, it looks like the site is going to require registration, and disable anonymous posting. Many regulars, while they understand the need, are concerned that the freewheeling character of the site will be lost. Let me continue by saying that I'm not a site admin, merely a member there. Also, if it helps, the site in question is running Discus. Has anyone here been in a similar situation? How did you handle it, and what did it do to the 'culture' of your site?"
This discussion has been archived. No new comments can be posted.

What Ways Can Sites Handle Spambot Attacks?

Comments Filter:
    • Some sites use CAPTCHA... but I don't like it. I'll bet you I make a mistake in the CAPTCHA at least 30% of the time, which is just frustrating.
    • Verified email accounts - this is what I tend to use. User signs up, email with password gets sent. Some people don't like giving out their email for fear of SPAM and such.
    • Heavy user moderation - seems to work overall, look at /.
    • by thue (121682)
      CAPTCHAs have to be relatively hard to solve if they are widely used. If a CAPCHA is not widely used then it can be quite simple, but still work.

      A forum I administer has a CAPCHA which asks "what is six plus one" in plain text. Since the spammers do not have the time to manually solve the CAPCHA for a small site such as mine then the bots fail to get through. So if you inserted a small customized CAPTCHA on your site then it might do the trick.
      • by Smidge204 (605297)
        I made a quick anti-bot hack for a forum along these lines.

        When registering, you can fill out your *entire* profile (It was a phpBB forum - poor design to begin with but I was in no position to change that) - so the spam bots would fill in he homepage URL and e-mail fields. Even if their accounts were never activated/verified, their profile would still be viewable and be enough of an advert.

        So I modified the registration code a bit and added a message telling anyone who registered to leave the homepage fiel
        • You can also do something like have a robots.txt file that tells bots to say way from a certain file, such as post.php (stay with me here). Then have that file posted all over the html, but hidden from actual browsing users (in hidden divs, etc). The bots will find the url, follow it, and thus label themselves as BAD since nice bots and users will never visit that file. Also, have all IPs that grab robots.txt labeled as bots that cannot post content. This way they can't act like nice bots and still pos
          • by Smidge204 (605297)
            Wrong type: These were registration/spam bots not crawling/harvesting bots - They would never even look for a robots.txt file let alone follow it. All it did was dump POST information to register.php (or whatever the file is) to register an account with bogus info.

            =Smidge=
            • Granted they are different, but how would they know which file to post information to unless they parsed all the HTML on the site like a regular bot at least once? Granted that if you're using a common thing like phpbb or something, it could guess the filenames, but I was assuming a custom job.
              • by Smidge204 (605297)
                Easy: It's phpBB. The file name and post data are public knowledge to anyone that bothers to look at the source. Once the index.php file is found the location of the register.php file is also known because of the standard file heirarchy.

                That, or the spammer manually targetted the forum.

                Either way, by making the post data NON-standard, the bot was effectively defeated.
                =Smidge=

                • by hesiod (111176)
                  I have a phpBB forum, and I changed the forms in the registration to be a bit different and still have been bitten by spammers every day, so it's not foolproof. Of course, I don't have any control data on that -- it was changed from the beginning, so I don't know if really it helps or not. I just know that it's far from perfect.
        • by Ucklak (755284) on Friday November 03, 2006 @09:50PM (#16712239)
          I have about 100 sites (really) and I've evolved with different methods. This is what worked with me.

          First, when I identified what the spambots read, then I figured out how to fool them.
          They read the form data; what the form posts to and what the form names are.
          They populate the form names and posts to the action.

          I removed all javascript validation. It's useless. Do 100% server side validation, verify email address are valid, links are valid, dates are valid, word count for submission, check for duplicate data for multiple form elements, etc...

          I added session ID checks and this cut down on 75% of spamming where the sessionID is in a hidden field and if the request doesn't match the sessionID, it doesn't post.

          I then separated the form from the page by using iframes.
          On the initial load of the form, the proper HTTP REFERER is comitted in a session. If the form doesn't have the allowed referer, the form doesn't load and that form is blocked for the session with the IP address noted.
          99% of the IP addresses are from China, Latin America, Russia, The Netherlands, and Africa.

          Of the 25% of spam still coming through, I had to figure out the next step to stop it without compromising user functionaliy as in CAPTCHAS. There is no way I was going to use those nor use a `click the kitten` method either.

          I rewrote the form code to change the form elements names for every load.
          It was pretty much a hack but it worked.
          I had a random 6 character word generated every load.
          I dismantled that word every 2 characters and put 2 characters in every other character for the form element names that had been base64 encoded.
          I had an empty hidden element that had to remain empty as well.
          Bots tend to take every element and give it a value.

          That seemed to get rid of the other 20%. After a while, the spam would continue at nowhere near the level it once was but we noticed that the timing was 5 minutes between replies instead of seconds meaning that the elements had to be filled out semi mechanically instead of automatically.

          After copying that format for a number of forms, the spams that were coming through were from the same pool of networks.

          After data crunching and some time, I realized that the obfuscating of element names really didn't deter much as much as sessionID and allowed refering pages did.

          I started to actually have a single form for all like forms and use that one form for multiple sites so that updates can happen across all sites at the same time instead of updating 80 or so forms across sites.
          I also am in the practice of banning IP address blocks for form access. If they really have something to say to us, they can contact us via email.

          Email you say is probably the bane of existence for us that receive spam.
          There are tons of javascript mail obfuscators and as long as you have a single email for mail contact, obfuscate it and only use that for mailto links.

          I can seriouly attest that for the past 13 months, I've never received a penis enlargement mail at that address or any other stock tip.

          My forms are hosted at a single location and have strict referer checking. Any attempt to `figure it out` by looking at the iframe source is banned.

          If I get a form with non-relevant data, that IP is banned and all my sites and forms benefit.

          I've gone from 300-400 form requests a day to the legitimate 10 valid responses a day

          • Sorry, I don't follow your method. Could you try explaining the "Any attempt to `figure it out` by looking at the iframe source is banned." part? What's in the iframe and how is it involved in blocking?

            Thanks --

            Stephan
      • by nakke (143673) *
        Something like this has helped on some of my sites as well, you can use something on-topic to the site, for example a site about bananas might have:

        What is this site about? B _ _ _ _ _ _ (fill in the missing letters)

        The spam bots have already beaten phpbb captchas, plus they sometimes even use a real email-address so email validation is no use either!
    • I loathe CAPTCHA, although I may end up implementing it on my system. I could also be convinced that a "which of these N pictures are kittens?" test might work.

      I run a small old-school weblog on my own content management system. Middling PageRank (6 or so), a couple of hundred readers. I just had the spambots discover my Wiki, but in the process of cleaning up that mess I was shocked and amazed by the emergent behavior I'm seeing in spambots. Every form on my site that could get random info plugged into it,
      • Simple solution: Bin/hide any post with more than three URLs unless it's from a verified registered account. If you have a non-phpBB syntax for URLs, bin/mark anything with a link that doesn't follow the regular syntax. And also make generous use of rel="nofollow".
    • by M.Hare (1022269)
      CAPTCHA sucks, plain and simple. I can't count the number of attempts it took just to get a /. account.
      • it's also extraordinarily difficult to use for low-vision people [w3.org]. Making your website inaccessible to people is not usually a good idea.

        It's also easily breakable [zoy.org] - there are scripts out there that can decode the common CAPTCHAs with >90% success.

        This doesn't even include social engineering CAPTCHA breakers which are also known to exist e.g. spammer sets up a (say) fake porn site, hotlinks the CAPTCHA image of the site they want to spam. When someone comes along to fake porn site, it tells the user to en
    • by fossa (212602)

      My ideal forum: Anonymous, semi-anonymous, or "open" posts like the summary describes are run through some bayes or otherwise learning filter. If it's clean, it gets posted, if not, then to the trash. Through some mechanism not requiring a unique website login (e.g. a browser extension that made PGP signatures as easy as a click; why doesn't this exist?), a "verified" post could be made by the semi-anonymous PGP ID. Once enough trust is gained by refraining from making posts flagged as spam, or simply b

    • Heavy user moderation - seems to work overall, look at /.

      It never ceases to amaze me how in over 4 years of reading Slashdot, I've never seen a spam message. Well, excluding Slashvertisements and Beatles-Beatles type stuff. This is one of the tops sites on the internet, with full anonymous posting supported. By all rights, this site should be inundated with all kinds of spambot messages.

      Perhaps the spammers fear the retaliation of the collective Slashmind?

    • ``- Some sites use CAPTCHA... but I don't like it. I'll bet you I make a mistake in the CAPTCHA at least 30% of the time, which is just frustrating.

      - Verified email accounts - this is what I tend to use. User signs up, email with password gets sent. Some people don't like giving out their email for fear of SPAM and such.''

      You could give your users choice. Either enter your email address (which will be used to send you a verification code), or solve this riddle. The riddle could be a captcha, but I prefer al
  • What do you think about need for registration and still keeping old open way of posting?
    Just log in and later post with whatever nick you want. Just don't trace it or anything. You can even prepare some kind of statistics for users (how many post they posted). And of course implement some captcha.
    • by Thansal (999464)
      nice idea, personaly I would combine it with the way /. works.

      Register if you want, if you don't then deal with the captcha. And still let them use any nick as they see fit for ach post (possibly autofill the nick field with a name they pick, but alow changes).
      • by tepples (727027)
        Register if you want, if you don't then deal with the captcha.

        But should the process of registering necessarily require paying a sighted person to deal with the captcha for you?

  • See if you can set up a CAPTCHA that must be completed before the post can be put up. Multiple missed attempts could even ban an IP. Just be sure you have some alternate means for people that have issues with their vision.
  • It has always had an open posting policy -- no registration required, and you could use any name you wanted.

    There's no reason why that should change. Just add CAPTCHAs of some sort or another to the posting system. No more bots posting crap (although the CAPTCHA system might need to be changed every now and then depending on the strength of those chosen).
    • by mabu (178417)
      Captchas do not work as well anymore. There are legions of Indians and other people who are now paid slave wages to cross-post crap all over the Internet.
  • Amavisd-new has had p0f support for detecting the OS of the sending mail server for quite some time. It detects the OS type of the incoming mail connection and adds a header indicating the results. You can then use SpamAssassin to detect the OS an add an appropriate point total. Since few "real" organizations use desktop OSs for mail relays, you can usually assume a high probability of spamminess from such.
  • Akismet [miphp.net] is a very good antispam. It blocks 99% of spam on my forum.

    CAPTCHA doesn't work, many spambots can solve CAPTCHAs.

  • I would suggest maybe putting in Captchas [wikipedia.org] for every spot you might submit a post, etc. This way, bots cannot or have more difficulty making posts. Here are more links I had on these, but I haven't looked at them in a while...

    • by Fozzyuw (950608)

      Also, there's some note somewhere about Captchas containing questions instead of codes. So, your capture might have 2 + 2 = and the answer to validate would be 4. This way, it's a double edge effect. The computer would have to recognize the images and the 'question' so to speak. But your users would also have to be smart enough. =P

      Of course, you can extend this to be multi-level. Place multiple Captcha images for the Q/A. Such as...

      [ image 1 - code 1]
      [ image 2 - code 2]
      [ image 3 - code 3]
      ...

      And t

  • by deepb (981634)
    Don't let anonymous users post links to other websites.
    • by ColinPL (1001084)

      It blocks about 95% of spam, but some spam messages don't have any links or the links are obfuscated (h* *p :/ / www . example . com).

      • by deepb (981634)
        95% is pretty good, and I suspect that the remaining 5% would quickly taper off, because people can't "click" on obfuscated links, making them next to useless.
        • On top of that, I suspect this is a googlebomb attempt, as well. So removing links would help.

          But as the story poster, I have to say that part of the problem is the site software. It's running Discus, which doesn't have a "log in once" thing. I don't even know if it's possible to add captchas, though that would probably be the best solution. I also like the "Banana" solution below.

          The best solution, which would retain the site flavor would be require registration, but allow pseudonymous posting.

          For exam
  • On my guestbook, I just say "posts must begin with the word Banana, which will be automatically stripped." It works. Some spambots are actually human so it doesn't stop them, but it's super-simple.
    • by Fozzyuw (950608)
      That's an interesting concept. How's the usability? Do people read that little note and remember to add 'banana'? Or do you send them a notice should they fail and they can easily add your code word to the beginning?
    • A friend of mine who administerd a forum was having spam troubles and didn't want to require registration for posting- His solution was to mess with the default boxes/buttons on the submission forms. Adding a few more of them on the pages used by anonymous posters and changing their default states ('uncheck this box if you are not a bot') stopped the spam quickly.
  • I used to get a ton of spam on my guestbook. I tried doing lots of little things in the code and it turns out the spam was being submitted without them filling out the HTML form. To force this to happen, I found a neat idea on some German website (it's down, so I mirrored it [princeton.edu]). The code will not accept a post if there is no number/checksum pair.
    That cut out a lot of the spam. The rest has been gone since I added another, required field "What is my first name?" It is like a captcha but much easier. No one wi
    • by panda (10044)
      I find that just checking the HTTP_REFERER header is sufficient. If it doesn't exactly match the expected URL for the form that the program is processing input from, then my server sends a 403 and slaps their IP address into .htaccess with a 'Deny from'.

      Since the spammers can't seem to figure out how to send a proper http_referer, it works rather well. They seem to always use the http://host.domain.tld/ [domain.tld].

      Of course, I'm not doing this on forum posting, but it is a form that can be used to send me an email. Th
      • You know that there are a number of products that automatically strip HTTP Referers, right? Nortons Antivirus is one - it thinks that this adds some privacy value for the user. Or what happens if I bookmark your comment page to come write an insightful comment later?

        Whilst I agree that checking for referers is a good thing (90% of the spam I've seen doesn't have them - they find the comment page, work out the form structure then hammer it using some app.), automatically DENY'ing users without refererers may
        • by schon (31600)
          what happens if I bookmark your comment page to come write an insightful comment later?

          How does that affect the referer header of the submitted page? Are you suggesting that your web browser will allow you to click "submit", but not actually submit the page (maybe presenting you to a placeholder, so you can bookmark it?), and then somehow submit it later with all of the text you entered?

          If you bookmark a form, then come back to it, enter text, and submit, then the referer header will be correct.
          • Oops, yes - of course. Not sure what I was thinking. However, my point that refererers are not reliable still stands.
  • Use Bad-Behaviour [homelandstupidity.us] and mod-security [modsecurity.org].

    These two work perfect for me.

  • I've been battling this for years now. Ironically, the best way to stop spambot attacks is to homebrew your own CGI stuff. If you can't do that, rename all the scripts to non-standard names so that the common URLs are not found.

    I've been using keyword blacklists. They have proven to be very effective. If you don't allow people to input names of common drugs or strings like ".php?" or ".asp?" you can knock out a lot of the affiliate/redirect spam.

    The biggest problems have been with the popular messageboar
    • Unless, of course, your commenters really do want to talk about Viagra or Cialis... :)
      • by mabu (178417)
        Fortunately I do not run any forums dedicated to Humvee vehicles or the NRA, so it's not an issue.
  • I've had to deal with spam attacks on both my personal site and a forum I use. In both cases, we tried to ban IP addresses, then tried invisible methods of stopping spam (eg hidden required fields populated by javascript), and nothing worked.
    In the end in both cases, we've just had to use a CAPTCHA system. Spammers tend to use multiple IP addresses (and I do mean in the hundreds, a lot of them proxies or botnet-controlled boxes) so banning simply doesn't work.
    I've tried doing things like only requiring a CA
    • by user24 (854467)
      On the forum, we set the CAPTCHA up so that once entered, you wouldn't have to re-enter it for 24 hours. This way it annoys users less.
    • by Tripster (23407)
      I have been tinkering with the methods in this PDF ..

      http://www.ngssoftware.com/papers/StoppingAutomate dAttackTools.pdf [ngssoftware.com]

      Specifically I have been testing out the Token Appending method, it looks like it might be a good method to try.

      I've set it so that the token is a SHA1 hash of todays date, client browser string and a text string of my choosing, so basically this token with change daily for recurring visitors and be fairly unique to each visitor (you can just throw more unique qualifiers at it if needed).

      I
  • On Wordpress [wordpress.org] you have the option of requiring moderation only the first time an individual posts. Once you have approved one post by them they no longer are moderated.

    Sure it still involves trolling though moderated spam to find the genuine posts, but if you don't have massive traffic it works fine.

  • After seeing this presentation on OpenBSD's spamd [ualberta.ca], which profiles and greylists SMTP connections coming from botnets, I'm convinced of the need for HTTP POST greylisting.

    Point is twofold: slow the bots down (or stop the dumb ones altogether) and block obvious botnets completely.

    SMTP has the handy retry message. For HTTP, we would need to store the original POST request, and return a response with a 10-20 second meta-refresh to a confirmation url. Anonymous posters won't mind the wait, and the time window gi
  • registration required to post a URL or email address?

  • Check your users against DNSBLs [wikipedia.org]. Originally intended to block out malicious mailservers via their IP addresses, they are applicable on webservers as well. Via sorbs [sorbs.net] you can check for open HTTP and SOCKS proxies (interresting for you), open SMTP servers (not very interresting for you), webservers with unpached vulnerabilities, hijacked IP netblocks and malicious (in bed with spammers) network service providers. Other lists include the here recently mentioned Spamhaus [spamhaus.org.uk] list, and various DULs (dial up user list
  • I have successfully blocked comment spam by rejecting messages with http:/// [http] in them. Most of the spam contains links, so this can be extremely effective. Maybe on the site in question, reject anonymous posts that have http links in them, and if you have a site you need to post, you have to get an account.

    Sean
  • I've had good success with grep(1), using a file filled with various words culled from spam.
    I also recycle known spam through the search software, so it automagically updates itself. Seems to work well, and the
    best part is that as your anti-spam technology improves, the people behind the spam robots tend to give up on your site.
  • I dealt with this same issue on a message board. For years it did not require registration to post and with a small cadre of level-headed moderators we had a lot of fun. It was good for everybody, from regulars to one time guests who just wanted to ask a question.

    Then, about two years ago (I think), the message board spammers began to get exponentially worse. Poker spammers were most of it, but I also saw a number of porn site spammers and some guerilla marketing campaigns that were awful. The evening t
  • I've been active for quite some time on a site dedicated to DIY tube guitar amps (ax84.com). We have a lounge area where anything goes, but the posting policy is quite loose, with all sorts of fun stuff occuring within [otherwise] on-topic threads as well.

    After getting hit with several posts by auto-spammers, the maintainer instituted new rules.

    You can register, which requires nothing more than a valid email address, handle and password (AFAIK, I registered when he was first testing logins). But we also h

Lo! Men have become the tool of their tools. -- Henry David Thoreau

Working...