Forgot your password?
typodupeerror

Preventing Forum Spam-bots? 124

Posted by Cliff
from the evolved-beyond-captcha dept.
A concerned reader asks: "Recently it seems that forums have become the new target for spam bots advertising everything from porn to casinos. The forums that I admin are constantly harassed by these bots even though you must enter the visual confirmation code code (the picture with letters/numbers) as well as reply to an e-mail in order to register. This only started a few months ago so I'm suspecting that some new spam program was released that somehow gets around these anti-bot measures. How can I get rid of these annoying bots?"
This discussion has been archived. No new comments can be posted.

Preventing Forum Spam-bots?

Comments Filter:
  • One word: (Score:5, Informative)

    by MadDog Bob-2 (139526) on Friday April 07, 2006 @05:50PM (#15088131)
    kittens [thepcspy.com]
    • I like the idea (even with the cute overload), though I'm not sure it really improves much over a single captcha image. Aside from the obvious anticipation of OCR improvements.

      It seems a bit process-intensive, though, judging by the load time I'm getting. The success message on the demo seems rather appropriate, given last weekend's Slashdot layout...
    • This also creates the problem of how many different kitten images you have to work with. It you have 10 kittens you can choose from it would take a few minutes to see them all, and write a program to recognise each (And this would work on all the forums with lazy admins who won't change default images). If you have a few thousand then you're using a few MB for the sole purpose of authentication.
    • If you used this to prevent automatic access to a porn site, would you be clicking on the kittens that are about to die?

      http://en.wikipedia.org/wiki/Every_time_you_mastur bate..._God_kills_a_kitten [wikipedia.org]
    • Forum spammers want to submit very specific content: hyperlinks (to boost their Google page rank). Our forum gets hammered by spambots hundreds of times per day, yet nothing comes through - we simply filter away any message containing a hyperlink (plain, non-clickable URLs are allowed). Works like a charm - no user registration, no fancy and annoying CAPTCHAs.
  • by Raul654 (453029) on Friday April 07, 2006 @05:52PM (#15088135) Homepage
    For the record, those blurred/skewed letters and numbers are called a "Completely Automated Public Turing [wikipedia.org] test to tell Computers and Humans Apart" - Captcha [wikipedia.org].
    • Also... (Score:4, Informative)

      by Raul654 (453029) on Friday April 07, 2006 @05:57PM (#15088171) Homepage
      ...it's patented. [uspto.gov] (and Turing is spinning in his grave...)
      • That patent speaks of riddles and the user guessing the answer, how does that translate into the CAPTCHAs we recognize these days?
        • The claim section (the only part of the patent that has any legal weight) covers "modifying at least one perceptual attribute of the string of random characters to form a riddle configured to be easily answered by a human being with no advance knowledge of the riddle while being substantially difficult to answer by an automated agent unaided by human being, the string being a correct answer to the riddle; " -- the perceptable attribute that is modified is the readability, and the riddle that the human must
    • by croddy (659025) on Friday April 07, 2006 @06:00PM (#15088182)
      Before you implement a captcha, please consider the effect this will have on visually impaired users. Obviously, any system relying on an image will not be accessible to blind people; systems making use of colored images may not work for colorblind people. Providing audio captchas would help, but this will be a problem for people who are deaf -- and one cannot simply assume that users are not both deaf and blind.

      I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem. This is going to be far more accessible than anything relying on embedded media.

      If you're sure that none of your users are blind or colorblind (which would be plausible only for an extremely small user base), then I suppose something like KittenAuth [arstechnica.com] might be appropriate.

      • by Xibby (232218) <zibby+slashdot@ringworld.org> on Friday April 07, 2006 @06:15PM (#15088267) Homepage Journal
        The forums that I run have a "If you are visually impaired or cannot otherwise read this code please contact the Administrator for help." with a mailto link.

        This has yet to be a problem as the forums that I run are orientiated around shooters or MMPOGs. :)
      • Though not a bad idea, even plain text arithmetic is far from foolproof [google.com]. You could go more complex, but then you run the risk of excluding those who have trouble solving those problems, either in translating the word problem into a solvable mathematical format, or whatever. It would seem that a simple logic problem might be better at differentiating human from bot, but I can imagine that it would have an even higher false negative detection rate.

        Visual tests with an audio alternative for sight impaired

        • by stevey (64018) on Friday April 07, 2006 @06:31PM (#15088360) Homepage

          You could also go for the cuteness approach:

          Click on the three images which are OMG Kittens and you're identified as human.

          • This is even worse than logic puzzles. How many unique kitten pictures is that thing going to have? Ten? Twenty? Maybe fifty? All you have to do to is to get 60% of the kitten pictures programmed into the spambot. Then you just have to compute a CRC of each image served, and bang, you have cracked it. And it's not like it's any better than the scrambled text authentication. If you wanted to reduce server load, you could generate and cache a couple of thousand unique text strings. This approach make
            • If you read the article introducing the kittens concept, you'll see that the author intends it to be customized to each site, thus preventing spambots from simply memorizing the pictures. And randomly picking three out of 9 images only gives a possiblity of success of 1/84, better than many word captchas are achieving these days.

              Anyone who wants to custom-program a bot for a single site would just be better off manually posting their spam.
              • No need to custom program anything. The program can grab 20 or 30 different captchas, figure out which images you are using, and simply have a human mark the kitten ones. This function will be implemented in all the spam software if this technique ever becomes widely used.

                Also, I fail to see how a word captcha could be guessable. A 5-letter sequence composed of alphanumeric characters would yield a 1/60466176 chance of guessing it right. That's one in 60 million. You'd be better-off playing the lottery
                • Oh that's easy. Write a script that takes your kitten pictures and do add different borders or save in different qualities, ... It's easy to get >30 versions of every picture and the CRC would always be different. Now your Bot writer has to analyze the pictures to find the ones that are similar.
                  • Sure, but now you've defeated the whole point by increasing the server load. Not to mention that it's trivial to write a program to analyze the images. Let's not forget, most spammers are commercial entities. They are ready to spend some time breaking security systems.
                    • The alternatives can be pregenerated, so no server load. And as long as there are many sites with weaker CAPTCHAS no one is interested in cracking it. If someone has enought time and money to spend they'd just decode by hand.
      • by Jester998 (156179) on Friday April 07, 2006 @06:39PM (#15088398) Homepage
        I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem.

        While not illegal, some may considering it amoral to discriminate against stupid people.
      • I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem.

        This is not a good captcha. If someone wants to flood the forums, it takes about 3 minutes to write a regexp to crack these. You aren't going to implement more than 20 or so different logic puzzles, and it's rather trivial to automatically parse these. Also remember that you only need a 5-10% success rate to completely shitflood the forums. I don't think it's possible to create a captcha that is usable
        • "I don't think it's possible to create a captcha that is usable by vision-impaired users, except maybe a sound recording" - someone else in this thread has already describedjust such a thing. Any visually impaired reader could use the voice->sound function to pass that captcha, or one of those electronic braile monitoring things.
      • I have seen some captchas that ask users in plain text to solve a simple arithmetic or logic problem

        I actually implemented this on my blog a little while back as a quick deterrent(Because I didn't have the resources to implement it). The system was quite simple - it basically was scientific notation like so:

        seven times one hundred plus eight times ten plus six

        Answer: 786

        Simple enough to check and because it's text it takes a little more effort to write something to crack it. I didn't get a comment

      • I've seen some that use linguistically-based tests. Things like "What color is an orange?" or "Please type Bob's first name."

        Of course, if you're really getting hammered, you'll need to vary the structure of the questions (and the keywords) a lot, and probably move into the realm of general knowledge questions -- and then you need to make sure you're not relying on vocabulary or knowledge that would exclude more people than you intend.

        And the simple ones only work because it's not worth the spammers' time
      • My bank's [kiwibank.co.nz] system allows you to listen to a computer generated .wav instead.
      • You can always introduce audio into it, too:

        "This is an apple. It is smooth, shiny, and red."
        "This is fluffy. It is shiny with orange and white stripes."

        If set up properly, it should be easy enough for a human to guess which is the kitten and which is not, but difficult for a bot (without semantic reasoning) to tell the difference. You may have to avoid words that the bot can clue in on ("fur" is probably bad, for example).

        This technique has been around long before the "Kitten Rank" site, however, and by fi
    • by mnemonic_ (164550)
      I'll probably get downmodded for this but some GNAA members (a couple of them are MIT students) developed OCR tools that defeat captchas, very long ago.
      • There's also a proof-of-concept called PWNtcha ( http://sam.zoy.org/pwntcha/ [zoy.org] which can automatically work out a large number of common CAPTCHAs (including PHPbb and vBulletin's standard ones ) with well over 90% correct.

        CAPTCHAs are NOT the best solution - they're just a band-aid, and they make your site harder to use ( especially for low vision people ). Personally I prefer web-server level blocking of dodgy UA's, IP ranges, POST payloads with something like the wonderful mod_security [modsecurity.org] for Apache, coupled wi
  • Maybe have a grace period between the time one registers and the time they are allowed to post or post replies?
    • Re:Grace period? (Score:4, Informative)

      by Donniedarkness (895066) <Donniedarkness@NoSpam.gmail.com> on Friday April 07, 2006 @06:16PM (#15088272) Homepage
      While this will keep some of the bots away, it will also cause the site to lose members. When I sign up on a forum, it is usually because I want to post RIGHT THEN. Of course, I'll probably continue to post on it.

      If a site makes me wait three days, though, I'm likely to forget about it in that time.

      Or were you talking about smaller grace periods? Perhaps 10 minutes? That might work well.

      • Maybe a system where a moderator has to allow your first 2 or 3 posts or something like that. Not sure how to do it, but depending on traffic and the amount of administrators/moderators you have, there might be a system that when you register, the first three posts have to be read by a moderator and allowed in. Maybe set a system that forwards them to all moderators, and then as soon as one of them clicks on "allow", it shows up on the forum.
        • Re:Grace period? (Score:3, Insightful)

          by FLEB (312391)
          It would work reasonably as well in reverse: Allow the person's posts, but forward them to a moderator. If the moderator determines them to be spam, that poster gets the boot (along with all their posts). Add in some intelligent "Find Similar" logic, and you'd have y'erself a good start at a forum anti-spam system.
        • A forum I'm on implemented a minimum post count before users can post links.. I guess the one or two spammers we got per month was too much. The only effect that I've seen is legitimate users have to jump through hoops to post links (even lurkers that have been registered for months still can't post links). It did get rid of most of the spammers though, it seems.

          But, it didn't completely stop them.. Two nights ago we had a guy spam us and told us to Google for his company's name and click the first link t
  • Easy (Score:5, Funny)

    by Kj0n (245572) on Friday April 07, 2006 @05:55PM (#15088155)
    Just display a confirmation page with the goatse.cx picture.

    Anyone who can still click on the confirm button is not human.
  • What's to stop a spammer/script kiddie from making a script that does all the registering except for the visual code, giving an average reg. time of maybe 5 seconds per site?
  • by etymxris (121288) on Friday April 07, 2006 @06:00PM (#15088183)
    Add hidden variables to submission forms that change everyday. This will force the bot software to do pagescraping for your specific webforum, which probably isn't worth their time. They will go to the easier targets first.

    But if they are defeating captcha, there is probably someone who just sits there manually spamming forums through anonymous proxies. The amount of money that can be made by doing this spamming is probably enough to pay people with lower standards of living to just do it manually. And if that's so, there's just no way to get around it. I started logging how many bots the captcha and hidden variables were catching, and it was tons. Still, I get spammers. Just not nearly as many.
    • But if they are defeating captcha, there is probably someone who just sits there manually spamming forums through anonymous proxies.

      Nope.
      Well maybe, but not necessarily.

      There is at least one public [sentinel.deny.de] and many 'private' tools that can brute force captcha while rotating proxies between attempts.

      Plenty of freely available OCR components can be incorporated into your own program. It'd make much more sense to pay one programmer (or DIY) to whip up a quality OCR proggie than to pay monkeys to sit around typing in c

  • by aiken_d (127097) <brooks @ t a n g e n t ry.com> on Friday April 07, 2006 @06:08PM (#15088228) Homepage
    Good: CAPTCHA

    Better: dynamically change the names of form fields ("subject", "message", etc) based on the current time. MD5 hash the current hour with the field name, and have the software only check the current and previous values. Spam bots generally have to be told what field names to look for.

    Best: have good moderators who kill spam and block IP's more or less instantly. Not practical for smaller sites, of course.

    -b

    • Better: dynamically change the names of form fields ("subject", "message", etc) based on the current time. MD5 hash the current hour with the field name, and have the software only check the current and previous values. Spam bots generally have to be told what field names to look for.


      Unless you're also willing to change the order of fields on your post-submit page, as well as the form factor, that doesn't do much good.
  • by savala (874118) on Friday April 07, 2006 @06:09PM (#15088232)

    Don't use phpbb, vbulletin or whichever other forum software everyone uses. Don't name your registration page "register.php" or something similarly easy to guess. Don't give your username and password fields name and id attributes of "username" and "password". Etc, etc. There is no security in obscurity, but there sure as hell is lots of convenience and freedom from automated harassment.

    The rewards for writing scripts that can handle the subscription process for all the big software packages are simply too large. Yes, these software packages will now start up the arms race, same as has happened with weblogs and email and referer spammers (does anyone else have the feeling we've won that last one, btw?). You can try and follow along and update your forum software every other day. But it's much more convenient to simply duck under the radar. Chances are no spammer is going to bother figuring out how to register at your custom-built/modified forum.

    • Don't use phpbb, vbulletin or whichever other forum software everyone uses

      Much as I hate to agree with that, he speaks the truth -- the bots are written to target specific forum packages, and they almost always go after the popular ones. phpBB has taken a lot of stick for one or two security problems that came up, but in truth it's as good, if not better than its competition; the reason it gets hit so badly is simply because it's so popular.

      So if you can use a less-well-known package, that will keep you awa
    • While I definitely understand the logic of this, I have to dispute the practicality of it. I've attempted to use many of the lesser used forum softwares, and they're lesser used for a reason. They have considerably less functionality, aren't as user friendly, and tend to be riddled with bugs. I currently use phpbb, and spend a few minutes each week having to weed out the spam from my forum, banning ranges of ip addresses, and deleting bogus members, that kind of thing. This is a drop in the bucket to th
  • by oni (41625) on Friday April 07, 2006 @06:11PM (#15088244) Homepage
    If they are using something like hotmail, then maybe just disallow hotmail. Nobody with a brain uses it anymore anyway.

    If they are using gmail, then maybe google would be nice enough to start a service where you could report addresses that bots are using. The great thing about google requiring invites is that google now has this neat chain of responsibility. If they see a pattern where all of the addresses created by invites from a certain person's account have been used as bots, then they could delete all those accounts and all the accounts they invited. That would seriously screw the spammers.
  • I'm guessing you're using phpBB. I've actually been hit by these guys on my boards; it wasn't a problem for me until they started to post. It appears to be actual people and not robots. I should also note I didn't have this problem until I added Google AdSense to my boards. After I did that, I started to get two or three of these spammers each week. Another phpBB board I administer hasn't gotten a spam user yet.

    What worked for me was checking the registration e-mail addresses of these people and putting in

    • My forum had people registering accounts every day with adult/gambling/etc links - the registration message would fail, but they didn't care; they just wanted those URLs in the db.

      I did a search on phpBB's site about this and found I wasn't the only one with the idea of removing the URL field from the user name information. The phpBB people were not interested in creating a mod to do that, and they instead suggested I try the mod to block requests from proxies.

      The proxy mod worked for a while, and I kept i
    • If you are using phpBB, the first suggestion I have is to change the VC code to something else. It doesn't have to be hard to break, it just has to be diffrent.

      There's also a huge topic on phpBB.com http://www.phpbb.com/phpBB/viewtopic.php?p=1404100 [phpbb.com] which details a few things you can do to stop them. Of main suggestion is the Instan Ban mod (http://www.phpbb.com/phpBB/viewtopic.php?t=186683 [phpbb.com] ) which will modifiy the registration page in such a way that automated attempts get banned. It is done in such a wa
    • What'd be really cool is a stealth ban where you can see your posts, but nobody else can.

  • Be proactive! (Score:4, Insightful)

    by BertieBaggio (944287) * <bob&manics,eu> on Friday April 07, 2006 @06:19PM (#15088289) Homepage

    There are a number of options you have, depending on how aggressive you want to be. You may have implemented some of these suggestions already, but they may help other forum admins in a similar quandry.

    Firstly, disable anonymous posting. What works for slashdot does not necessarily work for phpbb. This may sound obvious, but a forum I check on now and again is slowly haemorrhaging members due to guest bot spam.

    Secondly, find yourself a list of public proxy servers. Ban them. Find some more. Ban them too. Also, take note of the IPs the spambots were using to post. Ban them as well (unless they are AOL IPs -- be smart and do an nslookup). Keep this list of banned IPs, and are them with the blacklist groups, or other forum admins you know. You help them, they help you.

    Thirdly, augment your signup process. You say you are using CAPTCHAs, but if the bots are getting arond or through them, you have to do more. Write a few hundred straightforward questions; you can get your community to help you for this one. Have one o two of those questions displayed at regitration time, along with the CAPTCHA. For example:

    Which of this is not one of the seven dwarves?

    • Doc
    • Sleepy
    • Bashful
    • Horsey

    Or would you like another question ?

    Keep this as simple as possible. "What color is the sky?" is about the level you are looking for. A bot won't be able to answer these unless it is specifically programmed to. Need I say you should serve a random question?

    For bonus points on this one, make the questions something to do with the topic of the forums. If the forums were about widgets, you could ask something (really basic) like "What is the most common color of widget?". Or make come of the questions about the TOS. You know, the thing everyone checks the box saying "I agree to abide by the TOS". This may alienate some people, though, which you may or may not want. Also remember to consider non-native English speakers.

    If you are sill getting those darned bots, consider manually approving by hand all registrations. This will obviously depend on how many new signups you get, and what kind of manpower you have (think moderators and "trusted community members"). On the other hand, you should be able to spot and stop bots right off the bat.

    But why stop there? Be even more proactive! Set up a honeypot. Disallow a certain directory with robots.txt, and ban all IPs that find their way there. Include an invisible link to the disallowed location and see what falls in the trap. Remember that blacklist you started earlier? Add (and share) these IPs!

    Finally, let your community know what you are doing. They will appreciate the effort (If you have noticed the spam, so have they). Set clear guidelines, and encourage community vigilance.

    In the end, remember: spam is beatable.

    • In the end, remember: spam is beatable.

      Ahhhh! The optimism of youth!
    • That sounds like an awful lot of trouble.

      I can't remember which forum(s) allow you to do this, but at least one of 'em allows to you set a user up so that they can keep posting, but only they see their own posts.

      I think it makes a lot more sense to relegate the trolls and spammers to their own personal little playpen. Automated spammers aren't likely going to check if everyone else gets to see their posts... and as always, better an enemy you know than an enemy you don't.

      If they think their accounts are wor
    • Which of this is not one of the seven dwarves? * Doc * Sleepy * Bashful * Horsey

      Um. Dwarves? Dwarves are, y'know, heavily bearded guys with massive axes who go around hiring halfling burglars to help them plunder a dragon's hoard, have inherent resistance to the major deleterious effects of Rings of Power, and do a nice line in erotic mithril underwear.

      What you've got hold of there, on the other hand, are dwarfs.

  • by c0d3h4x0r (604141) on Friday April 07, 2006 @06:20PM (#15088295) Homepage Journal
    "Captcha" techniques aren't bulletproof. If someone can automate all but the "captcha test" part of the posting process, then someone can sit and repeatedly answer the captcha test and still post spam pretty efficiently.

    The only truly effective way to stop this crap is to require a certain amount of time to elapse before being able to post another post, like the way Slashdot does it, and to implement some kind of moderation+filtering system so the crap can be all be modded down by vigilant users. Combine that with a couple other requirements (you must have a user account to post, and new users can't post for the first 48 hours), and you'll easily sqaush the spam problem.

    • No matter how fancy you make your captcha, human labor is cheap. This is especially true when you consider the lengths people are willing to go to get free internet porn. The most genius way I've heard of to beat CAPTCHAs:

      1. Find links to a handful of free thumbnail galleries 2. Set up a webpage with links to said galleries 3. Make every outgoing link require filling in a CAPTCHA

      When your page gets a hit, you pull down the CAPTCHA image (or whatever) from the target site, and serve it up to the mast
      • I've wondered what would happen if you distorted the CAPTCHA using a site's name or URL instead of a random background. Do you think at least some people would hesitate a moment if you went to some random porn site and had to type a CAPTCHA with slashdot.org watermarked in the background?
  • by McCarrum (446375) <<mark.limburg> <at> <gmail.com>> on Friday April 07, 2006 @06:24PM (#15088321)
    i wont echo the above (kittens and altering html templates to make a more unique code process - both well worth it) but i say that on one site i used to run, we allowed anyone with 1000 posts, all members of a screening club .. and every new user had to have their posts screened before being posted .. once an account got to 10 non-spam posts, their group changed to allow normal postings.

    i do recommend you use your community to help your community .. and odds are, they'll help as well
  • attack your site (Score:4, Interesting)

    by kebes (861706) on Friday April 07, 2006 @06:25PM (#15088326) Journal
    I'm certainly no expert in such things, but here are some suggestions. The idea, of course, is to make life difficult for the spam-bot (or the spam-bot writer I suppose) without making life hell for your users. You seem to already be using a CAPTCHA [wikipedia.org], but you could switch to a different one. Everytime you switch, the bot-writer has to update his code. This is annoying for him but is no big deal for your users, since they are humans and can pass whatever simple visual test you give them. You might also consider making small changes to the HTML of those "make new account" pages. It's likely that that bot is making many assumptions about how your page is organized. Changing the names of forms (or having random names), or changing subtle things about the layout (things that a human wouldn't even notice, but which would break an HTML parsing program that was expecting your page to be organized in a certain way) are also good ways to slow down the bots. Make the HTML obfuscated. Include bogus hidden forms, for instance.

    Perhaps the best way to fix your site is to attack it yourself. Try to write a simple bot that automates the login process, and see what happens. You may suddenly notice a subtle hole in your security (maybe the filename for the captcha gives away what it is... or maybe after a successful verification, the same cookie can be used to create another account... or something). In the process of attacking your own site you may uncover something you've missed before.
    • You seem to already be using a CAPTCHA [wikipedia.org], but you could switch to a different one. Everytime you switch, the bot-writer has to update his code. This is annoying for him...

      I'd doubt it. Newer OCR engines are quite flexible.

      At worst, they might have to make up a new profile to process your captcha. Though, I'll admit, some are really tough, even for humans to decode.

      Some people don't realize that a simple "type in the black letters on white background" isn't going to cut it anymore.

  • I host a phpbb2 bulletinboard to help coordinate a team of amateur game developers. It's not linked anywhere, nor is it installed in the default directory. Still, one of these spam bots managed to find it and within a week had 50+ registrations of people with bogus web addresses.

    My solution was to implement the visual check that everyone's talking about. I still get some registrations, but much fewer. What's crazy is that by default, these users can't do hardly anything. Unfortunately creating spam
  • ...but those moderators burn out pretty damned quickly under the load that a concentrated attack can bring - every damned day.

    The most recent batch to hit the site where I'm one of the mods, often use a *@mail.ru e-mail address and eight to ten character random character strings as the registered name.

    Most of those we are getting link to sites like the following:

    http://www.drugsn.com/ [drugsn.com]
    http://phentermine.snow-send.com/ [snow-send.com]
    http://internet-casino-gambling-online.snow-send.c om/ [snow-send.com]
    http://xanax.crasn.com/ [crasn.com]
    http://www.drug [drugname.net]
    • Why don't you just implement an auto-ban filter? Attempt to post a URL with 'xanax' or 'casino gambling' in it, and you get your IP permanently banned.

      • A part of the problem is that each linked-to URL is DIFFERENT and each posting IP is DIFFERENT.

        We ARE banning IDs and IPs, which MAY explain why there are no repeat posts from them, but there seems to be a virtually unlimited number of IPs, from around the world (UK, US, Poland, Japan, Germany, France, etc.), that these turkeys hit from.

        • Yeah, they are probably using hijacked PCs. I'm just saying that you could ban giveaway addresses. For example, I really doubt any legitimate user will be posting URLs with 'xanax' in them. Of course, the spammers could also get smarter.

          Maybe you should set up SpamAssassin to filter forum posts. After all, it does a pretty good job of detecting spammy keywords and such. Sort of like Slashdot's filters.

          Another possibility is to put in a probation period. Let's say, if you have been registered for less
    • Problem there is that most of the domains used are only used for a few days, a week or two at most. After that, the malicious user moves on to the next throwaway domain name. Blog spam is all about getting one's pagerank high, so that someone looking for terms like xanax, or texas hold-em, will see the spammer's site above more legitimate sites. If you have mod_security installed, you may want to try the comment spam blacklist [gotroot.com] as a starting point. I recommend only using entries that are a couple months old,
      • Blog spam is just about page ranks. There's bots scouring the net for anything that looks like a blog or Bulletin Board and posting tons of crap. I started getting a lot of blog spam on my site a couple of months ago. Thing is, links couldn't even be posted. There was a bunch of URLs, but none of them had links in them, because I don't allow any kind of HTML in the posts specifically for this reason. The thing that annoyed me the most was how ugly these things make the site look. Anyway, I implemented
        • Google, Yahoo and MSN have already done this. Simply insert 'rel="nofollow"' into all the tags that people post in the comments, and although they still show up it makes it pointless for those spammers trying to increase their PageRank.

          I know this won't help with the unsightly comments on your website, but since this is the slashdot crowd just flag all the comments with URLs in them as 'hidden' and on a daily/whenever basis go through them deleting spam and unhiding legitimate comments. Stick this all in a
          • Stick this all in a central control panel and it's unlikely to take up more than 10 minutes of your time.

            I basically gave up on blogging because I had to sort through 500 spam comments a day. I know another blogger who had to clean 7,000 (yes, thousand) spams out of his blog every day.

            It took both of us longer than 10 minutes.
          • Looks pretty good, but that hidden link could cause problems for legitimate users. There's some browsers that prefetch all the links on a page, in order to be able to show the content faster when you get around to clicking the link. Banning any user who loads a link would probably block out all the users who used such tools. Although, maybe that's a good thing, because they are using way more bandwidth than necessary, by loading your entire site, even though they aren't going to look at it.
  • First of all, check the user agents of the users/bots doing it, although this should be fairly obvious to check for and change, but its worth a look anyway. Another idea is to prevent all new users from posting links for a week or so, or even anything that looks like a link, like anything that contains "http://", "www", "w w w", and such like, anything that you can block that wont restrict normal conversation on the forums too much. Although, I suppose its possible that they may then turn to using gibberi
  • radical measure (Score:3, Interesting)

    by dario_moreno (263767) on Friday April 07, 2006 @06:48PM (#15088457) Homepage Journal
    I saw a forum which required that you post a (non-'shopped) picture of yourself holding a 45 rpm record of the artist the forum was about before getting an account...best signal/noise ratio I ever saw with rec.guns, which seems to be moderated by gods because of the very high flame and spam potential!
  • Some of these are glitchy, and the code can be obtained from hidden form values or the image URL.
  • Block reoccuring IP addresses used by spammers, non browser programs (yes bots do tend to identify themselves in access logs), and those who seemed to have directly access (bookmarked perhaps) the post page from nowhere.
    • IP addresses: The big boys use open proxies all over the world. You'll often get spam which is clearly from the same source but comes from IP addresses all over the place.

      User agent strings: Again, the big boys use proper user agents so that they look like regular browsers.

      Referrers: Those are unreliable even with human visitors, as proxies (as e.g. used by companies) often filter those out. By relying on referrers you'll block a good portion of your regular visitors.

      Having said that, there are tools

  • Cheep medz (Score:5, Funny)

    by fm6 (162816) on Friday April 07, 2006 @08:00PM (#15088790) Homepage Journal
    www.cheapmeds.com
  • Go ask the porn webmasters which CAPTCHAS work and which don't.

    A better idea is to ask the people who spend their time brute forcing porn sites. They'll know what is undefeatable and what isn't, where the webmaster may only be worried about limiting the damage instead of preventing it outright.
  • You know, it's possible the spam-bots are using human-based systems to bypass your "computer can't recognise it" authentication method. Here's two ways:

    1. Spammer farms out registration to third world sweatshops - for US$1 per day, a person just sits there and fills in registrations then passes them on to the bot system to use.

    2. Spammer's system redirects your challenge to a "Free Porn Sign Up" page - now nudie hungry humans are filling doing it so they can see free naughties.

    Either way is not impossible t
  • by WoTG (610710) on Friday April 07, 2006 @10:17PM (#15089258) Homepage Journal
    I run a quiet phpBB for forum support of some websites of mine. For the last few months SPAM has outnumbered real posts by a large margin. I tried a CAPTA module (I think it was the built in one) and it did next to nothing - they aren't programs, the posts are from humans who have (low paying) jobs to post links on message boards.

    I had reasonable success by limiting posts to people who have verified their email address -- I think that that was also a feature of a recent phpBB update.

    But the spam still outnumbered posts, so in the last two weeks I've added these two phpBB mods:
    http://www.phpbbhacks.com/download/4878 [phpbbhacks.com] - this mod checks each registration IP address against the dns blacklists. I think that it improved the situation, but it didn't stop the problem out right, and I still had to clean up the board once in a while.

    http://www.phpbbhacks.com/download/6208 [phpbbhacks.com] - this mod gives a really easy way to delete a user and all of their posts at once. It's not a fix, but it's turned out to be the best solution. It only takes a few seconds to undo the damage from any one individual, no matter how many spam posts that they have made. A person could spend 20 minutes registering and posting 20 messages and I have to spend 20 seconds nuking the account and all it's posts. It's a fair trade, and I get some small satisfaction in that!
  • mod_security (Score:2, Informative)

    by fthiess (669981)
    I've had quite good luck by using Apache mod_security (modsecurity.org) to filter web activity. Yes, all the suggestions people have been giving about CAPTCHAs, blocking people with addresses in high spam domains, etc., are all good and useful, but mod_security lets you cover a base those approaches are missing: it lets you block spammers from posting spam, even if they somehow manage to get through your registration defenses. I use a mod_security ruleset based on one published at http://gotroot.com/tiki- [gotroot.com]
  • Spam in forums should be dealt
    as Email spam. Delete by filters.

    Add spam to text filters sets to reduce all future spam posts to blanks.
    sure its hard and time-consuming plus it
    gets its share of CPU power but
    Its most use-friendly.
    No CAPTCHAs.: just text filtering.

    All spam forms can be catalogued and string added to blocklists.
    i.e. If you post something
    (question marks indicate any letter)
    Containing string "Am?z?ng op?or?un?ty"=
    you get banned for a week.
    Or if you post "ch?ap Vi?gra substitut?",It get te
  • Use something like: reply e-mail activation and plain text only for n00bs. Then moderator review to get past n00b. One forum I joined, briefly, I as a n00b couldn't use post in html, upload avatar or use smilies (like I cared about that)
  • If you dont want to use PHPbb, PunBB is great. Its much easier to make themes for since its XHTML 1.0 strict compliant, so most of the changes you can make are done with just the CSS.

    Although a good idea, that I've seen on a forum once was that any new users, cant make a new topic until they make at least 2 replies first. Most bots are setup to make new topics and not replies. Although I guess they could change that. Ive even seen one forum that makes you wait 48hrs before you can ever post.

    Another idea is
  • First off, Google it [google.com]. Look and see what everyone else has done, and see what works and what doesn't. THEN come here to /. and ask your question.

    Here are a couple places to start your search:

    I'm just putting the final touches on my own hashcash implementation that doesn't require a server-side database, I'll post a lin

You can bring any calculator you like to the midterm, as long as it doesn't dim the lights when you turn it on. -- Hepler, Systems Design 182

Working...