Bot Nets Behind Recent Spam Surge 389
gsslay writes "Everyone must have noticed a surge in spam recently, particularly for stock pump 'n' dump scams. The Register reports that anti-spam companies have seen a 30% increase in the last two months and, more worryingly, more of this spam is getting through to mailboxes due to the spammers' change in tactics. Rather than use unsecured mail relays spammers are using bot nets, making spam harder to identify and eliminate. Bounced spam is also on the up, and some experts reckon it's past time to start worrying. "
AI to Stop the Spam (Score:5, Interesting)
But this Bayesian strategy has been overcome by the spammers. They use hilariously strange word ordering trick the spam filter and lower their threshold (see Graham's Lisp code) down to an acceptable range. Here's a piece of text from some spam that made it into my mailbox this morning: And it goes on for about 7 paragraphs with absolutely nothing to do with its pitch. It's because of this nonsense that it makes it into my mailbox in the first place.
How do we eradicate this problem? What strategies do we use next?
Well, I would suggest that we stick to the Bayesian approach but instead of tokenizing via Paul Graham's proposed algorithm, we could investigate tokenizing the text based on letter groups (divide 'words' into 2-3 letter groups and test for those frequencies) or even natural language parsing. Yes, I know it sounds absurd but I really think that an engine could be written in Prolog using WordNet or another dictionary with some basic English rules in an attempt to parse and analyze incoming text.
Who knows? Perhaps our need for a spam filtering engine could breed innovation in the AI community?
Smarter Spammers (Score:4, Interesting)
Are your mailbox counts filtered or unfiltered? If so, what strategy is used?
Current Problems (Score:3, Interesting)
Use IM Techniques + Captcha (Score:2, Interesting)
1- As in IM, no one can email you if you have not emailed before.
2- For first time email, the receiving server could sent back a http://en.wikipedia.org/wiki/Captcha [wikipedia.org]CAPTCHA or a product of two large primes to factorize.
The captcha would be solved by the human sender, or the factorization problem by her MUA. Nowadays email is almost instantaneous, this would not add a noticeable delay. All the protocol could be implemented over current email protocols with little modification to existing software.
SPAM processing - server meltdown (Score:4, Interesting)
Image to text (Score:3, Interesting)
I think law enforcement should be working harder at catching spammers (internationally, if necessary) than they are at tracking down copyright infringers. Not because of any moral posture, but because I suspect the total economic impact of spam is greater than infringing use of content. I also think the prohibition against cruel and unusual punishment should be lifted.
Hey, now that I come to think of it, maybe spam is a bigger issue than oil. I say we start invading countries with spammers!
bot wars (Score:5, Interesting)
Maybe we need bots to fight the bots. Bot Wars. In a galaxy far, far, away...
Not so hard to catch (Score:2, Interesting)
Oh, and Slashdot? If you keep hitting me with animated advertisements that cannot be closed, I will be moving to Digg.
Email is a broken protocol (Score:4, Interesting)
Bayesian Has Failed (Score:5, Interesting)
No. Bayesian filtering has failed, just like every other filtering method before it. Modifying it will not work. Adding OCR for image text will not work. Creating a new filtering mechanism will not work. The spamming will continue, more and more of it will get in.
Frankly, given that both processing power, disc space, bandwidth etc, are all increasing, I for one foresee the current spam/ant-spam arms race continuing indefinitely, with the amount of spam sent slowly increasing, and the amount caught by the filters being just enough to keep the amount of spam you get into your inbox at in and around a constant level. It's an endless cycle.
I say, turn it all off. All of it. The filters, the blacklists, the whitelists, Spamhaus, the lot. Let every single spam sent reach its destination, if just for one day. Let Joe Sick Pack finally realise the scale of the problem and just how much strain is being placed on mail servers. It will be both terrible and beautilful at the same time.
Then take off and nuke the site from orbit. It's the only way to be sure.
Re:AI to Stop the Spam (Score:2, Interesting)
So why aren't they used? The answer is two-fold. First of all Bayesian filters are very fast to train and very fast to use. Neural nets are computationally expensive to train and fast to use while support vector machines are expensive to both train and use.
The other reason is that apparently the people writing the mail clients have little or no knowledge of the more advanced methods while the people in the "AI" community seem to have limited interest in spam filtering.
Also, in the long term, server-side filtering is the only acceptable solution. Even with an adequate client-side spam filter, you have the problem that you are downloading the mail from the mail server. This not only puts unnecessary strain on the server but can be quite expensive if you for instance are synching your mail on your cellphone. And server-side anti-spam software is developed at an excruciatingly slow pace.
Finally, the second front must be legal. Wouldn't it be nice if the law enforcement agencies focused on getting the spammers rather than chasing file sharers? Unfortunately, there seems to be little interest for that in the US (the primary source of spam). In the EU it is illegal to send spam to somebody if you haven't gotten explicit permission from the person you are sending it to. In the US it isn't illegal unless the person you are sending it to hasn't explicitly forbidden you to do so. A change of the US system to the one they have in Europe would be preferable.
Email Weaknesses and Compromises (Score:3, Interesting)
Re:Not so hard to catch (Score:2, Interesting)
Last weeks press relating to Ameritrade and E*trade taking huge losses (22Million+ in writeoffs), points out that now pump/dumpers now can actually just 'steal' access to a bunch of legit accounts (HAXDOOR ID/password capture via a keystroker stealer), wait a couple weeks... then issue a bunch of BUY orders across the stolen accounts, use your pre-setup fake accounts to either SELL or SHORT the issue, ACH-OUT, and $$PROFIT$$, all in a matter of hours, and in fact, you don't even have to SPAM people (typically SPAM email doesn't work, but SPAMMING newsgroups and chatrooms does).
The press last week noted that it is _hard_ to catch these villians, as they typically launder their money through several layers of classic identity thefted accounts (online brokerages, then banks, maybe Ebay(buy/sell to 2 stolen identities) then PayPal, then foreign accounts. Once you're able to cross international jurisdictions and are not dealing with $millions (most scams like this net a couple hundred thousand USD per event, enough to make it worth setting up the one time network, let's say $10K of expenses in stealing accounts [fake ids, birth certs, SSNs, Drivers licenses] and setting up the seed cash for sales), the effort to catch a scammer is not worth it to the Feds, Interpol etc.