Fighting Spam with DNA Sequencing Algorithms 142
Christopher Cashell writes "According to this article from NewScientist, IBM's Anti-Spam Filtering Research Project has started testing a new spam filtering algorithm, an algorithm originally designed for DNA sequence analysis. The algorithm has been named Chung-Kwei (after a feng-shui talisman that protects the home against evil spirits). Justin Mason, of SpamAssassin, is quoted as saying that it looks promising. A paper is available on the algorithm, too (PDF)."
Feng Shui hardware (Score:5, Funny)
Re:Feng Shui hardware (Score:1, Funny)
Re:Feng Shui hardware (Score:4, Funny)
Re:Feng Shui hardware (Score:2)
Re:Feng Shui hardware (Score:5, Informative)
I mean, come on - don't anti-spam programs have the coolest names? SpamAssassin, Vipul's Razor...
Re:Feng Shui hardware (Score:2, Insightful)
Penn and Computing (Score:2)
Get the Feng Shui Motherboard (Score:3, Funny)
Wordfilter (Score:3, Insightful)
Even with training, isn't this just some regexp and searchting after particular strings.
And what about short messages, that don't use as much words, is the spamscore relative or absolute? The article is a little low on details, anybody who can point to some more informative articles?
Re:Wordfilter (Score:3, Interesting)
personally I'd prefer a much better set of filter tools e.g. being able to say "I only speak English, I NEVER use this account for commerce, and the people I email are professionals so score spelling mistakes much higher as probable spam".
can someone point me in the direction of such a filter?
Re:Wordfilter (Score:4, Informative)
personally I'd prefer a much better set of filter tools e.g. being able to say "I only speak English, I NEVER use this account for commerce, and the people I email are professionals so score spelling mistakes much higher as probable spam".
can someone point me in the direction of such a filter?
How about spamassassin? /etc/mail/spamassassin/local.cf:
ok_languages enJust add the following to
And increase the score for BIZ_TLD and other tests you find more important than others. Scoring per test is fully configurable, complete list of tests here [apache.org].
Re:Wordfilter (Score:2)
But really- have a new algorithm that's not perfect? Work on it. More algorithms to choose for cannot mean anything but better antispam solutions.
Mozilla Firefox (Score:2, Insightful)
With the nature of new spam messages that look like real emails, the only person who can really tell if something is spam is the recipient.
Thunderbird (Score:2, Informative)
Re:Thunderbird (Score:2)
Re:Mozilla Firefox (Score:3, Insightful)
the newest version has been doing better so far.
I think my problem is my rate of email is quite low so it's difficult to train. I'd like it if there could be a database where if a subject header is reported as spam by one user it effects other users' scoring.
Re:Mozilla Firefox^WThunderbird (Score:2)
Re:Mozilla Firefox (Score:3, Informative)
There are a few databases out there that take hashes of spam e-mails (either sent to spam traps or reported) and use them for spam tagging. SpamAssassin can use their client programs to help tag messages also - I don't know if there's an extension or anything for Thunderbird, I don't use it.
The three that come to mind are DCC [rhyolite.com], Razor [sourceforge.net] and Pyzor [sourceforge.net].
All have their advantages
A pitfall of relying on others' classifications (Score:2)
One of my accounts is a catch all for a domain which has gotten addresses misentered into both legitimate mailing lists and as the erroneous e-mail address of people who are copied and sometimes even directly addresses by genuine personal e-mails. But to me they are all equivalent to spam, so if I was reporting spam to some authoritative list there would likely be an outbreak
Re:Mozilla Firefox (Score:2, Interesting)
My experience with it has been rather disapppointing. Why I need to tag as spam two messages from the same sender or with the exact same subject is a mystery to me. After the 10th "Make $/d+ in XX days" type message one has to wonder just how effective this thing is.
This method is promising because it uses spell-checking and a better way to identify spammy string sequences, something none of the two main camps of spam-filters have seem keen to do until now.
Re:Mozilla Firefox (Score:3, Interesting)
This shouldn't be all that surprising - Bayesian filtering is all based on probabilities. The reason "Outlook message rules" is so bad is because a friend of mine might send me a joke about Viagra, which I don't want to have deleted indiscri
Re:Mozilla Thunderbird (Score:1)
Re:Mozilla Firefox (Score:4, Funny)
Re:Mozilla Firefox (Score:3, Interesting)
Maintaining an enterprise mail system based upon user-controlled spam filtering software is not practical. That small percentage of users with consistent ID 10T errors adds up fast. Try correcting false positives for a user-configured filter. It's time-consuming.
The better approach from an administrative standpoint is controlling spam at the MTA- and MDA- levels of the mail server. I use postfix checks with
Re:Mozilla Firefox (Score:2)
I mean, how are these twats going to get even the most floppy, lazy, frustrated 99 year old to buy their product by telling him "rankin decisionmake portraiture approval slothful clamber teutonic activism alcoa tofu wakeful polonaise burt afghan lad sedimentary pennyroyal aristotelea
High tech for what ? (Score:3, Interesting)
More correct than you know (Score:2, Interesting)
This is just like your own immune system, which uses such things as "V-D-J" recombination (and other tricks) to create billions of some what random different epitope to attack potential unknown pathogens. Cells they must further educate not to attack "self" in your own body.
If only computer geeks took some lesson from biologist, perhaps they could get
Re:More correct than you know (Score:2)
Doesn't Bayesian filtering work somewhat like the immune system? After being exposed to the "environment" it learns what is "self" and what is "pathogen" and starts distinguising one from the other pretty reliably. I currently use a server-side Bayes filter on my email and I get 99.5% accuracy with very little manual intervention. And it gets more an more accurate the longer you use it.. unlik
Misnomer, it's not "fighting spam"... (Score:1, Insightful)
Re:Misnomer, it's not "fighting spam"... (Score:2)
Re:Misnomer, it's not "fighting spam"... (Score:5, Insightful)
People have been improving filtering, and the spammers just pump up the volume. As filtering improves, the delivery rate goes down, but so does the complaint rate so they end up being able to pump more spam before they're detected.
I've been watching this arms race for almost a decade, and the advantage is still on the spammer's side. At the moment I'm blocking between 10,000 and 20,000 connections a day just on the basis of their IP address (including blocks against entire countries), another 3-5,000 using a greylist/honeypot app I'm working on, and I'm still getting one or two hundred messages per day hitting my procmailrc. A few years back, when I was getting a few hundred spams a day without all those RBLs and personal blacklists, people were all excited about how bayesian filters were gonna make spam uneconomical... and I made the same comment back then. Now I'm filtering a couple of hundred times more efficiently and effectively and I'm still getting almost the same volume.
I don't see anything different this time. You can't fight spam with filters, all you can do is adapt to it.
Re:Misnomer, it's not "fighting spam"... (Score:2)
The effectiveness of the spam that's blocked decreases, the potentcy of the spam that gets through skyrockets since it stands alone. This alone is motivation to triple the efforts of spammers. Im sure the more talented spammers out there nearly jizz themselves as they run thier latest crafted email through their local "test servers", seeing it passes through all the filters with ease, and hit the SEND button.
Until there is new methodologies to prevent the "ability" to spam, period, everything else
Re:Misnomer, it's not "fighting spam"... (Score:1)
This middle-market-merchandising-madness has to stop. Bill Gates and attendent remora-ware are getting richer and richer each and every day.
I guess if politicians can't figure out that their own computers aren't safe, or how to tax internet transactions, then we can't bloody rely on them to stop consumer gouging either can we?
Re:Misnomer, it's not "fighting spam"... (Score:1)
I'd also like to see email addys be treated exactly the same as a snail mail street address addy or a telephone number, ie, make them cost to get, so they are treated correctly. We register domains, why not email addys? If it cost 10$ a year (something like that) to register an email addy, there would be no incentive for th
Registering eMail addresses (Score:1)
I had not heard that angle before. That rocks! You'd think it would be the sort of thing a politician could wield in court too.
It's strange to me that there are a whole slew of laws concerning other modes of communication, but the internet is slow to be regulated. I
Way too complex (Score:2)
Yes, spammers do successfully guess whitelisted addresses, by stealing people's address books and mailboxes through viruses and guessing that if you're in someone's address book or they've got mail from you then you're whitelisted from them.
So, it's an effective filtering mechanism for now, but eventually you'll have to require something be
Bayesian Still Works (Score:4, Funny)
Besides, you have to ask yourself some questions...
"What happens if you try to filter spam with RNA?"
"Just how good can ACT and G manage spam?"
and, most important of all...
"Are you sure this spam filter uses no portion of Keanu Reeves' genetic code?"
Love SA... (Score:5, Informative)
I love this mostly because it means that SA is a moving target. Spammers can figure out how to defeat pieces of it, but it deploys a wide range of static, dynamic, network-based and user-driven tests that changes so much that spammers simply can't afford to keep up.
The biggest problem I see, at the moment.... (Score:4, Interesting)
Wrong title, I guess (Score:5, Interesting)
I think we will see more and more applications like this with the growing cross-polination between Biology and CS.
What could we do... (Score:2, Funny)
Re:What could we do... (Score:2)
Works until the Spammers get a copy of it (Score:5, Insightful)
For example, the article mentions the software accepts a message that is long but has a few "spammy" sequences. This suggests an immediate countermeasure of adding bulk to spam -- appending a copy of some news article to the spammy payload (some already do this).
Personally, I've always thought that a simple spell check would do a good job as another layer filtering. It would place spammers in a no-win situation -- either the keyword filter or the spell check filter would get them.
Re:Works until the Spammers get a copy of it (Score:2, Interesting)
Spell checker as anti-spam filter - that would create huge problems for most Americans
Otherwise it's a good idea.
Re:Works until the Spammers get a copy of it (Score:3, Insightful)
How so?
1) install software
2) treat as black box
3) spam spam spam
4) see what gets through
5) study, enhance
6) goto 3)
Just because you can't see how it works, doesn't mean you can't teach yourself how to get around it.
Or... (Score:3, Funny)
2) Decompile
3) Study code
4) Develop countermeasure
5) spam spam spam
It's not like spammers care about the EULA that says they can't look at the code. Oh, and before I forget...
6) ???
7) Profit!
Sean
Totally offtopic... (Score:1)
1) Collect underpants
2) Goto 1
3) Profit
See, it does make sense!
It is difficult to beat statistical spam filters (Score:3, Informative)
John Graham-Cumming presented a talk Beating Bayesian Filters at the 2004 Spam Conference [spamconference.org] detailing these results. A video recording is available; alas, no paper.
In conducting a recent spam filter evaluation [uwaterloo.ca] I observed (but did not report) that the statistical filter attacks were not
Re:It is difficult to beat statistical spam filter (Score:2)
Some statistical algorithms only pick a small number of tokens according to some rationale or other (e.g. most extreme scores). For such algorithms, the padding attack is a very good idea, as with enough random words, one or more of these should have a sufficiently extreme score (so that it replaces a more legitimate token in the list of considered tokens), although whether an extreme score can be synthesised randomly would depend on the c
Re:Works until the Spammers get a copy of it (Score:2, Insightful)
Then 3/4 of slashdotters wouldn't be able to get their messages through to anybody
....feng-shui... and WAKE up ppl. (Score:2, Troll)
and btw, WAKE up ppl. 'Filtering' won't make SPAM *ever* go away. As long as you keep on filtering, I guess, it'll act as a cure/remedy that 'relieves pain', but it isn't a cure/remedy that'll kill 'cancer' for good.
And from a different sidenote, 'Filter
Re:....feng-shui... and WAKE up ppl. (Score:2)
Let me get this straight... You are claiming that fend-shui is fake because science doesn't back it up, then you disclaim that claim by saying you don't really know if science backs it up or not. Ok.
SPAM eat's like *what was it* 60-80% of the total broadband (world wide) now?!
This recent article [msn.com] says that about 80% of the e-mail in the US is SPAM... but e-mail is just a small portion of all internet traffic, less than 5% in many locations such as u
Re:....feng-shui... and WAKE up ppl. (Score:2)
no, I think you missinterpreted me. I claim (from what I've read (scientific or otherwise )) that feng-shui is a fake thing. Hence the "AFAIK".
I don't claim I know more then I know, and if you know you know more then I know, then by all means, let me know. I sure would like to know as much as you know,
Re:....feng-shui... and WAKE up ppl. (Score:2)
I concur with you on the fine point you make "Reduce them (profit margins) enough and they'll stop doing it". But I have a hard time even hypothetically conciving that "Filtering techniques" will ever ever bog spammers (enough to make them stop) from reverse-engineering 'Filtering techniques".
I've used a couple of the (at the time) best "filtering techniques". At this pr
This is all bull -- Change the law (Score:2)
Re:This is all bull -- Change the law (Score:1)
No offense, but there are plenty of examples of (at least partial) technological solutions to social problems. For instance, the ignition lock on my car prevents people from casuallly stealing it.
This might not solve the social problem of people wanting to steal cars, but is a decent try at solving the technological problem of people being able to easily do it.
Re:This is all bull -- Change the law (Score:3, Insightful)
You'll be buying all your doors without locks from now on, I take it, since burglary is a social/legal problem and the government has passed laws against it. Let us know how that goes.
Re:This is all bull -- Change the law (Score:2)
The law alone will of course not make the spam magically go completly away, but it will make sure that sending spam gets a pretty risky business, instead of a completly risk free one, so people might think twice before sending out a million spam mails. Sure this won't stop people from other countries, however reducing spam from the USA would be a pretty good start.
Change Economics, not laws (Score:2)
Interesting... Electronic evolution... (Score:5, Insightful)
First, there's a constant tuning of both preditor and prey (Anti-spam tools and spam).
Second, there seems to be some sort of equilibrium which is inevitably achieved, and
Third, there are occasional discreet major developments which change the game. This would be an example. Now, spam is going to be forced to majorly adapt.
I could see the 'Quality' of spam improving a lot as a result of tools like this. No more letters from my long lost benefactors in nigeria, and no one liners about 'Gushing like a firehose' (My coworkers and I got a good chuckle out of that one), but, as the story said, if you have keywords in a long email, it gets far less penalized. OK. Attach verses from Dante's Inferno, or Joyce's Dubliners to the email. Problem solved. You can't block words like viagra altogether or Pfizer researchers are going to have a hell of a time getting anything through.
Another concern is that if this forces spammers to make up new and compelling spam, people will be more likely to check it out. While my parents are probably pretty confident they didn't win a secret lottery 3 or 4 times last week, they might possibly believe new and creative stories.
Perhaps evolution of email readers is just plain going to be a neccessary part of the solution...
Re:Interesting... Electronic evolution... (Score:3, Insightful)
Absolutely. Unfortunately, as most predator-prey models will tell you, neither population ever goes to zero unless something catastrophic happens. And in this case, catastrophe is precisely what we want to happen to the prey.
(If they'd simply implement my proposed scheme of a bullet to the head of every spammer, no mercy, no appeal, it'd be easy. But noooo, "spammers are human beings no matter how useless and harmful they are," waaaaah.)
Re:Interesting... Electronic evolution... (Score:2)
In the animal world analogy, if the economic solution is implemented the users who employ it become species without natural enemies in the habitat... like some large animals. In respect to spam, that is.
Re:Interesting... Electronic evolution... (Score:1)
Corrections... (Score:3, Insightful)
Uh oh - there goes the patent now.... (Score:2, Interesting)
By now, all the patent-trollster-lurkers who passively phish in the
Can anyone who works in the IP (intellectual property NOT Internet Protocol) post a list of known trollster companies that are full of lawyers who acquire patents (by any means) and make patent litigation their primary business model?
Nice tool but greylisting does more right now! (Score:2, Interesting)
Seriously, greylisting implemented on all the ISPs MTAs would overnight block 99% of the spam being sent. Most spam at the moment is being sent from armies of bots run on unsuspecting users systems connected to cable and DSL service. The programs used are unsophisticated, they churn through a list of addresses spewing messages out by the thous
Re:Nice tool but greylisting does more right now! (Score:1)
Seriously, greylisting implemented on all the ISPs MTAs would overnight block 99% of the spam being sent. Most spam at the moment is being sent from
armies of bots run on unsuspecting users systems connected to cable and DSL service. The programs used are unsophisticated, they churn through a
list of addresses spewing messages out by the thousands. They do not queue messages or retry them if they get an error. Greylisting uses this to
great effect and blocks
Re:Nice tool but greylisting does more right now! (Score:1)
In over a year the spammers have not done anything different but dump and spew. You s
If it were that easy, most ISPs would be using it (Score:2)
That doesn't mean it's not a h
Re:If it were that easy, most ISPs would be using (Score:2)
For those who don't want to RTFA (Score:2, Funny)
1) Make your PC face the North, whenever you are checking Email.
2) Hang a metal windchime above your workstation.
It is important that the rods of the windchime to be hollow, so that the auspicious Chi can rise up the chimes.
3) Add a user account for the Dragon Turtle & make him the admin.
Giving birth to Artificial Intelligence... (Score:4, Interesting)
Think about it - we now have software that "learns' what you like. [nuclearelephant.com]
Sorry, but anything that "learns" fits a definition of intelligence - using past results to predict future outcomes. Note that I'm not saying "self aware" or "conscious", simply "intelligence".
As we move forward, we'll see more and more intelligence on the part of the spammers, and the warring factions of intelligence will likely provide massive financial and political impetus to build ever more intelligence solutions - thus AI is born.
The problem with other vehicles for developing AI is simply the budget. With SPAM, everybody has a direct, financial incentive to develop it, so development will definitely happen!
Re:Giving birth to Artificial Intelligence... (Score:1, Interesting)
Biology is information technology (Score:2)
Each cell in your body contains approximately 20 GB of data. Consider the redundancy and sheer massive size of information storage capacity your body consists of! Compare THAT to an Oracle cluster...
So, given the incredible need to process information in order to understand life itself (which could be considered a form of self-rep
Best Spam Software (Score:1)
I have tried just about every single anti-spam software out there, so I have some experience. After being fed up with getting false positives and having to deal with tons of spam getting past the spam filters I tried out Cloudmark's Spamnet - a community based approach to fighting spam. So far it has been 95-99% effective with 0 false positives which is the most important factor for me.
In the past couple of months it has blocked 19,221 spam messages. I don't even bother to send spam to a Spam folder a
Everybody's doing this now (Score:2)
The approach used to be:
1. Find features (usually well-delimited words) in the message.
2. Look up the
Nothing new here, move along... (Score:5, Informative)
As someone who's done some research on machine learning for spam filtering, this sure looks to me from their 8-page paper like yet another simplistic ML algorithm advocated by folks who don't know the field and tested using techniques of questionable sensitivity. Their "novel" method sounds an awful lot like feature set construction by clustering, a method that is widely used in the spam filtering literature, but with a somewhat novel clustering technique from biology.
Message filtering starts by throwing away line breaks for no obvious reason, then optionally removing the known ham from the training set for no obvious reason. Message headers are then thrown away, for no obvious reason.
No general method is given for corpus allocation. In the experiment reported later, the original corpus appears to have been split roughly in half. (For unreported reasons, none of these splits are exact. No rationale is given for the various corpus allocations.) The training corpus is then split into ham and spam, and the ham portion is split in half. The spam training corpus is used for "positive training": determining a complex feature set as described below. One half of the ham training corpus is then used for "negative training": filtering out complex features that are common in ham. The remainder of the ham corpus is used as a validation set to select thresholds described below. No justification is given as to the failure of the validation set to include spam messages, and the procedure is vague on this point.
The description of the key "positive training" phase is difficult to follow: it seems to assume the pre-existence of the "SPAM vocabulary" [sic] being constructed. The key idea seems to be to use positional index of words within the body as base features, and construct complex features by using a pattern recognition algorithm to find correspondences between sets of base features across spam messages. Patterns that appear across many spam messages are treated as indicating spam.
The final training step is to set thresholds for (1) minimum number of complex features in the spam message and (2) fraction of the message text covered by the complex features. One would expect these two criteria to be highly correlated: no effort appears to have been made to enforce or explore their orthogonality.
The classification phase proceeds by simply counting the number of patterns in a given test message and the percent coverage of the message by the patterns. If the result exceeds both thresholds, the message is classified as spam.
For the empirical evaluation, the corpus used seems to have consisted of approximately 130,000 messages, roughly 1/4 ham and 3/4 spam. No details of the construction or acquisition of this large corpus were given. Because of its volume, one would suspect a synthetic corpus from high volume sources. The details of this corpus construction are critical to the evaluation of the method, so no useful conclusions can really be drawn from the empirical evaluation other than that, like most machine learning methods, this method works well on some problem set.
The claimed accuracies from the technique are at a level that is highly suspect from previous experience: there are fundamental bounds on how well any ML algorithm can do in real situations that don't appear to be met here. Indeed, messages found to be misclassified as spam in the test corpus were manually reclassified, but no effort seems to have been made to identify messages that were "correctly" classified by the algorithm but misclassified in the corpus. The error rate before manual manipulation of the results (!) appears to be about 97%, which is well within the normal expected range. Computational efficiency appears to be good.
The vocabulary used in the paper is not particularly consistent with the vocabulary normally used in the spam filtering or machine learning literature. A few spam filtering and machine learning papers are cited, but not many: citations are primarily from the
Re:Nothing new here, move along... (Score:2)
why? *just curious, as from the post you seem like a bright person ...*
Re:Nothing new here, move along... (Score:1)
On the topic of knowing, just to let you know: I know more (about grammar) than you do, please check your signature and find the two occurrences of "then" that should have been replaced with "than".
Re:Nothing new here, move along... (Score:2)
ah. so other than a bad PR judgement, they are ok.
>about grammar
all right'y then. think I found what you where saying.
Here's what I'm wondering... (Score:2)
Re:Here's what I'm wondering... (Score:1)
ActiveState PureMessage has been doing this for years.
Also now available for free via SURBL [surbl.com]
Just when you though you had a new idea, it turns out to be older than the hills...
Re:Here's what I'm wondering... (Score:1)
Also available in Vipul's Razor:
NAME Changes - razor-agents [sourceforge.net] 2.61 (July 06, 2004) * Introduced the Whiplash signature scheme. Whiplash signatures are based on canonical domain names present in URLs embedded in spam messages. A Whiplash signature is also a function of the length of the spam message. It's important to note that not all whiplashes are used as classifiers. The Whiplash engine is augmented by sophesticated logic on the Razor2 backend to select the Whiplashes that are used to filter
Virus and worm detection! (Score:3, Interesting)
Even moreso, since viruses are much more a compilation of a set of previous constructions with a few mods than a new composition not necessarily based on the wording of old scams.
And Viruses and worms (especially worms) are more constratined by their environment, requiring an exploit of a vulnerability and the instation of work-doing code. Though gene-shuffling techniques might be able to bury much of the code, the basic exploit must continue to be some sort of match to the vulnerability's "receptor".
Why not just change? (Score:2)
Given that, why can't there just be a proposal, adopted (like a DVD format, etc) by some huge players (Microsoft, OpenSou
e-postage (Score:2)
I solved the spam problem. Seriously. Interested? (Score:2)
Complete detailes here. [slashdot.org]
Bryan Taylor
iamcf13@hotpop.com
SpamByte code: 7
(see http://www.cf13.com/game-over-spammers.htm )
http://www.cf13.com/press-release.htm
All email containing unwanted content will be summarily deleted or reported as spam.
Greylisting works for me (Score:1)
I just installed greylistd [debian.org] by Tor Slettnes about 24 hours ago, and haven't received a single spam yet (down from 20-30 per day before). I only have a 5 minute greylist delay, meaning there's almost no downside to this method. Assuming my correspondants don't use broken mail servers (and that's their problem if they do) there are no false positives and no maintenance with this system. I use no other spam filters of any kind. I guess they just aren't patient enough to wait 5 minutes :)
And if they start
Abandon AI crap. Need another approach (Score:1)
The solution is same one that reduces paper junk mail: postage fees. Charge 5 cents or so per message, and spam will greatly shrink.
Parse Carefully (Score:2)
(((Anti-Spam) Filtering) Research) Project
This is not the same as the
((Anti-(Spam Filtering)) Research) Project
Nor is it the
(Anti-((Spam Filtering) Research)) Project
I'm not sure, but I think the last two are run by AT&T [slashdot.org].
Serious methodological flaws (Score:4, Insightful)
The first of these calls their sensitivity result into quesiton. If they classify their training data perfectly, then the 4.4% false negative rate they quote needs to be doubled to 8.8% -- almost one false negative in every eleven messages scanned.
The second of these calls their false positive rate into question: training with an unrealistically thorough set leads to better catergorization, ceteris paribus. They need to show the trend with a variety of different training set sizes to support any claims about performance.
This sounds like a fully buzzword compliant non-result to me.
Re:hm (Score:5, Insightful)
Of course. Spam is a moving target. Given that it is cheaper to create spam than to block spam, it will always be an uphill battle.
Lately, much of the spam I have been getting in my Inbox (squirrelmail/spamassassin) has been email that has no typos, no random text, no blatent "click here" lines and looks like normal mail. Except they are trying to sell me something.
Re:hm (Score:2, Informative)
You lucky g*t!
Re:hm (Score:3, Informative)
They'll.. (Score:3, Interesting)
Re:hm (Score:3, Interesting)
Re:What I want to know is... (Score:2)
I just don't want to read it - and now I don't have to.
Re:What I want to know is... (Score:2)
To block spam at the transport level is one thing; an algorithm for identifying spam without human intervention is another entirely.
I suggest you RTFA. Their method is actually pretty interesting. Lackluster is not the appropriate word for the novel idea they have come up with.
Re:What I want to know is... (Score:2)
You are confused.
Rather more confused are the slashbots who tout client-side content filtering as the end-all be-all "solution" to spam.
To block spam at the transport level is one thing; an algorithm for identifying spam without human intervention is another entirely.
The only catch: it's not possible to identify spam (unsolicited bulk e-mail) based on the content alone. Why? Because the two words in the definition, 'unsolicited' and 'bulk'. How can the existence of the word 'viagra' possibly tell me
Re:Stop This B\/llsh!t Filtering Crap (Score:3, Insightful)
If after reading the E-mail, you still don't know what product the spam is advertising, then the spammers are losing, since those E-mail's will not lead to a sale, and the spammers are simply wasting their own bandwidth.