Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Spam Detection Using an Artificial Immune System 114

Posted by timothy
from the lymp0cty3z-narf-poit!-claire-said-the-laundry-wheel dept.
rangeva writes "As anti-spam solutions evolve to limit junk email, the senders quickly adapt to make sure their messages are seen. an interesting article describes the application of an artificial immune system model to effectively protect email users from unwanted messages. In particular, it tests a spam immune system against the publicly available SpamAssassin corpus of spam and non-spam. It does so by classifying email messages with the detectors produced by the immune system. The resulting system classifies the messages with accuracy similar to that of other spam filters, but it does so with fewer detectors."
This discussion has been archived. No new comments can be posted.

Spam Detection Using an Artificial Immune System

Comments Filter:
  • by CRCulver (715279) <crculver@christopherculver.com> on Monday July 10, 2006 @05:08PM (#15693846) Homepage
    I have to admit, I don't see the need for these recent whizbang's additions to the spam-fighting repertoire. Sure, they might be ingenious, but on a practical level they don't do anything more than a properly-configured SpamAssassin system. I used to get a lot of spam coming through a default installation of SpamAssassin, but after spending some time with O'Reilly's book [amazon.com] (the free docs may already be up to this level of reader-friendliness, it's been a couple of years) and tweaking my installation, I get spam once in a blue moon. There's just no need for anything more.
    • by crotherm (160925) on Monday July 10, 2006 @05:36PM (#15694009) Journal
      I have to admit, I don't see the need for these recent wizbang horseless carriages. Sure, they might be ingenious, but on a practical level, they don't do anything more than a fine team of horses. yada yada

      But seriously, your attitude is one that would stop all progress. This new method does the job more efficiently.

      From TFA, The lightweight nature of this solution -- requiring significantly smaller number of detectors when compared to SpamAssassin -- will doubtlessly prove attractive to those looking to implement a server-based solution where processing overhead may well be an issue. A server-based solution would be a one-size-fits-all mold since the filter is not personalized and does not learn for each particular user, but the reduced processing and storage time makes such a solution attractive.

      That sounds like a good reason for this research.

      • Hmm.. I agree with the need for research and progress, of course. However, I also agree with the parent poster (relative to your post) in the sense that, as far as fighting spam itself goes... if something isn't broke, it's silly to fix it. Most technological progress does cost something in terms of society and the happiness of simplicity, and sometimes that price isn't worth paying.

        I guess for me the question is... "what KIND of efficiency are we talking about here? Simplicity for the CPU? Simplicity f
        • The real issue is efficiency at the server level. If your email server was running something like this you'd have protection just as good, but doesn't bog down with the thousands of emails going through it.

          Of course servers are getting faster all the time, but the whole point of computer science is to make things work more efficiently regardless of the actual hardware it runs on.
          • Yes, good point on servers. On the point of computer science though... hmm. I think the point of all science is understanding, which may lead to efficient use, abandonment, or almost any other course of action.
    • Wow, I wished spamassassin worked that well for me. Mind you, it does get rid of most of the junk, but I still get a fair amount of spam that slips past SA. Every few days, I'll even get spam that has a score of exactly 0!
      • Good spammers run their spam through SpamAssassin to make sure they get a 0 score in it to make sure the spam gets through. Most sysadmins use the standards settings and thus the spam gets through.

        No very smart to send spam that get caught by SpamAssassin.
    • The real precision of current good Bayesian filtering is close to the precission of a human filter - from 80 to 90 percent. There are newest advances in natural language processing (word sequence processing) and neural and functional text classfifcation areas (support vector machines with nonlinear kernels) that can get spam classification precision up to 99 percent. It might not be too much for spam, BUT when you transfer the same knoledge to other areas of text classification 99 percent of binary classfic
      • But the research in the article is pretty lame indeed - I have seen expiring Bayesian classifiers before, the only thing that I find interesting there is the use of word sequencing to reduce the feature vectors, but the paper is short of the details of automation of sequence selection which is a major reason why that process is quite underused currently.
    • Yeah, it might work for today, but Spam is only going to get worse and there's a point where traditional models won't scale. If you can get something that does the same job in fewer cycles, that implies you can scale up higher using fewer resources. Also, the methodology they're talking about here is growing organically. This probably means that it'll evolve organically, making it better with each generation. Spam fighters can't stop innovating because the spammers aren't going to.
  • Finally (Score:5, Funny)

    by nizo (81281) * on Monday July 10, 2006 @05:09PM (#15693848) Homepage Journal
    So now we can look forward to a spam filtering solution that actively searches for spammers and kills them?
    • So now we can look forward to a spam filtering solution that actively searches for spammers and kills them?

      Hooah! First one to hook this up with an MLRS gets a cookie!
    • Any good programmer worth their salt would have programmed this to cut out their tongue, cut off their fingers one by one, slice off their eyelids and force them to watch "Biodome" 5 times in succession.

      I want those fuckers to live painfully damnit, just like the rest of us do when we have too much spam.
      • Damnit that goes too far! You're a cruel human being. I wouldn't subject a dog to that level of torture, much less a human.

        In the name of human rights, they should not be forced to watch Biodome any more than twice! :-P
    • I know where most of them live, just kidding. the problem isn't that we have spammers the problem is that we have kids pretending to be spammers who just hack into legitimate spam networks to send out scams.

    • They've already got that. It's called the Robospamassassin!
      http://mirror12.escomposlinux.org/comic/ecol-205-e .png [escomposlinux.org]
  • I think this is a very useful new anti-spam tool, but as usual, it will have the possibility of false positives, which can be very damaging. And Spammers will adapt to this technology as well, reducing its effectiveness.
    • And Spammers will adapt to this technology as well, reducing its effectiveness.

      One wonders what sort of people have so little moral fiber that they study spam-blockers and create new methods for getting around it. Really, it would be great if Slashdot could profile one of these twisted people and show just who does it, what country they are from, what kind of upbringing they had, etc. But maybe anyone is susceptible to the temptation. Recently, while making a comment on a blog, I was thinking about just

      • Really, it would be great if Slashdot could profile one of these twisted people and show just who does it, what country they are from, what kind of upbringing they had, etc.
        You forgot...let's get their /. username also
      • Slashdot would NEVER post a story [slashdot.org] about the sorts of sick, twisted individuals that perpetrate such sleazy tactics for profit.

        (N.B: Okay, yeah, there's a difference between spyware and spam... I'd think that spyware is the worse of the two evils, though.)
      • One wonders what sort of people have so little moral fiber that they study spam-blockers and create new methods for getting around it.

        Simple: people who see the profit in it and don't care what people think of them. Who cares if there's a .001% reply rate when you send out tens of millions of spam per day? As long as there's a way to get money out of people with spam, there will be spam, and there will be people looking for ways to get around sny filtering program or algorythm designed.

      • Er, I think it's just people who don't think spam is a big deal and are amused by the several million dollars a year of revenue it generates. You act like they're organ-leggers.
  • The difference? (Score:3, Insightful)

    by MoeMoe (659154) on Monday July 10, 2006 @05:14PM (#15693876)
    Not that I'm arguing that it's the same, rather I'd like to know:

    What seperates this from a Bayesian filter?
    • Re:The difference? (Score:5, Insightful)

      by DragonWriter (970822) on Monday July 10, 2006 @05:17PM (#15693896)
      What seperates this from a Bayesian filter?
      If nothing else, it has new, improve buzzwords. "Artificial immune system" is so much more evocative than "Bayesian filter".
    • Not much (Score:5, Informative)

      by jfengel (409917) on Monday July 10, 2006 @05:28PM (#15693970) Homepage Journal
      Ultimately, very little. At core, they're probably identical techniques, and if I were reviewing this as a scientific paper I'd ding them for not answering exactly that question. There are such strong parallels between the two (train them on known data, add up probabilities, cut stuff on a threshold) that I strongly suspect that they're identical.

      There are useful things to be gained from a change of metaphor. For example, one difference between this and most bayesian spam filter implementations is that this explicitly incorporates a decay function. That could be useful, if a word that used to be common in spam no longer is (e.g. if I actually decided to buy a Rolex, it's no longer a strong spam indicator, whereas right now any email mentionining "Rolex" is 99.9999% certain to be spam).

      You could easily modify a Bayesian filter to have time-decaying weights, but if the change in metaphor leads somebody to come up with a good insight, then perhaps this is useful. Mathematically, though, the equations look very similar.
      • Look up "bayes_expiry_max_db_size". If your database gets larger than the limit you set then the lesser used tokens are deleted.
      • Re:Not much (Score:5, Interesting)

        by adrianbaugh (696007) on Monday July 10, 2006 @05:43PM (#15694048) Homepage Journal
        Perhaps a neat way to extend this idea would be to have the filter scan your outgoing mail, too; not to search for spam as such, but to look for changes in behaviour. Then, supposing you emailed sales@igottagetmearolex.com enquiring the price of a Rolex, the filter could modify the spam and ham probabilities of rolex. I suppose it would have to be clever enough to ignore emails sent to abuse@ addresses reporting spam and attaching the spam message, among other things I can't be bothered to think of now, but it's an idea that comes more readily from the immune system metaphor than the pure probability metaphor.
      • The reason you're not working for a scientific paper is that you guess about techniques being identical and pooh pooh them based on said guesses. A bayesian filter is a very specific mathematical technique. This isn't actually very similar at all, other than that it's being used towards the same end.

        Perhaps in the future you could know something about two algorithms before declaring them identical. Just a thought.
      • Re:Not much (Score:1, Informative)

        by Anonymous Coward
        Ah, commenting on things we don't know like we are an expert.... just like telling the doctor that a cold and flu must be the same because they have similar symptoms....

        Here, let me clarify the differences for you. The primary difference is in the nature of the tokens used to classify a message. The Bayesian system has words/tokens that are either predefined by a human or taken from messages verbatim. The artificial immune system has tokens that are randomly and automatically generated by the system using s
    • They claim to be as accurate as a Bayesian process, but with fewer check items.

      But from their paper, it seems that they're "tuning" their check items to the corpus of spam that they're testing against.

      So of course they will use fewer check items. There are a finite number of characteristics of that corpus.

      I did not see where they were using their system in a Real World environment (I may have missed it, the article was pretty painful to read). Now, if they can do as good as a fully tuned SpamAssassin system
  • Great.... (Score:4, Funny)

    by (pvb)charon (685001) on Monday July 10, 2006 @05:17PM (#15693895) Homepage
    Ever heard of hay fever? Allergies? Think, people, think! charon
    • by Dannon (142147) on Monday July 10, 2006 @05:51PM (#15694109) Journal
      Thanks, now I have the mental image of a spam filter with sinus problems. Ewwww...
    • Good point, what the authors are doing is probably trying to score some NSF funds or something.

      Arthritis, AIDS, tuberculosis, Leukemia, lupus, endometriosis, etc. Deadlier cousins of the failures of the immune system you mentioned.

      What they should be modelling the next-gen spam filters on are intracellular def. mechanisms, RNAi, si/shRNA, nuclear translocation tags, etc. Which is what blacklists, senderid, etc. are copying anyways.
    • Ever heard of hay fever? Allergies?

      Greeeat I can see it now...
      Doctor: Do you have any allergies to medication?
      you: No, But my computer has developed an allergy to Viagra, Cialis, and is also sensitive to weight loss pills. not to mention the keyboard seems to have grown several inches in length.
  • Fancy (Score:5, Insightful)

    by roman_mir (125474) on Monday July 10, 2006 @05:23PM (#15693932) Homepage Journal
    It looks fancy but when you get down to it, all it means is that there are a number of heuristics that are combined into filters (this happens by user training.) The filters are 'weighted' and filters that are not used often enough are 'culled' (killed off.) I don't think this will be significantly better than any other Bayesian-type spam systems.
  • Real spam solution (Score:3, Interesting)

    by Dryanta (978861) on Monday July 10, 2006 @05:30PM (#15693978) Journal
    Spam and content filtering will always be a struggle for anybody who actually utilizes email. Simply adding more logic will not solve the problem. Reporting spammers to every rbl list you can think of, and alerting forums and newsgroups of abusive ip blocks on the other hand is already doing quite nicely.
    • Reporting spammers to every rbl list you can think of...

      Sure, for those of us with the time, knowledge and inclination to do it. Expecting Aunt Minnie to do it is unreasonable. All she cares about is keeping spam out of her inbox, and if running something like this, or SpamAssasin at the server gets rid of most of it, isn't that all she can reasonable ask for?

    • token              spamprob       #ham    #spam
      'utilizes'         0.992422       1       6140
  • I gave up (Score:5, Interesting)

    by Scratch-O-Matic (245992) on Monday July 10, 2006 @05:30PM (#15693982)
    I recently gave up on tweaking filters for myself and a few dozen people whose accounts I administer. I wrote a little script that asks for confirmation from the sender...if the sender confirms, they are added to a whitelist and will go straight through after that. I can also add addresses manually to the whitelist, and will soon be able to have wildcard (domain-wide) approved addresses. I've gotten exactly two spam in 6 weeks...both were confirmed by either a person or an autoresponder. Five years ago I never would have wanted such a blunt system...nowadays it's just the ticket.

    • Re:I gave up (Score:5, Interesting)

      by babaloo (259815) on Monday July 10, 2006 @06:09PM (#15694220)
      I understand your frustration but I was the victim of a Joe Job attack and systems like you describe just add to the pain of the victim. I feel that these types of responses are just as unwelcome as spam and I report them as such. Have you had any issues like this?
    • Re:I gave up (Score:4, Interesting)

      by CFrankBernard (605994) <cfrankb AT gmail DOT com> on Monday July 10, 2006 @06:28PM (#15694337)
      I recommend joining the SPAM-L mailing list of 900+ email admins and ask for opinions on "challenge response" (C/R) spam fighting systems. Sending a confirmation message to the alleged/purported sending address *is* spam when it is spoofed/forged (quite common). The only way to ensure sending info back to the connecting email server is to do so /during/ the SMTP conversation.
    • Re:I gave up (Score:5, Insightful)

      by rudedog (7339) <daveNO@SPAMrudedog.org> on Monday July 10, 2006 @08:01PM (#15694810) Homepage
      So it appears that you decided that the responsibility for fighting your spam should be moved onto the backs of everybody else on the Internet? Spam almost always comes from a forged sender. By doing this, you're just sending tons of spam to the forgery victims. Please do us and you a favor and google "challenge response harmful", and then turn off your C/R system.
  • by mrheckman (939480) on Monday July 10, 2006 @05:33PM (#15693998)
    The "immune system" solution is just another way to detect spam, but it is unlikely to be much more successful than existing methods. As someone else pointed out, SpamAssasin is pretty good already. So what if this new type of filter eventually improves the spam filtering accuracy from 98% to 99%? A more highly-polished rock is still a rock.

    The real problem is the sending of spam itself, and that problem arrises from an inability to correctly attribute the spam to the spammers. If we can do that, we can block it, or at least better convict the spammers who violate the law. Things that solve this problem, like Yahoo!'s "DomainKeys", are the future of anti-spam, not more highly-polished rocks.
    • Things that solve this problem, like Yahoo!'s "DomainKeys", are the future of anti-spam, not more highly-polished rocks.

      Domain Keys, at least to this point is utter crap in my experience. I get these small floods of spam into my Yahoo! mailbox. What most of them have in common is they are certified by Domain Keys. A couple months ago, I was getting the exact same spam every day for some mortgage coming from different addresses. All were DK certified.

      For what it's worth, I do send off those specific emai

    • So what if this new type of filter eventually improves the spam filtering accuracy from 98% to 99%?
      Halving the number of errors? Sounds like a good deal.
      • Halving the number of errors is good, but that wouldn't stop my problems with spam. My chief objection to spam now is that there are still too many false positives -- things that show up in the spam box that should not -- so I still have to look through all of the hundreds of spam messages that arrive every day to find the few that are misclassified. Even cutting the number of false positives in half won't solve that problem. If, however, we could eliminate most of the spam, then I would have many fewer fal
    • In fact, such keys are currently strong signs that the ad is, in fact, spam. They're far too easy to buy or steal from other people's machines, often by installing spam zombie software on the machines of unsuspecting and innocent people.
  • Now your spam filter can catch AIDS too. But don't ask how.
  • I'm waiting for the day when we see our first email 'virus'. Something not unlike what happens with real viruses. Then we'd need antibodies similar to this.
  • I have two major objections to this idea, and to the article that presents it.

    1. The ONLY problem this solves is performance -- i.e., processing throughput. And that's not what's wrong with anti-spam systems today. They live and die on the precision/accuracy tradeoff, or maybe on UI.

    2. The authors seem to assume that Bayesian systems work really, really well. While technically most or all current spam-filtering products are Bayesian in some sense, that still speaks of considerable naivete about real-w
  • I just had a thought while reading about the spam filters about spelling. So I went and looked in my spam folder and found that every piece of spam has many, many words that are not in a dictionary, ie not spelled correctly.

    Why not run a script that filters messages based on spelling? If there are more than 'xx' many words that do not exist in the dictionary you choose to use, then the message gets sent to the spam folder. This would catch the odd e-mail from friends who don't know how to spell or what a
    • Generally techniques like that are not used because false positives are much more disasterous than false negatives. Accidentally allowing a couple of spam messages to creep into the regular mail is not so big of a deal; deleting a reply asking for a job interview because it was miscategorized is. Most spam detection systems have to walk a fine line between doing their job and not hosing somebody's mail. That said the systems could be set up so that misspellings add weight towards the decision to categorize
    • To avoid false positives, I recommend using a regex generator for spamvertized variations of common spam terms.
      See http://public.kvalley.com/regex/regex.asp [kvalley.com]
      Fore example, to allow viagra but detect most of its spamvertized variations:
      (?!viagra)(([v])|(\\\W{0,2}\/))[i1l\|\\\/!îíìï:;]( ([a@àáâãäå^æ])|(\/\W{0,2}\\))[gqp96][r](([a@àáâãäå ^æ])|(\/\W{0,2}\\))
    • An interesting idea... but you would need to allow for multiple dictionaries. I commonly get e-mail in english, american, french and japanese every day. And before anyone flames me I -do- make a distinction between english and american. They are spelled and pronounced differently so when discussing dictionaries they ARE different.

      As another responder pointed out... perhaps this could be used in some form of "weight" calculation. I would think counting special characters and individual characters ( barring I
  • Modelling Nature (Score:4, Interesting)

    by A Dafa Disciple (876967) * on Monday July 10, 2006 @05:39PM (#15694022) Homepage
    Your post advocates a

    (x) technical ( ) legislative ( ) market-based ( ) vigilante

    approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

    ( ) Spammers can easily use it to harvest email addresses
    ( ) Mailing lists and other legitimate email uses would be affected
    ( ) No one will be able to find the guy or collect the money
    ( ) It is defenseless against brute force attacks
    ( ) It will stop spam for two weeks and then we'll be stuck with it
    (x) An enormous amount of spam will initially go undetected before your idea is effective
    ( ) Users of email will not put up with it
    ( ) Microsoft will not put up with it
    ( ) The police will not put up with it
    (x) Your idea proposes a solution that only large corporations could deploy
    ( ) Requires too much cooperation from spammers
    ( ) Requires immediate total cooperation from everybody at once
    ( ) Many email users cannot afford to lose business or alienate potential employers
    ( ) Spammers don't care about invalid addresses in their lists
    ( ) Anyone could anonymously destroy anyone else's career or business

    Specifically, your plan fails to account for

    ( ) Laws expressly prohibiting it
    ( ) Lack of centrally controlling authority for email
    ( ) Open relays in foreign countries
    ( ) Ease of searching tiny alphanumeric address space of all email addresses
    ( ) Asshats
    ( ) Jurisdictional problems
    ( ) Unpopularity of weird new taxes
    ( ) Public reluctance to accept weird new forms of money
    ( ) Huge existing software investment in SMTP
    ( ) Susceptibility of protocols other than SMTP to attack
    ( ) Willingness of users to install OS patches received by email
    ( ) Armies of worm riddled broadband-connected Windows boxes
    ( ) Eternal arms race involved in all filtering approaches
    ( ) Extreme profitability of spam
    ( ) Joe jobs and/or identity theft
    ( ) Technically illiterate politicians
    ( ) Extreme stupidity on the part of people who do business with spammers
    ( ) Dishonesty on the part of spammers themselves
    ( ) Bandwidth costs that are unaffected by client filtering
    (x) The large amount of resources needed for implementation of your idea that small companies don't have
    ( ) Outlook

    and the following philosophical objections may also apply:

    ( ) Ideas similar to yours are easy to come up with, yet none have ever been shown practical
    ( ) Any scheme based on opt-out is unacceptable
    ( ) SMTP headers should not be the subject of legislation
    ( ) Blacklists suck
    ( ) Whitelists suck
    ( ) We should be able to talk about Viagra without being censored
    (x) Your solution is nothing more than a conceptual remanifestation of a solution that already exists
    ( ) Countermeasures should not involve wire fraud or credit card fraud
    ( ) Countermeasures should not involve sabotage of public networks
    ( ) Countermeasures must work if phased in gradually
    ( ) Sending email should be free
    ( ) Why should we have to trust you and your servers?
    ( ) Incompatiblity with open source or open source licenses
    ( ) Feel-good measures do nothing to solve the problem
    ( ) Temporary/one-time email addresses are cumbersome
    ( ) I don't want the government reading my email
    ( ) Killing them that way is not slow and painful enough

    Furthermore, this is what I think about you:

    (x) I think it is a creative concept, but there is no need to reinvent the wheel.
    ( ) Sorry dude, but I don't think it would work.
    ( ) This is a stupid idea, and you're a stupid person for suggesting it.
    ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
    • Furthermore, this is what I think about you:

      (x) Brilliant!
      ( ) I think it is a creative concept, but there is no need to reinvent the wheel.
      ( ) Sorry dude, but I don't think it would work.
      ( ) This is a stupid idea, and you're a stupid person for suggesting it.
      ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
    • Mod parent up. That was an awesome post.

      And kind of ironic that the author slipped in some unsolicited politically motivated PR on the Falun Gong as part of his/her message.
  • Inflict heavy fine on people buying spamvertised products and execute spammers. Only then can spam be stopped for good.
    • Take off and nuke them from orbit. It's the only way to be sure.
      • That's what finally knocked Cyberpromo off the air: not the lawsuits from other abused companies, not the out-of-court settlements they made with AOL and other victims of their spam, not the incensed public, but a bunch of irritated script kiddies who knocked down the router connection sold to them by Agis and kept it off the air.

        Eventually the peasants will revolt.
  • Abysmal results (Score:5, Interesting)

    by gvc (167165) on Monday July 10, 2006 @05:40PM (#15694033)
    More specifically, it correctly classifies 84% of spam and 98% of non-spam.

    The authors used the SpamAssassin corpus. Holden shows that, on the Spamassasin corpus, Bogofilter correctly classifies 90.3% of spam and 99.88% of non-spam. See http://sam.holden.id.au/writings/spam2/ [holden.id.au]

    This approach is nowhere near state of the art.
  • Has anybody stopped to think that the human immune system is a little less than perfect? It doesn't stop all diseases, not by a long shot. And sometimes it creates illness, as anybody with Hay Fever — or Multiple Sclerosis — will testify.
  • I'm seriously sick of people abusing biological methodolgies. People seem very attracted to ideas simply because they are grounded in "how nature works" and ignore the mathematical benefits or weaknesses. Now this idea pretty much just sounds like statistical rules based on a corpus - pretty much how every successful solution out there now works. This solution simply prunes rules that aren't being used, but there are better ways to get a smaller spam detection database. Have you seen the stuff the CRM114 people are doing? [sourceforge.net] This is nothing new.

    Read your Russell and Norvig, people. Airplane research didn't get off the ground (ugh) until we stopped trying to mimic birds and study physical principles of flight.
    • Read your Holland and Koza. Evolutionary computing (and others: Neural nets, Cellular Automata, ...) have a wide array of successful applications. Dismissing this just work because it's biologically inspired is inappropriate and counter-productive to science.

      And just so you know, the AIS community is absolutely not ignoring fundamental questions of complexity and mathematical weaknesses. I met one of the authors at ICARIS 05, and her presentation of this work was cautious, qualified, and thorough.
      • There are a million other biological ideas we could borrow, and other biological ideas we could borrow in radically different ways, but we don't because they don't work. Those ideas that do work may have been inspired by biological phenomena, but other than that they do little better than provide a good analogy. In this case, they aren't doing anything different and it is only considered interesting because they thought of a good analogy for it. Nothing works because it is based on biological phenomena. I r
    • Normally I would agree with you. A great deal of crappy research gets hyped up because of an inappropriate analogy to biology... but this isn't one of them.

      Stopping "spam" is almost exactly the problem that our immune system has to deal with. It has to go through reams of data (i.e. every cell in your body) and figure out what is junk and what isn't, and it does this by learning through exposure positive and negative examples. It's not perfect either, sometimes it goes berzerk, producing false positives (
      • Excellent analogy, but that's all there is. It might be inspiring, but this time the idea wasn't originally inspired by biology. These methods of filtering spam have been around for a long time.

        In any case, the basic idea is simple: use a corpus of examples separated into classes to create an algorithm to decide if a new example is in a certain category. There are million AI techniques to do this. What differs in each case are the details of what each part means.

        The immune system analogy is flawed in its de
    • By nature of things and how our mind relates to them by symbolic computation it is natural for people to use meta languages (and higher levels of reflexion) based in the affluent nature of things that are. This comment was provoked by your silly (do not take it as personal offense, please) sentence with an inflamatory derivation mentioning the process of development of machines capable of controlled flight wytch are posessing a feature to execute that capability without the need to decrease their density an
  • Has anyone come across the newer spam ideas, where the spam message looks so much like a real message, I can sometimes have to spend a good few minutes looking at it to see if it's genuine - they use your nickname - eg. "Dear Bob", and end with the name of someone you know. They are usually about mundane things (eg. "do you want to come to a party on saturday?"), and the emails make good sense and have a suitable subject line. The only giveaway is that they all have a tinyURL link to the actual spam site
    • No I haven't. Unless you think I can't tell what's below from correspondence from somebody I know.


      Hello .

      I think we had correspondence a long time ago if it was not you I am sorry.
      If it was I could not answer you because my Mozilla mail manager was down for a
      long time and I could not fix it only with my friend's help I got the emails
      address out for me ..:)
      I hope it was whom we were corresponded with you are still interested, as I am,
      though I realize much time has passed since then...
      I really don't know w
    • Many emails like that do not actually contain an ad or commercial message: they're email address probes, being sent by the million to gather email addresses, and often with a webbug (a one-pixel GIF in a URL) to track exactly which email address's HTML-reading client received the message.

      Those valid email addresses are themselves highly saleable to spam companies, whether the company is even vaguely legitimate or not.
  • from the lymp0cty3z-narf-poit!-claire-said-the-laundry-whee l dept.

    Pinky, if I could reach you I would hurt you.
  • Come on, guys.
  • by Anonymous Coward
    Are we still on the message-filtering bandwagon? I know it was all the rage when we talked about it in 2000, but now it's 2006, and we've all had experience with it. Pattern-matching has been defeated, and it was an embarassing defeat. This is usually a sign to those who proposed it that they should consider a career change. With the exception of those patterns that correspond to firewall rules blocking domains run by companies with names like "Megaultra Webcram Holdings, Inc", it's a dead issue.

    The real is
    • My Gmail account has a success rate of about 2/1000 or 99.8% success rate. My Thunderbird email has a similar success rate. Speak for yourself, buddy, statistical filtering works.
  • .45 caliber penicillin, applied directly to the spammer's kneecaps.
  • The idea of applying immune system models to spam and computer virus detection is old. Nobody has so far demonstrated that it is any better than a sound statistical approach, and this paper fails to do so as well. It's junk science.
  • by cyberscan (676092) * on Monday July 10, 2006 @09:46PM (#15695310) Homepage
    Here is a better Idea: Blue Security was attacked and shut down because the Internet is septic. The germs (spammers) have taken over. The best way to win this is to take the profit out of spamming. This can be done in a similar manner in which the body's t cells alert the rest of an immune system on how to attack a pathogen. A cryptographically signed spammer complaint (attack) file should be distributed via a peer to peer network protocol. This file is sent amongst complaining programs that complain to a spammer's website each time a spam advertising said website is received.

    Like an immune system, this network of spam attack programs will have a t-cell. The "t-cells" will be a small group of people who draw up the complaint instruction file. Whenever the pathogen (spammer) releases enough toxins (spam) into the body (Internet), the T-cells (people who write the complaint instruction file) alert the immune cells (spam complaint program) of the presence of the pathogen and how to attack (complain to website advertised) it. The pathogen is overwhelmed with a quick immuno responce (high bandwidth usage resulting from many, many complaints).

    When the cost of running a website surpasses the revenue earned from said website, the website is shut down. When the costs of spamming or advertising via spam exceeds the income, spam stops. Blue Security was beginning to become successful. Too bad they bowed out.
  • How about a REAL IMMUNE SYSTEM anti-spam filter? I had a dream...

    Here's how it works. I catch me a SPAMMER, and have it tested. IFF it is alergic to a common item (ragweed, peanuts, shellfish, etc.). I keep it in the sub-basement. Otherwsie, I release it back to the wild and catch me another.

    Once SPAMMER is aquired, I put it in a chair, and provide food and water. SPAMMER is given computer, internet access, and is also attach to an allergen device that delivers the substance SPAMMER is allergic to, in contr
  • First of all, you can't stop spam. Filtering will always be an imperfect arms race--we build a better filter, the spammers come up with a better way of circumventing it. It's a never-ending battle.

    Secondly, you can't end spam. Too many companies rely on its existence for their business model to work.

    The only way to stop spam is to stop the spammers from SENDING the stuff. However if this happened, you would see a huge number of companies suffer and possibly go bankrupt. Sure, the organised crime groups behi
  • A real solution would be end to end authentication [pki-page.org] and encryption [openpgp.org]. I wonder why none of the supreme innovators have thought of this yet. But then again the NoSuchAgency [hermetic.ch] wouldn't be able to monitor our inbox or product vendors spam [computerworld.com] our inboxs.
  • I have two major filtering layers (perimeter & inbox). If the recipient is not known, it's spam, and gets temp-failed. If the sender is not known, it is likely spam, and can only send 1 message per second, or get temp-failed (otherwise, I allow several messages per second). I allow only 2 recipients per envelope (temp-fail overage). Whatever makes it through my permieter filters gets to the second major layer (inbox). At this layer, if the sender is known, it stays in the inbox, otherwise, it goes

Today's scientific question is: What in the world is electricity? And where does it go after it leaves the toaster? -- Dave Barry, "What is Electricity?"