Forgot your password?
typodupeerror
User Journal

Journal eno2001's Journal: SPAM: The Troubles 6

Where I work, we've been using a Barracuda Spam firewall for a few years with great success. Sometime in the spring I got a call from a rival company that has a slightly different approach. They informed me that they are getting a lot of Barracuda users to come to them instead. I won't mention the name of the company because I'm going to be making a pretty nasty accusation in a bit. So I told the person that I was perfectly satisfied with the Barracuda and that I have no intention of going to anything else. The sale person was quite persistent and then asked if she could send me some info about their services. I said sure, send me some brochures via snail mail. She got a little snippy and said it's easier these days to send via e-mail. I should have told her to stuff it at that point. Instead I told her that I didn't want to recieve any kind of constant marketing material in my Inbox beyond what she intended to send and that if I did I would blacklist all of their mail for all 35 domains that I host. She seemed stunned by this and just took my e-mail address. I never got anything from her.

However, the very next week I noticed a lot more spam hitting our users. We were getting more complaints from users that they were seeing a lot of spam. At first I thought maybe we weren't up to date on the signatures and blacklists we get from Barracuda, but that wasn't the case. So I contacted Barracuda and they said to upgrade the firmware since we were a few revs behind. I did this because it looked like there was a lot more to the new firmware than the previous. But even after doing that and rebuilding our Bayesian DB, I still noticed a steady stream of spam. So I called Barracuda again. They said that they would recommend using RegEx filters at this point. (Gah! Back to where I was a few years ago!) So I went ahead and did that. (Which is really what this JE is about, I need a bit of RegEx assistance) The new filters I've crafted seem to be working but have the occasional false positive. That's to be expected.

Getting back to this other company. After talking to Barracuda Networks about the problem of more spam making it through the Barracuda, I also asked them how I up the number of days messages are held in the message log. It used to hold up to two weeks worth which was useful. But now it's only holding two days worth. The Barracuda tech stopped me and said, that the message log doesn't hold a specific number of days but it holds 250,000 messages. This indicates that the volume of mail we get has increased tremendously. And the increase happened a week after the call from their competitor. Looking at the stats, the Barracuda is STILL holding back about 85% of the mail we recieve because it's spam. So the Barracuda's performance hasn't changed at all. The volume of messages we are getting has. Coincidental? I'm not sure... read on.

The salesperson who called me said that she wanted to know when we would be re-evaluating our position on the Barracuda and I told her that we make those decisions in the late Fall just before the year is up. So she said she'd contact me then. But, I got a call from her in only two months. The first thing she said is, "How is the Barracuda doing right now"? I thought to myself, "You assholes... it really MIGHT have been you after all". I said, "It's doing just fine". She then told me how their system wouldn't require any interaction on my part since our mail domains would go through their systems before being delivered to us and implied that I must be REALLY BUSY keeping up with the Barracuda these days. Makes me VERY suspcious. What I will say is that I won't be buying into ANY of the companies products who you have to route your mail through. That should prevent anyone reading here from dealing with this company without me having to mention the name.

So now to my reason for posting. I've been able to craft some pretty effective RegEx filters to keep out specific words and phrases including the semi-common intentional misspellings. Here's an example of what I use on the Barracuda to keep out most variants of "Hoodia" for instance:

(.*(H|h)+\b*(O|o|0)+\b*(O|o|0)+\b*(D|d)+\b*(I|i|1)+\b*(A|a|4)+.*)

But what I'm having more trouble with is situations where we get messages with things like this:

ROdhLshaEX WATjdaCiwHEioqS PATeyEK PHIewqLLeqhrIePPhwE CaARhTIjdER

Since it's all alpha with only varying case, I really don't want to make a rule that will have more chances of false positives. I considered something where I would specify R O L E X and then in between each character include any number of only lower case characters a-z. But I have to see if the Barracuda can actually do that in a RegEx.

And in case anyone is interested, I wrote a short Bash script to allow me to feed in a word and get the central portion of a RegEx for most variations of that word including upper/lower case. Don't laugh:

----

#!/bin/bash

WORD=`echo "$1" | tr A-Z a-z`
WORDPOS=0

while test "${WORD:$WORDPOS:1}" != ""
do
L_WORDATOM=`echo "${WORD:$WORDPOS:1}"`
U_WORDATOM=`echo "${WORD:$WORDPOS:1}" | tr a-z A-Z`
ATOM_PAIR=`echo \($U_WORDATOM\|$L_WORDATOM\)`
TERMINATOR=`echo +\\\\b*`
CHAR_SEQ="$CHAR_SEQ$ATOM_PAIR$TERMINATOR"
let WORDPOS=$WORDPOS+1
done

echo "$CHAR_SEQ"

----

Yeah yeah... I know I should be doing this in Perl, but it just rolls off the fingers faster for me in Bash.

This discussion has been archived. No new comments can be posted.

SPAM: The Troubles

Comments Filter:
  • ... but I think that we could probably guess it with a few minutes of googling anyway (iirc, there was someone making a push for doing this "we'll filter your email and forward it to you" stuff).

    Was it them? Of course it was. What you CAN do to really fux0r them is lie to them - tell them you just went out and bought their biggest competitor's filter-and-forward solution, tht they made you an offer you couldn't refuse - they found the volume of spam you get "interesting enough" that they're PAYING YOU $1

    • I hereby dub you Evil Genius.
      • Hey - I resemble that remark! I mean, I resent ... no, actually, I don't resent it :-)

        On a side note, I'm running a little experiment. Take 500 valid email addresses, send them some useful info daily, with options to change and unsubscribe, and see what happens. So far, in the last 2 weeks 15 people have subscribed, 80 have unsubscribed. Most of the rest seem to just like the defaults (its all in the picking of the message, I guess). I'm interested in seeing what the final results will be in another 2 we

    • Hehehe... nice one. The only reason I didn't post their name here is concern about "slander" or some kind of legal repercussions should my JE make it back to them. But there are only a few companies who offer this osrt of thing at this point and this company does show up on the first page of actual Google results (not the ads) when search for "spam filter". They aren't way up on the page but they're not dead last either. Considering that Google results can change, that should still keep me fairly safe
      • I wouldn't worry about it. Heck, look at the fun I've been having with ultramatic beds. Just google for them and look what comes up right after their home page. That article gets at least 5 reads a day - 357 in the last 7 weeks, and it just keeps on.

        A lawsuit would only get more publicity, people relating similar stories, and me asking for a lot of stuff in discovery (fortunately, I know how to do all this myself, so its not like I'm going to run up huge legal expenses, but you can be sure the meter will

  • Since it's all alpha with only varying case, I really don't want to make a rule that will have more chances of false positives. I considered something where I would specify R O L E X and then in between each character include any number of only lower case characters a-z. But I have to see if the Barracuda can actually do that in a RegEx.

    As a regexp, what you're looking for is basically:

    [a-z]*R[a-z]*O[a-z]*L[a-z]*E[a-z]*X[a-z]*

    or, as a script:

    #!/bin/bash

    WORD=`echo "$1" | tr a-z A-Z`
    WORDPOS=0

    CHAR_

Science is to computer science as hydrodynamics is to plumbing.

Working...