Journal eno2001's Journal: SPAM: The Troubles 6
However, the very next week I noticed a lot more spam hitting our users. We were getting more complaints from users that they were seeing a lot of spam. At first I thought maybe we weren't up to date on the signatures and blacklists we get from Barracuda, but that wasn't the case. So I contacted Barracuda and they said to upgrade the firmware since we were a few revs behind. I did this because it looked like there was a lot more to the new firmware than the previous. But even after doing that and rebuilding our Bayesian DB, I still noticed a steady stream of spam. So I called Barracuda again. They said that they would recommend using RegEx filters at this point. (Gah! Back to where I was a few years ago!) So I went ahead and did that. (Which is really what this JE is about, I need a bit of RegEx assistance) The new filters I've crafted seem to be working but have the occasional false positive. That's to be expected.
Getting back to this other company. After talking to Barracuda Networks about the problem of more spam making it through the Barracuda, I also asked them how I up the number of days messages are held in the message log. It used to hold up to two weeks worth which was useful. But now it's only holding two days worth. The Barracuda tech stopped me and said, that the message log doesn't hold a specific number of days but it holds 250,000 messages. This indicates that the volume of mail we get has increased tremendously. And the increase happened a week after the call from their competitor. Looking at the stats, the Barracuda is STILL holding back about 85% of the mail we recieve because it's spam. So the Barracuda's performance hasn't changed at all. The volume of messages we are getting has. Coincidental? I'm not sure... read on.
The salesperson who called me said that she wanted to know when we would be re-evaluating our position on the Barracuda and I told her that we make those decisions in the late Fall just before the year is up. So she said she'd contact me then. But, I got a call from her in only two months. The first thing she said is, "How is the Barracuda doing right now"? I thought to myself, "You assholes... it really MIGHT have been you after all". I said, "It's doing just fine". She then told me how their system wouldn't require any interaction on my part since our mail domains would go through their systems before being delivered to us and implied that I must be REALLY BUSY keeping up with the Barracuda these days. Makes me VERY suspcious. What I will say is that I won't be buying into ANY of the companies products who you have to route your mail through. That should prevent anyone reading here from dealing with this company without me having to mention the name.
So now to my reason for posting. I've been able to craft some pretty effective RegEx filters to keep out specific words and phrases including the semi-common intentional misspellings. Here's an example of what I use on the Barracuda to keep out most variants of "Hoodia" for instance:
(.*(H|h)+\b*(O|o|0)+\b*(O|o|0)+\b*(D|d)+\b*(I|i|1)+\b*(A|a|4)+.*)
But what I'm having more trouble with is situations where we get messages with things like this:
ROdhLshaEX WATjdaCiwHEioqS PATeyEK PHIewqLLeqhrIePPhwE CaARhTIjdER
Since it's all alpha with only varying case, I really don't want to make a rule that will have more chances of false positives. I considered something where I would specify R O L E X and then in between each character include any number of only lower case characters a-z. But I have to see if the Barracuda can actually do that in a RegEx.
And in case anyone is interested, I wrote a short Bash script to allow me to feed in a word and get the central portion of a RegEx for most variations of that word including upper/lower case. Don't laugh:
----
#!/bin/bash
WORD=`echo "$1" | tr A-Z a-z`
WORDPOS=0
while test "${WORD:$WORDPOS:1}" != ""
do
L_WORDATOM=`echo "${WORD:$WORDPOS:1}"`
U_WORDATOM=`echo "${WORD:$WORDPOS:1}" | tr a-z A-Z`
ATOM_PAIR=`echo \($U_WORDATOM\|$L_WORDATOM\)`
TERMINATOR=`echo +\\\\b*`
CHAR_SEQ="$CHAR_SEQ$ATOM_PAIR$TERMINATOR"
let WORDPOS=$WORDPOS+1
done
echo "$CHAR_SEQ"
----
Yeah yeah... I know I should be doing this in Perl, but it just rolls off the fingers faster for me in Bash.
You really should name the company ... (Score:2)
Was it them? Of course it was. What you CAN do to really fux0r them is lie to them - tell them you just went out and bought their biggest competitor's filter-and-forward solution, tht they made you an offer you couldn't refuse - they found the volume of spam you get "interesting enough" that they're PAYING YOU $1
Re:You really should name the company ... (Score:1)
Re:You really should name the company ... (Score:2)
Hey - I resemble that remark! I mean, I resent ... no, actually, I don't resent it :-)
On a side note, I'm running a little experiment. Take 500 valid email addresses, send them some useful info daily, with options to change and unsubscribe, and see what happens. So far, in the last 2 weeks 15 people have subscribed, 80 have unsubscribed. Most of the rest seem to just like the defaults (its all in the picking of the message, I guess). I'm interested in seeing what the final results will be in another 2 we
Re:You really should name the company ... (Score:1)
Re:You really should name the company ... (Score:2)
I wouldn't worry about it. Heck, look at the fun I've been having with ultramatic beds. Just google for them and look what comes up right after their home page. That article gets at least 5 reads a day - 357 in the last 7 weeks, and it just keeps on.
A lawsuit would only get more publicity, people relating similar stories, and me asking for a lot of stuff in discovery (fortunately, I know how to do all this myself, so its not like I'm going to run up huge legal expenses, but you can be sure the meter will
Regexps (Score:2)
As a regexp, what you're looking for is basically:
[a-z]*R[a-z]*O[a-z]*L[a-z]*E[a-z]*X[a-z]*
or, as a script: