Forgot your password?
typodupeerror
BSD

Journal Saint Aardvark's Journal: Mail Server + 4am Pages == Fun 6

So update time on the not-so-new-anymore mail server.

SpamAssassin has been working out just ducky. I had the threshold set to 14, then 10, and I just lowered it to 9 yesterday. I'm keeping an eye on it as I go, because there are legitimate messages (mainly newsletters from Real Companies[tm]) that piss off SA -- "click here to unsubscribe", "you're getting this because", etc. -- and we need to whitelist 'em as we find 'em. Only, w/tens of thousands of messages being caught every day, that's a lot to look through...so it's taking a while.

As far as stats go, at threshold 10 we caught ~ 28k messages in 24 hours. In the 14 hours since I lowered it to 9, we've caught ~ 35k. Fuck me...

We've had one weird hardware problem. At 4am on Saturday morning I got a page (ugh) saying that the server was down. Tried pinging it, and yup, no response. I put our backup mail server on the front end and went back to sleep.

In the morning I went to check it out, and it seemed to be just frozen. Last log message sez:

xl0: watchdog timeout

WTF? Rebooted, saw a lot of "Stray IRQ" messages, and it seemed happy. Put it back on the front end, but let the backup server stay there too.

Dave the SysAdmin found this message on the FreeBSD mailing lists. It suggested that the problem might be because of a couple PCI slots sharing an IRQ; when the guy moved his network card to a slot that didn't share an IRQ, the problem went away. I checked the manual for the mobo (Gigabyte VR7XP), and it looks like the slot the card was in didn't share an IRQ. However, I took a few minutes, shut down the machine, and moved the card (3Com fill-in-the-blank-here) over a slot anyway.

While I was there, I checked out the BIOS and found something moderately interesting: APM was turned off, but in the options it had different IRQs it could wake upon. One of the four that were turned on was IRQ 7, which was the stray one that the box had been complaining about. I turned 'em all off. Bad me for not turning off all that in the first place.

It's held up fine after that last reboot, and now it's the only one on the front-end again. (Good thing, too; the backup mail server doesn't have SA installed, as it's also a webmail + web server.)

This discussion has been archived. No new comments can be posted.

Mail Server + 4am Pages == Fun

Comments Filter:
  • Is it so important to get woken up at 4 in the morning because your mail server went down?

    Seriously. I am having a hard time determining why this event necessatates your getting up and going into work at 4 in the morning. Not trolling mind you, just really curious. I once worked in a place where similiar things had occured...it was because the boss was extremely nit-picky and controlling....aka micro-manager from hell. All I know is it sucks to have to get up and try to fix stuff that early in the morning.
    • Short version: the sysadmin was away (actually he'd got back that day but hadn't turned his pager back on yet), and I was one of two people getting the page; the other guy knows web servers, and I know the mail servers. It turned out it had actually been down since 1am -- don't know why I only got paged at 4 -- so during that time, and until I put the other server in place, we had no incoming or outgoing email.

      Long version: I signed up for it, if you can believe that...

      This is my first job in computers; before this it was all sandwich shop jobs. I've been basically stuffing my brain as quickly as I can in hopes of getting a sysadmin job someday, here or somewhere else. To do that, and to do the Fun Stuff (as opposed to my helpdesk duties), I'm willing to make certain trades.

      For example: early last month our sysadmin went on vacation for three weeks. The only other guy who was getting pages was off with the flu for a week shortly after. I brought this up w/my boss and pointed out that, at the moment, if something went down no one was getting paged, and volunteered to carry a pager. As it happened, he wanted a pager number to give certain clients of ours, so I got the pager.

      I don't particularly like being paged at 4am; in fact, that particular night I was sleeping off the best part of a big bottle o' wine. (It had been a Friday, after all.) But I'm doing it for very calculated reasons:

      • I'm getting very valuable experience
      • I'm increasing the chance of me getting a SysAdmin job in the near future, either here or somewhere else
      • Between this and other things I do that aren't help desk related, my boss has agreed that time away from the help desk (which is always a good thing) is a Good Idea; with luck I'll be out of here some significant chunk of time soon.

      So basically, I'm doing this because then I can do the fun stuff (and get paid for it...sweet).

      (Thanks for the reply, BTW; I'd been wondering if anyone was actually reading this stuff. :-)

      • yeah, lots of people lurk I imagine....I'd love to be able to see the number of hits a journal gets in one day.

        As for the pager stuff, kudos for you, since it was you who wanted to do it. If someone else forces it on you, let me just say it rather sucks. Especially when they don't have the first clue about what is and is not important in the networking world.
  • My first tech job was dowco.com phone support probably 4 years ago. You guys still in the building behind the drafting company in burnaby? Keeping your heads above water?

    Nice to see you kids still around. :)

    And as scary as it might seem, phone support people do grow up to be sysadmins!

    • Yep, still behind the creek, still heads above water (for now). And I'm glad to hear phone support people do grow up to be sysadmins.:)

      If you don't mind my asking, what are you doing these days?

      • for the last 2+ years i was a junior and then co-sysadmin for a downtown webhost company (mostly porn). about 2 dozen machines up here in van and the same amount down in san jose. really interesting work.

        unfortunately they've hit hard times and let most of thier staff go, including me, so im riding the self employment wave right now. hopefully the market picks up soon and i can go back to full time somewhere. im thinking of going for an Oracle cert since i have so much free time.

        if the market doesn't pick up by after xmas i may have to sell out and get my mcse. no one is hiring right now.

        anyways, good luck with the machines and the mail server.

        (and the customers of course. i still have trouble answering the phone sometimes. i have permanent ring-shock.) :)

The truth of a proposition has nothing to do with its credibility. And vice versa.

Working...