Journal CmdrTaco's Journal: Oops, Server Slowdowns, Moderation Issues

So a few years ago, robo used to maintain the slashdot at slashdot dot org email address. It really isn't used for anything, except it is the reply-to for the system emails. Because of this, the account is virtually useless, since it consisted almost entirely of bounced messages, and people asking for help unsubscribing (despite the fact that instructions for unsubscribing have pretty much always been included at the end of the email).

Anyway, robo left us shortly after we moved our offices accross the state. Now he works with the forces of accountancy and welds a pencil.

Of course, this has left the mailbox unattended in the duration, because me, being an idiot, failed to reassign the duties of this box to anyone else. Fast forward to today, and I get an email from the mail administrator telling me that the box has now exceeded 2 gigs.

Now I don't know what our ISP is using for a mail server, but apparently neither it's webmail interface, or pop interface is capable of parsing a 2 gig mail file. So I'll have to simply have them blow it away. Since I read this email address for years, I know that it contains probably a hundred messages that actually matter, and they are all out of date by now anyway. Anyone who has a clue would have emailed an actual person and not the generic catchall box that isn't used anywhere on the site.

Anyway, Oops. Sorry to anyone who tried to contact us through that channel. Heh.

Also, to those who've been asking, yes we have apparently been having some pretty substantial problems serving pages to some users under some circumstances. We're really at a loss, but we've been looking for the problem. We've changed a lot of stuff under the hood in the last couple weeks... we have a new load balancer, and new webheads. There are a LOT of variables. Not the least of which is a huge growth in the number of pages we've been serving (yesterday we served 3.3M pages). The side affects have been many and numerous, but we're working on it.

One of the more obnoxious side affects in a 20% growth in traffic is that our accesslog has swollen noticably, and our pathetic little DB box that handles all logging activity has buckled under the pressure. This box handles a few tasks, including the divying up of mod points. So this has resulted in a substantial cut in the number of points being put into the system since it is behind schedule. This ought to be fixed in the next day or two. We know about it. Please don't submit SF bug reports!

The new webheads are beefy boxes, and we estimate that they are capable of serving as much as double the traffic that we usually serve during peak hours (although the databases probably would be a bottleneck at that point). But there clearly are some issues for people. We've had reports of 500 errors, which we think is a problem with the load balancer. But we can't duplicate it. We've had reports of slowness too- but I'm getting pages fast as hell from all server types. Some people apparently are having problems frequently, but it's been cool when we try to hit it. And since traffic is up, I assume it's working for most people most of the time.

