Forgot your password?
typodupeerror

Building a Scalable Mail System? 109

clusteredMail asks: "I work for a small ISP that up until now has survived with single servers for most critical roles, including the mail server. We are planning to introduce multiple mail servers (primarily for email collection via POP3 and IMAP) and want to put in place the most scalable, resistant to failure system that we can manage. Everything is currently running on one or another flavour of Linux. In my mind, the ultimate scenario would be to have some sort of distributed/clustered file system between the multiple machines, so that any user could log onto any server, and the loss of a single server would not cause downtime for any group of users. Has anyone in the Slashdot community had to put together a system like this using Linux and Open Source Software? If so, how did you fare and what were the major stumbling blocks?"
"So far, the plan is to split up the mail accounts between multiple servers and use some sort of connection proxy to sort out which account logs into which server but this seems like a rough approach. The disadvantage to this setup: if one server fails all the users who have accounts on that machine will be in the dark, email-wise."
This discussion has been archived. No new comments can be posted.

Building a Scalable Mail System?

Comments Filter:
  • MIT (Score:2, Interesting)

    by Anonymous Coward on Thursday April 20, 2006 @06:24PM (#15169033)
    MIT, an organization which you'd think would have a handle on this sort of thing, simply has a bunch of independent servers and assigns accounts to a specific one. One user might use po10.mit.edu for POP/IMAP, another might use po2.

    Of course, that might just be because the IT department at MIT does not take advice from the faculty and students, and just generally sucks.
  • Foundation. (Score:3, Interesting)

    by deep44 ( 891922 ) on Thursday April 20, 2006 @06:30PM (#15169083)
    I would recommend deploying an LDAP-enabled directory server as the foundation for everything else. Almost every other service can leverage the directory for pulling various information about each user. You can really help yourself down the road by making your directory server _the_ single authoritative source of email-related customer information.
  • by charlesnw ( 843045 ) <charles@knownelement.com> on Thursday April 20, 2006 @06:39PM (#15169140) Homepage Journal
    I run an open source project that is building an exchange replacement. http://www.thewybles.com/~charles/oser [thewybles.com] is the project homepage. It will be highly available (supporting both hardware (cisco/webmux load balancers) and software based load balancing. Along with a whole host of other groupware functionality. I have done high availability e-mail solution deployments. I am in the SoCal area but am willing to travel if necessary. There are others who can help you as well. Your choice. My blog [livejournal.com] covers a lot of the progress of the project and details. I would be happy to work with you to complete this task. Just e-mail me and we can work out an arrangement.
  • by p0rkmaster ( 198870 ) on Thursday April 20, 2006 @07:04PM (#15169280)
    I have been running CommuniGatePro mail servers for years. They have all the features you're looking for and more. The main thing I love about CommuniGate is the fact that I have one application that is the MTA, IMAP server, POP Server, and WebMail all in one. No dependencies. No futzing around with PHP or config files for half-a-dozen different applications. And it also supports clustering. A low-end cluster would consist of 2 machines with a NFS or CIFS/SMB backend for the storage. They also support so many operating systems it's not even funny. Solaris, Windows, Linux....even OS/2! You can grab a fully-functional trial off their site and have it up and running in minutes. Check it out, you won't be sorry. Their stuff scales from small systems to really huge ones with millions of accounts. For example - UC Berkeley's mail system is CGP. You can check 'em out at http://stalker.com/ [stalker.com].
  • by Etcetera ( 14711 ) on Thursday April 20, 2006 @08:17PM (#15169657) Homepage
    A properly configured, customized, qmail/vpopmail cluster is a beauty to behold. Unfortuantely, it takes the better part of a month to get up to speed on how the system works, and it will be many months overall before you really feel "comfortable" with how it works (longer if you're coming solely from a sendmail background).

    That being said, it's also rock-solid, extremely fast when properly configured, and more flexible than you can imagine.

    We currently use a single RAID-10 NFS and MySQL DB system handling the backend, with 5 cluster servers in front of it, each of them able to perform any number of roles. (We had a load balancer in front of them at one point, but it actually more just got in the way than anything else.) A sixth box handles all DNS requests for the servers, and we'll be bringing a 7th up soon to offload some of the spam processing from the three that currently run our asnychronous processing code. The cluster boxes are cheap MicroATX Athlon XP 3000+ machines with 2 GB of RAM. I've seen each box take well over a 100 simultanous SMTP connections without CPU being noticably affected. Current 1 does webmail, 1 does incoming MX, 1 does POP3/IMAP, 1 is for development and servers IMAP to the webmail box, and 1 is running SMTP, 587, and SMTP-SSL.

    When properly administered, I think it beats anything out there. However, if you can't afford the time and 3am-bang-your-head-against-your-monitor agony, I'd suggest one of the other solutions people have mentioned here.

    My $.02
  • by Bronster ( 13157 ) <slashdot@brong.net> on Thursday April 20, 2006 @09:18PM (#15169924) Homepage
    Alternatively, check out nginx [sysoev.ru]. Sure you have to wade through the Russian, but the configuration syntax is pretty simple and it's easy enough to build.

    It uses epoll. We replaced a perdition proxy that was seriously loading two servers with a single 8 process nginx instance that's not even breaking a sweat. It's amazing what the change from 32000 process down to 8 processes can do on a busy site! The two frontend machines are now configured with heartbeat to get full failover of IP addresses. Downtime appears to be on the order of 1-2 seconds with an orderly cutover and probably about 10 seconds for a total host failure.

    Cyrus supports replication now, which is a good way to handle the backends. I'd say more about it, but I haven't actually finished configuring the full failover system yet for this - lots of gating logic required to make sure two machines don't both believe they're master for a bit!

    Er, but why would I help you anyway, you're the competition ;)

    (I work for FastMail.FM [fastmail.fm] btw)
  • by kingradar ( 643534 ) on Thursday April 20, 2006 @10:04PM (#15170112) Homepage
    I started a free email service to compete with Gmail about 2 months after Gmail launched. For those interested, the name is Nerdshack.com.

    At first I used Postfix and Cyrus, but I found it to be a nightmare when your talking about more than 50k accounts.

    What I wanted was an email platform that integrated with ClamAV, DSPAM, supported SPF, Greylisting/Blacklisting/Whitelisting, and was all controlled from a MySQL database. I also wanted it to support SSL, and clustering.

    Frankly I didn't find anything. So I wrote my own. This may not be your cup of tea, so if not I reccomend looking at DB Mail (www.dbmail.org) and Cyrus (asg.web.cmu.edu/cyrus). Both are compotent mail servers, can be built to support a large user base. The problem I had was expanding their feature sets.

    As has been mentioned numerous times above, getting stock open source software to support a large user base is a huge pain. Combine that with trying to add in things like DSPAM, SPF and ClamAV, and your going to be faced with a nightmare. The system you end up with will be a kluge of hacks, custom scripts, and chewing gum. To me that seemed to much like a house cards. On top of which, most open source sytems do not handle large quota accounts very well. Run benchmarks against your favorite mail server using a 10 to 20 gig mail store for a single user. You will quickly find that even maildir struggles with that many files. (Hint, make sure you use ReiserFS at least.)

    So I went the route of Gmail, Yahoo, and Hotmail. I wrote my own, and after a few early bumps in the road, its pretty solid. I've had 100% uptime for over 300 days, util last month when I moved datacenters.

    Basically how I set things up is I have an Alteon AD4 load balancer that balances traffic amongst my application cluster. This app cluster runs my custom code which speaks SMTP and POP (IMAP is about 75% done), and interfaces with DSPAM, ClamAV, libspf, OpenSSL, LZW (for compression), and supports a host of other features. I even support using public key encryption (ECC) to store messages on disk with your public key, and then encrypt your private key with AES256 using your password as the key. Its seemless to the end user, but guarantees privacy while the mail is on my servers. I even created a point system to allow me to automatically block IP addresses that attempt dictionary attacks, etc (though its disabled at the moment). Each server caches everything it can to reduce database load, and uses a connection pool for retrieving messages and running queries. I wrote my server in C, so its very, very fast.

    These app servers store user information, preferences, etc, in a MySQL database. The actual messages are stored on message storage servers using a custom algorithim, and protocol for speed. Every message is stored on two servers, with both locations stored in the database. Needless to say my system rocks. Each Dell 1650 server can support close to 1000 simultaneous connections while using less than 10% of the CPU.

    What I'm working on now is the IMAP server, integration with Memcached, and moving configuration settings into an XML file. Right now config settings are DEFINE parameters, which means changing anything requires a recompile. I've also found that my database is the bottleneck, so I want to offload as much as I can to Memcached. Checking whether an email address exists, or using my custom point system with the database is too inefficient, so I hope Memcached will help.

    I've thought about releasing the source under the GPL, but I don't think its quite ready. I want to at least get config settings into an XML file first. I'd also like to find a company to sponser my development, but that hasn't happened yet. (I still have a day job.)

    Executive summary. Its always more fun to write your own, and then post to /. about it.

"What people have been reduced to are mere 3-D representations of their own data." -- Arthur Miller

Working...