Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Best Way To Archive Emails For Later Searching? 385

An anonymous reader writes "I have kept every email I have ever sent or received since 1990, with the exception of junk mail (though I kept a lot of that as well). I have migrated my emails faithfully from Unix mail, to Eudora, to Outlook, to Thunderbird and Entourage, though I have left much of the older stuff in Outlook PST files. To make my life easier I would now like to merge all the emails back into a single searchable archive — just because I can. But there are a few problems: a) Moving them between email systems is SLOW; while the data is only a few GB, it is hundred of thousands of emails and all of the email systems I have tried take forever to process the data. b) Some email systems (i.e. Outlook) become very sluggish when their database goes over a certain size. c) I don't want to leave them in a proprietary database, as within a few years the format becomes unsupported by the current generation of the software. d) I would like to be able to search the full text, keep the attachments, view HTML emails correctly and follow email chains. e) Because I use multiple operating systems, I would prefer platform independence. f) Since I hope to maintain and add emails for the foreseeable future, I would like to use some form of open standard. So, what would you recommend?"
This discussion has been archived. No new comments can be posted.

Best Way To Archive Emails For Later Searching?

Comments Filter:
  • by Mikkeles ( 698461 ) on Monday September 06, 2010 @10:44AM (#33488876)

    Alphabetically!

  • IMAP (Score:5, Informative)

    by klingens ( 147173 ) on Monday September 06, 2010 @10:44AM (#33488884)

    An IMAP server (dovecot, cyrus, courier) of your choice for Linux. If you don't have a Linux server you can always run it inside a small VM.

    • Re: (Score:2, Informative)

      by hedwards ( 940851 )
      Yeah, IMAP is the way to go, personally, I use IMAP on my email account and mailstorehome [mailstore.com] to do the actual download and backup. The OP will probably end up having to set up a personal server to get the program to download the older mail, but that can be done easily enough via a virtual machine.
    • Kmail has an excellent .pst converter that will pull out your old Outlook mail. Once you have it in Kmail, you can drag and drop it into any of the supported formats, mbox, mdir etc. If you have already established filters, you can let them sort things out. If not you can use a manual search for to, from, mail list, subject, etc. From there you can run your imap. I carry everything around on my laptop and use kmail instead of using imap. With full drive encryption and xscreensaver, I don't have any worr
      • Re: (Score:3, Informative)

        by AndGodSed ( 968378 )

        ++ the above, or Evolution - it also imports PST's and from there you can move it to Thunderbird for Windows. If you want uber searchability you could then upload the whole shebang to a gmail account that you sync offline via gears.

        I personally would balk at having all that stuff online with google but hey that would be the best searchable option I know. You can also sync with your Gmail account via imap protocol if gears and the web interface is not for you. Problem with that is that you will lose the grea

    • Perhaps the best route would be to use MySQL or some other FOSS database and build a web front end for browsing, searching, etc
    • by arivanov ( 12034 )

      The guy mentioned entourage. If he is running MacOS he can run any of these on MacOS.

      This solves the "storage" problem. However, this does not solve the search/index/etc problem. I have 9G+ and growing IMAP store going back to 1999 with several hundred of folders in it so I am facing a similar problem. Using Thunderbird search and even grepping it on the server just does not cut it any more.

      • Re: (Score:3, Informative)

        by wealthychef ( 584778 )
        Well, on OS X the searching problem *should* be solved by Spotlight, as it indexes "all files on your hard drive" (not) into constant-time searches automagically. The trouble with Spotlight is that Apple does not search all folders and I do not know of a way to enable it to search all folders. If you import it into Mail.app, you do get the indexed behavior, and my situation is similar to yours, and I do exactly that. But all those billions of old messages, I keep in an archive that I never look at.
        Any
    • For storage, IMAP is definitely the way to go.

      I'm using Cyrus myself for this exact purpose (e-mail from the last 7 years about; estimate 20 GB worth of mails; I have many mails that come with attachments). No specific reason to use that one; seemed to be the easiest to set up at the time; it works fine for me.

      Main reason for me to use an imap server is that it is client-independent, and as it's open source it's not some weird proprietary format. So great to store mails, easy to retrieve mail remotely, ea

    • by tenco ( 773732 )
      My first thought was "Maildir". AFAIK there are IMAP servers with a maildir backend. So, yeah, someone should get an IMAP server with Maildir backend for this job.
    • Re:IMAP (Score:4, Informative)

      by 19thNervousBreakdown ( 768619 ) <davec-slashdotNO@SPAMlepertheory.net> on Monday September 06, 2010 @12:20PM (#33489688) Homepage

      Seconding this. I've been using Dovecot with Maildir on EXT3 for the last few years--my mailbox is about 25k messages, which I keep all in a single folder and use IMAP tags to organize into different virtual folders, much like Gmail's system but without the privacy concerns.

      Dovecot's supplementary indexes makes everything extremely fast (tags, dates, etc), and anything it doesn't catch Thunderbird does, I can search my entire mailbox for a single word in less than a second. I lose my Thunderbird indexes whenever I move to a new computer, but that's just a matter of leaving the client up for a few hours.

      • Re: (Score:3, Informative)

        by vanyel ( 28049 ) *

        I am using dovecot and thunderbird, and have about 60 "live" folders, some with 10's of thousands of messages (a couple with 150+k messages). It is a constant battle with thunderbird, which often goes away for long periods of time, even when not doing anything one would expect to be dealing with the larger folders.

        I'm working on some scripts to archive messages into 30-90-180 day archive folders to keep the live folders down to a manageable size, but it would be nice to find something that already exists..

    • Re: (Score:3, Informative)

      by flyingfsck ( 986395 )
      Totally. All my email since about 1998 is in a Citadel mail server. It uses the BerkeleyDB, which can handle 256 Terabytes of mail. That should be enough for any semi-sane person...
  • Delete (Score:2, Insightful)

    by Anonymous Coward

    Time to delete them all

    • DO NOT DELETE. (Score:5, Insightful)

      by GuyFawkes ( 729054 ) on Monday September 06, 2010 @11:47AM (#33489440) Homepage Journal

      I can't tell you the number of times I nearly deleted my archived data, going back to 1997 in my case, not just e-mail either.

      Then I got falsely accused of everything except 9-11 as part of a separation / child custody battle that started with a nuclear attack out of the blue.

      It is amazing how much of that old data is relevant in such cases, "He did x on 1st June 2000 at our house!" and you have data showing you were 200 miles away doing something you had completely forgotten, with someone you haven't spoken to or seen for 7 years, at the time...

      DO NOT DELETE YOUR ARCHIVES, EVER!***

      *** unless of course you are a bad person and they incriminate you, in which case you'd better avoid everyone else who archives data.

      • Re:DO NOT DELETE. (Score:4, Insightful)

        by cervo ( 626632 ) on Monday September 06, 2010 @12:35PM (#33489848) Journal
        this can also work against you. Most big companies have record retention policies that include when to delete e-mails. Because those same archives that saved you can bite you in the butt. Also in reality you should be innocent until proven guilty anyway, although I know civil court works differently. But if there is anything you did, maybe an e-mail to another woman that can be spun as evidence you had another girlfriend (even if it was a harmless e-mail just saying hi) then it could bite you.

        Plus no one is 100% squeaky clean. Maybe you admitted you were speeding to someone. Maybe you bought porn website memberships (which could be spun as the reason for a break up, or that you are an unfit parent). Maybe you admitted you were a little too drunk to drive but did it anyway. Maybe you ordered a set of army knives and have the receipt and that gets spun as you have weapons all over the place that could endanger the kids....

        Anyway just saying that too many records could bite you too. Especially if someone from court gets an order for all of them. Then they can be pulled out of context and could be very damaging. Even medical issues could be in the e-mail archives from correspondents with doctors, confirmations of appointments, etc... If that data ever got out it could be damaging to buying insurance as well.
      • Re: (Score:3, Insightful)

        by afabbro ( 33948 )
        Alternatively, spend more time on your personal relationships and home life than maintaining your email archives.
  • by perpenso ( 1613749 ) on Monday September 06, 2010 @10:48AM (#33488916)

    I have kept every every email I have ever sent or received since 1990 with the exception of junk mail (though I kept a lot of that as well) ...

    You are a hostile lawyer's fantasy come true. ;-)

    • by Jawnn ( 445279 )
      Bravo, sir. You win my "most insightful comment of the week" prize, hands down.

      The OP should give some very serious thought to the wisdom of keeping all that email. It may be relatively harmless (I'll wager he's not in a position where his correspondence is likely to be of interest to potential litigants), but dude, hoarding is a disease. Seek treatment.
      Meanwhile, look for something that uses IMAP style storage and a database for indexing purposes. Be prepared for a laborious process of importing and i
    • Re: (Score:3, Insightful)

      by ShakaUVM ( 157947 )

      >>You are a hostile lawyer's fantasy come true. ;-)

      We've won a couple lawsuits because I save all of my email.

      We had a contract to do a workshop with Maricopa County - the same people whose Sheriff is under investigation by the FBI right now, and of Immigration Law fame. And who have a lot of other shady things going on right now, but I digress.

      I'd traded a series of emails with them planning the workshop. Everything was all set. Then, about a week before the workshop, they say they don't need me to c

  • Google Mail. (Score:3, Insightful)

    by sidragon.net ( 1238654 ) on Monday September 06, 2010 @10:50AM (#33488934)
    See subject.
  • Not Much (Score:2, Informative)

    by maxume ( 22995 )

    It isn't particularly platform independent (because no one is paying much attention to Windows), but Not Much offers threads and full text search:

    http://notmuchmail.org/ [notmuchmail.org]

    • Re:Not Much (Score:4, Informative)

      by koiransuklaa ( 1502579 ) on Monday September 06, 2010 @11:29AM (#33489296)

      +1

      Notmuch can manage absolutely insane amounts of email without any artificial 'archiving'. Of course, if you are looking for a a program that does something else than tagging and searching (like sending, composing or receiving email), you need to look elsewhere.

  • Print (Score:5, Funny)

    by JustOK ( 667959 ) on Monday September 06, 2010 @10:52AM (#33488952) Journal

    Print then scan

  • Gmail? (Score:5, Informative)

    by spiffydudex ( 1458363 ) on Monday September 06, 2010 @10:52AM (#33488956)

    While not open source, Gmail has a good search engine that isn't sluggish. Plus it has roughly 7.5 gigs of space to store data. Use IMAP to push all of your emails to the server and then use that Gmail account for archive email only.

    • Re: (Score:3, Insightful)

      by siliconbits ( 943161 )
      I second that. Invest in Google Apps to benefit from additional services as well.
      • Re: (Score:3, Insightful)

        by pvera ( 250260 )

        Yes! The thing that appeals to me the most about using Gmail is that searching through 5+GB of old emails won't make everything in my machine slow to a crawl. Even with the free Gmail account, you can up the storage to 20GB for $5/year, and that extra space is available from other Google services connected to the same account.

        If you want to have more flexibility, sign up for a Backupify account, which can backup Gmail pretty well. As a bonus, when Backupify stores your backups they are kept in plain text fo

    • by perpenso ( 1613749 ) on Monday September 06, 2010 @11:03AM (#33489056)
      And now the poster becomes an advertiser's dream come true in addition to being a hostile lawyer's dream come true. ;-)

      Remember that from Google's perspective gmail is a tool to better profile you for targeted advertising. Make sure you are OK with that before giving them access to all your emails.
      • Re: (Score:2, Interesting)

        by Nemilar ( 173603 )

        OK, so I hear this a lot and I never really understand the problem.

        The "unwritten gmail contract" (and it actually applies to most Google products) is this: We will give you a service for free (in this case Gmail), and in return we are going to profile your use of that service to select ads for you. In the case of gmail, they give you however many GB of storage, always-on cloud email, and the best searchable email system I've ever seen. There are other Google examples, from gtalk to Google Docs. The bas

        • By storing personal data on gmail, you are one hack away from identity theft. I prefer to keep as few personal details on the net as possible.
        • There is nothing unwritten about it. Google is quite up front in their agreement that they data mine your emails for targeted advertising purposes. I agree that there is nothing wrong with this, but I disagree that most people are aware of this.
        • Until they start selling that information about you to third parties. Google having a profile about me that's used in house to target ads to me, is OKish acceptable. Them selling this info to third parties is a definite no-go. And there is nothing that I am aware of preventing them doing just that, other than their own ethics.

      • by jridley ( 9305 )

        Yup, I'm really highly concerned that an advertiser might learn that I like electronics and am a huge computer geek. Because there's no other way they could know that.

        Seriously, this is what I did; I pushed everything to GMail, like the OP, tens of thousands of emails, going back to the 90s.

        Email is not and has never been a secure media, so if you've been putting sensitive data in emails, you're not being really bright anyway.

  • OK, My Favorite (Score:3, Interesting)

    by BoRegardless ( 721219 ) on Monday September 06, 2010 @10:53AM (#33488962)

    MailSteward on the Mac.

    SQL database. Good, Inexpensive, works w/many tens of thousands of emails & more.

    http://mailsteward.com/ [mailsteward.com]

    • Forgot to note a key factor and that is ultimately format independence, since email clients come and go over time & then many key output formats, so you are not restricted on that avenue.

      The search function is certainly a key for me, as sometimes I know only one key word in attempting to find a note about material, object or company from 15 years back.

  • Mbox or SQLite (Score:2, Insightful)

    by Anonymous Coward

    If you want an "email format" why not mbox? Many things currently support that as an import option.

    If you want a database, why not SQLite? It's about as open as can be, backwards compatibility is almost a religion and should have no problem with hundreds of thousands of entries.

  • mbox + grep (Score:5, Funny)

    by Anonymous Coward on Monday September 06, 2010 @10:54AM (#33488972)

    I use mbox format [wikipedia.org] files and grep [gnu.org].

    IMO, one can't get much more portable than that.

  • Maildir (Score:5, Informative)

    by alexhs ( 877055 ) on Monday September 06, 2010 @10:56AM (#33489006) Homepage Journal

    Maildir [wikipedia.org].

    And if you have an e-mail client that don't support it, use an IMAP server to feed your client. /thread

    • Re: (Score:2, Informative)

      by Anonymous Coward

      Maildir [wikipedia.org].

      And if you have an e-mail client that don't support it, use an IMAP server to feed your client. /thread

      With the proviso that you probably want to break up your archives in something akin to the following format:
      . 2009
      . . Q1
      . . . Sent
      . . . Received
      . . Q2
      . . . Sent
      . . . Received
      [...]
      . 2010
      . . Q2
      . . . Sent
      . . . Received

      Lots and lots of messages in a directory can cause problems with many file systems. If you have more than say ~8K or so messages in a folder, I'd recommend breaking it up. At work this is what I do at work (CY/Qx/Sent-Received), which also allows me to move entire quarters into PST files when I

  • Good IMAP Server (Score:5, Informative)

    by caffeinejolt ( 584827 ) on Monday September 06, 2010 @10:58AM (#33489022)
    If this is really important to you, and you want it all to work across multiple workstations/OSes, your best bet will be to store it all in IMAP [wikipedia.org]. If you have the means and motivation to run this yourself, I would recommend Dovecot [dovecot.org]. If you don't have the means and motivation, then you can use a service like Gmail to run your IMAP although you give up certain freedoms in doing so. For example, I use Dovecot coupled with Maildir++ [wikipedia.org] as the physical storage format - as a result I can (if I wanted to) change to any email client I wish very quickly, use different email clients at the same time, etc.
  • Maildir (Score:5, Interesting)

    by roderickm ( 6912 ) on Monday September 06, 2010 @10:59AM (#33489028)

    Maildir storage format is resistant to bit-rot because it stores each message in a separate file, and uses filesystem directories for mail folders. It's widely supported by user agents (mail readers) and IMAP/POP3/SMTP servers, so you'll never be stranded by the actions of a single software vendor. Finally, it's easily searched using everyday unix tools - find, grep, sed, awk, etc., and you can use the full-text search engine of your choice for speedy searches.

    • Re:Maildir (Score:4, Informative)

      by El_Muerte_TDS ( 592157 ) on Monday September 06, 2010 @11:12AM (#33489152) Homepage

      mairix is a useful addition to a maildir setup: http://www.rpcurnow.force9.co.uk/mairix/ [force9.co.uk]

      • Re: (Score:3, Informative)

        by Sancho ( 17056 ) *

        mairix is good, but it has some warts and it is not under development anymore. Among other things, it can run out of memory, has problems with parsing certain multipart messages, and can't search for an IP address (or any other string with dot-separated tokens.)

        It's about the best I've found, but I wish someone would pick up development and fix some of the issues. As time goes on, bit-rot is going to set in and mairix will get less and less useful.

    • Comment removed based on user account deletion
    • Re: (Score:3, Insightful)

      by jgrahn ( 181062 )

      Maildir storage format is resistant to bit-rot because it stores each message in a separate file, and uses filesystem directories for mail folders. It's widely supported by user agents (mail readers) and IMAP/POP3/SMTP servers, so you'll never be stranded by the actions of a single software vendor. Finally, it's easily searched using everyday unix tools - find, grep, sed, awk, etc., and you can use the full-text search engine of your choice for speedy searches.

      The only sane alternatives are, as far as I'm c

  • citadel (Score:4, Informative)

    by samjam ( 256347 ) on Monday September 06, 2010 @10:59AM (#33489034) Homepage Journal

    citadel at www.citadel.org is a full pop3/imap server with full-text indexing.

    Thunderbird can use server-side searches to find messages, and I find that works pretty well.

  • Have you looked at Archiveopteryx [archiveopteryx.org]? That is one potential solution to the storage side of the problem. It stores the messages into a PostgreSQL database with minimal tinkering, so you can always get the original plain text stuff back out again. Consider it a database of mbox files that exposes an IMAP interface. You can't get any less proprietary than Postgres, and you can scale up many of its operations using standard database approaches in that area.

    What I would do here is store messages there as my pe

  • I'll chime in with my own solution. My archive is not as extensive as yours but I have most everything from 2005 or so (excepting mailling lists, other junk, etc.). My solution is sort of silly, I just use Apple's Mail.app. The reason I use this is because Mail.app enables you to store and organize everything as separate folders and since Spotlight is blazingly fast and does a great job for searching. I try to keep my number of messages in a folder on the order of a few thousand messages, for my e-mail
  • In your will donate your archive to science. I'm sure it would make an interesting thesis project for some PhD candidates out there. I'm seriously, consider this.
  • Theres one method i've used fairly often in the past for getting mail out of an older client - provided the older client supports imap (lookout and lookout express do).

    First, setup a new account on your imap server just for archival purposes (you can setup an imap server on any UNIX/Linux distro and even Windows with Cygwin fairly easily - dovecot is a good place to start). Make sure its using either mbox or maildir (preferred).

    Second, setup said account on all the mail clients you'd like to archive.
  • You should put all that stuff on an IMAP server on your home network (preferably a box you can reach from the outside using DDNS or a static entry if you have your own domain).

    In that way your client OS'es can be whatever platform you choose, and they will all be able to access your mail storage.

    Put older mails in separate folders.

    If you can work with Linux there are plenty of choices. If not, consider Windows Home Server and get a mailserver product for Windows - there are plenty!

    Many advanced email client

  • While this answer will almost certainly not suit the OP, it may be of interest to other folk looking to archive their email. Using python and a combination of imaplib [python.org] and some basic file I/O you can save the original text of messages. My rationale for this was firstly that it's probably less problematic than converting between various email client formats; and secondly that it's a decent way to learn some python! ;)

    My rather basic implementation just dumps every email from an (IMAP) folder sequentially. I r

  • Scary thought, but you might just want to pick up one of the tools that the lawyers use for electronic discovery. They cover multiple mail formats (including older generations of said formats) and set it up so that it's easy for an intern to search for keywords and the like, so someone that understands tech should be able to use it I've had to use the Clearwell appliance and it did what it was supposed to do, including finding attachments and indexing them for ease of search. (No, I don't work for Clea

  • by Anonymous Coward

    I recommend mbox (MBX) format.

    1. The format is text based and not likely to become unreadable anytime in the forseeable future.

    2. There are no shortage of tools for manipulating mbox.

    3. Its easily indexed by full text search applications (MS Search included with windows)

    The outlook tools save dialouge has an apple export option which is actually the mbox format.

    In terms of archival access I recommend an IMAP server with a folder hirarchy based on month/year. Your mail client should be configured to leave t

    • Re: (Score:3, Insightful)

      by Sancho ( 17056 ) *

      I find that Maildir works better than mbox for my purposes. Roughly all of the same pros, plus:
      4) Doesn't require locking your entire mailbox to modify one message.
      5) Resistant to file/inode corruption (will likely only corrupt one message instead of several.)
      6) Can essentially use shell tools to copy individual messages.

      One thing that's neat to do with maildir mailboxes is to search using grep+xargs and copy the messages you find into a new maildir mailbox (named, perhaps, searchresults). Then you have a

  • by juanca ( 49302 ) <jchacon.gmail@com> on Monday September 06, 2010 @11:20AM (#33489210) Homepage

    At work, we needed to archive (for compliance purposes) all the inbound/outbound email messages of our users (about a 1K aprox). We setup an Ubuntu server with postfix and dovecot IMAP over SSL, using Maildir.

    Our users generate about 20K email messages daily, and we store each day in it's own directory, something like this:

    INBOX
            |- YYYY
                          |- MM
                                    |- DD

    The auditors use Evolution to connect to the archive server and search the emails, even though it takes a little while to load a day of emails for the first time, once it's properly loaded searching is really fast. The server is not that powerful, it's a VM with 2 CPUs and 2GB of RAM. You do need a lot of storage though.

    Hope this helps.

  • I still use Eudora... 7.1.09 paid mode from years ago... I use XP for my wifes computer and have different Eudora folders based on who is logged in. Works like a champ. The nice thing is I can sort the old emails by sender (for listserv's and such) to be put into folders, and then use the find email function to search things. I hardly ever have problems finding an email as long as I know WHO/WHAT I'm looking for and where - Body, from, subject, etc.. Sadly, No meta tags.. :( BTW, Mine goes back to.. e
  • The many comments here about using just imap with maildir or mbox storage backends forget to mention that these are all very slow to search when you have thousands of messages. They dont store the files in any kind of disk-seek friendly format. soo..

    I suggest either putting a dovecot with maildir++ system on fast SSD to overcome the poorly organized(on disk) files
    -and/or-
    using a mysql/postgresql backend on dovecot or courier or your favorite imap that supports *sql. The mail would be stored with each deta

  • by Fat Cow ( 13247 ) on Monday September 06, 2010 @11:27AM (#33489288)
    I migrated all my old personal emails to gmail using IMAP. You can use this to migrate between different on-disk formats like maildir, mbox and pst. I had all my email in yahoo and pulled it down using POP to a maildir, then used an IMAP mail client to copy it across to gmail. Then I regularly back them up from gmail to an on-disk maildir format using mbsync [sf.net]. I picked maildir because it's open and seemed better designed than the alternative, mbox. It's not completely standardized though. I've seen PSTs become corrupt so I try and stay away.
  • There's a commercial, but low cost, package that I've used to do exactly what you are describing: http://www.aid4mail.com/

    Aid4Mail converts email to and from a variety of mail formats. The feature that you might find useful is that it will create a zip archive that contains standard .msg format email messages. Use that in combination with an indexing programme. I use X1 (http://x1.com/), but there are lots of indexing programmes that will index zip archives for easy searching.

  • by WetCat ( 558132 )

    http://xena.sourceforge.net/ [sourceforge.net]
    A great Java free software for mail (and other documents) automatic normalization and archivation, developed by Australian Government

  • Google Apps for your domain offers a bulk-import feature from Outlook and other clients.

    Gmail offers all that you wish for. Take the free premium trial for GApps, bulk import, then cancel. Problem solved? :)
  • As many above have mentioned part of this, I just wanted to put some of it together:

    - setup a small server with a file system with checksums - ohh, that probably just leaves zfs
    - setup dovecot on the server with maildirs
    - setup clients to use imap to put messages on the server, if you have any existing imap-accounts, use mbsync directly on the server
    - setup thunderbird as a client to index it all in thunderbirds own index-files, so you can search it directly from thunderbird
    - use xapian or something similai

  • by dskoll ( 99328 )

    have mail going back to 1991 archived as mbox files. Some of it is pretty disorganized, but since 2000 I've organized mail into Sent-Archived and Received-Archived directories with the mbox files named YYYY-MM.

    It's a pain to search. But on the other hand, I hardly ever need to search the really old stuff, so grep and friends are good enough.

    I may eventually split it out into maildir format and use a full-text indexing engine such as Xapian to make searching easier. But I'll probably keep the master

  • Echo chamber... (Score:5, Informative)

    by MrNemesis ( 587188 ) on Monday September 06, 2010 @12:00PM (#33489522) Homepage Journal

    ...has me doing a "me too!" to everyone telling you to use IMAP + maildir; I use dovecot myself, complete with self-signed SSL cert (curse you firefox!).

    El_Muerte_TDS [slashdot.org] has just pointed me towards mairix [debian.org], a dedicated maildir + friends indexing system which I've just tried out, and seems to be ideal for my use - fast email search has always been a good thing for me, but I've rarely found a nice lightweight indexing solution that was catered only to mail; "desktop" search engines tend to take the opinion that if I want one thing indexed then I automatically want everything indexed, and also insist on running around the clock. Much nicer for my needs to just have one little lightweight indexing program that only runs when I want it to.

    Best thing about mairix IMHO is the way it creates a virtual maildir on the fly using symlinks, so not only is it easily viewable on the command line, it's also automatically compatible with all of those IMAP + maildir clients out there... which, last time I looked, was all of them. Useful hack for KMail users here [netmanagers.com.ar].

    Disclaimer: my IMAP server has all its databases on an SSD, so even full text searches from the client are pretty speedy (seriously - the lack of access times on small chunks of random data cuts down search times by at least an order of magnitude), but obviously mairix has the advantage of being able to scale to multiple users with >X GB mailboxes much easier than spending a fortune on fast storage.

  • Although it would involve keeping an index you could add a strange key word to each piece of email within the body of the email. For example all emails from Donna in 2009 could be tagged with donna09. Running a search should yield all emails from Donna in 2009. You could also add the month. jaunuary09donna for example. You can even ask people to install a tag in every email they send to you.

  • Domino (Score:5, Funny)

    by Belial6 ( 794905 ) on Monday September 06, 2010 @12:05PM (#33489554)
    Yes, it is not free, and yes, this suggestion will bring out the trolls, but you might want to consider Lotus Notes/Domino. It is ~$140 for the system, and ~$40 a year maintenance (Includes all upgrades) cost per user, but IBM isn't going anywhere any time soon.

    It has good full text indexing, you can keep your mail on a client, and on the server, with incredibly flexible replication rules for what is stored where.
    It supports IMAP, so it talks well to most clients.

    The iPhone syncs seamlessly with it via ActiveSync, and an Android client is in beta as we speak.

    It includes an http client, and the http client even offers offline access. That's right. You can use the http client, and still read your mail and write emails that will be sent the next time you make a connection.

    It also has folders, but you can put any email into as many folders as you want, so you have the best of both Outlook folders and Gmail tags.

    It supports auto-processing rules for automatic filing of data, as well as being a full development environment if you want to get really fancy.

    It is brain dead easy to set up and maintain.

    The server runs on Linux and Window, and the client runs on Linux, Windows and Mac.
  • by socsoc ( 1116769 ) on Monday September 06, 2010 @12:13PM (#33489632)

    just because I can.

    That's a big assumption. You are asking slashdot, so I'm thinking you can't. Especially because imap never occurred to you.

Programmers do it bit by bit.

Working...