Comment Re:You vs everyone (Score 1) 265
But GMail does, to my understanding, use a personalized filter, in addition to the global filters. I get some legitimate email in a foreign language (not Chinese, but one with a non-latin alphabet), and some spam in that language as well. GMail gets them 100% right. Alphabet is just another feature that you perform Bayesian analysis on.
What any big message processing service has that a single user won't, is access to the content of messages across users, and the collective action by its users. So, for example, if a new spam campaign starts up, once the 10th (or so) user has clicked "this is spam", the rest of the recipients' versions of that same message get automatically re-classified. I used to be responsible for fighting spam at a mid-sized social networking site (that no longer exists, unfortunately), and believe me, simply looking for multiple copies of a given message is a strong tool for fighting spam. The back-end service operators get access to that, the users don't.