expectlabs writes: Twitter now has the power to unearth both raw numbers and insights into our language behavior. The map above reveals regional language variations based on how we tweet about our beloved soft drinks. Edwin Chen, a data scientist at Twitter, used the site’s geo-tagging feature to search for tweets that contained the words, “coke,” “soda,” or “pop,” when users were talking about their drinks. Chen applied NLP technology into his analysis to ensure that the tweets were in fact soft drink related, and removed the tweets that were referring to the Coke brand. According to Chen’s blog, he then grouped the tweets that were within a 0.333 latitude/longitude radius, calculated the term distribution, and colored each group with the soft drink term that was furtherest away from the mean. Each point is sized according to the number of tweets in the group.
... though his invention worked superbly -- his theory was a crock of sewage
from beginning to end.
-- Vernor Vinge, "The Peace War"