Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Comment Google Books vs. real corpora (Score 4, Informative) 127

http://corpus.byu.edu/coha
Corpus of Historical American English.

-- 400 million words, 1810s-2000s.
-- Allows for many types of searches that Google Books can't:
* accurate frequency of words and phrases by decade and year
* changes in word forms (via wildcard searches)
* grammatical changes (because corpus is "tagged" for part of speech)
* changes in meaning (via collocates; "nearby words")
* show all words that are more common in one set of decades than others
* integrate synonyms and customized word lists into queries
* etc etc etc
-- Funded by the National Endowment for the Humanities (NEH), 2009-2011.

Take a look at the "Compare to Google/Archives" link off the first page.

Slashdot Top Deals

You have a massage (from the Swedish prime minister).

Working...