Why Anonymized Data Isn't 280
Ars has a review of recent research, and a summary of the history, in the field of reidentification — identifying people from anonymized data. Paul Ohm's recent paper is an elaboration of what Ohm terms a central reality of data collection: "Data can either be useful or perfectly anonymous but never both." "...in 2000, [researcher Latanya Sweeney] showed that 87 percent of all Americans could be uniquely identified using only three bits of information: ZIP code, birthdate, and sex. ... For almost every person on earth, there is at least one fact about them stored in a computer database that an adversary could use to blackmail, discriminate against, harass, or steal the identity of him or her. I mean more than mere embarrassment or inconvenience; I mean legally cognizable harm. ... Reidentification science disrupts the privacy policy landscape by undermining the faith that we have placed in anonymization."