A lot of our problems today are the result of people in power fundamentally misunderstanding what Big Data is good for.
We used to assume it was impractical for the Government to keep records of everything we do in the public sphere. Those things have gone from possible to practical to inevitable, mostly due to Moore's Law.
Just because you have everything recorded, doesn't mean it's useful, though. Technologists who should know better talk about searching these records to find the "needle in the haystack", selling the vision of complete records + powerful search tools = Total Awareness.
What they conveniently skip over is:
* All records have inaccuracies
* If the inaccuracy rate is higher than the occurrence rate of what you're searching for, the search is not useful
Consider medical screening tests. If you have a test with a false positive rate of 1 in 1000, it is useless to use such a test to search for a condition that happens to 1 in 1000000 - 999 times out of a thousand, the test will say you're sick when you're fine.
Now, consider:
* The error rate of address OCR
versus
* The rate of secrets being exchanged via US Mail
Anyone in the Government who can't produce an estimate of those two numbers shouldn't be allowed anywhere near those records - it would be like giving a child a loaded gun, or a politician a Twitter account.