Comment Re:Its all in the gmail terms of use ... (Score 5, Informative) 790
That means only the most incompetent pedos aren't already randomly tweaking their jpgs - the smart ones are doing it in the EXIF section so it won't even change the picture.
The smart implementations probably hash the image payload excluding EXIF, for exactly that reason - maybe downsample and reduce the colorspace too, so trivial tweaks won't have that effect any more.
(In fact, the implementation I'm working with right now for exactly this purpose - I have a small research project underway with the police in Scotland as part of their Offender Management work - just hashes HTTP payloads for the moment - although refining this is on the drawing board for later.)
I do find this very disturbing in principle though. Is absolutely everything in your mailbox entirely innocent? I have, for example, a list of various Microsoft product keys in mine. As it happens, those are legitimate - all issued to me by Microsoft via MSDN subscription, then I stuck them all in a spreadsheet to keep track of which key was in use for what - but would Google or the police know that just from looking at the list? They might turn up with a warrant looking for the piracy ring I'm obviously running, just because Google got nosy and went vigilante!
This isn't the first time, though; I recall a malware researcher getting rather upset after Google started eating samples from his Inbox - even when they were inside password-protected ZIP files. I can see that they mean well, but to me that crosses a line.