Comment Re:Canary trap (Score 5, Informative) 323
Intelligence agencies have been doing this sort of thing for decades, giving slightly different versions of a sensitive document to suspected spies or places where possible spies might have access to it, with some subtle changes in the words, seeing which one gets leaked or appears elsewhere. Tom Clancy coined the term Canary trap for the technique. Patriot Games was published in 1987, but its real-world use for exposing information leaks most likely predates the novel.
But the classic Canary Trap requires someone to modify the document manually, which is hard to do on a large scale. Here it is being done automatically by an algorithm.
However, I am aware of published methods for this problem dating back to 2001 by Mikhail Atallah at Purdue. In fact Atallah received a patent for followup work in 2007, a year after the Amazon patent was filed.
Here are a few hundred papers on the subject, via Google Scholar. Some adjust whitespace, some modify images of the text, and some attempt fairly sophisticated syntactic analysis and restructuring of selected sentences.
I apologize that I haven't read the Amazon patent, or read the prior literature carefully, or gone to law school, so I can't comment on whether the patent seems valid or not.