Comment Waiting to fail... (Score 1) 185
Personally, I've always thought of these plagiarism detection systems as ticking time bombs. The more data they acquire, the less unique each individual work entered into the system becomes. Eventually, a point will come where there will be a near 100% false-positive rate on submitted works that are original, but fail because they are worded too similarly to works already stored in the database.
For example:
"With a program called Pl@giarism, Vickers detected 200 strings of three or more words in 'Edward III' that matched phrases in Shakespeare's other works. Usually, works by two different authors will only have about 20 matching strings."
Okay... so, is the system keeping track of the time periods in which these works are written? There's a good chance that those numbers can vary greatly based on how literate a person is and their degree of formal education. A small number of matched strings between authors might be likely if they're each familiar with writing enough to utilize things like synonyms in their writing patterns.
But what about authors that aren't as educated and utilize speech and writing patterns that are more normalized among their peers? You could have significantly higher matched string counts between them.
It gets even worse when you introduce the internet savvy into the equation, where most of their contact with the outside world is specifically done through the internet. People of similar interests and trends who spend hours talking with each other in public chat channels are likely to pick up huge similarities in their writing patterns, much like how close knit communities tend to speak with similar accents and phrases over time. Our social networks directly influence how we communicate with one another.
Considering the fact that this is now a global phenomenon, it is inevitable that our individual written works will become so normalized that it will be almost impossible to distinguish who has written what with any real certainty by automated means. Especially in the generations to come!