Can Poisoning Peer to Peer Networks Work? 391
andrewchen writes "Can poisoning peer to peer networks really work? Business 2.0 picked up my research paper from Slashdot
and wrote an article about it. In my paper, I argue that P2P networks may have an inherent "tipping point" that can be triggered without stopping 100% of the nodes on the network, using a model borrowed from biological systems. For those who think they have a technical solution to the problem, I outlined a few problems with the obvious solutions (moderation, etc.)."
The easiest solution to fix poisoning... (Score:3, Insightful)
This would be moderation however, it would be the smartest way as each user would have their word on who is allowed and not allowed on the network.
Re:The easiest solution to fix poisoning... (Score:2, Insightful)
Re:The easiest solution to fix poisoning... (Score:3, Insightful)
If I see a list of servers, and a rating, I'm instinctively going to select one of the top rated servers. Most people's ratings of such servers would be a function of two distinct factors:
- Does the server have what I'm looking for?
- How quickly can I get this file from this server?
If both factors are very favorable to me, I'm going give this server a good rating. If I can't connect, or the server doesn't have what I'm looking for, I'm going give the server a poor rating.
If a server wants to become highly rated in this type of a system, the operators must provide
- Lots of bandwidth
- Lots of files
Not many people can afford to do both. As a result, a 'cartel' of sorts would be formed, where the top few servers serve to a majority of the users, and the rest of the servers, of which there may be 20 times or more of, all serve to the minority.
If the 'hunter' wants to kill this group, what does he do? He wouldn't want to poison each one systematically -- he'd want to go after the big targets that everyone feeds from. This rating system would only help him expedite this process.
Re:The easiest solution to fix poisoning... (Score:3, Insightful)
So to be useful, votes would require authentication in order to avoid ballot box stuffing. But authentication goes hand in glove with identification, and that's something the users of the P2P networks seem to be trying to avoid.
Bottom line: voting is subject to the same poisoning that the files are subject to. It adds a layer of complexity that simply delays poisoning, but probably not for long. Hell, with the inevitable bugs (that end up denying users unpoisoned files) and long-term ineffectiveness, voting would probably be smiled upon by the RIAA.
Re:The easiest solution to fix poisoning... (Score:2, Insightful)
Re:The easiest solution to fix poisoning... (Score:2)
If I were the RIAA, I'd tell my employees to stop acting like a bunch of two-bit hackers start giving the customers what they want.
Really, this whole thing -- from poisoning P2P network to authorizing legal hacks on 14 year old uers -- is absurd.
Hilary and Jack "Maddog
It's a long process, but I'll tell you one thing: the more the RIAA and MPAA keep employing the shock-trooper tactics, the less goodwill and grace (if such goodwill and grace ever existed, but I think it did -- at least in part) they're gonna get from Joe and Joe-elle Consumer.
Re:The easiest solution to fix poisoning... (Score:2)
The RIAA and MPAA want money. Lots of money. The kind of money they're used to. The P2P sharers want music. Lots of music. For free, just like they're used to.
Everybody keeps ranting "why don't they find a business model that works?" Here's your answer: There isn't one; there won't be one; there can never be one. First, it's an argument of corporations vs. the marketplace. Can you speak for every P2P user? Can anyone even claim to? Of course not, no one can. So it's already a one-sided discussion. The industries have no incentive to "talk" to the marketplace, since their only feedback comes in the form of "no revenue, no sales" in any case.
Jack and Hillary aren't stupid -- they've already figured that much out, so I think they've come up with a simple plan. They've decided to squeeze every last nickel from every last legitimate consumer until the whole production system implodes from lack of revenue. Their business plan is to get to be so rich now that they won't care when it implodes.
Under this plan, Jack and Hillary have no need to talk to anybody except to placate their respective industries. "Studios, crank out those movies. Recording companies, press those discs. We're taking good care of the whole Internet for you. We promise we'll have this piracy thing licked about the same time we reach $1,000,000,000 net worth (each.) So keep your stock prices up, please."
There is a business model that works (Score:3, Interesting)
Re:The easiest solution to fix poisoning... (Score:2, Interesting)
A blocking system can't work fast enough.
Re:The easiest solution to fix poisoning... (Score:2)
A voting system can be abused by creating a large group of malicious users giving each other positive feedback. Andrew already mentioned this on his webpage. Routing on a P2P network may not be direct, so you may not be able to give a site bad feedback anyhow.
Re:The easiest solution to fix poisoning... (Score:2)
What I want to have in the future of P2P is system level protocols which require no user interaction.
Re:The easiest solution to fix poisoning... (Score:2, Funny)
I was reminded of one of the AI Koans [everything2.com]
Re:The easiest solution to fix poisoning... (Score:2)
Re:The easiest solution to fix poisoning... (Score:2, Funny)
Blatently wrong posts often make it up to +5 informative, while a reply to it that is accurate will only get a +2 insightful.
I've thought about this a little and was wondering What would happen if slashdot started selling higher rated posts. Say for $5.00 I could buy 20 posts. I would tend to use them more judiciously but would have my posts start out at +2. Just a thought.
Easy to fix, really. (Score:2)
Just give moderators private keys, and distribute the public keys. Bingo! Authenticated moderation...
Re:The easiest solution to fix poisoning... (Score:3, Insightful)
This statement is obviously false. Nobody will move to the labels' own online sites, because the label sites don't provide what they are looking for: lots of music in vanilla MP3 files with no sharing restrictions, license headaches, or some kind of goofy-assed "copy once" encryption scheme.
Users who become frustrated with crapflooders on their favorite P2P network will simply move on to whatever the next emerging P2P network is, and those who want use poison tactics will play a losing game of whack-a-mole indefinately.
One big problem: Lazy users (Score:3, Insightful)
it's already poisonned by users (Score:3, Insightful)
Really annoying especially with large files you've downloaded at 1kbps
Re:it's already poisonned by users (Score:2)
In addition, anyone using ATTBI should be forewarned that you should remove ANY and ALL movies from your shared folder on any P2P network. The MPAA is reporting violations to ATTBI's legal demands center and ATTBI *is* disabling users who violate rules.
I suggest the removal of all shared movies if you are on ANY ISP, but especially large cable modem networks.
Re:it's already poisonned by users (Score:2)
You've never run a BBS right? :^) The number of junk files uploaded even when they didn't need it for download ratios was amazing. Or uploading renamed copies of software already uploaded (with fscking BBS ads inserted into the zip to make size checks impossible.)
Re:it's already poisonned by users (Score:2)
available files which are not usable. "
Are you talking about the people who post their playlists on a website, which is what you find when searching for a song title, but has the files themselves elsewhere?
Re:it's already poisonned by users (Score:2)
No biggie, I pop up a web server on the port. KaZaA is close enough to HTTP to confuse them into going away as well as logging all their user/download info. :^)
Re:it's already poisonned by users (Score:2)
Obvious technical solution take 2 (Score:3, Interesting)
So, um... how about this... If it's a standard file, such as, say, the deviance rip of neverwinter nights, or the new MPEG of Two Towers, then it should always have the same checksum.
Somebody somewhere needs to maintain a website with these checksums on. Then there's no dependence on the person who you're pulling the file from.
Obviously doesn't work for random porn videos (although it would for more popular ones... which might also tell you whether they're any good).
And there's nothing illegal about it.
Problems?
Re:Obvious technical solution take 2 (Score:2)
Re:Obvious technical solution take 2 (Score:2)
We can GPG sign each megabyte of the files to be downloaded. If the P2P clients downloading from the infected server raise enough red flags, the server can be voted off the island.
Re:Obvious technical solution take 2 (Score:2, Funny)
GPG signed by whom?
Well, who do you think?
Attention, the triple X movie you're about to download has not passed Microsoft Digital Signing and could endanger your personal stability...
Re:Obvious technical solution take 2 (Score:2, Informative)
The files are all downloaded in segments from multiple sources, and you sometimes get bad segments, but they are only a fraction of the total file size so you don't really care.
You just plain can't poison eDonkey / Overnet - it won't work. It is also the only network that I would be tempted to use to distribute real content since it is guaranteed that the user will get what you want them to.
Doesn't Sharezilla do this too? (Score:2)
Re:Obvious technical solution take 2 (Score:2)
Re:Obvious technical solution take 2 (Score:2)
Then, the different clients can interface to the content p2p network, so that users that are considering downloading a file can have a better guess at the authenticity and quality of the file - given that they build in support for passing hashes along with the general search results.
I would actually like to see a system where the content database is so well maintained that all systems can use it as a central QA tool, enhancing the file sharing experience.
And folks - just because the technology can be abused, does not mean that it is inherently evil. I just would like to have the quality raised in some way.
The downside is that it will probably become a way to censor information to some extent. We just need to minimize the risks, and maximize the benefits.
Re:Don't talk lame trash... (Score:2)
However, in any case, it is way easier to spread checksums by various means - internet boards, email lists, usenet, IRC - than spreading the actual file. If the situation arises, and the P2P net is "poisoned" with invalid files (and invalid checksums) I'm sure it won't be hard to acquire the valid checksums and download the correct files. Of course, "poisoned" clients sending out fake files with wrong checksums will still be a problem. Why would there be different rips? Typically, each movie is released only once (by groups specialising in it), all other releases are "dupes" and are not do be distributed. The same is true for virtually any sub-category of the scene, such games ISO/RIP, utils and audio.
Re:You have to be kidding... (Score:2)
ShareReactor [sharereactor.com]
FileNexus [filenexus.com]
Asia Movies [asia-movies.tk]
Jigle [jigle.com]
Various sites specialised in files of certain languages (French, German), such as Spieleplanet [spieleplanet.org]
etc etc etc - just search for eDonkey links.
There are also IRC channels and uncounted web boards (similar to Asia Movies) dedicated to sharing ED checksums.
No, I am saying, in fact, I said, that is completely irrelevant. We're not talking about sharing files as in Napster or Audiogalaxy (where you seem to draw your experience from). There's only ONE valid version of each single/album (single MP3s aren't usually spread), the first high-quality, complete release by a scene group. All later releases are dupes, and not distributed. You get the checksum to that release, and you're set.Re:Obvious technical solution take 2 (Score:2)
Re:Obvious technical solution take 2 (Score:2)
I'm not saying this is perfect, but it would help.
Incidentally, in answer to another point raised somewhere round here, it's true that the p2p system is the one providing you with the checksum, but there's still 2 Good Things.
- After you download, you won't run it... you can do the checksum test yourself
- If it was built into the p2p system then it would be indisputable... unless the server was lying, in which case you know not to trust that server.
Checksumming can work (Score:3, Informative)
Create a website with logins for the users. Users of this web site can create lists of checksum for the files they create or have downloaded and verified as valid.
Other users can check any given user's list, and perhaps even post comments about the user's list, a form of moderation, if you will.
The validity of any single file on any random user's list would certainly be questionable, but some lists would become "trusted" by the community through trial and error. Others would be recognized as bogus and ignored.
Just a thought. Give me more than a few minutes and I might be able to come up with a better one.
Re:Checksumming can work (Score:2)
Re:Checksumming can work (Score:2)
Yeah, and it's run e-bay into the ground. Oh wait, no it hasn't.
Re:Checksumming can work (Score:2)
Re:Checksumming can work (Score:2)
As for recovering lost bandwidth, you can't, but you can use checksumming along with moderation to improve reliability.
Re:Checksumming can work (Score:2)
Using cryptographic hashes will make it impossible to generate a bogus datastream which generates the same hash. (CRC32 and such are not as suited.)
Always a way (Score:5, Insightful)
Granted poisining it can start to drive away the gimmie-gimmie crowd or the newbies.. but the hardcore and old-timers will stay and simply find a way around it. Hell a group of about 100 of us now have our own private open nap network going and we have only high quality known good files. any clients connecting not sharing or sharing crap are instantly banned/blackballed... so we do the moderation thing.. with a side requirement that you must be asked to join and prove your worthyness to us. Maybe that will be the direction P2P will go... back to the roots of IRC where you had to prove your worthyness, ratios were encforced, and real people made decisions to keep out the troublemakers...(RIAA) granted you dont get 30 bajillion users that way, but then you dont have to spend a night and 10 gig trying to find that song or file you want.
Re:Always a way (Score:3, Insightful)
You hit upon a good theme here. To counter act the problems, the signal to noise ratio, poisoning, etc, users will have to PUT MORE EFFORT into downloading warz, and MP3s. The P2P networks will thrive, but you will not have as much of the global swap fests, and free warz that you can get now. The most the people poisining the P2P world can hope for is to increase the level of effort required to use P2P effectivly. And along the way they will create some stonger social ties between the users. Ultimately they will end up strenthening the whole P2P movement...
Re:Always a way (Score:2, Insightful)
To be fair though, that's pretty much the point, isn't it.
I agree and always have, but.... (Score:3, Insightful)
In fact, I would further argue, against the conventional wisdom on slashdot, that RIAA has basically won the war against P2P and other forms of mass piracy. At least once they shut out networks such as Fasttrack, and let it be known that there will no financial return for those that fund the development of piracy networks. Certainly the average Schmoe can download that super popular song via GNUtella with some effort, but getting much more than that like, say, the entire album at decent quality from same artist, is like trying to extract blood from a rock. That is not to say that they will retire their guns, but rather that it will just be an on-going series of small battles, more like maintenance, to hammer down any network, system, or device that pops up and starts to hemmorage their intellectual property.
Re:I agree and always have, but.... (Score:2)
I (sadly) only started using Napster about a year before it got shut down, but I never found it a particularly good source for downloading an entire album, especially one in the same bitrate and overall quality. I thought that was nearly impossible.
I'd say overall that only about 75% of the stuff was worth keeping (eg, 128kbps+, no skips/cutoffs/distortion) and I searched for mostly mainstream stuff (rock n roll). I got a fair amount of cutoff tunes, tunes with skips in the middle or just bad overall audio quality.
I'd agree thought that the RIAA has effectively killed off P2P, except for people that make a serious effort at maintaining their own networks or of putting real resources towards mining gnutella-type networks.
Re:I agree and always have, but.... (Score:2)
Re:Always a way (Score:2)
And that's what the *AA want. As long as the networks split and isolate, they can monitor them and pick them off as they become big enough. Also, since being a member of a closed "pirating ring" is as good as an admission of conspiracy, they can start to use RICO laws, too. Yummy...
In reality, the only safety in P2P for illegal sharing was its ubiquity. Once that's gone, you become an easy target. It's a lot easier to control five people than a mob of thousands.
IP address block banning (Score:2)
Re:IP address block banning (Score:3, Funny)
Or you could not live in the US and have no problem
Re:IP address block banning (Score:2)
That would be nice to see, RIAA sat on by AOL.. cos ultimately that would be a breach of AOL's terms of usage.
Some comments on the conclusions... (Score:3, Insightful)
In particular, our analysis of the model leads to four potential strategies, which can be used in conjunction:
1. Randomly selecting and litigating against users engaging in piracy
This seems to be the option which involves the least technological action. However, randomly wouldn't work, if it were only because the P2P users don't all live in the same country, hence different laws apply. So some sort of not-so-random selection proces has to be implemented.
2. Creating fake users that carry (incorrectly named or damaged files)
Modern P2P programs support downloading files from multiple sources. If someone downloads such a fake file and discovers it, the file will almost always be deleted. So, these files will not propagate through the network, or at least not as fast and as much as the correct files. So a search where one file can be downloaded from many sources is in this case preferable before one with not many nodes serving the same file.
3. Broadcasting fake queries in order to degrade network performance
Now this is an interesting thing. The makers of the P2P programs who are being targeted by fake queries could ban such users, or could build in a feature where the user of a P2P program can ban a host his/herself, so that it will be excluded in further searches.
4. Selectively targeting litigation against the small percentage of users that carry the majority of the files
Some users carry gigs and gigs of files, but that doesn't mean they're very popular. If I setup a server where I host my 20CD collection of Mozart works I'll probably won't get as much traffic as when I publish the Billboard 100. It's not the quantity, but the content of the files served that counts. Search for Britney and you'll receive 1000's of hits. Search for Planisphere and a lot less results will show up.
Nevertheless it's a good paper.
I disagree (Score:2)
Again, I disagree. It has been my experience than many users do not delete damaged files, they simply leave them. The so-called swarmed downloads only further expose the downloads to corruption since all it really takes is one corrupt segment to either cause the program to crash or at least play really unbearable sound (or whatever media). To further compound the problem, the industry could use their cash and their legitimacy to be the most available and desirable servers (so that your swarmed downloads are almost certain to select its servers).
This is impossible in any current decentralized P2P scheme, don't you get it? How is any routing servent to know that the other servent it is connected to is not passing legitmate requests the hosts it is purporting to represent? It can't. It might attempt to throttle the traffic of any from any given node, but then that would necessarily mean throttling the ENTIRE network, which would be self-defeating.
While it is almost certainly true that only 1% of the content accounts for 99% of the traffic, it is also true that only 10% of the hosts account for almost all of the servers. Of those 10%, roughly half of them, (those that HAVE the popular files, are SHARING, are on truly HIGH speed network, and are NOT FIREWALLED) account for the majority of it. If you take the biggest servers out first, you will have a big impact. What's more, once it becomes established that there are likely consequences for being an effective server of files, the industry need not literally attack every last one of them. They need only use fear to their advantage and allow the servers' own self-interest to take over.
GPG signatures and web of trust (Score:5, Insightful)
GPG signatures (which BTW include a checksum) of content, with said signatures refering to an online alias rather than a real person (thereby maintaining anonymouty).
A web of trust is formed, in which HollywoodDude is known and trusted, and has signed RipperGod's key, who in turn has signed FairUsers key, and so forth.
Provide a separate way of obtaining the keys (e.g. multiple independent websites, multiple independent keyservers, and so forth), and people can simply filter out anything submitted by untrusted users. If something submitted by someone outside of the trust ring, and someone who is trusted sees the item and determines that it is worthwhile/good/whatever and not a decoy, they could sign the item themselves.
Gaining trust would of course take time, probably requiring many worthwile submissions, but that is true in real life anyway, so why should it be any different online.
If someone violates their trusted status (or their private key is stolen, which BTW would be a violation of the law), others in the ring of trust could revoke their trusted access and blacklist their signature.
It isn't as convinient as just being able to share something with little or no thought, but it is emminently doable, and there really is no straightforward way to undermine such an approach.
Re:GPG signatures and web of trust (Score:2, Insightful)
That's exactly what the paper's authors said, pointing out that the decrease in convenience is in itself a real danger, and they were right.
Re:GPG signatures and web of trust (Score:2)
Now web sites can present reviews that tie into this new protocol with a URL (something like "gsig://sigs.mediahype.net/ab3827d9827eab39f2c-1"
Sure, you can still put out a 10-second clip with empty noise after, but the download will stop at that 10-second mark. What's more: a smart client can keep the section that DID match the signatures and look for an intact copy to CONTINUE from. Thus, truncated versions will now be ignored immediately.
This introduces a centralized client-server model for trust purposes, but reviewers are not providing content, just reviewing it. The MPAA and RIAA could even put up servers that review valid promotional content, and warn users of copyright violations in other files! *This* is the way to solve everyone's problems at once (unless of course your problem happens to be a failing business model).
this already exists (Score:2)
Re:GPG signatures and web of trust (Score:2)
So when the MPAA downloads Star Wars Attack of The Clones they know that I'm the one who ripped it!
I'm not going to put my GPG (PGP) signature on a document with plans to hijack planes either.
Re:GPG signatures and web of trust (Score:2)
So when the MPAA downloads Star Wars Attack of The Clones they know that I'm the one who ripped it!
Go back and read my comment. The comment, not the title. To wit:
There is absolutely nothing about GPG that requires the key to refer to an actual, human identity. If everyone knows that TrustedDude is a trustworthy person, that is sufficient. No one needs to know that TrustedDude is in fact a 15 year old kid in New Jersey who spends his free time violating copyright (or perhaps not, there are all kinds of legitimate uses for P2P networks, not least among them improved accessibility to popular legal content, like free software whose primary ftp servers are often overloaded).
Re:GPG signatures and web of trust (Score:2)
What is needed is source obfustication. Instead of connecting directly to a node with the files, we lay the network out like a system of routers. Each node can only communicate with it's neighbor, which means you can only know the next hop, never the source. Without a doubt this exponentially decreases download speeds (each node downloads and uploads the file before it gets to you) but with swarming, and dynamic metrics for each node, it could work out. Of course caching would be the obvious next step.
As for overloading the network with crap, go right ahead. I'm more than happy to waste RIAA/MPAA bandwidth time and time again. But that's just me.
For a solution, let's be democratic about it. When we search for a file, we find that 10 people have it. If 8 have marked it as being a fake, I probably won't download it.
So that takes care of everything except 3. Broadcasting fake queries in order to degrade network performance. The downside of p2p networks is that they have a lot of overhead. Privacy demands overhead. However, if one is sending excessive requests, they may be blocked temporarily. Of course the lower the TTL, the higher the tolerance for a large number of searches.
Any questions?
You can do better than that :) (Score:3, Insightful)
Obviously the user would get to select the appropriate action if one of the files are just better than the other with a rating mechanism as well
Other advantages to this method are:
*Checksums can't be faked except in NP time. (use a random block size to thwart a super computer precalculating bad blocks that MD5 to the right hash... use multiple hashes)
*Multiple host download is gauranteed to be the same file (even when being poisoned).
*A computer need not have the entire file to share a block of the file, therefore files propogate the network in a more exponential manner. (host A gets block 1 from B. Host C gets block 2 from B, Host C and A trade blocks 1 and 2. Host D comes along and wants the same file, and can download from A and C instead of bogging down B. Works even better because all connections that I've seen are duplex even if they have a slower upstream. Conserve network bandwidth by refering downloaders to other people who have downloaded before... search for the GPG signature of the hosts on the network.)
Overall, I see this kind of thing being implemented very soon because it's not that difficult, and it's pretty obvious. Maybe the next edition of Gnutella will support this.
Of course there are loopholes where the RIAA/MPAA could buy half a million IP addresses or have a lot of computers on the network, but you don't have to have an unbreakable system, just a system that costs more to break than they think they will see in profits from breaking it.
faked hashes (Score:3, Interesting)
The problem faked hashes can be addressed using trees of checksums rather than just a simple checksum although a workable implementation would require embedding into the P2P protocol.
The idea is you break the file up into smallish sized blocks (100k or so) and generate a hash for each one of these. For each 8 first level hashes, you feed them into a crypto hash function to generate a second level hash. For each 8 second level hashes... you generate a third level hash. This allows a continuous (per 100k blocks) proof that the content is valid... The size of the proof grows with the log of the content so it is not much of a problem.
Tree Hash EXchange (THEX) (Score:2)
Re:faked hashes (Score:2)
They Don't Need to Poison P2P (Score:5, Insightful)
Really. Most users, given the choice, will pick the "honest" legal way to get their music and videos. Will there still be pirates? Of course, but you can never stop them and, heck, you're not losing money on them anyway. They wouldn't spend the money on the music.
Treat honest customers as honest, embrace new distribution methods. The problems go away. Think of the cost savings: they wouldn't have to buy any more senators.
Re:They Don't Need to Poison P2P (Score:4, Insightful)
While lowering the price of the media would make *some* difference, it wouldn't make enough of a difference to be worthwhile.
Re:They Don't Need to Poison P2P (Score:2)
Re:They Don't Need to Poison P2P (Score:2)
Remember, speeding on an open road where there are no other cars is a victimless crime too... but it's still illegal.
Furthermore, piracy isn't victimless. If it were, changing the prices wouldn't make *ANY* difference, but it does. Think about it.
Re:They Don't Need to Poison P2P (Score:3, Insightful)
"Copyright Infringement" is *NOT* piracy.
Thanks.
--
Re:They Don't Need to Poison P2P (Score:2)
Re:They Don't Need to Poison P2P (Score:2)
But with internet distribution there is only one "right" to sell: once the content is on the net, anyone can get it, anywhere, anytime. While a tremendous boon for consumers, this completely destroys the old, picture-perfect system of nice little independent packages of "rights." And that is why traditional media companies are keeping their heads buried in the sand, horrified at the collapse of their nice neat rights packages, hoping that this whole internet distribution thing will finally blow over. They are praying for ubiquitous DRM systems to re-create all those nice little borders...
Two Problems (Score:2, Interesting)
I see two problems with this idea.
-or-
So if I try to download the latest.. (Score:4, Funny)
And, pray tell, how am I supposed to know the difference?
Simple! (Score:5, Funny)
Re:Simple! (Score:4, Interesting)
Or, maybe, a "licence":
Having distributed content together with such licenses (or hired someone to do so), it might be a bit harder for the labels to defend copyright claims for individual songs.
Trust webs (Score:2)
Poisoning such a web could prove difficult. I trust personal friends highly, the aren't a poisoning group.
People I or they don't know well won't get a high trust rating, and would be suspected if they were poisoning the group.
I think slashdot type moderation works well too, combined with a decent sized web of trust should be a pretty stable system
From the article... (Score:2)
The problem is the labels don't have their own online sites. Sooner or later (its bound to happen) the labels are gonna hire some college grads who grew up on sharing and understand the problem. Maybe then a compromise will be reached.
Not really working... (Score:3, Insightful)
Webs of trust - hardly. Imagine a network of antis giving eachother good reviews, they'd certainly be better off than someone without any reviews at all. It's very *unlikely* that the one you're P2P'ing with has a trust chain you accept.
"Database" of who are good traders and not - Fake databases would screw that, you wouldn't know which ones to trust as you have no central server. The problem is that if there's to be any real P2P exchange happening, it's usually *strangers* meeting.
My friends could do a web of trust or a database, but then we'd much more likely to setup some mutual leech ftp servers instead and skip the entire P2P-networks.
Kjella
Poison chain (Score:2)
Trust should work both ways.
Several unrelated "I got a good file" ratings could give you a cloud of trust. I think it oculd work.
Use Limewire (Score:4, Informative)
Cripes -- did anyone proof this paper? (Score:3, Insightful)
Anyway, is it:
"Or perhaps the carrying capacity of a well-designed P2P network is huge, and *NO* amount of flooding can overwhelm the network."
Or:
"Or perhaps the carrying capacity of a well-designed P2P network is huge, and *ANY* amount of flooding can overwhelm the network."
Which is it: "no" or "any?"
Distributed trust and peer review (Score:5, Insightful)
The author of this paper seems to suffer from the common practice of those in a hurry to finish their term papers that if they somehow ignore the elephant in the room that disproves their point they might end up getting partial credit for impressing people with how well they can tap dance around the elephant. In this case the well-established practice of using a secure hash function as a self-verifying mechanism to prevent DoS attacks that try to flood a network with garbage files is the elephant.
In his FAQ regarding the paper, Mr. Chen correctly addresses the problem of a lack of centralized authority in using hash functions as distributed/P2P but apparently did not make more than a cursory examination of the subject or else he would have seen the various methods available for solving such a problem. I can only assume this is the case because reputation systems beyond simple moderation are not addressed and flow-constrained trust networks are never mentioned in this section.
As someone who seeks to pass off a "bad" file (this report) as a "good" file, perhaps sooner rather than later Mr. Chen will learn how the distributed moderation and trust system known as peer reputation works. Surely I am not the only one who finds it more than a little ironic that a paper by an author who claims that distributed moderation doesn't work is being submitted to a peer-reviewed journal in an attempt by the author to bootstrap his own reputation?
Overkill (Score:3, Informative)
Look at the warez scene to see how it goes. A handful of release groups whose names are known to everybody who is even vaguely interested is sufficient to ensure supply. If these groups are attacked by fake releases (rarely happens) they can use hash keys as you suggest (some already do).
Websites like www.sharereactor.com also safeguard against fakes - another mechanism which is strong enough to defeat the entire problem by itself.
What I am saying is that distributed moderating à la slashdot will not evolve. Instead, we will have a handful of "authorities" - Web sites or public keys - that everyone trusts.
Note that authority - when not combined with power - is a Good Thing (TM).
Re:Distributed trust and peer review (Score:2)
It's also funny to see you present as a solved problem something that's actually a very active area of research and pretty much still in its infancy. If you ask ten people who've been working on peer reputation how it works, you'll probably get five saying "it doesn't...yet" and the other five giving you five (or more) different algorithms. You're probably correct that there's a solution in there somewhere, but please don't make people think all the interesting stuff in that area has already been done.
In other words, watch out for that elephant. ;-)
Re:Distributed trust and peer review (Score:3, Interesting)
You can send out a bad copy once, but if well-known and trusted copies already exist on the network you are not going to be able to replace these with bad copies, the self-verification does not prevent the single-point attack you describe, it prevents the propogation of this attack throughout the network. If an attacker serves up bad files (ones that do not match the SHA1 hash advertised) then the downloader should treat the host as malfunctioning and query a more reliable source. The downloading agent does not need to unpack the file and see what is inside, it just checks the SHA1 hash and then can simple assume that there was a transmission error and try another source. Eventually the malicious node will be trimmed from everyone else's peer list and a new node identity will have to be generated and the game starts again.
This single attack costs the attacker as much as it does the downloader (and you can bet the RIAA is paying more per MB of data sent than someone downloading the data via a DSL or cable modem line) and a few simple changes to the system like favoring trusted peers (ones who have not given you mismatched hash/payload data) as the first nodes to query and only moving down the local reputation food chain if you need to expand your query or search for alternate sources. Unless an attacker can pretend to be a vast majority of the nodes in the system it is not going to be able to make this attack scale-up in the manner you suggest.
There is a difference between an attack that works on a single download and an attack that would be viable for a network-wide assault. The case you and Mr. Chen bring up here is clearly in the first category, an inconvenience for individual users but not something that will be a significant problem for the network as a whole.
Moderation and peer reputation require some method of recording "ratings" of users on the network. Something not present in the current Gnutella network. But if implemented, it would have to be distributed as well. This means that there, at some point, must be a blind trust between clients to complete these "ratings". That blind trust will lead to poisioning of the ratings system and make it worthless.
"Ring of trust" simply does not work in a distributed environment that is truly open to anyone. Closed distributed environments, or virtually closed environments within an open environment would be the only way. However new users would not be able to enter them and that is how Gnutella keeps itself alive.
Which is why I think that things like Raph Levien's work in reputation systems (and actually coding up working examples of such a system, see refs below) are rather attractive because they solve this specific problem in a rather elegant fashion and make such simplistic attacks much more difficult and expensive to pull off. [Here's a quick hint: Have you ever noticed that most people seem to care about Roger Ebert's opinion rather than yours when it comes to what movies to go see? This is because distributed trust system can deal with voter flooding attacks by limiting how much influence comes from untrusted sources.]
You seem to think, Mr. McCoy, that there are obvious solutions. Yet you really don't present any nor do you present any existing real-world examples.
One of the problems I addressed in the original paper was the fact that it was poorly researched in certain aspects. It seems that everyone is too lazy to actually do any research these days, but since spending five minutes doing google searches on various terms related to reputation systems seems to be too much work for either you or Mr. Chen, here is a quick summary of a few minutes work (although I selected papers that I am familiar with after google returned a hit).
1) For starters look at Google itself. Google is the single biggest distributed reputation system in the internet. That is what a pagerank is, the "repuation" of a particular link for a particular subject using link count as the voting mechanism. It can be attacked and subverted on a small scale as various Google-juicing experiments prove, but it is also very effective at filtering out these attacks (see some of the Scientology google-juicing wars to see how hard it is to really influence a massively distributed reputatioon system implemented my people who know how to pick the best ideas from current research and invent a few of their own.
2) EBay seller rankings. These can also be attacked and tweaked, but even when money is involved (making the incentive for dishonest behavior very high, much more so than any p2p system will ever have to deal with) EBay manages to keep fraud to a manageable level and recent research into seller/buyer identity-blinding and reputation cluster filtering can make the seller ranking system even more attack-resistant.
3) Amazon buyer ratings and recommendations. Yet another example of a real-world distributed trust management system.
4) Advogato [advogato.org] is a community forum site that implements some of Raph's Ph.D. work in reputaitons and distrubted trust management to create a flow-constrained reputation system that has some very good attack-resistance characteristics. Raph has been running Advogato using his distrubted trust metric for several years now.
5) Pattie Maes' agents group at MIT [mit.edu], specifically the Yenta reputation clustering system but just about everything to come out of this group is a source of good ideas and practical research in this area.
6) Check out some of the available research bibliographies (like this [umich.edu]) and places like citeseer for other research in the subject.
One thing you will notice about these real-world examples is that none of the systems tries to be "perfect", just good enough to get the job done.
Actually checksums should work. (Score:3, Interesting)
Although this idea works for newsgroups and some other centralized services, it does not with P2P. Basically, it comes down to the fact that you must trust whomever is actually doing the checksumming, or else they can just lie and publish false checksums. In the case of P2P networks, the checksumming is done by the same person you want to figure out if you can trust! As far as I know, this is an unresolvable problem.
Actually, the checksums should still work I believe, in much the same way that file sizes work now. Consider the reason the files that are being injected are set to the same size as the real file; the purpose is to mask these files to the naked eye. Checksums could be used for the same purpose.
The reason for this is because as people find good files they will tend to keep them while deleting the bad files. Sure if we only get 1 result back then we don't know one way or other, but if we have 10 results back and 8 of the 10 of the same checksum, we can assume those 8 are the good files.
Of course the problem with this is that a great many people don't bother to delete bad files after downloading, but should the poisoning become too much of a problem we can entice more people to clean up their shared files by way of the client interface.
All in all, I think this would combat poisoning very well.
Fake Checksums (Score:3, Informative)
Bobs_Song.mp3 5 M Hash -XXXXXXX
You don't know that I gave you the wrong hash till you're done.
It can only tell you that you have the wrong file, after you have it
What a bunch of hypocrits (Score:2)
Sure it can if and only if you can buy online MP3s (Score:2)
I know one of my chief frustrations is to search for a song and either have it incomplete, or be of poor quality (e.g. pops or other defects) or to simply have it not be the same song that I downloaded. If I could search for a song, pay $SOME_SMALL_AMOUNT (e.g. $1US) for it and download a 'known perfect' copy at my choice of bitrates (e.g. 128, 160, etc.) then sure as heck I'd do it.
Distributing these poisoned files would take an enormous amount of bandwidth, so they'd have to have some sort of agreement worked out with ISPs and a mass-content provider, say Akamai. Akamai has tens of thousands of servers located in hundreds (if not more) of ISPs throughout the nation. I think on peak usage they're pushing out 100 GB/sec. in the US (if not more). Simply say "Ok Akamai, can we buy 10GB on each of your servers and push all these MP3s out?". Then you write a gnutella client for each box which offers all the MP3s up for distribution.
I can't remember how the gnutella protocol works but I think it broadcasts search requests to the nodes that store a cache of what they have and what their neighbors offer and then can pass the request off. Have your client log all the requests (so you can tell the record companies which songs were requested more) and of course offer up your files when requested. If you do this with 10,000 boxes full of identical content chances are you're going to drown out any signal out there.
If you're really tricky, you can even have the client 'fake' files so you don't actually need to have the file on the box; you could send a pre-existing obfuscated file, or even dynamically build and stream the poisoned MP3.
Of course, all of this is moot if you still don't have a very easy, cheap method of offering MP3s online for the mass public. You could pitch it like this "Yeah, so you won't make much money off of offering $SOME_SMALL_AMOUNT for each MP3. But you're a fool if you think simply shutting Morpheus off will result in even 10% of the Morpheus users buying the actual CD or using a painful, userUNfriendly pay-per-MP3 system. However, what if we have a method to net you 20 or 30% of users who wouldn't pay you anyway?" So the pitch would be "We can't get you all of them, but our method would give you more than you're getting now!". Frankly the people who post on SlashDot (from the very negative response to the Subscription model) are not a good cross-section of the vast majority of internet users out there
So in your obfuscated file you have it play maybe 20 seconds of the file and then say "Sorry, this is a copywrited file. Pirating files costs artists money. If you want to buy this MP3 for $SOME_SMALL_AMOUNT, please visit http://www.somestore.com. 80% of $SOME_SMALL_AMOUNT earned will go directly to the artist."
It gives them a reason to buy it - not only do you have SomeStore.com very easily accepting payment, but you ACTUALLY PAY THE ARTISTS A MAJORITY OF THE MONIES EARNED! So it can quell the naysayers who say "Well the artist wouldn't receive anything anyway!" (rant: but who are you hurting more, the billion dollar-industry or the Artist who NEEDS even the small cut they receive from each CD sold?).
Some drawbacks could be of course that someone writes a 'detector' to find and ignore the invalid MP3s, or they block the IP addresses of the servers, etc. but that is easily fixed. Most non-power users (e.g. the great and huddled masses of the internet) don't want to update their Morpheus client every time a new version is released. Heck, even programs which offer hassle-free updating (e.g. antivirus, windowsupdate.com) very rarely are by the majority of internet users. Also, you'd work out the server IP settings with the ISP so that they would rotate to a random IP in their pool - since most of the servers are located in most ISPs you couldn't ban the single IP but perhaps a subnet. But since the IPs are in the ISP, you have now banned a large chunk of users. If they are in every ISP, you will have to ban every ISP (see the problem in banning IPs?).
So, to boil it down to a sentence:
Have very easy-to-use, hassle-free, cheap, reliable, etc. method for users to buy MP3s and they WILL
Sharereactor and edonkey (Score:3, Informative)
In order to dowload a file, you can use a URI such as (ed2k://|file|The_Adventrues_Of_Pluto_Nash(2002).
). The URI contains the "local filename", size and SHA-1 hash. A companion web site [sharereactor.com] acts as a directory of URI's for popular content. The content is screened by the folks running the site. It has now reached the point where the "pirate" teams have accounts and post SHA-1 encoded URIs before releasing the content into the wild. Most edonkey users don't use the embedded search and instead use directories such as sharereactor.
Checksums and signatures work (Score:3, Informative)
The author writes
This is not an unresolvable problem at all; this is where web of trust comes in. The basic idea is for the publisher to sign the checksum using his or her private key. Others can then verify the signature using the publishers public key. This allows me to verify, using only a few bytes of information, that a publisher named SecretAgent did indeed publish a file. If I know that SecretAgent has previously published a lot of "good" files, then the file is probably good. If I don't have any experience with SecretAgent, but I do know that PrivateBenji is trustworthy, and PrivateBenji vouches for SecretAgent, then the file is probably good.
The author fundamentally misunderstands webs of trust:
A web of trust is not a "trust rating" ala eBay. A web of trust is a specific group of people who vouch for each other. Creating a malicious group of people who trust each other does not cause problems. (In fact, it can actually help.) If I trust A, based on experience, and if A trusts B, based on experience, then I can probably trust B. The fact that C, D, and E are malicious doesn't cause problems, because neither A nor B trusts them.
So, sharing is OK now right? (Score:2)
I wonder what would happen if some ordinary user did the same things? Right or wrong?
Dealing with the problem this way is far better than using the law because it is hard to define the law in a way that makes good sense for everyone long term particularly when we don't yet know how P2P could benefit us all.
Besides, they can place any number of promotional information into their files just as easily as they can garbage and they should. Why not? They might even be able to write off more of the expense.
What the media companies need is good marketing. They are the content source. (for now) All they need to do is add value in ways that leverage the network effect that P2P offers and they *will* make money.
Anyway, the result of this is likely not all bad because file sharing will get somewhat marginalized, we all preview before we download large files and everyone is reasonably happy and free to use the net in creative ways.
P2P sharing should leverage popularity (Score:2)
Popular files are more likely to be valid. Poison is less likely to be popular. Poison sinks to obscurity.
block checksum (Score:3, Interesting)
To bad I am so late in posting this...
What we need, to legitimize P2P (Score:2)
(Bye bye, karma) I may sound like a troll, but at least I'm being honest.
Peer-to-peer filesharing has a great deal of potential, but if its only popular use is piracy, well, we already get enough bad press, don't we? It'll only get worse.
(Sorry about the soapbox I'm standing on...)
Misapplications... (Score:2)
I'll leave the relevant ethical issues as a matter of discussion -- but I would suggest that this is a far more serious reason to be concerned about corporate research into network interruption.
Meet the demand, kill the network (Score:2, Insightful)
Enough people will defect to the faster, more direct, legitimate servers. Where they can get the whole album and a movie in 2 hours instead of 2 weeks. The price should be good enough to encourage this.
The P2P networks relies on enough users mirroring enough copies of enough products. Reduce the user base and the number of nodes drops until it just doesn't work anymore.
You can see this on the unpopular P2P networks now.
So either you will end up with:
1. a few users sharing lots of files (which can be picked off with civil copyright laws).
2. a few users sharing few files (which means they can't find the files they want on the network, so are less likely to be running a P2P just to support other users, so the number of people spirals down).
The one thing I don't think you will end up with is many people legitimately downloading and then sharing the files. Quite simply, you would eat up your bandwidth using P2P which you need to do the downloading.
Another factor is the charging, many ISPs are moving to a download limit, e.g. TOnline is moving to 5GB limit per month, then pay 1.5 cents per MB.
So a movie would cost $7 to download after you've used up the first 5GB. Or for that matter to upload to another user!
So you could pull maybe 7 movies a month on the flat fee.
A lot of users on P2P systems will disappear as this becomes the norm.
So P2P is really just a temporary problem for copyright holders, just as long as they get their legitimate sales systems in place and don't go pissing off the consumers with DRM, funny licenses etc.
Comparisons to the War on Drugs (Score:4, Interesting)
1. Randomly selecting and litigating against users engaging in piracy
2. Creating fake users that carry (incorrectly named or damaged files)
3. Broadcasting fake queries in order to degrade network performance
4. Selectively targeting litigation against the small percentage of users that carry the majority of the files
This mostly summarizes the war on drugs and the government's strategy against alcohol prohibition in the 1920's. Neither worked and the countermeasures are simple and straight forward.
A "directed" web of trust, objective quality measurement, and knowledge compartimentalization defeat the above strategy. The countermeasure of creating large numbers of mutally trusting attackers doesn't work when trust "flow" is taken into account. The keys to such a system are:
1) trust is assymetric
2) nodes define and change who they trust based on their own assessments
3) Nodes protect their knowledge of the web of trust
To see how this works, consider the cops and the drug dealers. The fact that the cops all trust each other does not result in the drug dealers trusting them. When a dealer is compromised, no matter how high up the chain it goes, trust shifts to rivals. Even when a kingpin falls, lines of trust will still exist that aren't compromised.
Drug dealing is not as popular as file sharing, is substantially more damaging to peoples lives and society, and has motivated levels of funding that are not matchable by publicly traded firms (who must demonstrate at least mid-range ROI). Despite all of these advantages, the war on drugs has been a dismal failure. The bottom line is that the internet makes distribution of content a commidity, where it was formerly a task of enormous complexity and value add. Economics will determine the rest, unless the US adopts and maintains a totalitarian government.
Re:Shameless plug... (Score:2)
This strategy fails to take into account the fact that an RIAA mole could easily share desireable content. For example, mp3.com [mp3.com] has 7 free, legal tracks for download from Linkin Park (not my choice in music, but they are quite popular currently). There are quite a few other well-known bands with free tracks on there. Sharing all that content, which the various record labels have decided to share anyway, will only serve to get the sharing user voted up.
Once the mole is voted up for carrying lots of valid files that people are interested in, the mole begins to distribute poison. Sure, this will cause the mole's ranking to fall somewhat, but damage will be done in the process. Furthermore, the legit files will continue to somewhat offset the attempts to vote the user down. Multiply this whole situation by a number of different automated users, and you've got an effective poisoning attack.
In short: The mole has spread files that the RIAA already wants distributed (win for the RIAA, win for users), and the mole has spread poison for files that the RIAA doesn't want distributed (win for the RIAA, loss for users).
Re:Directories like Bitzi can stop fraudulent file (Score:2)