Slashdot Log In
Gnutella Not Scaling?
Posted by
Hemos
on Fri Sep 22, 2000 11:59 AM
from the peer-to-peer-salvation dept.
from the peer-to-peer-salvation dept.
cbull writes "ZDNet Music has an article that makes an argument that "Gnutella is Going Down in Flames". Basically, the argument is that Gnutella isn't as scalable as Napster."
This discussion has been archived.
No new comments can be posted.
Gnutella Not Scaling?
|
Log In/Create an Account
| Top
| 137 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
|
2
(1)
|
2
Improving GNUtella (Score:3)
If you've got a big pipe, and you're going to be connected to gnutella for awhile, this would improve the performance of your client and those closest to you.
Of course, if you really want improvement, you'd have to build this capability into the protocol. Allow clients to register as either low or high bandwidth. Then low bandwidth clients could do anything, but traffic could only go through them for a level or two. Ideally, you'd want every client to be able to reach a high-bandwidth node within 3-5 hops. A connected client would then note and rely upon these distribution nodes to do the work. Perhaps even reconnect to distributors directly...
Just a thought. Isn't this the kind of thing that Freenet already does?
Xentax
Re:Of course it doesn't scale (Score:3)
I suprised by this being an issue at all. I haven't looked at the gnutella infrastructure, but these are issues that I would have thought tackled during the initial design.
Re:Yeah no shit. (Score:4)
--
Scalable vs. Distributed (Score:4)
Freenet (Score:3)
Freenet isn't searchable (Score:3)
This is hardly news (Score:3)
Re:Gnutella IS going down in flames. (Score:3)
On the plus side, he eventually did manage to find every single
Is there any reasonable way to determine usage stats for Gnutella?
Kierthos
Gnutella is open source... (Score:3)
There are still people like that in the world today. What a shame! It seems that ZDnet likes to cater to this crowd. So now they are bitching to an entire community, of which they were - by default - invited to participate.
It NEEDS a central index and reference. (Score:3)
If you use search engines which don't check the accuracy of the data they scrounge or run your own with Archie/Veronica types of searches or worse, become your own search engine, snooping on everybody's hard drives, you're going to take longer and longer to retrieve indexes to content that is of more and more dubious quality.
The world NEEDS MP3.com types of businesses that rate & index as well as store content.
The world NEEDS engines that can demand micro-payment from the recipient before sending a file.
The world NEEDS micro payment services like X3.com to catch the pennies and send the content producers their due.
And SCREW the RIAA, MPAA and other Luddites and SCREW the culture vultures who rip off the concent creators (artists and writers etc.) and rip off the consumers by over charging simply because they put themselves in everybody's faces.
Part of a solution (Score:5)
The gPulp project is currently working on all of these issues. Check proposals and ideas at: http://gnutellang.we go.com/go/wego.pages.page?groupId=133015&view=pag
There is also a server oriented gnutella application which aims to start resolving some of these issues in the near term. Features such as:
1) Provide a server for broadband / dedicated network users to provide content with a true server oriented gnutella node. This will be similar to a modified apache for singular installations, or a federated distributed server architecture for routing and caching fun.
2) Remove broadcast push requests (in all future clients)
3) Proxy and cache support for slow users. This will allow beafy servers to take over some of the load which dialup / slower clients experience. This will be somewhat ala freenet, as popular data will propagate through caches in various nodes. Also, this can provide a level of anonymity which is not present.
4) Adaptive servers which configure their network connections for optimal efficiency. Not too busy, not too slow, and with the widest distance topologically from their peers (if linked) and fuzzy / reactive propogation algorithms so that TTL's and routes can be dynamically modified as load increases or other factors require.
There is nothing fundamentally flawed with the gnutella architecture, and it is far from a 'dead' horse'. However, there are significant innefficiencies and complications which are causing problems right now. Rest assured these will be fixed.
Yes, it doesn't scale; we know that. (Score:4)
The basic problem is that small sites either take a lot of search hits to which they will answer "no find", or their index has to be mirrored elsewhere, which introduces centralization. There's an economy of scale to searching.
So automatic, distributed, redundant, partial centralization is necessary. This is hard. It also has to be reasonably secure against hacking; look at the problems IRC has. It probably needs a reputation service, so people who spam the indexing system lose.
On the other hand, music interest, being a popularity thing, follows a power law; the music most likely to be searched for will be found easily. A simple hack on Gnutella so that it queries servers slowly, in order, starting at the one with the best response time, stopping with the first find, will keep the thing from collapsing until somebody cracks the hard problems. It's not necessary to crack the general distributed search-engine problem to fix this.
Distributed server (Score:3)
Math... (Score:5)
The Math behind it is simple:
- Every user that that adds Cu amount of capacity to the network (on average).
- Every user also adds Tu amount of traffic (also on average). However, because of the broadcast nature that traffic is sent to all users, so with N users, each user generates Tu*N amount of traffic.
This means that the total capacity of the network is:
C = Cu*N
(Capacity per user times the number of users). The total traffic on the other hand is:
T = Tu * N * N = Tu * N^2.
For the network to work C needs to be greater than T, if T C. You simple cannot win using a broadcat model.
On the Freenet-dev list we have a standing rule that two words are indecent and offensive: "centralize" and "broadcast". We think we can pull it off without them, but it makes everything 1000% more difficult, which is the simple answer to why Freenet is developing more slowly then the one hundred million Napster and Gnutella variants outthere. That, and the fact that you are not helping us...
Re:Freenet isn't searchable (Score:5)
The underlying Freenet architecture should actually be quite a good fuzzy-searching system, it is just that we have not got around to enabling that functionality yet as we have been concentrating on getting the underlying architecture right.
--
Gnutella may not scale, but it is still useful (Score:4)
So I don't think Gnutella is going down in flames. Since it is open source, we may take that as a lesson learnt and perhaps rip out the offended non-scalable part and build a better file sharing device that actually works this time.
Re:Math... (Score:4)
1) Every bit of information is NOT sent to every other client. Many requests are dropped, ignored, or simply do not reach their destination when the TTL expires.
2) The nature of the clients ensures that slow connections have fewer peers, propogate fewer requests, and receive fewer requests than faster ones.
These two attributes greatly reduce the theoretical maximums encountered when doing math.
The real world implementation does not even remotely follow the absolute mathematical predictions.
Gnutella IS going down in flames. (Score:3)
Gnutella was a good idea; it was just taken the wrong way by the moronic serverops who can't avoid sticking a ruler between their legs. Personally, I'd prefer having separate servers for content (mp3 specific network, DivX specific network, binary specific network, etc.).
demonstration (Score:4)
My Home: Apartment6 [apartment6.org]
Death of Gnutella a little premature. (Score:5)
They further mention that proposals for redesigned version have already been made.
link from article [wego.com]
Not only that, it says support and resources for this project are being sought out - it's active, it's open source, what more do we want?
Given the interest in Gnutella, I don't see any problem finding people to fix known bugs.
Rather then seeing this as the death of Gnutella, I saw it more as a positive article pointing out known bugs that are being fixed, and announcing a the planning of a new and even more powerful version.
Re:Yeah no shit. (Score:5)
Only if you insist on reaching all the nodes all the time. If you can afford to reach only a subset of the nodes for any given request, then the problem becomes one of proper clustering.
Note that Napster also implements kind of clustering: you see the files of people in your "cluster", not of all Napster users on Earth.
Kaa
a 2^N problem: Metcalf's law (Score:3)
is gnutella practical anymore? (Score:3)
Its been mentioned before but some ways of fixing the situation may include doing things like making the searches bandwidth related to filter out the modems. Perhaps a better idea would be to have an auto peer mode where high bandwidth connections become servers for a cluster of machines near them. (Gaining mojo points to take the mojo example for instance) Then clients can just search the (relatively) finite connection of high bandwidth high speed servers much like in the form of napster but the client/server analogy is a bit more fluid..
check out Mojo Nation (Score:5)
It uses centrialized content tracking servers, but anyone can run one by just clicking a switch in their client. The content trackers store XML metadata describing the file, so you can search on different fields in different file type categories (easily defineable).
The the files themselves are broken into small redundant pieces and spread over the network. You only need half of the available pieces to reconstruct the original file. This way the system is resistant to servers disappearing. It also means you distribute your load over many hosts and clients with slower connections can still provide block services.
The coolest thing is that Mojo Nation has a built in digital cash called "Mojo" and a microcredit system that effectively turns it into a barter system for disk space, bandwidth, and CPU. Whenever you upload, download, search, or otherwise consume another systems resources, you must compensate them with Mojo. The Mojo represents the disk space, CPU, and bandwidth you are using. You can get Mojo by contributing your resources to the network through the client software (it's automagic). This way nobody can consume more resources than they are contributing to the system. Each person that uses it helps to make it stronger. Of course, being a real digital cash system, nothing stops people from sending Mojo to eachother in e-mail and settling the transaction with something like PayPal.
It's really cool, check it out.
Burris
Optimization... (Score:4)
I think there needs to be a way to tell what the network load on an individual node is, and attempt to negotiate connections with machines of similar connection speeds or ping times up to a maximum load cut-off.
Of course, there will still be people with hacked clients that report a bandwidth of 0 and a load of 10, but suspiciously have low pings. Those leeches should be killed, or at least swamped with connections...
Also, it would be nice if the network could re-organize over time, as in, promote people in your segment who give you back successful searches, and cut off branches that don't yield search results. Then everyone who wants free books would eventually find each other, and be separate from everyone who wants free porn (the other 99%, it seems)
---
pb Reply or e-mail; don't vaguely moderate [ncsu.edu].