Interview Responses From BitTorrent's Bram Cohen 253
1) Bit-Torrent browsing... by CashCarSTAR
Has any effort/thought been put towards bit torrent page distribution?
Specifically, a way that one can use BT to mirror webpages. A way to get around the /. effect, and as well would work wonders the next emergency that comes out (see 9/11).
Bram:
Images in web pages are very small and require very low latency. BitTorrent is designed for much larger files, which download on the order of minutes or hours rather than seconds. BitTorrent uses the significant amount of time those downloads take to try out and compare different connections. This process has inherent latencies which make it unsuitable for images on web pages.
Certainly it would beis possible on paper to dramatically reduce the cost of hosting an ordinary web site using peer transfers, but the logistical problems of handling many small files at low latency have yet to be solved, and will probably require a protocol which looks significantly different from BitTorrent.
2) Forward successful download stats to originators... by gsfprez
Many freeware/shareware folks like to keep download stats for marketing purposes, so P2P software and mirrors really irk them....
In order to foster more love from freeware/shareware distributors, could BitTorrent be made to inform the end user (me) that BitTorrent was going to send a "notice of download" (not including any personal information, such as an IP, etc) upon sucessful download (that I could preview before sending of course)?
If *I* was Warner Bros, and eveyone offered to distribute and pay for all the bandwidth for the next version of the Animatrix, while I still got to see download statistics, i'm not sure I'd even would need to provide a direct link to the 150 meg QuickTime files.
With this kind of feedback mechanism, the software/media providers get all the love - download stats, far far far less bandwidth used - and we get all the goodness - their free movies, software, freeware, data, etc. Its the ultimate mirror.
Or am i missing something?
Bram:
I'm happy to report that you are, in fact, missing something. Clients report very detailed statistics to the BitTorrent tracker, including the number of complete downloads and the total amount each peer uploaded and downloaded. If you host a file using your own tracker, all of this data is readily accessible, the same as if you hosted it via http.
By the way, many people find out about tracker statistics reporting and falsely think that hacking their client to exaggerate their upload rate will increase download speeds. Clients actually decide who to upload to based strictly on the transfer rates they experience directly; Tracker statistics are never even sent to them.
3) Comparison to other P2P... by jfmiller
As far as I can tell the genius of BitTorrent is allowing peers who themselves do not yet have a complete file to share the parts they do. With all dew respect to the effort taken, the rest is just functional glue that allows the system to work as it should.
The eDonkey protocol used the same basic premise. How is BitTorrent different to it and other P2P protocols and why did you make that choice?
Bram:
That 'functional glue' is extraordinarily difficult to get to work well. Ever-changing network conditions and very high rates of peers disconnecting produce a very thorny logistical problem. Most existing swarming implementations don't even manage to fully utilize all the upload capacity available to them.
That said, there are other decent swarming implementations. For example, the one in eDonkey is quite serviceable, and Furthurnet's works okay as well. BitTorrent handles the little details of file transfer better than all of the others, but if that were the only difference its advantage would be relatively minor and subtle.
What sets BitTorrent apart is its very robust technique for rewarding specifically the peers which upload the most, known as leech resistance. On the highest level, this prevents a long-term meltdown of the system from being caused by people running leeching clients. It also causes upload and download rates to be somewhat correlated, so peers on good pipes get decent download rates, which increases general good feeling about how the system behaves. Overnet, the follow-on to eDonkey, may start using BitTorrent's peer protocol in the future specifically for the leech resistance properties.
By the way, people sometimes run clients hacked to not upload at all and still experience good download rates. Usually this is because they're downloading a file which has been available for a while and there are many clients which have finished downloading but been left running, so there's plenty of excess bandwidth to go around. Not uploading in a swarm which is still ramping up is generally ruinous for download rates.
4) Improvements... by BJH
Bram,
Do you have any plans for improvements to BitTorrent to improve some of its (few) weaknesses, such as searching for torrent files, bandwidth usage by trackers and inability to download if the tracker goes off the air?
Bram:
I have no plans to add search functionality, since that can be handled at a higher layer, such as google, and finding content via links is considerably more versatile and widespread than keyword searching anyway.
Bandwidth used by the tracker is currently around 1/1000 the total amount of bandwidth used. With some tweaking, I can get that down to around 1/10,000. Going lower than that would require sacrificing the tracker's ability to collect statistics, since those get significant at that scale.
Relying on a single tracker is really no different than relying on a single web site. Any well-colocated machine is plenty reliable enough, and if you really need failover you can do it at the DNS level.
5) Impending doom... by damu4a) Re: Improvements... by ichimunkiI would like to refine this question because I have some specific nits that I'd like to pick: why doesn't the client/server open a single port and listen on that instead of opening a new port for each file? Second, why don't the peers maintain and share information about other peers once the download has started-- going through the central tracker provides a central point of failure. Wouldn't decentralizing allow for a .torrent file to have a list of seeds, and then each of the seeds would be able to share information about peers, eliminating the need for a tracker altoghether?
Bram:
Single port has been high on my list of things to do for a while now but keeps getting put off as more immediate concerns pop up. It mostly hasn't been done yet for a highly technical reason. The way BitTorrent currently shuts down is with a hack where the entire event loop is terminated; To support multiple downloads a cleaner technique which only stopped events and sockets related to a particular download which one of them terminates would be necessary. This is reasonably straightforward to implement, but requires a lot of surgery.
By the way, my mail load has made getting actual development done rather difficult as of late. I'm hoping to offset this with contributions from other developers. While there's been plenty of interest in contributing, and a significant amount of contribution to the tracker, to date noone other than me has made any significant changes to the core download functionality.
If anyone really wants to make a significant development contribution to BitTorrent, you should read over the codebase enough to understand it all (the irc channel can be helpful with this) then ask me what's on the to do list. I suggest you do not start implementing your own BitTorrent client. There are already several of those being worked on, and they're all very far from being as mature as the main line client. What's really needed is more development on the main branch.
Are you taking any precautions for your clash with the RIAA/MPAA?
Bram:
I don't expect to run into any legal trouble. BitTorrent can be used for any kind of content, and several web sites have used it for their own files. Also, all the etree usage (live show recordings of bands which permit it) is completely legal. BitTorrent's total bandwith usage would be quite substantial even if the etree distributions were all it was used for.
6) Future Considerations... by pgrote
Do you feel that BitTorrent's core functionality can one day be integrated in the operating system as a file system? The ability to share files among disparate systems in remote locations can be seen as extension of what was started with HTML, et. al.
Bram:
No. BitTorrent's API is one of starting a download and later being notified that the whole download is complete. File system APIs very specifically involve open(), seek(), read() and write(), which are completely different and wholly incompatible with the way BitTorrent works.
The same is true of http by the way. Attempting to make certain protocols act like local file file system access is kludgy at best, both as a literal concept and as a metaphor.
7) Panhandling for internet dollars... by Matey-O
You've got a paypal dontation button to help compensate you for your non-trivial expenditure of time...how well is that working? Is it an adequate revenue stream, or just enough for a pizza or two?
Bram:
So far, more than a pizza, but less than a living. The donations definitely help though.
8) Re: most obvious question... by Noksagt
...what do you think of what people have done with what you have created. I'm sure you might be sick of people asking you how to obtain a torrent for the latest movie, but are you troubled that it is being used for copyright infringement? Pleased? Apathetic?
Do you wish that it was used more for distributing legal ISOs and other files? If so, do you believe you should promote it more for this purpose or promote development of tools to push it in this direction (perhaps automatic creation of torrents on a successful build, etc.).
Bram:
I'm amused mostly. I find humans highly entertaining.
My attempts to promote BitTorrent for any specific purpose basically failed. It's grown almost entirely through guerilla marketing. That said, I'm hoping that in the future BitTorrent starts being used directly by content producers to distribute their own works.
9) Success... by pgrote
BitTorrent has seen a wide array of usage since it debuted. Many have been surprising and it has caught the fire that makes sofwtare a success. How do you personally measure the success of BitTorrent? Has it achieved the goals you first set?
Bram:
I generally measure software success by how many machines it's deployed on. In that sense BitTorrent has done very well, but it will probably become much more widespread as publishers make their content available using it. My current hope is that BitTorrent will one day be installed on almost all end user machines.
10) Commercial Interest... by Noksagt
I think that bittorrent can be of significant commercial interest. It might be used for software updates for instance. Have you pursued this path or have companies approached you? I certainly hope you'd keep a free version available, but a more feature-rich version would surely land you a great deal of money with the right pitch.
Bram:
So far there hasn't been much commercial interest, but I expect that to change now that large deployments have proven the technology so dramatically.
Starting a business is very tempting. BitTorrent has the potential to create such incredible amounts of value that if I manage to make even a tiny fraction of that I could do very well.
-----
Why Python? (Score:5, Interesting)
If bittorrent ever get modified to server much smaller objects, like html pages and gif and jpegs, then the ton of trakers needed would see a big improvement if written in a compiled lanuage or even java (though I hear a java version is in the works). It would have been interesting to hear from his point of view though.
Advice to Bram on making money (Score:5, Interesting)
My advice would be to license the source code under the GPL for OSS projects, and additionally under a commercial license for businesses.
Provide BT technology for incorporation into random commercial products. Resell your consulting skills at a good rate. Train others to be able to do the same. With licensing and consulting fees, you will do nicely.
A good project. (Score:5, Interesting)
this way the first few people on the thing would be getting it from the corporate client, then after that from other peers, but then when the file becomes unpopular, people would then basically be getting it from the corporate client again.
This would a little improvement. Though this may just show my ignorance of how bittorrent works as well. Currently I download some files using bittorrent (wolfenstein enemy territories) but when all the seeds go away it can cause issues.
So basically make it so that there is a relatively permanent seed, and he is always requested from LAST. that way if the file is popular the site doesnt have to worry about losing bandwidth.
also, stats tracking should be "ramped up" a little, to where someone would have to register to use the torrents on a specific site, this way the tracking per user could be used. Now this wouldnt interfere with anyones right to privacy, but could be used as a "bonus" system, to provide incentive to keep the torrent open. IE the more you upload the more "credit" you are given. If you think of it in slashdot subscriber terms, perhaps people that have a high "credit" (ie they leave their client open after being finished) would get earlier access to files. maybe have a 3 teir file access. top teir (high uploaders) would get the file as soon as it was served. second teir would get at it 20 minutes later, and 3rd teir get it 45 minutes to an hour later.
this would allow sites to reward those that are high quality users, and maybe allow them to track site benefits based on participation.
maybe call it "sitetorrent" or such.
and this is actually an original idea i thought of trying to get some freinds of mine and myself to code 2 years ago, but I had neither the experience nor the time to work on it. Then someone showed me bittorrent about 2 months ago and I was like "holy shit thats exactly what my product was going to be sans user participation"
Oh, and you cant steal my idea, i provide it free to the public today 6/2/03, as a business application given freely and documented.
Buzz OUT!
Re:My question... (Score:5, Interesting)
Anyway... do any of you torrent gurus know how to change a tracker? For example, say you have 80
Viewing the plain text of the
Quote:
d8:announce37:http://f.scarywater.net:8080/anno
End Quote
After this prelude of text, the rest of the
Can this plain text be edited so all the tracker files not have to be rebuilt?
Davak
Re:*just* functional glue (Score:5, Interesting)
apt-get (Score:4, Interesting)
my sources in the community tell me that the apt-get guys are busy incorporating P2P into the latest version of apt-get in order to extend the availability of rare debian packages and to lessen the load on the central debian servers, which are frequently crashing under their present heavy load.
Re:My question... (Score:5, Interesting)
Jasin Natael
Re:apt-get (Score:4, Interesting)
if you apt-get the latest apt-get beta (assuming you have apt-get in the first place :) and libBitTorrent, apt-get will check for other peers that are downloading the files, and share from them.
BTW - the central server is frequently crashing due to kernel panics. Ingo is looking into the problem with the token buffer allocation scheme, but it may also be hardware problems with the eMachines we use.
Re:Queue the whiners (Score:5, Interesting)
I have to agree with you. Good luck to Bram.
I've only used it once now. When I dl'd the release of enemy territory I had corruption in the file from some regular dl site. While reading slashdot someone mentioned having the same problem and someone pointed BT to the corrupted file lo-and-behold it fixed the file for me.
I was impressed. I think I'll be trying BT more often now though.
Commercial uses (Score:3, Interesting)
You can do exactly that. ;) ./btdownloadheadless (Score:4, Interesting)
What I do is put the source file onto the server, create the
Why not Python? (Score:5, Interesting)
- Security. This is a server, so buffer overflows and memory allocation errors are not acceptable.
- Readability. Bram expressed a strong interest in getting more developers involved, making readability essential.
- Platform neutrality.
Other languages cover some of these requirements too, of course. But Python is a great choice.
As for reducing the slashdot effect using a distributed mechanism, I'd like to see something like this: Slashdot runs a BitTorrent server and provides a "package" for every story. Users run a small local HTTP server that fetches web pages from Slashdot story packages, downloaded via BitTorrent. Slashdot lets users set a preference that converts all front page URLs to fetch from the local HTTP server instead of the real site.
The net effect is Slashdot provides a "cache" without actually using up bandwidth. We wouldn't even have to change the BitTorrent protocol. Slashdotters unite!
Re:A good project. (Score:4, Interesting)
The site is having processing power issues, but seems to be holding up "ok". It's a great place to get some good shows from, though.
It's the implementation not the protocol. (Score:4, Interesting)
Redundancy (Score:5, Interesting)
sigh..
I don't think he gets it. First, we've already discussed the virtues/sins of DNS round-robin. But basically, when DNS round-robin doesn't solve your problem, you have to go to Big-IP. Which means 'free' tracker sites will need complex setup for failover/redundancy.
If the Tracker itself, had this built in, i propose it could do it more efficently, and with less setup hassle. Imagine being able to setup a mirror by simply having the admin place your new "cluster-able" tracker IP:Port on an approved mirror list. The main tracker could refer clients to a mirror after behind-the-scenes communication to determine which mirror has least load.
A step below this, but better than DNS round-robin, would be to give the client an array of tracker addresses. This is better than DNS because you don't get the stalled server mixed with cached DNS record causing inaccessibility. The clients could try connections randomly to the servers in the array, and prevent cached dns records for altering distribution.
-Malakai
Re:Why Python? (Score:1, Interesting)
this program is bound by the rate of i/o to the net.
a python program is plenty fast enough to handle some buffer copying
rewriting this in java would just be crazy and a waste of time. So, I assume someone is hard at work on it right now
Re:My question... (Score:3, Interesting)
Re:sarting a business (Score:4, Interesting)
I think a "startup" nowadays needs to go ahead and have a sellable software product in hand before expecting to go anywhere, much as a startup free software product needs to have something that does usable work before it will attract a developer community.
The only thing that would concern me about this business model is that bandwidth prices are kind of artificially inflated right now because of really crappy leadership by our Federal government. If any FCC administration ever figured out what they were doing, or suddenly had an attack of ethics and remembered that they're supposed to server the people rather then corporate interests, the bandwidth situation could significantly improve, which would lower (albiet not eliminate) the need for BitTorrent technology at the corporate level. There may be a relatively narrow window where this sort of thing is economically viable (as opposed to useful; they are not the same thing at all!). Still, said "relatively narrow window" in all likelihood is at least three or four years (I can't imagine the bandwidth situation being sorted out on a large scale in any lesser time period) and you can still make a respectable amount of money in that time, plus you have that time to refine the product into something that may be able to continue to be usable even after market conditions change.
Swarm a Media Stream (Score:4, Interesting)
Radio on the net, video on the net...the problem is the multiplying lag factor. You need to organize the swarm into tiers, by lag. Tough but doable. Add support for IP broadcast, where available...
Re:Why not Python? (Score:3, Interesting)
As for CPU usage, I was refering to the server side trackers, which have already seen slashdottings. While that's probably also a network issue, if was thinking about when you are running dozens and dozens of trackers. A snowflake wieghs next to nothing but snow can collapse a house.
Despite its advantages (I like Python alot) I'm not sure it will ever be a mainstream high volume server language. Yes for client side its fine, and the benefits far outwiegh the disadvantages, but if BitTorrent becomes as popular as Bram wants it to be, a server side interpreted language seems dubious at best.
Right now it's fine because there is still alot of development to do, but eventually performance will take a big hit and it will have to be ported.
I even think the oncoming java version will be significantly more scalable than python but we'll see.
Also, another thing about Python is that it requires Python and third party GUI libraries (at least it did on my linux build) and not every one has Python and the GUI libraries caused some conflicts on my machine. Python marketshare would be the only reason in my mind to move the client to java or c.
Re:Slight lack of vision (Score:3, Interesting)
Typically, you hear that they are afraid it takes too much bandwidth, but I think the major factor in the majority of cases is straight up addiction to control that goes beyond logic.
Logically, it doesn't make much sense to try to control caching of your web site. I mean why publish something on the web if you don't want people to see it and why should you care if they are connected to your server when they do so. But many sites make a big deal about it despite the fact that there's no logical reason why it shouldn't be done.
Many web sites also think they can make money selling archive access. Those sites would never go for it which means you'd have to have permissions. That doesn't make it impossible. It's still doable, but it would be tricky and not for technical reasons as much as the desire to pursue the illusion of control.
Re:My question... (Score:3, Interesting)
So the question really is "Why doesn't someone else create a torrent site for all that crap?"
Oh, wait, they did [google.com].
Bittorrent should get ALOT better.... sharaza (Score:3, Interesting)
Re:Redundancy (Score:3, Interesting)
Still, the point remains... RRDNS is a truly bad solution to distributed/redundant servers. When one of the servers dies, 1/n of the clients or 1/n of the time still try the down server, on average for half of the cache timeout.
Re:A good project. (Score:3, Interesting)
You have an asymetric (unequal upload and download) connection. Unless there are tonnes of seeds, your download rate and upload rate will be relatively similar. Because you can only upload 128Kb, you'll almost never get 3 Mb download. This is the inherent nature of bittorrent - it simply works better on connetions where you get equal upload and download.
For bittorrent purposes, it's MUCH better to have a 1Mb/768Kb connection as opposed to a 3Mb/128Kb connection.
Re:My question... (Score:5, Interesting)
I'm rather convinced that either (1) the