The Birth of DHT, May 2005
When BitTorrent started in 2002, decentralization was one of its main innovations. The central structure of services like Napster ultimately led to their downfall, and while decentralized systems such as eDonkey/eMule and Gnutella existed, they were often cumbersome and filled with fakes and spam.
BitTorrent was also somewhat individualized. Clients only dealt with clients on the swarms they were interested in, and all conducted business through a tracker.
This led to problems though, when trackers went down, as the trackers were the only way for peers to get information about others in the swarm. There was no fallback, except trying to add more trackers and hope everyone else adds the same. However, with the launch of Distributed Hash Tables (DHT) these problems were all but over.
That two similar but incompatible DHT systems were launched within weeks of each other is quite surprising, given the history behind both. To this day, in fact, the systems are still incompatible, although there are plug-ins that allow the use of both to act as a bridge between the two swarms (one Vuze, one Mainline).
When you factor in that both were released just months after eXeem had tried and failed to do a similar thing (earning significant criticism while doing so) the success and longevity of both look even more impressive. But how did they come about?
The Vuze DHT debuted first, with version 2.3.0 of the Azureus client on May 2, 2005. In its announcements back then, they were keen to stress the difference from eXeem, stating it was a decentralized layer on top of BitTorrent, rather than a decentralised BitTorrent system itself. Within 24 hours there were more than 200,000 peers, and there are currently around 1.1 million peers on the network.
According to Paul Gardener, the main developer of the Azureus DHT system, tracker redundancy wasn’t the main reason behind its development. Instead, decentralization for search was driving it.
“That was one of my pet aims when I joined the Azureus development team,” Gardener told TF earlier this month. “But the others in the team weren’t sure if search was a priority, so I found a way of working on some decentralization that perhaps one day could evolve into/be adapted for search. Of course decentralized tracking was a good aim in itself.”
“I started from scratch,” Gardener recalls, “there weren’t any libraries out there I could use, so had to figure out which kind of DHT to use (Kademlia) etc. [It took] a few months I guess.”
Three weeks later, Bittorrent Inc. released their own version of DHT with the release of version 4.1. This was then adopted by the then popular client BitComet in early June, and by other clients soon after.
While the timing may suggest otherwise, BitTorrent’s DHT wasn’t a response to Vuze’s release at all, as the person responsible – Drue Loewenstern – had been working on it since 2002.
“I started working on the DHT in the summer of 2002 after making the first Mac BitTorrent clients, a year before Azureus was established on Sourceforge. Finishing it off and integration into BitTorrent started in 05 when BT became a company. I was in testing and about to release it when Azureus launched theirs,” Loewenstern says.
The inspiration for the BitTorrent mainline DHT came from an unlikely and famous source: Aaron Swartz.
“Distributed hash tables were an inspiring area of research. I was really into P2P, having just worked on MojoNation and BitTorrent, and wanted to do all sorts of cool decentralized things like trust metrics. Aaron Swartz, 15 at the time, circulated a one page implementation of the Chord algorithm and I was struck by its simplicity, Loewenstern notes.
“I started looking into DHTs specifically and Kademlia was the first DHT paper that really clicked with me and seemed like it might work in the real world So I decided to start implementing it without really knowing what I was going to do with it.”
Contrary to Vuze, redundancy was one of the main motivations driving the development of the mainline DHT.
In the case of BitTorrent, the goal of the DHT has always been to make BT more robust, to improve performance by finding more peers, and to simplify publishing by making a tracker optional,” Loewenstern says.
Of course, not everyone was thrilled to see the introduction of DHT. Private trackers were opposed to DHT as it enabled people to use the site’s torrents without being under the strict control of the tracker admins.
The solution to this was a form of access control called the private flag, which disabled DHT, along with Peer Exchange (PEX) and restricted peer access to trackers – locking things into the way of 2005.
The flag works by being inside the data used to generate the hash, so if disabled, it would change the overall torrent hash, meaning a torrent with the flag enabled would be a completely separate swarm from one with the flag disabled. It also gave these sites a new way to market themselves, by taking the term “private flagged torrent trackers” and condensing it to “private trackers,” implying some form of privacy.
This move though, was not by choice.
“There’s always been tensions between clients and private trackers,” Vuze’s Gardener says. “In particular they like to ban Vuze because it is ‘open source and people have hacked it to report incorrect stats’ or other such ‘reasons’. I’ve never been a fan of [the private flag] as a solution.”
“It came to be because some index site operators enforce upload/download ratios in an effort to keep seeders around for torrents that nobody wants to be left holding the bag for by seeding. They thought DHT (and PEX) might let users bypass the ratio system so they made a lot of noise about banning clients that implemented DHT,” he says.
“Azureus didn’t want to get banned so they came up with the private flag and added it to their client. It wasn’t my decision to add it to BitTorrent. Without PEX, torrents take longer to ramp up so it annoys me when people upload private torrents to public index sites.”
NEXT, The BitComet Incident