docs/paper/paper.tex

   1 \documentclass[conference]{IEEEtran}
   2
   3 %% INFOCOM addition:
   4 \makeatletter
   5 \def\ps@headings{%
   6 \def\@oddhead{\mbox{}\scriptsize\rightmark \hfil \thepage}%
   7 \def\@evenhead{\scriptsize\thepage \hfil\leftmark\mbox{}}%
   8 \def\@oddfoot{}%
   9 \def\@evenfoot{}}
  10 \makeatother
  11 \pagestyle{headings}
  12
  13 \usepackage[noadjust]{cite}
  14
  15 \usepackage[dvips]{graphicx}
  16
  17 \usepackage{url}
  18
  19 \begin{document}
  20
  21 \title{Peer-to-Peer Distribution of Free Software Packages}
  22 \author{\IEEEauthorblockN{Cameron Dale}
  23 \IEEEauthorblockA{School of Computing Science\\
  24 Simon Fraser University\\
  25 Burnaby, British Columbia, Canada\\
  26 Email: camerond@cs.sfu.ca}
  27 \and
  28 \IEEEauthorblockN{Jiangchuan Liu}
  29 \IEEEauthorblockA{School of Computing Science\\
  30 Simon Fraser University\\
  31 Burnaby, British Columbia, Canada\\
  32 Email: jcliu@cs.sfu.ca}}
  33
  34 \maketitle
  35
  36 \begin{abstract}
  37 A large amount of free software packages are available over the
  38 Internet from many different distributors. Most of these
  39 distributors use the traditional client-server model to handle
  40 requests from users. However, there is an excellent opportunity to
  41 use peer-to-peer techniques to reduce the cost of much of this
  42 distribution, especially due to the altruistic nature of many of
  43 these users. There are no existing solution suitable for this
  44 situation, so we present a new technique for satisfying the needs of
  45 this P2P distribution, which is generally applicable to many of
  46 these distributors' systems. Our method makes use of a DHT for
  47 storing the location of peers, using the cryptographic hash of the
  48 package as a key. To show the simplicity and functionality, we
  49 implement a solution for the distribution of Debian software
  50 packages, including the many DHT customizations needed. Finally, we
  51 analyze our system to determine how it is performing and what effect
  52 it is having.
  53 \end{abstract}
  54
  55 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
  56
  57 \section{Introduction}
  58 \label{intro}
  59
  60 There are a large number of free software package distributors using
  61 package distribution systems over the Internet to distribute
  62 software to their users. These distributors have developed many
  63 different methods for this distribution, but they almost exclusively
  64 use a client-server model to satisfy user requests. The popularity
  65 and number of users results in a large number of requests, which
  66 usually requires a network of mirrors to handle. Due to the free
  67 nature of this software, many users are willing and able to
  68 contribute upload bandwidth to this distribution, but have no
  69 current way to do this.
  70
  71 We present a new peer-to-peer distribution model to meet these
  72 demands. It is based on many previous implementations of successful
  73 peer-to-peer protocols, especially distributed hash tables (DHT) and
  74 BitTorrent. The model relies on the pre-existence of cryptographic
  75 hashes of the packages, which should uniquely identify it for a
  76 request from other peers. If the peer-to-peer download fails, then
  77 the original request to the server is used as a fallback to prevent
  78 any dissatisfaction from users. The peer can then share this new
  79 package with others through the P2P system.
  80
  81 First, we examine the opportunity that is available for many of
  82 these free software package distributors. We present an overview of
  83 a system that would efficiently satisfy the demands of a large
  84 number of users, and should significantly reduce the currently
  85 substantial bandwidth requirements of hosting this software. We then
  86 present an example implementation based on the Debian package
  87 distribution system. This implementation will be used by a large
  88 number of users, and serves as an example for other free software
  89 distributors of the opportunity that can be met with such a system.
  90
  91 This paper is organized as follows. The current situation and its
  92 problems are presented in section \ref{situation}. We analyze some
  93 related solutions in section \ref{related}, and then present our
  94 solution in section \ref{opportunity}. We then detail our sample
  95 implementation for Debian-based distributions in section
  96 \ref{implementation}, including an in-depth look at our DHT
  97 customizations in section \ref{custom_dht}. We analyze our deployed
  98 sample implementation in section \ref{analysis}, suggest some future
  99 enhancements in section \ref{future}, and conclude the paper in
 100 section \ref{conclusions}.
 101
 102 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 103
 104 \section{Problem/Situation}
 105 \label{situation}
 106
 107 There are a large number of groups using the Internet to distribute
 108 their free software packages. These packages are usually available
 109 from a free download site, which usually requires a network of
 110 mirrors to support the large number of requests. These distribution
 111 systems almost always support some kind of file verification,
 112 usually cryptographic hashes, to verify the completed downloads
 113 accuracy or authenticity.
 114
 115 \subsection{Examples}
 116 \label{examples}
 117
 118 Most Linux distributions use a software package management system
 119 that fetches packages to be installed from an archive of packages
 120 hosted on a network of mirrors. The Debian project, and other
 121 Debian-based distributions such as Ubuntu and Knoppix, use the
 122 \texttt{apt} (Advanced Package Tool) program, which downloads Debian
 123 packages in the \texttt{.deb} format from one of many HTTP mirrors.
 124 The program will first download index files that contain a listing
 125 of which packages are available, as well as important information
 126 such as their size, location, and a hash of their content. The user
 127 can then select which packages to install or upgrade, and
 128 \texttt{apt} will download and verify them before installing them.
 129
 130 There are also several similar frontends for the RPM-based
 131 distributions. Red Hat's Fedora project uses the \texttt{yum}
 132 program, SUSE uses \texttt{YAST}, while Mandriva has
 133 \texttt{Rpmdrake}, all of which are used to obtain RPMs from
 134 mirrors. Other distributions use tarballs (\texttt{.tar.gz} or
 135 \texttt{.tar.bz2}) to contain their packages. Gentoo's package
 136 manager is called \texttt{portage}, SlackWare Linux uses
 137 \texttt{pkgtools}, and FreeBSD has a suite of command-line tools,
 138 all of which download these tarballs from web servers.
 139
 140 Other software distributors also use a similar system. CPAN
 141 distributes software packages for the PERL
 142 programming language, using SOAP RPC requests to find and download
 143 files. Cygwin provides many of the
 144 standard Unix/Linux tools in a Windows environment, using a
 145 package management tool that requests packages from websites. There
 146 are two software distribution systems for Mac OSX, fink and
 147 MacPorts, that also retrieve packages in this way.
 148
 149 Also, some systems use direct web downloads, but with a hash
 150 verification file also available for download next to the desired
 151 file. These hash files usually have the same file name, but with an
 152 added extension identifying the hash used (e.g. \texttt{.md5} for
 153 the MD5 hash). This type of file downloading and verification is
 154 typical of free software hosting facilities that are open to anyone
 155 to use, such as SourceForge.
 156
 157 \subsection{Similarities}
 158 \label{similarities}
 159
 160 The important things to note for each of the systems mentioned in
 161 section \ref{examples}, is that they all have the following in
 162 common:
 163 \begin{itemize}
 164  \item The content is divided into distinct units (packages), each
 165        of which contains a small independent part of all the
 166        content.
 167  \item The packages are avaliable for anyone to freely download.
 168  \item Users are typically not interested in downloading all of the
 169        packages available.
 170  \item Hashes of the packages are available before the download is
 171        attempted.
 172  \item Requests to download the packages are sent by a tool, not
 173        directly by the user (though the tool is responding to
 174        requests from the user).
 175 \end{itemize}
 176
 177 We also expect that there are a number of users of these systems
 178 that are motivated by altruism to want to help out with this
 179 distribution. This is common in these systems due to the free nature
 180 of the software being delivered, which encourages some to want to
 181 help out in some way. A number of the systems are also used by
 182 groups that are staffed mostly, or sometimes completely, by
 183 volunteers. This also encourages users to want to give back
 184 to the volunteer community that has created the software they are
 185 using.
 186
 187 \subsection{Differences}
 188 \label{differences}
 189
 190 Although at first it might seem that a single all-reaching solution
 191 is possible for this situation, there are some differences in each
 192 system that require independent solutions. The systems all use
 193 different tools for their distribution, so any implementation would
 194 have to be specific to the tool it is to integrate with. In
 195 particular, how to obtain requests from the tool or user, and how to
 196 find a hash that corresponds to the file being requested, is very
 197 specific to each system.
 198
 199 Also, there may be no benefit in creating a single large solution to
 200 integrate all these problems. Although they sometimes distribute
 201 nearly identical packages (e.g. the same software available in
 202 multiple Linux distributions), it is not exactly identical and has
 203 usually been tailored to the system by the distributor. The small
 204 differences will change the hash of the files, and so will make it
 205 impossible to distribute similar packages across systems. And,
 206 although peer-to-peer systems can scale very well with the number of
 207 peers in the system, there is some overhead involved, so having a
 208 much larger system of peers would mean that requests could take
 209 longer to complete.
 210
 211 \subsection{Problems}
 212 \label{problems}
 213
 214 There are some attributes of software package distribution that make
 215 it difficult to use an existing peer-to-peer solution with, or to
 216 design a new solution for this need.
 217
 218 \subsubsection{Archive Dimensions}
 219
 220 The dimensions of the software archive makes it difficult to
 221 distribute efficiently. While most of the packages are very small in
 222 size, there are some that are quite large. There are too many
 223 packages to distribute each individually, but the archive is also
 224 too large to distribute in its entirety. In some archives there are
 225 also divisions of the archive into sections, e.g. by the computer
 226 architecture that the package is intended for.
 227
 228 \begin{figure}
 229 \centering
 230 \includegraphics[width=\columnwidth]{apt_p2p_simulation-size_CDF.eps}
 231 \caption{The CDF of the size of packages in a Debian system, both
 232 for the actual size and adjusted size based on the popularity of
 233 the package.}
 234 \label{size_CDF}
 235 \end{figure}
 236
 237 For example, Figure~\ref{size_CDF} shows the size of packages in the
 238 Debian distribution. While 80\% of the packages are less than
 239 512~KB, some of the packages are hundreds of megabytes. The entire
 240 archive consists of 22,298 packages and is approximately 119,000 MB
 241 in size. Some packages are divided by architecture, but there are a
 242 large number of packages that are not architecture specific and so
 243 can be installed on any computer architecture.
 244
 245 \subsubsection{Package Updates}
 246
 247 The software packages being distributed are being constantly
 248 updated. These updates could be the result of the software creators
 249 releasing a new version of the software with improved functionality,
 250 or the result of the distributor updating their packaging of the
 251 software to meet new requirements. Even if the distributor
 252 periodically makes \emph{stable} releases, which are snapshots of
 253 all the packages in the archive at a certain time, updates are still
 254 released for security issues or serious bugs.
 255
 256 \begin{figure}
 257 \centering
 258 \includegraphics[width=\columnwidth]{size-quarter.eps}
 259 \caption{The amount of data in the Debian archive that is updated
 260 each day, broken down by architecture.}
 261 \label{update_size}
 262 \end{figure}
 263
 264 For example, Figure~\ref{update_size} shows the amount of data in
 265 the Debian archive that was updated each day over a period of 3
 266 months. Each day approximately 1.5\% of the 119,000 MB archive is
 267 updated with new versions of packages.
 268
 269 \subsubsection{Limited Interest}
 270
 271 Though there are a large number of packages, and a large number of
 272 users, interest in a given version of a package could be very
 273 limited. If the package is necessary for the system (e.g. the
 274 package distribution software), then it is likely that everyone will
 275 have it installed, but there are very few of these packages. Most
 276 packages fall in the category of optional or extra, and so are
 277 interesting to only a limited number of people.
 278
 279 \begin{figure}
 280 \centering
 281 \includegraphics[width=\columnwidth]{apt_p2p_popularity-cdf.eps}
 282 \caption{The CDF of the popularity of packages in a Debian system.}
 283 \label{popularity_CDF}
 284 \end{figure}
 285
 286 For example, the Debian distribution tracks the popularity of its
 287 packages using popcon \cite{popcon}. Figure~\ref{popularity_CDF}
 288 shows the cumulative distribution function of the percentage of all
 289 users who install each package. Though some packages are installed
 290 by everyone, 80\% of the packages are installed by less than 1\% of
 291 users.
 292
 293 \subsubsection{Interactive Users}
 294
 295 The package management software that downloads packages displays
 296 some kind of indication of speed and completeness for users to
 297 watch. Since previous client-server downloads occur in a sequential
 298 fashion, the package management software also measures the speed
 299 based on sequential downloading. This requires the peer-to-peer
 300 solution to be reasonably responsive at retrieving packages
 301 sequentially.
 302
 303 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 304
 305 \section{Related Work/Existing Solutions}
 306 \label{related}
 307
 308 There are many peer-to-peer implementations available today, but
 309 none is very well suited to this specific problem.
 310
 311 \subsection{BitTorrent}
 312 \label{bittorrent}
 313
 314 Many distributors make their software available in some form using
 315 BitTorrent \cite{COHEN03}, e.g. for the distribution of CD
 316 images. This is not an ideal situation though, as it requires the
 317 peers to download large numbers of packages that they are not
 318 interested in, and prevents them from updating to new packages
 319 without downloading another image containing a lot of the same
 320 packages they already have. Unfortunately, BitTorrent is not ideally
 321 suited to an application such as this one.
 322
 323 First, there is no obvious way to divide the packages into torrents.
 324 Most of the packages are too small, and there are too many packages
 325 in the entire archive, to create individual torrents for each one.
 326 However, all the packages together are too large to track
 327 efficiently as a single torrent. Some division of the archive's
 328 packages into torrents is obviously necessary, but wherever that
 329 split occurs it will cause either some duplication of connections,
 330 or prevent some peers from connecting to others who do have the same
 331 content. Also, a small number of the packages can be updated every
 332 day which would add new files to the torrent, thereby changing its
 333 \emph{infohash} identifier and making it a new torrent. This will
 334 severely fracture the download population, even though peers in the
 335 new torrent share 99\% of the packages in common with peers in the
 336 old torrent.
 337
 338 Other issues also prevent BitTorrent from being a good solution to
 339 this problem. BitTorrent's fixed piece sizes that disregard file
 340 boundaries are bigger than many of the packages in the archive. This
 341 will waste peers' downloading bandwidth as they will end up
 342 downloading parts of other packages just to get the piece that
 343 contains the package they do want. Incentives to share (upload) are
 344 no longer needed, as the software is freely available for anyone to
 345 download without sharing. Seeds are also not needed as the mirrors
 346 can serve in that capacity. Finally, BitTorrent downloads files
 347 randomly, which does not work well with the interactive package
 348 management tools expectation of sequential downloads.
 349
 350 \subsection{Other Solutions}
 351 \label{others}
 352
 353 There have also been other attempted implementations, usually based
 354 on some form of BitTorrent, to address this problem. Some other
 355 implementations have used BitTorrent almost unmodified, while others
 356 have only looked at using DHTs to replace the tracker in a
 357 BitTorrent system. apt-torrent \cite{apttorrent} creates torrents
 358 for some of the larger packages available, but it ignores the
 359 smaller packages, which are often the most popular. DebTorrent
 360 \cite{debtorrent} makes widespread modifications to a traditional
 361 BitTorrent client, to try and fix the drawbacks mentioned in section
 362 \ref{bittorrent}. However, these changes also require some
 363 modifications to the distribution system to support it.
 364
 365 Others have also used DHTs to support this type of functionality.
 366 Kenosis \cite{kenosis} is a P2P Remote Procedure Call
 367 client also based on the Kademlia DHT, but it is meant as a P2P
 368 primitive system on which other tools can be built, and so it has no
 369 file sharing functionality at all. Many have used a DHT as a drop-in
 370 replacement for the tracker in a BitTorrent system
 371 \cite{bittorrent-dht, azureus-dht}, but such systems only use the
 372 DHT to find peers for the same torrent, so the file sharing uses
 373 traditional BitTorrent and so is not ideal for the reasons listed
 374 above.
 375
 376 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 377
 378 \section{Opportunity/Our Proposal}
 379 \label{opportunity}
 380
 381 The situation described in section \ref{situation} presents a clear
 382 opportunity to use some form of peer-to-peer file-sharing protocol
 383 to allow willing users to contribute upload bandwidth. This sparse
 384 interest in a large number of packages undergoing constant updating
 385 is well suited to the functionality provided by a Distributed Hash
 386 Table (DHT). DHTs require unique keys to store and retrieve strings
 387 of data, for which the cryptographic hashes used by these package
 388 management systems are perfect for. The stored and retrieved strings
 389 can then be pointers to the peers that have the package that hashes
 390 to that key. A downloading peer can lookup the package hash in the
 391 DHT and, if it is found, download the file from those peers and
 392 verify the package with the hash. Once the download is complete, the
 393 peer will add its entry to the DHT indicating that it now has the
 394 package.
 395
 396 The fact that this package is also available to download for free
 397 from a server is very important to our proposal. If the package hash
 398 can not be found in the DHT, the peer can then fallback to
 399 downloading from the original location (i.e. the network of
 400 mirrors). The mirrors thus, with no modification to their
 401 functionality, serve as seeds for the packages in the peer-to-peer
 402 system. Any packages that have just been updated, or that are very
 403 rare, and so don't have any peers available can always be found on
 404 the mirror. Once the peer has completed the download from the mirror
 405 and verified the package, it can then add itself to the DHT as the
 406 first peer for the new package, so that future requests from peers
 407 will not need the mirror.
 408
 409 The trust of the package is also always guaranteed through the use
 410 of the cryptographic hashes. Nothing can be downloaded from a peer
 411 until the hash is looked up in the DHT, so a hash must first come
 412 from a trusted source (i.e. a mirror). Most distributors use index
 413 files that contain hashes for a large number of the packages in
 414 their archive, and which are also hashed. After retrieving the
 415 index's hash from the mirror, the index file can be downloaded from
 416 peers and verified. Then the program has access to all the hashes of
 417 the packages it will be downloading, all of which can be verified
 418 with a \emph{chain of trust} that stretches back to the original
 419 distributor's server.
 420
 421 \subsection{Implementation Options}
 422 \label{imp_options}
 423
 424 There are several ways to implement the desired P2P functionality
 425 into the existing package management software. The functionality can
 426 be directly integrated through modifications to the software, though
 427 this could be difficult as the P2P functionality should be running
 428 at all times. This is needed both for efficient lookups and to
 429 support uploading of already downloaded packages. Unfortunately, the
 430 package management tools typically only run until the download and
 431 install request is complete.
 432
 433 Many of the package management software implementations use HTTP
 434 requests from web servers to download the packages, which makes it
 435 possible to implement the P2P aspect as an almost standard HTTP
 436 caching proxy. This proxy will run as a daemon in the background,
 437 listening for requests from the package management tool for package
 438 files. It will get uncached requests first from the P2P system, or
 439 falling back to the normal HTTP request from a server should it not
 440 be found. For methods that don't use HTTP requests, other types of
 441 proxies may also be possible.
 442
 443 \subsection{Downloading From Peers}
 444 \label{downloading}
 445
 446 Although not necessary, we recommend implementing a download
 447 protocol that is similar to the protocol used to fetch packages from
 448 the distributor's servers. This simplifies the P2P program, as it
 449 can then treat peers and mirrors almost identically when requesting
 450 packages. In fact, the mirrors can be used when there are only a few
 451 slow peers available for a file to help speed up the download
 452 process.
 453
 454 Downloading a file efficiently from a number of peers is where
 455 BitTorrent shines as a peer-to-peer application. Its method of
 456 breaking up larger files into pieces, each with its own hash,
 457 makes it very easy to parallelize the downloading process and
 458 maximize the download speed. For very small packages (i.e. less than
 459 the piece size), this parallel downloading is not necessary, or
 460 even desirable. However, this method should still be used, in
 461 conjunction with the DHT, for the larger packages that are
 462 available.
 463
 464 Since the package management system only stores a hash of the entire
 465 package, and not of pieces of that package, we will need to be able
 466 to store and retrieve these piece hashes using the P2P protocol. In
 467 addition to storing the file download location in the DHT (which
 468 would still be used for small files), a peer will store a
 469 \emph{torrent string} containing the peer's hashes of the pieces of
 470 the larger files. These piece hashes could be compared ahead of time
 471 to determine which peers have the same piece hashes (they all
 472 should), and then used during the download to verify the pieces of
 473 the downloaded package.
 474
 475 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 476
 477 \section{Sample Implementation/Implementation}
 478 \label{implementation}
 479
 480 We have created a sample implementation that functions as described
 481 in section \ref{opportunity}, and is freely available for other
 482 distributors to download and modify \cite{apt-p2p}. This software,
 483 called \texttt{apt-p2p}, interacts with the \texttt{apt} tool, which
 484 is found in most Debian-based Linux distributions. \texttt{apt} uses
 485 SHA1 hashes to verify most downloaded files, including the large
 486 index files that contain the hashes of the individual packages. We
 487 chose this distribution system as it is familiar to us, it allows
 488 software contributions, and there are interesting statistics
 489 available for analyzing the popularity of the software packages
 490 \cite{popcon}.
 491
 492 Since all requests from apt are in the form of HTTP downloads from a
 493 server, the implementation takes the form of a caching HTTP proxy.
 494 Making a standard \texttt{apt} implementation use the proxy is then
 495 as simple as prepending the proxy location and port to the front of
 496 the mirror name in \texttt{apt}'s configuration file (i.e.
 497 ``localhost:9977/mirrorname.debian.org/\ldots'').
 498
 499 We created a customized DHT based on Khashmir \cite{khashmir}, which
 500 is an implementation of Kademlia \cite{kademlia} using methods
 501 familiar to BitTorrent developers. Khashmir is also the same DHT
 502 implementation used by most of the existing BitTorrent clients to
 503 implement trackerless operation. The communication is all handled by
 504 UDP messages, and RPC (remote procedure call) requests and responses
 505 are all \emph{bencoded} in the same way as BitTorrent's
 506 \texttt{.torrent} files. Khashmir uses the high-level Twisted
 507 event-driven networking engine \cite{twisted}, so we also use
 508 Twisted in our sample implementation for all other networking needs.
 509 More details of this customized DHT can be found below in section
 510 \ref{custom_dht}.
 511
 512 Downloading is accomplished by sending simple HTTP requests to the
 513 peers identified by lookups in the DHT to have the desired file.
 514 Requests for a package are made using the package's hash, properly
 515 encoded, as the URL to request from the peer. The HTTP server used
 516 for the proxy also doubles as the server listening for requests for
 517 downloads from other peers. All peers support HTTP/1.1, both in the
 518 server and the client, which allows for pipelining of multiple
 519 requests to a peer, and the requesting of smaller pieces of a large
 520 file using the Range request header.
 521
 522 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 523
 524 \section{Customized DHT}
 525 \label{custom_dht}
 526
 527 A large contribution of our work is in the customization and use of
 528 a Distributed Hash Table (DHT). Although our DHT is based on
 529 Kademlia, we have made many improvements to it to make it suitable
 530 for this application. In addition to a novel storage technique to
 531 support piece hashes, we have improved the response time of looking
 532 up queries, allowed the storage of multiple values for each key, and
 533 incorporated some improvements from BitTorrent's tracker-less DHT
 534 implementation.
 535
 536 \subsection{Kademlia Background}
 537 \label{kademlia}
 538
 539 The Kademlia DHT, like most other DHTs, assigns IDs to peers from
 540 the same space that is used for keys. The peers with IDs closest to
 541 the desired key will then store the values for that key. Lookups are
 542 recursive, as nodes are queried in each step that are exponentially
 543 closer to the key than in the previous step.
 544
 545 Nodes in a Kademlia system support four primitive requests.
 546 \texttt{ping} will cause a peer to return nothing, and is only used
 547 to determine if a node is still alive, while \texttt{store} tells a
 548 node to store a value associated with a given key. The most
 549 important primitives are \texttt{find\_node} and
 550 \texttt{find\_value}, which both function recursively to find nodes
 551 close to a key. The queried nodes will return a list of the nodes
 552 they know about that are closest to the key, allowing the querying
 553 node to quickly traverse the DHT to find the nodes close to the
 554 desired key. The only difference between them is that the
 555 \texttt{find\_value} query will cause a node to return a value, if
 556 it has one for that key, instead of a list of nodes.
 557
 558 \subsection{Piece Hash Storage}
 559 \label{pieces}
 560
 561 Hashes of pieces of the larger package files are needed to support
 562 their efficient downloading from multiple peers.
 563 For large files (5 or more pieces), the torrent strings described in
 564 section \ref{downloading}
 565 are too long to store with the peer's download info in the DHT. This
 566 is due to the limitation that a single UDP packet should be less
 567 than 1472 bytes to avoid fragmentation.
 568 % For example, for a file of 4 pieces, the peer would need to store a
 569 % value 120 bytes in size (IP address, port, four 20-byte pieces and
 570 % some overhead), which would limit a return value from including more
 571 % than 10 values to a request.
 572
 573 Instead, the peers will store the torrent string for large files
 574 separately in the DHT, and only contain a reference to it in their
 575 stored value for the hash of the file. The reference is an SHA1 hash
 576 of the entire concatenated length of the torrent string. If the
 577 torrent string is short enough to store in the DHT (i.e. less than
 578 1472 bytes, or about 70 pieces for the SHA1 hash), then a lookup of
 579 that hash in the DHT will return the torrent string. Otherwise, a
 580 request to the peer for the hash (using the same method as file
 581 downloads, i.e. HTTP), will cause the peer to return the torrent
 582 string.
 583
 584 Figure \ref{size_CDF} shows the package size of the 22,298 packages
 585 available in Debian in January 2008. We can see that most of the
 586 packages are quite small, and so most will therefore not require
 587 piece hash information to download. We have chosen a piece
 588 size of 512 kB, which means that 17,515 (78\%) of the packages will
 589 not require this information. There are 3054 packages that will
 590 require 2 to 4 pieces, for which the torrent string can be stored
 591 directly with the package hash in the DHT. There are 1667 packages
 592 that will require a separate lookup in the DHT for the longer
 593 torrent string as they require 5 to 70 pieces. Finally, there are
 594 only 62 packages that require more than 70 pieces, and so will
 595 require a separate request to a peer for the torrent string.
 596
 597 \subsection{Response Time}
 598 \label{response_time}
 599
 600 Most of our customizations to the DHT have been to try and improve
 601 the time of the recursive \texttt{find\_value} requests, as this can
 602 cause long delays for the user waiting for a package download to
 603 begin. The one problem that slows down such requests is waiting for
 604 timeouts to occur before marking the node as failed and moving on.
 605
 606 Our first improvement is to retransmit a request multiple times
 607 before a timeout occurs, in case the original request or its
 608 response was lost by the unreliable UDP protocol. If it does not
 609 receive a response, the requesting node will retransmit the request
 610 after a short delay. This delay will increase exponentially for
 611 later retransmissions, should the request again fail. Our current
 612 implementation will retransmit the request after 2 seconds and 6
 613 seconds (4 seconds after the first retransmission), and then timeout
 614 after 9 seconds.
 615
 616 We have also added some improvements to the recursive
 617 \texttt{find\_node} and \texttt{find\_value} queries to speed up the
 618 process when nodes fail. If enough nodes have responded to the
 619 current query such that there are many new nodes to query that are
 620 closer to the desired key, then a stalled request to a node further
 621 away will be dropped in favor of a new request to a closer node.
 622 This has the effect of leap-frogging unresponsive nodes and
 623 focussing attention on the nodes that do respond. We will also
 624 prematurely abort a query while there are still oustanding requests,
 625 if enough of the closest nodes have responded and there are no
 626 closer nodes found. This prevents a far away unresponsive node from
 627 making the query's completion wait for it to timeout.
 628
 629 Finally, we made all attempts possible to prevent firewalled and
 630 NATted nodes from being added to the routing table for future
 631 requests. Only a node that has responded to a request from us will
 632 be added to the table. If a node has only sent us a request, we
 633 attempt to send a \texttt{ping} to the node to determine if it is
 634 NATted or not. Unfortunately, due to the delays used by NATs in
 635 allowing UDP packets for a short time if one was recently sent by
 636 the NATted host, the ping is likely to succeed even if the node is
 637 NATted. We therefore also schedule a future ping to the node to make
 638 sure it is still reachable after the NATs delay has hopefully
 639 elapsed. We also schedule future pings of nodes that fail once to
 640 respond to a request, as it takes multiple failures (currently 3)
 641 before a node is removed from the routing table.
 642
 643 \subsection{Multiple Values}
 644 \label{multiple_values}
 645
 646 The original design of Kademlia specified that each keywould have
 647 only a single value associated with it. The RPC to find this value
 648 was called \texttt{find\_value} and worked similarly to
 649 \texttt{find\_node}, iteratively finding nodes with ID's closer to
 650 the desired key. However, if a node had a value stored associated
 651 with the searched for key, it would respond to the request with that
 652 value instead of the list of nodes it knows about that are closer.
 653
 654 While this works well for single values, it can cause a problem when
 655 there are multiple values. If the responding node is no longer one
 656 of the closest to the key being searched for, then the values it is
 657 returning will probably be the staler ones in the system, and it
 658 will not have the latest stored values. However, the search for
 659 closer nodes will stop here, as the queried node only returned the
 660 values and not a list of nodes to recursively query. We could have
 661 the request return both the values and the list of nodes, but that
 662 would severely limit the size and number of the values that could be
 663 returned.
 664
 665 Instead, we have broken up the original \texttt{find\_value}
 666 operation into two parts. The new \texttt{find\_value} request
 667 always returns a list of nodes that the node believes are closest to
 668 the key, as well as a number indicating the number of values that
 669 this node has for the key. Once a querying node has finished its
 670 search for nodes and found the closest ones to the key, it can issue
 671 \texttt{get\_value} requests to some nodes to actually retrieve the
 672 values they have. This allows for much more control of when and how
 673 many nodes to query for values. For example, a querying node could
 674 abort the search once it has found enough values in some nodes, or
 675 it could choose to only request values from the nodes that are
 676 closest to the key being searched for.
 677
 678 \subsection{BitTorrent's Improvements}
 679 \label{bittorrent_dht}
 680
 681 In the many years that some BitTorrent clients have been using a
 682 Kademlia based DHT for tracker-less operation, the developers have
 683 made many enhancements which we can take advantage of. One of the
 684 most important is a security feature added to stop malicious nodes
 685 from subscribing other nodes as downloaders. When a node issues a
 686 request to another node to store its download info for that key, it
 687 must include a \emph{token} returned by the storing node in a recent
 688 \texttt{find\_node} request. The token is created by hashing the IP
 689 address of the requesting peer with a temporary secret that expires
 690 after several minutes. This prevents the requesting peer from faking
 691 its IP address in the store request, since it must first receive a
 692 response from a \texttt{find\_node} on that IP.
 693
 694 We also made some BitTorrent-inspired changes to the parameters of
 695 the DHT originally specified by the authors of Kademlia. Since, as
 696 we will show later, peers stay online in our system for much longer
 697 periods of time, we reduced Kademlia's \emph{k} value from 20 to 8.
 698 The value is supposed to be large enough such that any given
 699 \emph{k} nodes are unlikely to fail within an hour of each other,
 700 which is very unlikely in our system given the long uptimes of
 701 nodes. We also increased the number of concurrent outstanding
 702 requests allowed from 3 to 6 to speed up the recursive key finding
 703 processes.
 704
 705 \subsection{Other Changes}
 706 \label{other_changes}
 707
 708 We added one other new RPC request that nodes can make:
 709 \texttt{join}. This request is only sent on first loading the DHT,
 710 and is usually only sent to the bootstrap nodes that are listed for
 711 the DHT. These bootstrap nodes will respond to the request with the
 712 requesting peer's IP and port, so that the peer can determine what
 713 its oustide IP address is and whether port translation is being
 714 used. In the future, we hope to add functionality similar to STUN
 715 \cite{STUN}, so that nodes can detect whether they are NATted and
 716 take appropriate steps to circumvent it.
 717
 718 In addition, we have allowed peers to store values in the DHT, even
 719 if the hash they are using is not the correct length. Most of the
 720 keys used in the DHT are based on the SHA1 hash, and so they are 20
 721 bytes in length. However, some files in the targetted Debian package
 722 system are only hashed using the MD5 algorithm, and so the hashes
 723 retrieved from the server are 16 bytes in length. The DHT will still
 724 store values using these keys by padding the end of them with zeroes
 725 to the proper length, as the loss of uniqueness from the 4 less
 726 bytes is not an issue. Also, the DHT will happily store hashes that
 727 are longer than 20 bytes, such as those from the 32 byte SHA256
 728 algorithm, by truncating them to the correct length. Since the
 729 requesting peer will have the full length of the hash, this will not
 730 affect its attempt to verify the downloaded file.
 731
 732 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 733
 734 \section{Analysis}
 735 \label{analysis}
 736
 737 Our \texttt{apt-p2p} implementation supporting the Debian package
 738 distribution system has been available to all Debian users since May
 739 3rd, 2008 \cite{apt-p2p-debian}, and will also be available in the
 740 next release of Ubuntu \cite{apt-p2p-ubuntu}. We have since created
 741 a \emph{walker} that will navigate the DHT and find all the peers
 742 currently connected to it. This allows us to analyze many aspects of
 743 our implementation.
 744
 745 \subsection{Peer Lifetimes}
 746 \label{peer_life}
 747
 748 \begin{figure}
 749 \centering
 750 \includegraphics[width=\columnwidth]{AptP2PWalker-peers.eps}
 751 \caption{The number of peers found in the system, and how many are
 752 behind a firewall or NAT.}
 753 \label{walker_peers}
 754 \end{figure}
 755
 756 We first began analyzing the DHT on June 24th, and continue to this
 757 day, giving us 2 months of data so far. Figure~\ref{walker_peers}
 758 shows the number of peers we have seen in the DHT during this time.
 759 The peer population is very steady, with just over 50 regular users
 760 participating in the DHT at any time. We also note that we find 100
 761 users who connect regularly (weekly), and we have found 186 unique
 762 users in the 2 months of our analysis. We determined which users are
 763 behind a firewall or NAT, which is one of the main problems of
 764 implementing a peer-to-peer network. These peers will be
 765 unresponsive to DHT requests from peers they have not contacted
 766 recently, which will cause the peer to wait for a timeout to occur
 767 (currently set at 9 seconds) before moving on. They will also be
 768 unable to contribute any upload bandwidth to other peers, as all
 769 requests for packages from them will also timeout. From
 770 Figure~\ref{walker_peers}, we see that approximately half of all
 771 peers suffered from this restriction.
 772
 773 \begin{figure}
 774 \centering
 775 \includegraphics[width=\columnwidth]{AptP2PDuration-peers.eps}
 776 \caption{The CDF of how long an average session will last.}
 777 \label{duration_peers}
 778 \end{figure}
 779
 780 Figure~\ref{duration_peers} shows the cumulative distribution of how
 781 long a connection from a peer can be expected to last. Due to our
 782 software being installed as a daemon that is started by default
 783 every time their computer boots up, peers are expected to stay for a
 784 long period in the system. 50\% of connections last longer than 5
 785 hours, and 20\% last longer than 10 hours. These connections are
 786 much longer than those reported by Saroiu et. al. \cite{saroiu2001}
 787 for other P2P systems, which had 50\% of Napster and Gnutella
 788 sessions lasting only 1 hour.
 789
 790 \begin{figure}
 791 \centering
 792 \includegraphics[width=\columnwidth]{AptP2PDuration-ind_peers.eps}
 793 \caption{The CDF of the average time individual peers stay in the
 794 system.}
 795 \label{duration_ind_peers}
 796 \end{figure}
 797
 798 We also examined the average time each individual peer spends in the
 799 system. Figure~\ref{duration_peers} shows the cumulative
 800 distribution of how long each individual peer remains in the system.
 801 Here we see that 50\% of peers have average stays in the system
 802 longer than 10 hours.
 803
 804 \begin{figure}
 805 \centering
 806 \includegraphics[width=\columnwidth]{AptP2PDuration-online_1.eps}
 807 \caption{The fraction of peers that, given their current duration in
 808 the system, will stay online for another hour.}
 809 \label{duration_online_1}
 810 \end{figure}
 811
 812 Since our DHT is based on Kademlia, which was designed based on the
 813 probability that a node will remain up another hour, we also
 814 analyzed our system for this parameter.
 815 Figure~\ref{duration_online_1} shows the fraction of peers will
 816 remain online for another hour, as a function of how long they have
 817 been online so far. Maymounkov and Mazieres found that the longer a
 818 node has been online, the higher the probability that it will stay
 819 online \cite{kademlia}. Our results also show this behavior. In
 820 addition, similar to the Gnutella peers, over 90\% of our peers that
 821 have been online for 10 hours, will remain online for another hour.
 822 Our results also show that, for our system, over 80\% of all peers
 823 will remain online another hour, compared with around 50\% for
 824 Gnutella.
 825
 826 \begin{figure}
 827 \centering
 828 \includegraphics[width=\columnwidth]{AptP2PDuration-online_6.eps}
 829 \caption{The fraction of peers that, given their current duration in
 830 the system, will stay online for another 6 hours.}
 831 \label{duration_online_6}
 832 \end{figure}
 833
 834 Since our peers are much longer-lived than other P2P systems, we
 835 also looked at the fraction of peers that stay online for another 6
 836 hours. Figure~\ref{duration_online_6} shows that over 60\% of peers
 837 that are online for 10 hours will stay online for another 6.
 838 However, we see an interesting decrease in this fraction between 8
 839 and 12 hours, which can also be seen in
 840 Figure~\ref{duration_online_1}. We believe this to be due to desktop
 841 users, who regularly turn off their computers at night.
 842
 843 \subsection{Peer Statistics}
 844 \label{peer_stats}
 845
 846 On July 31st we enhanced our walker to retrieve additional
 847 information from each contacted peer. The peers are configured, by
 848 default, to publish some statistics on how much they are downloading
 849 and uploading, and their measured response times for DHT queries.
 850 Our walker can extract this information if the peer is not
 851 firewalled or NATted, it has not disabled this functionality, and if
 852 it uses the same port for both its DHT (UDP) requests and download
 853 (TCP) requests (which is also a configuration parameter).
 854
 855 \begin{figure}
 856 \centering
 857 \includegraphics[width=\columnwidth]{AptP2PDownloaded-peers.eps}
 858 \caption{The number of peers that were contacted to determine their
 859 bandwidth, and the total number of peers in the system.}
 860 \label{down_peers}
 861 \end{figure}
 862
 863 Figure~\ref{down_peers} shows the total number of peers we have been
 864 able to contact since starting to gather this additional
 865 information, as well as how many total peers were found. We were
 866 only able to contact 30\% of all the peers that connected to the
 867 system during this time.
 868
 869 \begin{figure}
 870 \centering
 871 \includegraphics[width=\columnwidth]{AptP2PDownloaded-bw.eps}
 872 \caption{The bandwidth of data that the contacted peers have
 873 downloaded and uploaded.}
 874 \label{down_bw}
 875 \end{figure}
 876
 877 Figure~\ref{down_bw} shows the amount of data the peers we were able
 878 to contact have downloaded. Peers measure their downloads from other
 879 peers and mirrors separately, so we are able to get an idea of how
 880 much savings our system is generating for the mirrors. We see that
 881 the peers are downloading approximately 20\% of their package data
 882 from other peers, which is saving the mirrors from supplying that
 883 bandwidth. The actual numbers are only a lower bound, since we have
 884 only contacted 30\% of the peers in the system, but we can estimate
 885 that \texttt{apt-p2p} has already saved the mirrors 15 GB of
 886 bandwidth, or 1 GB per day.
 887
 888 We also collected the statistics on the measured response time peers
 889 were experiencing when sending requests to the DHT. We found that
 890 the recursive \texttt{find\_value} query, which is necessary before
 891 a download can occur, is taking 17 seconds on average. This
 892 indicates that, on average, requests are experiencing almost 2
 893 stalls while waiting for the 9 second timeouts to occur on
 894 unresponsive peers. This is longer than our target of 10 seconds,
 895 although it will only lead to a slight average delay in downloading
 896 of 1.7 seconds when the default 10 concurrent downloads are
 897 occurring.This increased response time is due to the number of peers
 898 that were behind firewalls or NATs, which was much higher than we
 899 anticipated. We do have plans to improve this through better
 900 informing of users of their NATted status, the use of STUN
 901 \cite{STUN} to circumvent the NATs, and by better exclusion of
 902 NATted peers from the DHT (which will not prevent them from using
 903 the system).
 904
 905 We were also concerned that the constant DHT requests and responses,
 906 even while not downloading, would overwhelm some peers' network
 907 connections. However, we found that peers are using 200 to 300 bytes
 908 per second of bandwidth in servicing the DHT. These numbers are
 909 small enough to not affect any other network services the peer would
 910 be running.
 911
 912 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 913
 914 \section{Future Work}
 915 \label{future}
 916
 917 We feel that our P2P software package distribution model is
 918 well-suited to be used by many free software distributors. We hope
 919 to convince them to adopt such a model in their distribution, or we
 920 may port our existing system to some of the other groups for them to
 921 try.
 922
 923 One aspect missing from our model is the removal of old packages
 924 from the cache. Since our implementation is still relatively young,
 925 we have not had to deal with the problems of a growing cache of
 926 obsolete packages consuming all of a user's hard drive. We plan to
 927 implement some form of least recently used (LRU) cache removal
 928 technique, in which packages that are no longer available on the
 929 server, no longer requested by peers, or simply are the oldest in
 930 the cache, will be removed.
 931
 932 The most significant area of improvement still needed in our sample
 933 implementation is to further speed up some of the slower recursive
 934 DHT requests. We hope to accomplish this by further tuning the
 935 parameters of our current system, better exclusion of NATted peers
 936 from the routing tables, and through the use of STUN \cite{STUN} to
 937 circumvent the NATs of the 50\% of the peers that have not
 938 configured port forwarding.
 939
 940 %%%%%%%%%%%%%%%%%%%%%%%%%%%  Section  %%%%%%%%%%%%%%%%%%%%%%%%%%%%
 941
 942 \section{Conclusions}
 943 \label{conclusions}
 944
 945 We have designed a generally applicable peer-to-peer content
 946 distribution model to be used by many of the free software
 947 distributors operating today. It makes use of already existing
 948 features of most package management systems to create a
 949 peer-to-peer distribution that should substantially reduce the costs
 950 of hosting the software packages.
 951
 952 We have also implemented our design in freely available software to
 953 be used in conjuction with Debian-based distribution of Linux
 954 software packages. It is currently in use by some users of the
 955 Debian project's distribution, and so serves as an example of the
 956 possibilities that exist.
 957
 958 \bibliographystyle{IEEEtran}
 959 \bibliography{./IEEEabrv,./all}
 960
 961 \end{document}