From 8a9e9644183186c3c88da0b995bb49059408cc75 Mon Sep 17 00:00:00 2001 From: Cameron Dale Date: Fri, 18 Jan 2008 17:45:34 -0800 Subject: [PATCH] Updates after more though on a couple of of the TODO items. --- TODO | 56 ++++++++++++++++++++++++++------------------------------ 1 file changed, 26 insertions(+), 30 deletions(-) diff --git a/TODO b/TODO index 2be63a6..2ff301a 100644 --- a/TODO +++ b/TODO @@ -15,21 +15,14 @@ distributions. They need to either be ignored, or dealt with properly by adding them to the tracking done by the AptPackages module. -Hashes need to be sent with requests for some files. +Change file identifier from path to hash. -Some files can change without changing the file name, since the file was +Some files can change without changing the path, since the file was added to the DHT by the peer. Examples are Release, Packages.gz, and -Sources.bz2. For files like this (and only for files like this), the -request to download from the peer should include the downloader's -expected hash for the file as a new HTTP header. If the file is found, -the cached hash for the file will be used to determine whether the -request is for the same file as is currently available, and a special -HTTP response can be sent if it is not (i.e. not a 404). - -Alternatively, consider sharing the files by hash instead of by -directory. Then the request would be for -http://127.3.45.9:9977/, and it would always work. This -would require a database lookup for every request. +Sources.bz2. This would cause problems when requesting these files by +path. Instead, share the files by hash, then the request would be for +http://127.3.45.9:9977/~, and it would always work. This +will require a database lookup for every request. PeerManager needs to download large files from multiple peers. @@ -59,26 +52,29 @@ first piece, in which case it is downloaded from a 3rd peer, with consensus revealing the misbehaving peer. -Consider storing torrent-like strings in the DHT. +Store and share torrent-like strings for large files. -Instead of only storing the file download location (which would still be +In addition to storing the file download location (which would still be used for small files), a bencoded dictionary containing the peer's hashes of the individual pieces could be stored for the larger files -(20% of all the files are larger than 512 KB ). This dictionary would -have the download location, a list of the piece sizes, and a list of the -piece hashes (bittorrent uses a single string of length 20*#pieces, but -for general non-sha1 case a list is needed). - -These piece hashes could be compared ahead of time to determine which -peers have the same piece hashes (they all should), and then used during -the download to verify the downloaded pieces. - -Alternatively, the peers could store the torrent-like string for large -files separately, and only contain a reference to it in their stored -value for the hash of the file. The reference would be a hash of the -bencoded dictionary, and a lookup of that hash in the DHT would give the -torrent-like string. (A 100 MB file would result in 200 hashes, which -would create a bencoded dictionary larger than 6000 bytes.) +(20% of all the files are larger than 512 KB). This dictionary would +have the normal piece size, the hash length, and a string containing the +piece hashes of length *<#pieces>. These piece hashes could +be compared ahead of time to determine which peers have the same piece +hashes (they all should), and then used during the download to verify +the downloaded pieces. + +For very large files (5 or more pieces), the torrent strings are too +long to store in the DHT and retrieve (a single UDP packet should be +less than 1472 bytes to avoid fragmentation). Instead, the peers should +store the torrent-like string for large files separately, and only +contain a reference to it in their stored value for the hash of the +file. The reference would be a hash of the bencoded dictionary. If the +torrent-like string is short enough to store in the DHT (i.e. less than +1472 bytes, or about 70 pieces for the SHA1 hash), then a +lookup of that hash in the DHT would give the torrent-like string. +Otherwise, a request to the peer for the hash (just like files are +downloaded), should return the bencoded torrent-like string. PeerManager needs to track peers' properties. -- 2.30.2