X-Git-Url: https://git.mxchange.org/?p=quix0rs-apt-p2p.git;a=blobdiff_plain;f=TODO;h=787520369ac76e9eb250affe7a4e712d7082f4d3;hp=2be63a6c76e56cfd02ac2ccb64d6c4d9c01a1331;hb=61a264de46913269780852560f45316b36e7e6fb;hpb=957c001a1e1a0476f9885ff851de9ba1bc78fcef diff --git a/TODO b/TODO index 2be63a6..7875203 100644 --- a/TODO +++ b/TODO @@ -1,11 +1,3 @@ -Files for which a hash cannot be found should not be added to the DHT. - -If the hash can't found, it stands to reason that other peers will not -be able to find the hash either. So adding those files to the DHT will -just clutter it with useless information. Examples include Release.gpg, -Release, Translation-de.bz2, and Contents.gz. - - Packages.diff files need to be considered. The Packages.diff/Index files contain hashes of Packages.diff/rred.gz @@ -15,23 +7,6 @@ distributions. They need to either be ignored, or dealt with properly by adding them to the tracking done by the AptPackages module. -Hashes need to be sent with requests for some files. - -Some files can change without changing the file name, since the file was -added to the DHT by the peer. Examples are Release, Packages.gz, and -Sources.bz2. For files like this (and only for files like this), the -request to download from the peer should include the downloader's -expected hash for the file as a new HTTP header. If the file is found, -the cached hash for the file will be used to determine whether the -request is for the same file as is currently available, and a special -HTTP response can be sent if it is not (i.e. not a 404). - -Alternatively, consider sharing the files by hash instead of by -directory. Then the request would be for -http://127.3.45.9:9977/, and it would always work. This -would require a database lookup for every request. - - PeerManager needs to download large files from multiple peers. The PeerManager currently chooses a peer at random from the list of @@ -59,26 +34,29 @@ first piece, in which case it is downloaded from a 3rd peer, with consensus revealing the misbehaving peer. -Consider storing torrent-like strings in the DHT. +Store and share torrent-like strings for large files. -Instead of only storing the file download location (which would still be +In addition to storing the file download location (which would still be used for small files), a bencoded dictionary containing the peer's hashes of the individual pieces could be stored for the larger files -(20% of all the files are larger than 512 KB ). This dictionary would -have the download location, a list of the piece sizes, and a list of the -piece hashes (bittorrent uses a single string of length 20*#pieces, but -for general non-sha1 case a list is needed). - -These piece hashes could be compared ahead of time to determine which -peers have the same piece hashes (they all should), and then used during -the download to verify the downloaded pieces. - -Alternatively, the peers could store the torrent-like string for large -files separately, and only contain a reference to it in their stored -value for the hash of the file. The reference would be a hash of the -bencoded dictionary, and a lookup of that hash in the DHT would give the -torrent-like string. (A 100 MB file would result in 200 hashes, which -would create a bencoded dictionary larger than 6000 bytes.) +(20% of all the files are larger than 512 KB). This dictionary would +have the normal piece size, the hash length, and a string containing the +piece hashes of length *<#pieces>. These piece hashes could +be compared ahead of time to determine which peers have the same piece +hashes (they all should), and then used during the download to verify +the downloaded pieces. + +For very large files (5 or more pieces), the torrent strings are too +long to store in the DHT and retrieve (a single UDP packet should be +less than 1472 bytes to avoid fragmentation). Instead, the peers should +store the torrent-like string for large files separately, and only +contain a reference to it in their stored value for the hash of the +file. The reference would be a hash of the bencoded dictionary. If the +torrent-like string is short enough to store in the DHT (i.e. less than +1472 bytes, or about 70 pieces for the SHA1 hash), then a +lookup of that hash in the DHT would give the torrent-like string. +Otherwise, a request to the peer for the hash (just like files are +downloaded), should return the bencoded torrent-like string. PeerManager needs to track peers' properties.