Check packet lengths before sending get_value responses. The length of the created UDP packet needs to be checked before sending to make sure it is not so long that it will get fragmented. This is only possible for the get_value RPC request. Clean up the khashmir actions. The khashmir actions are a mess, and some cleanup is necessary. A lot of the actions have most of their processing in common, so this code should be put in functions that all can call. Perhaps creating a base "RecurringAction" and "StaticAction" would be a good idea, as then find_node and find_value could use the first, while get_value and store_value could use the second. Perhaps ping and join actions should also be created for consistency, and maybe inherit from a "SingleNodeAction" base class. Packages.diff files need to be considered. The Packages.diff/Index files contain hashes of Packages.diff/rred.gz files, which themselves contain diffs to the Packages files previously downloaded. Apt will request these files for the testing/unstable distributions. They need to either be ignored, or dealt with properly by adding them to the tracking done by the AptPackages module. PeerManager needs to download large files from multiple peers. The PeerManager currently chooses a peer at random from the list of possible peers, and downloads the entire file from there. This needs to change if both a) the file is large (more than 512 KB), and b) there are multiple peers with the file. The PeerManager should then break up the large file into multiple pieces of size < 512 KB, and then send requests to multiple peers for these pieces. This can cause a problem with hash checking the returned data, as hashes for the pieces are not known. Any file that fails a hash check should be downloaded again, with each piece being downloaded from different peers than it was previously. The peers are shifted by 1, so that if a peers previously downloaded piece i, it now downloads piece i+1, and the first piece is downloaded by the previous downloader of the last piece, or preferably a previously unused peer. As each piece is downloaded the running hash of the file should be checked to determine the place at which the file differs from the previous download. If the hash check then passes, then the peer who originally provided the bad piece can be assessed blame for the error. Otherwise, the peer who originally provided the piece is probably at fault, since he is now providing a later piece. This doesn't work if the differing piece is the first piece, in which case it is downloaded from a 3rd peer, with consensus revealing the misbehaving peer. Store and share torrent-like strings for large files. In addition to storing the file download location (which would still be used for small files), a bencoded dictionary containing the peer's hashes of the individual pieces could be stored for the larger files (20% of all the files are larger than 512 KB). This dictionary would have the normal piece size, the hash length, and a string containing the piece hashes of length *<#pieces>. These piece hashes could be compared ahead of time to determine which peers have the same piece hashes (they all should), and then used during the download to verify the downloaded pieces. For very large files (5 or more pieces), the torrent strings are too long to store in the DHT and retrieve (a single UDP packet should be less than 1472 bytes to avoid fragmentation). Instead, the peers should store the torrent-like string for large files separately, and only contain a reference to it in their stored value for the hash of the file. The reference would be a hash of the bencoded dictionary. If the torrent-like string is short enough to store in the DHT (i.e. less than 1472 bytes, or about 70 pieces for the SHA1 hash), then a lookup of that hash in the DHT would give the torrent-like string. Otherwise, a request to the peer for the hash (just like files are downloaded), should return the bencoded torrent-like string. PeerManager needs to track peers' properties. The PeerManager needs to keep track of the observed properties of seen peers, to help determine a selection criteria for choosing peers to download from. Each property will give a value from 0 to 1. The relevant properties are: - hash errors in last day (1 = 0, 0 = 3+) - recent download speed (1 = fastest, 0 = 0) - lag time from request to download (1 = 0, 0 = 15s+) - number of pending requests (1 = 0, 0 = max (10)) - whether a connection is open (1 = yes, 0.9 = no) These should be combined (multiplied) to provide a sort order for peers available to download from, which can then be used to assign new downloads to peers. Pieces should be downloaded from the best peers first (i.e. piece 0 from the absolute best peer).