X-Git-Url: https://git.mxchange.org/?p=quix0rs-apt-p2p.git;a=blobdiff_plain;f=TODO;h=7a586fe9a2c4b15861764d4b103f9bcd22a2f5f0;hp=9b4f7483cf5030e6bfcb9a2087d3a010a4762024;hb=fcfb936185ed7dfa126e443b3b281215eefc4a67;hpb=358d175e2ba5638ad2ff31f233b8fa23db5235dd diff --git a/TODO b/TODO index 9b4f748..7a586fe 100644 --- a/TODO +++ b/TODO @@ -1,34 +1,72 @@ -Upgrade the security in khashmir by using longer TIDs. +Add all cache files to the database. -The TIDs sent with every request that must be echoed on response are -very short (one char). They should be lengthened to 20 using the NewID() -function from khash. +All files in the cache should be added to the database, so that they can +be checked to make sure nothing has happened to them. The database would +then need a flag to indicate files that are hashed and available, but +that shouldn't be added to the DHT. -Hashes of files need to be stored permanently. +Packages.diff files need to be considered. -A new database of files and their hashes is needed. It should store the -location and hash of the file as well as the modtime and other details -so we can check if a file needs to be rehashed on startup. The DB can -also be used to store info needed to manage the values stored in the DHT. +The Packages.diff/Index files contain hashes of Packages.diff/rred.gz +files, which themselves contain diffs to the Packages files previously +downloaded. Apt will request these files for the testing/unstable +distributions. They need to be dealt with properly by +adding them to the tracking done by the AptPackages module. -Add ability to search and hash and DHT-store other directories. +Improve the downloaded and uploaded data measurements. -The user should be able to specify a list of directories that will be -searched for files to hash and add to the DHT. +There are 2 places that this data is measured: for statistics, and for +limiting the upload bandwidth. They both have deficiencies as they +sometimes miss the headers or the requests sent out. The upload +bandwidth calculation only considers the stream in the upload and not +the headers sent, and it also doesn't consider the upload bandwidth +from requesting downloads from peers (though that may be a good thing). +The statistics calculations for downloads include the headers of +downloaded files, but not the requests received from peers for upload +files. The statistics for uploaded data only includes the files sent +and not the headers, and also misses the requests for downloads sent to +other peers. -Missing Kademlia implementation details are needed. +Consider storing deltas of packages. -The current implementation is missing some important features, mostly -focussed on storing values: - - values need to be republished (every hour?) - - original publishers need to republish values (every 24 hours) - - when a new node is found that is closer to some values, replicate the - values there without deleting them - - when a value lookup succeeds, store the value in the closest node - found that didn't have it - - make the expiration time of a value exponentially inversely - proportional to the number of nodes between the current node and the - node closest to the value +Instead of downloading full package files when a previous version of +the same package is available, peers could request a delta of the +package to the previous version. This would only be done if the delta +is significantly (>50%) smaller than the full package, and is not too +large (absolutely). A peer that has a new package and an old one would +add a list of deltas for the package to the value stored in the DHT. +The delta information would specify the old version (by hash), the +size of the delta, and the hash of the delta. A peer that has the same +old package could then download the delta from the peer by requesting +the hash of the delta. Alternatively, very small deltas could be +stored directly in the DHT. + + +Consider tracking security issues with packages. + +Since sharing information with others about what packages you have +downloaded (and probably installed) is a possible security +vulnerability, it would be advantageous to not share that information +for packages that have known security vulnerabilities. This would +require some way of obtaining a list of which packages (and versions) +are vulnerable, which is not currently available. + + +Consider adding peer characteristics to the DHT. + +Bad peers could be indicated in the DHT by adding a new value that is +the NOT of their ID (so they are guaranteed not to store it) indicating +information about the peer. This could be bad votes on the peer, as +otherwise a peer could add good info about itself. + + +Consider adding pieces to the DHT instead of files. + +Instead of adding file hashes to the DHT, only piece hashes could be +added. This would allow a peer to upload to other peers while it is +still downloading the rest of the file. It is not clear that this is +needed, since peer's will not be uploading and downloading ery much of +the time.