X-Git-Url: https://git.mxchange.org/?p=quix0rs-apt-p2p.git;a=blobdiff_plain;f=TODO;h=bd8283c904d6ed372c9a84136ce204422bb6e183;hp=c6c74bd8de2fe762962f4d701af0a1ad073bda4b;hb=21f3067f2e5f694c696835f5eceab0eba5c3d479;hpb=dbcf7d0324c5ddee23bf170695e173fdff5a2c0e diff --git a/TODO b/TODO index c6c74bd..bd8283c 100644 --- a/TODO +++ b/TODO @@ -1,13 +1,44 @@ -Some last few things to do before release. +Rotate DNS entries for mirrors more reliably. + +Currently the mirrors are accessed by DNS name, which can cause some +issues when there are mirror differences and the DNS gets rotated. +Instead, the HTTP Downloader should handle DNS lookups itself, store +the resulting addresses, and send requests to IP addresses. If there +is an error from the mirror (hash check or 404 response), the next IP +address in the rotation should be used. + + +Use GPG signatures as a hash for files. + +A detached GPG signature, such as is found in Release.gpg, can be used +as a hash for the file. This hash can be used to verify the file when +it is downloaded, and a shortened version can be added to the DHT to +look up peers for the file. To get the hash into a binary form from +the ascii-armored detached file, use the command +'gpg --no-options --no-default-keyring --output - --dearmor -'. The +hash should be stored as the reverse of the resulting binary string, +as the bytes at the beginning are headers that are the same for most +signatures. That way the shortened hash stored in the DHT will have a +better chance of being unique and being stored on different peers. To +verify a file, first the binary hash must be re-reversed, armored, and +written to a temporary file with the command +'gpg --no-options --no-default-keyring --output $tempfile --enarmor -'. +Then the incoming file can be verified with the command +'gpg --no-options --no-default-keyring --keyring /etc/apt/trusted.gpg +--verify $tempfile -'. + +All communication with the command-line gpg should be done using pipes +and the python module python-gnupginterface. There needs to be a new +module for GPG verification and hashing, which will make this easier. +In particular, it would need to support hashlib-like functionality +such as new(), update(), and digest(). Note that the verification +would not involve signing the file again and comparing the signatures, +as this is not possible. Instead, the verify() function would have to +behave differently for GPG hashes, and check that the verification +resulted in a VALIDSIG. CAUTION: the detached signature can have a +variable length, though it seems to be usually 65 bytes, 64 bytes has +also been observed. -- Handle/investigate the HTTP client pipeline errors -- DB should not always restat files (especially for expired hashes) -- remove missing files at startup (in DB's removeUntracked) -- when files modtime but not size changes, rehash them to be sure -- remove files from the peer's download cache -- update the modtime of files downloaded from peers - - also set the Last-Modified header for the return to Apt -- refresh expired DHT hashes concurrently instead of sequentially Consider what happens when multiple requests for a file are received. @@ -26,6 +57,19 @@ distributions. They need to be dealt with properly by adding them to the tracking done by the AptPackages module. +Improve the estimation of the total number of nodes + +The current total nodes estimation is based on the number of buckets. +A better way is to look at the average inter-node spacing for the K +closest nodes after a find_node/value completes. Be sure to measure +the inter-node spacing in log2 space to dampen any ill effects. This +can be used in the formula: + nodes = 2^160 / 2^(average of log2 spacing) +The average should also be saved using an exponentially weighted +moving average (of the log2 distance) over separate find_node/value +actions to get a better calculation over time. + + Improve the downloaded and uploaded data measurements. There are 2 places that this data is measured: for statistics, and for @@ -41,6 +85,14 @@ and not the headers, and also misses the requests for downloads sent to other peers. +Rehash changed files instead of removing them. + +When the modification time of a file changes but the size does not, +the file could be rehased to verify it is the same instead of +automatically removing it. The DB would have to be modified to return +deferred's for a lot of its functions. + + Consider storing deltas of packages. Instead of downloading full package files when a previous version of