Roland Häder [Fri, 20 Oct 2023 04:58:47 +0000 (06:58 +0200)]
Continued:
- software doesn't need to be aliases each round as the variable isn't assigned
inside the loop
- added check in software_helper.alias() if parameter 'software' is an empty
string
- don't attempt to alias empty software string
Roland Häder [Thu, 12 Oct 2023 23:17:07 +0000 (01:17 +0200)]
Continued:
- added parameter --no-detection for rechecking instances with no detection mode
being set which might happen when the server is down or the used software was
nowhere stated
Roland Häder [Thu, 5 Oct 2023 00:29:07 +0000 (02:29 +0200)]
Continued:
- don't attempt to fetch peers from fetch_instances() function when the software
is a relay, command fetch_relay() should be used instead to fetch relay's
peer list
Roland Häder [Wed, 4 Oct 2023 08:18:54 +0000 (10:18 +0200)]
Continued:
- some people made a website which redirected to other redirect-only domains,
e.g. start with 'social.golangengine.de' and you see what I mean
- if fetch_instances() is used here, it will cause a permanent redirect and
later a stackoverflow
Roland Häder [Mon, 2 Oct 2023 07:56:43 +0000 (09:56 +0200)]
Continued:
- ordered SELECT statement for update_nodeinfo() command
- added --no-auto as another "filter" parameter
- don't check domains that turn e.g. into an IP address before redirecting to
them
Roland Häder [Sat, 30 Sep 2023 11:11:41 +0000 (13:11 +0200)]
Continued:
- Paula has finally seen the wrong outcome of publishing a #FediBlock list
publicly:
"Too many people use blocklists as-is and don't use their own brain. Blindly
blocking instances because someone else says so is not good."
Roland Häder [Tue, 12 Sep 2023 10:00:19 +0000 (12:00 +0200)]
Continued:
- added command 'fetch_csv' which fetches CSV files and processes them for
further instance discovery and blocklist expansion
- introduced function processing.csv_block() which does the above processing
- return non-zero exit code when source was queried to recently
Roland Häder [Tue, 12 Sep 2023 01:40:31 +0000 (03:40 +0200)]
Continued:
- please execute SQL command:
"UPDATE instances SET command = 'redirect_target' WHERE command = 'fetch_generator';"
- yes, it is done during detection mode 'generator' but it was discovered as
redirection target
Roland Häder [Mon, 11 Sep 2023 20:27:46 +0000 (22:27 +0200)]
Continued:
- some people think that CamelCase.Domain is something to be proud of
- truth is, that those upper-case characters are being lower-cased before a DNS
server is been queried
Roland Häder [Mon, 11 Sep 2023 19:53:11 +0000 (21:53 +0200)]
Continued:
- instances.has_pending() raises an exception if domain is not registered yet
- instances.update() checks is_registered() first, then has_pending()
Roland Häder [Wed, 6 Sep 2023 08:58:23 +0000 (10:58 +0200)]
Continued:
- added last_response_time which is a float that stores the last response time
- you have to run following SQL statement on your blocks.db:
ALTER TABLE instances ADD last_response_time FLOAT NULL DEFAULT NULL;
Roland Häder [Wed, 6 Sep 2023 02:02:26 +0000 (04:02 +0200)]
Continued:
- introduced software_helper.is_relay() function to check if given software name
is a supported relay software
- federation.fetch_instances() will throw an exception if invoked with relay
software
- also command fetch_instances avoids them
Roland Häder [Wed, 6 Sep 2023 01:37:48 +0000 (03:37 +0200)]
Continued:
- fetch_relays now supports --software=foo parameter
- added support for 'pub-relay' relays, they provide their peers over their
nodeinfo URL (see element metadata -> peers)
Roland Häder [Mon, 4 Sep 2023 07:54:14 +0000 (09:54 +0200)]
Continued:
- functions in module fba.helpers.tidyup are relatively "expensive", means they
need a lot of CPU cycles
- let's avoid invoking them on empty string
Roland Häder [Tue, 29 Aug 2023 05:52:42 +0000 (07:52 +0200)]
Another attempt to rewrite:
- don't update nodeinfo URL and detection mode to STATIC_CHECK while fetching
blocks for Pleroma
- Pleroma has their block list exposed in that nodeinfo and not in separate API
Roland Häder [Mon, 28 Aug 2023 14:30:09 +0000 (16:30 +0200)]
Continued:
- some misskey instances may have no nodeinfo URL, e.g. they got only detected
through APP_NAME method
- still they may provide a blocklist
- it is now rewritten that first a generic "/api/v1/instance/domain_blocks" is
fetched and if it fails, a software-specific attempt is done
Roland Häder [Fri, 25 Aug 2023 23:31:52 +0000 (01:31 +0200)]
Continued:
- I hope this isn't to strict, some hosts return a "298 None" which the HTTP
library doesn't see as failed (response.ok = False) but still doesn't return
a JSON
Roland Häder [Thu, 24 Aug 2023 19:12:14 +0000 (21:12 +0200)]
Continued:
- added parameter --software3 which searchinf for a file 'software.txt'
- you can generate this by running e.g.
sqlite3 ./blocks.db "SELECT software FROM instances WHAT_EVER_PARAMETER;" > software.txt
- reset nodeinfo_url if software is now None
- always set complete URL, including domain
Roland Häder [Thu, 24 Aug 2023 18:35:55 +0000 (20:35 +0200)]
Continued:
- renamed fetch_nodeinfo() to fetch() as it is already part of module nodeinfo
- added 3rd optional parameter to it, fetching of /.well-known/* isn't then
required anymore and saves another request
- also the wanted URL can be directly used
Roland Häder [Tue, 22 Aug 2023 18:34:24 +0000 (20:34 +0200)]
Continued:
- added parameter --no-software which fetching instances with no software
detected and not recently being checked
- the parameter --force is not re-checking recently entries. If you want this
you need to use ./nodeinfo.sh --software --force but be kind to other
webmasters!
Roland Häder [Wed, 16 Aug 2023 22:56:19 +0000 (00:56 +0200)]
Continued:
- no, nope: validators.hostname() was a bad idea, it also let IP addresses and
local host names in as well
- added command remove_invalid to remove those from database
- renamed recheck.sh -> nodeinfo.sh
Roland Häder [Wed, 16 Aug 2023 13:27:44 +0000 (15:27 +0200)]
Continued:
- chaos.social isn't part of oliphant, so it still requires being handled
separately
- fetch software/origin from local database instead software from remote
nodeinfo (saves some requests to their servers)