Roland Häder [Mon, 10 Jul 2023 19:12:11 +0000 (21:12 +0200)]
Continued:
- renamed utils.deobfuscate_domain() to deobfuscate()
- oliphant blocklists may contain obfuscated domains, need to deobfuscate them
first to get actual domain names
Roland Häder [Mon, 10 Jul 2023 17:35:39 +0000 (19:35 +0200)]
Continued:
- cannot get len() (number of rows) from reader
- instances.set_total_blocks() accepts as 2nd parameter not direct count, so
let's handle the domain list
Roland Häder [Sun, 9 Jul 2023 17:47:11 +0000 (19:47 +0200)]
Fixed:
- ops, to much renames, named 'domains' back to 'blocklist'
- also need to check combined arrays, or else always 2 will be found
- need to invoke commit() in sources.update() function
Roland Häder [Sat, 8 Jul 2023 21:22:23 +0000 (23:22 +0200)]
Continued:
- some instances or honeypots may return empty (None in Python) link[href]
entries
- you can run a honeypot and pay monthly domain fees for it, not my business,
but at least format your /.well-known/nodeinfo properly!
Roland Häder [Sat, 8 Jul 2023 20:23:54 +0000 (22:23 +0200)]
Continued:
- instances.social is a non-federating website, `origin` should always bear a
federating instance
- please run SQL `DELETE FROM instances WHERE origin='instances.social'` and
afterwards ./fba.py fetch_instances --domain=<some-large-instance>
- then you can run this command (fetch_instances_social) again
Roland Häder [Thu, 6 Jul 2023 07:59:05 +0000 (09:59 +0200)]
Fixed:
- PeerTube's JSON response always includes mode2=following or mode2=follower
depending on if mode=followers or mode=following is set
- this causes PeerTube instances being reported with duplicate amount of peers
Roland Häder [Wed, 5 Jul 2023 21:25:25 +0000 (23:25 +0200)]
Continued:
- added "official" name 'nextcloudpi', others like 'crowncloud', 'darkcloud' are
just aliases created by their owners, I don't provide them a stage in my code
- provided template variable 'domain' might be None
Roland Häder [Wed, 5 Jul 2023 20:15:19 +0000 (22:15 +0200)]
Continued:
- added view /list which lists domains by some criteria (mode/value)
- renamed blocks.is_valid_level() to valid() but now requires 2nd parameter
with column to check
Roland Häder [Tue, 4 Jul 2023 18:28:01 +0000 (20:28 +0200)]
Renaming season:
- renamed table/model file 'apis' to 'sources' as wikis are not APIs but
all are (instance) sources
- renamed api_domain to source_domain
Roland Häder [Tue, 4 Jul 2023 12:57:02 +0000 (14:57 +0200)]
Continued:
- added domain.is_in_url() to check if domain is matching netloc or hostname
part of the URL. This function encodes the domain into punycode before
comparing it
Roland Häder [Tue, 4 Jul 2023 10:03:00 +0000 (12:03 +0200)]
Continued:
- first (if needed) acquire lock, then check (if needed) api_domain
- also check host name (components.netloc) for feed in fetch_fba_rss command
Roland Häder [Mon, 3 Jul 2023 22:37:31 +0000 (00:37 +0200)]
Continued:
- added command fetch_instances_social to fetch new instances from
instances.social
- you need to get an API key from them, please don't lower api_last_access to
much, your API key/IP address might get banned!
- added table `apis` which keeps track of "API" accessed, including github
and wikis, this is to lower traffic on these sites, again: please DO NOT
overdose these requests! Your IP/API key might get blocked!
Roland Häder [Mon, 3 Jul 2023 12:19:39 +0000 (14:19 +0200)]
Continued:
- reset software, detection mode and nodeinfo URL to None when redirection is
done to other domain
- yes, some people have moved their instance to a sub domain and now redirect
their traffic to there
- still this had caused another instance under a wrong domain name to be
registered
- this fix solves this, please run ./fba.py update_nodeinfo
- added config key recheck_nodeinfo
Roland Häder [Sun, 2 Jul 2023 18:16:01 +0000 (20:16 +0200)]
Continued:
- command update_nodeinfo() now supports --domain and --software parameter
- without them only out-dated instances are being check
- also determined software is being updated
Roland Häder [Sun, 2 Jul 2023 07:09:30 +0000 (09:09 +0200)]
Continued:
- added command fetch_fedilist()
- please don't overdose these commands, fetching instances is limited to only
not recently fetched but from static websites like fediverse.observer is NOT
limited!)
- flush pending data here, too
Roland Häder [Sat, 1 Jul 2023 06:23:02 +0000 (08:23 +0200)]
Continued:
- added command check_nodeinfo which checks if domain is part of nodeinfo_url
- first loop through nodeinfo IDs to find newest nodeinfo first
Roland Häder [Sat, 1 Jul 2023 00:06:20 +0000 (02:06 +0200)]
Continued:
- blacklisted `activitypub-proxy.cf` as this fakes instances and is currently
offline
- blacklisted `netlify.app` as this is a mass-hoster with many fake sub domains
Roland Häder [Fri, 30 Jun 2023 11:38:43 +0000 (13:38 +0200)]
Continued:
- lhr.life and localhost.run provide some tunnel service with tons of sub
domains which causes the 'instances' table to flood
- command 'fetch_fbabot_atom' is originating from ryona.agency
Roland Häder [Thu, 29 Jun 2023 04:11:00 +0000 (06:11 +0200)]
Continued:
- command recheck_obfuscation now accepts parameter --domain and --software
- encapsulated aliasing unwanted block_level into function
utils.alias_block_level()
Roland Häder [Mon, 26 Jun 2023 17:16:48 +0000 (19:16 +0200)]
Continued:
- added command recheck_obfuscation() to recheck if returned instance's
obfuscated blocked peers can be de-obfuscated
- I use array["fop"] outside strings and array['foo'] inside