Roland Häder [Tue, 15 Aug 2023 17:12:09 +0000 (19:12 +0200)]
Continued:
- typical for oliphant members: Hide their own blocklist and then handle it
over to Codeberg repository of oliphant
- this helper now has a simple function to check if the provided domain should
be excluded
Roland Häder [Mon, 14 Aug 2023 04:03:15 +0000 (06:03 +0200)]
Continued:
- added detection-mode 'APP_NAME' which reflects meta information
name="application-name"
- allow checking generator type if status code 410 (Gone) is given, e.g.
wordpress.com still returns a full HTML code to check
Roland Häder [Sun, 13 Aug 2023 17:13:36 +0000 (19:13 +0200)]
Continued:
- added --software2 for re-checking instances with `software` given and no
`detection_mode` given
- also added og:platform to HTML base template
Roland Häder [Tue, 8 Aug 2023 18:14:54 +0000 (20:14 +0200)]
Continued:
- you can now with --feed=https://some-fba/feed.atom specify an other ATOM feed
from an FBA/Pleroma bot
- parserset_defaults() is now specified first, then additional parameter
Roland Häder [Sat, 5 Aug 2023 21:54:19 +0000 (23:54 +0200)]
Continued:
- added network 'mammuthus[ experimental]' for retriving peers
- moved software-related (not version number) functions to software.py
- strip off " experimental", so you can enter e.g. 'mammuthus' easier
Roland Häder [Sat, 5 Aug 2023 13:52:13 +0000 (15:52 +0200)]
Continued:
- throw exception again, if they happen then they won't be fixed within a
split of a second
- also make sure that home directory of FBA is properly set, sure you can
choose a different directory or take the default /home/fba/
- added recheck.sh, a small wrapper script I wrote for myself and you should
try. For example above exceptions might cause the used software not being
detected (sure with timeouts) then you can run ./recheck.sh --software
to re-test them
Roland Häder [Thu, 27 Jul 2023 10:59:53 +0000 (12:59 +0200)]
Continued:
- move nodeinfo handling to new module 'nodeinfo'
- also had to renamed variable nodeinfo to other names
- first newest version at /.well-known/x-nodeinfo2
Roland Häder [Wed, 26 Jul 2023 14:22:34 +0000 (16:22 +0200)]
Partly reverted cdcd2b0109e126bca887d0712a7ddf602e5d6e62:
- "Accept" is not being accepted by misskey (gladly only these instances)
- it must be "Content-Type: application/json" or otherwise it is blocked
Roland Häder [Mon, 24 Jul 2023 22:51:40 +0000 (00:51 +0200)]
Continued:
- instances.is_recent() now checks recheck_block if 'last_blocked' is provided
- command fetch_blocks() now supports --force parameter
- blacklisted fnaf.stream as this domain has super-long sub-domains (troll)
Roland Häder [Mon, 24 Jul 2023 21:58:15 +0000 (23:58 +0200)]
Continued:
- added column `obfuscated_blocks` to save count of (still) obfuscated blocks
- also exposed it in infos.html view
- blacklisted gitpod.io as this domain floods `instances` table
Roland Häder [Mon, 24 Jul 2023 14:35:53 +0000 (16:35 +0200)]
Continued:
- added command fetch_relay() for fetching instances from ActivityPub relays
which show their peers in index page (/)
- added grid.tf as this flooded a lot "testing/developing" sub domains
Roland Häder [Mon, 24 Jul 2023 09:48:58 +0000 (11:48 +0200)]
Continued:
- let's not iterate directly (always possible, of course) on the CSV reader
object, but generate a list from its rows
- this also allow us to check if 'reader' is not NoneType
Roland Häder [Mon, 24 Jul 2023 06:16:17 +0000 (08:16 +0200)]
Continued:
- added support for x-nodeinfo2 which can be directly fetched from /.well-known/
"directory"
- also rewrote fetching well-known nodeinfo URLs to more flexible way
Roland Häder [Mon, 24 Jul 2023 05:04:51 +0000 (07:04 +0200)]
Optimized:
- first simple checks then invoke methods
- recheck_obfuscation() is about block lists, not instances, therefore we need
to check 'last_blocked' timestamp
Roland Häder [Fri, 21 Jul 2023 07:05:39 +0000 (09:05 +0200)]
Continued:
- added mitra network supporting fetch_instances (not domain_blocks unfortunate)
- if I fetch domain blocks from chaos.social, it is being reset to zero, so
let's better bypass it here
Roland Häder [Fri, 21 Jul 2023 05:08:57 +0000 (07:08 +0200)]
Continued:
- prepared for reverse-proxy, e.g. Apache/nginx
- configuration keys "scheme" (newly added) and "hostname" are how your FBA
instance is called from outside, I was not able to find any other way as
url_for() was returning a http:// URL and not a https:// ... :-(
Roland Häder [Thu, 20 Jul 2023 15:22:02 +0000 (17:22 +0200)]
Continued:
- only attempt to fetch peers when software was detected
- added API /api/v1/instance/domain_blocks
- for this the blacklist needs to be rewritten for having "block" reasons
included
Roland Häder [Thu, 20 Jul 2023 13:29:39 +0000 (15:29 +0200)]
Continued:
- FBA is now a Fediverse "instance"
- outbound "rss" is supported as feeds are provided
- peer list is available at `/api/v1/instance/peers`, but only instances with
valid nodeinfo
Roland Häder [Wed, 12 Jul 2023 09:05:03 +0000 (11:05 +0200)]
Continued:
- max "crawl" depth and min peerlist size to go deeper is now configurable
- for example for low-memory systems, keep max_crawl_depth small and
min_peers_length big
- the default values may cause python3 to consume ~550 MB RAM
- so you can practially say each depth adds another MB RAM usage
Roland Häder [Wed, 12 Jul 2023 08:30:27 +0000 (10:30 +0200)]
Continued:
- roadhouse is an alias for hubzilla, it is currently unsupported as it doesn't
provide needed APIs for fetching peers and blocklists but just in case they
add it
- same with nextcloud and others
- shumihub is an alias for misskey
Roland Häder [Wed, 12 Jul 2023 05:28:43 +0000 (07:28 +0200)]
Continued:
- a recursive (aka. "crawl") depth of 500 is REALLY far deep, practically the
whole Fediverse
- minimum peer count to deepen the "crawl" to max depth is 100 peers
- flush any pending data of current domain before continuing