Roland Häder [Tue, 19 Aug 2025 19:25:46 +0000 (21:25 +0200)]
Continued:
- renamed parameter `domain` to `pattern`
- initialized `domain` with value from `pattern`
- some better error messages
- only invoke `utils.obfuscate()` when an asterisk or question mark is included
in pattern
Roland Häder [Fri, 8 Aug 2025 16:46:22 +0000 (18:46 +0200)]
Continued:
- misskey needs some random sleep, too, so let's externalize those hardcoded
values first and then apply them, too
- if any other software can be such a mess, then it is misskey and its deviates
Roland Häder [Sat, 12 Jul 2025 21:06:09 +0000 (23:06 +0200)]
Continued:
- added a few debug lines
- no need for `elif` when that block isn't exiting the function
- let's use our own functions instead of checking `foo in bar`
Roland Häder [Tue, 3 Jun 2025 00:07:58 +0000 (02:07 +0200)]
Continued:
- `pleroma.envs.net` does now publish a more detailed blocklist than its CSV file
- but still it can be used as peer source (key `only_peers` set to `True`)
- exception messages improved
Roland Häder [Sat, 31 May 2025 18:40:23 +0000 (20:40 +0200)]
Continued:
- also need to set +pubSub for fetch_instances() command
- remove duplicate code, no need to fetch same for POST field "blocked" again
- --force-all is generically the norm
Roland Häder [Fri, 30 May 2025 13:29:47 +0000 (15:29 +0200)]
Continued:
- let's not include original software in queries anymore when --software=xxx is provided
- set software to None when --domain=xxx is provided (or --force-all/update-none)
Roland Häder [Thu, 1 May 2025 11:55:23 +0000 (13:55 +0200)]
Continued:
- skip wordpress.com instances as the public API is always different to the
"instance"
- skip empty doc (BeautifulSoup4) result (HTML parser failed)
- tpzo fixed
Roland Häder [Tue, 29 Apr 2025 19:36:46 +0000 (21:36 +0200)]
Continued:
- ops, `value` is no parameter in daemon's function
- introduced --force-recrawl (to include recently crawled instances) parameter
to 2 commands
- updated --force-all help text
Roland Häder [Mon, 21 Apr 2025 03:03:19 +0000 (05:03 +0200)]
Continued:
- added `srv.us` and `linodeusercontent.com` as mass-hosters/tunnel service,
no real instance can be expected here
- if table `instances` doesn't bear a record then `is_recent()` should return
`False`
- removed parameter `--single` from command `fetch_instances` and moved SQL
statement into `else` block
Roland Häder [Mon, 21 Apr 2025 02:21:03 +0000 (04:21 +0200)]
Continued:
- fixed some tpzos (TypError -> TypeError, row[block] -> block[reason], ...)
- RuntimeException is RuntimeError (confusing as errors are hard for my understanding)
- used f-string instead (pylint is now a bit happier)
- still it is confusing with import errors?!
Roland Häder [Mon, 21 Apr 2025 01:06:54 +0000 (03:06 +0200)]
Continued:
- shorthand "e.g." replaced by "for example"
- removed if() block as a loop on an empty list is still not doing anything
and the else block only contained a debug line
Roland Häder [Sun, 20 Apr 2025 23:55:31 +0000 (01:55 +0200)]
Continued:
- let's not shorten so much, else local functions may be confused with impored
libraries
- renamed variable `domain` to `hostname`, not a domain only
- skip unwanted domains before invoking encode_idna()
Roland Häder [Tue, 21 Jan 2025 11:43:44 +0000 (12:43 +0100)]
Continued:
- added more exceptions to catch and handle
- split long 2-statements lines into single lines for better error handling
and debugging (if fedilist is one day back -> https://fedilist.com )
Roland Häder [Wed, 15 Jan 2025 02:12:11 +0000 (03:12 +0100)]
Continued:
- need to skip invalid table headers, they should be introduced with <thead>
and then each column <th> but some website may use <tr> instead of <thead>
- strip (trim) strings
Roland Häder [Mon, 13 Jan 2025 23:40:06 +0000 (00:40 +0100)]
Continued:
- avoided dangerous (=mutable) argument to functions/methods (thanks to pylint)
- reduced invocation count of find_all("foo") by using local variable
- added more checks in "quarantined" branch
Roland Häder [Mon, 13 Jan 2025 22:33:37 +0000 (23:33 +0100)]
Continued:
- check if required key 'url' is in dict 'row'
- avoided superflous else + indenting
- simplier check before complex checks (`if domain in domains` is lesser
intensive than invoking "expensive" is_wanted())
Roland Häder [Sun, 12 Jan 2025 03:03:52 +0000 (04:03 +0100)]
Continued:
- logging whole tag isn't a good idea
- yes, the HTML is sometimes broken, e.g. <meta> and not <meta /> (self-closing)
- the open tag causes the warning