]> git.mxchange.org Git - fba.git/log
fba.git
8 months agoContinued:
Roland Häder [Mon, 11 Mar 2024 21:28:00 +0000 (22:28 +0100)]
Continued:
- added aliases for misskey, mastodon and pleroma (each one)
- moved mastodon aliases to module ("private") variable, please don't access it
  outside the module!

8 months agoContinued:
Roland Häder [Mon, 11 Mar 2024 09:18:12 +0000 (10:18 +0100)]
Continued:
- fixed SQL string
- partly reverted because simple code like 'tuple(row)' doesn't work

8 months agoContinued:
Roland Häder [Sun, 10 Mar 2024 10:57:12 +0000 (11:57 +0100)]
Continued:
- need to convert type Row to a true tuple

8 months agoContinued:
Roland Häder [Sun, 10 Mar 2024 10:53:17 +0000 (11:53 +0100)]
Continued:
- row[] does already contain named keys, e.g. 'total_websites' so let's alias
  the COUNT() statement to such key name which reduces some code
- renamed 'blocks_recorded' to 'total_blocks'

8 months agoContinued:
Roland Häder [Sun, 10 Mar 2024 10:35:19 +0000 (11:35 +0100)]
Continued:
- renamed known_instances to total_websites because this is more clear
- you cannot distinguish between a regular website and a former Fediverse
  instance (e.g. before: Mastodon was installed, now IBM Connections)

8 months agoContinued:
Roland Häder [Sat, 9 Mar 2024 20:16:10 +0000 (21:16 +0100)]
Continued:
- check 'json' key

8 months agoContinued:
Roland Häder [Fri, 8 Mar 2024 15:43:10 +0000 (16:43 +0100)]
Continued:
- added missing key 'json' (ops!)

8 months agoContinued:
Roland Häder [Sun, 3 Mar 2024 03:37:26 +0000 (04:37 +0100)]
Continued:
- added 7988276.xyz as a testing/development hoster

8 months agoContinued:
Roland Häder [Sun, 3 Mar 2024 03:25:11 +0000 (04:25 +0100)]
Continued:
- added another alias "hijikey" for "misskey"

8 months agoContinued:
Roland Häder [Fri, 1 Mar 2024 06:36:44 +0000 (07:36 +0100)]
Continued:
- moved --same from nodeinfo.sh to command update_nodeinfo()

8 months agoContinued:
Roland Häder [Wed, 28 Feb 2024 17:12:22 +0000 (18:12 +0100)]
Continued:
- also log parameter 'column'
- blocks.add() should not be invoked with 'blocker' is already blocking
  'blocked' at 'block_level'

8 months agoContinued:
Roland Häder [Sun, 25 Feb 2024 02:43:29 +0000 (03:43 +0100)]
Continued:
- added another gardenfence blocklist for bootstrapping FBA
- just run ./fba.py fetch_txt and you can build up an initial list of peers

8 months agoContinued:
Roland Häder [Thu, 22 Feb 2024 23:38:57 +0000 (00:38 +0100)]
Continued:
- more pre-checks to avoid exceptions
- don't abuse catching them here as a control statement (if)

9 months agoContinued:
Roland Häder [Thu, 15 Feb 2024 17:58:51 +0000 (18:58 +0100)]
Continued:
- added network type 'vebinet' (Mozilla's Fediverse software) for peer retrival
- domain blocks aren't supported (yet)

9 months agoContinued:
Roland Häder [Thu, 15 Feb 2024 13:30:07 +0000 (14:30 +0100)]
Continued:
- first simple variable checks
- next "lighter" functions being invoked
- last "heavy" function, which means possible database queries

9 months agoContinued:
Roland Häder [Tue, 13 Feb 2024 18:31:31 +0000 (19:31 +0100)]
Continued:
- added some more plain-text block lists (thanks to @Kromonos)

9 months agoContinued:
Roland Häder [Mon, 12 Feb 2024 15:45:46 +0000 (16:45 +0100)]
Continued:
- check parameters first, then "expensive" function invocation's returned value
- command fetch_instances should always fetch instances from a provided domain
  name
- only when --software=bar is used and no --force parameter was given

9 months agoContinued:
Roland Häder [Thu, 8 Feb 2024 17:00:57 +0000 (18:00 +0100)]
Continued:
- make --domain parameter for command fetch_instances optional so --software
  can be handled, too
- skip recently crawled domains in same loop
- args.<domain|software> can be both of type None, too

9 months agoContinued:
Roland Häder [Wed, 7 Feb 2024 19:26:24 +0000 (20:26 +0100)]
Continued:
- github isn't any fediverse instance
- https://github.io redirects to a bad URL

9 months agoContinued:
Roland Häder [Wed, 7 Feb 2024 07:55:13 +0000 (08:55 +0100)]
Continued:
- it was imported from previous code and seem not to work everywhere
- so let's remove the non-working part

9 months agoContinued:
Roland Häder [Sun, 4 Feb 2024 08:29:38 +0000 (09:29 +0100)]
Continued:
- added alias 'rosekey' for 'misskey'
- also 'smithereen' is a federating (at least peer list is provided) software

9 months agoContinued:
Roland Häder [Sat, 3 Feb 2024 19:39:39 +0000 (20:39 +0100)]
Continued:
- maybe there is no unsorted list or table at all

9 months agoContinued:
Roland Häder [Wed, 24 Jan 2024 17:14:14 +0000 (18:14 +0100)]
Continued:
- 'continue' if an exception is thrown because that instance can be "ignored"

9 months agoContinued:
Roland Häder [Wed, 24 Jan 2024 05:18:53 +0000 (06:18 +0100)]
Continued:
- handled unregistered instances

10 months agoContinued/WIP:
Roland Häder [Tue, 16 Jan 2024 19:18:16 +0000 (20:18 +0100)]
Continued/WIP:
- in commands.fetch_instances() added initialization of variables 'rows'
- also moved fetching rows into if() block
- commented some more code

10 months agoContinued:
Roland Häder [Tue, 16 Jan 2024 02:48:35 +0000 (03:48 +0100)]
Continued:
- added alias "kmyblue" for mastodon

10 months agoWIP:
Roland Häder [Sun, 14 Jan 2024 01:25:10 +0000 (02:25 +0100)]
WIP:
- rewritten fetch_instances towars allowing also --software=foo as an
  alternative parameter

10 months agoContinued:
Roland Häder [Fri, 12 Jan 2024 09:00:40 +0000 (10:00 +0100)]
Continued:
- changed string "true/false" to boolean
- please update your configuration file (all occurances!)

10 months agoContinued:
Roland Häder [Fri, 12 Jan 2024 03:56:07 +0000 (04:56 +0100)]
Continued:
- need to negate this state: only skip/reject .i2p domains when they are not
  allowed by configuration (which is default)

10 months agoContinued:
Roland Häder [Wed, 10 Jan 2024 21:52:50 +0000 (22:52 +0100)]
Continued:
- moved utils.fetch_url() to module network as this is network-related
- added some debug lines

10 months agoContinued:
Roland Häder [Tue, 2 Jan 2024 19:45:51 +0000 (20:45 +0100)]
Continued:
- need to remove prefix "re:" before cleaning up software, else all software is called "re"
- added alias "lovers" for "misskey"

10 months agoContinued:
Roland Häder [Tue, 2 Jan 2024 19:30:01 +0000 (20:30 +0100)]
Continued:
- added alias "miraiskey" for "misskey"
- moved all those aliases to "private" variable _misskey_aliases

10 months agoContinued:
Roland Häder [Sat, 23 Dec 2023 16:10:59 +0000 (17:10 +0100)]
Continued:
- fediverse.observer has changed their API to Graph (JSON POST)
- domain singleuser.club blacklisted, this domain started flooding with
  sub-domains which have wwXX as another sub-domain

11 months agoContinued:
Roland Häder [Fri, 22 Dec 2023 07:44:52 +0000 (08:44 +0100)]
Continued:
- need to cut off everything after hash symbol because that is for JavaScript
  click-event loaded content anyway
- prevented a few empty/None strings for invoking tidyup.domain()
- improved a few log messages

11 months agoContinued:
Roland Häder [Sat, 16 Dec 2023 13:19:57 +0000 (14:19 +0100)]
Continued:
- don't allow empty peer list in federation.add_peers()

11 months agoContinued:
Roland Häder [Sat, 16 Dec 2023 08:51:12 +0000 (09:51 +0100)]
Continued:
- added missing 'continue'
- added some debug messages
- skip empty/NoneType domain names

11 months agoContinued:
Roland Häder [Tue, 12 Dec 2023 06:07:23 +0000 (07:07 +0100)]
Continued:
- maybe error[json][error] is not a dict

11 months agoContinued:
Roland Häder [Tue, 12 Dec 2023 02:34:16 +0000 (03:34 +0100)]
Continued:
- added "error_message" JSON error message

11 months agoContinued:
Roland Häder [Mon, 11 Dec 2023 06:57:39 +0000 (07:57 +0100)]
Continued:
- also check blacklist first, before invoking instances.is_recent()

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 23:54:01 +0000 (00:54 +0100)]
Continued:
- also check blacklist in instances.is_recent() function
- first check against blacklist, then if it is recent (in commands.py)

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 22:44:35 +0000 (23:44 +0100)]
Continued:
- added alias 'goblin' for 'misskey'
- added debug line

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 22:08:50 +0000 (23:08 +0100)]
Continued:
- Python still needs a backslash here

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 08:04:23 +0000 (09:04 +0100)]
Continued:
- check validity of href URL before parsing it (controlled skip instead of
  uncontrolled raised exception)

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 07:20:07 +0000 (08:20 +0100)]
Continued:
- reformatted

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 05:15:09 +0000 (06:15 +0100)]
Continued:
- some relays may have removed/never set nodeinfo URL, these need to be skipped
  to avoid exception in fetch_url() function

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 04:50:06 +0000 (05:50 +0100)]
Continued:
- command 'fetch_txt' supports now --force parameter
- check against blacklist (wasn't in is_registered())
- some debug messages added

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 04:06:26 +0000 (05:06 +0100)]
Continued:
- formatted SQL statements

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 02:42:22 +0000 (03:42 +0100)]
Continued:
- forgot to vacuum-clean it

11 months agoContinued:
Roland Häder [Sun, 10 Dec 2023 01:06:59 +0000 (02:06 +0100)]
Continued:
- simplified code, no need for this extra indenting

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 07:15:42 +0000 (08:15 +0100)]
Continued:
- combined None and "" (empty string) for lesser code
- added/improved debug messages
- got rid of one local variable, added another one ;-)

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 07:03:01 +0000 (08:03 +0100)]
Continued:
- improved logger messages

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 05:14:48 +0000 (06:14 +0100)]
Continued:
- logged domain name
- better check variable software against None and empty string

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 05:04:20 +0000 (06:04 +0100)]
Continued:
- initially set original_software, maybe it needs fixing later

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 05:01:38 +0000 (06:01 +0100)]
Continued:
- need to catch any network-related exceptions to reduce _DEPTH counter

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 04:10:38 +0000 (05:10 +0100)]
Continued:
- moved original_software next to 'software' column
- replaced own _cache dict with @lru_cache

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 03:03:46 +0000 (04:03 +0100)]
Continued:
- check if domain is valid before checking if it is blacklisted, else an
  exception will raise

11 months agoContinued:
Roland Häder [Sat, 9 Dec 2023 02:46:25 +0000 (03:46 +0100)]
Continued:
- check all URLs against validator

11 months agoContinued:
Roland Häder [Fri, 8 Dec 2023 22:28:48 +0000 (23:28 +0100)]
Continued:
- also search for instances with detected software

11 months agoContinued:
Roland Häder [Fri, 8 Dec 2023 04:44:15 +0000 (05:44 +0100)]
Continued:
- more checks against blacklist

11 months agoContinued:
Roland Häder [Fri, 8 Dec 2023 04:02:57 +0000 (05:02 +0100)]
Continued:
- ops, column and title confused

11 months agoContinued:
Roland Häder [Fri, 8 Dec 2023 03:56:26 +0000 (04:56 +0100)]
Continued:
- added list/scoreboard mode for column 'original_software'

11 months agoContinued:
Roland Häder [Fri, 8 Dec 2023 03:40:46 +0000 (04:40 +0100)]
Continued:
- added column original_software, it contains the original software name, but
  without any version number or " powered by " in it

11 months agoContinued:
Roland Häder [Tue, 5 Dec 2023 01:38:15 +0000 (02:38 +0100)]
Continued:
- more checks against blacklist

11 months agoContinued:
Roland Häder [Tue, 5 Dec 2023 01:01:06 +0000 (02:01 +0100)]
Continued:
- added another alias 'lycheebridge' for 'misskey'

11 months agoContinued:
Roland Häder [Sun, 3 Dec 2023 17:50:00 +0000 (18:50 +0100)]
Continued:
- added alias 'catodon' for 'misskey'

11 months agoContinued:
Roland Häder [Sat, 2 Dec 2023 14:48:22 +0000 (15:48 +0100)]
Continued:
- check blacklist and skip below code
- also handle "error" (int) and "msg" (string) JSON response

11 months agoContinued:
Roland Häder [Fri, 1 Dec 2023 18:52:53 +0000 (19:52 +0100)]
Continued:
- check row[software] if not None, then if it is a relay

11 months agoContinued:
Roland Häder [Thu, 30 Nov 2023 17:54:05 +0000 (18:54 +0100)]
Continued:
- software.is_relay() doesn't allow None/empty strings, let's handle them, too

11 months agoContinued:
Roland Häder [Thu, 30 Nov 2023 04:51:19 +0000 (05:51 +0100)]
Continued:
- daemon.py is already executable

11 months agoContinued:
Roland Häder [Thu, 30 Nov 2023 04:21:36 +0000 (05:21 +0100)]
Continued:
- skip blacklisted domains (aka. instances)

11 months agoContinued:
Roland Häder [Thu, 30 Nov 2023 03:03:02 +0000 (04:03 +0100)]
Continued:
- `foo is None or foo == ""` can be simplified to `foo in [None, ""]`
- added more blacklist check
- improved/added some debug messages

11 months agoContinued:
Roland Häder [Thu, 30 Nov 2023 00:18:08 +0000 (01:18 +0100)]
Continued:
- fixed copy-paste mistakes in debug messages

11 months agoContinued:
Roland Häder [Wed, 29 Nov 2023 22:35:20 +0000 (23:35 +0100)]
Continued:
- ops, one parameter 'key' to much ...

11 months agoContinued:
Roland Häder [Wed, 29 Nov 2023 22:08:45 +0000 (23:08 +0100)]
Continued:
- check domain more again blacklist to avoid bad function invocations
- improved some debug logging messages

11 months agoContinued:
Roland Häder [Wed, 29 Nov 2023 15:16:57 +0000 (16:16 +0100)]
Continued:
- also log 'path' here

11 months agoContinued:
Roland Häder [Wed, 29 Nov 2023 15:02:46 +0000 (16:02 +0100)]
Continued:
- nodeinfo_url is not always a path only, better don't handle it over
- check if parameter 'path' starts with a slash if not 'None'

11 months agoContinued:
Roland Häder [Wed, 29 Nov 2023 00:02:18 +0000 (01:02 +0100)]
Continued:
- should be always aliased ;-)

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 23:17:55 +0000 (00:17 +0100)]
Continued:
- renamed $DOMAINS to $DOMAIN_LIST (Bash script)
- need to exclude None as possible value for parameter 'path'
- added another parked domain

11 months agoBlacklist updated:
Roland Häder [Tue, 28 Nov 2023 22:37:12 +0000 (23:37 +0100)]
Blacklist updated:
- added mass sub-domain flooder
- added fb.me (Facebook websites)

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 22:33:33 +0000 (23:33 +0100)]
Continued:
- these parked domains are the pest of the Internet :-( They just don't contain
  any real content and flood crawl indexes and tables such as 'instances' in
  this software

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 22:04:46 +0000 (23:04 +0100)]
Continued:
- Parameter 'path' need to start with a slash
- checked more against blacklist
- logger message after API returned JSON made similar

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 16:12:38 +0000 (17:12 +0100)]
Continued:
- cached function software.alias()
- moved cached initialization to file header

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 15:03:52 +0000 (16:03 +0100)]
Continued:
- also cache invocations of raise_on() which is being very often invoked

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 14:50:35 +0000 (15:50 +0100)]
Continued:
- first check parameter (better performance)
- cache is_wanted() and is_in_url() invocations

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 07:37:28 +0000 (08:37 +0100)]
Continued:
- blocked more parked domains

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 05:54:55 +0000 (06:54 +0100)]
Continued:
- parked domains blocked

11 months agoContinued:
Roland Häder [Tue, 28 Nov 2023 00:40:23 +0000 (01:40 +0100)]
Continued:
- also skip empty 'href' values
- include 'infos' array, too

11 months agoContinued:
Roland Häder [Mon, 27 Nov 2023 21:47:55 +0000 (22:47 +0100)]
Continued:
- rewrote so all parameters for command fetch_blocks() can have parameter --force
- added parameter --only-none

11 months agoContinued:
Roland Häder [Sun, 26 Nov 2023 13:30:14 +0000 (14:30 +0100)]
Continued:
- added 'sutty.nl' for flooding fediverse with useless "instances"
- it is a mass website hoster

11 months agoContinued:
Roland Häder [Sat, 25 Nov 2023 23:45:04 +0000 (00:45 +0100)]
Continued:
- improved/added debug lines

11 months agoContinued:
Roland Häder [Sat, 25 Nov 2023 13:47:42 +0000 (14:47 +0100)]
Continued:
- added another array dimension where an error message might be set

11 months agoContinued:
Roland Häder [Thu, 23 Nov 2023 01:27:30 +0000 (02:27 +0100)]
Continued:
- check config key against "true"
- improved logger messages
- added some

11 months agoFixed:
Roland Häder [Thu, 23 Nov 2023 00:36:11 +0000 (01:36 +0100)]
Fixed:
- identing must be always aligned

11 months agoContinued:
Roland Häder [Wed, 22 Nov 2023 22:30:42 +0000 (23:30 +0100)]
Continued:
- you can now optionally allow I2P domains being crawled (default: forbidden =
  clear-net)

11 months agoContinued:
Roland Häder [Wed, 22 Nov 2023 22:04:51 +0000 (23:04 +0100)]
Continued:
- improved/added some debug lines

11 months agoContinued:
Roland Häder [Wed, 22 Nov 2023 22:04:25 +0000 (23:04 +0100)]
Continued:
- added dynamic IP address and hostname provider to blacklist

12 months agoContinued:
Roland Häder [Tue, 21 Nov 2023 23:25:07 +0000 (00:25 +0100)]
Continued:
- ops, fixed syntax error

12 months agoContinued:
Roland Häder [Tue, 21 Nov 2023 21:15:08 +0000 (22:15 +0100)]
Continued:
- added parked domain
- added another list for seirdy.one

12 months agoContinued:
Roland Häder [Tue, 21 Nov 2023 18:52:38 +0000 (19:52 +0100)]
Continued:
- also track response time during raised exceptions

12 months agoContinued:
Roland Häder [Tue, 21 Nov 2023 16:23:41 +0000 (17:23 +0100)]
Continued:
- added last_response_time to templates
- moved cookie clearing to proper place (?)