2021-11-26 Why I hate bots

Just look at these logs:

Nov 26 13:49:27 [info] Looking at gemini://communitywiki.org/page/FracturedDemocracyProblem
Nov 26 13:49:27 [info] IP is blocked, extending by 2419200
Nov 26 13:49:33 [info] Looking at gemini://communitywiki.org/page/JimMcCarthy
Nov 26 13:49:33 [info] IP is blocked, extending by 2419200
Nov 26 13:49:33 [info] Looking at gemini://vault.transjovian.org/text/en/Jevons_paradox
Nov 26 13:49:33 [info] Wikipedia getting en/Jevons_paradox
Nov 26 13:49:34 [info] Looking at gemini://communitywiki.org/raw/Page_Aliases
Nov 26 13:49:34 [info] IP is blocked, extending by 2419200
Nov 26 13:49:34 [info] Looking at gemini://communitywiki.org/WikiDe/page/TopicNode
Nov 26 13:49:34 [info] IP is blocked, extending by 2419200
Nov 26 13:49:36 [info] Looking at gemini://communitywiki.org/raw/Comments_on_OnceAndOnlyOnce
Nov 26 13:49:36 [info] IP is blocked, extending by 2419200
Nov 26 13:49:37 [info] Looking at gemini://communitywiki.org/html/TomM
Nov 26 13:49:37 [info] IP is blocked, extending by 2419200
Nov 26 13:49:38 [info] Looking at gemini://communitywiki.org/MeatBall/page/HasAutomatedEditingPrivilages
Nov 26 13:49:38 [info] IP is blocked, extending by 2419200
Nov 26 13:49:38 [info] Looking at gemini://communitywiki.org/page/SpacedRepitition
Nov 26 13:49:38 [info] IP is blocked, extending by 2419200
Nov 26 13:49:39 [info] Looking at gemini://communitywiki.org/page/TheArtOfPlainTalk
Nov 26 13:49:39 [info] IP is blocked, extending by 2419200
Nov 26 13:49:42 [info] Looking at gemini://communitywiki.org/raw/CategoryInformationVisualization
Nov 26 13:49:42 [info] IP is blocked, extending by 2419200
Nov 26 13:49:42 [info] Looking at gemini://communitywiki.org/WikiDe/page/ManyToMany
Nov 26 13:49:42 [info] IP is blocked, extending by 2419200
Nov 26 13:49:46 [info] Looking at gemini://communitywiki.org/page/WysiwygEditor
Nov 26 13:49:46 [info] IP is blocked, extending by 2419200

In roughly 20s I’m getting 13 requests, 12 of them are being blocked.

​#Gemini ​#Butlerian Jihad

Comments

(Please contact me if you want to remove your comment.)

They could be hits from <https://portal.mozz.us/>, which isn’t exactly a bot. Or is it from somewhere else?

https://portal.mozz.us/

– Sean Conner 2021-11-26 20:35 UTC

Sean Conner

---

It could be a web spider hitting a Gemini proxy, of course. Perhaps the proxy author does not respect the 44 “slow down” response.

I know I didn’t when I wrote Soweli Lukin, but I have since retired it, because I did not want to be complicit in all of this.

Soweli Lukin

– Alex 2021-11-26 23:40 UTC

---

An average computer can do around 25,000 small HTTP requests per second, so I don’t see 13 per second as that much of a problem, tbh.

– Krixano 2021-12-06 14:28 UTC

---

No problem; I’ll simply ban the bot. 😀

– Alex 2021-12-06 15:06 UTC

---

The situation remains as ridiculous as the bot authors who remain as clueless as script kiddies.

I’m noticing two bots who are attempting to crawl a now non-existant redirection test, probably with a backlog of *thousands* of links because hey, who tests their bots anyway? – Schadenfreude at some badly written bots

Schadenfreude at some badly written bots

– Alex 2021-12-23 13:36 UTC

---

Sean Conner on his blog:

I finally got fed up with Gemini crawlers not bothering to limit their following of redirects … inability to deal with relative links … empty requests … requests for domains I’m not running a Gemini server on … crawler attempting to canonicalize links to lower case … – My common Gemini crawler pitfalls

My common Gemini crawler pitfalls

– Alex 2022-04-17 16:30 UTC