I haven't ran the crawler for my search engine in a long while, so I've decided to run in now, and I'm keeping track of a new stat (how many times it encounters slow down responses) as per the discussion from Solderpunk's latest news post over at geminiprotocol.net
9 months ago 路 馃憤 maxheadroom
@clseibold yeah makes sense 路 8 months ago
@satch I also found out both smolver (the server that runs ainent.xyz) and my server software uses a number of ms/seconds between requests per IP, and my implementation doesn't use a leaky bucket at all. I did it this way because it's simpler to implement myself, it was quick, uses less memory (I only store one timestamp, and an expected redirect string, for each IP).
I don't know that I like the leaky bucket concept in general. it's harder to reason about. Wanting 225 ms to pass between each request is easier to think about than number of requests per second or whatever, imo. 路 9 months ago
@satch WIll do :) So far the crawler has gotten 3 slow down responses from 3 different domains, two of which is software I wrote, lol. That'd be AuraGem, your misfin-server gemini interface, and ainent.xyz.
I know for a fact that transjovian and the related websites also has slow downs, because they ban you for like a month when you make just a couple requests too fast (which I always found ridiculous, but anyways). I might have stopped the crawler from crawling those pages at all, I'm not sure. 路 9 months ago
keep us posted! :) 路 9 months ago