Identifying robots (was Re: Open Source Proxy)

On Fri Jul 24, 2020 at 2:59 AM CEST, Sean Conner wrote:

> That's a decent idea, but that still doesn't help when I want to block a
> particular bot for "misbehaving" (in some nebulous way).

This is true, but at the end of the day, even if we had a user-agent
header, a badly written bot can always ignore robots.txt, or request
robots.txt and parse/respect it incorrectly, or just regularly change
its user-agent to evade restrictions.  There will *always* be scenarios
where admins simply have to resort to IP bans.  Gemini just bumps into
those scenarios slightly sooner than the web.

Some kind of official documentation on how to write good bots would
probably not go astray...



Previous in thread (9 of 10): 🗣️ Solderpunk (solderpunk (a)

View entire thread.