On Sat, Mar 21, 2020 at 09:39:46PM -0400, Sean Conner wrote: > I don't mind the crawling, but I am concerned about the references to > robots.txt. In the web world, robots.txt lives at the top level and *only* > at the top level. I don't think there's been a official response from > solderpunk about robots.txt, but I would expect it to be very similar to how > it works on the web---the top level only. > > But a clarification would be nice (either way). In my opinion, it should > only live at the top level, but I can adapt to every "directory" as well. This is nicely timed, actually, as things like robots.txt are now looming larger on my personal radar than they have previously - with CAPCOM I am writing for the first time a program which automatically makes Gemini requests, and I'm very keen on making sure that it's a "good citizen". There hasn't been too much overt discussion of good Gemini citizenship yet, but now that non-human clients are becoming more common, there should be. Robots.txt is obviously part of that package. (It's *not* super relevant to feed aggregation, because nobody publishes a feed without the expectation that it is read entirely by bots, but other issues, especially rate limiting, rate) It's been many years since I read any robots.txt specs from the web. I will refresh my memory and start thinking about this, and asking questions, in the hopes that we can finalise some stuff soon. Cheers, Solderpunk
---
Previous in thread (1 of 3): 🗣️ Sean Conner (sean (a) conman.org)
Next in thread (3 of 3): 🗣️ Natalie Pendragon (natpen (a) natpen.net)