💾 Archived View for gemi.dev › gemini-mailing-list › 001048.gmi captured on 2024-08-19 at 03:04:17. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
I'm going to lead in with a question prompted by Sean's experiences. Do we even need a robots.txt? -- ----- http://singletona082.flounder.online gemini://singletona082.flounder.online My online presence
Why wouldn't we? We certainly have a lot of bots so it seems reasonable to have robots.txt. I learned the value of robots.txt soon after setting up Remini, my Gemini proxy for Reddit. Many Reddit pages tend to link to a lot of other Reddit pages, so crawlers that visited Remini were sent down a rabbit hole which ultimately led to them trying to index all of Reddit (which is huge) via the proxy. That's obviously not a usual case but I don't think it's *that* unusual either, in Geminispace. More generally, it seems obvious to me that there should be a (mostly) agreed-upon way to direct the behaviour of bots that visit one's capsule, so if there are good arguments against robots.txt I'd be interested in hearing them. I don't think this is strictly speaking a Gemini question though, as the robots exclusion standard is something quite separate to Gemini (or HTTP). On 21/10/2021 13:41, Andrew Singleton wrote: > > I'm going to lead in with a question prompted by Sean's experiences. > > Do we even need a robots.txt? > > -- ----- > http://singletona082.flounder.online > gemini://singletona082.flounder.online > My online presence
---