On Tue, 2020-11-24 at 13:31 +0000, James Tomasino wrote: > > As much as I'd love to wave a magic wand and say, "it's all opt-in > here" we don't really have any legal footing to do so. > James and I talked a bit more about this one on IRC. Key to this argument, AIUI, is how robots.txt (or the lack of it) is treated for FTP, which lacks any mention of it in the spec but has apparently been given weight in DMCA-related rulings involving it. I'm not sure I agree with the reasoning, which goes something like "the robots.txt Internet-Draft is already de-jure part of Gemini, and we can't change that", but IANAL ^^. In particular, I've been thinking about this almost entirely in GDPR terms so far, and have a bunch of DMCA-related reading to do now. In the event that it *is* accurate, we talked about an alternative way to implement the functionality. Rather than having the gemini robots.txt spec say "if the client doesn't receive a robots.txt, it must assume this one", the *server* could be made to return a defined robots.txt response body if it would otherwise issue a 51 response to `/robots.txt` (51 may be too specific, it could be 5x, but I don't *think* it would be appropriate in response to 4x responses, which crawlers would be expected to retry). Of course, any server could do that already today, so the ask is to put a recommendation about it into "server best practice", perhaps incorporating the `--permit-indexing` and `--permit-archiving` flags I talked about in another post. Another advantage of this approach is that it becomes opaque to crawler authors whether the user has explicitly selected a preference or not. I'm also inclined to trust server implementors over crawler implementors. /Nick p.s. there was also some question as to whether someone hosting gemini content was a "gemini user", in the way we use that term on the project homepage. To me, it seems like a reasonable extrapolation, but perhaps it's a topic that deserves more debate or clarification.
---
Previous in thread (31 of 70): 🗣️ Johann Galle (johann (a) qwertqwefsday.eu)
Next in thread (33 of 70): 🗣️ James Tomasino (tomasino (a) lavabit.com)