(I could be a lot better at using mailing lists. I think this message was sent privately in error). On Tue, 2020-11-24 at 08:15 -0500, A. E. Spencer-Reed wrote: > Why do you dislike archival? Thanks for weighing in! In short, because the purposes to which the archive can be put, and the motives of the archiver, are not clear at time of robots.txt-mediated archival. For myself, I'm happy with some types of archival, and not happy with some other types. Some people would be happy to be included in every archive going; others, in none of them. Given this variability, we must take a stance on what to assume if robots.txt isn't present. I also I don't think this variability is amenable to capture with more fine- grained virtual agents. The current internet-draft for robots.txt says, in 2.2.1: > If no group satisfies either condition, or no groups are present at > all, no rules apply. ( https://tools.ietf.org/html/draft-koster-rep-00 ) This is pretty standard on the Web and, entirely coincidentally, a huge boon to Google et al. Importing robots.txt the way we do in the companion specification also imports this line. However, unlike the Web, Gemini "takes user privacy very seriously". Archives *can* be injurious to user privacy - if you need convincing on this point, there are a range of cases and examples around GDPR "right to be forgotten" stuff. To my perspective, Gemini is important a line from the internet-draft that is directly contrary to its mission. Combining Gemini's mission with that realisation means that if no statement has been made about whether the given user (server operator in this specific case) is OK with their content being archived, the presumption should be that they are not OK with it. We should value user privacy above archiver convenience. In affect, we add a second exception to the protocal that amends 2.2.1 to end "if no rules are specified, this robots.txt file MUST be assumed". On a practical level, being excluded from search engines by-default drives the discoverability of robots.txt, and server software could easily include flags like --permit-indexing or --permit-archival to streamline that discoverability. I don't think that opt-in rates would be similar to current opt-out rates on the Web. /Nick
---
Previous in thread (29 of 70): 🗣️ marc (marcx2 (a) welz.org.za)
Next in thread (31 of 70): 🗣️ Johann Galle (johann (a) qwertqwefsday.eu)