Assuming disallow-all, and some research on robots.txt in Geminispace (Was: Re: robots.txt for Gemini formalised)

🗣️ From: Krixano (krixano (a) protonmail.com)
📅 Sent: 2020-11-26 10:22
📧 Message 66 of 70
My arguments weren't just about privacy. They were also about copyright.
Sharing on the internet is fine, but copyright still applies.

Secondly, You can share something for free online for a short period of time, and
then remove it after that time limit. This was done with a lot of books during a
portion of the Covid pandemic we are in. To say that archives should be 
able to permanently
cache this without explicit permission makes no logical sense.

Anyways, back to my original argument, caching should be opt-in. It makes the most sense.

	Granting permission* to use, modify, distribute something should be opt-in. Not opt-out.


Christian Seibold

Sent with ProtonMail Secure Email.

??????? Original Message ???????

On Thursday, November 26th, 2020 at 4:15 AM, Luke Emmet <luke at marmaladefoo.com> wrote:

> On 25-Nov-2020 00:18, Nick Thomas wrote:
>
> > You're presuming consent here. We don't actually know that said 90%
> >
> > of hosts are happy to be archived; we only know that 90% of hosts
> >
> > haven't included a robots.txt file, which could be for any one of a
> >
> > multitude of reasons.
> >
> > If a not-insignificant proportion of those hosts without robots.txt
> >
> > files would actually prefer not to be included in archives when asked,
> >
> > the current situation is not serving their privacy well, and gemini is
> >
> > suppose to be protective of user privacy. If an overwhelming majority
> >
> > of them simply don't care, then sure, the argument for it starts to
> >
> > look a bit niche. Talking in IRC earlier today, I hand-waved a 5%
> >
> > threshold for the first condition and 1% for the second.
> >
> > A personal example: I didn't have a robots.txt on my capsule file
> >
> > until today, but I don't want to be included in archives for various
> >
> > reasons. Presuming consent from the lack of a robots.txt file would
> >
> > have incorrectly guessed my preference, and harmed my privacy. Who else
> >
> > in that 90% is like me? We don't know.
>
> Hello all
>
> Personally, I'm not really that interested in the legal arguments back
>
> and forth about archiving and access. Yes there are some legal case
>
> precedents in this area in some jurisdictions, but I would say that by
>
> and large that ship has sailed. Sorry about that folks. The web is the
>
> de-facto baseline reference in this respect, whether we like it or not.
>
> If you publish information on the internet, there will be actors who
>
> will re-purpose it. Gemini is no different to the web in this.
>
> If any of us have information that is to be preserved as private, I
>
> cannot see how you can expect that to be achieved if you publish on the
>
> public internet (i.e. servers that do not require authentication). If
>
> you want to hide something, use authentication or a private channel.
>
> Yes there is robots.txt which is an opt-out mechanism, from general
>
> robot access to a server's content. It is established practice and good
>
> actors will respect it. But it cannot be a mechanism to preserve privacy.
>
> My take on the whole "Gemini preserves privacy better" is really about
>
> clients. We don't have extended headers, cookies or agent names in
>
> requests. So to that extent, client privacy is maintained better than
>
> the web, where the expectation is of long term, cross-session tracking.
>
> We dont thankfully have that.
>
> I don't see it as Gemini's role to attempt to set a cultural/legal
>
> privacy framework for servers who are choosing to publish on Gemini. We
>
> cannot imagine we can break new ground in this respect. We can however
>
> do our efforts to have this as a side effect of technical design in the
>
> protocol itself, and within the Gemini community we can look out for
>
> risks in exposing such personal information via the protocol.
>
> If Gemini ever becomes interesting enough to the outside world that some
>
> case goes to court (what a publicity success that would be!), surely the
>
> existing infrastructure of public server hypertext systems, namely the
>
> web, will be the established precedent.
>
> So I support use of robots.txt, but if none exists, the presumption -
>
> like the web - is that access and usage is allowed. If some actor
>
> doesn't follow a server's robots.txt, I'm sad about it, but we should
>
> ultimately expect it.
>
> -   Luke
---
Previous in thread (65 of 70): 🗣️ marc (marcx2 (a) welz.org.za)
Next in thread (67 of 70): 🗣️ Krixano (krixano (a) protonmail.com)
View entire thread.