💾 Archived View for gemi.dev › gemini-mailing-list › 000734.gmi captured on 2024-03-21 at 18:01:41. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

[SPEC] Encouraging HTTP Proxies to support Gemini hosts self-blacklisting

📧 Messages: 20
🗣️ Authors: 12
📅 First Message: 2021-02-21 18:53
📅 Last Message: 2021-02-24 00:14

1. Mansfield (mansfield (a) ondollo.com)

📅 Sent: 2021-02-21 18:53
📧 Message 1 of 20

# Overview

I like the idea of supporting the creating-users freedom to choose to have
their content only accessible through the gemini protocol. (Of course this
support only goes so far - once content is available, it's available.
Stopping someone from taking what you've made available and doing what they
will with it... short of a legal license and litigation... ugh.)

So... I have an HTTP Proxy and, while I provide a robots.txt, I'd like to
explore going a step further. I'd like to provide a URL path prefix like:
/internal/blacklist

The process I'm thinking of is explained in more detail below, but the
result would be that the HTTP Proxy refuses to forward requests to
self-blacklisted Gemini hosts.


# An Example

Before the process, going to https://gem.ondollo.com/external/ondollo.com
would return the gemini/text content available at gemini://ondollo.com to
the requesting web browser.

After the process, going to https://gem.ondollo.com/external/ondollo.com
would *not* return the gemini/text content available at gemini://ondollo.com

Maybe the proxy could instead return a page that says... "The owner of this
content has requested that it only be accessible through the gemini
protocol and we respect that. Please use a Gemini client to access content
at gemini://ondollo.com. Here are some recommended clients: <list of one or
more clients>"

... or... the HTTP Proxy could just 404? 400? 204 Success and no content?
301 Moved permanently with a gemini URI? 403 Forbidden (not a *user* authz
issue... but a *proxy* authz issue...)? 410 Gone? 451 Legal? (As an aside:
there is a part of me that loves discussions/debates around what the right
HTTP status code is for a given situation... there's another part of me
that dies every time...)

I think I'll go for a 200 HTTP status and content that explains what
happened and encourages the user to access the content through gemini.


# A Possible Request Response Sequence

Here's a sequence I'm thinking of (given a creating-user and their gemini
server at gemhost, and a consuming-user and their web browser... and the
proxy at proxyhost... and 'somepath' - a sufficiently unique and random URL
path):

 1. The creating-user makes a new document on their server available at
<gemhost>/<somepath> with the content <proxyhost>
 2. The creating-user makes an HTTP GET request to
<proxyhost>/internal/blacklist/<gemhost>/<somepath>
 3. The proxy makes a Gemini request to <gemhost>/<somepath> and gets the
content that matches themself - their own proxyhost value
 4. The proxyhost adds the gemhost to their blacklist
 5. The proxyhost refuses to proxy requests to the gemhost

Thoughts? How would you tweak to get the desired outcome?


# An Alternative

An additional thought I had... the above feels like it might be too process
heavy (but... it's also super simple...). What if proxy server
implementations were encouraged to check gemini://<gemhost>/noproxy.txt
before a request? (Every request? Feels too much like the favicon. The
first request? What if the creating-user changes their mind?) If my proxy
checks that URL and gets a 20 status back, then it refuses to proxy the
connection. If it gets a 51 back, then it continues with proxying the
request.

So... who does the work? Proxy implementers would be encouraged to make
modifications with either approach... but the first approach has no work by
gemini server implementers, just creating-users. The second approach could
have some work by gemini server implementors... but little by
content-users... maybe they toggle a setting in the server to respond with
a 20 to /noproxy.txt requests?

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210221/3685
d0f6/attachment.htm>

Link to individual message.

2. Oliver Simmons (oliversimmo (a) gmail.com)

📅 Sent: 2021-02-21 19:31
📧 Message 2 of 20

I'm abstaining from commenting on the rest, but for the status code:

On Sun, 21 Feb 2021 at 18:54, Mansfield <mansfield at ondollo.com> wrote:
> ... or... the HTTP Proxy could just 404? 400? 204 Success and no 
content? 301 Moved permanently with a gemini URI? 403 Forbidden (not a 

	user* authz issue... but a *proxy* authz issue...)? 410 Gone? 451 Legal? 

(As an aside: there is a part of me that loves discussions/debates around 
what the right HTTP status code is for a given situation... there's 
another part of me that dies every time...)

404, 204 and 410 I think are definitely the wrong codes.
A 3xx redirect would be bad as it may cause confusion for non-gemini
folks and browsers.
451 & 403 would ~kinda make sense.

I'm going to add to the list these two, which I think may somewhat fit
when someone is trying to access gemini-only content via HTTP:

406 Not Acceptable:
> The requested resource is capable of generating only content not 
acceptable according to the Accept headers sent in the request.

405 Method Not Allowed:
> A request method is not supported for the requested resource; for 
example, a GET request on a form that requires data to be presented via 
POST, or a PUT request on a read-only resource.

I'm not expert on HTTP though :p so I'm likely to be wrong.

- Oliver Simmons (GoodClover)

Link to individual message.

3. easrng (easrng (a) gmail.com)

📅 Sent: 2021-02-21 20:21
📧 Message 3 of 20

On February 21, 2021 6:53:36 PM UTC, Mansfield <mansfield at ondollo.com>
wrote:
>... or... the HTTP Proxy could just 404? 400? 204 Success and no content?
>301 Moved permanently with a gemini URI? 403 Forbidden (not a *user* authz
>issue... but a *proxy* authz issue...)? 410 Gone? 451 Legal?

I'd go with 502 Bad Gateway and an explanation.


-- 
? <https://www.google.com/teapot>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210221/9951
7e99/attachment.htm>

Link to individual message.

4. Sean Conner (sean (a) conman.org)

📅 Sent: 2021-02-21 20:26
📧 Message 4 of 20

It was thus said that the Great Mansfield once stated:
> 
> Maybe the proxy could instead return a page that says... "The owner of this
> content has requested that it only be accessible through the gemini
> protocol and we respect that. Please use a Gemini client to access content
> at gemini://ondollo.com. Here are some recommended clients: <list of one or
> more clients>"

  After reading up on HTTP response codes, I think the most appropriate one
is 409 Conflict.  The HTTP spec (RFC-2616, section 10.4.10, with added
commentary from me):

	The request could not be completed due to a conflict with the
	current state of the resource. 

The owner of the Gemini server does not want data proxied to HTTP.

	This code is only allowed in situations where it is expected that
	the user might be able to resolve the conflict and resubmit the
	request.

To resolve the situation, use an actual Gemini client.

	The response body SHOULD include enough information for the user to
	recognize the source of the conflict.  Ideally, the response entity
	would include enough information for the user or user agent to fix
	the problem; however, that might not be possible and is not
	required.

Which can be included in the body of the 409 response (most, if not all, web
servers allow custom error pages to be sent).

  Short of that, then 407 Proxy Authentication Required is the next best
one, kind of.  The semantics are perfect, but it would seem to apply
(RFC-2616, section 10.4.8):

	This code is similar to 401 (Unauthorized), but indicates that the
	client must first authenticate itself with the proxy. The proxy MUST
	return a Proxy-Authenticate header field (section 14.33) containing
	a challenge applicable to the proxy for the requested resource. The
	client MAY repeat the request with a suitable Proxy-Authorization
	header field (section 14.34). HTTP access authentication is
	explained in "HTTP Authentication: Basic and Digest Access
	Authentication"

> # An Alternative
> 
> An additional thought I had... the above feels like it might be too process
> heavy (but... it's also super simple...). What if proxy server
> implementations were encouraged to check gemini://<gemhost>/noproxy.txt
> before a request? (Every request? Feels too much like the favicon. The
> first request? What if the creating-user changes their mind?) If my proxy
> checks that URL and gets a 20 status back, then it refuses to proxy the
> connection. If it gets a 51 back, then it continues with proxying the
> request.

  If only there was a file that automated agents already use ... like
robots.txt, where one could do something like ...

	User-agent: proxy
	Disallow: /

But alas, not-so-benevolent dictator wanna-be Drew DeVault said thou shalt
not do that:

	gemini://gemi.dev/gemini-mailing-list/messages/003506.gmi

  -spc (I don't agree with Drew, and think robots.txt is find, and already
	in place in most cases ... )

Link to individual message.

5. Johann Galle (johann (a) qwertqwefsday.eu)

📅 Sent: 2021-02-21 20:48
📧 Message 5 of 20

Hi,

why is robots.txt not the obvious answer here? The companion 
specification[1] has a "User-agent: webproxy" for this specific case:

 > ### Web proxies
 > Gemini bots which fetch content in order to translate said content into 
HTML and publicly serve the result over HTTP(S) (in order to make 
Geminispace accessible from within a standard web browser) should respect 
robots.txt directives aimed at a User-agent of "webproxy".

So this should suffice:

 ```
User-agent: webproxy
Disallow: /
 ```

Regards,
Johann

-- 
You can verify the digital signature on this email with the public key
available through web key discovery. Try e.g. `gpg --locate-keys`...
or go to
<https://qwertqwefsday.eu/.well-known/openpgpkey/hu/spd3xecxhotzgyu1p3eqdqdp31ba6rif>.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA4EFCC5A6174FB0F.asc
Type: application/pgp-keys
Size: 3131 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210221/1221
b622/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210221/1221
b622/attachment-0001.sig>

Link to individual message.

6. Mansfield (mansfield (a) ondollo.com)

📅 Sent: 2021-02-21 21:37
📧 Message 6 of 20

On Sun, Feb 21, 2021 at 1:48 PM Johann Galle <johann at qwertqwefsday.eu>
wrote:

> Hi,
>
> why is robots.txt not the obvious answer here? The companion
> specification[1] has a "User-agent: webproxy" for this specific case:
>
>  > ### Web proxies
>  > Gemini bots which fetch content in order to translate said content into
> HTML and publicly serve the result over HTTP(S) (in order to make
> Geminispace accessible from within a standard web browser) should respect
> robots.txt directives aimed at a User-agent of "webproxy".
>
> So this should suffice:
>
> ```
> User-agent: webproxy
> Disallow: /
> ```
>
> Regards,
> Johann
>

I must admit, I'm woefully lacking skill or background with robots.txt. It
seems like it could be a great answer.

A few questions to help me educate myself:

 1. How often should that file be referenced by the proxy? It feels like an
answer might be, to check that URL before every request, but that goes in
the direction of some of the negative feedback about the favicon. One user
action -> one gemini request and more.
 2. Is 'webproxy' a standard reference to any proxy, or is that something
left to us to decide?
 3. Are there globbing-like syntax rules for the Disallow field?
 4. I'm assuming there could be multiple rules that need to be mixed. Is
there a standard algorithm for that process? E.g.:
User-agent: webproxy
Disallow: /a
Allow: /a/b
Disallow: /a/b/c

Again - it seems like this could work out really well.

Thanks for helping me learn a bit more!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210221/efe2
188f/attachment.htm>

Link to individual message.

7. Emma Humphries (ech (a) emmah.net)

Subject Changed! New Subject: Re: [SPEC] Encouraging HTTP Proxies to support Gemini hosts self-blacklisting
📅 Sent: 2021-02-21 22:33
📧 Message 7 of 20

What's the use case behind blocking proxies? 

On Sun, Feb 21, 2021, at 10:53, Mansfield wrote:
> 
> 
> # Overview
> 
> I like the idea of supporting the creating-users freedom to choose to 
> have their content only accessible through the gemini protocol. (Of 
> course this support only goes so far - once content is available, it's 
> available. Stopping someone from taking what you've made available and 
> doing what they will with it... short of a legal license and 
> litigation... ugh.)

Link to individual message.

8. Sean Conner (sean (a) conman.org)

Subject Changed! New Subject: [SPEC] Encouraging HTTP Proxies to support Gemini hosts self-blacklisting
📅 Sent: 2021-02-22 01:32
📧 Message 8 of 20

It was thus said that the Great Emma Humphries once stated:
> What's the use case behind blocking proxies? 

  Some people don't want web crawlers to crawl their Gemini space.  Web
crawlers can be ... stupidly bad [1].  For me, it's for web proxies that

	DON'T* have a robots.txt preventing web crawlers through the proxy.


  -spc

[1]	http://boston.conman.org/2019/07/09-12

Link to individual message.

9. Sean Conner (sean (a) conman.org)

📅 Sent: 2021-02-22 01:43
📧 Message 9 of 20

It was thus said that the Great Mansfield once stated:
> 
> I must admit, I'm woefully lacking skill or background with robots.txt. It
> seems like it could be a great answer.
> 
> A few questions to help me educate myself:
> 
>  1. How often should that file be referenced by the proxy? It feels like
> an answer might be, to check that URL before every request, but that goes
> in the direction of some of the negative feedback about the favicon. One
> user action -> one gemini request and more.

  I would say once per "visit" would be good enough (say you have 50
requests to make to a site---check before doing all 50).  Checking
robots.txt for *every* request is a bit too much.  

>  2. Is 'webproxy' a standard reference to any proxy, or is that something
> left to us to decide?

  The guide for Gemini [1] says:

	Below are definitions of various "virtual user agents", each of
	which corresponds to a common category of bot.  Gemini bots should
	respect directives aimed at any virtual user agent which matches
	their activity.  Obviously, it is impossible to come up with perfect
	definitions for these user agents which allow unambiguous
	categorisation of bots.  Bot authors are encouraged to err on the
	side of caution and attempt to follow the "spirit" of this system,
	rather than the "letter".  If a bot meets the definition of multiple
	virtual user agents and is not able to adapt its behaviour in a fine
	grained manner, it should obey the most restrictive set of
	directives arising from the combination of all applicable virtual
	user agents.

	...

	# Web Proxies

	Gemini bots which fetch content in order to translate said content
	into HTML and publicly serve the result over HTTP(S) (in order to
	make Geminispace accessible from within a standard web browser)
	should respect robots.txt directives aimed at a User-agent of
	"webproxy".

  So for example, if you are writing a gopher proxy (user makes a gopher
request to get to a Gemini site), then you might want to check for
"webproxy", even though you aren't actually behind a wesite but a gopher
site.  This is kind of a judgement call.

>  3. Are there globbing-like syntax rules for the Disallow field?

  No.  But it's not a complete literal match either.  

	Disallow:

will allow *all* requests.

	Disallow: /

will not allow any requests at all.

	Disallow: /foo

Will only disallow paths that *start* with the string '/foo', so '/foo',
'/foobar', '/foo/bar/baz/' will all be disallowed.

>  4. I'm assuming there could be multiple rules that need to be mixed. Is
> there a standard algorithm for that process? E.g.:
> User-agent: webproxy
> Disallow: /a
> Allow: /a/b
> Disallow: /a/b/c

  Allow: isn't in the standard per se, but many crawlers do accept it.  And
the rules for a user agent are applied in order they're listed.  First match
wins.

> Again - it seems like this could work out really well.
> 
> Thanks for helping me learn a bit more!

  More about it can be read here [2].

  -spc

[1]	https://portal.mozz.us/gemini/gemini.circumlunar.space/docs/companion/robots.gmi

[2]	http://www.robotstxt.org/robotstxt.html

Link to individual message.

10. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2021-02-22 16:46
📧 Message 10 of 20



> On Feb 21, 2021, at 21:26, Sean Conner <sean at conman.org> wrote:
> 
>  After reading up on HTTP response codes, I think the most appropriate one
> is 409 Conflict. 

451 Unavailable For Legal Reasons surely.

https://tools.ietf.org/html/rfc7725

?0?

Link to individual message.

11. Dave Cottlehuber (dch (a) skunkwerks.at)

Subject Changed! New Subject: Re: [SPEC] Encouraging HTTP Proxies to support Gemini hosts self-blacklisting
📅 Sent: 2021-02-22 17:22
📧 Message 11 of 20

On Mon, 22 Feb 2021, at 17:46, Petite Abeille wrote:
> 
> 
> > On Feb 21, 2021, at 21:26, Sean Conner <sean at conman.org> wrote:
> > 
> >  After reading up on HTTP response codes, I think the most appropriate one
> > is 409 Conflict. 
> 
> 451 Unavailable For Legal Reasons surely.
> 
> https://tools.ietf.org/html/rfc7725

403 Forbidden is ideal for this, or one of the 50x error codes. In 
practice most systems will retry a 50x request but not a 403.

409 conflict implies you are updating some resource a d this is rejected 
because of a conflicting request already being handled by the server. Not 
appropriate to this case as we are not updating.

451 is explicitly for legal reasons. Not semantic nor preference but book 
burning lawyer talk. Also not appropriate.

Using uncommon http codes makes things more confusing than necessary.

A+
Dave

Link to individual message.

12. Petite Abeille (petite.abeille (a) gmail.com)

Subject Changed! New Subject: [SPEC] Encouraging HTTP Proxies to support Gemini hosts self-blacklisting
📅 Sent: 2021-02-22 17:30
📧 Message 12 of 20



> On Feb 22, 2021, at 18:22, Dave Cottlehuber <dch at skunkwerks.at> wrote:
> 
> 451 is explicitly for legal reasons. Not semantic nor preference but 
book burning lawyer talk. Also not appropriate.

Opinions, opinions.

Quoting Tim Bray:

   HTTP/1.1 451 Unavailable For Legal Reasons
   Link: <https://spqr.example.org/legislatione>; rel="blocked-by"
   Content-Type: text/html

   <html>
    <head><title>Unavailable For Legal Reasons</title></head>
    <body>
     <h1>Unavailable For Legal Reasons</h1>
     <p>This request may not be serviced in the Roman Province
     of Judea due to the Lex Julia Majestatis, which disallows
     access to resources hosted on servers deemed to be
     operated by the People's Front of Judea.</p>
    </body>
   </html>

Seems appropriate.

?0?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210222/b4af
145e/attachment.htm>

Link to individual message.

13. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2021-02-22 17:31
📧 Message 13 of 20

> On Feb 22, 2021, at 18:22, Dave Cottlehuber <dch at skunkwerks.at> wrote:
> 
> 403 Forbidden is ideal for this

Acknowledgements

   Thanks to Terence Eden, who observed that the existing status code
   403 was not really suitable for this situation, and suggested the
   creation of a new status code.

   Thanks also to Ray Bradbury.

?0?

Link to individual message.

14. John Cowan (cowan (a) ccil.org)

📅 Sent: 2021-02-22 17:53
📧 Message 14 of 20

On Mon, Feb 22, 2021 at 12:30 PM Petite Abeille <petite.abeille at gmail.com>
wrote:

>      <h1>Unavailable For Legal Reasons</h1>
>      <p>This request may not be serviced in the Roman Province
>      of Judea due to the Lex Julia Majestatis, which disallows
>      access to resources hosted on servers deemed to be
>      operated by the People's Front of Judea.</p>
>
> Seems appropriate.
>

No no no.  It may sound silly today, but violating a law like that would
get your servers raided -- and by raided I mean that a decurio and his
squad would be sent to destroy them, capture anyone they could find, bring
the captives before a magistrate for an exceedingly speedy trial without
jury or lawyers, and then behead them (if citizens) or crucify them (if
non-citizens or slaves).  That's what you got for laesum maiestatum (or
l?se-majest?, as they say in Gaul).

Don't abuse the CCCCLI response.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
Ahhh, I love documentation.                           --Stephen C.
Now I know that I know, and why I believe that I know it.
My epistemological needs are so satisfied right now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210222/f587
8861/attachment.htm>

Link to individual message.

15. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2021-02-22 20:05
📧 Message 15 of 20



> On Feb 22, 2021, at 18:53, John Cowan <cowan at ccil.org> wrote:
> 
> violating a law

By "law", understand "non-technical reason(s)", e.g. ideology.

?0?

Link to individual message.

16. Petite Abeille (petite.abeille (a) gmail.com)

📅 Sent: 2021-02-22 20:30
📧 Message 16 of 20



> On Feb 22, 2021, at 18:53, John Cowan <cowan at ccil.org> wrote:
> 
> No no no

Perhaps of interest:

Reflections on Internet Transparency
https://tools.ietf.org/html/rfc4924

?0?

Link to individual message.

17. Mansfield (mansfield (a) ondollo.com)

📅 Sent: 2021-02-23 01:41
📧 Message 17 of 20

On Sun, Feb 21, 2021 at 6:44 PM Sean Conner <sean at conman.org> wrote:

> It was thus said that the Great Mansfield once stated:
> >
> > I must admit, I'm woefully lacking skill or background with robots.txt.
> It
> > seems like it could be a great answer.
> >
> > A few questions to help me educate myself:
> >
>

<snip>


>   More about it can be read here [2].
>
>   -spc
>
> [1]
> https://portal.mozz.us/gemini/gemini.circumlunar.space/docs/companion/robots.gmi
>
> [2]     http://www.robotstxt.org/robotstxt.html


Thanks for the links.

I'll add this to the TODO list.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210222/c593
548a/attachment.htm>

Link to individual message.

18. Jason McBrayer (jmcbray (a) carcosa.net)

📅 Sent: 2021-02-23 02:21
📧 Message 18 of 20


Dave Cottlehuber writes:
> 403 Forbidden is ideal for this, or one of the 50x error codes. In
> practice most systems will retry a 50x request but not a 403.

I feel that by analogy, status code 418 might be appropriate.

-- 
Jason McBrayer      | ?Strange is the night where black stars rise,
jmcbray at carcosa.net | and strange moons circle through the skies,
                    | but stranger still is lost Carcosa.?
                    | ? Robert W. Chambers,The King in Yellow

Link to individual message.

19. Bradley D. Thornton (Bradley (a) NorthTech.US)

📅 Sent: 2021-02-23 19:19
📧 Message 19 of 20

On 2/22/2021 6:21 PM, Jason McBrayer wrote:
> 
> Dave Cottlehuber writes:
>> 403 Forbidden is ideal for this, or one of the 50x error codes. In
>> practice most systems will retry a 50x request but not a 403.
> 
> I feel that by analogy, status code 418 might be appropriate.
> 

Oh Jason you beat me to the punch lolz.

Yes I was just this very minute playing with ASCII-art generators for a
kewl teapot response page delivered with status code 20.

I figured I might need this after checking a few  webproxies. some
respect robots.txt, some don't. I understand that some of those authors
may not view their proxy as a bot, but middleware it is. It may not have
ever occured to them to accommodate that part of the spec.

to further complicate things where my philosophy is concerned, and this
was pointed out yesterday or so with the explanation as to what exactly
a "webproxy" is, due to the necessarily vague verbiage in the spec.

What if, Someone uses something like what Bitreich has with their SSH
kiosk, to enable people to burrow Gopherholes? I've actually thought of
deploying a service like this for Gemini...

But if I authored such a thing, I myself would have to say that yes, I'm
going to have to consider my product a webproxy and not violate a
Disallow rule for webproxy user agents.

And certainly, my specific intent here is not to stymie that sort of
vehicle. My beef, is I don't want any of the unique content that I
publish in Gopher space or Gemini space leaking to HTTP space. I just
loathe the thought of that - What purpose does Gopher or Gemini have
where the client side is concerned if it's just going to be published in
HTML.

In fact, I rarely, and I mean rarely ever publish anything in HTTP space
that exists in either Gopher or Gemini space. I maintained a package
repo containing well over ten thousand slackware packages back in the
Slackware 9 and 10 days, and none of those packages were obtainable via
HTTP - only gopher.

To me (no point in debating me on this as it's simply how I feel about
it), having "Unique" content in a particular aspect of a network breeds
relevance for the usage of that space. With Gemini, I see much more
potential than simple phlogs in Gemini capsules here in the future.

But enough of why I feel that it's important enough for me to be kinda a
dik about it with respect to the way I brought it up the other day. I
try to be a lot nicer than that most of the time lolz.

Now I'm just thinking of my own properties in the following, so it may
or may not be applicable or attractive to others now or in the future -
or NOT. Either way is kewl :)

So short of /dev/null routing tables, I think a CIDR based event handler
for Vger might be worth a go, because as it has been pointed out, it's
an actual user that will receive the page for all of those URLs. Instead
of baking this into the the daemon of Vger, perhaps it would be more
elegant to forward the packets to another Gemini service with something
like go-mmproxy and simply serve a single page with that HTTP 418
graphic for any and all requests?

I'll still have to manually hunt down the offending webproxy bots that
refuse to comply with the published spec:

gemini://gemini.circumlunar.space/docs/companion/robots.gmi

But that's not so difficult, since most are eager to advertise as a
public service (and most with good intentions, to be certain).

I'm long past being angry about it. Now it just feels like a fun little
tinkering project to play around with. This problem is not unique to
Gemini space - the bat bot phenomenon has plagued the HTTP space for
decades, but as the spec points out, it's different with Gemini, because
there's no user agent to assist in the identification process.

Although I think serving a .gmi with an HTTP 418 graphic is quite
hilarious, there's an undercurrent that is sinister on another,
non-technical level.

I have wanted to believe that I can incorporate copyright law into the
things that I personally wish to share with the world by using things
like the GPL v2 (and in some instances the AGPL) or with a CC-BY-SA. But
this matter has me questioning if that's going to afford others (and
myself) the protections I choose for my works.

Do I need to do something like "Copyright 2021 all rights reserved?" or
will simply racheting things up a little bit to a CC-BY-ND legally
protect my intellectual property from being converted into HTML?

And what about the other user agents? Search engines like Gus and
Houston? Will they think that maybe they shouldn't crawl and index
servers that state that webproxies aren't welcome - I certainly don't
wish for that to happen, those are not simply valuable services to the
community, but in the coming years they're going to be vital.

And what about the archiver agetns? are they going to store their
results in space that includes HTTP servers?

It's potentially a whole can "O" worms.

My position is quite simple. I don't want ANYONE to be able to read,
surf or access content on Vger from a web browser via HTTP protocol.
This of course, excludes plugins like Geminize, because the user is
actually using native Gemini protocol to access Gemini space.

Very simple concept to me. It makes Gopher more valuable. it makes
Gemini more valuable (provided the content is worth visiting via the
native protocols).

And I really don't want people to have to adhere to a "No Derivitive"
clause in a creative commons license. I want them to be able to take my
toys and edit them and share them with others in a form that suits the
greater good (Yes, even if  that means they put that shit on a web
server - they just have to retrieve it via Gemini or Gopher client).

I really don't know the answers to all of the questions that may raise,
and maybe that's where the discussion should go, coz I don't see any
roadblocks adressing this technically, or in a neighborly fashion. It's
the people who choose an immoral approach that may beg of licensing and
copyright discussions.

But in the end... Hey man, what do I know? I'm just a teapot :)

Kindest regards,

-- 
Bradley D. Thornton
Manager Network Services
http://NorthTech.US
TEL: +1.310.421.8268

Link to individual message.

20. Nathan Galt (mailinglists (a) ngalt.com)

📅 Sent: 2021-02-24 00:14
📧 Message 20 of 20



> On Feb 23, 2021, at 11:19 AM, Bradley D. Thornton <Bradley at NorthTech.US> wrote:
> 
> 
> 
> On 2/22/2021 6:21 PM, Jason McBrayer wrote:
>> 
>> Dave Cottlehuber writes:
>>> 403 Forbidden is ideal for this, or one of the 50x error codes. In
>>> practice most systems will retry a 50x request but not a 403.
>> 
>> I feel that by analogy, status code 418 might be appropriate.
>> 
> 
> 
> And certainly, my specific intent here is not to stymie that sort of
> vehicle. My beef, is I don't want any of the unique content that I
> publish in Gopher space or Gemini space leaking to HTTP space. I just
> loathe the thought of that - What purpose does Gopher or Gemini have
> where the client side is concerned if it's just going to be published in
> HTML.
[snip]
> I have wanted to believe that I can incorporate copyright law into the
> things that I personally wish to share with the world by using things
> like the GPL v2 (and in some instances the AGPL) or with a CC-BY-SA. But
> this matter has me questioning if that's going to afford others (and
> myself) the protections I choose for my works.
> 
> Do I need to do something like "Copyright 2021 all rights reserved?" or
> will simply racheting things up a little bit to a CC-BY-ND legally
> protect my intellectual property from being converted into HTML?
> 
> And what about the other user agents? Search engines like Gus and
> Houston? Will they think that maybe they shouldn't crawl and index
> servers that state that webproxies aren't welcome - I certainly don't
> wish for that to happen, those are not simply valuable services to the
> community, but in the coming years they're going to be vital.
> 
> And what about the archiver agetns? are they going to store their
> results in space that includes HTTP servers?
> 
> It's potentially a whole can "O" worms.
> 
> My position is quite simple. I don't want ANYONE to be able to read,
> surf or access content on Vger from a web browser via HTTP protocol.
> This of course, excludes plugins like Geminize, because the user is
> actually using native Gemini protocol to access Gemini space.
> 
> Very simple concept to me. It makes Gopher more valuable. it makes
> Gemini more valuable (provided the content is worth visiting via the
> native protocols).
> 
> And I really don't want people to have to adhere to a "No Derivitive"
> clause in a creative commons license. I want them to be able to take my
> toys and edit them and share them with others in a form that suits the
> greater good (Yes, even if  that means they put that shit on a web
> server - they just have to retrieve it via Gemini or Gopher client).
> 
> I really don't know the answers to all of the questions that may raise,
> and maybe that's where the discussion should go, coz I don't see any
> roadblocks adressing this technically, or in a neighborly fashion. It's
> the people who choose an immoral approach that may beg of licensing and
> copyright discussions.

Standard disclaimer: I am very much not a lawyer.

While this won?t make you many friends online, it?s worth pointing out 
that you can get much further with a Disallow: line and a DMCA takedown 
request than you can with a Disallow: line alone.

That said, if the other party pushes back, you might end up having to pay 
lawyers to litigate novel caselaw regarding proxying of content. Then 
again, all this might?ve been settled by websites that put their banners 
on things and stick your site in an <iframe> in the middle of it all.

Link to individual message.

---

Previous Thread: [tech] reverse proxy gemini

Next Thread: [SPEC] Backwards-compatible metadata in Gemini