💾 Archived View for gemi.dev › gemini-mailing-list › 000424.gmi captured on 2023-11-04 at 12:48:16. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Proposal about content-size and hash

📧 Messages: 48
🗣️ Authors: 20
📅 First Message: 2020-10-29 11:11
📅 Last Message: 2021-03-03 22:14

Arav K. <nothien (a) uber.space>

📅 Sent: 2020-10-29 11:11
📧 Message 1 of 48

Hi, Gemini peoples!

I have two main proposals for getting information like content-size
(which, as has been discussed previously, would be very useful for even
slightly big responses).

1. Define specific additional MIME type parameters.  For example, one
   could return '20 application/gzip; content-size=42069420' as the
   response line.  This could also be used for giving the client a hash
   of the data (parameter name e.g. 'content-hash-sha256'), useful for
   verification and for checking against local caches of the same page.

   To just check the MIME type (now possibly with size and hash), as
   jmcbray had mentioned on a previous thread on the topic, the best
   solution is to just request the page and terminate the connection
   after the request line.  Perhaps it should be specified in the spec
   that clients may do this, and servers may want to be prepared for
   this outcome.

   A possible issue is that the hash will not fit within the META line,
   given the 1024 byte limitation (this is especially problematic for
   things like SHA 512 hashes).  We could work around this either by
   increasing the mandated maximum META line size (probably not
   possible), or by providing this information using the second proposal
   but still using this proposal (or both) for content-size.

   A potential drawback is that we open ourselves up to further
   extension this way, but I would argue that this avenue has always
   been around.  If the spec defined only these two fields, and mandated
   that aside from them only parameters defined by the MIME spec may be
   included, then it _should_ be fine.

2. Define an additional endpoint for retrieving meta info.

   On ~chat, bjorn.warmedal layed out the possibility of using the
   '/.content' URL to return a content hash.  This would function like a
   normal URL, one which accepts as input the URL of the page to
   retrieve content information about.

   I propose that the response from a '/.content?/<path>' request
   returns the MIME type of the URL '/<path>', optionally including
   content-size and content-hash-* parameters (this works because there
   are no size restrictions for the content, unlike the META line).  Its
   MIME type can be bikeshed if this proposal is agreed upon.  The
   response format may be exactly an MIME type (i.e. no CR LF or
   anything).

   Unfortunately, extension becomes possible simply by using a different
   MIME type.  I don't know how to prevent this.

I think that proposal #1 is definitely doable, but I understand that #2
can be more problematic for some.  I suppose #2 would be beneficial if
we determine that content-hash is really necessary, but further
extensibility is unwanted and should be prevented somehow.

~aravk | ~nothien

Link to individual message.

Björn Wärmedal <bjorn.warmedal (a) gmail.com>

Subject Changed! New Subject: Content length, EOF -- ways to resolve whether we received everything
📅 Sent: 2020-10-30 11:50
📧 Message 2 of 48

There's been a lot of discussions about the lack of an end-of-message
indicator in the protocol. Clearly it's something that a lot of client and
server implementers are missing.

A proposal that arose (I think acdw may have been the first to suggest it;
correct me if I'm wrong) was to include a "content-length: <nr of bytes>"
in the <META> of a status 20 response. It's a simple thing to add, that
doesn't extend the protocol into bloat.

"But, ew0k! The <META> is for MIME types! That's not a MIME type!"

Yes, this is true. Would that cause an issue? In that case calling it
"x-gemi-content-length" should resolve it, as the "x-" prefix is for
experimental types and any receiver that doesn't recognize it will ignore
it. I'm aware that it still doesn't convey info on the *type* of content,
and thereby doesn't belong among MIME type info, but it's a compromise I'm
willing to make.

Are you?

Cheers,
BW/ew0k

Link to individual message.

Martin Keegan <martin (a) no.ucant.org>

📅 Sent: 2020-10-30 12:02
📧 Message 3 of 48

On Fri, 30 Oct 2020, Bj?rn W?rmedal wrote:

> it. I'm aware that it still doesn't convey info on the *type* of content,
> and thereby doesn't belong among MIME type info, but it's a compromise I'm
> willing to make.
>
> Are you?

No.

Mk

-- 
Martin Keegan, @mk270, https://mk.ucant.org/

Link to individual message.

Nick Thomas <gemini (a) ur.gs>

📅 Sent: 2020-10-30 13:00
📧 Message 4 of 48

On Fri, 2020-10-30 at 12:50 +0100, Bj?rn W?rmedal wrote:
> 
> "But, ew0k! The <META> is for MIME types! That's not a MIME type!"

I predict this is not going to be popular ^^.

The HTTP approach to the specific problem of no content-length header
is (was) chunked transfer-encoding: 
https://en.wikipedia.org/wiki/Chunked_transfer_encoding

Perhaps signalling this is more acceptable in the <META> area?
Transfer-encoding is at least directly about the format of the returned
bytes.

Signalling support from the client side is more difficult, though. It
could be an opt-in thing with server-side state when authenticated, but
that's a bit niche.

I predict adding it as a requirement to the spec will not be popular
either ;).

/Nick

Link to individual message.

John Cowan <cowan (a) ccil.org>

📅 Sent: 2020-10-30 13:38
📧 Message 5 of 48

On Fri, Oct 30, 2020 at 7:50 AM Bj?rn W?rmedal <bjorn.warmedal at gmail.com>
wrote:

> There's been a lot of discussions about the lack of an end-of-message
> indicator in the protocol. Clearly it's something that a lot of client and
> server implementers are missing.
>

I confess to not having read these discussions.  But what's the problem?
The server writes an entity-body to the socket and closes it.  The client
reads from the socket until it gets EOF (a zero-length return from read or
recv).  Done.  Gopher has been transmitting binary files like this forever,
and the cognate protocols finger and whois also do it this way: no length
or in-band EOF sequence.

It is the particular mime-type that declares what parameters are meaningful
to it.  Writing "text/plain;charset=utf-8;content-length=32767" will not
mean anything to anyone outside the Gemini world and will probably confuse
them.

Let's not go there.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
One art / There is / No less / No more
To do / All things / With sparks / Galore   --Douglas Hofstadter

Link to individual message.

Arav K. <nothien (a) uber.space>

📅 Sent: 2020-10-30 16:13
📧 Message 6 of 48

On Fri Oct 30, 2020 at 2:38 PM WAT, John Cowan wrote:
> I confess to not having read these discussions. But what's the
> problem?  The server writes an entity-body to the socket and closes
> it. The client reads from the socket until it gets EOF (a zero-length
> return from read or recv). Done. Gopher has been transmitting binary
> files like this forever, and the cognate protocols finger and whois
> also do it this way: no length or in-band EOF sequence.

The problem is that clients have no way to know how much stuff they're
receiving.  This would be helpful for interactive clients, so that they
can tell the user some indication of progress, and it would be helpful
for more constrained programs to know whether or not they can store all
their input.  Both of these points were raised in previous discussions.

> It is the particular mime-type that declares what parameters are
> meaningful to it. Writing
> "text/plain;charset=utf-8;content-length=32767" will not mean anything
> to anyone outside the Gemini world and will probably confuse them.

AFAIK, all MIME parsers know to ignore additional MIME parameters.  Even
if someone who did not know about this concept were to see such an MIME
type, the idea is clear.  I don't see how including this is a problem.

~aravk | ~nothien

Link to individual message.

Emery Hemingway <ehmry (a) posteo.net>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-10-30 16:28
📧 Message 7 of 48

I'd also like to see optional content hashes to aid caching and archiving.

What I miss about the pre-HTTPs days was that personalizing the web
experience was something that could be done independently of the browser
via proxies, be it caching, ad-blocking, or text manipulation.

I'd like to see proxies make a comeback, and local caching proxies seem
like good place a good start.

E.

Link to individual message.

Sean Conner <sean (a) conman.org>

📅 Sent: 2020-10-30 22:06
📧 Message 8 of 48

It was thus said that the Great Arav K. once stated:
> Hi, Gemini peoples!
> 
> I have two main proposals for getting information like content-size
> (which, as has been discussed previously, would be very useful for even
> slightly big responses).
> 
> 1. Define specific additional MIME type parameters.  For example, one
>    could return '20 application/gzip; content-size=42069420' as the
>    response line.  This could also be used for giving the client a hash
>    of the data (parameter name e.g. 'content-hash-sha256'), useful for
>    verification and for checking against local caches of the same page.

  Ah yes ... this is something I proposed back in August of 2019:

	gemini://gemini.conman.org/gRFC/0003

  Solderpunk has always rejected it when it comes up, mumbling "simplicity"
and "not HTTP again" under his breath.  Another issue with this is that the
server might not know how big a given response is---think dynamic output
(via CGI/SCGI or similar).  It would have to be optional in any case.

> 2. Define an additional endpoint for retrieving meta info.

  I also proposed something along these lines in August of 2019:

	https://lists.orbitalfox.eu/archives/gemini/2019/000016.html

  Read the thread to see how that turned out.

  -spc

Link to individual message.

Sean Conner <sean (a) conman.org>

Subject Changed! New Subject: Content length, EOF -- ways to resolve whether we received everything
📅 Sent: 2020-10-30 22:07
📧 Message 9 of 48

It was thus said that the Great Bj?rn W?rmedal once stated:
> Yes, this is true. Would that cause an issue? In that case calling it
> "x-gemi-content-length" should resolve it, as the "x-" prefix is for
> experimental types and any receiver that doesn't recognize it will ignore
> it. 

  RFC-6648 deprecates the whole "X-" thing.

  -spc

Link to individual message.

Arav K. <nothien (a) uber.space>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-10-30 22:27
📧 Message 10 of 48

On Fri Oct 30, 2020 at 12:06 PM UTC, Sean Conner wrote:
> Ah yes ... this is something I proposed back in August of 2019:
>
> gemini://gemini.conman.org/gRFC/0003
>
> Solderpunk has always rejected it when it comes up, mumbling
> "simplicity" and "not HTTP again" under his breath.

It would be a useful optional header that simply requires a few lines of
info in the spec - I don't know what solderpunk is going on about.
Clients (that conform to the MIME parsing guidelines) will ignore it if
they don't recognize it.  It's not affecting anything else in the spec.

I understand to some extent solderpunk's fear of MIME type parameters
becoming the new HTTP header fields (or whatever they're called), but
this has always been around as an avenue of extension.  We can't even
specify that unrecognized MIME type parameters be considered errors,
because the MIME spec itself leaves open the possibility of adding more
parameters later.  I argue that if this avenue of extension is open, and
it cannot be closed, then may as well make (optional) use of it.

> Another issue with this is that the server might not know how big a
> given response is---think dynamic output (via CGI/SCGI or similar). It
> would have to be optional in any case.

For things like CGI, the server is expected to execute the CGI stuff
before sending a success response (in case the CGI fails; see the
CGI-specific error code in the spec).  Immediately after execution, it
would have the full response to send, so it can still provide the
content size.  Even then, it would still be optional to provide, as no
servers currently provide it.

> I also proposed something along these lines in August of 2019:
>
> https://lists.orbitalfox.eu/archives/gemini/2019/000016.html
>
> Read the thread to see how that turned out.

>From what it seems, you suggested the same thing I have a year back, and
solderpunk said 'ah' and then it was forgotten.

Here's what solderpunk's last response to the idea from that thread had
been:

> Yeah, I think we can leave this for now.  It was a hypothetical
> concern that somebody had.  Not necessarily a bad one, but until it's
> observed actually creating significant trouble for actual users on
> actual clients I think we can just table this issue.  If it does come
> up as a practical concern, we can resume discussion of some of the
> ideas here.

The problem has come up again, and more clients are looking for ways to
report this information.  I think we should 'officially' resume
discussion of this, and we can start by figuring out issues with this
MIME type parameter proposal.

In addition, someone reminded me how short content hashes are in
comparison to the available space in the META line.  Even SHA-512 takes
up only 140 characters out of the 1024 available on META, so we can
actually encode content-hash as well in the MIME type.  I understand
this is even more controversional than content-length, but I'm leaving
it out there for discussion.

~aravk | ~nothien

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>

Subject Changed! New Subject: External SHA hashes instead? (was: Re: Content length, EOF -- ways to resolve whether we received everything)
📅 Sent: 2020-10-30 22:34
📧 Message 11 of 48



> On Oct 30, 2020, at 4:50 AM, Bj?rn W?rmedal <bjorn.warmedal at gmail.com> wrote:
> 
> There's been a lot of discussions about the lack of an end-of-message 
indicator in the protocol. Clearly it's something that a lot of client and 
server implementers are missing.
> 
> A proposal that arose (I think acdw may have been the first to suggest 
it; correct me if I'm wrong) was to include a "content-length: <nr of 
bytes>" in the <META> of a status 20 response. It's a simple thing to add, 
that doesn't extend the protocol into bloat.
> 
> "But, ew0k! The <META> is for MIME types! That's not a MIME type!"
> 
> Yes, this is true. Would that cause an issue? In that case calling it 
"x-gemi-content-length" should resolve it, as the "x-" prefix is for 
experimental types and any receiver that doesn't recognize it will ignore 
it. I'm aware that it still doesn't convey info on the *type* of content, 
and thereby doesn't belong among MIME type info, but it's a compromise I'm 
willing to make.
> 
> Are you?
> 
> Cheers,
> BW/ew0k


If you?re worried whether the occasional big(gish) file transferred 
correctly and don?t want to stand up an HTTP server, have you considered 
publishing SHA-256 or -512 hashes, like one does for Linux-distribution .iso files?

Link to individual message.

Ivy Foster <escondida (a) iff.ink>

Subject Changed! New Subject: Content length, EOF -- ways to resolve whether we received everything
📅 Sent: 2020-10-30 23:12
📧 Message 12 of 48

On 30 Oct 2020, at 12:50 pm +0100, Bj?rn W?rmedal wrote:
> A proposal that arose (I think acdw may have been the first to suggest it;
> correct me if I'm wrong) was to include a "content-length: <nr of bytes>"
> in the <META> of a status 20 response. It's a simple thing to add, that
> doesn't extend the protocol into bloat.

> "But, ew0k! The <META> is for MIME types! That's not a MIME type!"

> Yes, this is true. Would that cause an issue? In that case calling it
> "x-gemi-content-length" should resolve it, as the "x-" prefix is for
> experimental types and any receiver that doesn't recognize it will ignore
> it. I'm aware that it still doesn't convey info on the *type* of content,
> and thereby doesn't belong among MIME type info, but it's a compromise I'm
> willing to make.

Alternately, as I [previously pointed out][1], there is [an
established RFC][2] for including size as a parameter to a MIME type.
Specifically, it's part of the specification for email: if any
"attachment" is actually an external part to be fetched rather than an
actual attachment, you specify
"content-type:foo/bar;access-type:how-you-get-it;(optionally)size:octets".
Solderpunk did shoot it down, granted, but since the conversation's
come up again I figure I should point this out instead of having us
come up with new MIME extensions.

[1]: https://lists.orbitalfox.eu/archives/gemini/2020/001534.html
[2]: https://tools.ietf.org/html/rfc1341

Link to individual message.

marc <marcx2 (a) welz.org.za>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-10-31 08:42
📧 Message 13 of 48


Hi

>   Ah yes ... this is something I proposed back in August of 2019:
> 
> 	gemini://gemini.conman.org/gRFC/0003
> 
>   Solderpunk has always rejected it when it comes up, mumbling "simplicity"
> and "not HTTP again" under his breath.  Another issue with this is that the
> server might not know how big a given response is---think dynamic output
> (via CGI/SCGI or similar).  It would have to be optional in any case.
> 
> > 2. Define an additional endpoint for retrieving meta info.
> 
>   I also proposed something along these lines in August of 2019:
> 
> 	https://lists.orbitalfox.eu/archives/gemini/2019/000016.html
> 
>   Read the thread to see how that turned out.


The list has been full of little proposals to add just one
more protocol element to make things look a bit more like http
or one more tag to make things look a bit more like html.

In a way this is understandable - the features of http+html
are well known, and part of ones mental problem solving kit, and
everybody uses gemini a bit differently, so when encountering
a limitation, suggesting bits of http+html is easy...

But (and I don't speak for solderpunk) I view gemini as a
reaction against http/html - the thesis is that there might
have technical aspects of that stack which has made surveillance
capitalism easier. So I don't see gemini as http+html lite - I
see it as its antithesis.

If people want to see gemini evolve, don't have it converge
on http+html, have it work towards a better world.

For instance: In this thread people worry about partially
downloaded data - we don't know if a gemini document has
been transferred completely. We also have the problem that
people are writing gemini to web gateways but other people
would like to keep their content in gemini-space only.
And then Emery Hemingway astutely points out that the modern
web has forgotten about caching and replication.

So: If (and it is a big if) there extension proposals to be
made, how about an (optional) footer in the document markup
(not the transfer protocol) which improves on the mean:

  "(c) Joe Soap, all rights reserved"

to something which states under what conditions that document
might be distributed, if it is a cachable static page or one which
updates and who should be credited as author. There are many
parts of the small web which contain a snippet of valuable
information, and often it is unclear how to rescue those
if the original site disappears. A line like

  -- CC-SA: Joe Soap

for creative commons, share alike or

  -- GMI,DYN,NOC: Joe Soap

for gemini space only, dynamic content not to be cached, or just

  --

for end of document. So this mechanism then solves multiple
problems (I think /robots.txt is an ugly afterthought) and
opens the way for people to cache/curate/remix/reuse each others
ideas. The -- could be a different tag, but -- is in common
use to denote a signature, so a footer, already.

However, having just written all that, it is still good to
consider the following:

  "Perfection isn't achieved when nothing more can be added, it
   is achieved when nothing more can be taken away"

                                - An impressive author and aviator

regards

marc

Link to individual message.

Arav K. <nothien (a) uber.space>

📅 Sent: 2020-10-31 09:13
📧 Message 14 of 48

> The list has been full of little proposals to add just one
> more protocol element to make things look a bit more like http
> or one more tag to make things look a bit more like html.
>
> In a way this is understandable - the features of http+html
> are well known, and part of ones mental problem solving kit, and
> everybody uses gemini a bit differently, so when encountering
> a limitation, suggesting bits of http+html is easy...
>
> But (and I don't speak for solderpunk) I view gemini as a
> reaction against http/html - the thesis is that there might
> have technical aspects of that stack which has made surveillance
> capitalism easier. So I don't see gemini as http+html lite - I
> see it as its antithesis.
> 
> If people want to see gemini evolve, don't have it converge
> on http+html, have it work towards a better world.

This is an orthogonal point.  Yes, I agree that there are elements of
HTTP+HTML that allow for surveillance, and yes, we don't want them, but
a content-size header is not wandering in that direction at all.  It is
an optional extension that allows clients to show progress bars and
potentially give ETAs for downloads.  Period.  Please tell me about what
aspects of surveillance capitalism content-size allows for.

> For instance: In this thread people worry about partially
> downloaded data - we don't know if a gemini document has
> been transferred completely. We also have the problem that
> people are writing gemini to web gateways but other people
> would like to keep their content in gemini-space only.
> And then Emery Hemingway astutely points out that the modern
> web has forgotten about caching and replication.
>
> So: If (and it is a big if) there extension proposals to be
> made, how about an (optional) footer in the document markup
> (not the transfer protocol) which improves on the mean:
>
> "(c) Joe Soap, all rights reserved"
>
> to something which states under what conditions that document
> might be distributed, if it is a cachable static page or one which
> updates and who should be credited as author. There are many
> parts of the small web which contain a snippet of valuable
> information, and often it is unclear how to rescue those
> if the original site disappears. A line like
>
> -- CC-SA: Joe Soap
>
> for creative commons, share alike or
>
> -- GMI,DYN,NOC: Joe Soap
>
> for gemini space only, dynamic content not to be cached, or just
>
> --
>
> for end of document. So this mechanism then solves multiple
> problems (I think /robots.txt is an ugly afterthought) and
> opens the way for people to cache/curate/remix/reuse each others
> ideas. The -- could be a different tag, but -- is in common
> use to denote a signature, so a footer, already.

It doesn't solve the original problem I was trying to address, as I've
pointed out above (i.e. it doesn't allow for progress bars or ETAs).
What you're suggesting is an orthogonal idea that can be implemented in
conjunction with my proposal.  And it's not a bad idea you've got here!
But it is in no way an argument against my proposal.  Even if you didn't
mean it in that way, I want to make this clear.

~aravk | ~nothien

Link to individual message.

Björn Wärmedal <bjorn.warmedal (a) gmail.com>

Subject Changed! New Subject: External SHA hashes instead? (was: Re: Content length, EOF -- ways to resolve whether we received everything)
📅 Sent: 2020-10-31 09:21
📧 Message 15 of 48

It wouldn't just be about content integrity; it's also about caching. If
the client has fetched a file before, and it hasn't changed since, there's
no need to waste time and bandwidth fetching it again.

On Fri, 30 Oct 2020 at 23:34, Nathan Galt <mailinglists at ngalt.com> wrote:

>
>
> > On Oct 30, 2020, at 4:50 AM, Bj?rn W?rmedal <bjorn.warmedal at gmail.com>
> wrote:
> >
> > There's been a lot of discussions about the lack of an end-of-message
> indicator in the protocol. Clearly it's something that a lot of client and
> server implementers are missing.
> >
> > A proposal that arose (I think acdw may have been the first to suggest
> it; correct me if I'm wrong) was to include a "content-length: <nr of
> bytes>" in the <META> of a status 20 response. It's a simple thing to add,
> that doesn't extend the protocol into bloat.
> >
> > "But, ew0k! The <META> is for MIME types! That's not a MIME type!"
> >
> > Yes, this is true. Would that cause an issue? In that case calling it
> "x-gemi-content-length" should resolve it, as the "x-" prefix is for
> experimental types and any receiver that doesn't recognize it will ignore
> it. I'm aware that it still doesn't convey info on the *type* of content,
> and thereby doesn't belong among MIME type info, but it's a compromise I'm
> willing to make.
> >
> > Are you?
> >
> > Cheers,
> > BW/ew0k
>
>
> If you?re worried whether the occasional big(gish) file transferred
> correctly and don?t want to stand up an HTTP server, have you considered
> publishing SHA-256 or -512 hashes, like one does for Linux-distribution
> .iso files?

Link to individual message.

Arav K. <nothien (a) uber.space>

📅 Sent: 2020-10-31 09:25
📧 Message 16 of 48

On Fri Oct 30, 2020 at 12:34 PM UTC, Nathan Galt wrote:
> If you?re worried whether the occasional big(gish) file transferred
> correctly and don?t want to stand up an HTTP server, have you
> considered publishing SHA-256 or -512 hashes, like one does for
> Linux-distribution .iso files?

The issue with this is twofold:
 * Clients generally don't want the hashes of all files, only of the
   they are trying to download.  It is better to only provide this hash,
   which can be done for individual files (either as a MIME type param
   or from a special access point which takes the file name as input).
 * The server would have to constantly update the hash list (or keep it
   in memory and still keep updating it).  This is all the more
   difficult for servers running CGI scripts, as their output can vary.
   Further complications arise if servers change their output based on
   client certs.  Better to serve the hash along with the content
   itself, as then all the details about the content are known (e.g. the
   client cert used to request the content, and the state of databases
   and CGI output which affect the response).

~aravk | ~nothien

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-11-01 13:39
📧 Message 17 of 48

On Fri Oct 30, 2020 at 11:06 PM CET, Sean Conner wrote:

> Ah yes ... this is something I proposed back in August of 2019:
>
> gemini://gemini.conman.org/gRFC/0003
>
> Solderpunk has always rejected it when it comes up, mumbling
> "simplicity"
> and "not HTTP again" under his breath.

Well, that's not quite fair.  What I usually mumble, in the context of
people wanting to add this stuff as part of the MIME type in a status 20
response, is something like this:

1) The parameters which may be legitimately appended to a MIME media
type are those registered in that media type's formal definition.
You can't just stick whatever you want on there.  We can define
parameters for text/gemini, but we can't do anything about other media
types whose definitions are beyond our control.

2) A media type is supposed to specify, well, what *type* of thing a
file is.  They are *categories*, which a client can use to decide how to
handle a given file.  From RFC 2045, section 5:

> The purpose of the Content-Type field is to describe the data
> contained in the body fully enough that the receiving user agent can
> pick an appropriate agent or mechanism to present the data to the
> user, or otherwise deal with the data in an appropriate manner.

It seems very clear to me that information which is unique to a specific
individual file - like its size, hash, last update time, etc. - is just
not semantically appropriate here.

TLDR; MIME types aren't big trucks that you just dump something on.
They have a clearly limited semantic scope, and each media type and
subtype is constrained by a formal definition.

Cheers,
Solderpunk

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

Subject Changed! New Subject: External SHA hashes instead? (was: Re: Content length, EOF -- ways to resolve whether we received everything)
📅 Sent: 2020-11-01 14:26
📧 Message 18 of 48

On Sat Oct 31, 2020 at 10:21 AM CET, Bj?rn W?rmedal wrote:
> It wouldn't just be about content integrity; it's also about caching. If
> the client has fetched a file before, and it hasn't changed since,
> there's
> no need to waste time and bandwidth fetching it again.

I'm not philosophically opposed to caching or anything - far from it -
but it comes with a complexity cost and whether or not that cost is
worth paying depends upon the benefits it brings.

On the web, caching is a win, because webpages are composed of many
different parts (style sheets, background images, tracking scripts,
sometimes even content!), some of those parts are re-used across
multiple pages, and some of those parts can be fairly large.

In Geminispace, the majority of content is small (a kilobyte or so),
and visiting a single URL only means downloading an individual resource,
so there's no way for a resource to be frequently re-used like a large
background image might be on a website.  I'm not sure the cost to
benefit ratio is quite so clearly favourable.

It's true that frequent use of "the back button" or similar
navigational strategies to get around Geminispace means that certain
individual pages may be loaded several times within a browsing session
(one might go back and forth between a page listing entries in a gemlog
and those individual entries), and that can be a little wasteful.
Smarter navigation tools (like AV-98's "tours") can reduce this waste,
but it's also easily fixed by clients doing something simple like
keeping the 5 or 10 most recently visited pages cached in memory, with
cached pages expiring after 10 minutes or so (AV-98 does this, although
I don't think I've done a release since adding it).  Clients doing this
would obviously have problems with content which updated more frequently
than every 10 minutes, but that kind of content is - and I would argue

	should* be - very rare in Geminispace.  Apps are the one obvious place

where this strategy might fail, which is perhaps yet another argument in
favour of having different clients optimised for reading and for apps.

Cheers,
Solderpunk

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-11-01 14:47
📧 Message 19 of 48

On Fri Oct 30, 2020 at 11:27 PM CET, Arav K. wrote:
> On Fri Oct 30, 2020 at 12:06 PM UTC, Sean Conner wrote:
> > Ah yes ... this is something I proposed back in August of 2019:
> >
> > gemini://gemini.conman.org/gRFC/0003
> >
> > Solderpunk has always rejected it when it comes up, mumbling
> > "simplicity" and "not HTTP again" under his breath.
>
> It would be a useful optional header that simply requires a few lines of
> info in the spec - I don't know what solderpunk is going on about.
> Clients (that conform to the MIME parsing guidelines) will ignore it if
> they don't recognize it. It's not affecting anything else in the spec.

I've explained in another email why I don't think it's okay to put this
information into the MIME type.

The other option is to stick it *after* the MIME type, separated by a
tab or some other delimiter, and I don't like that idea because as soon
as the precedent is set that extra bits of optional information can be
tacked on after the MIME type and simpler clients can ignore them, it
opens the gateway to endless such extensions, and the risk that some of
them will become so popular and widely implemented that clients which
don't do so are considered "outdated" or "broken", and these "optional"
extensions are now de-facto obligatory spec features.

> > Yeah, I think we can leave this for now.  It was a hypothetical
> > concern that somebody had.  Not necessarily a bad one, but until it's
> > observed actually creating significant trouble for actual users on
> > actual clients I think we can just table this issue.  If it does come
> > up as a practical concern, we can resume discussion of some of the
> > ideas here.
>
> The problem has come up again

I'm behind on my mailing list reading (to put it lightly), so forgive me
if I should know this: but has the problem *actually* come up, as in
people are actually observing real problems in the wild where Gemini
transactions are terminated early and the situation isn't immediately
very obvious to either the client or the human user?  Or has it come up
in the sense that more people have noticed it as an abstract possibility
and are just looking for a way to fix it on principle because we're all
geeks here and enjoy designing perfect things?

Cheers,
Solderpunk

Link to individual message.

Solderpunk <solderpunk (a) posteo.net>

📅 Sent: 2020-11-01 15:22
📧 Message 20 of 48

On Sun Nov 1, 2020 at 3:47 PM CET, Solderpunk wrote:

> The other option is to stick it *after* the MIME type, separated by a
> tab or some other delimiter, and I don't like that idea because as soon
> as the precedent is set that extra bits of optional information can be
> tacked on after the MIME type and simpler clients can ignore them, it
> opens the gateway to endless such extensions, and the risk that some of
> them will become so popular and widely implemented that clients which
> don't do so are considered "outdated" or "broken", and these "optional"
> extensions are now de-facto obligatory spec features.

...I suspect I'll regret sharing this, but intellectual honesty compels
me.  It's just occurred to me that this problem can actually be avoided by
having the <META> value for status code 20 be the content size in bytes,
followed by arbitrary whitespace, followed by the MIME type.  With this
approach, the designated separator between the size and the MIME type
is whitespace, but because MIME types may themselves contain whitespace
(to e.g. separate type/subtype from parameter values), it would be
extremely problematic to attempt to add any optional extensions on after
the MIME type using that same designated separator.  Having the
separator possibly occur inside the final element of the list makes the
list self-terminating, which completely addresses my greatest fear with
having a list at all, as opposed to just a single bit of information.

So, this is actually solvable, in principle.  It would require a
backward-compatibility breaking change to the protocol, which would
probably throw the Geminiverse into a period of chaos as clients and
servers made the change at different rates.  I haven't done anything
like this since the early days when you could count Gemini
implementations on your fingers and I had pre-existing relationships
with all the authors, so the chaotic period was measured in mere days
before we achieved perfect unity again.  I have no idea how it would go
down this time now that everybody and their dog has written some Gemini
tooling, and I'm not at all convinced that the lack of Content-Size is
such a big practical problem that it's worth this kind of upheaval.

But if content size is ever going to make it in, this seems like the
best approach to me.  It doesn't abuse MIME types at all and it doesn't
provide an obvious place for would-be extenders of the protocol to add
additional content to the response header.  No other previous proposal
has had both those properties, to the best of my knowledge.

Cheers,
Solderpunk

Link to individual message.

Arav K. <nothien (a) uber.space>

📅 Sent: 2020-11-01 15:56
📧 Message 21 of 48

> So, this is actually solvable, in principle. It would require a
> backward-compatibility breaking change to the protocol, which would
> probably throw the Geminiverse into a period of chaos as clients and
> servers made the change at different rates. I haven't done anything
> like this since the early days when you could count Gemini
> implementations on your fingers and I had pre-existing relationships
> with all the authors, so the chaotic period was measured in mere days
> before we achieved perfect unity again. I have no idea how it would go
> down this time now that everybody and their dog has written some
> Gemini tooling, and I'm not at all convinced that the lack of
> Content-Size is such a big practical problem that it's worth this kind
> of upheaval.

This is a great idea, but Gemini seems to be beyond the point where this
is feasible.  Oh well.  I guess content-size won't be getting in after
all.

~aravk | ~nothien

Link to individual message.

Sean Conner <sean (a) conman.org>

Subject Changed! New Subject: External SHA hashes instead? (was: Re: Content length, EOF -- ways to resolve whether we received everything)
📅 Sent: 2020-11-02 02:00
📧 Message 22 of 48

It was thus said that the Great Solderpunk once stated:
> It's true that frequent use of "the back button" or similar
> navigational strategies to get around Geminispace means that certain
> individual pages may be loaded several times within a browsing session
> (one might go back and forth between a page listing entries in a gemlog
> and those individual entries), and that can be a little wasteful.
> Smarter navigation tools (like AV-98's "tours") can reduce this waste,
> but it's also easily fixed by clients doing something simple like
> keeping the 5 or 10 most recently visited pages cached in memory, with
> cached pages expiring after 10 minutes or so (AV-98 does this, although
> I don't think I've done a release since adding it).  Clients doing this
> would obviously have problems with content which updated more frequently
> than every 10 minutes, but that kind of content is - and I would argue
> *should* be - very rare in Geminispace.  Apps are the one obvious place
> where this strategy might fail, which is perhaps yet another argument in
> favour of having different clients optimised for reading and for apps.

  For my gopher client, I cache files that are fetched for the session. 
They're stored to disk, and when I quit out of the gopher client, the files
are deleted.  There is an option to reload an already loaded page, but I
don't think I use that all that often.

  -spc

Link to individual message.

Petite Abeille <petite.abeille (a) gmail.com>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-11-02 09:55
📧 Message 23 of 48

> On Nov 1, 2020, at 14:39, Solderpunk <solderpunk at posteo.net> wrote:
> 
> It seems very clear to me that information which is unique to a specific
> individual file - like its size, hash, last update time, etc. - is just
> not semantically appropriate here.

This of course depend on the MIME type itself. For example, multipart's 
boundary is very much unique to that content type specific instance.

Link to individual message.

Petite Abeille <petite.abeille (a) gmail.com>

📅 Sent: 2020-11-02 10:03
📧 Message 24 of 48

> On Nov 1, 2020, at 16:22, Solderpunk <solderpunk at posteo.net> wrote:
> 
> upheaval

On the plus side, providing some sort of content length/end-of-content 
indicator would open the door to persistent connections, mitigating the 
cost of ephemeral TLS  connections, making gemini a bit more nimble altogether.

Alternatively to a content length, perhaps gemini could consider some sort 
of end-of-content marker, ala chunked transfer encoding, or such.

Link to individual message.

Petite Abeille <petite.abeille (a) gmail.com>

📅 Sent: 2020-11-02 10:54
📧 Message 25 of 48



> On Nov 1, 2020, at 15:47, Solderpunk <solderpunk at posteo.net> wrote:
> 
> I've explained in another email why I don't think it's okay to put this
> information into the MIME type.

Actually, one could use a Message/External-Body content type, sporting a 
SIZE parameter, e.g:

message/external-body; access-type=URL; URL="gemini://example.com/foo"; SIZE=1024

could != should :D

Link to individual message.

John Cowan <cowan (a) ccil.org>

📅 Sent: 2020-11-02 12:52
📧 Message 26 of 48

This is wonderful and just what I need for Dioscuri, which needs to be able
to return new content as the URL where it is now stored.  I was going to do
that with a new response code, but this is much better.

The Dioscuri spec is at <tinyurl.com/dioscuri-spec>.   Reviews are
welcome.  And it does include content-length, because TLS 1.2 is not able
to close just one side of a connection.

On Mon, Nov 2, 2020 at 5:54 AM Petite Abeille <petite.abeille at gmail.com>
wrote:

>
>
> > On Nov 1, 2020, at 15:47, Solderpunk <solderpunk at posteo.net> wrote:
> >
> > I've explained in another email why I don't think it's okay to put this
> > information into the MIME type.
>
> Actually, one could use a Message/External-Body content type, sporting a
> SIZE parameter, e.g:
>
> message/external-body; access-type=URL; URL="gemini://example.com/foo";
> SIZE=1024
>
> could != should :D
>
>

Link to individual message.

Jason McBrayer <jmcbray (a) carcosa.net>

Subject Changed! New Subject: External SHA hashes instead?
📅 Sent: 2020-11-02 14:37
📧 Message 27 of 48

"Arav K." <nothien at uber.space> writes:

> On Fri Oct 30, 2020 at 12:34 PM UTC, Nathan Galt wrote:
>> If you?re worried whether the occasional big(gish) file transferred
>> correctly and don?t want to stand up an HTTP server, have you
>> considered publishing SHA-256 or -512 hashes, like one does for
>> Linux-distribution .iso files?
>
> The issue with this is twofold:
>  * Clients generally don't want the hashes of all files, only of the
>    they are trying to download.  It is better to only provide this hash,
>    which can be done for individual files (either as a MIME type param
>    or from a special access point which takes the file name as input).
>  * The server would have to constantly update the hash list (or keep it
>    in memory and still keep updating it).  This is all the more
>    difficult for servers running CGI scripts, as their output can vary.
>    Further complications arise if servers change their output based on
>    client certs.  Better to serve the hash along with the content
>    itself, as then all the details about the content are known (e.g. the
>    client cert used to request the content, and the state of databases
>    and CGI output which affect the response).

I believe the suggestion was 1) to do this only for known large files
(like the audio files on konpeito.media) and 2) Provide hashes and/or
signatures for the purpose of manual, not automatic, validation.

-- 
+-----------------------------------------------------------+
| Jason F. McBrayer                    jmcbray at carcosa.net  |
| A flower falls, even though we love it; and a weed grows, |
| even though we do not love it.            -- Dogen        |

Link to individual message.

prisonpotato@tilde.team <prisonpotato (a) tilde.team>

Subject Changed! New Subject: Proposal about content-size and hash
📅 Sent: 2020-11-02 18:54
📧 Message 28 of 48

> ...I suspect I'll regret sharing this, but intellectual honesty compels
> me. It's just occurred to me that this problem can actually be avoided by
> having the <META> value for status code 20 be the content size in bytes,
> followed by arbitrary whitespace, followed by the MIME type. With this
> approach, the designated separator between the size and the MIME type
> is whitespace, but because MIME types may themselves contain whitespace
> (to e.g. separate type/subtype from parameter values), it would be
> extremely problematic to attempt to add any optional extensions on after
> the MIME type using that same designated separator. Having the
> separator possibly occur inside the final element of the list makes the
> list self-terminating, which completely addresses my greatest fear with
> having a list at all, as opposed to just a single bit of information.

This seems like a neat solution to this problem to me, but I'm not sure if 
it would work at this stage of gemini's life cycle.  There are also of 
course the issues with dynamically sized responses as generated by CGI 
scripts and stuff like that, so maybe we could introduce a new response 
code, like 22: Response with size.

20 text/gemini
22 100 text/gemini

This solves both problems by making content length optional again, but 
exposes a risk that this type of extension could be used to add more fields

Link to individual message.

A. E. Spencer-Reed <easrng (a) gmail.com>

📅 Sent: 2020-11-02 19:04
📧 Message 29 of 48

Wait, what defines whitespace? I have a terrible idea...

On Mon, Nov 2, 2020 at 1:54 PM <prisonpotato at tilde.team> wrote:
>
> > ...I suspect I'll regret sharing this, but intellectual honesty compels
> > me. It's just occurred to me that this problem can actually be avoided by
> > having the <META> value for status code 20 be the content size in bytes,
> > followed by arbitrary whitespace, followed by the MIME type. With this
> > approach, the designated separator between the size and the MIME type
> > is whitespace, but because MIME types may themselves contain whitespace
> > (to e.g. separate type/subtype from parameter values), it would be
> > extremely problematic to attempt to add any optional extensions on after
> > the MIME type using that same designated separator. Having the
> > separator possibly occur inside the final element of the list makes the
> > list self-terminating, which completely addresses my greatest fear with
> > having a list at all, as opposed to just a single bit of information.
>
> This seems like a neat solution to this problem to me, but I'm not sure 
if it would work at this stage of gemini's life cycle.  There are also of 
course the issues with dynamically sized responses as generated by CGI 
scripts and stuff like that, so maybe we could introduce a new response 
code, like 22: Response with size.
>
> 20 text/gemini
> 22 100 text/gemini
>
> This solves both problems by making content length optional again, but 
exposes a risk that this type of extension could be used to add more fields



-- 
?

Link to individual message.

Katarina Eriksson <gmym (a) coopdot.com>

📅 Sent: 2020-11-02 21:38
📧 Message 30 of 48

<prisonpotato at tilde.team> wrote:

> This seems like a neat solution to this problem to me, but I'm not sure if
> it would work at this stage of gemini's life cycle.  There are also of
> course the issues with dynamically sized responses as generated by CGI
> scripts and stuff like that, so maybe we could introduce a new response
> code, like 22: Response with size.
>
> 20 text/gemini
> 22 100 text/gemini
>
> This solves both problems by making content length optional again, but
> exposes a risk that this type of extension could be used to add more fields
>

Remember, simple clients might only read the first "2" and treat it like a
"20".

If this was a year ago, maybe we could have gone with
"20<sp><size><sp><meta>" but what would be a valid size? Byte count only?
Approximate sizes like "20k" and "42M"? Ignoring trailing characters like
"5744267;<other_headers>"?

-- 
Katarina

>

Link to individual message.

Philip Linde <linde.philip (a) gmail.com>

📅 Sent: 2020-11-02 22:44
📧 Message 31 of 48

On Mon, 02 Nov 2020 18:54:32 +0000
prisonpotato at tilde.team wrote:

> a new response code, like 22: Response with size.
> 
> 20 text/gemini
> 22 100 text/gemini

The spec currently explicitly prohibits this kind of use of the status
codes, but it could be a new first-digit response code, e.g. 7x. This
would naturally take some time for the implementations to adopt.

I personally don't miss content-length in practice, but I think the
fact that it wasn't included in the protocol from the beginning is the
most compelling argument against it. That said, I'd happily implement,
something like a new status code to account for such a change in my
client. When would servers dare use it, though?

-- 
Philip

Link to individual message.

Philip Linde <linde.philip (a) gmail.com>

📅 Sent: 2020-11-02 22:49
📧 Message 32 of 48

On Mon, 2 Nov 2020 14:04:57 -0500
"A. E. Spencer-Reed" <easrng at gmail.com> wrote:

> Wait, what defines whitespace? I have a terrible idea...

A single space...I presume that by "terrible idea" you mean something
like a tab+space binary encoding of content length, if the whitespace
separating status from meta was an arbitrary sequence of spaces and/or
tabs. Alas, the spec is quite clear on this and we can only dream
(nightmares?) about hacks like that.

-- 
Philip

Link to individual message.

Sean Conner <sean (a) conman.org>

📅 Sent: 2020-11-03 00:21
📧 Message 33 of 48

It was thus said that the Great Philip Linde once stated:
> On Mon, 02 Nov 2020 18:54:32 +0000
> prisonpotato at tilde.team wrote:
> 
> > a new response code, like 22: Response with size.
> > 
> > 20 text/gemini
> > 22 100 text/gemini
> 
> The spec currently explicitly prohibits this kind of use of the status
> codes, but it could be a new first-digit response code, e.g. 7x. This
> would naturally take some time for the implementations to adopt.
> 
> I personally don't miss content-length in practice, but I think the
> fact that it wasn't included in the protocol from the beginning is the
> most compelling argument against it. That said, I'd happily implement,
> something like a new status code to account for such a change in my
> client. When would servers dare use it, though?

  Now.

  As I'm wont to do, I will often code up some weird "proof-of-concept" for
my Gemini server, and if anyone wants to play around with this concept,
there's a server for that:

	gemini://gemini.conman.org/test/testsize.gemini

  The above page will return a normal 20, but all the links on it will
return a 22 with a size, then MIME type.

  -spc (Will implement dodgy specs on a whim for $200, Alex)

Link to individual message.

Ali Fardan <raiz (a) stellarbound.space>

📅 Sent: 2020-11-03 11:49
📧 Message 34 of 48

On Mon, 02 Nov 2020 18:54:32 +0000
prisonpotato at tilde.team wrote:
> This seems like a neat solution to this problem to me, but I'm not
> sure if it would work at this stage of gemini's life cycle.  There
> are also of course the issues with dynamically sized responses as
> generated by CGI scripts and stuff like that, so maybe we could
> introduce a new response code, like 22: Response with size.
> 
> 20 text/gemini
> 22 100 text/gemini
> 
> This solves both problems by making content length optional again,
> but exposes a risk that this type of extension could be used to add
> more fields

The protocol allows minimalist clients to parse only the first digit of
response code and handle it, if such thing were to be implemented,
those clients would break since they'd treat '2' as '20', so I'm not in
favor of that.

Not to mention how it will start complicate the protocol, since now it
is accepted for certain response codes to have an alternative header
instead of the unified one specified in the spec, this opens the door
for other feature seekers to want to have their own set of response
codes with custom headers, and the simple implementations of the
protocol that used to be are obsolete and now are required to handle
many many cases of custom headers.

If so many people are not satisfied with the protocol as is without an
insane amount of features, why don't you move to a different protocol
that satisfies your needs?  Or rather, define your own, the only reason
I'm interested in the Gemini protocol in the first place is the lack of
features, yet ever since I joined this community the majority of
discussion is all about feature proposals, why don't we get creative
with what we have?

Petite Abeille have suggested the use of message/external-body MIME
type defined in RFC 1873 for such thing, and I know this looks like an
ugly solution, Guess what? so is adding content length to response
headers, the protocol was designed to make it impossible to do such
thing, lets keep it that way.  And by the way, you could outsource
certain operations to external protocols if you really need that,
gemtext allows a clean way of specifying links to different protocol
schemes by design.

Link to individual message.

Björn Wärmedal <bjorn.warmedal (a) gmail.com>

📅 Sent: 2020-11-03 12:11
📧 Message 35 of 48

On Tue, 3 Nov 2020 at 12:49, Ali Fardan <raiz at stellarbound.space> wrote:

>
> If so many people are not satisfied with the protocol as is without an
> insane amount of features, why don't you move to a different protocol
> that satisfies your needs?  Or rather, define your own, the only reason
> I'm interested in the Gemini protocol in the first place is the lack of
> features, yet ever since I joined this community the majority of
> discussion is all about feature proposals, why don't we get creative
> with what we have?
>

I assume the majority of people who suggest a feature want "gemini + X",
but everyone has their own idea of what X is :) If everyone built their own
protocol instead, almost all of those would be doomed from the get-go. To
get traction a potential new protocol needs to be appealing to as many as
possible -- and the creator needs to *reach out* to as many as possible at
that!

The most sold pie in the US is apple pie. It's pretty much nobody's
favourite, but it's virtually everyone's second-favourite.

The fact that so many feature proposals drop in suggests that Gemini has
done mostly everything right and appeals to a whole truckload of people.
It's a good thing, and it doesn't mean people *aren't* getting creative
with what they have (see
https://portal.mozz.us/gemini/mozz.us/files/rfc_gemini_favicon.gmi for
example). The lack of content-size or hash is by no means a deal-breaker
for me; I'll find other ways to reduce bandwidth usage for my use case if
need be. But that doesn't mean it *wouldn't be useful*, for my use case or
others'.

>
> Petite Abeille have suggested the use of message/external-body MIME
> type defined in RFC 1873 for such thing, and I know this looks like an
> ugly solution, Guess what? so is adding content length to response
> headers, the protocol was designed to make it impossible to do such
> thing, lets keep it that way.  And by the way, you could outsource
> certain operations to external protocols if you really need that,
> gemtext allows a clean way of specifying links to different protocol
> schemes by design.
>

Well, if all I want is gemini + X, then using protocol Y with its bloat of
features I *don't* need is less tempting than sending a feature proposal to
the gemini ML. And again, that's a good thing! It means people are engaging
and shaping the trajectory of their own internet future. A rejected
proposal is a hundred times better than one that was never discussed for
fear of ridicule or social repercussions. The community is alive and
vibrant :D

Cheers,
ew0k

Link to individual message.

Petite Abeille <petite.abeille (a) gmail.com>

📅 Sent: 2020-11-03 12:27
📧 Message 36 of 48



> On Nov 3, 2020, at 12:49, Ali Fardan <raiz at stellarbound.space> wrote:
> 
> gemtext allows a clean way of specifying links to different protocol
> schemes by design

It also allows embedding content through data URI magic. Which may still 
be giving Solderpunk heartburns :D

C'est la vie.

Link to individual message.

Ali Fardan <raiz (a) stellarbound.space>

📅 Sent: 2020-11-03 13:28
📧 Message 37 of 48

On Tue, 3 Nov 2020 13:11:16 +0100
Bj?rn W?rmedal <bjorn.warmedal at gmail.com> wrote:
> I assume the majority of people who suggest a feature want "gemini +
> X", but everyone has their own idea of what X is :) If everyone built
> their own protocol instead, almost all of those would be doomed from
> the get-go. To get traction a potential new protocol needs to be
> appealing to as many as possible -- and the creator needs to *reach
> out* to as many as possible at that!

I agree on the fact the if everyone rolled their own protocol it'll be
a mess, however, the appeal of Gemini would fade away if it starts
growing in terms of features, the way I see to grow the community is
hosting more content in the gemspace and going forward with refining
the spec to a final paper that is more precise and easier for newcomers
to get a grasp on because the protocol has evolved along with the
current spec paper and stuff has been added that wasn't intended to be
there from the beginning.

It would be discouraging for people to have their implementations break
so often because the protocol is never stable and features get
added/removed with stuff changing all the time.

The way I see it, Gemini is complete, the only way going forward is
tidying up the spec and growing the gemspace with more content, and in
the meantime, Gemini implementations will mature and become more
appealing for newcomers.

> Well, if all I want is gemini + X, then using protocol Y with its
> bloat of features I *don't* need is less tempting than sending a
> feature proposal to the gemini ML. And again, that's a good thing! It
> means people are engaging and shaping the trajectory of their own
> internet future. A rejected proposal is a hundred times better than
> one that was never discussed for fear of ridicule or social
> repercussions. The community is alive and vibrant :D

You wouldn't want to add revision control to Gemini, that's what Git is
for, just like you wouldn't add remote shell to Gemini because that's
what SSH is for, this should apply to everything, use the right tool
for the right task.

Link to individual message.

khuxkm@tilde.team <khuxkm (a) tilde.team>

📅 Sent: 2020-11-03 14:07
📧 Message 38 of 48

November 3, 2020 8:28 AM, "Ali Fardan" <raiz at stellarbound.space> wrote:

> On Tue, 3 Nov 2020 13:11:16 +0100
> Bj?rn W?rmedal <bjorn.warmedal at gmail.com> wrote:
> 
>> I assume the majority of people who suggest a feature want "gemini +
>> X", but everyone has their own idea of what X is :) If everyone built
>> their own protocol instead, almost all of those would be doomed from
>> the get-go. To get traction a potential new protocol needs to be
>> appealing to as many as possible -- and the creator needs to *reach
>> out* to as many as possible at that!
> 
> I agree on the fact the if everyone rolled their own protocol it'll be
> a mess, however, the appeal of Gemini would fade away if it starts
> growing in terms of features, the way I see to grow the community is
> hosting more content in the gemspace and going forward with refining
> the spec to a final paper that is more precise and easier for newcomers
> to get a grasp on because the protocol has evolved along with the
> current spec paper and stuff has been added that wasn't intended to be
> there from the beginning.

I don't really think this is a scenario where there are things that 
weren't "intended"; what was intended was to create a protocol, lighter 
than the Web and heavier than Gopher, for serving content securely (at 
least, from where I sit and what I see). Obviously, the protocol isn't 
perfect, so sometimes it needs things added that may not have been there 
at the beginning, but fit the intent of the protocol.

> It would be discouraging for people to have their implementations break
> so often because the protocol is never stable and features get
> added/removed with stuff changing all the time.

The spec was created around August of 2019 (at least, the list was first 
posted to in mid-August). It's only a year old. If a breaking change 

	needs* to be made, it's not too late yet.

> The way I see it, Gemini is complete, the only way going forward is
> tidying up the spec and growing the gemspace with more content, and in
> the meantime, Gemini implementations will mature and become more
> appealing for newcomers.

I'm willing to agree with you on this point; it seems like we don't *need* 
this breaking change, at least not yet.

>> Well, if all I want is gemini + X, then using protocol Y with its
>> bloat of features I *don't* need is less tempting than sending a
>> feature proposal to the gemini ML. And again, that's a good thing! It
>> means people are engaging and shaping the trajectory of their own
>> internet future. A rejected proposal is a hundred times better than
>> one that was never discussed for fear of ridicule or social
>> repercussions. The community is alive and vibrant :D
> 
> You wouldn't want to add revision control to Gemini, that's what Git is
> for, just like you wouldn't add remote shell to Gemini because that's
> what SSH is for, this should apply to everything, use the right tool
> for the right task.

That's not really a good comparison. Obviously you wouldn't add revision 
control and/or a remote shell to Gemini; that has nothing to do with 
serving content. But if the X in "gemini + X" has to do with content 
(serving it, etc.), it's worth considering.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

Ali Fardan <raiz (a) stellarbound.space>

📅 Sent: 2020-11-03 15:22
📧 Message 39 of 48

On Tue, 03 Nov 2020 14:07:49 +0000
khuxkm at tilde.team wrote:
> I don't really think this is a scenario where there are things that
> weren't "intended"; what was intended was to create a protocol,
> lighter than the Web and heavier than Gopher, for serving content
> securely (at least, from where I sit and what I see). Obviously, the
> protocol isn't perfect, so sometimes it needs things added that may
> not have been there at the beginning, but fit the intent of the
> protocol.

Let me reword... weren't considered from the beginning, here is an
example: 5.4.2 from the spec goes into great detail even defining what
whitespace is, while 5.5.2 and 5.5.3 aren't defined in the same style
as 5.4.2 was defined, there's nothing wrong with that, the protocol
evolved and these were added later, though they weren't considered at
the very beginning, that's what I meant by weren't "intended".

> The spec was created around August of 2019 (at least, the list was
> first posted to in mid-August). It's only a year old. If a breaking
> change *needs* to be made, it's not too late yet.

If a crucial change were to be made, of course, I'm all for it, but in
my opinion, it seems that there is no need for any more features to be
added, and the way I see it is moving towards stabilizing the spec
without any breaking changes, that's just my opinion.

> That's not really a good comparison. Obviously you wouldn't add
> revision control and/or a remote shell to Gemini; that has nothing to
> do with serving content. But if the X in "gemini + X" has to do with
> content (serving it, etc.), it's worth considering.

I may have went to the extreme with this, but take a look at web land,
there's applications written to entirely run on the web, those include
mail readers, video and music players (implemented in JS), document
editors (Google docs and whatnot), and finally, control panels which
serve the purpose of a remote shell, obviously, that's not SSH, but if
more and more features get accepted to the protocol, how long until
that becomes possible?

Applications can be implemented in Gemini currently, using CGI, these
applications don't have to do with serving content, you could implement
a calculator, a banner generator, and a git repository viewer, all
using what is currently available, I have no argument against
implementing such applications using CGI which currently allows
extending the protocol without requiring more features to be added.

Link to individual message.

khuxkm@tilde.team <khuxkm (a) tilde.team>

📅 Sent: 2020-11-03 15:33
📧 Message 40 of 48

November 3, 2020 10:22 AM, "Ali Fardan" <raiz at stellarbound.space> wrote:

> On Tue, 03 Nov 2020 14:07:49 +0000
> khuxkm at tilde.team wrote:
> 
>> I don't really think this is a scenario where there are things that
>> weren't "intended"; what was intended was to create a protocol,
>> lighter than the Web and heavier than Gopher, for serving content
>> securely (at least, from where I sit and what I see). Obviously, the
>> protocol isn't perfect, so sometimes it needs things added that may
>> not have been there at the beginning, but fit the intent of the
>> protocol.
> 
> Let me reword... weren't considered from the beginning, here is an
> example: 5.4.2 from the spec goes into great detail even defining what
> whitespace is, while 5.5.2 and 5.5.3 aren't defined in the same style
> as 5.4.2 was defined, there's nothing wrong with that, the protocol
> evolved and these were added later, though they weren't considered at
> the very beginning, that's what I meant by weren't "intended".

I think this is an apples to oranges comparison; 5.5.2 has to do with the 
text/gemini media type, which, while it is a part of the spec, isn't 
protocol based (i.e; I could serve text/gemini on a web server if I really wanted to)

>> The spec was created around August of 2019 (at least, the list was
>> first posted to in mid-August). It's only a year old. If a breaking
>> change *needs* to be made, it's not too late yet.
> 
> If a crucial change were to be made, of course, I'm all for it, but in
> my opinion, it seems that there is no need for any more features to be
> added, and the way I see it is moving towards stabilizing the spec
> without any breaking changes, that's just my opinion.

I agreed with you on that point though; this isn't needed, at least not 
yet. But if the breaking change needed to be made, like you said, it 
should be made. This attitude of "oh, we don't need anything else" might 
convince some people who otherwise would have had good suggestions to back 
away; after all, if we aren't adding new features, why bother giving your suggestion?

>> That's not really a good comparison. Obviously you wouldn't add
>> revision control and/or a remote shell to Gemini; that has nothing to
>> do with serving content. But if the X in "gemini + X" has to do with
>> content (serving it, etc.), it's worth considering.
> 
> I may have went to the extreme with this, but take a look at web land,
> there's applications written to entirely run on the web, those include
> mail readers, video and music players (implemented in JS), document
> editors (Google docs and whatnot), and finally, control panels which
> serve the purpose of a remote shell, obviously, that's not SSH, but if
> more and more features get accepted to the protocol, how long until
> that becomes possible?

Just for the sake of it, I really want to try and make a remote shell in 
Gemini CGI, just to prove a point. You can do it already, with the 
protocol as-is; send the command as a query to a CGI endpoint

> Applications can be implemented in Gemini currently, using CGI, these
> applications don't have to do with serving content, you could implement
> a calculator, a banner generator, and a git repository viewer, all
> using what is currently available, I have no argument against
> implementing such applications using CGI which currently allows
> extending the protocol without requiring more features to be added.

Okay, but with all due respect, what does that have to do with content 
size? CGI isn't going to help the fact that the protocol currently has no 
way to indicate "this is how big the response will be" or "this is the 
hash of the file". Those questions, at least in my opinion, need to be 
answered at the protocol level, unless we're going to make a .well-known for Gemini.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

Ali Fardan <raiz (a) stellarbound.space>

📅 Sent: 2020-11-03 15:42
📧 Message 41 of 48

On Tue, 03 Nov 2020 15:33:31 +0000
khuxkm at tilde.team wrote:
> I think this is an apples to oranges comparison; 5.5.2 has to do with
> the text/gemini media type, which, while it is a part of the spec,
> isn't protocol based (i.e; I could serve text/gemini on a web server
> if I really wanted to)

I'm referring to this in the context of the spec, not the protocol
itself.

> Just for the sake of it, I really want to try and make a remote shell
> in Gemini CGI, just to prove a point. You can do it already, with the
> protocol as-is; send the command as a query to a CGI endpoint

Great, that'd be a creative way to make use of the limitations of a
protocol instead of suggesting adding more.

> Okay, but with all due respect, what does that have to do with
> content size? CGI isn't going to help the fact that the protocol
> currently has no way to indicate "this is how big the response will
> be" or "this is the hash of the file". Those questions, at least in
> my opinion, need to be answered at the protocol level, unless we're
> going to make a .well-known for Gemini.

With all due respect, EOF should be an indicator.

Link to individual message.

Martin Keegan <martin (a) no.ucant.org>

📅 Sent: 2020-11-03 15:43
📧 Message 42 of 48

On Tue, 3 Nov 2020, Ali Fardan wrote:

> If so many people are not satisfied with the protocol as is without an
> insane amount of features, why don't you move to a different protocol
> that satisfies your needs?  Or rather, define your own, the only reason

The problem, as I see it, is that some people want there *not* to be a 
simple protocol, and will propose modifications to make it extensible. The 
minimalist attitude is perceived, wrongly, by some people as 
self-righteous and worthy of being taken down a peg or two; there are also 
other reasons for wanting to drive up the cost of information sharing 
online.

If the lack of features in Gemini means people go off and use some other, 
possibly new or incompatible, protocol, that's not too much of a problem, 
and more people's preferences will be satisfied. It may be that those who 
want a minimalist protocol should spec up a non-minimalist protocol and 
implement that, and then tell everyone who wants Gemini not to be 
minimalist to go and use this other protocol. In the presence of a viable 
alternative protocol to Gemini, the remaining arguments in favour of 
extending Gemini would much more obviously be in bad faith.

On the other hand, if eventually Solderpunk gives in and makes Gemini 
extensible, then the supporters of a minimalist protocol will just go and 
make their own new protocol and the cycle of agitation against minimalism 
will repeat, so one's just competing for the Gemini name and mindshare.

Mk

-- 
Martin Keegan, @mk270, https://mk.ucant.org/

Link to individual message.

khuxkm@tilde.team <khuxkm (a) tilde.team>

📅 Sent: 2020-11-03 15:51
📧 Message 43 of 48

November 3, 2020 10:42 AM, "Ali Fardan" <raiz at stellarbound.space> wrote:

> On Tue, 03 Nov 2020 15:33:31 +0000
> khuxkm at tilde.team wrote:
> 
>> I think this is an apples to oranges comparison; 5.5.2 has to do with
>> the text/gemini media type, which, while it is a part of the spec,
>> isn't protocol based (i.e; I could serve text/gemini on a web server
>> if I really wanted to)
> 
> I'm referring to this in the context of the spec, not the protocol
> itself.

Alright, fine, I'll cede that point. Still, if anything, 5.5.2 and 5.5.3 
show that new features that have been suggested have, in fact, made it 
into the spec. If we tell people not to suggest new features at all, we 
might miss out on some things we otherwise may have wanted.

>> Okay, but with all due respect, what does that have to do with
>> content size? CGI isn't going to help the fact that the protocol
>> currently has no way to indicate "this is how big the response will
>> be" or "this is the hash of the file". Those questions, at least in
>> my opinion, need to be answered at the protocol level, unless we're
>> going to make a .well-known for Gemini.
> 
> With all due respect, EOF should be an indicator.

But how can I differentiate "EOF, the file is over" vs "EOF, the socket died"?

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

Ali Fardan <raiz (a) stellarbound.space>

📅 Sent: 2020-11-03 16:04
📧 Message 44 of 48

On Tue, 03 Nov 2020 15:51:20 +0000
khuxkm at tilde.team wrote:
> Alright, fine, I'll cede that point. Still, if anything, 5.5.2 and
> 5.5.3 show that new features that have been suggested have, in fact,
> made it into the spec. If we tell people not to suggest new features
> at all, we might miss out on some things we otherwise may have wanted.

I'm absolutely not against these two gemtext features, I'm just
suggesting that for the spec paper to be finalized, there has to be an
agreement within the community that the protocol is ready and there's
no more to be added.  Until of course, a newer version of the protocol
gets released, but that's for the far future when technology evolves,
for example, when TLS becomes obsolete.

> But how can I differentiate "EOF, the file is over" vs "EOF, the
> socket died"?

Suppose you have content length in the header and the content received
didn't match the content length, what would you do then? What purpose
would it serve to know if it's a socket dying EOF or end of stream EOF?

Link to individual message.

Sean Conner <sean (a) conman.org>

📅 Sent: 2020-11-03 16:26
📧 Message 45 of 48

It was thus said that the Great Ali Fardan once stated:
> On Tue, 03 Nov 2020 15:51:20 +0000
> khuxkm at tilde.team wrote:
> 
> > But how can I differentiate "EOF, the file is over" vs "EOF, the
> > socket died"?
> 
> Suppose you have content length in the header and the content received
> didn't match the content length, what would you do then? What purpose
> would it serve to know if it's a socket dying EOF or end of stream EOF?

  Then one of the following is true if the content size doesn't match:

	1. The server has a bug in that it sent a malformed size.

	2. The server send a malformed file.

	2. The connction was dropped during the transfer.

	3. The client has a bug in counting the bytes being received.

  In any case, what you have may not be complete and an error (or warning)
should be presented to the operator to decide what to do next.

  -spc

Link to individual message.

A. E. Spencer-Reed <easrng (a) gmail.com>

📅 Sent: 2020-11-04 16:50
📧 Message 46 of 48

Another solution might be to send a single part multipart/mixed
response, which IIRC can do headers.

On Tue, Nov 3, 2020 at 11:27 AM Sean Conner <sean at conman.org> wrote:
>
> It was thus said that the Great Ali Fardan once stated:
> > On Tue, 03 Nov 2020 15:51:20 +0000
> > khuxkm at tilde.team wrote:
> >
> > > But how can I differentiate "EOF, the file is over" vs "EOF, the
> > > socket died"?
> >
> > Suppose you have content length in the header and the content received
> > didn't match the content length, what would you do then? What purpose
> > would it serve to know if it's a socket dying EOF or end of stream EOF?
>
>   Then one of the following is true if the content size doesn't match:
>
>         1. The server has a bug in that it sent a malformed size.
>
>         2. The server send a malformed file.
>
>         2. The connction was dropped during the transfer.
>
>         3. The client has a bug in counting the bytes being received.
>
>   In any case, what you have may not be complete and an error (or warning)
> should be presented to the operator to decide what to do next.
>
>   -spc
>


-- 
?

Link to individual message.

Sean Conner <sean (a) conman.org>

📅 Sent: 2020-11-09 21:18
📧 Message 47 of 48

It was thus said that the Great Sean Conner once stated:
> 
> 	gemini://gemini.conman.org/test/testsize.gemini
> 
>   The above page will return a normal 20, but all the links on it will
> return a 22 with a size, then MIME type.

  The above page no longer exists.  It's clear that this method (using a
separate status code to signal size data) is not popular (and I wasn't even
a fan of it myself).

  -spc (Will remove dodgy proof-of-concepts on a whim for $400, Alex)

Link to individual message.

Matthew Ernisse <matt (a) going-flying.com>

📅 Sent: 2021-03-03 22:14
📧 Message 48 of 48

On Fri, Oct 30, 2020 at 05:28:40PM +0100, Emery Hemingway said unto me:
> I'd like to see proxies make a comeback, and local caching proxies seem
> like good place a good start.

I don't think size or hash data in the response is going to enable this
unless you are going to man-in-the-middle the TLS connection that gemini
rides on top of.  In this respect it is really no different than the
problem with proxies and HTTPS.

--Matt

---
Matthew Ernisse
matt at going-flying.com
gemini://going-flying.com/

Link to individual message.

---

Previous Thread: [ANN] New Personal Server: thesudorm.com

Next Thread: [ANN] Gemini Anonymous Board (and julia server)