Ambiguity in the gemini document spec

1. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Hello everyone! Although this is mostly directed to solderpunk.

I've noticed something that's not clearly defined in the part of
the Gemini spec that concerns the text/gemini format. Specifically,
something should be added in section 1.3.5.1 or 1.3.5.2.

The link specification in 1.3.5.3.2 defines links as:

=>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF>

But no where else in the spec does it say that lines must end with
<CR><LF> characters, in fact how lines end is not defined at all. I haven't
inspected a lot of gemini sites, but I suspect that most of them
just use <LF>, aka \n, for line endings, because that is the default
for most text editors. And I suspect that most clients are fine with
this, for links or any other text lines.

My suggestion is to change the spec to explicitly say that lines end
with <LF>, with the <CR> part being optional. This could apply to
URL requests, etc, as well, but that's not really necessary, it would
just be nice.

This change will bring the current practices of users and clients into
spec, and it will also clear up that ambiguity.

Let me know what you think!

Thank you,
makeworld

Link to individual message.

2. solderpunk (solderpunk (a) SDF.ORG)

On Sat, May 16, 2020 at 03:41:05AM +0000, colecmac at protonmail.com wrote:
> Hello everyone! Although this is mostly directed to solderpunk.
> 
> I've noticed something that's not clearly defined in the part of
> the Gemini spec that concerns the text/gemini format. Specifically,
> something should be added in section 1.3.5.1 or 1.3.5.2.
> 
> The link specification in 1.3.5.3.2 defines links as:
> 
> =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF>
> 
> But no where else in the spec does it say that lines must end with
> <CR><LF> characters, in fact how lines end is not defined at all.

Thanks for bringing this to my attention, somebody raised the same point
to me on Mastodon at about the same time.

I'm quite surprised this spec ambiguity hasn't been mentioned before
now!  Perhaps it has, it sounds very vaguely familiar.  I think at the
time nobody thought it mattered much. But the fact that, unlike HTML,
the text/gemini format is explicitly line-oriented, means that this is
actually an important point.

I strongly suspect the use of <CR><LF> in 1.3.5.3.2 is simply the result
of me being in the habit of using it elsewhere in the spec while talking
about the request and response syntax.

This definitely needs to be cleared up, I've added it to the list of
things to address once the spec-freeze thaws (soon!).  In principle the
freeze shouldn't apply to real problems which definitely need solving,
but it's close enough to over now, and this problem, while real,
obviously hasn't caused any actual practical difficulties, so there's no
need to rush it.

Cheers,
Solderpunk

Link to individual message.

3. Fabio (fabrixxm (a) kirgroup.net)


Just to add another small point: also in links section 1.3.5.3.2 it says


	=>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF>

	where:

	* <whitespace> is any non-zero number of consecutive spaces or
	  tabs

	[...]

	All the following examples are valid link lines:

	[...]

	=>gemini://example.org/bar Yet another example link at the same host


so, the first whitespace can be a zero number of consecutive spaces or 
tabs :)

Link to individual message.

4. jan6 (a) tilde.ninja (jan6 (a) tilde.ninja)

May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote:
> =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF>
> 
> =>gemini://example.org/bar Yet another example link at the same host
> 
> so, the first whitespace can be a zero number of consecutive spaces or tabs :)

it also says that


  optional.

Link to individual message.

5. solderpunk (solderpunk (a) SDF.ORG)

On Sun, May 17, 2020 at 08:57:13AM +0000, jan6 at tilde.ninja wrote:
> May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote:
> > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF>
> > 
> > =>gemini://example.org/bar Yet another example link at the same host
> > 
> > so, the first whitespace can be a zero number of consecutive spaces or tabs :)
> 
> it also says that
> 
> * Square brackets indicate that the enclosed content is
>   optional.

The bigger problem here (which somebody mentioned in a HN comment) which
definitely needs fixing is that "any non-zero number of spaces/tabs"
includes, say, 13 tabs per atom in the universe.

I actually got a patch for AV-98 recently to address this.  Somebody
wrote a proof-of-concept server which sends infinitely long response
headers and AV-98 stupidly slurped it all down until the Linux OOM
killer stepped in.

(which is as much sloppy programming on my part as it is a problem with
the spec - in principle, a well-written client could slurp down and
immediately discard insignificant whitespace)

But clearly the spec needs to place a maximum length on response
headers.

Cheers,
Solderpunk

Link to individual message.

6. Fabio (fabrixxm (a) kirgroup.net)



Il giorno dom 17 mag 2020 alle 09:55, solderpunk <solderpunk at SDF.ORG> 
ha scritto:
> On Sun, May 17, 2020 at 08:57:13AM +0000, jan6 at tilde.ninja wrote:
>>  May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote:
>>  > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK 
>> NAME>]<CR><LF>
>>  >
>>  > =>gemini://example.org/bar Yet another example link at the same 
>> host
>>  >
>>  > so, the first whitespace can be a zero number of consecutive 
>> spaces or tabs :)
>> 
>>  it also says that
>> 
>>  * Square brackets indicate that the enclosed content is
>>    optional.

Well, you're right.. my brain just skip the square brackets.
Sorry for the noise.

Link to individual message.

---

Previous Thread: Announcing gemini://park-city.club

Next Thread: Announcing jan.bio & ncgopher gemini support