💾 Archived View for gemi.dev › gemini-mailing-list › 000091.gmi captured on 2023-12-28 at 15:41:09. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-04)
-=-=-=-=-=-=-
Hello everyone! Although this is mostly directed to solderpunk. I've noticed something that's not clearly defined in the part of the Gemini spec that concerns the text/gemini format. Specifically, something should be added in section 1.3.5.1 or 1.3.5.2. The link specification in 1.3.5.3.2 defines links as: =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF> But no where else in the spec does it say that lines must end with <CR><LF> characters, in fact how lines end is not defined at all. I haven't inspected a lot of gemini sites, but I suspect that most of them just use <LF>, aka \n, for line endings, because that is the default for most text editors. And I suspect that most clients are fine with this, for links or any other text lines. My suggestion is to change the spec to explicitly say that lines end with <LF>, with the <CR> part being optional. This could apply to URL requests, etc, as well, but that's not really necessary, it would just be nice. This change will bring the current practices of users and clients into spec, and it will also clear up that ambiguity. Let me know what you think! Thank you, makeworld
On Sat, May 16, 2020 at 03:41:05AM +0000, colecmac at protonmail.com wrote: > Hello everyone! Although this is mostly directed to solderpunk. > > I've noticed something that's not clearly defined in the part of > the Gemini spec that concerns the text/gemini format. Specifically, > something should be added in section 1.3.5.1 or 1.3.5.2. > > The link specification in 1.3.5.3.2 defines links as: > > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF> > > But no where else in the spec does it say that lines must end with > <CR><LF> characters, in fact how lines end is not defined at all. Thanks for bringing this to my attention, somebody raised the same point to me on Mastodon at about the same time. I'm quite surprised this spec ambiguity hasn't been mentioned before now! Perhaps it has, it sounds very vaguely familiar. I think at the time nobody thought it mattered much. But the fact that, unlike HTML, the text/gemini format is explicitly line-oriented, means that this is actually an important point. I strongly suspect the use of <CR><LF> in 1.3.5.3.2 is simply the result of me being in the habit of using it elsewhere in the spec while talking about the request and response syntax. This definitely needs to be cleared up, I've added it to the list of things to address once the spec-freeze thaws (soon!). In principle the freeze shouldn't apply to real problems which definitely need solving, but it's close enough to over now, and this problem, while real, obviously hasn't caused any actual practical difficulties, so there's no need to rush it. Cheers, Solderpunk
Just to add another small point: also in links section 1.3.5.3.2 it says =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF> where: * <whitespace> is any non-zero number of consecutive spaces or tabs [...] All the following examples are valid link lines: [...] =>gemini://example.org/bar Yet another example link at the same host so, the first whitespace can be a zero number of consecutive spaces or tabs :)
May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote: > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF> > > =>gemini://example.org/bar Yet another example link at the same host > > so, the first whitespace can be a zero number of consecutive spaces or tabs :) it also says that
On Sun, May 17, 2020 at 08:57:13AM +0000, jan6 at tilde.ninja wrote: > May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote: > > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]<CR><LF> > > > > =>gemini://example.org/bar Yet another example link at the same host > > > > so, the first whitespace can be a zero number of consecutive spaces or tabs :) > > it also says that > > * Square brackets indicate that the enclosed content is > optional. The bigger problem here (which somebody mentioned in a HN comment) which definitely needs fixing is that "any non-zero number of spaces/tabs" includes, say, 13 tabs per atom in the universe. I actually got a patch for AV-98 recently to address this. Somebody wrote a proof-of-concept server which sends infinitely long response headers and AV-98 stupidly slurped it all down until the Linux OOM killer stepped in. (which is as much sloppy programming on my part as it is a problem with the spec - in principle, a well-written client could slurp down and immediately discard insignificant whitespace) But clearly the spec needs to place a maximum length on response headers. Cheers, Solderpunk
Il giorno dom 17 mag 2020 alle 09:55, solderpunk <solderpunk at SDF.ORG> ha scritto: > On Sun, May 17, 2020 at 08:57:13AM +0000, jan6 at tilde.ninja wrote: >> May 17, 2020 11:53 AM, "Fabio" <fabrixxm at kirgroup.net> wrote: >> > =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK >> NAME>]<CR><LF> >> > >> > =>gemini://example.org/bar Yet another example link at the same >> host >> > >> > so, the first whitespace can be a zero number of consecutive >> spaces or tabs :) >> >> it also says that >> >> * Square brackets indicate that the enclosed content is >> optional. Well, you're right.. my brain just skip the square brackets. Sorry for the noise.
---