Hi! I have found a page with a link written this way =><whitespace><URL><whitespace><linebreak> that is, a withespace without the link name following. How a parser should interpret this links? Is this a malformed link according to specs? My choice was to parse as it was: =><whitespace><URL><linebreak> like it had no withespace (and no link label) after the URI, i wonder if is this OK, or should i interpret this block of text like plain text. Bye! C.
On Fri, 17 Jul 2020 17:25:00 +0200 cage <cage-dev at twistfold.it> wrote: > Hi! > > I have found a page with a link written this way > > =><whitespace><URL><whitespace><linebreak> > > that is, a withespace without the link name following. My understanding of the spec is that the link name is optional. Without it, the link would just be a bare URL. But please don't take that as gospel; I could be wrong. :) -- Matthew Graybosch https://matthewgraybosch.com #include <disclaimer.h> gemini://starbreaker.org gemini://tanelorn.city "Out of order?! Even in the future nothing works."
On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote: Hello! Thank you for your reply! > On Fri, 17 Jul 2020 17:25:00 +0200 > cage <cage-dev at twistfold.it> wrote: > > > Hi! > > > > I have found a page with a link written this way > > > > =><whitespace><URL><whitespace><linebreak> > > > > that is, a withespace without the link name following. > > My understanding of the spec is that the link name is optional. Without > it, the link would just be a bare URL. > > But please don't take that as gospel; I could be wrong. :) No problem! We are just discussing :) I think what you wrote is entirely reasonable (in fact i actually modified my parser to act as you said) but then i checked the documentations and the specs for links is written as: =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>] So *if* i understand correctly (and i am not sure i did :-D), if i interpret the square brackets as "optional terms", i can read that line as: "A link is formed by the symbol '=>' followed by any non-zero number of consecutive spaces followed by the url and followed by an optional block formed by non zero space *and* a link name. So if there is a <whitespace> after <URL> a link name *must* follows. If each terms after the url was optional i expect the specs was something like: =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>] but like i am just guessing here, i am not a linguist, just an humble self taught programmer :) Bye! C.
cage <cage-dev at twistfold.it> wrote: > On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote: > If each terms after the url was optional i expect the specs was > something like: > > =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>] > That one makes the whitespace separator between <URL> and <USER-FRIENDLY LINK NAME> optional, making it hard to parse. This is what you were looking for: =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]] However, I think it's reasonable to assume the ending whitespace was unintentional and ignore it. Postel's law: Be conservative in what you do, be liberal in what you accept from others -- Katarina > -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200718/22cd 3695/attachment.htm>
On Sat, Jul 18, 2020 at 11:18:38AM +0200, Katarina Eriksson wrote: Hi! > cage <cage-dev at twistfold.it> wrote: > > If each terms after the url was optional i expect the specs was > > something like: > > > > =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>] > > > > That one makes the whitespace separator between <URL> and <USER-FRIENDLY > LINK NAME> optional, making it hard to parse. > > This is what you were looking for: > > =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]] Yes i was wrong! Thank you for correcting what i wrote. :) > However, I think it's reasonable to assume the ending whitespace was > unintentional and ignore it. > > Postel's law: > > Be conservative in what you do, be liberal in what you accept from > others I was not able to remember the name of this law, thank you! What disturbs me is that now my parser does not follow the grammar the specs describe anymore; but this is just some personal thing that i have to accept someway, i guess! :) Bye! C.
On 7/18/20 2:18 AM, Katarina Eriksson wrote: > cage <cage-dev at twistfold.it <mailto:cage-dev at twistfold.it>> wrote: > > On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote: > If each? terms after the url? was optional i expect? the specs was > something like: > > =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>] > > > That one makes the whitespace separator between <URL> and <USER-FRIENDLY > LINK NAME> optional, making it hard to parse. > > This is what you were looking for: > > =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]] > > However, I think it's reasonable to assume the ending whitespace was > unintentional and ignore it. > > Postel's law: > > ? ? Be conservative in what you do, be liberal in what you accept from > others > > -- > Katarina > For what it's worth, I think one should be careful in applying Postel's law, since it can encourage drift from the spec: if everyone else accepts messages that are misformatted in a particular way, then new implementations need to do so as well. That being said, I think this case is simple enough that I would 100% support parsers tolerating the trailing whitespace, and even support changing the spec in the way you described.
A possible solution is changing the grammar to be =>[whitespace]URL[[whitespace][friendly name]][whitespace] Since whitespace shouldn't parse out as part of the url anyway On July 18, 2020 3:20:20 PM EDT, Ash <ext0l at riseup.net> wrote: >On 7/18/20 2:18 AM, Katarina Eriksson wrote: >> cage <cage-dev at twistfold.it <mailto:cage-dev at twistfold.it>> wrote: >> >> On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch >wrote: >> If each? terms after the url? was optional i expect? the specs >was >> something like: >> >> =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>] >> >> >> That one makes the whitespace separator between <URL> and ><USER-FRIENDLY >> LINK NAME> optional, making it hard to parse. >> >> This is what you were looking for: >> >> =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]] >> >> However, I think it's reasonable to assume the ending whitespace was >> unintentional and ignore it. >> >> Postel's law: >> >> ? ? Be conservative in what you do, be liberal in what you accept >from >> others >> >> -- >> Katarina >> > >For what it's worth, I think one should be careful in applying Postel's > >law, since it can encourage drift from the spec: if everyone else >accepts messages that are misformatted in a particular way, then new >implementations need to do so as well. > >That being said, I think this case is simple enough that I would 100% >support parsers tolerating the trailing whitespace, and even support >changing the spec in the way you described. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200718/3f66 7e48/attachment-0001.htm>
On Saturday, July 18, 2020 9:20 PM, Ash <ext0l at riseup.net> wrote: > For what it's worth, I think one should be careful in applying Postel's > law, since it can encourage drift from the spec: if everyone else > accepts messages that are misformatted in a particular way, then new > implementations need to do so as well. It is true that keeping a strict spec, help keeping clean and short code. If such badly formed link is rare, it should be considered as an error by all gemini client and then rapidly corrected by the author. freD.
---