💾 Archived View for gemi.dev › gemini-mailing-list › 000304.gmi captured on 2024-08-25 at 10:50:30. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

question about links parsing

📧 Messages: 8
🗣️ Authors: 6
📅 First Message: 2020-07-17 15:25
📅 Last Message: 2020-07-18 21:11

1. cage (cage-dev (a) twistfold.it)

📅 Sent: 2020-07-17 15:25
📧 Message 1 of 8

Hi!

I have found a page with a link written this way

=><whitespace><URL><whitespace><linebreak>

that is, a withespace without the link name following.

How a  parser should interpret  this links?  Is this a  malformed link
according to specs?

My choice was to parse as it was:

=><whitespace><URL><linebreak>

like it had no withespace (and no  link label) after the URI, i wonder
if is  this OK, or  should i interpret this  block of text  like plain
text.

Bye!
C.

Link to individual message.

2. Matthew Graybosch (hello (a) matthewgraybosch.com)

📅 Sent: 2020-07-17 15:45
📧 Message 2 of 8

On Fri, 17 Jul 2020 17:25:00 +0200
cage <cage-dev at twistfold.it> wrote:

> Hi!
> 
> I have found a page with a link written this way
> 
> =><whitespace><URL><whitespace><linebreak>
> 
> that is, a withespace without the link name following.

My understanding of the spec is that the link name is optional. Without
it, the link would just be a bare URL.

But please don't take that as gospel; I could be wrong. :)

-- 
Matthew Graybosch		https://matthewgraybosch.com
#include <disclaimer.h>		gemini://starbreaker.org
	 			gemini://tanelorn.city
"Out of order?! Even in the future nothing works."

Link to individual message.

3. cage (cage-dev (a) twistfold.it)

📅 Sent: 2020-07-17 16:49
📧 Message 3 of 8

On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote:

Hello!

Thank you for your reply!

> On Fri, 17 Jul 2020 17:25:00 +0200
> cage <cage-dev at twistfold.it> wrote:
>
> > Hi!
> >
> > I have found a page with a link written this way
> >
> > =><whitespace><URL><whitespace><linebreak>
> >
> > that is, a withespace without the link name following.
>
> My understanding of the spec is that the link name is optional. Without
> it, the link would just be a bare URL.
>
> But please don't take that as gospel; I could be wrong. :)

No problem! We are just discussing :)

I think  what you  wrote is  entirely reasonable  (in fact  i actually
modified  my  parser to  act  as  you said)  but  then  i checked  the
documentations and the specs for links is written as:

=>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]

So *if*  i understand correctly (and  i am not  sure i did :-D),  if i
interpret the  square brackets  as "optional terms",  i can  read that
line as: "A link is formed by the symbol '=>' followed by any non-zero
number of  consecutive spaces followed by  the url and followed  by an
optional block formed by non zero space *and* a link name.

So if there is a <whitespace> after <URL> a link name *must* follows.

If each  terms after the url  was optional i expect  the specs was
something like:

=>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>]

but like i am just guessing here,  i am not a linguist, just an humble
self taught programmer :)

Bye!
C.

Link to individual message.

4. Katarina Eriksson (gmym (a) coopdot.com)

📅 Sent: 2020-07-18 09:18
📧 Message 4 of 8

cage <cage-dev at twistfold.it> wrote:

> On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote:
> If each  terms after the url  was optional i expect  the specs was
> something like:
>
> =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>]
>

That one makes the whitespace separator between <URL> and <USER-FRIENDLY
LINK NAME> optional, making it hard to parse.

This is what you were looking for:

=>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]]

However, I think it's reasonable to assume the ending whitespace was
unintentional and ignore it.

Postel's law:

    Be conservative in what you do, be liberal in what you accept from
others

-- 
Katarina

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200718/22cd
3695/attachment.htm>

Link to individual message.

5. cage (cage-dev (a) twistfold.it)

📅 Sent: 2020-07-18 12:28
📧 Message 5 of 8

On Sat, Jul 18, 2020 at 11:18:38AM +0200, Katarina Eriksson wrote:

Hi!

> cage <cage-dev at twistfold.it> wrote:
> > If each  terms after the url  was optional i expect  the specs was
> > something like:
> >
> > =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>]
> >
>
> That one makes the whitespace separator between <URL> and <USER-FRIENDLY
> LINK NAME> optional, making it hard to parse.
>
> This is what you were looking for:
>
> =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]]

Yes i was wrong! Thank you for correcting what i wrote. :)

> However, I think it's reasonable to assume the ending whitespace was
> unintentional and ignore it.
>
> Postel's law:
>
>     Be conservative in what you do, be liberal in what you accept from
> others

I was not able to remember the name of this law, thank you!

What disturbs me is that now my parser does not follow the grammar the
specs describe  anymore; but this is  just some personal thing  that i
have to accept someway, i guess! :)

Bye!
C.

Link to individual message.

6. Ash (ext0l (a) riseup.net)

📅 Sent: 2020-07-18 19:20
📧 Message 6 of 8

On 7/18/20 2:18 AM, Katarina Eriksson wrote:
> cage <cage-dev at twistfold.it <mailto:cage-dev at twistfold.it>> wrote:
> 
>     On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch wrote:
>     If each? terms after the url? was optional i expect? the specs was
>     something like:
> 
>     =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>]
> 
> 
> That one makes the whitespace separator between <URL> and <USER-FRIENDLY 
> LINK NAME> optional, making it hard to parse.
> 
> This is what you were looking for:
> 
> =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]]
> 
> However, I think it's reasonable to assume the ending whitespace was 
> unintentional and ignore it.
> 
> Postel's law:
> 
>  ? ? Be conservative in what you do, be liberal in what you accept from 
> others
> 
> -- 
> Katarina
> 

For what it's worth, I think one should be careful in applying Postel's 
law, since it can encourage drift from the spec: if everyone else 
accepts messages that are misformatted in a particular way, then new 
implementations need to do so as well.

That being said, I think this case is simple enough that I would 100% 
support parsers tolerating the trailing whitespace, and even support 
changing the spec in the way you described.

Link to individual message.

7. Caranatar (caranatar (a) riseup.net)

📅 Sent: 2020-07-18 20:23
📧 Message 7 of 8

A possible solution is changing the grammar to be

=>[whitespace]URL[[whitespace][friendly name]][whitespace]

Since whitespace shouldn't parse out as part of the url anyway

On July 18, 2020 3:20:20 PM EDT, Ash <ext0l at riseup.net> wrote:
>On 7/18/20 2:18 AM, Katarina Eriksson wrote:
>> cage <cage-dev at twistfold.it <mailto:cage-dev at twistfold.it>> wrote:
>> 
>>     On Fri, Jul 17, 2020 at 11:45:28AM -0400, Matthew Graybosch
>wrote:
>>     If each? terms after the url? was optional i expect? the specs
>was
>>     something like:
>> 
>>     =>[<whitespace>]<URL>[<whitespace>][<USER-FRIENDLY LINK NAME>]
>> 
>> 
>> That one makes the whitespace separator between <URL> and
><USER-FRIENDLY 
>> LINK NAME> optional, making it hard to parse.
>> 
>> This is what you were looking for:
>> 
>> =>[<whitespace>]<URL>[<whitespace>[<USER-FRIENDLY LINK NAME>]]
>> 
>> However, I think it's reasonable to assume the ending whitespace was 
>> unintentional and ignore it.
>> 
>> Postel's law:
>> 
>>  ? ? Be conservative in what you do, be liberal in what you accept
>from 
>> others
>> 
>> -- 
>> Katarina
>> 
>
>For what it's worth, I think one should be careful in applying Postel's
>
>law, since it can encourage drift from the spec: if everyone else 
>accepts messages that are misformatted in a particular way, then new 
>implementations need to do so as well.
>
>That being said, I think this case is simple enough that I would 100% 
>support parsers tolerating the trailing whitespace, and even support 
>changing the spec in the way you described.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200718/3f66
7e48/attachment-0001.htm>

Link to individual message.

8. defdefred (defdefred (a) protonmail.com)

📅 Sent: 2020-07-18 21:11
📧 Message 8 of 8

On Saturday, July 18, 2020 9:20 PM, Ash <ext0l at riseup.net> wrote:
> For what it's worth, I think one should be careful in applying Postel's
> law, since it can encourage drift from the spec: if everyone else
> accepts messages that are misformatted in a particular way, then new
> implementations need to do so as well.

It is true that keeping a strict spec, help keeping clean and short code.
If such badly formed link is rare, it should be considered as an error by 
all gemini client and then rapidly corrected by the author.

freD.

Link to individual message.

---

Previous Thread: [ANN] tinmop 0.1.3

Next Thread: New user, spec details