💾 Archived View for lists.flounder.online › gemini › threads › 266WUHDMV1XBQ.3PET71GXLP3J5@mailbox.o… captured on 2022-07-16 at 15:38:47. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2022-04-28)

-=-=-=-=-=-=-

May user-friendly link names be empty?

May user-friendly link names be empty?

From: codesoap@mailbox.org

Date: Mon, 08 Nov 2021 20:07:03 +0100

Message-Id: 266WUHDMV1XBQ.3PET71GXLP3J5@mailbox.org

To: <gemini@lists.orbitalfox.eu>

Reply

Export

--------------------------------------

Hi all,

the specification says this about link lines:

=>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]
where:
• <whitespace> is any non-zero number of consecutive spaces or tabs
• Square brackets indicate that the enclosed content is optional.
• <URL> is a URL, which may be absolute or relative.

From this description I don't know whether <USER-FRIENDLY LINK NAME> MAY

be empty. Which of these regular expressions would be correct to parse a

link line?

a) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t].*)$)

b) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t]?.*?)$)

Or are both wrong, because I misunderstood "spaces" as 0x20?

Sorry for the nitpicking :p

Greetings

Richard Ulmer

Re: May user-friendly link names be empty?

From: alex@nytpu.com

Date: Mon, 8 Nov 2021 13:01:38 -0700

Message-Id: 20211108200138.dk5htbjnyoztjh7p@GLaDOS.local

To: "Gemini Mailing List" <gemini@lists.orbitalfox.eu>

In-Reply-To: 266WUHDMV1XBQ.3PET71GXLP3J5@mailbox.org

Cc: <codesoap@mailbox.org>

Reply

Export

--------------------------------------

On 2021-11-08 08:07PM, codesoap@mailbox.org wrote:

Hi all,
> =>[<whitespace>]<URL>[<whitespace><USER-FRIENDLY LINK NAME>]
From this description I don't know whether <USER-FRIENDLY LINK NAME>
MAY be empty. Which of these regular expressions would be correct to
parse a link line?

The user friendly link name is optional, however if it is present

then the whitespace after the url is mandatory as well. In other words,

the URL is mandatory, and you either have no link name and no space

after the URL or a link name and mandatory space(s) after the URL.

So you have the following possibilities for link lines:

<- no link name, no space between => and url

<- no link name, one or more spaces separating

name <- link name present, no space between => and url

name <- link name present, one or more spaces separating

a) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t].*)$)
b) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t]?.*?)$)

I believe the latter (option b) would be correct. I would remove the

final "[^ \t]?" and simplify it down to this:

^=>[ \t]*([^ \t]+)(?:$|[ \t]+(.*)$)

You don't really need regex though, just skip past the =>[whitespace],

then the url is splitting on the first space and the remainder is

everything else.

~nytpu

--

Alex // nytpu

alex@nytpu.com

gpg --locate-external-key alex@nytpu.com

Re: May user-friendly link names be empty?

From: codesoap@mailbox.org

Date: Mon, 08 Nov 2021 21:29:25 +0100

Message-Id: 2OWBU4L09DGIP.23ZAM2A3BDE20@mailbox.org

To: "Alex // nytpu" <alex@nytpu.com>

In-Reply-To: 20211108200138.dk5htbjnyoztjh7p@GLaDOS.local

Cc: "Gemini Mailing List" <gemini@lists.orbitalfox.eu>

Reply

Export

--------------------------------------

Hi Alex,

thanks for your response! Unfortunately your answer still leaves me

uncertain. Here you say, that trailing whitespace is illegal, if there

is no link name:

The user friendly link name is optional, however if it is present
then the whitespace after the url is mandatory as well. In other words,
the URL is mandatory, and you either have no link name and no space
after the URL or a link name and mandatory space(s) after the URL.

The regular expression you suggest here, however, allows trailing

whitespace, even if there is no link name:

> a) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t].*)$)
> b) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t]?.*?)$)
I believe the latter (option b) would be correct. I would remove the
final "[^ \t]?" and simplify it down to this:
^=>[ \t]*([^ \t]+)(?:$|[ \t]+(.*)$)

Do you have a link to an authoritative source upon which your knowledge

on the matter is based? I wrote my initial mail on the topic, because

I'm looking for an authoritative answer rather than an opinion.

- Richard Ulmer

Re: May user-friendly link names be empty?

From: sean@conman.org

Date: Tue, 9 Nov 2021 16:29:03 -0500

Message-Id: 20211109212903.GA323@brevard.conman.org

To: "Gemini Mailing List" <gemini@lists.orbitalfox.eu>

In-Reply-To: 2OWBU4L09DGIP.23ZAM2A3BDE20@mailbox.org

Reply

Export

--------------------------------------

It was thus said that the Great codesoap@mailbox.org once stated:

Hi Alex,
thanks for your response! Unfortunately your answer still leaves me
uncertain. Here you say, that trailing whitespace is illegal, if there
is no link name:
> The user friendly link name is optional, however if it is present
> then the whitespace after the url is mandatory as well. In other words,
> the URL is mandatory, and you either have no link name and no space
> after the URL or a link name and mandatory space(s) after the URL.
The regular expression you suggest here, however, allows trailing
whitespace, even if there is no link name:
> > a) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t].*)$)
> > b) ^=>[ \t]*([^ \t]+)(?:$|[ \t]+([^ \t]?.*?)$)
> I believe the latter (option b) would be correct. I would remove the
> final "[^ \t]?" and simplify it down to this:
> ^=>[ \t]*([^ \t]+)(?:$|[ \t]+(.*)$)
Do you have a link to an authoritative source upon which your knowledge
on the matter is based? I wrote my initial mail on the topic, because
I'm looking for an authoritative answer rather than an opinion.

The two sources I know of are the one you already quoted (section 5.4.2 of

gemini://gemini.circumlunar.space/docs/specification.gmi) and a proposed BNF

for the revised specification

(https://gitlab.com/gemini-specification/gemini-text/-/issues/7).

Both leave the impression that (where spaces are replaced by '_'):

=>gemini://example.com_

would not be allowed, but as with the real world, you may have to be a bit

lenient with the input (especially if it's user generated). Also note that

as of now, the official spec allows both spaces (0x20) and tabs (0x09) to be

"whitespace", whereas the BNF only allowd spaces (0x20).

-spc

Re: May user-friendly link names be empty?

From: sean@conman.org

Date: Tue, 9 Nov 2021 16:35:32 -0500

Message-Id: 20211109213532.GC323@brevard.conman.org

To: "Gemini Mailing List" <gemini@lists.orbitalfox.eu>

In-Reply-To: 2OWBU4L09DGIP.23ZAM2A3BDE20@mailbox.org

Reply

Export

--------------------------------------

It was thus said that the Great codesoap@mailbox.org once stated:

Do you have a link to an authoritative source upon which your knowledge
on the matter is based? I wrote my initial mail on the topic, because
I'm looking for an authoritative answer rather than an opinion.

Let me amend my previous response---the BNF at

https://gitlab.com/gemini-specification/gemini-text/-/issues/7, as defined,

will allow the text portion of the link line to be nothing but spaces (per

the text rule).

-spc