💾 Archived View for gemi.dev › gemini-mailing-list › 000675.gmi captured on 2024-06-16 at 14:08:44. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

Proposal to drop double slashes from URL syntax

1. Daniel Nagy (danielnagy (a) posteo.de)

Hello,

I want to propose to drop the double slashes from the gemini URL 
syntax.

The reason for this is that they dont serve any semantic value and 
while the
project is still young, I think it could still be changed. So 
instead of

    gemini://example.com

we would have:

    gemini:example.com

In fact, Sir Tim Berners-Lee apologized[0] for introducing them in 
the http URL
syntax. I see the following advantages and disadvantages:

Advantages:
  - Less typing
  - Less wasted screenspace
  - Less transfered bytes and less stored bytes on disk and memory

Disadvantages:
  - Newcomers might see familiarities with http in the double 
  slashed syntax and
    recognize, that the following token is a hostname, which will 
    be contacted.
    There is a chance that, without the slashes, newcomers might 
    expect
    something else than a hostname after the color, although I 
    personally think
    that chance is low.
  - Implementations would need to adapt to this and some URL 
  parsers in their
    respective languages might not support the parsing of such 
    syntax.
  - Automatic URL detectors, like for example a terminal emulators 
  where you can
    click on a URL and it openes, might have trouble detecting 
    this URL form and
    therefore not recognize links. Those terminal emulators would 
    need adaption.

Of course, the double slashed syntax could still be supported, but 
the more
compact format could be encouraged. Any feedback or suggestions 
would greatly be
welcomed.

Regards,
Daniel

[0]: https://www.sitepoint.com/sir-tim-berners-lee-http-slashes/

Link to individual message.

2. PJ vM (pjvm742 (a) disroot.org)

> I want to propose to drop the double slashes from the gemini URL
> syntax.

No.

One of the reasons Gemini uses URLs is because the URL syntax is a

any other protocol for which there exists a URL scheme. In order for a
different-protocol link to be handled by a client for that protocol, it
is important that the link is a valid, standard URL. And it does not
make sense to use nonstandard fake URLs for gemini links and real URLs
for other protocols.

-- 
pjvm

Link to individual message.

3. Louis Brauer (louis (a) brauer.family)

URIs are already standardized:
 https://tools.ietf.org/html/rfc3986

specifically in Section 3.3 you'll find the description about //.

I don't think changing gemini URIs to a non-standard format would bring 
any advantage to the project.

- Louis

Am Di, 9. Feb 2021, um 17:12, schrieb Daniel Nagy:
> Hello,
> 
> I want to propose to drop the double slashes from the gemini URL 
> syntax.
> 
> The reason for this is that they dont serve any semantic value and 
> while the
> project is still young, I think it could still be changed. So 
> instead of
> 
>     gemini://example.com
> 
> we would have:
> 
>     gemini:example.com
> 
> In fact, Sir Tim Berners-Lee apologized[0] for introducing them in 
> the http URL
> syntax. I see the following advantages and disadvantages:
> 
> Advantages:
>   - Less typing
>   - Less wasted screenspace
>   - Less transfered bytes and less stored bytes on disk and memory
> 
> Disadvantages:
>   - Newcomers might see familiarities with http in the double 
>   slashed syntax and
>     recognize, that the following token is a hostname, which will 
>     be contacted.
>     There is a chance that, without the slashes, newcomers might 
>     expect
>     something else than a hostname after the color, although I 
>     personally think
>     that chance is low.
>   - Implementations would need to adapt to this and some URL 
>   parsers in their
>     respective languages might not support the parsing of such 
>     syntax.
>   - Automatic URL detectors, like for example a terminal emulators 
>   where you can
>     click on a URL and it openes, might have trouble detecting 
>     this URL form and
>     therefore not recognize links. Those terminal emulators would 
>     need adaption.
> 
> Of course, the double slashed syntax could still be supported, but 
> the more
> compact format could be encouraged. Any feedback or suggestions 
> would greatly be
> welcomed.
> 
> Regards,
> Daniel
> 
> [0]: https://www.sitepoint.com/sir-tim-berners-lee-http-slashes/
>

Link to individual message.

4. Nico (nico (a) itwont.work)

On 09/02/2021 16:12, Daniel Nagy wrote:
> Hello,
> 
> I want to propose to drop the double slashes from the gemini URL syntax.
> 
> The reason for this is that they dont serve any semantic value and while 
> the
> project is still young, I think it could still be changed. So instead of
> 
>  ?? gemini://example.com
> 
> we would have:
> 
>  ?? gemini:example.com
> 
> In fact, Sir Tim Berners-Lee apologized[0] for introducing them in the 
> http URL
> syntax. I see the following advantages and disadvantages:
> 
> Advantages:
>  ?- Less typing
>  ?- Less wasted screenspace
>  ?- Less transfered bytes and less stored bytes on disk and memory
> 
> Disadvantages:
>  ?- Newcomers might see familiarities with http in the double ?slashed 
> syntax and
>  ?? recognize, that the following token is a hostname, which will ?? be 
> contacted.
>  ?? There is a chance that, without the slashes, newcomers might ?? expect
>  ?? something else than a hostname after the color, although I    
> personally think
>  ?? that chance is low.
>  ?- Implementations would need to adapt to this and some URL ?parsers in 
> their
>  ?? respective languages might not support the parsing of such ?? syntax.
>  ?- Automatic URL detectors, like for example a terminal emulators 
>  ?where you can
>  ?? click on a URL and it openes, might have trouble detecting ?? this 
> URL form and
>  ?? therefore not recognize links. Those terminal emulators would    
> need adaption.
> 
> Of course, the double slashed syntax could still be supported, but the more
> compact format could be encouraged. Any feedback or suggestions would 
> greatly be
> welcomed.
> 
> Regards,
> Daniel
> 
> [0]: https://www.sitepoint.com/sir-tim-berners-lee-http-slashes/
I don't really feel like the benefits of this are worth the drawbacks 
that we would be breaking the URL standard, breaking existing 
URL-handling libraries, just breaking things for the sake of, what, 
saving two bytes? I don't think it's worth it. If you want to save 
screen space, just don't display the gemini://, if you are in a gemini 
client it is safe to assume you are speaking gemini (many web browsers 
don't display the "https://" for similar reasons)

Link to individual message.

5. Daniel Nagy (danielnagy (a) posteo.de)

My initial mail should have made that clearer. I was not talking 
about removing
the double-slashes of every link you can find in a Gemtext 
document. I was only
talking about gemini URI scheme links. http links ( and all other 
urls ) should
stay exactly the same as they are.

Also, to my understanding removing them does not violate the 
standard. For
example, `tel:+1-816-555-1212` and `magnet:?xt=urn:sha1:...` are 
valid URIs. A
Gemini browser might choose to recognize those `tel:` and 
`magnet:` links and a
open a locally installed VoiP client or Torrent client 
respectively. My initial
assumption was that some of the simpler parsers might not 
recognize this format.
If they adhere to the URI standard it should be fine.

There is some confusion regarding the terms URI and URL. The W3C 
tries to clear
that up[0]. It used to be that they ment different things in what 
they call the
"classical view" but this difference isn't as dominant anymore in 
the
"contemporary view". In my understanding, a scheme, which would be 
`gemini:`,
can define upon its own what it expects after the colon. You can 
find more
registered schemes here[1], some of which require a double-slash, 
some don't.

Even though the difference seems small, in the amount of links out 
there I think
this small difference can add up.

Best,
Daniel

[0]: https://www.w3.org/TR/uri-clarification/#uri-partitioning
[1]: 
https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml

PJ vM <pjvm742 at disroot.org> writes:

>> I want to propose to drop the double slashes from the gemini 
>> URL
>> syntax.
>
> No.
>
> One of the reasons Gemini uses URLs is because the URL syntax is 
> a
> *standard*. Gemtext allows linking to things that are accessible 
> through
> any other protocol for which there exists a URL scheme. In order 
> for a
> different-protocol link to be handled by a client for that 
> protocol, it
> is important that the link is a valid, standard URL. And it does 
> not
> make sense to use nonstandard fake URLs for gemini links and 
> real URLs
> for other protocols.

Link to individual message.

6. PJ vM (pjvm742 (a) disroot.org)

On 2/9/21 8:57 PM, Daniel Nagy wrote:

> Also, to my understanding removing them does not violate the standard. For
> example, `tel:+1-816-555-1212` and `magnet:?xt=urn:sha1:...` are valid
> URIs.
They are valid URIs, because these schemes do not use the "authority"
component of the URI syntax. The authority is for the hostname, port,
and userinfo. The double slash is grouped with the authority component
and is required by the URI standard to be present when the authority
component is used and to be absent when it is not used. Gemini URIs
always need to specify the hostname, so there is always an authority
component, and to be standard-compliant the double slash cannot be left out.

-- 
pjvm

Link to individual message.

7. Oliver Simmons (oliversimmo (a) gmail.com)

If you want a good diagram explaining this take a look at
https://en.m.wikipedia.org/wiki/URL#/media/File%3AURI_syntax_diagram.svg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210209/4e16
8af6/attachment.htm>

Link to individual message.

8. PJ vM (pjvm742 (a) disroot.org)

On 2/9/21 9:47 PM, Oliver Simmons wrote:
> If you want a good diagram explaining this take a look at 
> https://en.m.wikipedia.org/wiki/URL#/media/File%3AURI_syntax_diagram.svg
>
Link for desktop:
https://upload.wikimedia.org/wikipedia/commons/d/d6/URI_syntax_diagram.svg

Link to individual message.

9. Alex // nytpu (alex (a) nytpu.com)

> Also, to my understanding removing them does not violate the standard.
> For example, `tel:+1-816-555-1212` and `magnet:?xt=urn:sha1:...` are
> valid URIs.
The URI standard[1] says that if an "authority" is present in a URI, it
MUST be preceded by "//".  If an authority is not present, it
MUST NOT be preceded by two forward slashes.  A magnet link does not
have an authority because it is content addressed via a peer to peer
network, and an authority is only used when one central authority (for
instance the IANA for domain names) dictates the allocation and
structure of identifiers within that space.

[1]: RFC3986, in particular:
https://tools.ietf.org/html/rfc3986#appendix-A  (ABNF of a URI)
https://tools.ietf.org/html/rfc3986#section-3.2 (Description of an
                                                 authority)

> There is some confusion regarding the terms URI and URL. The W3C tries
> to clear that up[0].
The W3C deals exclusively with http and its derivative protocols (https,
etc), so we are not held to standards they pass (although we
unfortunately have to deal with the fallout sometimes).

> It used to be that they ment different things in what they call the
> "classical view" but this difference isn't as dominant anymore in the
> "contemporary view". In my understanding, a scheme, which would be
> `gemini:`, can define upon its own what it expects after the colon.
> You can find more registered schemes here, some of which require a
> double-slash, some don't.
In conversational terms, or even in technical discussion by
implementers, yes they have different meanings than they used to.
However, when you're dealing with standards, you have to use formally
defined words and as such the "classical" meanings of URI/URL/URN are
what is generally used on /this list/ because we^Weveryone else loves to
bikeshed the spec a lot.

Also, the gemini spec specifically says:
> This scheme is syntactically compatible with the generic URI syntax
> defined in RFC 3986, ut does not support all components of the generic
> syntax. In particular, the authority component is allowed and
> required.
https://gemini.circumlunar.space/docs/specification.html Section 1.2

Now, this is obviously what you're talking about changing, but there's
good reasons to be compliant with the URI spec (which, as previously
mentioned, mandates "//" precedes an authority).  The most compelling
reason being that there's a URI library for every programming language
that's ever been used in the past 30 years, and a main goal of gemini is
to reuse existing specifications where it's not necessary to save people
the trouble of having to roll their own everything.  This is the same
reason that gemini uses DNS and TLS despite their respective problems.
They already exist, they're standardized, and they're well used by
everyone already.


I hope I don't seem rude here, I'm just trying to lay out the general
position clearly.

~nytpu

-- 
Alex // nytpu
alex at nytpu.com
GPG Key: https://www.nytpu.com/files/pubkey.asc
Key fingerprint: 43A5 890C EE85 EA1F 8C88 9492 ECCD C07B 337B 8F5B
https://useplaintext.email/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210209/5708
49bc/attachment.sig>

Link to individual message.

10. Baschdel (baschdel (a) disroot.org)

On 09.02.21 17:12, Daniel Nagy wrote:
> Hello,

Hello!

> Advantages:
>  ?- Less typing

That's the only real advantage I see of this proposal. (Humans are lazy)
However as it was already explained it is a part of the URL/URI 
standard, without which we couldn't reuse already existing code and we 
would have to add special cases everywhere.

There are two scenarios one would want to type less:


For browser address baers you can add special cases and whatever bells 
and whistles and autocompletion you like as most webbrowsers currently 
do, just make sure that documents that do not conform to the standard 
get what they deserve.

For your gemini documents just use relative uris by leaving out the 
scheme or the path.

# Example
=> gemini://example.org/some_file.gmi
=> //example.org/some_file.gmi
=> /some_file.gmi

If you got the ones above in any document served over gemini from a 
server at example.org they will all point to the same resource

Link to individual message.

11. Daniel Nagy (danielnagy (a) posteo.de)

> Also, the gemini spec specifically says:
>> This scheme is syntactically compatible with the generic URI 
>> syntax
>> defined in RFC 3986, ut does not support all components of the 
>> generic
>> syntax. In particular, the authority component is allowed and
>> required.
> https://gemini.circumlunar.space/docs/specification.html Section 
> 1.2
>
> Now, this is obviously what you're talking about changing, but 
> there's
> good reasons to be compliant with the URI spec (which, as 
> previously
> mentioned, mandates "//" precedes an authority).  The most 
> compelling
> reason being that there's a URI library for every programming 
> language
> that's ever been used in the past 30 years, and a main goal of 
> gemini is
> to reuse existing specifications where it's not necessary to 
> save people
> the trouble of having to roll their own everything.  This is the 
> same
> reason that gemini uses DNS and TLS despite their respective 
> problems.
> They already exist, they're standardized, and they're well used 
> by
> everyone already.

I agree, adhering to a widely used standard and therefore being 
able to
use well established libraries is important. My expectation was, 
that it
would be compatible to the standard, but it looks like it is not.

What also contributed to this expectation is that I can still
differentiate where a hostname ends and a path starts, since 
hostnames
cannot contain slashes.

> I hope I don't seem rude here, I'm just trying to lay out the 
> general
> position clearly.
No rudeness experience from here. Thank you for your clarifying 
words.
I am happy to have learned about gemini and hope that I can help 
it
someday in the future.

Link to individual message.

12. Sean Conner (sean (a) conman.org)

It was thus said that the Great Daniel Nagy once stated:
> Hello,
> 
> I want to propose to drop the double slashes from the gemini URL syntax.
> 
> The reason for this is that they dont serve any semantic value and while
> the project is still young, I think it could still be changed. So instead
> of
> 
>    gemini://example.com
> 
> we would have:
> 
>    gemini:example.com

  All right.  Let's assume we do this.  Let's start with an existing link:

	gemini://gemini.conman.org/test/torture/0001

and work our way through the first few entries with the new style.  The
first is a full link in the Gemini document:

	gemini:gemini.conman.org/test/torture/0002

That works.  Our client can get the second page.  The next is a schemeless
link, wihch *is* valid in text/gemini documents (but not as the request). 
This will now look like:

	gemini.conman.org/test/torture/0003

Here we hit our first potential snag---do we have a hostname or not?  In
fact, here are the links from the next few tests using your proposal:

	/test/torture/0005
	007
	/test/../test/torture/0009

And to further mess with things, I could add a test 0051 with a relative
link to:

	gemini.conman.org

Do I mean the top level page of my sever?  Or the page

	/test/torture/gemini.conman.org

Because remember, you can have relative links in text/gemini documents, and
the hypothetical test 0051 has a path of

	/test/torture/0051

(and don't think I wouldn't do it).

> In fact, Sir Tim Berners-Lee apologized[0] for introducing them in the
> http URL syntax. I see the following advantages and disadvantages:

  Not mentioned is the alternative he would have done.

> Advantages:
>  - Less typing
>  - Less wasted screenspace
>  - Less transfered bytes and less stored bytes on disk and memory

  We're running over TLS.  There's already quite a bit of overhead, as I
documented here:

	gemini://gemi.dev/gemini-mailing-list/messages/001958.gmi

> Disadvantages:

  - Breaks relative linking in documents.  

  -spc

[0]: https://www.sitepoint.com/sir-tim-berners-lee-http-slashes/

Link to individual message.

13. Philip Linde (linde.philip (a) gmail.com)

On Tue, 09 Feb 2021 17:12:57 +0100
Daniel Nagy <danielnagy at posteo.de> wrote:

> The reason for this is that they dont serve any semantic value

They do serve a semantic value. They signify the beginning of the
authority portion of an URI. Not all URIs contain authorities, and
the authoriy means something in the sense that there is a standardized
way to interpret it. That you don't think that this semantic value is
important, or that it's an important characteristic of Gemini to rely
on existing standards is a different matter.

So to add to your list, further disadvantages:
- Can't rely as heavily on existing URI standards
- "Encouraged" standards lead to unreliable support

For those miniscule advantages I don't see why this should be
considered.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210210/ded1
9929/attachment.sig>

Link to individual message.

14. Daniel Nagy (danielnagy (a) posteo.de)


Sean Conner <sean at conman.org> writes:

> The next is a schemeless link, wihch *is* valid in text/gemini 
> documents (but
> not as the request). This will now look like:
>        gemini.conman.org/test/torture/0003
I didn't know Gemtext allows those. To me, for both , before and 
after, this
would be a file. The client should not guess what could be a 
hostname or a
directory. I thought, to switch hostnames in a document, you would 
have always
needed to write the full "gemini://". But with that scheme-less 
syntax, that
notation is even shorter.

> We're running over TLS. There's already quite a bit of overhead, 
> as I
> documented here:
>  gemini://gemi.dev/gemini-mailing-list/messages/001958.gmi
Interesting resource, thank you.

Link to individual message.

---

Previous Thread: [ANN] New site with some really dodgy algorithmic poetry

Next Thread: Re outreach and YouTubers