The legality of double slashes in URIs

Martin Chang replied [1] to my musings on processing malformed Gemini requests [2], saying that double slashes in URI (Uniform Resource Indicator)s are illegal, and pointed out the ABNF (Augmented Backus-Naur Form) grammar from the URI specification [3] to back up his claim:

path          = path-absolute   ; begins with "/" but not "//"
path-absolute = "/" [ segment-nz *( "/" segment ) ]
segment-nz    = 1*pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

But he didn't quote the segment rule:

segment       = *pchar

which translated says, “0 or more pchar rules.”

So the ABNF he quoted does indeed rule out //­boston/­2018/­07/­04.2. It doesn't rule out /­boston//­2018/­07/­04.2, since by the time we hit the double slash, we're in the *( "/" segment ) part of the path-absolute rule, and segment can have 0 characters. But what he quoted only applies to relative links, what I receive is an abolute link. If you follow the ABNF from that perspective:

URI-reference = URI / relative-ref
URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part     = "//" authority path-abempty
                 / path-absolute
                 / path-rootless
                 / path-empty

path-abempty  = *( "/" segment )

; other rules omitted

not only does this allow gemini://­gemini.conman.org/­/­boston/­2018/­07/­04.2 but gemini://­gemini.conman.org/­/­/­/­/­/­/­/­/­/­/­boston/­2018/­07/­04.2.

I can understand why this was done—to simplify the grammar as the various path- rules generally end with *( "/" segment ) allows one to end a URI with a trailing slash or not. I don't think the intent was to allow long strings of slashes, but that's the end result of a lax grammar. Martin is also correct that multiple slashes are treated as a single slash on POSIX (Portable Operating System Interface) (basically, any Unix system), that's not the case across all operating systems. One exception I can think of AmigaOS (Operating System), where each slash represents a parent directory. This command, cd /// on AmigaOS is the same as cd ‥/‥/‥ on a POSIX system. Crazy, I know. And maybe not even relevant these days, but I thought I should mention it.

[1] gemini://gemini.clehaxze.tw/gemlog/2022/05-03-two-cents-on-the-mistery-of-double-slashes-in-urls.gmi

[2] /boston/2022/04/30.1

[3] https://www.ietf.org/rfc/rfc3986.txt

Gemini Mention this post

Contact the author