[spec] IRIs, IDNs, and all that international jazz

> Feedback welcome, especially if I've overlooked anything, which is
> certainly possible.  What I'd be most interested in hearing, at this
> point, is client authors letting me know whether the standard library
> in the language their client is implemented in can straightforwardly:
>
> 1. Parse and relativise URLs with non-ASCII characters (so, yes, okay,
>    technically not URLs at all, you know what I mean) in paths and/or
>    domains?
> 2. Transform back and forth between URIs and IRIs?
> 3. Do DNS lookups of IDNs without them being punycoded first?  You can
>    test this with r?ksm?rg?s.josefsson.org.

The main language I use for Gemini software is Go. My clients, Amfora and
gemget, are both programmed using Go, and they use Go's built-in URL
library, called "net/url".

This library cannot properly handle 1, 2, or 3. This likely because the Go
stdlib is high quality, and appears to be coded to follow RFCs very strictly,
and the library was only designed to support URLs, and not IRIs.

For example, it will accept invalid characters in the path when parsing the
URL, but when converting it back into a string, it will percent-encode the
invalid characters. This does not happen with the query string, though.

The fact that paths and query strings are treated differently makes converting
IRIs to URIs not straightforward. And doing the reverse would require taking
the bits of the parsed URL and then decoding them compliantly, and then
stitching them together manually.

As for #3, the Go stdlib looks up the domain in the URL as-is, and will not
punycode anything. I have had to do it myself, which was annoying but not
super difficult. Amfora and gemget both have support for IDNs.

See the link below for how IDN support was added, if it's of interest.

https://github.com/makeworld-the-better-one/go-gemini/compare/a557676343c51
dabbc7d5a112d38bb8095db94d7...2f79af7688e88942d0d51d6ed65617b68a91a733


I believe these difficulties have implications on whether or not IRIs should
be added to the spec, but I'd rather let this email and the facts of the matter
stand on their own.


makeworld

---

Previous in thread (92 of 109): 🗣️ Petite Abeille (petite.abeille (a) gmail.com)

Next in thread (94 of 109): 🗣️ Dmitry Bogatov (gemini#lists.orbitalfox.eu#v1 (a) kaction.cc)

View entire thread.