IDN with Gemini?

🗣️ From: Sean Conner (sean (a) conman.org)
📅 Sent: 2020-12-08 00:04
📧 Message 52 of 68

It was thus said that the Great C?me Chilliet once stated:
> Le lundi 7 d?cembre 2020, 19:00:02 CET colecmac at protonmail.com a ?crit :
> > 
> > This would then require IRI parsing libraries, and as I have explained
> > earlier, these don't exist in likely many programming languages, and
> > when they do, they are third-party.
> 
> From what you said on irc, the situation is different between URI and IRI
> because most languages have URI parsing either in their stdlib or in a
> well tested known library. But, if no project use IRI, of course no one
> will write a library for it, this is a chicken and egg situation here.

  I'm looking at RFC-3987 [1] and the changes from RFC-3986 [2] are minimal,
and it would be easy to modify my own URI parsing library [3] (which is
based directly off the BNF of RFC-3986) but that only gets me so far.  The
other issue is Unicode normalization and punycode support, both of which I
would have to track down existing libraries or (and I shudder to think this)
write my own.

> Also, for the purpose of a client, it seems to me the parsing needed
> (domain and query extraction) is only to search for the first "/" and the
> last "?", and some minor tweaks on the scheme maybe (which does not
> contain unicode, I will leave the scheme alone, promise).

  And then do some Unicode normalization to match how filenames are stored
on your server:

	http://www.example.org/r&#xE9;sum&#xE9;.html
	http://www.example.org/re&#x301;sume&#x301;.html

  -spc

[1]	https://tools.ietf.org/html/rfc3987

[2]	https://tools.ietf.org/html/rfc3986

[3]	https://github.com/spc476/LPeg-Parsers/blob/master/url.lua

---

Previous in thread (51 of 68): 🗣️ colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Next in thread (53 of 68): 🗣️ Philip Linde (linde.philip (a) gmail.com)

View entire thread.