It was thus said that the Great C?me Chilliet once stated: > Le lundi 7 d?cembre 2020, 19:00:02 CET colecmac at protonmail.com a ?crit : > > > > This would then require IRI parsing libraries, and as I have explained > > earlier, these don't exist in likely many programming languages, and > > when they do, they are third-party. > > From what you said on irc, the situation is different between URI and IRI > because most languages have URI parsing either in their stdlib or in a > well tested known library. But, if no project use IRI, of course no one > will write a library for it, this is a chicken and egg situation here. I'm looking at RFC-3987 [1] and the changes from RFC-3986 [2] are minimal, and it would be easy to modify my own URI parsing library [3] (which is based directly off the BNF of RFC-3986) but that only gets me so far. The other issue is Unicode normalization and punycode support, both of which I would have to track down existing libraries or (and I shudder to think this) write my own. > Also, for the purpose of a client, it seems to me the parsing needed > (domain and query extraction) is only to search for the first "/" and the > last "?", and some minor tweaks on the scheme maybe (which does not > contain unicode, I will leave the scheme alone, promise). And then do some Unicode normalization to match how filenames are stored on your server: http://www.example.org/résumé.html http://www.example.org/résumé.html -spc [1] https://tools.ietf.org/html/rfc3987 [2] https://tools.ietf.org/html/rfc3986 [3] https://github.com/spc476/LPeg-Parsers/blob/master/url.lua
---
Previous in thread (51 of 68): 🗣️ colecmac (a) protonmail.com (colecmac (a) protonmail.com)
Next in thread (53 of 68): 🗣️ Philip Linde (linde.philip (a) gmail.com)