On Mon Dec 28, 2020 at 1:12 PM CET, William Orr wrote: > Normalization is the process of looking for all of these synonyms for > characters, and standardizing them to the same set of codepoints. If you > don't normalize, you could have a case where one user gets the intended > host for ?crire.hostname and another user gets an NXDOMAIN, all > depending on the sequence of bytes their input method produced. ...and actually, now that I think about, this issue is not specific to IRI support, is it? Even if we followed the web's lead and declared that Gemini requests and text/gemini links must contain ASCII-only URLs, and people have to do punycoding of non-ASCII hostnames and percent-encoding of UTF-8 representations of non-ASCII paths, it's still possible for the server and client to have different ideas about how a hostname or path are represented, right? With one using a composed form and the other a decomposed form? Whether you send a UTF-8 string as-is or first punycode and/or percent-encode it so it's valid ASCII is totally orthogonal to that question. Or have I missed something important? Cheers, Solderpunk
---
Previous in thread (11 of 16): 🗣️ Solene Rapenne (solene (a) perso.pw)
Next in thread (13 of 16): 🗣️ Petite Abeille (petite.abeille (a) gmail.com)