What is required to be IRI compliant?

On Mon Dec 28, 2020 at 1:12 PM CET, William Orr wrote:

> Normalization is the process of looking for all of these synonyms for
> characters, and standardizing them to the same set of codepoints. If you
> don't normalize, you could have a case where one user gets the intended
> host for ?crire.hostname and another user gets an NXDOMAIN, all
> depending on the sequence of bytes their input method produced.

...and actually, now that I think about, this issue is not specific to
IRI support, is it?  Even if we followed the web's lead and declared
that Gemini requests and text/gemini links must contain ASCII-only URLs,
and people have to do punycoding of non-ASCII hostnames and
percent-encoding of UTF-8 representations of non-ASCII paths, it's still
possible for the server and client to have different ideas about how a
hostname or path are represented, right?  With one using a composed form
and the other a decomposed form?  Whether you send a UTF-8 string as-is
or first punycode and/or percent-encode it so it's valid ASCII is
totally orthogonal to that question.  Or have I missed something
important?

Cheers,
Solderpunk

---

Previous in thread (11 of 16): 🗣️ Solene Rapenne (solene (a) perso.pw)

Next in thread (13 of 16): 🗣️ Petite Abeille (petite.abeille (a) gmail.com)

View entire thread.