IDN with Gemini?

It was thus said that the Great Stephane Bortzmeyer once stated:
> On Tue, Dec 08, 2020 at 01:18:07AM +0100,
>  Philip Linde <linde.philip at gmail.com> wrote 
>  a message of 69 lines which said:
> 
> > homograph attacks
> 
> Homograph attacks are basically a good way to make an english-speaking
> audience laugh when you show them funny Unicode problems (I've seen
> that several times in several meetings: the languages and scripts of
> other people are always funny). No bad guy use them in real life,
> probably because users typically never check the URI or IRI.

  True, there's no need currently for homograph attacks if other, simpler
means are available.

> And they exist with ASCII, too (goog1e.com...)

  True.  But a more concerning attack is bitsquatting [1], a much harder
attack to thwart.  Is it widely used?  Hard to say actually.

> > Some browsers deal with homograph attacks by displaying punycode
> > directly based on some basic heuristic (e.g. when a hostname
> > contains both cyrillic and latin codes).
> 
> Which is awful for the UX. Note that such mangling is never done for
> ASCII, which clearly shows a provincial bias toward english.
> 
> > Octet encoded ASCII does have the nice property that there are no
> > homographs, there's no normalization,
> 
> This is not true. Since percent-encoding encodes bytes, there are
> still several ways to represent "the same" string of characters and
> therefore normalization remains an issue.

  Yes, but by "normalization" they mean precomosed characters (like
"\u{00E9}") vs. combining characters (like "e\u{0301}"), along with the
ordering of consecutive combining characters.

> > RFC 4690 is a good read on the topic of IDNs.
> 
> No, it is a one-sided anti-internationalization rant.

  Aside from the "internationalization is hard", what's so bad about the
document?  Remember, they *are* (or *were*) trying to retrofit
internationalization into protocols that were never designed for it.

  -spc

[1]	http://www.dinaburg.org/bitsquatting.html

---

Previous in thread (62 of 68): 🗣️ colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Next in thread (64 of 68): 🗣️ Sean Conner (sean (a) conman.org)

View entire thread.