[spec] IRIs, IDNs, and all that international jazz

On Wed, Dec 23, 2020 at 11:00:58AM +0100,
 marc <marcx2 at welz.org.za> wrote 
 a message of 79 lines which said:

> So I value the decency which wants to include all 
> human languages in the gemini ecosystem.

Actually, all human *scripts*. In any case, a Gemini client or server
won't have to understand the language. (Mandatory AI in Gemini?)

> But in an effort to be inclusive in one dimension one ends up being
> exclusive in another dimension, namely in the space of computer
> languages/host operating systems.

We already do it with the mandatory TLS: some systems cannot run
Gemini (imagine a Gemini server in assembly language). 

> It is one thing to find full I8N support in a language such as
> python (slow batteries included), but what about minorities such
> tcl, lua, m4 or sed ?

Lua is not a good example since the core language is, by design,
stricly limited. Any real Lua program uses several third-party
libraries.

> And so it strikes me as weird to embed the (combinatorial)
> complexity of human languages deep in the protocol stack,

I agree but nobody suggested to force Gemini software to understand
languages, only scripts.

> But even the layer just below that (the competent user level) this
> starts leaking. A gemini url starts with "gemini://" - that is ascii
> text, and even funnier, taken from latin. If a non-english user is
> confused by english (nay, latin, with no native speakers at all)
> words, then surely "gemini://" has to be rewritten as "tweling://"
> or "zwilling://" or whatever farsi, japanese or mongolian use for
> "twin". If not, then an full ascii text url should be manageable
> too...

The Web solved the problem by making the URI scheme optional. I don't
know Gemini clients who complete the URI with "gemini://" if it's
missing but it is a possible approach.

> an url is primarily a computer address.

This is clearly false. URI are both a technical identifier (like an IP
address or an address in memory) *and* a text seen by humans and
displayed in TV ads, business cards, spoken over the phone,
etc. Unlike addresses, they have to be internationalized. (Nobody
would use the Web if HTTP URIs were really addresses.)

> Long ago I came across a version of (I think it was) Pascal had been
> localised into french with language keywords like "begin" and "if"
> replaced. I am sure somebody can justify this somehow, but I thought
> this was an impediment to interoperability, and view the
> internationalising of computer protocols (as opposed to the user
> interfaces) in a similar way.

The idea is to have much more users than page authors and much more
page authors than programmers. Internationalizing programming
languages is a different issue, since programmers are a smaller group,
of professionals.

---

Previous in thread (54 of 109): 🗣️ bie (bie (a) 202x.moe)

Next in thread (56 of 109): 🗣️ John Cowan (cowan (a) ccil.org)

View entire thread.