IETF policy on encodings and languages

We already have good support for multiple encodings and (in the case of
text/gemini) languages.  However, two questions arise:

a) What character encoding is used for META parts intended for human
consumption?  TL;dr answer: UTF-8.

b) What language is used for those META parts, since the server does not
know what languages are acceptable to the user?  TL;dr answer: start with
English, add other languages as necessarily or useful.

Details:

BCP 18, IETF Policy on Character Sets and Languages <
https://tools.ietf.org/html/bcp18>, says what a spec should say about
character sets and languages.  The MUSTard of this BCP is:

1) Specs MUST say which parts of the protocol are meant to be
human-readable.  The answer should be that the META of status lines 1x, 4x
(except 44), 5x, and 6x are human-readable and everything else is part of
the protocol.

2) Protocols MUST specify which character encoding is in use, and it MUST
be possible for it to be UTF-8.  Nailing that down for human-readable META
text is what needs to be done.  See (a).

3) Encodings that are used MUST be in the IANA registry.  Because we are
using media types, that happens already.  No action needed.

4) Protocols MUST have a way (which can be a default) of communicating the
encoding in use.  Fixing (2) will fix this one also.

5) Protocols in which users have text presented to them MUST have a way of
dealing with multiple languages.  We have a problem here for 1x that isn't
trivial to solve: what should a Russian search engine indexing both English
and Russian documents return as the META to a 1x response?  (6) is one
approach.

6) Where there is no ability to negotiate languages (Gemini doesn't), then
"i-default" language SHOULD be used.  "i-default" text MUST be
understandable to an English-speaking person, but MAY include text in other
languages if appropriate (e.g. the languages of the capsule or server).
See (b) and (6).

7) Protocols SHOULD use BCP 47 language tags to specify languages.  We do.

8) Material on i18n SHOULD be collected into a special section so that it
can be found by people concerned with i18n or L10n.  That one's up to
Solderpunk, though it will be necessary if the spec becomes one or more
RFCs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201227/d573
6fe1/attachment.htm>

---

Next in thread (2 of 24): 🗣️ Arav K. (nothien (a) uber.space)

View entire thread.