On Tue, Dec 29, 2020 at 10:11 AM Petite Abeille <petite.abeille at gmail.com> wrote: > > On Dec 29, 2020, at 10:03, Sean Conner <sean at conman.org> wrote: > > > > Per this wording, any client that receives "text/plain; charset=us-ascii" > > is allowed to just drop it on the floor and do absolutely nothing with it. > Nonsense. A compliant client MUST support UTF-8. US-ASCII is a strict subset of UTF-8. Therefore a compliant client supports US-ASCII out-of-the-box. Nothing more, and nothing less. A car contains people. Therefore people are cars. Petite, you are confusing Is-A and Has-A relationships [1][2]. UTF-8 is a ("separate" from US-ASCII) character encoding that contains ASCII charset. If the spec said "clients MUST support ONLY UTF-8" then any pages specifying "charset=us-ascii" must result in an error. [1] https://en.wikipedia.org/wiki/Is-a [2] https://en.wikipedia.org/wiki/Has-a Back to a more productive topic, the wording in the spec - "clients MUST support UTF-8 encoded responses" - is ambiguous and doesn't actually mean that acceptable value for "charset" must include "utf-8", and says nothing about what values of "charset" are acceptable. It says that clients must at the very least try to decode response using UTF-8 charset decoder. Responses encoded with US-ASCII and UTF-8 (and UTF-PETER, which is a random subset of UTF-8) will indeed work. Looking at latest stats on gemini://gemini.bortzmeyer.org/software/lupa/stats.gmi it looks like UTF-8 (this includes unspecified charsets which per spec default to UTF-8) is used by 81% of pages, US-ASCII accounts for 17%. Given this, I suggest the spec be rephrased such that it instead specifies minimum acceptable values of "charset" (specifically us-ascii and utf-8).
---
Previous in thread (11 of 29): 🗣️ Petite Abeille (petite.abeille (a) gmail.com)
Next in thread (13 of 29): 🗣️ Sean Conner (sean (a) conman.org)