💾 Archived View for rawtext.club › ~sloum › geminilist › 007673.gmi captured on 2023-12-28 at 15:53:42. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-09-08)

-=-=-=-=-=-=-

<-- back to the mailing list

Documents with mixed languages

Michael Lazar lazar.michael22 at gmail.com

Sat Dec 11 18:06:25 GMT 2021

- - - - - - - - - - - - - - - - - - - 

On Sat, Dec 11, 2021 at 8:31 AM Wolf <wolf at wolfsden.cz> wrote:

Hello,

Greetings!

I'm considering implementing a gemini client over the Christmas, so I've
started to read the spec. And I think I've notice a missing thing in
text/gemini format.

Many things are missing from the text/gemini format, as it wasintended to be a simple markup format. Instead of "is this possible?",think "does this extension meet the power-to-weight ratio?".

What I think is missing is
an ability to set language for a specific part of the document itself.

Do you have any examples of existing gemini pages in the wild thatsuffer from this problem? Or examples of pages that you intend topublish on gemini? I'm interested in the real world use-cases (ingemini) and not contrived scenarios.

I'm not sure if the spec is permanently frozen or can still be extended,
but I would like to propose adding a language toggle command to
text/gemini.

I can tell you there's approximately zero chance of this being addedto the official spec. Your best bet, if you're serious about this, isto go ahead and implement your proposal in your client. If others findit useful they will start using it too (see: how titan is beingadopted).

Not sure about specific syntax, basically anything would work, ideally
something that would be not very disrupting if just render by clients
that do not know about it. So for example:
Here are some examples in Japanese and Chinese:
!lang=ja
こんにちは。
!lang=zh
你好。
!lang=en
And back to the original language.
This is just proposal, I don't really care if it is a switch (like mine
example) or some kind of block.

This doesn't allow for mixed languages inside of a singleline/paragraph though.

Now, why is this necessary. For Japanese and Chinese in particular,
Unicode is simply not enough. Same code point can be rendered very
differently depending on the language. So for proper rendering of the
text, you need to know the language. If the document has just one, there
is currently available way to do so (document language), but for
mixture, it is simply not possible. And I think that is a shame.

I'm curious how this actually works with text rendering engines. Dothey expose an API that allows you to specify the language? Which oneare you planning on using, and could you point me to the relevantdocs?

Thanks for considering.
W.
PS: I'm not subscribed, so please CC.
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.