💾 Archived View for rawtext.club › ~sloum › geminilist › 007672.gmi captured on 2023-09-08 at 16:28:37. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

<-- back to the mailing list

Documents with mixed languages

Wolf wolf at wolfsden.cz

Sat Dec 11 12:00:13 GMT 2021

- - - - - - - - - - - - - - - - - - - 

Hello,

I'm considering implementing a gemini client over the Christmas, so I'vestarted to read the spec. And I think I've notice a missing thing intext/gemini format. In particular regarding documents with contentcomposed from multiple languages. The specification provides thisexample:

"text/gemini; lang=en,fr" Denotes a text/gemini document written in a mixture of English and French

However, that seems to be a document-wide settings. If I wanted adocument to have a mix of Japanese and Chinese, I believe it would be:

"text/gemini; lang=ja,zh"

However, that in itself is not very useful. What I think is missing isan ability to set language for a specific part of the document itself.I'm not sure if the spec is permanently frozen or can still be extended,but I would like to propose adding a language toggle command totext/gemini.

Not sure about specific syntax, basically anything would work, ideallysomething that would be not very disrupting if just render by clientsthat do not know about it. So for example:

Here are some examples in Japanese and Chinese:

!lang=ja こんにちは。

!lang=zh 你好。

!lang=en And back to the original language.

This is just proposal, I don't really care if it is a switch (like mineexample) or some kind of block.

Now, why is this necessary. For Japanese and Chinese in particular,Unicode is simply not enough. Same code point can be rendered verydifferently depending on the language. So for proper rendering of thetext, you need to know the language. If the document has just one, thereis currently available way to do so (document language), but formixture, it is simply not possible. And I think that is a shame.

Thanks for considering.

W.

PS: I'm not subscribed, so please CC.

-- There are only two hard things in Computer Science:cache invalidation, naming things and off-by-one errors.-------------- next part --------------A non-text attachment was scrubbed...Name: signature.ascType: application/pgp-signatureSize: 833 bytesDesc: not availableURL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20211211/b47cc774/attachment.sig>