💾 Archived View for tilde.pink › ~imbrica › en › txt › encoding.gmi captured on 2022-06-03 at 23:56:14. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2021-12-03)

➡️ Next capture (2023-01-29)

-=-=-=-=-=-=-

Encoding problems with UTF-8

I was having some trouble to render UTF-8 text, in special the accents for the Portuguese language.

"pa?oca éum delicioso doce de amendoim com apenas tr?s ingredientes"

If you only write in English encoding may never be a problem for you.

For other languages however, accents will tend to break or render weird characters in your page.

They may also interfere with how the text editor and the ssh console display lines. This can be very frustrating for the user, making them give up entirely.

This in turn means that speakers of other languages end up facing extra barriers to get their content online.

If I have to explain further the consequences of this than you may already be suffering from them.

To solve that, we must make translated and translatable information on language settings available.

Plain-text technologies such as gemini are very translatable. But encoding and rendering errors are present still. When using a default installation, users can have a lot of problems publishing their non-English text.

If you are trying to create non-English content, one way to alleviate this is to make sure your text editor encoding is set to UTF-8.

For vim, you can do that by adding the following lines to your home directory's .vimrc file:

set encoding=utf-8
set fileencoding=utf-8

That way everytime you launch vim and save a file it will be saved as UTF-8.

To see what encoding a specific file has you can use the file command:

file file_name.gmi