πΎ Archived View for thebird.nl βΊ gn-gemtext-threads βΊ issues βΊ fix-broken-utf8-chars.gmi captured on 2023-05-24 at 18:08:42. Gemini links have been rewritten to link to archived content
β¬ οΈ Previous capture (2023-01-29)
β‘οΈ Next capture (2023-09-08)
-=-=-=-=-=-=-
We have jumbled up text in our database and this has been the case for years. It's impractical for a user to do the fixes using the metadata editing form because there are too many cases. A script that fixes this should be created to fix this issue.
This thread has some really nice ideas
Detecting broken characters in mysql
An example of a broken unicode character is: "Γ―ΒΌΕΎ". The character "Γ―ΒΌΕΎ" appears broken because it is not a valid Unicode character. This can happen for a number of reasons, such as a mistake when typing or pasting the character, corruption during transmission (most likely the case) or storage, or a lack of support for the character in the font or software being used to display the text.
To find the correct replacement for the character "Γ―ΒΌΕΎ", or any other character for the matter, you can look up its Unicode code point. In this case, the code point for "Γ―ΒΌΕΎ" is "U+2273", which corresponds to the character "β₯". You can then use this code point to search for and replace the broken character with the correct character in the text.