๐Ÿ’พ Archived View for oppen.digital โ€บ memex โ€บ 20211105 captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content

View Raw

More Information

โฌ…๏ธ Previous capture (2021-11-30)

-=-=-=-=-=-=-

Misuse of the unicode Mathematical Alphanumeric block

I recently became aware of the idea to inline bold text in Gemtext using the MATHEMATICAL_ALPHANUMERIC_SYMBOLS unicode block, I'd seen people doing this on Twitter before of course but this was positioned specifically to get around the lack of inline tags in Gemtext.

I was initally against this from a screen reader accessibility standpoint but Android's TalkBack remaps them to standard characters and reads them without issue.

Users can install any screen reader they like on Android, we can't assume everyone uses TalkBack and not all will have had the resources of Google to develop their software to the point that includes handling these misused blocks (it has been suggested elsewhere (unkindly) that these developers don't understand Unicode, whereas it's merely a question of prioritising other things and time being linear). A further point is legibility for people who have any number of sight issues.

I personally find these characters really irritating, so am going to add a feature to remap them in Ariane. There's a small subset of the available characters below while I test remapping to standard characters using a feature toggle:

They are ๐‘๐„๐€๐‹๐‹๐˜ annoying.

๐€๐๐‚๐ƒ๐„๐…๐†๐‡๐ˆ๐‰๐Š๐‹๐Œ๐๐Ž๐๐๐‘๐’๐“๐”๐•๐–๐—๐˜๐™

Code block:

๐€๐๐‚๐ƒ๐„๐…๐†๐‡๐ˆ๐‰๐Š๐‹๐Œ๐๐Ž๐๐๐‘๐’๐“๐”๐•๐–๐—๐˜๐™ 

Working Kotlin Solution

private fun remapBoldUnicode(line: String): String {

    val unicodeMapper = UnicodeMathematicalSymbolsMapper()
    val unescaped = StringEscapeUtils.unescapeJava(line)

    val hasBoldUnicode = unicodeMapper.hasMathematicalAlphanumericSymbols(unescaped)

    return when {
        hasBoldUnicode -> unicodeMapper.remap(unescaped)
        else -> line
    }
}

UnicodeMathematicalSymbolsMapper.kt:

package oppen.gemini.gemtext.processor

/**
 * This class maps characters from the Mathematical Alphanumeric Symbols unicode block to standard A-Z a-z
 *
 * @see https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols
 */
class UnicodeMathematicalSymbolsMapper {

    private val standard = listOf(
        'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
        'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
        'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd',
        'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
        'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
        'y', 'z')

    private val highSurrogateCode = 55349

    //Serif Bold: ๐€๐๐‚๐ƒ๐„๐…๐†๐‡๐ˆ๐‰๐Š๐‹๐Œ๐๐Ž๐๐๐‘๐’๐“๐”๐•๐–๐—๐˜๐™๐š๐›๐œ๐๐ž๐Ÿ๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ
    private val serifBoldRange = IntRange(56320, 56371)

    //Serif Italic: ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐ผ๐ฝ๐พ๐ฟ๐‘€๐‘๐‘‚๐‘ƒ๐‘„๐‘…๐‘†๐‘‡๐‘ˆ๐‘‰๐‘Š๐‘‹๐‘Œ๐‘๐‘Ž๐‘๐‘๐‘‘๐‘’๐‘“๐‘”โ„Ž๐‘–๐‘—๐‘˜๐‘™๐‘š๐‘›๐‘œ๐‘๐‘ž๐‘Ÿ๐‘ ๐‘ก๐‘ข๐‘ฃ๐‘ค๐‘ฅ๐‘ฆ๐‘ง
    private val serifItalicRange = IntRange(56372, 56423)

    //Serif Italic Bold: ๐‘จ๐‘ฉ๐‘ช๐‘ซ๐‘ฌ๐‘ญ๐‘ฎ๐‘ฏ๐‘ฐ๐‘ฑ๐‘ฒ๐‘ณ๐‘ด๐‘ต๐‘ถ๐‘ท๐‘ธ๐‘น๐‘บ๐‘ป๐‘ผ๐‘ฝ๐‘พ๐‘ฟ๐’€๐’๐’‚๐’ƒ๐’„๐’…๐’†๐’‡๐’ˆ๐’‰๐’Š๐’‹๐’Œ๐’๐’Ž๐’๐’๐’‘๐’’๐’“๐’”๐’•๐’–๐’—๐’˜๐’™๐’š๐’›
    private val serifItalicBoldRange = IntRange(56424, 56475)

    //Sans Serif Normal: ๐– ๐–ก๐–ข๐–ฃ๐–ค๐–ฅ๐–ฆ๐–ง๐–จ๐–ฉ๐–ช๐–ซ๐–ฌ๐–ญ๐–ฎ๐–ฏ๐–ฐ๐–ฑ๐–ฒ๐–ณ๐–ด๐–ต๐–ถ๐–ท๐–ธ๐–น๐–บ๐–ป๐–ผ๐–ฝ๐–พ๐–ฟ๐—€๐—๐—‚๐—ƒ๐—„๐—…๐—†๐—‡๐—ˆ๐—‰๐—Š๐—‹๐—Œ๐—๐—Ž๐—๐—๐—‘๐—’๐—“
    private val sansNormalRange = IntRange(56736, 56787)

    //Sans Serif Bold: ๐—”๐—•๐—–๐——๐—˜๐—™๐—š๐—›๐—œ๐—๐—ž๐—Ÿ๐— ๐—ก๐—ข๐—ฃ๐—ค๐—ฅ๐—ฆ๐—ง๐—จ๐—ฉ๐—ช๐—ซ๐—ฌ๐—ญ๐—ฎ๐—ฏ๐—ฐ๐—ฑ๐—ฒ๐—ณ๐—ด๐—ต๐—ถ๐—ท๐—ธ๐—น๐—บ๐—ป๐—ผ๐—ฝ๐—พ๐—ฟ๐˜€๐˜๐˜‚๐˜ƒ๐˜„๐˜…๐˜†๐˜‡
    private val sansBoldRange = IntRange(56788, 56839)

    //Sans Serif Italic
    //๐˜ˆ๐˜‰๐˜Š๐˜‹๐˜Œ๐˜๐˜Ž๐˜๐˜๐˜‘๐˜’๐˜“๐˜”๐˜•๐˜–๐˜—๐˜˜๐˜™๐˜š๐˜›๐˜œ๐˜๐˜ž๐˜Ÿ๐˜ ๐˜ก๐˜ข๐˜ฃ๐˜ค๐˜ฅ๐˜ฆ๐˜ง๐˜จ๐˜ฉ๐˜ช๐˜ซ๐˜ฌ๐˜ญ๐˜ฎ๐˜ฏ๐˜ฐ๐˜ฑ๐˜ฒ๐˜ณ๐˜ด๐˜ต๐˜ถ๐˜ท๐˜ธ๐˜น๐˜บ๐˜ป
    private val sansItalicRange = IntRange(56840, 56891)

    //Sans Serif Italic Bold: ๐˜ผ๐˜ฝ๐˜พ๐˜ฟ๐™€๐™๐™‚๐™ƒ๐™„๐™…๐™†๐™‡๐™ˆ๐™‰๐™Š๐™‹๐™Œ๐™๐™Ž๐™๐™๐™‘๐™’๐™“๐™”๐™•๐™–๐™—๐™˜๐™™๐™š๐™›๐™œ๐™๐™ž๐™Ÿ๐™ ๐™ก๐™ข๐™ฃ๐™ค๐™ฅ๐™ฆ๐™ง๐™จ๐™ฉ๐™ช๐™ซ๐™ฌ๐™ญ๐™ฎ๐™ฏ
    private val sansItalicBoldRange = IntRange(56892, 56943)

    //Calligraphy Normal: ๐’œโ„ฌ๐’ž๐’Ÿโ„ฐโ„ฑ๐’ขโ„‹โ„๐’ฅ๐’ฆโ„’โ„ณ๐’ฉ๐’ช๐’ซ๐’ฌโ„›๐’ฎ๐’ฏ๐’ฐ๐’ฑ๐’ฒ๐’ณ๐’ด๐’ต๐’ถ๐’ท๐’ธ๐’นโ„ฏ๐’ปโ„Š๐’ฝ๐’พ๐’ฟ๐“€๐“๐“‚๐“ƒโ„ด๐“…๐“†๐“‡๐“ˆ๐“‰๐“Š๐“‹๐“Œ๐“๐“Ž๐“
    private val calligraphyNormalRange = IntRange(56476, 56527)

    //Calligraphy Bold: ๐“๐“‘๐“’๐““๐“”๐“•๐“–๐“—๐“˜๐“™๐“š๐“›๐“œ๐“๐“ž๐“Ÿ๐“ ๐“ก๐“ข๐“ฃ๐“ค๐“ฅ๐“ฆ๐“ง๐“จ๐“ฉ๐“ช๐“ซ๐“ฌ๐“ญ๐“ฎ๐“ฏ๐“ฐ๐“ฑ๐“ฒ๐“ณ๐“ด๐“ต๐“ถ๐“ท๐“ธ๐“น๐“บ๐“ป๐“ผ๐“ฝ๐“พ๐“ฟ๐”€๐”๐”‚๐”ƒ
    private val calligraphyBoldRange = IntRange(56528, 56579)

    //Fraktur Normal: ๐”„๐”…โ„ญ๐”‡๐”ˆ๐”‰๐”Šโ„Œโ„‘๐”๐”Ž๐”๐”๐”‘๐”’๐”“๐””โ„œ๐”–๐”—๐”˜๐”™๐”š๐”›๐”œโ„จ๐”ž๐”Ÿ๐” ๐”ก๐”ข๐”ฃ๐”ค๐”ฅ๐”ฆ๐”ง๐”จ๐”ฉ๐”ช๐”ซ๐”ฌ๐”ญ๐”ฎ๐”ฏ๐”ฐ๐”ฑ๐”ฒ๐”ณ๐”ด๐”ต๐”ถ๐”ท
    private val frakturNormalRange = IntRange(56580, 56631)

    //Fraktur Bold: ๐•ฌ๐•ญ๐•ฎ๐•ฏ๐•ฐ๐•ฑ๐•ฒ๐•ณ๐•ด๐•ต๐•ถ๐•ท๐•ธ๐•น๐•บ๐•ป๐•ผ๐•ฝ๐•พ๐•ฟ๐–€๐–๐–‚๐–ƒ๐–„๐–…๐–†๐–‡๐–ˆ๐–‰๐–Š๐–‹๐–Œ๐–๐–Ž๐–๐–๐–‘๐–’๐–“๐–”๐–•๐––๐–—๐–˜๐–™๐–š๐–›๐–œ๐–๐–ž๐–Ÿ
    private val frakturBoldRange = IntRange(56684, 56735)

    //Monospace: ๐™ฐ๐™ฑ๐™ฒ๐™ณ๐™ด๐™ต๐™ถ๐™ท๐™ธ๐™น๐™บ๐™ป๐™ผ๐™ฝ๐™พ๐™ฟ๐š€๐š๐š‚๐šƒ๐š„๐š…๐š†๐š‡๐šˆ๐š‰๐šŠ๐š‹๐šŒ๐š๐šŽ๐š๐š๐š‘๐š’๐š“๐š”๐š•๐š–๐š—๐š˜๐š™๐šš๐š›๐šœ๐š๐šž๐šŸ๐š ๐šก๐šข๐šฃ
    private val monospaceRange = IntRange(56944, 56995)

    //Double Struck: ๐”ธ๐”นโ„‚๐”ป๐”ผ๐”ฝ๐”พโ„๐•€๐•๐•‚๐•ƒ๐•„โ„•๐•†โ„™โ„šโ„๐•Š๐•‹๐•Œ๐•๐•Ž๐•๐•โ„ค๐•’๐•“๐•”๐••๐•–๐•—๐•˜๐•™๐•š๐•›๐•œ๐•๐•ž๐•Ÿ๐• ๐•ก๐•ข๐•ฃ๐•ค๐•ฅ๐•ฆ๐•ง๐•จ๐•ฉ๐•ช๐•ซ
    private val doubleStruckRange = IntRange(56632, 56683)

    fun hasMathematicalAlphanumericSymbols(text: String): Boolean =
        text.codePoints().anyMatch { codePoint ->
            Character.UnicodeBlock.of(codePoint) == Character.UnicodeBlock.MATHEMATICAL_ALPHANUMERIC_SYMBOLS
        }

    fun remap(text: String): String {
        val remappedChars = mutableListOf<Char>()

        var highSurrogateCode = -1

        text.chars().forEach { code ->

            var char = Character.valueOf(Char(code))
            val isSurrogate = char.isSurrogate()

            when {
                isSurrogate -> {
                    when {
                        char.isHighSurrogate() -> highSurrogateCode = code
                        char.isLowSurrogate() -> {
                            if (highSurrogateCode == this.highSurrogateCode) {
                                when {
                                    serifBoldRange.contains(code) -> char = standard[code - serifBoldRange.first]
                                    serifItalicRange.contains(code) -> char = standard[code - serifItalicRange.first]
                                    serifItalicBoldRange.contains(code) -> char = standard[code - serifItalicBoldRange.first]
                                    sansNormalRange.contains(code) -> char = standard[code - sansNormalRange.first]
                                    sansBoldRange.contains(code) -> char = standard[code - sansBoldRange.first]
                                    sansItalicRange.contains(code) -> char = standard[code - sansItalicRange.first]
                                    sansItalicBoldRange.contains(code) -> char = standard[code - sansItalicBoldRange.first]
                                    calligraphyNormalRange.contains(code) -> char = standard[code - calligraphyNormalRange.first]
                                    calligraphyBoldRange.contains(code) -> char = standard[code - calligraphyBoldRange.first]
                                    frakturNormalRange.contains(code) -> char = standard[code - frakturNormalRange.first]
                                    frakturBoldRange.contains(code) -> char = standard[code - frakturBoldRange.first]
                                    monospaceRange.contains(code) -> char = standard[code - monospaceRange.first]
                                    doubleStruckRange.contains(code) -> char = standard[code - doubleStruckRange.first]
                                }
                            }

                            remappedChars.add(char)
                        }
                    }
                }
                else -> {
                    remappedChars.add(char)
                }
            }
        }
        return String(remappedChars.toCharArray())
    }
}