💾 Archived View for rawtext.club › ~sloum › geminilist › 000354.gmi captured on 2020-09-24 at 02:37:49. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

<-- back to the mailing list

Text reflow woes (or: I want bullets back!)y

solderpunk solderpunk at SDF.ORG

Sat Jan 18 12:59:16 GMT 2020

- - - - - - - - - - - - - - - - - - - ```

On Sat, Jan 18, 2020 at 12:02:42AM -0500, Michael Lazar wrote: 
> Python's textwrap module is fundamentally flawed for unicode and they have no
> intention of ever fixing it [0]. Once you start going down the rabbit hole of
> CJK characters, emojis, grapheme clusters, etc. it becomes exceedingly hard
> to figure out how to correctly determine the width of unicode text. You can
> get it working 99% of the time, but there's always those fringe cases that
> no one thinks about until somebody files a bug report.

...

God, I hate computers.

But, many thanks for bringing this to my attention. 
> I don't know if this has any bearing on the discussion, but it's worth keeping
> in the back of your mind if you intend to make unicode a first-class citizen.

Unicode is already a first-class citizen in Gemini (text/gemini isassumed to be UTF-8 if a different encoding is not explicitly providedin the response header), and I don't think I have any interest inchanging that.

As for the present discussion...well, it's obvious this problem is noless of a problem under paragraph-oriented "bidirectional" reflowing.It's not obvious to me if it's less of a problem under a Gopher-stylehard-wrapping to a pre-defined maximum width model....I suppose if thewidth of line including CJK characters is dependent upon the combinationof font and terminal being used (I don't know if it is, but it seemsprobable) then it's not actually possible for a CJK-using author tocomply with a spec like "Hard-wrap all your content at X characters"...

Hmm...

Solderpunk