2019-06-25 Reflowing arbitrary text

So I’ve been struggling at reflowing arbitrary text written with Gopher in mind. This is a culture that is steeped in ASCII art. My Gopher-compatible, simple text client *Soweli Lukin* isn’t doing too badly, but the code is horrible. 😀

Let’s take a look at the preprocessing I do before handing it off to the Markdown library which turns Markdown into HTML:

1. split the text into paragraphs

2. look at each paragraph individually

3. if it has two lines and the second line is nothing but `-` or `=` it’s a Markdown header, so no changes required

4. if the number of punctuation and whitespace characters is higher than the number of alphanumeric characters, then it’s some sort of ASCII art

5. if every line of this paragraph looks like a list item (it starts with an asterisk followed by a non-asterisk), then each line is wrapped in `code` tags and ends in a `br` tag, because I’m assuming this is a bunch of Gopher items from a Gopher map, and each is actually a link so I cannot use a `pre` tag for the entire paragraph since that would prevent the Markdown library from processing the links

6. in all other cases the ASCII art paragraphs are are wrapped in `pre` tags

7. in addition to that, check if the paragraph consists of list items with each containing at least three spaces; in this case we’re assuming it’s a file listing with dates and sizes and so we wrap the list item in `code` tags

8. furthermore, if the entire paragraph looks like a definition list, warp it in `pre` tags as well; but allow for some slack: a definition is a line starting with a letter, contains a colon followed by some whitespace, and some more characters; if the paragraph is at least 5 lines long, one of the lines need not be a definition; if the paragraph is at least 9 lines long, two of the lines need not be a definition

Are you crying, yet? I know I am! Some examples I’ve added to comments all over the code:

gopher://circumlunar.space:70/1/~cmccabe/

gopher://gopher.floodgap.com/

gopher://gopher.club:70/1/users/gallowsgryph/

gopher://gopher.club:70/1/users/xiled/

gopher://sdf.org:70/1/users/trnslts/feels/2019-05-13

gopher://sdf.org:70/1/users/tfurrows/phlog/2019/

Examples of text that isn’t handled well:

gopher://baud.baby:70/0/phlog/fs20190616.txt

gopher://sdf.org:70/0/users/trnslts/about.txt

gopher://rawtext.club:70/0/~trnslts/stuff/2019-06-08-Automatic_merge_of_GitHub_security_fixes.txt

It’s hard!

​#Gopher ​#Web ​#Soweli Lukin

Comments

(Please contact me if you want to remove your comment.)

Hmm it feels wrong to me that you’re basically do two parsings. Without having seen any code, wouldn’t it be cleaner to extend the markdown parser to do the steps you describe?

– Andreas Gohr 2019-06-25 12:10 UTC

Andreas Gohr

---

Yeah, that would be better, I’m sure. At the same time, I like using it as a black box. It would make it easy to exchange one Markdown parser for another.

– Alex Schroeder 2019-06-25 15:38 UTC

---

On my own homegrown gopher client, I have a keyboard command (or two) to reflow the text if needed. That way, I don’t go crazy trying to reflow those crazy baudbaby pages.

– Sean Conner 2019-06-26 01:13 UTC

Sean Conner

---

Yeah, VF-1 uses `fold -w 70 -s` if you use the `fold` command. So what you’re saying is:

1. a human has to decide whether reflowing is possible

2. a tool is used to reflow the *entire* page

I’m still operation under the assumption that we can do better, but it’s not a bad fallback. 🙂

– Alex Schroeder 2019-06-26 06:20 UTC