💾 Archived View for circadian.gemlog.org › 2023-07-04-gemtext-tweaks.gmi captured on 2024-12-17 at 09:52:47. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-08-31)
-=-=-=-=-=-=-
A few posts have been discussing tweaks to Gemtext.
Tables in Gemtext, the non-hacky way
I agree with Acidus’s points about tables: they are far, far too heavyweight for Gemtext.
The “burned in” tables look good to me.
And then on to this post:
From the sound of it I like Gemtext as-is much more than Zelena does, but I still think they have a few good points.
I like “don’t strip preceding whitespace” as an idea, it seems to fit with the general Gemtext feel.
As Zelena’s post points out, it would allow people to format their own lists without help. In fact, this already works today—the examples on that page use a “figure space” instead of the standard space character to do layout:
1. Ordered list
  a. Subitem one
  b. Subitem two
  c. Subitem three
So all that’s needed is for the normal space character to work the same.
Then, lists can be dropped from the spec altogether—a simplification, hurrah!
I am for the most part enjoying writing Gemtext without any inline formatting.
It forces me to find other ways to convey emphasis: paragraph layout, wording, even the occasional exclamation mark! ... and I like how it turns out.
But I don’t think everyone else should have to write the same way.
Inline italics and perhaps bold would add a lot of value, if we can figure out a good way to do it. I’m not sure it’s worth going beyond that.
Gemtext has to be easy to write as plain text, readable as plain text, and not just easy to “parse” but easy to describe a parser for.
That’s tough.
At first glance it might look like using “_” to mark italics fits all those:
Are you sure this _really works_?
It’s easy to write, it’s readable—even, unsurprising.
But, “_” is not easy to parse. Consider:
It might look like using "_" to mark italics fits all those. Are you sure this _really works_? But, "_" is not easy to parse.
Simple parsing of this paragraph would italicise the second half of the first line, leaving “really works” unemphasized before returning to italics through the middle of the last line. Whoops.
The problem is how to get an underscore when you want an underscore.
Pretending that you have a structured markup language when you don’t is hard: you get into all kinds of exceptions. So maybe “_” only counts when there are two on the line, or something about where the whitespace has to be, or there can be an escape sequence when you really want a “_”, or ... unfortunately none of these are obvious, simple, or easy to explain.
I don’t have a good idea, but here’s the best I’ve got:
This <word> is in italics, and so is <this phrase>. <<This>> word is bold, and so is <<this phrase>>. These equations just work: x > 3, x < 10, x >= 20 And so do these code snippets: register >> bits, register << bits But these need special care: "<", x <= 20
To achieve this, non-code lines get fed one line at a time through a state machine with two-character lookahead running as follows.
Base state:
Italic state:
Bold state:
I said “one line at a time”: the state resets to base state at the start of each line.
Clients that can’t render bold or italics can choose whether to display the markup or to discard it.
The “trick” is that the effect of "<" or "<<" is canceled by a trailing whitespace, which automatically covers common uses like “x < 3” and “register << bits”.
It does not cover other use cases like those I called out as “special care” above. To fix these—to insert a literal "<" anywhere you like—you can use the same “canceled by trailing whitespace” rule, by way of the zero width space character.
This is then a rare, awkward case that is very hard to write—but remains easy to read and easy to parse. I have actually included the requisite zero width spaces in the examples above, so they should work as written.
I <said> I didn’t have a good idea!
Maybe <<you>> can do better?
So far today, 2024-12-17, feedback has been received 1 times. Of these, 0 were likely from bots, and 1 might have been from real people. Thank you, maybe-real people!
——— / \ i a | C a \ D n | irc \ / ———