💾 Archived View for gemlog.blue › users › isoraqathedh › 1618885436.gmi captured on 2023-09-08 at 17:24:16. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

text/gemini and hard line wraps

One thing I really don't like about text/gemini is that lines must never be hard-wrapped. Or rather, that hard-wrapped lines are not combined together into a coherent unit called a paragraph. In this particular blog I'll rant about it a little about why I think this is a bad idea and what I do to try and mitigate it.

Background

The lightweight markup language used as representative of the Gemini protocol is a severely limited form of markup, which largely befits the whole mission statement of the protocol in general. It's entirely line-oriented and therefore all markup is essentially a block element of some sort, and it is also specified that extremely minimal processing is done so that there can be minimal friction from turning markup to text.

To that end, there are really only these items:

Line

# *

## **

### ***

```
#+BEGIN_SRC
#+END_SRC
```

> #+BEGIN_QUOTE
>
> #+END_QUOTE

=> https://example.com [[https://example.com][link]]

Where we have indicated the equivalent org-mode constructs.

Simplicity over usability

As you might have guessed at this point, my markup language of choice is org-mode, which is not in fact not supported by any protocol at all and is in some sense the near-polar opposite of text/gemini – full of baroque syntax, obscure features and nearly impossible to parse in a Perl-like fashion – and there is value in a simpler parser which is beneficial for a project like Gemini.

However, I do believe that text/gemini is too far in the opposite direction, favouring easy parsing way too much over convenience for writers. For a format that is meant for ease of writing, this seems to me like an inconsistency.

"Official" (?) page for semantic line breaks

One /particular/ thing that I would rather have still is the ability to insert semantic newlines – newlines that are not rendered in the displayed product but visible in the source code. This is not an old concept – the link above shows that it isn't – and most markup languages support this. Semantic newlines are in my opinion useful to the writer for various reasons, including:

Easy version control and diffs
Keeping your writing, especially with regard to phrase length, under control
Making some normally-invisible parts of the language more explicit

While modern text editors and browsers are able to wrap text so that long lines are no longer a problem anymore when trying to view and edit long lines, semantic newlines are still a value-added proposition for the lists above.

Statelessness

The text/gemini parser is not stateless if one is attempting a text rendering routine; The "pre-formatted text" directive ensures that and my reading of the spec seems to show that it is not an optional feature, unlike block-quotes and lists.

Adding extra state to track when a new paragraph (two newline characters) appear is in my opinion not too much of a difficulty to add; I've supervised the growth of a new markup language that simply accepts "paragraphs" as a matter of course. That text/gemini ignores that in favour of simpler parsing, which it undoubtedly is, but as previously said I think it's too much "less is more".

Indeed, it is not difficult to add a pre-processor step that strips semantic newlines and then re-parse in the now fully-conforming text/gemini document.

Why semantic newlines and not, say, inline markup?

Semantic newlines are much simpler to implement than those other things, and requires no change in the formatted output – indeed, it expressly /requires/ no change in the formatted output. Unlike inline markup, semantic newlines preserve the line-by-line parsing model, meaning that the contents of the lines remain unexamined when joining the words together.

There is a little complexity when attempting to support advanced text/gemini features like headings, lists and blockquotes with semantic newlines, which can offer a reason not to support semantic newlines. Nevertheless, we could instead specify that these things can only work in a new paragraph (so separated with two newlines) which is what in my experience everyone does anyway.

Other small niggles

It's a little funny that the original text/gemini description documents says:

It can also be used to write documents like this one, which explain Gemtext syntax with examples - you're able to see the syntax examples above without your client interpreting them like it normally would because they are rendered in preformatted mode.

This is really funny, because the pre-formatted markup cannot be escaped, and the document carefully omits the example of using pre-formatted text to demonstrate how pre-formatted text works. One can fake this by using Unicode to make it not the first character anymore:

```
```

but then the result can't be copy-pasted.

⸻⸻⸻⸻⸻

This document is actually written in org-mode and then exported to Gemini. Manual edits are then made to cover for the deficiencies in the export process.