💾 Archived View for mozz.us › markdown › design_document.html captured on 2022-07-16 at 14:51:30.

View Raw

More Information

⬅️ Previous capture (2020-09-24)

-=-=-=-=-=-=-

<div style="max-width: 38rem; padding: 2rem; margin: auto">
<h1>Gemini Markdown Proposal</h1>
<p>This document uses a experimental markdown format that was designed to extend the text/gemini mimetype. The design goals for this format are:</p>
<ul>
<li>Simple enough to write and read as plain text</li>
<li>Simple enough to be easily parsed, with unambiguous line-based syntax</li>
<li>Expressive enough to compose markdown style documents</li>
<li>Easy to understand for those familiar with writing in markdown</li>
</ul>
<p>This format combines several syntax ideas that have been proposed by other people in the mailing list and elsewhere. I do not claim creative ownership over any of this, I am just too lazy to properly cite where everything came from.</p>
<h2>Overview</h2>
<pre># Title
## Heading
### Sub-Heading

=&gt;Link



+ Ordered list

--- Horizontal rule

Text blocks surrounded by ``` contain pre-formatted lines.</pre>
<h2>Paragraphs</h2>
<p>Paragraphs are blocks of text that do not start with any other reserved &quot;tokens&quot;. Any whitespace at the beginning and end of lines within a paragraph will be normalized to a single &lt;SPACE&gt;. This allows paragraphs to be easily re-flowed to fit the client&#x27;s screen.</p>
<p>Two or more consecutive newlines will start a new paragraph. If a line contains only whitespace characters, it will be considered equivalent to a newline.</p>
<p>Inline styles like *italic* and **bold** are NOT supported. They would be nice to have, but they making parsing much more difficult and result in too many ambiguities and edge cases.</p>
<h2>Headings</h2>
<p>Three levels of headings are supported.</p>
<h1>This is a title</h1>
<h2>This is a heading</h2>
<h3>This is a sub-heading</h3>
<p>Headings are denoted by the hash character followed by a single whitespace. A heading is restricted to a single line, but you can make it as long as you want if you are willing to spill over 80 characters. You can only use between 1-3 hash characters for headings, because supporting arbitrary heading levels is difficult to parse.</p>
<h2>Links</h2>
<p>Links keep the pre-existing gemini format:</p>
<div>
<a href="/path">/path</a>
</div>
<div>
<a href="/path">An alternate description</a>
</div>
<div>
<a href="gemini://mozz.us">Full URLs are also supported</a>
</div>
<p>If you are a writing a gemini client for the terminal, you can get away with parsing only the link lines in a file and displaying everything else as plain text. The markdown elements are designed to gracefully degrade.</p>
<h2>Horizontal Rules</h2>
<p>Horizontal rules are lines that start with three &quot;---&quot;.</p>
<hr>
<p>These have the same syntax meaning as the &lt;hr&gt; tag in HTML. The text after the first three characters is not significant. You could choose to pad the line out to 80 characters to look better for plain text displays.</p>
<h2>Lists</h2>
<h3>Unordered</h3>
<p>Unordered (bullet) lists are consecutive lines that start with the &quot;*&quot; character.</p>
<ul>
<li>This is an unordered list</li>
<li>Single items are not allowed to wrap to multiple lines</li>
<li>Nested lists are not supported because they are difficult to parse</li>
</ul>
<h3>Ordered</h3>
<p>Ordered lists behave the same, except they use the &quot;+&quot; character instead of a asterisk. Using a single character is much easier to parse than trying to detect consecutive integers like &quot;1.)&quot;, &quot;2.)&quot;, etc. The client can render the ordered list however they feel is appropriate, for example by using a &lt;ol&gt; in HTML.</p>
<ol>
<li>This is an ordered list</li>
<li>Second element</li>
<li>Third element</li>
</ol>
<h2>Preformatted Text</h2>
<p>A preformatted text block starts and ends with a line containing only the &quot;```&quot; characters. You can use preformatted blocks for ASCII art, code snippets, or anything else that requires monospace font or significant whitespace.</p>
<pre>def greet(name):
    print(&quot;Hello &quot; + name)</pre>
<pre>           /\
          p  q
      _\| \  / |/_
        \//  \\/
         `|  |`
          |  | ,/_
      _\, |  |//\
       /\\|  ;/
         \;  \
  jgs      &#x27;. \
      .-.    \ |
     `   &#x27;.__//
           `&quot;`

# Other tokens like headings are ignored inside of pre-formatted blocks.</pre>
<h2>Parsing</h2>
<p>I was able to write a syntax parser for this format in about 50 lines of python. I have never written a markdown parser before, and my code is very naive. I did not need to use recursion or any other advanced constructs. The only state that I had to keep was whether or not a preformatted text block was currently active.</p>
<p>This document leaves some ambiguities as to how whitespaces are handled. The parser implementation should be considered the canonical truth for the proposed markdown specification.</p>
</div>