FW: Text reflow woes (or: I want bullets back!)y

Brian Evans <b__m__e (a) mailfence.com>

I've mostly been sitting back through all of this list and raw mode
talk, only occasionally popping in small points here or there. But
I figure if this is a community decision it is good if those in the community
voice their support or lack of support for ideas. 
 
I am mostly ambivalent about the need for a raw text mode. Terminals 
and any graphical clients (GTK, FLTK, Tcl/Tk) all wrap to the window at either
a hard or a word wrap. If we arent reflowing, then we are already doing what
needs to be done. There has been talk about not wrapping python code. Ok.
I get how that would look bad. But which is more readable:
      1. Hard-wrapped python code
      2. Truncated python code where you just straight up lose the content
Clearly #1. I get it as a nice thing for ascii art, but again I feel like using
ascii art is a risk a content creator takes... knowing that it may degrade
poorly on various output systems.
 
Tomasino writes:
> 1. Listing things
> 2. Quick instructions where headers are overkill
> 3. Track listings
> 4. Top 10 lists
> 5. The same reasons we want bullets the reflow
> 6. Recipes!
 
I think Tomasino's response to Julien, in plain text, just proved the _lack_ of 
need for lists.
 
Want a list? Write a list.... handled. I really really really do not see the point of 
adding all of this markup :-/ Just serve files as markdown if you want
markdown and let clients either support markdown or not. 
 
I really do not want to, and likely will not, add support for lists in my current
or any future gemini client I make. I don't see the point. 
 
I do appreciate all of the work and thought that has gone into trying to
figure these things out. It just feels unnecessary for a simple protocol
like gemini. Why get complex when just serving markdown or html or
whatever you like will do the job and people can read it as they see fit.
We are definitely starting to get into the territory with the proposals
where some client authors will support some things and some wont...
which will create a fractured view of gemini to users. So why not keep it
simple and cohesive by just having links and move on to other things?
 
Just my two cents at the moment.

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

On Sun, Jan 19, 2020 at 08:15:14PM +0100, Brian Evans wrote:

> I've mostly been sitting back through all of this list and raw mode
> talk, only occasionally popping in small points here or there. But
> I figure if this is a community decision it is good if those in the community
> voice their support or lack of support for ideas. 

It definitely is, if people who don't feel good about proposed changes
keep quiet and all I hear on the list is agreement it will just lead to
the most vocal members driving things.  I'm glad you've spoken up.

This will be a quick and kind of partial reply, sorry.  But, very
briefly, I understand the concern about lists (of all the recent
proposed changes, they are the least valuable IMHO because they only
concern presentation).  How do you feel about the heading proposals?
IMHO they are probably the most valuable of these changes because they
can be totally ignored from the perspective of presentation and yet
still do genuinely useful stuff, like automatically generated tables of
contents, which do nothing but make large and well-structured text
content easier to read, or help with automatic generation of
user-friendly menus or feeds to bodies of content.  That's good stuff,
surely?

> I really do not want to, and likely will not, add support for lists in my current
> or any future gemini client I make. I don't see the point. 

As I intend to actually write all this stuff in the spec, if I decide to
adopt it, supporting the list line types will be strictly optional, so
this is absolutely fine.  For a text-based client like Bombadillo, the
difference between supporting lists and not is an extremely small
aesthetic thing that some users probably won't even notice.

> I do appreciate all of the work and thought that has gone into trying to
> figure these things out. It just feels unnecessary for a simple protocol
> like gemini. Why get complex when just serving markdown or html or
> whatever you like will do the job and people can read it as they see fit.
> We are definitely starting to get into the territory with the proposals
> where some client authors will support some things and some wont...
> which will create a fractured view of gemini to users. So why not keep it
> simple and cohesive by just having links and move on to other things?

The fractured Geminispace thing is a real concern, but actually I think
an argument can be made in both ways here.  If text/gemini is kept
extremely simple and plain and the official policy is "Serve Markdown or
HTML if you want even basic text styling" then people may well do that.
Some clients will add support for rendering those but many won't because
it's so much more complicated (far worse than what has been proposed
here!).  Then we end up with two regions of Geminispace, the text/gemini
subspace which anybody can visit and the Markdown subspace which only
people using the fanciest clients can visit.  IMHO, this is a worse kind
of fracturing than one where we adopt the proposed new text/gemini
syntax and different clients implement different subsets of the optional
features.  After all, the only reason I've been so positive about these
recently proposed changes is that they all seem to degrade very
gracefully if a client doesn't recognise them and treats them equivalent
to text.  The degree of fracturing possible is actually very slight.

> Just my two cents at the moment.

One cent of opinion from client or server authors is valued equal to one
dollar of opinion from anybody else. :)

Cheers,
Solderpunk

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

On Sun, Jan 19, 2020 at 08:01:23PM +0000, solderpunk wrote:
 
> The fractured Geminispace thing is a real concern, but actually I think
> an argument can be made in both ways here.  If text/gemini is kept
> extremely simple and plain and the official policy is "Serve Markdown or
> HTML if you want even basic text styling" then people may well do that.
> Some clients will add support for rendering those but many won't because
> it's so much more complicated (far worse than what has been proposed
> here!).  Then we end up with two regions of Geminispace, the text/gemini
> subspace which anybody can visit and the Markdown subspace which only
> people using the fanciest clients can visit.  IMHO, this is a worse kind
> of fracturing than one where we adopt the proposed new text/gemini
> syntax and different clients implement different subsets of the optional
> features.  After all, the only reason I've been so positive about these
> recently proposed changes is that they all seem to degrade very
> gracefully if a client doesn't recognise them and treats them equivalent
> to text.  The degree of fracturing possible is actually very slight.

Oh, there was also the argument made at some point that if Gemini
doesn't have any standardised syntax for these common formatting tasks,
ambitious clients might start trying to recognise the most popular
non-standard ways of doing it, which could easily lead to divergent
implementations across clients.  So, better to provide a standard way to
do to provide us control and uniformity.

I don't mean to claim either of these arguments are bulletproof, I just
think that a principle of "we shouldn't do anything that risks
fragmentation of Geminispace" (which is obviously a good principle)
necessarily comes down clearly on one side of this question or the
other.

Cheers,
Solderpunk

Link to individual message.

Michael Lazar <lazar.michael22 (a) gmail.com>

I think there have been a lot of good suggestions coming from all sides. Here's
my take on a compromised format that tries to take everything into account while
also inserting a few of my own opinions:

Parser pseudo-code (actually, it's valid python):

 ```
preformat_mode = False
preformat_buffer = []
for line in document:
    if line.startswith('```'):
        if not preformat_mode:
            # Start preformat block
            preformat_mode = True
            preformat_buffer = []
        else:
            # End preformat block
            preformat_mode = False
            display_preformat_block(preformat_buffer)
    elif preformat_mode:
        # Inside of preformat block
        preformat_buffer.append(line)
    elif line.startswith('###'):
        display_header_level_3(line)
    elif line.startswith('##'):
        display_header_level_2(line)
    elif line.startswith('#'):
        display_header_level_1(line)
    elif line.startswith('=>'):
        display_link(line)
    elif line.startswith('---'):
        display_horizontal_rule()
    else:
        display_paragraph(line)

if preformat_mode:
    # Flush the preformat block if there was no end tag
    add_preformat_block(preformat_buffer)
 ```

This pseudo-code was written with "fancy" gemini clients in mind. In other
words, this should be close to the worst-case scenario for how complicated a
gemini document parser would ever need to be.

## Preformat mode

Many clients are going to want to display a preformat block of text in a
horizontally scrollable window or some other type of block widget. This
pseudo-code reflects that by sticking the pre-formatted lines in a separate
buffer until the end of the block. I think this is a more accurate
representation of what most client parsers would end up looking like.

## Headers

I'm of the opinion that there should only be a fixed number of header levels.
It keeps the matching logic flat and straightforward. Three levels is few
enough that most clients should be able to come up with distinct styles to
display them. Fixed header lines are trivial to parse and provide a lot of
utility for organizing a document and linking to sub-sections.

### Ordered Lists & Unordered Lists

Lists are tricky because while they would be nice to have, the complicate the
parsing significantly. In order to parse a list while preserving its semantic
structure, you will need to keep track of where it starts and ends. Nested
lists complicate this even further, no matter which syntax for nesting is used.

Parsing lists semantically would require keeping a separate buffer for each type
of list, and then keeping flags and making sure that these buffers are flushed
after the last element in the list. Because of this, I do not believe that they
pass the power-to-weight ratio smell test.

For authors, they still have a few choices for lists:

1. Stick the list in a preformat block
2. Write the list in plain mode without special formatting

I accept that neither of these options is *ideal* for all use cases, but I
think they are *good enough* for most use cases. Don't forget that unicode
bullets can already be added directly in gemini documents if the author
wishes to do so.

### Quotes

Quote blocks with ">" would be ok if we could count on them being only a single
line long. However, many quotes will necessarily include line breaks that
should be displayed together in a single block. This complicates parsing in the
same way that lists do, so I think that quotes should also be omitted for the
same reason.

If you want to display something like a quote from a mailing list message, I
think that would be a perfect candidate for copying it into a preformat
block. For other types of quotes, stick them between two horizontal rules to
separate them from the surrounding text.

### Horizontal Rule

I find the horizontal rule useful for separating sections of a page. I see them
commonly use on gopher to add a footer to the bottom. They can likewise be
used for header sections.

 ```
Header
---
Content
---
Footer
 ```

The following gemini sites already use some form of horizontal separator on
their front pages (the precise syntax varies):

- gemini://vger.cloud/
- gemini://gemini.conman.org/
- gemini://zaibatsu.circumlunar.space/
- gemini://carcosa.net/
- gemini://yam655.com/

I think that since horizontal rules are easy to parse and they add utility for
structuring pages, they should be included in the spec.

### Other Random Opinions

- Leading and trailing whitespace should be stripped from all of lines outside
  of the preformat block. If you're allowing a non-monospace font for these
  elements, then leading whitespace can look inconsistent and trailing
  whitespace serves no real purpose. By leading whitespace, I mean that

##     heading text

  Should be interpreted as "heading text", not "     heading text".
- I have no opinion on whether the ``` should allow text after it on the same
  line.

I think I would be satisfied enough with the above document to at least try it
out by converting all of my existing gemini content. I also think I would be
fine keeping everything fixed-width and hard wrapping. I *don't* think I would
want to implement nested lists or quote blocks, or anything significantly more
complicated than what is outlined above.

- mozz

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

On Sun, Jan 19, 2020 at 08:15:14PM +0100, Brian Evans wrote:

I've been thinking more about lists and, while I understand the concern
about maintaining simplicity, I really do think that:

> I think Tomasino's response to Julien, in plain text, just proved the _lack_ of 
> need for lists.
>  
> Want a list? Write a list.... handled.

is a bit too simplistic.  Tomasino's response consisted entirely of
short list items which didn't require wrapping, and also happened in a
plain text email, which is a hard wrapping environment and so not
comparable to a "long line" text/gemini document.

In Gopherspace, it's very common for people writing lists to format
multi-line list items "nicely", e.g. people write this:

---

  extended periods, at least eight days required for a Moon landing,
  to a maximum of two weeks

  the combined spacecraft using the propulsion system of the target
  vehicle

  outside the protection of the spacecraft, and to evaluate the
  astronauts' ability to perform tasks there

  pre-selected location on land
---

In preference to this:

---

extended periods, at least eight days required for a Moon landing,
to a maximum of two weeks

maneuver the combined spacecraft using the propulsion system of
the target vehicle

space-"walks" outside the protection of the spacecraft, and
to evaluate the astronauts' ability to perform tasks there

touchdown at a pre-selected location on land
---

I had worried, before writing this, that perhaps I was a lone weirdo for
caring about this, but didn't have to search too far to find examples of
others doing it, e.g.
gopher://zaibatsu.circumlunar.space:70/0/~tfurrows/phlog/2020-01-16_worldEndingStuff.txt

Now, this nice list formatting is simple and easy in a hard-wrapping
environment with fixed line width.  In a long line environment where
clients wrap the line to the appropriate length, and the appropriate
length is different for everybody, it's impossible for a content
author to ensure readers get the first, nicer style of list
formatting.  As a consequence of client wrapping, we get stuck with the
second.

For lists of short items, there is no difference.

Some might argue that this really doesn't matter and isn't worth adding
complicating features for.  But the fact that people writing for Gopher
take the time to do the nice formatting suggests people actually care
about this.  And I don't think it's purely a matter of vanity and
wanting our content to simply look pretty - if that's what we wanted, we
wouldn't be writing for gopher!  I actually think the nicer list
formatting makes for a genuinely better reading experience.  The list is
much more easily and immediately recognisable as a list and it's much
easier for the eyes to parse out the separate list items.  This really
is a functional thing, I think.  A small one, I grant you, but a real
one.  It would be a shame if Gemini content had to be in some ways
harder to read than Gopher content, and ironic if that happened as a
result of the decision to have clients wrap lines to fit the viewport
neatly in order to ensure a better reading experience for everyone!

After thinking about this I am more convinced than I was yesterday that
having some defined markup for lists is worthwhile.  That said, I'm
becoming rapidly less convinced about *nested* lists, especially
without a limit on the permitted level of nesting.  More on that when I
reply to Michael's recent email, which will probably be much later
today.

Cheers,
Solderpunk

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

Just a quick response for now: nice post, thanks, there's a lot in here
that I agree with (and I had been starting to think similar things about
quotes), but can I ask you to elaborate on:

On Sun, Jan 19, 2020 at 09:01:17PM -0500, Michael Lazar wrote:
 
> Lists are tricky because while they would be nice to have, the complicate the
> parsing significantly. In order to parse a list while preserving its semantic
> structure, you will need to keep track of where it starts and ends. Nested
> lists complicate this even further, no matter which syntax for nesting is used.
> 
> Parsing lists semantically would require keeping a separate buffer for each type
> of list, and then keeping flags and making sure that these buffers are flushed
> after the last element in the list. Because of this, I do not believe that they
> pass the power-to-weight ratio smell test.

In particular, what do you mean by "parsing lists semantically"?

At no point in these discussions have I been envisaging anything to do
with lists which requires clients to recognise or keep track of whether
or not they are "inside" a list or not, or sticking lists in buffers.
I have imagined list items standing alone and "lists" being an emergent
property of a document that clients have no awareness of - in exactly
the same way that "paragraphs" are an emergenty property of lines (if
some of those lines happen to be blank).

Well, that's true for unordered lists, at least.  Ordered lists are
another story 

Cheers,
Solderpunk

Link to individual message.

James Tomasino <tomasino (a) lavabit.com>

On 1/20/20 10:28 AM, solderpunk wrote:
> Well, that's true for unordered lists, at least.  Ordered lists are
> another story 

Ordered lists are--so far--the only thing that really does break the
linear line-by-line processing approach. If they are sacrificed to the
gods of a cleaner spec, I for one wouldn't cry too much. Unordered lists
that support reflow is the more important bit. One could always cheat
and just use unordered lists and start each one with a number:


screens by our awesome client writers.


Saves us a bit o' logic. Keeps everything line based. You could run a
stateless renderer on a stream now and it wouldn't choke.

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

On Mon, Jan 20, 2020 at 12:09:01PM +0000, James Tomasino wrote:
 
> One could always cheat
> and just use unordered lists and start each one with a number:
> 
> * 1) item one!
> * 2) A really long item two that will be wonderfully reflowed on small
> screens by our awesome client writers.
> ** 2a) I'm lookin' at you, Michael!
> ** 2b) And the rest of you too. ;)
> 
> Saves us a bit o' logic. Keeps everything line based. You could run a
> stateless renderer on a stream now and it wouldn't choke.

This has two other advantages, too.  First, when processed by a simple
client which opts not to treat list items as anything special, the
result is still obviously an ordered list.  Under the + proposal, it
will look just like an unordered list whose author chose a different
bullet character for reasons of taste.  Ambiguous degredation is not
graceful degredation!

Second, authors can unambiguously refer back to a list item in later
writing.  I can say "Tomasino used a winky face in item 2b above" and
you can all go back and find 2b and confirm this.  If I said that and
you were using a simple client that just rendered + and ++, it would
be up to you to mentally figure out which point was 2b.  And even if
you're using an advanced client that does render ordered lists, I
might write "2b" but your fancy client might use Roman numerals for
second level lists and print 2ii instead of 2b, and again the
connection wouldn't be immediate.  This could only be solved by
tediously specifying that first level lists MUST use Arabic numerals,
second level lists MUST use lowercase letters, third level lists MUST
use Roman numerals, and so on and so on.  And then what happens when
somebody uses more than 26 second level list items and we run out of
lowercase letters to use?

A spec that can avoid all these problems will be exactly the kind of
long, tedious, fiddly spec that I really don't want us to use, and which
nobody will want to code to anyway.  I'm starting to think we should
either drop the ordered list idea, or at the very least strictly limit
it to one level with no nesting.

Cheers,
Solderpunk

Link to individual message.

Michael Lazar <lazar.michael22 (a) gmail.com>

On Mon, Jan 20, 2020 at 5:29 AM solderpunk <solderpunk at sdf.org> wrote:
>
> Just a quick response for now: nice post, thanks, there's a lot in here
> that I agree with (and I had been starting to think similar things about
> quotes), but can I ask you to elaborate on:
>
> On Sun, Jan 19, 2020 at 09:01:17PM -0500, Michael Lazar wrote:
>
> > Lists are tricky because while they would be nice to have, the complicate the
> > parsing significantly. In order to parse a list while preserving its semantic
> > structure, you will need to keep track of where it starts and ends. Nested
> > lists complicate this even further, no matter which syntax for nesting is used.
> >
> > Parsing lists semantically would require keeping a separate buffer for each type
> > of list, and then keeping flags and making sure that these buffers are flushed
> > after the last element in the list. Because of this, I do not believe that they
> > pass the power-to-weight ratio smell test.
>
> In particular, what do you mean by "parsing lists semantically"?
>
> At no point in these discussions have I been envisaging anything to do
> with lists which requires clients to recognise or keep track of whether
> or not they are "inside" a list or not, or sticking lists in buffers.
> I have imagined list items standing alone and "lists" being an emergent
> property of a document that clients have no awareness of - in exactly
> the same way that "paragraphs" are an emergenty property of lines (if
> some of those lines happen to be blank).

By "parsing lists semantically" I mean that if I build an AST, I want all of
the list items grouped together inside of single list object. This is how I
did it when I was playing around with markdown a while ago [0]. From my
research this seems to be the common way to do it [1].

Sophisticated gemini clients could utilize this is a variety of ways. Maybe
you want to add a little bit of extra whitespace surrounding the list. Or you
want to make sure that the your display does not cut-off half way through the
list. Or you want to support re-ordering list items alphabetically. I don't
know, the sky is the limit.

I'm willing to admit that HTML has perhaps tainted my thinking here, but it
just feels *wrong* to me to have an <li> without the surrounding <ul>. Doing
the same thing with "paragraphs" (i.e. each line is a new paragraph) doesn't
feel wrong in the same way. I just have a hard time mentally getting past it.

If I understand correctly, the main argument that I'm hearing in favor of
unordered lists is so that users can visually distinguish the first line of
the list from subsequent lines that have been wrapped by the client. I can
emphasize with this. Bullet lists have been called out because they're an
obvious example of where this is painful. But this might be a more generalized
problem. For example, a poem will have deliberate line breaks, but you would
also like your poem to be wrapped by the client.

What if I were to say this:

When a client is wrapping a line longer than the viewport, the client may chose
to add indents or other visual indicators to distinguish the beginning of the
line from a continuation line. The simplest way to do this would be by adding a
hanging indent to continuation lines.

Expanding on my previous code example:

 ```
def display_paragraph(line):
    # Strip leading and trailing whitespace
    line = line.strip()

    initial_indent = ''
    subsequent_indent = '    '
    wrapped_text = textwrap.wrap(line, initial_indent, subsequent_indent)
    for line in wrapped_text:
        print(line)
 ```

If we generalize this to all lines, we don't need to handle list items as a
special case. Is there anything that this would break?

[0] gemini://mozz.us/markdown/design_document.json
[1] https://github.com/syntax-tree/mdast#list

- mozz

Link to individual message.

Aaron Janse <aaron (a) ajanse.me>

> I'm of the opinion that there should only be a fixed number of header levels.
> It keeps the matching logic flat and straightforward. Three levels is few
> enough that most clients should be able to come up with distinct styles to
> display them. Fixed header lines are trivial to parse and provide a lot of
> utility for organizing a document and linking to sub-sections.

I like that this would discourage the "markdown hacking" seen on GitHub where
unnecessary depth is used to make the HTML render look nicer.

OTOH, it would be easy for authors to:


> Quote blocks with ">" would be ok if we could count on them being only a single
> line long. However, many quotes will necessarily include line breaks that
> should be displayed together in a single block. This complicates parsing in the
> same way that lists do, so I think that quotes should also be omitted for the
> same reason.

I'm strongly against reflowing quote blocks. I understand that we don't want
to display lists inside quotes in a fancy way, but I still think that we
shouldn't break them:

 ```
> My opinion:
> * these lines
> * should not be
> * reflowed
 ```

Speaking of which, should we explicitly disallow fancy-rendering lists within
quotes, or leave the choice up to client authors?

> If you want to display something like a quote from a mailing list message, I
> think that would be a perfect candidate for copying it into a preformat
> block. For other types of quotes, stick them between two horizontal rules to
> separate them from the surrounding text.

Maybe. I still don't like the idea of quotes not wrapping with display width.

> And then what happens when
> somebody uses more than 26 second level list items and we run out of
> lowercase letters to use?

Simply use multiple letters:

 ```
a. lorem ipsum
b. lorem ipsum
[...]
y. lorem ipsum
z. lorem ipsum
aa. lorem ipsum
ab. lorem ipsum
ac. lorem ipsum
 ```

However, this may be complex. I'll try to write a simple client tonight to see
how difficult all this is.

Cheers!

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

On Mon, Jan 20, 2020 at 12:30:14PM -0500, Michael Lazar wrote:
 
> By "parsing lists semantically" I mean that if I build an AST, I want all of
> the list items grouped together inside of single list object. This is how I
> did it when I was playing around with markdown a while ago [0]. From my
> research this seems to be the common way to do it [1].
> 
> Sophisticated gemini clients could utilize this is a variety of ways. Maybe
> you want to add a little bit of extra whitespace surrounding the list. Or you
> want to make sure that the your display does not cut-off half way through the
> list. Or you want to support re-ordering list items alphabetically. I don't
> know, the sky is the limit.

Got it, thanks for clarifying.

We'll never be able to stop people going nuts and defining their own
structure on top of the official structure in the spec if they really
want to, but I think if the official spec can define a perfectly flat
structure (such that actually building an AST is unnecessary) which is
rich enough to take care of the most compelling styling that's needed to
achieve good readability, then that's absolutely fine.  There's no need
to have a concept of a list encapsulating consecutive list items in
order to implement the clean formatting I discussed previously, so I
think we can do without it.  It might feel weird compared to HTML or
LaTeX, but if it works, where's the problem?  I think this is how lists
work in common troff macros, actually, but I can't swear to it.
 
> I'm willing to admit that HTML has perhaps tainted my thinking here, but it
> just feels *wrong* to me to have an <li> without the surrounding <ul>. Doing
> the same thing with "paragraphs" (i.e. each line is a new paragraph) doesn't
> feel wrong in the same way. I just have a hard time mentally getting past it.

In the rough spec I sent around for this line-oriented syntax, each line

whitespace between two parts of text which correspond to different lines
before wrapping, you need to put a blank line in between them.  This
facilitatess things like
one
word
per
line
for emphasis, acrostic poems, etc.

> When a client is wrapping a line longer than the viewport, the client may chose
> to add indents or other visual indicators to distinguish the beginning of the
> line from a continuation line. The simplest way to do this would be by adding a
> hanging indent to continuation lines.

Hmm.  An elegant idea, but I guess it would look quite strange for
ordinary text?

> Expanding on my previous code example:
> 
> ```
> def display_paragraph(line):
>     # Strip leading and trailing whitespace
>     line = line.strip()
> 
>     initial_indent = ''
>     subsequent_indent = '    '
>     wrapped_text = textwrap.wrap(line, initial_indent, subsequent_indent)
>     for line in wrapped_text:
>         print(line)
> ```

Aah!!!  I hadn't noticed that initial_indent, subsequent_indent feature
of textwrap.wrap.  That makes handling unordered list items in the
proposed way absolutely trivial!

def display_unordered_list_item(line):
    # Strip * and any whitespace
    line = line[1:].strip()
    print(textwrap.fill(line, viewportwidth-2, "* ", " ")

Cheers,
Solderpunk

Link to individual message.

Aaron Janse <aaron (a) ajanse.me>

Attached is a <100 line python gemini renderer with the following features:

  * numbers
  * letters (does `az` after `z`, `aaa` after `zz`, etc)
  * roman numerals


To use this script, pipe text/gemini into stdin.

I hope this makes a strong case that these features aren't too complex to
implement.

Link to individual message.

Aaron Janse <aaron (a) ajanse.me>

Whoops, I forgot to reset the list counter. Fixed version attached.

Still < 100 lines of code, including empty lines.

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

Thanks for taking the time to write this!

There are a few details that could be nitpicked (e.g. a lot of this code
seems to assume that *s or #s at the start of lines are followed by
whitespace, which hasn't been specced), but I'm totally happy that this
code is representative of the complexity involved in handling everything
proposed so far.

If a bare-minimum renderer implementing only the compulsory core line
types can be done in ~10 lines and a full-strength renderer implementing
everything to the max can be done in ~100 lines then I'm totally happy
with that - in terms of implementation difficulty.

I still want to think very carefully about graceful degradation, to make
sure those ~10 line renderers still yield usable results.  I still have
real concerns about ordered lists in that regard.

Cheers,
Solderpunk



On Mon, Jan 20, 2020 at 11:53:51AM -0800, Aaron Janse wrote:
> Attached is a <100 line python gemini renderer with the following features:
> * no external dependencies
> * only two state variables
> * unlimited-depth ordered lists that rotates through
>   * numbers
>   * letters (does `az` after `z`, `aaa` after `zz`, etc)
>   * roman numerals
> * unlimited-depth unordered lists
> * unlimited-depth headers with rotating colors
> * wraps at word boundaries, with fancy indents for lists and quotes
> * colors for all special syntax
> * horizontal rules that span the width of the display
> * preformatted text
> * links
> 
> To use this script, pipe text/gemini into stdin.
> 
> I hope this makes a strong case that these features aren't too complex to
> implement.

> #!/usr/bin/env python3
> 
> import sys
> import textwrap
> 
> def int2roman(number):
>     numerals = { 1 : "I", 4 : "IV", 5 : "V", 9 : "IX", 10 : "X", 40 : "XL", 
>         		 50 : "L", 90 : "XC", 100 : "C", 400 : "CD", 500 : "D",
>         		 900 : "CM", 1000 : "M" }
>     result = ""
>     for value, numeral in sorted(numerals.items(), reverse=True):
>         while number >= value:
>             result += numeral
>             number -= value
>     return result
> 
> width = 80
> 
> # only two state variables
> preformatted = False
> list_counter = [0]
> 
> for line in sys.stdin:
> 	if line.startswith('```'):
> 		preformatted = not preformatted
> 		continue
> 
> 	if preformatted:
> 		print('\033[37m'+line+'\033[m', end='')
> 		continue
> 
> 	line = line.rstrip()
> 
> 	if line.startswith('=>'):
> 		parts = line[2:].strip().split(maxsplit=1)
> 		print('\033[36m'+parts[0]) # url
> 		print(parts[1]+'\033[m') # text
> 	elif line.startswith('#'):
> 		parts = line.split(maxsplit=1)
> 		depth = len(parts[0]) - 1
> 		colors = ['31', '93', '92', '34']
> 		color = colors[depth % len(colors)]
> 		print('\033['+color+'m'+line+'\033[m')
> 	elif line.startswith('*'):
> 		parts = line.split(maxsplit=1)
> 		depth = len(parts[0])
> 		text = textwrap.fill(parts[1], width)
> 		text = textwrap.indent(text, ' '*(2*depth)).lstrip()
> 		print(2*(depth-1)*' '+'\033[93m???\033[m '+text)
> 	elif line.startswith('+'):
> 		list_counter = list_counter if len(list_counter) > 0 else [0]
> 
> 		parts = line.split(maxsplit=1)
> 		depth = len(parts[0])
> 
> 		if depth > len(list_counter):
> 			list_counter += [0]
> 		elif depth < len(list_counter):
> 			list_counter = list_counter[:depth]
> 
> 		assert len(list_counter) == depth
> 
> 		marker = ''
> 
> 		counter_type = (len(list_counter) - 1) % 3
> 		list_counter[-1] += 1
> 		count = list_counter[-1]
> 		if counter_type == 0:
> 			marker = str(count)
> 		elif counter_type == 1:
> 			while True:
> 				count -= 1
> 				marker = chr(97+(count%26)) + marker
> 				count = count // 26
> 				if count == 0:
> 					break
> 		else:
> 			marker = int2roman(count)
> 
> 		text = textwrap.fill(parts[1], width)
> 		text = textwrap.indent(text, ' '*(3*depth)+' '*(len(marker)-1)).lstrip()
> 		print('\033[93m'+(depth-1)*3*' ' + marker + '. \033[m' + text)
> 	elif line.startswith('>'):
> 		depth = 0
> 		while True:
> 			if line.startswith('>'):
> 				line = line[1:].lstrip()
> 				depth += 1
> 			else:
> 				break
> 		text = textwrap.fill(line, width)
> 		text = textwrap.indent(text, '\033[93m>\033[m '*depth)
> 		print(text)
> 	elif line.startswith('---'):
> 		print('\033[37m'+'-'*width+'\033[m')
> 	else:
> 		print(textwrap.fill(line, width))
>

Link to individual message.

---

Previous Thread: Color and other escape sequences in Gemini

Next Thread: [SPEC-CHANGE] Full text reflow is out, long line wrapping is in