A proposed scheme for parsing preformatted alt text

Luke Emmet <luke (a) marmaladefoo.com>

Hi All

We had an interesting discussion on the #Gemini IRC channel earlier 
today about a generalised scheme for parsing the alt text on 
preformatted regions, e.g.

 ```this is the alt text, not normally displayed to the end user
the preformatted content
 ```

It was a collective discussion, but I've written up some of the key 
points in a post here:

gemini://gemini.marmaladefoo.com/blog/7-Sep-2020_Parsing_preformatted_alt_text.gmi

Essentially the key design considerations are as follows:

1. By default the whole alt text can be used as a label (current behaviour)
2. Use CSS style syntax for the remainder, a familiar and low ritual syntax
3. Don't prescribe the attributes, allow practice to suggest them
4. Be backwards compatible and friendly to screen readers etc.

two initial attributes seem to have obvious initial utility and could be 
used to effectively label content in a practical way:

content-type
lang

Best Wishes

  - Luke

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great Luke Emmet once stated:
> Hi All
> 
> We had an interesting discussion on the #Gemini IRC channel earlier 
> today about a generalised scheme for parsing the alt text on 
> preformatted regions, e.g.
> 
> ```this is the alt text, not normally displayed to the end user
> the preformatted content
> ```
> 
> It was a collective discussion, but I've written up some of the key 
> points in a post here:
> 
> gemini://gemini.marmaladefoo.com/blog/7-Sep-2020_Parsing_preformatted_alt_text.gmi
> 
> Essentially the key design considerations are as follows:
> 
> 1. By default the whole alt text can be used as a label (current behaviour)
> 2. Use CSS style syntax for the remainder, a familiar and low ritual syntax

  I read the article linked, and I think a better format would be:

 ``` mumble mumble label text mumble; attribute1=value1; attribute2=value2

  Skip the CSS rules since they aren't used in Gemini, but the attributes
for MIME *are* used, and those use the format I've shown above.  If you can
parse MIME types, you can reuse *that* code to parse attributes.

 ```here is a table in CSV; content-type=text/csv; lang=en_US;

  The format for tables is *horrible* (at least in my opinion).  The format
I use to generate tables (ultimately in HTML) is the following:


row-11	row-12	row-13	row-14
row-21	row-22	row-23	row-24

  Basically, it's a TAB-delimeted text, so there's no worry about escaping
commas or dealing with quoted strings or any such nonsense.  I know, tabs
are horrible, don't ever use them, etc. etc. but it's *way* easier to deal
with tabs than just about any other character (short of the field separator
control characters).

  So your table example:

 ```Here is a table; content-type=text/tsv; lang=en

1	2	3	4
2	3	4	5
3	4	5	6
 ```

  (Yeah, way eaiser to type than '|' between each field)

> 3. Don't prescribe the attributes, allow practice to suggest them
> 4. Be backwards compatible and friendly to screen readers etc.
> 
> two initial attributes seem to have obvious initial utility and could be 
> used to effectively label content in a practical way:
> 
> content-type
> lang

  Just my two zorkmids worth.  I don't really have a horse in this race, as
I don't really care for the current gemini text format anyway, and this is
adding complexity to a simple format, but that is solerpunk's call, not
mine.

  -spc

Link to individual message.

Kevin Sangeelee <kevin (a) susa.net>

Since my services have been retained by the Devil himself, I am obliged to
advocate on his behalf as follows:

Anything that adds text that's really only parseable by a machine is just a
teeny bit user-hostile.

The ideas being discussed go further, because (unless I misunderstand) it
encourages text in weird unaligned formats to be served up by default.

I would argue (on behalf of the Devil, of course) that mono-spaced
preformatted text is already perfect for aligning user-readable tables.
It's just awkward for machines to process semantically. There are however
reasonably reliable heuristics to figure out columns, should the client
want to add decoration. etc.

Perhaps the alt-text could be used to specify something like 'source:
/data/my_original.csv', to give a person-readable route to machine-readable
data that the client could use to fetch and render in-place on behalf of
the user.

Kevin

On Mon, 7 Sep 2020 at 00:21, Luke Emmet <luke at marmaladefoo.com> wrote:

> Hi All
>
> We had an interesting discussion on the #Gemini IRC channel earlier
> today about a generalised scheme for parsing the alt text on
> preformatted regions, e.g.
>
> ```this is the alt text, not normally displayed to the end user
> the preformatted content
> ```
>
> It was a collective discussion, but I've written up some of the key
> points in a post here:
>
> gemini://
> gemini.marmaladefoo.com/blog/7-Sep-2020_Parsing_preformatted_alt_text.gmi
>
> Essentially the key design considerations are as follows:
>
> 1. By default the whole alt text can be used as a label (current behaviour)
> 2. Use CSS style syntax for the remainder, a familiar and low ritual syntax
> 3. Don't prescribe the attributes, allow practice to suggest them
> 4. Be backwards compatible and friendly to screen readers etc.
>
> two initial attributes seem to have obvious initial utility and could be
> used to effectively label content in a practical way:
>
> content-type
> lang
>
> Best Wishes
>
>   - Luke
>

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>


On 07-Sep-2020 01:47, Sean Conner wrote:
> I read the article linked, and I think a better format would be:
>
> ``` mumble mumble label text mumble; attribute1=value1; attribute2=value2
>
>    Skip the CSS rules since they aren't used in Gemini, but the attributes
> for MIME *are* used, and those use the format I've shown above.  If you can
> parse MIME types, you can reuse *that* code to parse attributes.
>
> ```here is a table in CSV; content-type=text/csv; lang=en_US;

That's a possibility. Either formats are simple enough to parse I think.

> The format for tables is *horrible* (at least in my opinion).  The format
> I use to generate tables (ultimately in HTML) is the following:
>
> *header1	header2	header3	header4
> **footer1	footer2	footer3	footer4
> row-11	row-12	row-13	row-14
> row-21	row-22	row-23	row-24

Yes I think people have jumped on the specific example of table parsing 
I gave. The example is a quote from Bouncepaw's original post, and was 
to illustrate primarily how the parameters are used in the alt text, not 
to propose a new format. Maybe that aspect wasn't clear enough.

The particular format for a table I don't endorse - CSV or TSV is more 
natural and the mime type is already defined.

I'll clarify the example so its clear I'm not proposing a new format for 
text based tables.

>    So your table example:
>
> ```Here is a table; content-type=text/tsv; lang=en
> *+	1	2	3
> 1	2	3	4
> 2	3	4	5
> 3	4	5	6
> ```
>
>    (Yeah, way eaiser to type than '|' between each field)

Yes I agree, as you illustrate, using TSV or CSV is probably better.

> Just my two zorkmids worth. I don't really have a horse in this race, as
> I don't really care for the current gemini text format anyway, and this is
> adding complexity to a simple format, but that is solerpunk's call, not
> mine.

I'm not proposing this needs to be institutionalised in the spec 
(although that would be cool).

Rather this is instead what I have in mind is more like a community 
practice. For example people are sometimes using unicode superscript or 
square bracket footnotes to indicate a citation marker for links.

Best Wishes

  - Luke

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>

Hi Kevin

On 07-Sep-2020 10:06, Kevin Sangeelee wrote:
> Anything that adds text that's really only parseable by a machine is 
> just a teeny bit user-hostile.

Well the alt-text never gets shown to the user, it is invisible to them. 
Some clients might act on it (for example at the moment to show a tool tip).

My personal view is that the CSS style delimiters attribute1: value; is 
less hostile than others. So it was part of the design consideration. 
Those clients or users wanting a simple experience can just ignore it 
all anyway.

> The ideas being discussed go further, because (unless I misunderstand) 
> it encourages text in weird unaligned formats to be served up by default.
>
> I would argue (on behalf of the Devil, of course) that mono-spaced 
> preformatted text is already perfect for aligning user-readable 
> tables. It's just awkward for machines to process semantically. There 
> are however reasonably reliable heuristics to figure out columns, 
> should the client want to add decoration. etc.

Yes everyone seems to have noticed the strange table format of the 
example. That example is misleading, as the real intention was to show 
where the attribute lives in technical terms (quoting from Bouncepaw's 
original post). I'm not proposing a new table format - CSV or TSV is 
probably the most appropriate format.

> Perhaps the alt-text could be used to specify something like 'source: 
> /data/my_original.csv', to give a person-readable route to 
> machine-readable data that the client could use to fetch and render 
> in-place on behalf of the user.

That could be a possible attribute people could try out, which might be 
important in some contexts. Perhaps it might apply where source code is 
quoting from a larger source.

Best Wishes

  - Luke

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Mon, Sep 07, 2020 at 05:36:08PM +0100, Luke Emmet wrote:
> On 07-Sep-2020 10:06, Kevin Sangeelee wrote:
> > Anything that adds text that's really only parseable by a machine is
> > just a teeny bit user-hostile.
> 
> Well the alt-text never gets shown to the user, it is invisible to them.
> Some clients might act on it (for example at the moment to show a tool tip).

I think it would be helpful to quote the spec here:

> Use of alt text is at the client's discretion, and simple clients may
> ignore it. Alt text is recommended for ASCII art or similar
> non-textual content which, for example, cannot be meaningfully
> understood when rendered through a screen reader or usefully indexed
> by a search engine. Alt text may also be used for computer source code
> to identify the programming language which advanced clients may use
> for syntax highlighting.

Alt text is recommended for, and I think named after, the role of alt
text in HTML <img>. That is, to be an alternative representation of the
content, like "Photograph of a woman on a horse". In that sense it is
meant to be text that users see! Just not every user in all
circumstances. HTML <img> alt text is meant in part for users that use
screen readers, but users nonetheless. So I would prefer we not go
adding extensions to alt text that prevent it from being always
human-readable.

(Refer to the alt attribute as documented here:)
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img

The advanced use case cited in the spec matches the way in
GitHub-flavored Markdown you can write ```typescript to get TypeScript
syntax highlighting. While not an alternative content representation,
that is at least human-readable.

> My personal view is that the CSS style delimiters attribute1: value; is less
> hostile than others. So it was part of the design consideration. Those
> clients or users wanting a simple experience can just ignore it all anyway.

I agree it's less hostile than other possibilities, but if you're a user
who depends on alt text for content representation, it's going to be
jarring when some of the time you get code read aloud to you by a screen
reader, or??well, you get what I mean.

Really, I think what we have here is a field with two possible uses that
are at cross purposes, one for people and one for machines. Syntatically
it's based on something in Markdown that's for machine use, but it's
recommended for the human use. But the whole idea of it being for human
use is spoiled by the fact that it will grate on humans in cases when it
is not for their use.

I would rather the spec either make a call and pick one purpose, or omit
the field entirely, rather than leaving this conflict unresolved.

Re: machine-parseable tables in Gemtext, I acknowledge that was not
really your point :) But in general I think the discussion around tables
is a symptom of our wish to keep the spec small and not see it slowly
grow forever.

I believe that extensions, when popular enough, become de facto
standards and force the spec to grow more than we might otherwise
prefer, lest the de facto standard simply become the new spec. This
pressure is not necessarily a bad thing, but consider, an accessibility
feature like the human-oriented use of alt text is unlikely to become
popular enough to force its way in by a de facto standard, and would
therefore need the support of the base spec to see acceptance.

Link to individual message.

Ecmel Berk CanlΔ±er <me (a) ecmelberk.com>

I just had this interesting formating idea reading through this thread:


[modifiers...] [type] [:] <description>


This format lets the alt text be mostly human-readable while keeping
some semantics for machines to parse through.

As an example, all these lines should be valid:

"python code: Code to parse this format"
"table: Point leaderboard"
"Description without any machine readable parts"

The [modifiers...] part is a list of adjectives the pre-formatted text
has, like a programming language name, type of art (ascii / ansi /
unicode / whatever), or whatever else other people can think of. They are
all single words separated by spaces.

[type] is the type of the content in the pre-formatted text block. Like
"code", "table" or "art"

The [:] is just a separator between the machine-readable part and the
description. If a machine reads any ":", it will stop processing, as the
continuation of this line is for humans only.

<description> is the human-readable explanation of what's inside the
pre-formatted block. It can contain anything, including ":" characters
since the machines should stop parsing after the first one.

If a machine has any doubts (any unrecognized modifier or type, or maybe some 
other heuristic), it should halt parsing and display the entire line as
is, to make sure malformed input can still be understood by a human.



I am not entirely sure how well this would actually work "in
production", but I feel like this is definitely something to throw out
here.

-- 
Have a nice (day|night|week(end)?)
~ Ecmel B. Canl?er ~

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>



On 07-Sep-2020 19:16, easeout at tilde.team wrote:
> I think it would be helpful to quote the spec here:
>> Use of alt text is at the client's discretion, and simple clients may
>> ignore it. Alt text is recommended for ASCII art or similar
>> non-textual content which, for example, cannot be meaningfully
>> understood when rendered through a screen reader or usefully indexed
>> by a search engine. Alt text may also be used for computer source code
>> to identify the programming language which advanced clients may use
>> for syntax highlighting.
> Alt text is recommended for, and I think named after, the role of alt
> text in HTML<img>. That is, to be an alternative representation of the
> content, like "Photograph of a woman on a horse". In that sense it is
> meant to be text that users see! Just not every user in all
> circumstances. HTML<img>  alt text is meant in part for users that use
> screen readers, but users nonetheless. So I would prefer we not go
> adding extensions to alt text that prevent it from being always
> human-readable.
>
> (Refer to the alt attribute as documented here:)
> https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img
>
> The advanced use case cited in the spec matches the way in
> GitHub-flavored Markdown you can write ```typescript to get TypeScript
> syntax highlighting. While not an alternative content representation,
> that is at least human-readable.

Whilst a reference to the HTML spec is interesting, we are defining 
Gemini here, so the Gemini spec isn't beholden to HTML.

The spec mentions both applications of the tail of the "mode switch 
line" (to avoid a normative gloss), both as an alt text and to identify 
the programming language. So this duality is already envisaged in the spec.

> Really, I think what we have here is a field with two possible uses that
> are at cross purposes, one for people and one for machines. Syntatically
> it's based on something in Markdown that's for machine use, but it's
> recommended for the human use. But the whole idea of it being for human
> use is spoiled by the fact that it will grate on humans in cases when it
> is not for their use.
>
> I would rather the spec either make a call and pick one purpose, or omit
> the field entirely, rather than leaving this conflict unresolved.

I prefer that the spec allows for multiple uses. Primarily it is a label 
for human benefit, but that is not to say it cannot be comprehended by a 
machine.

This sort of relates to the other thread on the ML right now - about 
whether there could be a feed format based on gemtext - it would be 
based on certain conventions in the use of the format, without breaking 
gemtext at all.

I don't see how the spec can mandate the content that is in an 
informative part of the content -i.e. a label on an element.

Best wishes

  - Luke

Link to individual message.

Sandra Snan <sandra.snan (a) idiomdrottning.org>

This feels like the "hot comment" antipattern. Codifying a part of the
language that was meant for humans.
http://wiki.c2.com/?HotComments

As for table output, I've been really happy with how the unicode tables
from md2gemini look. I have used many many ASCII tables via org-mode and
pandoc markdown. Which I use depend on the source data.

For a table that's primarily meant to be seen and used (like the D&D
tables I've posted) by humans I think the unicode tables are great.

And, you _can_ extract the fields and records from them:
The fields are separated by ? characters and whitespace.

Screenreaders could set those characters to silent and then read off the
values in each row separately.

For data that's meant for computer usage, it's easy to convert the
unicode tables to computable data, but a separate TSV file (or
sexps, XML or even JSON) is better.

That's kind of how I see the gem files, and I mean the following as a
warm compliment: they're glorified index listings. The "one link per
line" is a great structure for actually linking to actual files.

While I'm on the topic of one link per line, the whole numbers in the
paragraphs to refer to line links style is not cool. Its as if you want
inline links. If you wanted inline links then why didn't you put them in
the spec?

The spec was designed by someone who wanted text paragraphs followed by
(or preceded by) lists of links. _Pages_ are hypertext, but the prose
isn't.

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Mon, Sep 07, 2020 at 08:31:58PM +0100, Luke Emmet wrote:
> On 07-Sep-2020 19:16, easeout at tilde.team wrote:
> > I think it would be helpful to quote the spec here:
> > > Use of alt text is at the client's discretion, and simple clients may
> > > ignore it. Alt text is recommended for ASCII art or similar
> > > non-textual content which, for example, cannot be meaningfully
> > > understood when rendered through a screen reader or usefully indexed
> > > by a search engine. Alt text may also be used for computer source code
> > > to identify the programming language which advanced clients may use
> > > for syntax highlighting.
> >
> > Alt text is recommended for, and I think named after, the role of alt
> > text in HTML<img>. That is, to be an alternative representation of the
> > content, like "Photograph of a woman on a horse". In that sense it is
> > meant to be text that users see! Just not every user in all
> > circumstances. HTML<img>  alt text is meant in part for users that use
> > screen readers, but users nonetheless. So I would prefer we not go
> > adding extensions to alt text that prevent it from being always
> > human-readable.
> > 
> > (Refer to the alt attribute as documented here:)
> > https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img
> > 
> > The advanced use case cited in the spec matches the way in
> > GitHub-flavored Markdown you can write ```typescript to get TypeScript
> > syntax highlighting. While not an alternative content representation,
> > that is at least human-readable.
> 
> Whilst a reference to the HTML spec is interesting, we are defining Gemini
> here, so the Gemini spec isn't beholden to HTML.

To be clear, I'm not saying we are beholden to HTML. But it is the basis
for the spec's vocabulary: In this case "alt text" is a term from HTML.
I'm pointing to HTML's <img> alt attribute because that's what I believe
Solderpunk was talking about when the spec mentioned "alt text". I think
that, and the text of the spec ("Alt text is recommended for?"), support
the notion that human-readable alternative content representation is the
primary point of the field.

> The spec mentions both applications of the tail of the "mode switch line"
> (to avoid a normative gloss), both as an alt text and to identify the
> programming language. So this duality is already envisaged in the spec.

It does, but my next point was that this two-uses-for-one-field
situation creates conflict between the uses with negative consequences
for users. It would be better regarded as a mistake in the spec worth
amending.

> > Really, I think what we have here is a field with two possible uses that
> > are at cross purposes, one for people and one for machines. Syntatically
> > it's based on something in Markdown that's for machine use, but it's
> > recommended for the human use. But the whole idea of it being for human
> > use is spoiled by the fact that it will grate on humans in cases when it
> > is not for their use.
> > 
> > I would rather the spec either make a call and pick one purpose, or omit
> > the field entirely, rather than leaving this conflict unresolved.
> 
> I prefer that the spec allows for multiple uses. Primarily it is a label for
> human benefit, but that is not to say it cannot be comprehended by a
> machine.

I hear you, but I don't see how that addresses the problem create for
users by having the field work two ways.

> This sort of relates to the other thread on the ML right now - about whether
> there could be a feed format based on gemtext - it would be based on certain
> conventions in the use of the format, without breaking gemtext at all.

As I understand it, a feed format based on Gemtext would be usable as a
plain Gemtext page by a plain browser, which is great. And for a feed
aggregator to use it as a subscription mechanism, you'd have to
formalize some piece of it. All of that sounds OK for that purpose.

But in this thread, I think we're talking interpreting Gemtext alt text
differently, not for a special kind of client, but in regular
user-facing interactive browsers. And not on special documents like
feeds, but in general purpose Gemtext pages.

I think that difference makes these two cases not comparable. Adding
Gemtext-compatible formality for feed pages has no effect on other
Gemtext pages. Adding accessibility-incompatible formality for all
Gemtext pages could have a negative effect on particular users across
Gemini as a whole.

> I don't see how the spec can mandate the content that is in an informative
> part of the content -i.e. a label on an element.

I think what the spec can do is make a clearer recommendation. Maybe
we'd have two fields and they wouldn't have to step on each other's
toes.

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great Sandra Snan once stated:
> 
> While I'm on the topic of one link per line, the whole numbers in the
> paragraphs to refer to line links style is not cool. Its as if you want
> inline links. If you wanted inline links then why didn't you put them in
> the spec?
>  
> The spec was designed by someone who wanted text paragraphs followed by
> (or preceded by) lists of links. _Pages_ are hypertext, but the prose
> isn't.

  Because it was clear that people *wanted* links in gopher?  Because there
are too many variations on Markdown already?  Because it's not that easy to
parse Markdown? [1]  Solderpunk wanted a way to include links and still make
it easy to parse.

  I do the numbering thing because I'm converting HTML to gemtext, and I
borrowed most of the code from my work in converting HTML to text for
gopher.  I like HTML for its hypertext capabilities.  But I found that it
was harder for me to convert HTML to gemtext tnan to plain text since
gemtext has *just* enough capabilities to make it seem easier, but not
enough to handle some of the seldom used tags (like <DL><DT><DD>) or even
simple nesting (a <BLOCKQUOTE> within a <BLOCKQUOTE>).

  I could serve HTML, but that would go against the grain of Gemini.

  So I'm looking at this post of mine:

        http://boston.conman.org/2020/07/28.1

and I'm wondering how I would do it differently.  I supposed instead of
(showing the rendered output, not the actual gemtext):

        On Saturday, I sent a message [1] to the party responsible for
        slamming my Gemini server [2] (one among several) and I've yet to
        receive any response.  I removed the block from the firewall, and I
        haven't seen any requests from said bot.  It looks to have been a
        one-off thing at this time.

        [1] /boston/2020/07/25.2
        [2] /boston/2020/07/24.2

        Weird.

        But then again, this is the Intarwebs, where weird things [1] happen
        all the time [2].

        [1] /boston/2006/10/30.1
        [2] /boston/2015/04/29.1

        At this point, I'm hoping it was fixed silently and it won't be an
        issue again.

I could do:

        On Saturday, I sent a message to the party responsible for slamming
        my Gemini server (one among several) and I've yet to receive any
        response.  I removed the block from the firewall, and I haven't seen
        any requests from said bot.  It looks to have been a one-off thing
        at this time.

        I sent a message
        slamming my Gemini server

        Weird.

        But then again, this is the Intarwebs, where weird things happen
        all the time.

        weird things
        all the time

        At this point, I'm hoping it was fixed silently and it won't be an
        issue again.

  I'm not sure how I fell about that.  It doesn't look as good to me as the
numbered links.  Would you prefer I just serve up text/html?  Not use HTML
at all?  Change the output to the sample above?

  And before I go, here are the links to all three versions (HTML, plain
text, gemtext):

	http://boston.conman.org/2020/07/28.1
	gopher://gopher.conman.org/0Phlog:2020/07/28.1
	gemini://gemini.conman.org/boston/2020/07/28.1

  -spc

[1]	Markdown, as initially defined by John Gruber also allowed arbitrary
	HTML.  I think most people either forget that detail, or don't know
	about it in the first place.

Link to individual message.

rjt <lists (a) ryliejamesthomas.net>

On 7/9/20 9:20 am, Luke Emmet wrote:
> Hi All
> 
> We had an interesting discussion on the #Gemini IRC channel earlier 
> today about a generalised scheme for parsing the alt text on 
> preformatted regions, e.g.
> 
> ```this is the alt text, not normally displayed to the end user
> the preformatted content
> ```
> 
> It was a collective discussion, but I've written up some of the key 
> points in a post here:
> 
> gemini://gemini.marmaladefoo.com/blog/7-Sep-2020_Parsing_preformatted_alt_text.gmi 
> 
> 
> Essentially the key design considerations are as follows:
> 
> 1. By default the whole alt text can be used as a label (current behaviour)
> 2. Use CSS style syntax for the remainder, a familiar and low ritual syntax
> 3. Don't prescribe the attributes, allow practice to suggest them
> 4. Be backwards compatible and friendly to screen readers etc.
> 
> two initial attributes seem to have obvious initial utility and could be 
> used to effectively label content in a practical way:
> 
> content-type
> lang
> 
> Best Wishes
> 
>  ?- Luke


It's interesting!, but I feel like transforming preformatted text (like 
in your example of adding borders to a table)  goes against the semantic 
idea of preformatted text. It becomes postformatted :) You've basically 
turned the preformatted text section into a general-purpose wrapper.

   --I keep going back-and-forth on whether-or-not it's a good idea
   though. On one hand: yeah, it's no longer preformatted; on the other
   it's not a dramatic change, and it makes tables actually legible; on
   the other, oh god, I see the return of table-based layouts--

Still, I think it's more within the spirit of Gemini to show the 
preformatted version, and link to a TSV/CSV/.ODS file for those that may 
want it.

Nitpick: I also think your examples misunderstand what alt text is for. 
'here is a table in csv' is not a useful description to a screenreader 
user. Alt text is not a 'simple label', but a description.

As an aside: Unfortunately I don't think there's enough semantic 
information in Gemtext to let screen readers describe tables well.

I'm all for a bit more discussion about adding metadata to Gemini files 
themselves (even if I suspect it would produce too much clutter). 
Perhaps a better approach is to not tie it to the preformatted text 
section though? Preformatted text doesn't need, say, a language 
attribute more than a quote or any other piece of text.

Link to individual message.

mbays@sdf.org <mbays (a) sdf.org>



>This feels like the "hot comment" antipattern. Codifying a part of the
>language that was meant for humans.
>http://wiki.c2.com/?HotComments

Yes. IIUC, gemtext deliberately avoided having a comment line type in 
order to avoid this problem. It was designed to be non-extensible. But 
preformatting toggle lines -- both the opening ones we're talking about 
here, and (to my mind even more worryingly) the closing ones -- seem to 
have ended up as potential extensibility hooks anyway, because most 
clients ignore any data there. (I haven't researched this properly, but 
I checked that at least Amfora, AV-98, and Bombadillo ignore it by 
default.)

So, with apologies for hijacking this thread with something totally 
opposed to its original idea, I'd like to encourage client authors to 
close this extensibility hole by not suppressing text after the "```" in 
preformatting toggle lines.

For the opening line that means displaying the alt text by default 
somehow. Even if the alt text is primarily intended for screenreaders, 
it could well be interesting to visual readers.

For the closing line, the spec says
> Any text following the leading "```" of a preformat toggle line which 
> toggles preformatted mode off MUST be ignored by clients.
I believe the intention is that having text there is an error (though 
this could definitely be made clearer). So I suggest that when a client 
encounters some text there, it displays some sort of (unobtrusive, but 
not too unobtrusive) warning indicating that there is an error in the 
gemtext. I know that with html it's traditional for renderers to be 
forgiving about syntax errors, but since this and invalid uris are the 
only ways gemtext can be invalid, I think it makes sense to be strict. 
If most clients just ignore the error and suppress the extra text, one 
day people will start using it as an extension mechanism, e.g. embedding 
base64-encoded images.

Link to individual message.

Katarina Eriksson <gmym (a) coopdot.com>

<mbays at sdf.org> wrote:

> If most clients just ignore the error and suppress the extra text, one
> day people will start using it as an extension mechanism, e.g. embedding
> base64-encoded images.
>

Or something like the end of this:
https://lists.orbitalfox.eu/archives/gemini/2020/000930.html

-- 
Katarina

>

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>



On 09-Sep-2020 18:38, mbays at sdf.org wrote:
>
> So, with apologies for hijacking this thread with something totally 
> opposed to its original idea, I'd like to encourage client authors to 
> close this extensibility hole by not suppressing text after the "```" 
> in preformatting toggle lines.

As far as I can see, the spec seems clear enough on this: "5.4.3: Any 
line whose first three characters are "```" (i.e. three consecutive back 
ticks with no leading whitespace) are preformatted toggle lines. These 
lines should NOT be included in the rendered output shown to the user."

  - Luke

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Wed, Sep 09, 2020 at 10:41:12PM +0100, Luke Emmet wrote:
> 
> 
> On 09-Sep-2020 18:38, mbays at sdf.org wrote:
> > 
> > So, with apologies for hijacking this thread with something totally
> > opposed to its original idea, I'd like to encourage client authors to
> > close this extensibility hole by not suppressing text after the "```" in
> > preformatting toggle lines.
> 
> As far as I can see, the spec seems clear enough on this: "5.4.3: Any line
> whose first three characters are "```" (i.e. three consecutive back ticks
> with no leading whitespace) are preformatted toggle lines. These lines
> should NOT be included in the rendered output shown to the user."

This is an interesting detail: 5.4.3 goes on to say,

> Any text following the leading "```" of a preformat toggle line which 
toggles preformatted mode on MAY be interpreted by the client as "alt 
text" pertaining to the preformatted text lines which follow the toggle 
line. Use of alt text is at the client's discretion, and simple clients 
may ignore it. Alt text is recommended for ASCII art or similar 
non-textual content which, for example, cannot be meaningfully understood 
when rendered through a screen reader or usefully indexed by a search engine.

So on one hand, alt text is not to be included in the rendered output of
the preformat block, but alt text is recommended as alternative output
for the preformat block, when the usual rendered output is not useful to
    the user.

I think that means clients should probably hide alt text normally, but
to make use of it, might have a way to reveal it. Tooltips or an alt
text visibility toggle function come to mind. Screen readers would want
to read alt text when focusing the preformat block.

Link to individual message.

mbays@sdf.org <mbays (a) sdf.org>



> On 09-Sep-2020 18:38, mbays at sdf.org wrote:
>> So, with apologies for hijacking this thread with something totally 
>> opposed to its original idea, I'd like to encourage client authors to 
>> close this extensibility hole by not suppressing text after the "```" 
>> in preformatting toggle lines.
> 
> As far as I can see, the spec seems clear enough on this: "5.4.3: Any line 
> whose first three characters are "```" (i.e. three consecutive back ticks 
> with no leading whitespace) are preformatted toggle lines. These lines 
> should NOT be included in the rendered output shown to the user."

That passage predates the later additions to the spec in the same 
section about what to do with text on these lines. I guess the intention 
was to make it clear that clients aren't expected to actually print 
"```". I don't think it should be read as ruling out printing the alt 
text.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 9, 2020, at 4:28 PM, easeout at tilde.team wrote:
> 
> On Wed, Sep 09, 2020 at 10:41:12PM +0100, Luke Emmet wrote:
>> 
[snip]
>> Any text following the leading "```" of a preformat toggle line which 
toggles preformatted mode on MAY be interpreted by the client as "alt 
text" pertaining to the preformatted text lines which follow the toggle 
line. Use of alt text is at the client's discretion, and simple clients 
may ignore it. Alt text is recommended for ASCII art or similar 
non-textual content which, for example, cannot be meaningfully understood 
when rendered through a screen reader or usefully indexed by a search engine.
> 
> So on one hand, alt text is not to be included in the rendered output of
> the preformat block, but alt text is recommended as alternative output
> for the preformat block, when the usual rendered output is not useful to
>    the user.
> 
> I think that means clients should probably hide alt text normally, but
> to make use of it, might have a way to reveal it. Tooltips [?]

I strongly recommend against this. The W3C has been trying to get HTML 
authors and browser makers to _only_ display, in tooltips, text in `title` attributes.

Previously, HTML authors would write `alt` attribute values intending them 
to be read by sighted HTML readers who can already see the image. If we 
encourage Gemini-browser authors to put Gemini Alt Text in some kind of 
tooltip visible to people who can already see the ASCII art or code or 
whatever, then all but the most accessibility-conscious authors will stuff 
easter-egg-type text in Gemini Alt Text, similar to the `title` attribute 
contents on each XKCD comic strip.

(You might be thinking ?well, what about that Markdown meme where people 
write ?```javascript? to start off a JavaScript code block, with the idea 
that a syntax highlighter will read it and colorize the output?? I?d say 
that yes, that sort of thing _might_ be an imposition to blind users, but 
that sort of text is mercifully short and gives them a better idea of when 
they might want to hit a ?skip just past the end of the preformatted-text 
block? keystroke combination and get on with the rest of the page.)

Link to individual message.

Alex // nytpu <alex (a) nytpu.com>

>I strongly recommend against this. The W3C has been trying to get HTML
>authors and browser makers to _only_ display, in tooltips, text in
>`title` attributes.  Previously, HTML authors would write `alt`
>attribute values intending them to be read by sighted HTML readers who
>can already see the image.
I strongly agree. While I don't view the W3C's recommendations as the
gold standard of things gemini should look to emulate (EME anyone?), I
agree that if we make the alt text easily visible it will no longer be
accessibility text and will turn into a miscellaneous-purpose field.

>(You might be thinking ?well, what about that Markdown meme where
>people write ?```javascript? to start off a JavaScript code block, with
>the idea that a syntax highlighter will read it and colorize the
>output??
The original point of suggesting that it be displayed for all users was
to discourage turning text intended for humans into text intended for
machines anyways, but I believe an alternate solution other than
displaying it for everybody would still be preferable.

Rewriting that portion of the spec to emphasize that it is not to be
parsed in any way other than as natural language?  Maybe say that a
client *should be able to* completely replace the preformatted block
with the contents of the alt text without the document losing
significant meaning? That would work for ascii art and short code
snippets, but it might not be doable for longer code blocks.

-- 
Alex // nytpu
alex at nytpu.com
GPG Key: https://www.nytpu.com/files/pubkey.asc
Key fingerprint: 43A5 890C EE85 EA1F 8C88 9492 ECCD C07B 337B 8F5B
https://e-mail.is-not-s.ms/

Link to individual message.

Gary Johnson <lambdatronic (a) disroot.org>

Just to throw another idea out there:

There has been some extended discussion now on the mailing list about a
conflict in machine-readable vs human-readable content being added after
the opening pre-formatting ``` characters. It seems that enough people
in the Gemini community see both of these kinds of information as
providing value, but the spec currently lacks a clear path forward for
differentiating between them.

Right now, anything written after the closing pre-formatting ``` chars
isn't being used as per the spec's instructions. I can't help but wonder
what would happen if we were able to put machine-readable instructions
(like "table", "image", "code:python") on the opening line (so that
clients could switch their line interpreter modes accordingly) and place
human-readable alt text (mainly for screen readers) on the closing line
(assuming the screen reader will probably just skip over the
pre-formatted block's contents anyway and then read the alt text if
provided).

I realize this would entail a fairly significant change to the spec, but
it seems - at least to me - to resolve the issue in a rather
straightforward fashion. If I'm missing something important here, please
let me know. I'm also interested in hearing anyone else's suggestions
for how to address this issue.

Onward Geminauts!
  Gary


mbays at sdf.org writes:

> * Wednesday, 2020-09-09 at 22:41 +0100 - Luke Emmet <luke at marmaladefoo.com>:
>
>> On 09-Sep-2020 18:38, mbays at sdf.org wrote:
>>> So, with apologies for hijacking this thread with something totally
>>> opposed to its original idea, I'd like to encourage client authors
>>> to close this extensibility hole by not suppressing text after the
>>> "```" in preformatting toggle lines.
>> As far as I can see, the spec seems clear enough on this: "5.4.3:
>> Any line whose first three characters are "```" (i.e. three
>> consecutive back ticks with no leading whitespace) are preformatted
>> toggle lines. These lines should NOT be included in the rendered
>> output shown to the user."
>
> That passage predates the later additions to the spec in the same
> section about what to do with text on these lines. I guess the
> intention was to make it clear that clients aren't expected to
> actually print "```". I don't think it should be read as ruling out
> printing the alt text.


-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

A. E. Spencer-Reed <easrng (a) gmail.com>

What if we use a human readable *and *machine parsable format, something
like "This is code in python. It does ..." or "This is a table from
/data.csv. It contains information about..." and parse it by using
something like /this\s+is\s+(?:an?\s+)?(\S+)\s+(in|of|from)\s+(\S+)\./i You
can then get structured data like what kind of data it contains while still
preserving it's use for assistive technologies.

On Thu, Sep 10, 2020 at 12:43 PM Gary Johnson <lambdatronic at disroot.org>
wrote:

> Just to throw another idea out there:
>
> There has been some extended discussion now on the mailing list about a
> conflict in machine-readable vs human-readable content being added after
> the opening pre-formatting ``` characters. It seems that enough people
> in the Gemini community see both of these kinds of information as
> providing value, but the spec currently lacks a clear path forward for
> differentiating between them.
>
> Right now, anything written after the closing pre-formatting ``` chars
> isn't being used as per the spec's instructions. I can't help but wonder
> what would happen if we were able to put machine-readable instructions
> (like "table", "image", "code:python") on the opening line (so that
> clients could switch their line interpreter modes accordingly) and place
> human-readable alt text (mainly for screen readers) on the closing line
> (assuming the screen reader will probably just skip over the
> pre-formatted block's contents anyway and then read the alt text if
> provided).
>
> I realize this would entail a fairly significant change to the spec, but
> it seems - at least to me - to resolve the issue in a rather
> straightforward fashion. If I'm missing something important here, please
> let me know. I'm also interested in hearing anyone else's suggestions
> for how to address this issue.
>
> Onward Geminauts!
>   Gary
>
>
> mbays at sdf.org writes:
>
> > * Wednesday, 2020-09-09 at 22:41 +0100 - Luke Emmet <
> luke at marmaladefoo.com>:
> >
> >> On 09-Sep-2020 18:38, mbays at sdf.org wrote:
> >>> So, with apologies for hijacking this thread with something totally
> >>> opposed to its original idea, I'd like to encourage client authors
> >>> to close this extensibility hole by not suppressing text after the
> >>> "```" in preformatting toggle lines.
> >> As far as I can see, the spec seems clear enough on this: "5.4.3:
> >> Any line whose first three characters are "```" (i.e. three
> >> consecutive back ticks with no leading whitespace) are preformatted
> >> toggle lines. These lines should NOT be included in the rendered
> >> output shown to the user."
> >
> > That passage predates the later additions to the spec in the same
> > section about what to do with text on these lines. I guess the
> > intention was to make it clear that clients aren't expected to
> > actually print "```". I don't think it should be read as ruling out
> > printing the alt text.
>
>
> --
> GPG Key ID: 7BC158ED
> Use `gpg --search-keys lambdatronic' to find me
> Protect yourself from surveillance: https://emailselfdefense.fsf.org
> =======================================================================
> ()  ascii ribbon campaign - against html e-mail
> /\  www.asciiribbon.org   - against proprietary attachments
>
> Please avoid sending me MS-Office attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>


-- 
? <https://www.google.com/teapot>

Link to individual message.

mbays@sdf.org <mbays (a) sdf.org>



>>I strongly recommend against this. The W3C has been trying to get HTML 
>>authors and browser makers to _only_ display, in tooltips, text in 
>>`title` attributes.
>[...]if we make the alt text easily visible it will no longer be 
>accessibility text and will turn into a miscellaneous-purpose field.

Good point.

I guess it will be abused that way even if most clients offer no easy 
way to view it, as "easter egg" text for those who read the source. But 
yes, if the alt text is typically shown then this abuse could easily 
become normalised. So I retract my encouragement to always show alt 
text. Thanks for explaining my mistake!

How about if clients have an easily toggled switch between showing 
preformatted text and just showing alt text? I guess that would still 
lead to more easter-egging, but maybe not too much? It seems there's 
a tradeoff between discouraging inaccessible uses as human-readable 
text, and discouraging inaccessible uses as machine-readable text...

For the closing "```", I still think clients should enforce the rule 
that no further text on these lines is allowed.

But actually this suggests another possibility: adjust the spec to allow 
text on the closing line, and have it specifically interpreted as 
a subtitle to be rendered (however the client likes, e.g. centred). That 
could divert the pressure to misuse the alt text as a title, and would 
properly close the extensibility hole.

Link to individual message.

James Tomasino <tomasino (a) lavabit.com>

On 9/10/20 5:47 PM, mbays at sdf.org wrote:
> How about if clients have an easily toggled switch between showing
> preformatted text and just showing alt text? I guess that would still
> lead to more easter-egging, but maybe not too much? It seems there's a
> tradeoff between discouraging inaccessible uses as human-readable text,
> and discouraging inaccessible uses as machine-readable text...

I've been thinking about clients toggling the visibility of preformatted 
text. While it may not provide much value in a desktop client for sighted 
users, this could be very useful in mobile clients. Preformatted text is 
one of the troublesome areas that screws up displays on narrow screens. If 
a mobile client were to serve the alternate text instead then visitors 
could choose whether they want to expand it to see the preformatted content.

This sort of flow is exactly what a screen reader would be doing for a 
blind user. It serves up that alternate text first and the user then can 
decide whether it is worth the effort to dive into the contents further.

Maybe it will also help keep alt-text top-of-mind for content authors if 
they run into it themselves in the proper context.

Link to individual message.

A. E. Spencer-Reed <easrng (a) gmail.com>

When parsing closing lines with extra text after the "```", should
they just be treated as part of the block? ex. should
 ```Alt Text
1
 ```2
3
 ```
4

look like
+------------------------------+
| 1                            |
| ```2                         |
| 3                            |
+------------------------------+
4

or

+------------------------------+
| 1                            |
+------------------------------+
3
+------------------------------+
| 4                            |
+------------------------------+


On Thu, Sep 10, 2020 at 1:54 PM James Tomasino <tomasino at lavabit.com> wrote:
>
> On 9/10/20 5:47 PM, mbays at sdf.org wrote:
> > How about if clients have an easily toggled switch between showing
> > preformatted text and just showing alt text? I guess that would still
> > lead to more easter-egging, but maybe not too much? It seems there's a
> > tradeoff between discouraging inaccessible uses as human-readable text,
> > and discouraging inaccessible uses as machine-readable text...
>
> I've been thinking about clients toggling the visibility of preformatted 
text. While it may not provide much value in a desktop client for sighted 
users, this could be very useful in mobile clients. Preformatted text is 
one of the troublesome areas that screws up displays on narrow screens. If 
a mobile client were to serve the alternate text instead then visitors 
could choose whether they want to expand it to see the preformatted content.
>
> This sort of flow is exactly what a screen reader would be doing for a 
blind user. It serves up that alternate text first and the user then can 
decide whether it is worth the effort to dive into the contents further.
>
> Maybe it will also help keep alt-text top-of-mind for content authors if 
they run into it themselves in the proper context.
>


-- 
?

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 10, 2020, at 10:54 AM, James Tomasino <tomasino at lavabit.com> wrote:
> 
> On 9/10/20 5:47 PM, mbays at sdf.org wrote:
>> How about if clients have an easily toggled switch between showing
>> preformatted text and just showing alt text? I guess that would still
>> lead to more easter-egging, but maybe not too much? It seems there's a
>> tradeoff between discouraging inaccessible uses as human-readable text,
>> and discouraging inaccessible uses as machine-readable text...
> 
> I've been thinking about clients toggling the visibility of preformatted 
text. While it may not provide much value in a desktop client for sighted 
users, this could be very useful in mobile clients. Preformatted text is 
one of the troublesome areas that screws up displays on narrow screens. If 
a mobile client were to serve the alternate text instead then visitors 
could choose whether they want to expand it to see the preformatted content.
> 
> This sort of flow is exactly what a screen reader would be doing for a 
blind user. It serves up that alternate text first and the user then can 
decide whether it is worth the effort to dive into the contents further.
> 
> Maybe it will also help keep alt-text top-of-mind for content authors if 
they run into it themselves in the proper context.
> 

True, but another (probably better in most cases) way for narrow screens 
to handle wide preformatted blocks is to have just those blocks be 
side-scrollable. I?d rather use a client that has the occasional 
side-scrollable block instead of a client that makes me tap on alt text to 
display ``` blocks. Oftentimes I can get a better idea of what?s in the 
block, and whether I want to scroll to the right, just by looking at the left edge.

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>



On 10-Sep-2020 19:53, Nathan Galt wrote:
>> I've been thinking about clients toggling the visibility of 
preformatted text. While it may not provide much value in a desktop client 
for sighted users, this could be very useful in mobile clients. 
Preformatted text is one of the troublesome areas that screws up displays 
on narrow screens. If a mobile client were to serve the alternate text 
instead then visitors could choose whether they want to expand it to see 
the preformatted content.
>>
>> This sort of flow is exactly what a screen reader would be doing for a 
blind user. It serves up that alternate text first and the user then can 
decide whether it is worth the effort to dive into the contents further.
>>
>> Maybe it will also help keep alt-text top-of-mind for content authors 
if they run into it themselves in the proper context.
> True, but another (probably better in most cases) way for narrow screens 
to handle wide preformatted blocks is to have just those blocks be 
side-scrollable. I?d rather use a client that has the occasional 
side-scrollable block instead of a client that makes me tap on alt text to 
display ``` blocks. Oftentimes I can get a better idea of what?s in the 
block, and whether I want to scroll to the right, just by looking at the left edge.

The pleasure of Gemini is there can be a genuine diversity of clients - 
unlike the web.

We should not really be specifying client behaviour too much, but rather 
focus on Gemini as a protocol and exchange format intended for humans to 
be displayed in the way that makes sense for them. It is a real benefit 
that there is no server imposed presentation controls and styling 
(unlike HTML)

I'd love to see a Gemini client on KaiOS (offspring of FirefoxOS) which 
is growing rapidly as a simpler smart phone in many countries. In that 
case, an option to skip or keep preformatted text collapsed would make 
perfect sense to me.

What makes sense for one person on a large screen will be different to 
someone on a small phone, and so on for the myriad of users out there.

  - Luke

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great A. E. Spencer-Reed once stated:
> When parsing closing lines with extra text after the "```", should
> they just be treated as part of the block? ex. should
> ```Alt Text
> 1
> ```2
> 3
> ```
> 4
> 
> look like
> +------------------------------+
> | 1                            |
> | ```2                         |
> | 3                            |
> +------------------------------+
> 4
> 
> or
> 
> +------------------------------+
> | 1                            |
> +------------------------------+
> 3
> +------------------------------+
> | 4                            |
> +------------------------------+

  The second example.  From the spec
(gemini://gemini.circumlunar.space/docs/specification.gmi):

	Any text following the leading "`" of a preformat toggle line which
	toggles preformatted mode off MUST be ignored by clients.

  -spc

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Thu, Sep 10, 2020 at 12:43:10PM -0400, Gary Johnson wrote:
> Just to throw another idea out there:
> 
> There has been some extended discussion now on the mailing list about a
> conflict in machine-readable vs human-readable content being added after
> the opening pre-formatting ``` characters. It seems that enough people
> in the Gemini community see both of these kinds of information as
> providing value, but the spec currently lacks a clear path forward for
> differentiating between them.
> 
> Right now, anything written after the closing pre-formatting ``` chars
> isn't being used as per the spec's instructions. I can't help but wonder
> what would happen if we were able to put machine-readable instructions
> (like "table", "image", "code:python") on the opening line (so that
> clients could switch their line interpreter modes accordingly) and place
> human-readable alt text (mainly for screen readers) on the closing line
> (assuming the screen reader will probably just skip over the
> pre-formatted block's contents anyway and then read the alt text if
> provided).

This is a possible solution to the conflict, and I don't mind it. I do
think machine-readable on the top line and human-readable on the bottom
line is natural too, because we put captions beneath figures, and
because the top slot is already used in e.g. GitHub Markdown for
language specifiers.

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Thu, Sep 10, 2020 at 09:55:34AM -0600, Alex // nytpu wrote:
> >I strongly recommend against this. The W3C has been trying to get HTML
> >authors and browser makers to _only_ display, in tooltips, text in
> >`title` attributes.  Previously, HTML authors would write `alt`
> >attribute values intending them to be read by sighted HTML readers who
> >can already see the image.
> I strongly agree. While I don't view the W3C's recommendations as the
> gold standard of things gemini should look to emulate (EME anyone?), I
> agree that if we make the alt text easily visible it will no longer be
> accessibility text and will turn into a miscellaneous-purpose field.

Great points. My main goal is to address the fact that the spec
underdefines the alt text field. I want it to decide on one use such
that clients could depend on it always being suitable for that use.

Thanks for pointing out that what we think of as "alt text" in HTML is
not the same thing as the alternative representation used by screen
readers. In that light, I don't want Gemini's preformat alt text to be
machine-readable, nor do I want it to be an HTML alt text style tooltip
or caption.

I think it should behave like the "title" attribute. I want Gemini to
offer accessibility affordances where the nature of plain text does not
already provide them, and this seems like the right use for that. I
think it would also be suitable to change the name from "alt text" to
something like "accessibile description".

> >(You might be thinking ?well, what about that Markdown meme where
> >people write ?```javascript? to start off a JavaScript code block, with
> >the idea that a syntax highlighter will read it and colorize the
> >output??
> The original point of suggesting that it be displayed for all users was
> to discourage turning text intended for humans into text intended for
> machines anyways, but I believe an alternate solution other than
> displaying it for everybody would still be preferable.

Yep, that is what I meant, to drop machine-readability in favor of human
readability.

> Rewriting that portion of the spec to emphasize that it is not to be
> parsed in any way other than as natural language?  Maybe say that a
> client *should be able to* completely replace the preformatted block
> with the contents of the alt text without the document losing
> significant meaning? That would work for ascii art and short code
> snippets, but it might not be doable for longer code blocks.

An accessible description for a long code block seems like a non-goal,
though, because when code is text, I imagine that makes it accessible to
a screen reader.

When considering these details it is helping me to draw a clear line
between an accessible content description and a caption. I think we want
the former, not the latter.

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great easeout at tilde.team once stated:
> 
> This is a possible solution to the conflict, and I don't mind it. I do
> think machine-readable on the top line and human-readable on the bottom
> line is natural too, because we put captions beneath figures, and
> because the top slot is already used in e.g. GitHub Markdown for
> language specifiers.

  Luke Emmet wrote:

> The pleasure of Gemini is there can be a genuine diversity of clients -
> unlike the web.

  And back in the mid-90s, there *were* plenty of web clients.  Easily a
dozen that were easily available and that was back in time when it was easy
enough to parse HTML, there was no CSS and no Javascript, and it was
conceivable that someone could write a simple web browser in a weekend [1].

  Then Sandra Snan sent the following link:

	https://drewdevault.com/2020/03/18/Reckless-limitless-scope.html

Wherein it's mentioned that the current "state of the web" is described by
114,000,000 words spread across 1,217 standards documents.  You get here by
incremental changes, all of which are "easy" and "it would be so nice."

  Also, don't forget that Gemini can *easily* serve up HTML documents.  And
Markdown documents.  And PDF.  And a host of other documentation formats
that all do what you want to do.  And then some.

  But hey, I can play this game.  I added the following non-standard
document:

	gemini://gemini.conman.org/test/preformat.gemini

that contains "machine readable text" at the opening preformatted marker,
and a "human readable text" on the ending preformatted marker, just to give
an indication of what it might look like and what might be done with it. 
Enough talk, *someone* has to do an implementation to scare the bejeezus out
of everyone (not that it's particularly scary in what I did).

  -spc (HTML people.  Seriously, HTML.  You want your format, you have it
	already ... )

[1]	https://en.wikipedia.org/wiki/ViolaWWW

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 10, 2020, at 5:16 PM, easeout at tilde.team wrote:
> 
> On Thu, Sep 10, 2020 at 09:55:34AM -0600, Alex // nytpu wrote:
>>> I strongly recommend against this. The W3C has been trying to get HTML
>>> authors and browser makers to _only_ display, in tooltips, text in
>>> `title` attributes.  Previously, HTML authors would write `alt`
>>> attribute values intending them to be read by sighted HTML readers who
>>> can already see the image.
>> I strongly agree. While I don't view the W3C's recommendations as the
>> gold standard of things gemini should look to emulate (EME anyone?), I
>> agree that if we make the alt text easily visible it will no longer be
>> accessibility text and will turn into a miscellaneous-purpose field.
> 
> Great points. My main goal is to address the fact that the spec
> underdefines the alt text field. I want it to decide on one use such
> that clients could depend on it always being suitable for that use.
> 
> Thanks for pointing out that what we think of as "alt text" in HTML is
> not the same thing as the alternative representation used by screen
> readers. In that light, I don't want Gemini's preformat alt text to be
> machine-readable, nor do I want it to be an HTML alt text style tooltip
> or caption.
> 
> I think it should behave like the "title" attribute. I want Gemini to
> offer accessibility affordances where the nature of plain text does not
> already provide them, and this seems like the right use for that. I
> think it would also be suitable to change the name from "alt text" to
> something like "accessibile description?.

I think you?ve got the terminology backwards. In HTML, `alt` is for people 
who can?t see the image (didn?t download it, eyes don?t work right, etc.). 
In that light, renaming it to ?accessible description? changes nothing, 
and ?it should behave like the ?title? attribute? means, as far as I can 
tell, ?it should be accessory information for people who can already see 
the image?, which is not even _trying_ to be helpful for people who can?t see the image.

Let me show an example where the contents of `alt` and `title` differ wildly:

<img src='conned-eventually.jpeg' alt='Chio telling Link ?Ya know, if you 
keep doing everything everyone asks of you without question, you?re gonna 
get conned eventually??' title='Little does Chio know I?ve already met Purah.'>

The alt text describes the image (a screenshot of me playing The Legend of 
Zelda: Breath of the Wild).

The title text is a joke, hidden behind a tooltip in most browsers, that 
should make anyone who?s played BotW snicker briefly. (Purah treats Link 
as an errand-boy for a good portion of the game.)

> 
>>> (You might be thinking ?well, what about that Markdown meme where
>>> people write ?```javascript? to start off a JavaScript code block, with
>>> the idea that a syntax highlighter will read it and colorize the
>>> output??
>> The original point of suggesting that it be displayed for all users was
>> to discourage turning text intended for humans into text intended for
>> machines anyways, but I believe an alternate solution other than
>> displaying it for everybody would still be preferable.
> 
> Yep, that is what I meant, to drop machine-readability in favor of human
> readability.
> 
>> Rewriting that portion of the spec to emphasize that it is not to be
>> parsed in any way other than as natural language?  Maybe say that a
>> client *should be able to* completely replace the preformatted block
>> with the contents of the alt text without the document losing
>> significant meaning? That would work for ascii art and short code
>> snippets, but it might not be doable for longer code blocks.
> 
> An accessible description for a long code block seems like a non-goal,
> though, because when code is text, I imagine that makes it accessible to
> a screen reader.

I have a strong prior that most screen readers won?t read code text aloud 
in ways that would make sense to a (blind) programmer who?s familiar with 
that sort of code. For example, I would expect that most screen readers 
won?t read parentheses and single/double quotation marks out loud, and 
those sorts of punctuation marks tend to be superlatively important in 
roughly all programming languages.

And then there are blind non-programmers who might stumble across a bit of 
code, or blind programmers who don?t understand _that_ kind of code and 
would appreciate some short alternative text explaining what it does (this 
gives them a better idea of whether they should tough it out and listen to 
it all or just skip it).

> When considering these details it is helping me to draw a clear line
> between an accessible content description and a caption. I think we want
> the former, not the latter.

For what it?s worth, I?m pro-alt-text. Tightly-bound captions might also 
be nice (I?d use them if they were in the spec), but they?re much, _much_ less important.

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Thu, Sep 10, 2020 at 07:29:53PM -0700, Nathan Galt wrote:
> 
> > On Sep 10, 2020, at 5:16 PM, easeout at tilde.team wrote:
> > 
> > I think it should behave like the "title" attribute. I want Gemini to
> > offer accessibility affordances where the nature of plain text does not
> > already provide them, and this seems like the right use for that. I
> > think it would also be suitable to change the name from "alt text" to
> > something like "accessibile description?.
> 
> I think you?ve got the terminology backwards. In HTML, `alt` is for 
people who can?t see the image (didn?t download it, eyes don?t work right, 
etc.). In that light, renaming it to ?accessible description? changes 
nothing, and ?it should behave like the ?title? attribute? means, as far 
as I can tell, ?it should be accessory information for people who can 
already see the image?, which is not even _trying_ to be helpful for 
people who can?t see the image.
>
> Let me show an example where the contents of `alt` and `title` differ wildly:
> 
> <img src='conned-eventually.jpeg' alt='Chio telling Link ?Ya know, if 
you keep doing everything everyone asks of you without question, you?re 
gonna get conned eventually??' title='Little does Chio know I?ve already met Purah.'>
> 
> The alt text describes the image (a screenshot of me playing The Legend 
of Zelda: Breath of the Wild).
> 
> The title text is a joke, hidden behind a tooltip in most browsers, that 
should make anyone who?s played BotW snicker briefly. (Purah treats Link 
as an errand-boy for a good portion of the game.)
> 

You're right, I did have the concepts reversed. Thanks for spelling it
out so clearly!

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>


> On Sep 10, 2020, at 5:30 PM, Sean Conner <sean at conman.org> wrote:
> 
>  I added the following non-standard document:
> 
> 	gemini://gemini.conman.org/test/preformat.gemini
> 
> that contains "machine readable text" at the opening preformatted marker,
> and a "human readable text" on the ending preformatted marker, just to give
> an indication of what it might look like and what might be done with it. 
> Enough talk, *someone* has to do an implementation to scare the bejeezus out
> of everyone (not that it's particularly scary in what I did).
> 
>  -spc (HTML people.  Seriously, HTML.  You want your format, you have it
> 	already ... )

I like sets of concrete examples. Thanks for whipping this up.

What I dislike about this style of ??machine-readable? text up top? (for 
some definition of ?machine-readable?) is that the alt-text function has 
been entirely obliterated, at least in these examples.

For the two code bits at the top of the page, the alt text should be the 
contents of the captions at the bottom of each.

For the three ?images?, the alt-text should be something like:

- a dragon
- Merry Christmas
- a Christmas tree with a rabbit sitting near its base

It seems we have a ?pick two? problem. We have (at least!) three different 
annotation types that we?d like to adorn preformatted blocks with:

- machine-readable hints for how to parse a block (say, for syntax highlighting)
- alternative representations for people who can?t see (/understand?) the 
contents of the block
- captions of the block for people who can see it fine

And?we have two potential slots to fill. Constraints:

- preexisting Markdown parsers expect their what-language-is-this hint 
right after the first ```
- alt text probably ought to come first, just in case the user is on an 
oversaturated 2 KB/s modem connection?or something. Plus, the Gemini spec 
already says that This Is the Way?.

Personally, I think block captions (title-text equivalents) are the least 
important and should be the first to be shoved out the airlock. While HTML 
is better for their addition, people limped along OK before <aside> and 
<figure>+<figcaption> were added to HTML5.

Link to individual message.

Meff <meff (a) meff.me>

(Apologies for the double email Sean, I forgot to Reply All)

Sean Conner <sean at conman.org> writes:

>   And back in the mid-90s, there *were* plenty of web clients.  Easily a
> dozen that were easily available and that was back in time when it was easy
> enough to parse HTML, there was no CSS and no Javascript, and it was
> conceivable that someone could write a simple web browser in a weekend [1].
>
>   Then Sandra Snan sent the following link:
>
> 	https://drewdevault.com/2020/03/18/Reckless-limitless-scope.html
>
> Wherein it's mentioned that the current "state of the web" is described by
> 114,000,000 words spread across 1,217 standards documents.  You get here by
> incremental changes, all of which are "easy" and "it would be so
> nice."

I agree with this. I understand that there's a couple points here that
the alt-text discussion is trying to solve:


   their use of tools such as screen readers


I don't see a way out of this without making Gemtext much more
complicated than it is now. As it is, parsing Gemtext preformatted
blocks requires holding onto state, which no other portion of Gemtext
requires. Adding hints will make parsing these blocks even more
complicated. And then the interpretation question comes in: how do we
interpret these blocks that adorn preformatted text. Will these blocks
be abused. Will complicated clients adopt a de-facto meaning for these,
leaving simpler clients to wither?

There are so many more questions that can come up. If I'm trying to
represent data, should Gemtext convey semantic information? If I'm
rendering the contents of longform text, should Gemtext convey layout
information? And how do we reify layouts between different languages and
their narrative structures? Should Gemtext support compression for large
payloads? What about caching? (ETags come to mind.) I mean, if I'm
distributing copies of Project Gutenberg books, I don't want to force
someone to download uncompressed text when they can get the same content
compressed in often half the time, especially in low/bad connectivity
situations. Oh and how about math? I see lots of discussion about code,
but how do we represent math? How do we give Gemini browsers rendering
hints for math? The potential here to add complexity is endless!

>
>   Also, don't forget that Gemini can *easily* serve up HTML documents.  And
> Markdown documents.  And PDF.  And a host of other documentation formats
> that all do what you want to do.  And then some.
>
>   But hey, I can play this game.  I added the following non-standard
> document:
>
> 	gemini://gemini.conman.org/test/preformat.gemini
>
> that contains "machine readable text" at the opening preformatted marker,
> and a "human readable text" on the ending preformatted marker, just to give
> an indication of what it might look like and what might be done with it. 
> Enough talk, *someone* has to do an implementation to scare the bejeezus out
> of everyone (not that it's particularly scary in what I did).

Thanks for putting the rubber to the road!

I'm just not a fan of trying to bundle more in Gemtext. I'd rather we
try to diversify the formats of content that are available. I think HTML
is a great fallback that can answer almost all the questions I posed
earlier, the questions that are under discussion for alt text, and 
most questions of document presentation. HTML doesn't need to be deeply
tied into a DOM with gobs of Javascript and CSS to work. And there's
plaintext, PDFs, XML, JSON, tons of formats for all sorts of use
cases. I'd rather not shove a round peg into a square hole, but that's
just me.

- meff

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great Nathan Galt once stated:
> 
> > On Sep 10, 2020, at 5:30 PM, Sean Conner <sean at conman.org> wrote:
> > 
> >  I added the following non-standard document:
> > 
> > 	gemini://gemini.conman.org/test/preformat.gemini
> > 
> > that contains "machine readable text" at the opening preformatted marker,
> > and a "human readable text" on the ending preformatted marker, just to give
> > an indication of what it might look like and what might be done with it. 
> > Enough talk, *someone* has to do an implementation to scare the bejeezus out
> > of everyone (not that it's particularly scary in what I did).
> > 
> >  -spc (HTML people.  Seriously, HTML.  You want your format, you have it
> > 	already ... )
> 
> I like sets of concrete examples. Thanks for whipping this up.
> 
> What I dislike about this style of ??machine-readable? text up top? (for
> some definition of ?machine-readable?) is that the alt-text function has
> been entirely obliterated, at least in these examples.
> 
> For the two code bits at the top of the page, the alt text should be the
> contents of the captions at the bottom of each.
> 
> For the three ?images?, the alt-text should be something like:
> 
> - a dragon
> - Merry Christmas
> - a Christmas tree with a rabbit sitting near its base

  Okay, check out

	gemini://gemini.conman.org/test/preformat-2.gemini

  -spc (Taking away my fun with the alt-text ... )

Link to individual message.

Gary Johnson <lambdatronic (a) disroot.org>

Thanks for putting together a simple example Gemtext file using the
machine-readable-on-top, alt-text-on-bottom approach I suggested
yesterday.

>From the recent mailing list responses, it looks like my proposal was
met with mixed feelings. Some of you liked it. Others didn't and
suggested using HTML or Markdown instead. There also appear to be
several folks harboring fears of a slippery slope situation in which the
Gemini protocol, by means of supporting a machine-readable text line on
its preformatted text blocks, is somehow going to become as complex as
the modern-day web. I'll try to address some of these concerns here.

Just stepping back to first principles, I believe Solderpunk's intention
with the Gemtext spec was to provide a slightly more structured markup
format than plain text that could look nice in a more complex client but
which would still render just fine in a client that did nothing more
than render text/gemini as text/plain.

While I believe that Gemtext accomplishes this purpose very well in its
current state, the preformatted text block is definitely overloaded at
the moment in terms of its purpose. Currently, it is the only way to
include images (as ASCII art), tables, or source-code blocks in a
guaranteed mono-spaced font. All of these provide valuable information
to readers of our Gemtext pages, and I personally am quite happy with
how simple the preformatted blocks make it to include them.

However, once of the issues I first wrote to Solderpunk about when I
became aware of Gemini was whether Gemtext would support optional syntax
highlighting (in advanced clients that chose to support it) by allowing
us to use the ```some-programming-language syntax from Github-flavored
Markdown on the opening line of preformatted blocks. He replied to me
saying that this had been considered already and was intended as one of
the uses of that slot. He also pointed me to the point in the Gemini
spec which mentions this. Here is the relevant paragraph from Section
5.4.3 Preformatting toggle lines:

> Any text following the leading "```" of a preformat toggle line which
> toggles preformatted mode on MAY be interpreted by the client as "alt
> text" pertaining to the preformatted text lines which follow the
> toggle line. Use of alt text is at the client's discretion, and simple
> clients may ignore it. Alt text is recommended for ASCII art or
> similar non-textual content which, for example, cannot be meaningfully
> understood when rendered through a screen reader or usefully indexed
> by a search engine. Alt text may also be used for computer source code
> to identify the programming language which advanced clients may use
> for syntax highlighting.

For the three use cases of preformatted blocks that I mentioned above
(ASCII art images, tables, source code), alt text (as an accessibility
option) could clearly be useful and is also obviously the intended
purpose of this slot. However, the last line of that paragraph indicates
the intention that the ```some-programming-language is meant as one
machine-readable use of the line's contents for the purpose syntax
highlighting.

To assuage any concerns about opening the door to endless possible
syntaxes for machine-readable processing of preformatted blocks being
introduced to Gemtext, I'll retract my previous suggestion of using the
top line for (generic) machine-readable purposes and the bottom line for
the alt text.

The only real machine-readable behavior I want to see in Gemini is
source code syntax highlighting anyway, and the spec already supports
and encourages that. In all other cases, alt text (which can clearly be
ignored or used "at the client's discretion" as per the spec) is just
fine on the top line of preformatted blocks. I suppose I don't really
see much machine-readable value in tagging a block as "image" or "table"
currently anyway. YMMV

To that end, maybe we just need some community agreement (and/or a
clearer codification in the Gemini spec) of how to use alt text "for
computer source code to identify the programming language which advanced
clients may use for syntax highlighting".

===========================================================================

Here's a simple proposal:

 ```clojure Source code implementing a non-tail-recursive factorial function
(defn fac [n]
  (if (= n 0)
    1
    (* n (fac (- n 1)))))
 ```

If the first word in a preformatted text block's alt text is the name of
a programming language recognized by the client, then it may (at its
discretion) apply syntax highlighting for that language to the block's
contents.

===========================================================================

This remains in line with both the spirit and content of Gemini
specification Section 5.4.3 Preformatting toggle lines. It also
explicitly allows clients to opt-in (or not). Whether or not syntax
highlighting is applied, the entire alt text line can still be used as
perfectly valid human-readable text for accessibility purposes.

There is no possibility of a slippery slope happening in the Gemtext
specification because this is just a non-extensible, optional-only
feature that is already in the Gemini spec anyway and is just being more
clearly codified, so it can be used reliably in the wild.

I yield the floor.

Thanks,
  Gary


Meff <meff at meff.me> writes:

> (Apologies for the double email Sean, I forgot to Reply All)
>
> Sean Conner <sean at conman.org> writes:
>
>>   And back in the mid-90s, there *were* plenty of web clients.  Easily a
>> dozen that were easily available and that was back in time when it was easy
>> enough to parse HTML, there was no CSS and no Javascript, and it was
>> conceivable that someone could write a simple web browser in a weekend [1].
>>
>>   Then Sandra Snan sent the following link:
>>
>> 	https://drewdevault.com/2020/03/18/Reckless-limitless-scope.html
>>
>> Wherein it's mentioned that the current "state of the web" is described by
>> 114,000,000 words spread across 1,217 standards documents.  You get here by
>> incremental changes, all of which are "easy" and "it would be so
>> nice."
>
> I agree with this. I understand that there's a couple points here that
> the alt-text discussion is trying to solve:
>
> * Formatting hints for non-human clients
> ** Specifically this benefits users that may be visually impaired and
>    their use of tools such as screen readers
> * Hints or descriptions for human readers
>
> I don't see a way out of this without making Gemtext much more
> complicated than it is now. As it is, parsing Gemtext preformatted
> blocks requires holding onto state, which no other portion of Gemtext
> requires. Adding hints will make parsing these blocks even more
> complicated. And then the interpretation question comes in: how do we
> interpret these blocks that adorn preformatted text. Will these blocks
> be abused. Will complicated clients adopt a de-facto meaning for these,
> leaving simpler clients to wither?
>
> There are so many more questions that can come up. If I'm trying to
> represent data, should Gemtext convey semantic information? If I'm
> rendering the contents of longform text, should Gemtext convey layout
> information? And how do we reify layouts between different languages and
> their narrative structures? Should Gemtext support compression for large
> payloads? What about caching? (ETags come to mind.) I mean, if I'm
> distributing copies of Project Gutenberg books, I don't want to force
> someone to download uncompressed text when they can get the same content
> compressed in often half the time, especially in low/bad connectivity
> situations. Oh and how about math? I see lots of discussion about code,
> but how do we represent math? How do we give Gemini browsers rendering
> hints for math? The potential here to add complexity is endless!
>
>>
>>   Also, don't forget that Gemini can *easily* serve up HTML documents.  And
>> Markdown documents.  And PDF.  And a host of other documentation formats
>> that all do what you want to do.  And then some.
>>
>>   But hey, I can play this game.  I added the following non-standard
>> document:
>>
>> 	gemini://gemini.conman.org/test/preformat.gemini
>>
>> that contains "machine readable text" at the opening preformatted marker,
>> and a "human readable text" on the ending preformatted marker, just to give
>> an indication of what it might look like and what might be done with it. 
>> Enough talk, *someone* has to do an implementation to scare the bejeezus out
>> of everyone (not that it's particularly scary in what I did).
>
> Thanks for putting the rubber to the road!
>
> I'm just not a fan of trying to bundle more in Gemtext. I'd rather we
> try to diversify the formats of content that are available. I think HTML
> is a great fallback that can answer almost all the questions I posed
> earlier, the questions that are under discussion for alt text, and 
> most questions of document presentation. HTML doesn't need to be deeply
> tied into a DOM with gobs of Javascript and CSS to work. And there's
> plaintext, PDFs, XML, JSON, tons of formats for all sorts of use
> cases. I'd rather not shove a round peg into a square hole, but that's
> just me.
>
> - meff


-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

Luke Emmet <luke (a) marmaladefoo.com>



On 11-Sep-2020 18:28, Gary Johnson wrote:
> [...]
>
> To assuage any concerns about opening the door to endless possible
> syntaxes for machine-readable processing of preformatted blocks being
> introduced to Gemtext, I'll retract my previous suggestion of using the
> top line for (generic) machine-readable purposes and the bottom line for
> the alt text.
>
> The only real machine-readable behavior I want to see in Gemini is
> source code syntax highlighting anyway, and the spec already supports
> and encourages that. In all other cases, alt text (which can clearly be
> ignored or used "at the client's discretion" as per the spec) is just
> fine on the top line of preformatted blocks. I suppose I don't really
> see much machine-readable value in tagging a block as "image" or "table"
> currently anyway. YMMV
>
> To that end, maybe we just need some community agreement (and/or a
> clearer codification in the Gemini spec) of how to use alt text "for
> computer source code to identify the programming language which advanced
> clients may use for syntax highlighting".

Whilst I think it is nice to support a practice of source code language 
labelling (to assist syntax highlighting), I think it would be 
insufficient to cover current usage practice.

In particular, I'm thinking of ANSI markup that some authors sprinkle in 
their content.

ANSI codes are effectively platform-specific formatting instructions 
(for example foreground, background colours) that are unique to a 
terminal type client.

If authors wish to use these in preformatted regions, they really should 
be hinting this to the client so that the client can take appropriate 
steps to render (or ignore) the terminal ANSI codes. The interpretation 
of these terminal escape codes is not treating the content as plain 
text, but rather to take a particular content interpretation, to drive 
the visual UI. Similar to embedded <font> tags, you might say.

So this in itself suggests the need to be able to hint at content-type 
in the preformatted region.

e.g.

 ```text/x-ansi
(ANSI marked up content in color etc)
``

(or perhaps ```content-type: text/x-ansi to label it correctly)

Or are we going to say these implementation terminal escape codes are 
left as an ad-hoc convention? That seems to have its own risks as 
discussed on this thread and elsewhere.

Best Wishes


  - Luke

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>


> On Sep 11, 2020, at 12:44 AM, Sean Conner <sean at conman.org> wrote:
> 
> It was thus said that the Great Nathan Galt once stated:
>> 
>>> On Sep 10, 2020, at 5:30 PM, Sean Conner <sean at conman.org> wrote:
>>> 
>>> I added the following non-standard document:
>>> 
>>> 	gemini://gemini.conman.org/test/preformat.gemini
>>> 
>>> that contains "machine readable text" at the opening preformatted marker,
>>> and a "human readable text" on the ending preformatted marker, just to give
>>> an indication of what it might look like and what might be done with it. 
>>> Enough talk, *someone* has to do an implementation to scare the bejeezus out
>>> of everyone (not that it's particularly scary in what I did).
>>> 
>>> -spc (HTML people.  Seriously, HTML.  You want your format, you have it
>>> 	already ... )
>> 
>> I like sets of concrete examples. Thanks for whipping this up.
>> 
>> What I dislike about this style of ??machine-readable? text up top? (for
>> some definition of ?machine-readable?) is that the alt-text function has
>> been entirely obliterated, at least in these examples.
>> 
>> For the two code bits at the top of the page, the alt text should be the
>> contents of the captions at the bottom of each.
>> 
>> For the three ?images?, the alt-text should be something like:
>> 
>> - a dragon
>> - Merry Christmas
>> - a Christmas tree with a rabbit sitting near its base
> 
>  Okay, check out
> 
> 	gemini://gemini.conman.org/test/preformat-2.gemini
> 
>  -spc (Taking away my fun with the alt-text ... )

Much better.

I don?t know if ?image? before the ASCII-art images is, or would be, useful to anything.

It seems a little weird to see ? code Lua? instead of ?lua?, and I don?t 
know how easy it would be to adjust syntax-coloring libraries to account 
for this, but this nitpick is largely immaterial.

Like with the last one, thanks for making a concrete example.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 11, 2020, at 11:36 AM, Luke Emmet <luke at marmaladefoo.com> wrote:
> 
> 
> 
> On 11-Sep-2020 18:28, Gary Johnson wrote:
>> [...]
>> 
>> To assuage any concerns about opening the door to endless possible
>> syntaxes for machine-readable processing of preformatted blocks being
>> introduced to Gemtext, I'll retract my previous suggestion of using the
>> top line for (generic) machine-readable purposes and the bottom line for
>> the alt text.
>> 
>> The only real machine-readable behavior I want to see in Gemini is
>> source code syntax highlighting anyway, and the spec already supports
>> and encourages that. In all other cases, alt text (which can clearly be
>> ignored or used "at the client's discretion" as per the spec) is just
>> fine on the top line of preformatted blocks. I suppose I don't really
>> see much machine-readable value in tagging a block as "image" or "table"
>> currently anyway. YMMV
>> 
>> To that end, maybe we just need some community agreement (and/or a
>> clearer codification in the Gemini spec) of how to use alt text "for
>> computer source code to identify the programming language which advanced
>> clients may use for syntax highlighting".
> 
> Whilst I think it is nice to support a practice of source code language 
labelling (to assist syntax highlighting), I think it would be 
insufficient to cover current usage practice.
> 
> In particular, I'm thinking of ANSI markup that some authors sprinkle in their content.
> 
> ANSI codes are effectively platform-specific formatting instructions 
(for example foreground, background colours) that are unique to a terminal type client.
> 
> If authors wish to use these in preformatted regions, they really should 
be hinting this to the client so that the client can take appropriate 
steps to render (or ignore) the terminal ANSI codes. The interpretation of 
these terminal escape codes is not treating the content as plain text, but 
rather to take a particular content interpretation, to drive the visual 
UI. Similar to embedded <font> tags, you might say.
> 
> So this in itself suggests the need to be able to hint at content-type 
in the preformatted region.
> 
> e.g.
> 
> ```text/x-ansi
> (ANSI marked up content in color etc)
> ``
> 
> (or perhaps ```content-type: text/x-ansi to label it correctly)
> 
> Or are we going to say these implementation terminal escape codes are 
left as an ad-hoc convention? That seems to have its own risks as 
discussed on this thread and elsewhere.
> 
> Best Wishes
> 
> 
> - Luke

[shock and horror that people are using ANSI codes for color]

Prior reading:

=> https://en.wikipedia.org/wiki/Escape_character#ASCII_escape_character
=> https://the.exa.website/ a modern ls(1) replacement

I think ANSI color codes are up to 24-bit color now. Not all terminals 
support them (Terminal.app doesn?t; iTerm2 does), but they?re out there. I 
was looking up color codes so I could make my EXA_COLORS variable nicer 
and the whole process wasn?t pleasant.

Sounds like a good reason to explicitly disallow U+001B in the text/gemini spec and:

- give dirty looks to any page author that uses it
- give dirty looks to any client author that doesn?t strip it out before 
presenting it to the user (whether the client is terminal-based or in a GUI)

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great Nathan Galt once stated:
> 
> > On Sep 11, 2020, at 12:44 AM, Sean Conner <sean at conman.org> wrote:
> > 
> >  Okay, check out
> > 
> > 	gemini://gemini.conman.org/test/preformat-2.gemini
> 
> Much better.
> 
> I don?t know if ?image? before the ASCII-art images is, or would be,
> useful to anything.

  It would prevent a screen reader from reading "circumflex circumflex next
line slash backslash slash slash backslash nextline ..."

> It seems a little weird to see ? code Lua? instead of ?lua?, and I don?t
> know how easy it would be to adjust syntax-coloring libraries to account
> for this, but this nitpick is largely immaterial.

  Do you really want a client to have to list all these languages?

	https://en.wikipedia.org/wiki/List_of_programming_languages

  At least with the prefix "code" the client can know it's source code, even
if it doesn't what the language is.  And having the language can let a
client syntax highlight for those languages it does know.  That was my
reasoning.

> Like with the last one, thanks for making a concrete example.

  You're welcome.

  -spc

Link to individual message.

Sean Conner <sean (a) conman.org>

It was thus said that the Great Nathan Galt once stated:
> 
> On Sep 11, 2020, at 11:36 AM, Luke Emmet <luke at marmaladefoo.com> wrote:
> > 
> > Or are we going to say these implementation terminal escape codes are
> > left as an ad-hoc convention? That seems to have its own risks as
> > discussed on this thread and elsewhere.
> 
> [shock and horror that people are using ANSI codes for color]

  I've come across ECMA-48 [1] code usage on both gopher and Gemini.  It is
being done.

> Prior reading:
> 
> => https://en.wikipedia.org/wiki/Escape_character#ASCII_escape_character
> => https://the.exa.website/ a modern ls(1) replacement
> 
> I think ANSI color codes are up to 24-bit color now. Not all terminals
> support them (Terminal.app doesn?t; iTerm2 does), but they?re out there. I
> was looking up color codes so I could make my EXA_COLORS variable nicer
> and the whole process wasn?t pleasant.
> 
> Sounds like a good reason to explicitly disallow U+001B in the text/gemini
> spec and:

  Ban ESC and I can *still* send the codes.  The sequence '<ESC>[' is the
CONTROL SEQUENCE INTRODUCER and is only one of two ways it is represented. 
The other way is with codepoint 155 [2].  To be truely safe, you need to
filter out all control codes (control set 0, from 0 to 31, 127 (which is
technically not in any control set, and control set 1, from 128 to 159) with
the exception of HT (horizonantal tab), CR (carriage return) and LF (line
feed).

  -spc (Did a deep dive into a few years ago ... )

[1]	ECMA-48 is the actual standard describing these codes.  ANSI got out
	of the game in the 80s (or maybe very early 90s) if I recall
	correctly.

[2]	In any of the ISO 8-bit character sets, this is character 155.
	Unicode also uses this value, but when encoded as UTF-8, it's
	represented as the bytes 194,155.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 11, 2020, at 12:57 PM, Sean Conner <sean at conman.org> wrote:
> 
> It was thus said that the Great Nathan Galt once stated:
>> 
>>> On Sep 11, 2020, at 12:44 AM, Sean Conner <sean at conman.org> wrote:
>>> 
>>> Okay, check out
>>> 
>>> 	gemini://gemini.conman.org/test/preformat-2.gemini
>> 
>> Much better.
>> 
>> I don?t know if ?image? before the ASCII-art images is, or would be,
>> useful to anything.
> 
>  It would prevent a screen reader from reading "circumflex circumflex next
> line slash backslash slash slash backslash nextline ??

Oh, huh. My assumption would be that screenreaders wouldn?t read anything 
in preformatted-text blocks if there were any alt text available.

(Yes, my assumption is that 99.9999% of the time, the page will download 
faster than a screenreader can speak.)

Back in HTML land, if an image has no alt attribute at all, the usual 
screenreader behavior is to read the filename out loud. Because this is 
time-wasting noise 99.999% of the time, web authors are repeatedly urged 
to add `alt=??` (empty alt attributes) to images that blind people don?t 
need to care about (purely presentational ones, for example).

At any rate, if I were blind, I?d want a ?skip past the preformatted block 
I?m in? if my client were set to read out preformatted-text blocks. I have 
no idea how hard this would be to program in a GUI-based Gemini client, 
though, for any OS.

>> It seems a little weird to see ? code Lua? instead of ?lua?, and I don?t
>> know how easy it would be to adjust syntax-coloring libraries to account
>> for this, but this nitpick is largely immaterial.
> 
>  Do you really want a client to have to list all these languages?

Seems doable to me:

> bat --list-languages | wc -l
     147

?and <https://github.com/sharkdp/bat> is merely a supercharged cat(1) clone.

<https://pygments.org/> boasts support for ?over 500? languages/text formats.

> At least with the prefix "code" the client can know it's source code, even
> if it doesn't what the language is.  And having the language can let a
> client syntax highlight for those languages it does know.  That was my
> reasoning.

Good choice.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>



> On Sep 11, 2020, at 1:16 PM, Sean Conner <sean at conman.org> wrote:
> 
> It was thus said that the Great Nathan Galt once stated:
>> 
>> On Sep 11, 2020, at 11:36 AM, Luke Emmet <luke at marmaladefoo.com> wrote:
>>> 
>>> Or are we going to say these implementation terminal escape codes are
>>> left as an ad-hoc convention? That seems to have its own risks as
>>> discussed on this thread and elsewhere.
>> 
>> [shock and horror that people are using ANSI codes for color]
> 
>  I've come across ECMA-48 [1] code usage on both gopher and Gemini.  It is
> being done.
> 
>> Prior reading:
>> 
>> => https://en.wikipedia.org/wiki/Escape_character#ASCII_escape_character
>> => https://the.exa.website/ a modern ls(1) replacement
>> 
>> I think ANSI color codes are up to 24-bit color now. Not all terminals
>> support them (Terminal.app doesn?t; iTerm2 does), but they?re out there. I
>> was looking up color codes so I could make my EXA_COLORS variable nicer
>> and the whole process wasn?t pleasant.
>> 
>> Sounds like a good reason to explicitly disallow U+001B in the text/gemini
>> spec and:
> 
>  Ban ESC and I can *still* send the codes.  The sequence '<ESC>[' is the
> CONTROL SEQUENCE INTRODUCER and is only one of two ways it is represented. 
> The other way is with codepoint 155 [2].  To be truely safe, you need to
> filter out all control codes (control set 0, from 0 to 31, 127 (which is
> technically not in any control set, and control set 1, from 128 to 159) with
> the exception of HT (horizonantal tab), CR (carriage return) and LF (line
> feed).
> 

=> https://en.wikipedia.org/wiki/C0_and_C1_control_codes

Ooh, good catch. Yeah, ban ?em all except \t, \r, and \n.

Link to individual message.

James Tomasino <tomasino (a) lavabit.com>

On 9/11/20 8:22 PM, Nathan Galt wrote:
> Oh, huh. My assumption would be that screenreaders wouldn?t read 
anything in preformatted-text blocks if there were any alt text available.
> 
> (Yes, my assumption is that 99.9999% of the time, the page will download 
faster than a screenreader can speak.)
> 
> Back in HTML land, if an image has no alt attribute at all, the usual 
screenreader behavior is to read the filename out loud. Because this is 
time-wasting noise 99.999% of the time, web authors are repeatedly urged 
to add `alt=??` (empty alt attributes) to images that blind people don?t 
need to care about (purely presentational ones, for example).
> 
> At any rate, if I were blind, I?d want a ?skip past the preformatted 
block I?m in? if my client were set to read out preformatted-text blocks. 
I have no idea how hard this would be to program in a GUI-based Gemini 
client, though, for any OS.

I've invited the Rhapsode maintainer to join us on the mailing list so we 
can get the perspective of a dev of an actual accessible gemini client. I 
hope that will help guide our assumptions in this area better. My own 
expectations of how a screen reader would work differ a bit from what 
you're saying, but rather than muddy the waters more I'll wait and hope 
Adrian jumps in here.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>


> On Sep 11, 2020, at 1:33 PM, James Tomasino <tomasino at lavabit.com> wrote:
> 
> On 9/11/20 8:22 PM, Nathan Galt wrote:
>> Oh, huh. My assumption would be that screenreaders wouldn?t read 
anything in preformatted-text blocks if there were any alt text available.
>> 
>> (Yes, my assumption is that 99.9999% of the time, the page will 
download faster than a screenreader can speak.)
>> 
>> Back in HTML land, if an image has no alt attribute at all, the usual 
screenreader behavior is to read the filename out loud. Because this is 
time-wasting noise 99.999% of the time, web authors are repeatedly urged 
to add `alt=??` (empty alt attributes) to images that blind people don?t 
need to care about (purely presentational ones, for example).
>> 
>> At any rate, if I were blind, I?d want a ?skip past the preformatted 
block I?m in? if my client were set to read out preformatted-text blocks. 
I have no idea how hard this would be to program in a GUI-based Gemini 
client, though, for any OS.
> 
> I've invited the Rhapsode maintainer to join us on the mailing list so 
we can get the perspective of a dev of an actual accessible gemini client. 
I hope that will help guide our assumptions in this area better. My own 
expectations of how a screen reader would work differ a bit from what 
you're saying, but rather than muddy the waters more I'll wait and hope 
Adrian jumps in here.

Good call. I?ve used VoiceOver occasionally on iOS (it can be difficult to 
see your phone?s screen when you?re donating platelets), but I?m mostly a 
tourist when it comes to designing things for blind people.

I have a lot more experience using voice _control_ on Windows Vista, but I 
haven?t noticed any potential issues with the format that might seriously 
impact people whose arms don?t work right for long periods of time.

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Thu, Sep 10, 2020 at 10:31:51PM -0700, Meff wrote:
> (Apologies for the double email Sean, I forgot to Reply All)
> 
> Sean Conner <sean at conman.org> writes:
> 
> >   And back in the mid-90s, there *were* plenty of web clients.  Easily a
> > dozen that were easily available and that was back in time when it was easy
> > enough to parse HTML, there was no CSS and no Javascript, and it was
> > conceivable that someone could write a simple web browser in a weekend [1].
> >
> >   Then Sandra Snan sent the following link:
> >
> > 	https://drewdevault.com/2020/03/18/Reckless-limitless-scope.html
> >
> > Wherein it's mentioned that the current "state of the web" is described by
> > 114,000,000 words spread across 1,217 standards documents.  You get here by
> > incremental changes, all of which are "easy" and "it would be so
> > nice."
> 
> I agree with this. I understand that there's a couple points here that
> the alt-text discussion is trying to solve:
> 
> * Formatting hints for non-human clients
> ** Specifically this benefits users that may be visually impaired and
>    their use of tools such as screen readers
> * Hints or descriptions for human readers
> 
> I don't see a way out of this without making Gemtext much more
> complicated than it is now.

Here are a few options that avoid complicating Gemtext.

1. Use alt text only as an accessible content description for humans.
   (not a tooltip or caption)
2. Use alt text only as a formatting hint for machine use. Rename it.
3. Get rid of alt text entirely.

All of these options attempt to solve the problem that alt text
currently has multiple uses and would not be reliable to a user who
needed it for a particular use. Any of these options would mean altering
the spec. I like option 1 best.

The reason I didn't mind syntax-on-top, alt-on-bottom was that it solved
the problem of one field with multiple uses, by making them two fields.
However it would complicate Gemtext and I agree we don't need to
do that.

Link to individual message.

Timur Ismagilov <bouncepaw2 (a) yandex.ru>

I've written a gemlog post on the topic of extending gemtext and using 
something other than gemtext. I hope it'll be of an interest.
=> gemini://tanelorn.city/~bouncepaw/gemlog/html-over.gemini

Link to individual message.

Katarina Eriksson <gmym (a) coopdot.com>

<easeout at tilde.team> wrote:

I think it would also be suitable to change the name from "alt text" to
something like "accessibile description".


Changing the name to something to do with accessibility is something I
agree with. While we're at it, let's also change the name of the "```"
lines to "plain text mode switch" so that it doesn't remind us of <pre>
elements which are blocks.

-- 
Katarina

Link to individual message.

easeout@tilde.team <easeout (a) tilde.team>

On Sat, Sep 12, 2020 at 06:11:53PM +0200, Katarina Eriksson wrote:
> <easeout at tilde.team> wrote:
> 
>> I think it would also be suitable to change the name from "alt text" to
>> something like "accessibile description".
> 
> Changing the name to something to do with accessibility is something I
> agree with.

For what it's worth, when I suggested the name change I thought that
HTML IMG "alt" was the tooltip and "title" was the accessible
description. But in fact that is backwards; alt text is the accessible
description. So at this point I don't think a name change is necessary.
I'm not opposed to renaming, though, if it would avoid repeating the
misunderstanding I had.

Link to individual message.

Nathan Galt <mailinglists (a) ngalt.com>


> On Sep 12, 2020, at 1:40 PM, easeout at tilde.team wrote:
> 
> On Sat, Sep 12, 2020 at 06:11:53PM +0200, Katarina Eriksson wrote:
>> <easeout at tilde.team> wrote:
>> 
>>> I think it would also be suitable to change the name from "alt text" to
>>> something like "accessibile description".
>> 
>> Changing the name to something to do with accessibility is something I
>> agree with.
> 
> For what it's worth, when I suggested the name change I thought that
> HTML IMG "alt" was the tooltip and "title" was the accessible
> description. But in fact that is backwards; alt text is the accessible
> description. So at this point I don't think a name change is necessary.
> I'm not opposed to renaming, though, if it would avoid repeating the
> misunderstanding I had.

=> https://stackoverflow.com/questions/1734806/ Background reading and linkage

Most graphical browsers displayed the contents of the `alt` attribute in a 
tooltip for a decade+, in addition to showing it when images hadn?t loaded 
(yet). People used to write ?alt text? for people who may or may not be 
able to see the image.

Then Firefox said ?No, we?re not doing that. It misleads HTML authors into 
thinking that the value of the  `alt` attribute is for people who can see 
the image already. If you want tooltips, put your tooltip text into the 
`title` attribute; Internet Explorer handles that just fine, too.?

Much wailing and gnashing of teeth ensued, but just about everyone got over it.

I don?t think many people today are liable to get their wires crossed like 
this. Referring to tooltip text as ?alt text? strikes me as a late-90s 
affectation, and I don?t think new authors are liable to slip into this mistake easily.

Link to individual message.

---

Previous Thread: Feed format for gemini (alternative to rss feed)

Next Thread: For consideration: JSON Feed