<-- back to the mailing list

[SPEC] Backwards-compatible metadata in Gemini

Omar Polo op at omarpolo.com

Thu Feb 25 20:56:27 GMT 2021

- - - - - - - - - - - - - - - - - - - 

John Cowan <cowan at ccil.org> writes:

On Thu, Feb 25, 2021 at 4:17 AM Omar Polo <op at omarpolo.com> wrote:
- we have TLS because it's fundamental to guarantee confidentiality
between servers and clients
I personally don't give a damn about this (after all, who's going to pass
confidential information over Gemini? See below.), but I accept that other
people do.

"confidentiality" maybe was the wrong word. The idea is that I don'twant to let everyone in the same network see (and possibly hijack) thepages I visit. (OK, there are tons of possible issues with this, givenTOFU and how specific clients implements it etc, I don't want to get offtopic though.)

- we have status codes, because a page that says "an error occurred"
or "certificate required" cannot be interpreted correctly otherwise
They aren't actually part of the required protocol engine, except for 1x
vs. 2x. The META for the others is just human-readable content, and could
be replaced by a 2x document, and some other mechanism could be found for
the marginal case of 1x. (Clients might exploit 6x to automatically retry
with a (different) cert, but it's unclear that this is the Right Thing: the
protocol document is, as usual, ambiguous, as it says "should be retried"
but not who should retry it.)

I'm not sure I understand what you mean. If my server fails to executea CGI script, it should return a 20 reply with "Error while executingthe script" instead of 42? The meta is human-readable, but the codes3x, 4x, 5x and 6x carries a meaning, it's an extra information, heck,you can consider that a metadata in some large interpretation of theword. (Maybe 3x can be dropped, and some codes inside 4x and 5x as wellif we really want to be barebones, but again, I don't want to getoff-topic.)

- we have pre-formatted blocks to allow certain types of
explanations/presentations that otherwise would have been impossible
(how do we teach how to write text/gemini in text/gemini?)
It's still hard to explain ``` within ```: you can talk about it but you
can't show an example.

Touché

Anyway, I don't care about any of these points, just that "necessary" is in
the eye of the beholder.
- the one adding the line-type =: (or whatever): you have to parse the
whole document to extract the metadata
True, but that's true for any approach except metadata-at-the-top. Note
that ^^^ within ``` does not mark metadata, so you have to parse at least
that much.
and it allows for possibly unreadable text/gemini files[1]

[...]

you probably will have trouble here

FYI, Emacs handled that wall of text surprisingly well :)

(just for curiosity, what languages are? I couldn't recognize them)

Workers at the Tower of Babel:
"Bitte geben Sie mir einen kleineren Schraubenschlüssel.'
'Non ho idea di quello che stai chiedendo.’

Adesso ne ho una, o almeno mezza, sperando che la traduzione sia fedele :)

(sorry I couldn't resist. I don't see pretty often examples featuringItalian)

‘Давайте чашку чая и домой.’
‘Kei te korero koe i tito noa, ko ahau ngenge o te whare pourewa.'
There. A perfectly valid text/gemini or indeed text/plain document, and
yet quite unintelligible, and it would be trivial to make it far worse.
The question is, what is the _motive_ for writing and serving such rubbish?
Web pages lie to search engines because money. Web pages contain hidden
text because money. Web pages plant tracking pixels because money. The
love of money is the root of all Web evil. But where's the money in
Gemini? I mean, you could put textual ads in your pages and a link to a
website, but why not just use the website?

The motive to serving rubbish can simply be some sloppery whengenerating automatic content. If I know `=:'-style metadata can be putanywhere, I can write scripts that *for convenience* don't try to puteverything at the top/bottom. But I agree, this is probably a mootpoint. The real issue I see lies in just adding an explicit syntax forkey-pairs value. I'm explaining it better below

An user on a non-sophisticate client cannot (easily) understand
that. It's just full of bloat.
Again, you'd have to be a fool to write something like your example or mine
except for hack value. Reasonable people would either put metadata at the
bottom of the document or the most important entries at the top and less
important ones at the bottom; it's the possibility of doing that, along
with allowing links-with-metadata, that make me want it to be able to go
anywhere. Gemini is like (anarchist) Anarres: everything is open to
everyone's . The Web has become like (capitalist) Urras: there is lots of
glitz on top, but the important stuff is hidden in the cellars, where
people are bleeding to death.
Repeating myself: metadata conventions should go in a metadata spec, *not*
in the text/gemini spec, since neither clients nor servers are required to
take any notice of them.
[snip]
One thing that I haven't though about when writing the mail, but only
later when discussing the matter with thfr@, is that we're trying to
hide stuff from users eyes. Sure, if used correctly those two proposed
syntaxes (=: and ^^^) can be easy to read, but lets be honest: clients
won't show them as-is, in particular the more advanced ones.
I'm sorry, but I am unwilling to take this for granted. Why hide them?
(I'm already annoyed that Gmail hides signatures, and mine aren't always
the same, so I don't use the standard "-- " line to announce them any more.)
As things stands now, there are only two things that Gemini clients
usually hide: the URL of a link-line and the alt-text of a pre-formatted
block.
I actually think hiding the URL is bad UX. People should be able to know
*before* going to a page where it's coming from. Showing the domain is not
enough, because multi-hosting. (This is an option in Lagrange now.)
There's a understandable UX reason for that, but do we really
need to add something else that we know will be hidden to end users?
"Know" is a very strong verb, especially for something that hasn't happened
yet. Metadata in HTML head elements was *required* to be hidden from day 1.
[]
Another thing that I forgot to explicitly say in my previous message is
that we can use some sort of common notation, a convention, rather than
adding new things to the specification. See for instance the
"Subscribing to Gemini pages" companion specification: a lightweight,
convention-based way to provide atom-like feeds. I found it pretty
elegant, and has proven a) easy to implement b) easy for content writers
to use c) easy for end-users to consume and d) avoid adding extra line
types/file types/etc to the specification.
That's what I want too. =: does not have to be a text/gemini line type,
just a convention explained elsewhere.

I feel like we're not talking about the same thing. This thread hasgrown pretty large, so I apologies if I missed some parts of it.

My understanding of the proposals is that they want to add some sort ofextra notation (either as line-type or block, it doesn't matter) toexpress (potentially) arbitrary key-value pairs. Given this, we knowthat non-dummy clients will hide them in the document, probably to showthose information in a "better" way. I mean, it wouldn't be bad if aclient would be able to display a sidebar the authors of the documentand creation/update dates. But then we have given users ways to extendthe format way beyond its purpose, so it gets way easier to add styling,formatting, and other stuff.

But from what you write, it seems that you're aim at some sort ofconvention for metadata, and this is something I can actually agree on.To make it clear, since in another mail you mentioned two "factions",I'm not against the idea of metadata in the first place, I simply don'tlike the existing proposals for the already mentioned issues.

Even if I'm not 100% happy with what follows, if I were to add metadata,I'd advocate for something simpler, without a prefix that smell like"unofficial line-type", and probably already in use like:

Author: Omar PoloPublished: 2020-02-24Edited: 2020-02-25Licence: ISC

(even the FAQ document has a Last-Update line near the top, or somethinglike that)

With something like this I think we could achieve metadata in a waythat: - they're not hidden from the end-user eyes - easy (and natural) to use by who write contents - intuitive for the readers - backward - and forward - compatible with the spec - doesn't require a special treatment by already existing clients

Something like this would also avoid various problems regarding theclassification of valid keywords and the troubles regardinginternationalisation. In an Italian document I would happily write

Autore: Omar PoloPubblicato: 2020-02-24Aggiornato: 2020-02-25Licenza: ISC

and all my readers would understand the meaning. We could then havespecific search-engines per-language, adapt tools to cope with this etc,all without modifying the spec and without providing a syntax that canbe abused.

Now, to the OT (i'm lazy today and I don't want to write another mail)

OT: I'm actually designing something like GeminiScript, but not for Gemini
clients. I think the idea of no-install instant-download software with
severe limitations on presentation and no access to the local system except
very limited keyboard/screen/mouse is in fact a good one, but unlike
Brendan Eich I have the luxury of more than two weeks to think about it. I
also want to make it as accessible to non-professionals as microcomputer
Basic was. It would run in its own native client, either CLI or TUI or GUI
(there are some issues around the fact that CLIs linearize access). Like
most languages, it could probably be compiled to JavaScript.

I don't like the idea of instant-download software for various ethicaland practical concerns, but I have a soft-spot for programming languagedesign and compilers, so, if you don't mind, I'd be curious to read moreabout it :)

P.S.: I find your way of quoting text strange. Why the first line ofevery cited block isn't prefixed by

when all the others are? Is thatsome sort of arcane custom that youngster like me don't understand?
John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org
Your worships will perhaps be thinking that it is an easy thing
to blow up a dog? [Or] to write a book?
--Don Quixote, Introduction