💾 Archived View for rawtext.club › ~sloum › geminilist › 005612.gmi captured on 2021-11-30 at 19:37:34. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

<-- back to the mailing list

[SPEC] Backwards-compatible metadata in Gemini

Omar Polo op at omarpolo.com

Thu Feb 25 09:16:52 GMT 2021

- - - - - - - - - - - - - - - - - - -

Oliver Simmons <oliversimmo at gmail.com> writes:

On Wed, 24 Feb 2021 at 19:15, PJ vM <pjvm742 at disroot.org> wrote:

Whilst it's true anyone could use any key/values, I would hope that we

are civilised enough to be able to agree on what keys and values we

use.

I'm a contributor to OSM, and their saying goes:

Feel free to invent new tags! Though it is not "feel free to ignore existing tagging schemes".

Simple. if you start using your own key/value, nothing is going to

support it, so you might as well use what everyone else uses.

BUT, as is obvious with OSM, if we don't get the keys/values organised

**from the start**, we will end up with different ways of doing that

same thing and, and I think anyone would agree, that is awful to work

with. If we get keys/values organised at the start though this isn't

really an issue.

I think this is a bogus point. I never contributed to OSM, but fromwhat you're saying I suppose they use something like XML/SGML/... Thosethings are *meant* for extensions (the 'X' in XML stands for that),whilst everything around Gemini is focused on non-extensibility andsimplicity, even at cost of missing features.

Let's think about that again, for a moment.

There are various people here, myself included, that would like to add"only a small change" to the spec. Everyone has their 20% of thingsthat would like to add to Gemini or text/gemini[0] because with that itwould be, oh, so much better.

But instead of thinking about what we may add, let's think about what wehave: - we have TLS because it's fundamental to guarantee confidentiality between servers and clients - we have status codes, because a page that says "an error occurred" or "certificate required" cannot be interpreted correctly otherwise - we have a media-type in the response, so users know what kind of document they're getting - we have links, so we can connect different pages, even across different capsules - we have titles, paragraphs, quotes and lists to express and organize our writings - we have pre-formatted blocks to allow certain types of explanations/presentations that otherwise would have been impossible (how do we teach how to write text/gemini in text/gemini?)

From here you can notice how humanly-centric Gemini is. We don't havefeatures for bots (more than what it's absolutely needed at least) andeven more importantly we only have basic and necessary stuff. There'sno fluff in Gemini.

If you think about it, we only have features that we can't objectivelylive without (no links? no paragraphs? no media-types? ...) while we'relacking various things that would be "nice to have".

We don't have headers, because with them comes extensibility andcomplexities, and we're getting just fine without them. We don't haveinline formatting because it's difficult to handle client-side and we'redoing really fine without, etc.

Now, replying to the two proposals specifically, I'm against both ofthem for various reason: - we're doing fine without them, so we can continue do so - they're bring extensibility which is against the "spirit" of Gemini (at least until now) and thus dangerous.

Specifically they have their own faults in my opinion:

- the one adding the line-type =: (or whatever): you have to parse the whole document to extract the metadata and it allows for possibly unreadable text/gemini files[1]

=: foo: bar # a document =: title: a document =: author: Omar Polo lorem ipsum dolor sit amet... =: x-best-viewed-with: tinmop Quia ullam quae repellat. Dicta occaecati beatae qui... =: script: gemini://evil.corp/analytics.gms =: document-class: article =: x-song-im-listening-title: "Norwegian Wood" =: x-song-im-listening-by: "The Beatles" =: licence: CC-BY-SA another pragraph? dunno -- text: CC-BY-SA, code: MIT (a lot of capsules have lines like this) =: prefetch-page: /some/other/page =: x-some-even-more-funny-meta-because-why-not yay! ... =: preferred-color: black-text-on-white-background

An user on a non-sophisticate client cannot (easily) understand that. It's just full of bloat. (with non-sophisticate I mean something more elaborate than "printf $url\r\n | nc ... | less".)

- (your?) proposal of the ^^^ toggle line, while eastetically nice (I'll give you that!) has the additional drawbacks of breaking the concatenation. As things stands, I know I can

cat file1.gmi file2.gmi ...

result.gmi

and obtain a valid text/gemini file. With your proposal, I have to write a parser that analyzes every file. There are a lot of people who uses simple scripts/makefiles to generate their capsules with standard UNIX tools, this would (possibly) break them. And even worst, the cat(1) example I gave before will break only *sometimes*, depending on the content of the files. (let's not talk about how to merge metadata from multiple files...)

Also, the examples you gave in support of your proposals seems bogustoo. Serving a mailing list archive over Gemini? Cool, but why convertthe mails to text/gemini? Wrapping them in ``` (with headers visible)or serving them "raw" is not enough?

What I think is missing in all these discussions is a valid reason tooutweight the cons.

However, I feel that denying and turning down feature requests foraddition is not a good thing. I think we should reflect on what's theactual problem and solve it, because this smells like a XY problem[2] tome.

If we want to give people ways to manage their local data, maybe becausethey want to search across documents or do some kind of publicationsover Gemini, then centralising metadata in one place is an option.That's what I'm currently doing with my blog: all entries are puretext/gemini files and there's a posts.edn[3] file with all the metas(title, tags, date, song I was listening to, relevant XKCD, ...), andI'm happy with the outcome. It's easy to generate pages for either theWeb or Gemini, and I can easily adjust the "layout" when I want to.

If we want to build a better GUS I don't think that adding metadata totext/gemini will solve anything, it will actually make things worst.The point is, you can't trust 3rd-parties metadata. Sure, I can stick adescription of "About the interpretation of the Will of power inNietzsche" with tags "philosophy, nietzsche, will-to-power" and acategory of "essay", but you cannot trust me to talk about thosearguments in the page, maybe it only contains link to pics of cutekittens :)

Why I think metadata will make things like GUS worst? While full-textsearch is not without its drawbacks, as Bortzmeyer reminded us, peoplewill abuse the metadata to "go up" in the search results, and theoutcome of that is crystal-clear on the Web, other than making the lifeof who makes a SE more difficult, as now they also have to try tounderstand if the metadata is actually relevant or not.

(sorry for the long mail)

[0]: mine? I would love to have a syntax for definition lists and 3-levels of un-ordered lists.[1]: .gms is GeminiScript of course. A minimal, non-estensibile and simple scripting language for your preferred client, hoping it doesn't lack support for it /s [2]: When people asks for Y because they think that will solve the problem X, instead of asking directly for X. https://en.wikipedia.org/wiki/XY_problem[3]: edn is like json, but for clojure, kinda.