💾 Archived View for stack.tilde.cafe › gemlog › 2022-02-09.urlencoding.gmi captured on 2023-09-28 at 16:20:34. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-09-08)

-=-=-=-=-=-=-

Gemini Spec compliance - my anarchist take

There is ongoing discussion about odd corners of the Gemini spec.

Gemserv - Just Block All Encoded Paths

gemini://gerikson.com/gemlog/gemini-sux/e-m-p-h-a-s-i-s.gmi

A spec for an internet protocol is usually written by an industry group, disguised as 'cooperation' and 'free flow of information', but it is a thinly veiled oligopoly. A spec is a document designed to scare anyone stupid enough to _want_ to implement a browser or a server - just looking at 1800 pages of a spec with all required appendices would prevent any startup from getting into the game. VCs play along: "There is an industry standard you must follow, and it will cost more than we are willing to offer".

A spec written by an individual is a different animal - it is more of a manifesto, complete with opinionated choices. Whatever the politics are, a group of industy experts will usually dig into most subtle details, and everything that is vague, is vague for a (devious) reason. An individual is likely to have a couple of points they really care about, and gloss over things they don't care about, or make arbitrary choices. It is hard for a single person to be an expert of every part of a protocol, no matter how 'simple' it looks.

Remember that kid in kindergarten who wanted to play some complicated game? You needed a passport, so everyone made passports from folded up. No, the picture has to be on the first page, he says - that's how a 'real' passport looks. OK, everyone draws a picture in their 'passport'. The crowd joins into the enforcement. "He needs to stamp it, or it is invalid!". OK, the stamping ceremony starts. Where is the visa page, his cohorts demand. "Your passport is invalid!". All the kids line up to beat you up.

I love the Gemini community, but the #protocol is not a f***ing bible, and _with all due respect_ it's pretty arbitrary. It's a good starting point. It's vague in places, which is a good thing. You can ignore it *when you feel like it*, and no one will arrest you. Some zealot will complain, and you will get a bunch of concerned emails about how they are worried about you destroying the community with your irresponsible actions...

There is a bunch of things you can do, _responsibly_. Highlighting text is one of them. Sure, you can rewrite the sentence in a way that avoids the need to highlight. Sure, some screen reader will not read highlighted text correctly (that is a #bug in the reader, by the way). But why rewrite what you have written, only to satisfy some arbitrary limits? You can't please everyone - there are many valid complaints about not being able to have emphasis -- which we _desperately need_, by the way. There are others writing about the joy of quiet text. Staring at a dense page of text without emphasisized words to provide visual and semantic anchoring is sometimes painful, especially to those with certain cognitive issues. Too many hurts too.

Things you could do

It is explicitly stated that the author cannot dicatate how the text is rendered. That is great. You can do almost anything and see if it sticks.

The response-to convention -- providing a link to the original post early on -- is great. Thread aggregators are springing up.

The #tag syntax is gaining popularity. Great. Search engines are catching on.

I see some @name tagging as well. Great.

You can add the fourth-level headers. They won't be rendered correctly by spec sticklers, but some browser-writer may say to themselves "why 3 levels? I will provide for 6, and it is still compatible with everyone who is spec-compliant. And it is a nice surprise for everyone else, and it costs me nearly nothing. And maybe it will help a little bit to semantically organize the document".

There is no reason not to linkify in-lined URLs. Illegal, you say? What is the harm, really. How often will you put an entire URL into your text and not intend for it to be a link someone may wish to visit? And how hard is it to search the line for a match to 'gemini://' in order to make a link? What kind of a programmer are you?

Removing really necessary features in order to make it possible to implement a crippled client in a day is hardly a good requirement for a protocol. Especially since such clients will be implemented in things like python, which have libraries that make all kinds of things pretty easy. And terminals support bold text and colors; if you are writing a GUI client, well, you are not doing it in a day.

URL encoding

URL-encoding in the request? Just don't URL-encode anything. There is enough information in the request to allow arbitrary unencoded text in the query. Everything before the first '?' is the path, and everything that follows is the query. Terminated by an '\r\n' combo. If there is no query, then the entire line is the path. The query can have anything in it, except for the said '\r\n' combo. Bingo, ambiguity solved, with downward compatibility - you can still encode and decode, but you don't have to, and eventually, encoding will wither away.

Sounds scary? In practice, a surprising number of clients don't encode properly, as my empirical observations of SpellBinding requests show. And some opinionated choices are made about when to encode. And which characters to encode.

Takeaway

In reality, standards are decided by the big players. A dominant browser will decide how to render _emphasis_, and a few others will follow. An updated standard will sometimes reflect that. Or not. Who cares, really. Anyone writing a browser (especially a labor of love browser) should be aware of the subtle points like this, and understand the culture enough to make good choices, sometimes not specified in the spec.

We are not bound to the spec by regulatory bodies, or venture capitalists. We are an amateur community. We can actually do whatever we want.

Just do whatever, and render the best you can. Whatever you do, make sure that, when rendered literally, it is still ___kind of___ readable. Let the users decide on what is absolutely necessary for a browser -- they will let you know, trust me.

Are we scared of browsers being hard to implement because of things like that? Well, browsers _are_ hard to implement because of all kinds of things, not just the size of the specs. Prohibiting bold text is not a solution to this issue.

And if you make choices you will get a bunch of complaints no matter what you do.

index

home