💾 Archived View for geminiprotocol.net › history › phlog › gemini-maps.gmi captured on 2024-07-08 at 23:48:14. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-09-08)

-=-=-=-=-=-=-

Gemini maps

(originally posted in Gopherspace on 2019-06-23)

I have actually mostly been thinking about details *other* than what I'll call "Gemini maps" until a better name comes along. But now that multiple third parties are starting to implement "proto-Gemini", and given that I'll be travelling for a bit in 4 days time and will have less time to dedicate to this, I feel a kind of time pressure to come up with a very, very preliminary complete spec which gives enough detail for the community to experiment with at least the core functionality in my absence.

At this point, the request and response formats seem pretty well settled. There's still a question mark floating over how to include search queries in requests (though I have pretty well convinced myself that this isn't necessary, more on that one day), but that's, IMHO, not "core functionality". As long as people can put up simple static "Geminiholes" (stupid name, we need something better) and link 'em together, that will do for now. Currently the most under-specced part of this whole thing is the text/gemini default item type.

I previously proposed something kind of like gophermaps, and on the whole I still think there is an awful lot to recommend this idea. The server at gemini.conman.org used a form of that proposal, so I wrote AV-98 to parse that, so now it's *vaguely* settled upon, but there's some wiggle room, and anyway there's overdue discussion to be had about this, so here we go.

First of all - someone phlogged somewhere (I'm truly sorry, I honestly tried to find and cite this, but this conversation has become very widely spread and I couldn't find it. As always, email me if it was you, or you remember who it was) that it could be problematic to erase gopher's text/menu distinction in favour of something like a gophermap because very long text-only files (like the text of Lord of the Rings) would need to be parsed line-by-line looking for links, for no gain. The solution to this is simple! That sort of content should be served as text/plain, not text/gemini. That way the client knows it has no links and can just be displayed. The power of content type declaration! To make this simple, we could come up with a file extension for text/gemini content (.gem? I know Ruby Gems exist but I've never used Ruby so I don't know if they are litteraly .gem files), and servers could dish .txt fils up as text/plain and .gem files up as text/gemini and that'd be that.

Right, the actual format! What is being used in the existing code is as follows. Any line that looks like this is a link:

<TAB><USER FRIENDLY NAME><TAB><LINK><CR><LF>

and anything else is plain text (like a gopher menu 'i' line).

That's it!

What can <LINK> be? Definitely, URLs are absolutely allowed. This is a deliberate improvement upon gopher, which was designed only to link to other gopher stuff, and can only link to non-gopher content via an ugly hack which I'm very happy that Gemini is free of. Anything other than URLs? Well, the gemini.conman.org server also serves relative links, HTML style, and AV-98 handles them. This wasn't something that was ever discussed or decided upon, it just kind of happened. I don't *think* I have a problem with it. Commentary is very welcome! This *does* complicate client design slightly, in that clients need to remember the URL where they got the map from in order to translate the relative links to absolute ones. Gopher clients don't need to do this, because each item in a gopher menu specifies a host and a port. This is basically a trade-off between network taffic (relative URLs are shorter than absolute ones) and client complexity. Since we have TLS overhead, small network efficiencies are not necessarily worth chasing (although I do hope that TLS session resumption will become more widely supported in future, so we can really cut that overhead down). This *probably* argues for absolute URLs only. Relative links are more user friendly for authors, but of course Gemini servers could convert them for you, which is how most gopher servers work anyway. This pushes work out of the client and into the server, which I think is how things should be (and is an explicit part of the philosophy of gopher in RFC1436). For now, let's maybe follow Postel's law: if anybody wants to write a new server, please just send absolute URLs. If anybody wants to write a client, please be prepared to accept relative URLs if they appear. We can defer the final decision on this, based on what we learn in early testing.

What can <USER FRIENDLY NAME> be? Anything that doesn't have a <TAB> in it, because that would confuse the parsing. Anything else goes, I guess.

So, the link indicator and delimiter is fixed as <TAB>? Well, that is what's being used in practice for now. Sloum pointed out that tabs are perhaps problematic, because some people have configured their editors to produce a sequence of spaces instead of a tab, because, for reasons I never quite understood, most programming language communities seem to have decided that tabs are somehow bad (I love that Lua does not seem to have this culture!). This sounds like a valid argument on the face of it, but then, most gopher servers use tabs for this and it doesn't seem to stop people. All that really matters is that we pick something which is very quick and easy for somebody to produce in any editor, and which people are unlikely to reasonably want to use in <USER FRIENDLY NAME>. This last consideration makes me not very in favour of sloum's suggestion of "@" to separate <USER FRIENDLY NAME> from <LINK>, because I can absolutely see people e.g. linking to their Mastodon profile and using their username, with an @ at the front, as the <USER FRIENDLY NAME>. I kind of like his "~!" link indicator idea. Lines beginning with tabs could occur easily in non-link contexts (paragraph indents, snippets of source code). Identifying link lines correctly requires counting the total number of tabs in the whole line, which is slightly more computational effort than just checking the first two characters. As ever, feedback welcome, but it's tabs for now because, well, there's running code using tabs.

Link to Sloum's 2019-06-19 post indicating problems with tabs

In earlier writing I originally proposed a more extensive link format:

<TAB><USER FRIENDLY NAME><TAB><LINK><TAB><MIMETYPE><CR><LF>

Mostly this was just blindly copying gopher. The MIME type is slightly redundant, in that the client will learn what it is when actually fetching it (this is how it works on the web). It doesn't seem popular to have this here, gemini.conman.org doesn't provide it and sloum said he didn't think it was necessary. The only reason I am still very slightly attracted to the idea is the following thought: graphical Gemini clients could, as an option the user could turn on/off as they pleased, when seeing image/* links, fetch and display them in-line. I'm imagining this in a very harmless, "images as figures" way, as espoused by @gcupc in their nice post about the "Lynx web": the server has no way of controlling the image's size, or position. It just goes exactly where the link is, centred in the line, at a sensible size that is under the user's direct control. Like a figure in a textbook or something. For certain kinds of documents this is a totally sensible and reasonable thing to want to do, and I really like that doing it this way requires *no* image-specific syntax and it degrades totally gracefully into just a link in clients which don't want to or can't support images. That's nice! The least-offensive way possible to bring images into this!

Jason McBrayer's 2019-04-20 blog post about Gopher and the lynx web

But I'm also worried that people would start serving weird not-really-MIME values in that position, and using it to trigger weird and wonderful behaviour in experimental non-standard clients. I will write an entire post on this some time, but I am terrified of putting extensibility into Gemini, either designed extensibility or accidental scope for sneaky extensibility. Extensibility is not a fundamentally bad thing, from an engineering perspective when you just want to solve problems it can be very powerful. But Gemini is an ideological protocol - simple is best, privacy matters! If you let people who don't believe these things add extra features, it presents a possible slippery slope away from those values. This is kind of what happened to the web, with cookies. I would like Gemini to be "closed by design", so I'm trying to avoid places where people could easily slip things in, and that <MIMETYPE> field, that's just a free place to stick arbitrary text, confident that clients which aren't in on the extension will just ignore it as "some MIME type I don't have a special way to handle", and continue without breakage. Way too tempting. Yeah, let's leave that out. It's a shame that we can't have nice things!

Regarding the treatment of plain text content, for now let's just say it should be presented as-is, but there is some discussion around this to have. Many people have pointed out that the convention of not reflowing text in gopher makes gopher content difficult to consume on devices like phones, where 70 or 80 char lines are too wide. Maybe we should explicitly declare Gemini map text to be reflowable? I would also not be opposed to a *very light dash* of *strictly optional* formatting possibilities. For example, in Markdown, you can have:

# Sections
## Subsections
### Sub-subsections

That's an extremely easy thing for even a crappy hand-written parser to recognise. We could say that Gemini clients *may* render lines beginning with #s in larger fonts, but that it's 100% okay not to. This lets very simple clients ignore this issue entirely, and the plain text version remains totally readable, but graphical clients *could* give a very nice and clean representation of structured text, which is not a bad thing at all. We could just not specify any of this at all and leave it entirely up to the discretion of individual clients to recognise and render some things, but with that approach different clients will recgonise different things, and so authors will just ignore all of them and there will be no point. Having *one* standardised way to do this kind of thing lets authors who want to partake of it do so in a way they know will maximise client compatibility, but doesn't force anybody who is disinterested to use it. So, it makes sense to me to specify *something*. We should *only* consider things which degrade cleanly when viewed as raw text. If text which "should be" bold turns up with *s around it instead, that's fine, the point is still clear. But Markdown supports, e.g. strike-through text with ~~this syntax~~ and that's *no good* because if you view it unrendered it is not at all obvious that it's supposed to be crossed out and the meaning is confused. So, none of that! Also, absolutely nothing which is remotely difficult to unambiguously parse. If even a shred of cleverness is required to not make a mess of it, I'm not interested. I suspect something usable meets those criteria, but if not oh, well.

A final consideration: the reflow thing and the light markup thing would both make it quite difficult to include ASCII art in Gemini maps. It of course would still be possible to serve it in text/plain documents, but not in maps, which would kill the gopher tradition of including ASCII art headers in the root menu. This is kind of a shame, but then, maybe it's also nice if there is a clear aesthetic difference between gopher and Gemini. I dunno. This would complicate low-effort bihosting, too. Hmm...

That's it! I will condense this into the spec-spec.txt shortly.

If you want to influence my thinking on any of the open questions raised here, write me something convincing. Remember, always: simple is best, privacy matters, beware of sneaky extenders!