💾 Archived View for gemini.smallweb.space › saved-capsule-pages › gemini-faq › faq-section-4.gmi captured on 2023-09-08 at 16:09:10. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Project Gemini FAQ - §4 Protocol design

4.1 Comparisons to Gopher and the web

4.1.1 I'm familiar with Gopher. How is Gemini different?

Compared to Gopher, Gemini allows for:

Unambiguous use of arbitrary non-ASCII character sets.
Identifying content using MIME types instead of a small set of badly outdated item types, and this information comes from the server together with and at the same time as the content, rather than being stored in a separate menu.
Clearly distinguishing successful transactions from failed ones, in a machine-readable way to allow for more robust automated user agents.
Linking to resources served over other protocols via simple URLs, without ugly hacks (item type h plus "URL:").
Redirects to prevent broken links when content moves or is rearranged.
Domain-based virtual hosting.

Text in Gemini documents is wrapped by the client to fit the device's viewport, rather than being "hard wrapped" at ~80 characters with newline characters. This means content displays equally well on phones, tablets, laptops and desktops.

Gemini does away with Gopher's strict dichotomy between menus and text files, and lets you intermingle text and links in a single "item type". Increasingly many phloggers try to push Gopher in this direction by serving everything as item type 1. At the level of the Gopher protocol, this is ugly and inefficient, with phony selectors, hostnames and port numbers transmitted along with every line of text in a post. In Gemini, there's no penalty for this, it's normal. Of course, if you really like the Gopher way, nothing in Gemini stops you from duplicating it. You can serve "item type 0" content with a MIME type of text/plain, and you can write text/gemini documents where every single line is a link line, replicating the look and feel of a RFC1436-fearing Gopher menu without that pesky non-standard i item type.

Gemini mandates the use of TLS encryption. It even provides a way for servers to request a client certificate for clients, which is a way to establish a "session" of requests. This allows developing simple, textual applications where all the state is maintained server-side without relying on fragile mechanisms like binding sessions to IP addresses.

4.1.2 I'm familiar with HTTP and HTML. How is Gemini different?

The Gemini network protocol looks kind of like something between HTTP 0.9 and HTTP 1.0. There's only one kind of request, analogous to GET, and the request itself is nothing but a URL. It's sort of like a HTTP request where the only header allowed is Host. The response is kind of like a HTTP response where the only header allowed is Content-type. By design, the request and response formats are not extensible, so these are the only HTTP header-esque things that will ever be there. This is considered a good thing: Gemini does not and never will contain an equivalent of the Cookie, Referer or User-Agent headers, which goes a very long way to preventing user tracking.

This freedom from the threat of tracking comes with downsides: Gemini has no support for caching, compression, or resumption of interrupted downloads, and as such it's not very well suited to distributing large files, for values of "large" which depend upon the speed and reliability of your network connection. Without an equivalent of HTTP's POST method, Gemini does not really support uploads, at least not in a simple and straightforward way. That doesn't mean that user input is totally impossible; queries included in URLs can be used to, e.g. send a search term or a username to a server. But it does mean you can't use Gemini itself to put content into Geminispace. You need to use something else, such as (S)FTP, SSH, rsync, git, a web interface, an email interface, etc.

The "native content type" of Gemini (analogous to HTML for HTTP(S)) never requires additional network transactions (there are no inline images, external stylesheets, fonts or scripts, no iframes, etc.). This allows for quick browsing even on slow connections and for full awareness of and control over which hosts connections are made to. The native content type is also strictly a document, with no facility for scripting, allowing for easy browsing even on old computers with limited processor speed or memory. Complete control over the visual styling of these documents is granted to the reader, not the author. There's nothing like CSS for Gemini. That doesn't mean Geminispace is ugly. Beautiful graphical clients exist which allow you to style the whole of Geminispace to look the way that looks and works best for you.

Gemini mandates TLS encryption for all transactions. In lieu of cookies, Gemini allows using TLS client certificates as a way for the user to authenticate themselves to apps. Compared to passwords and cookies, this is far less vulnerable to brute force attacks and session hi-jacking, and users always have the right to instantly, irrevocably and unilaterally delete their private key, permanently ending a session.

4.2 What were the design principles for Gemini?

Gemini looks very unusual to a lot of people, and sometimes they assume that some aspect of the design is a mistake that arose because the designers didn't know what they were doing. It's true that Gemini is an amateur project (see question 7.5), but more often than not these supposed "defects" and "oversights" are actually things which were designed in very deliberately with full awareness of the consequences, because the consequences were something we actively wanted. The reason Gemini looks strange is that it wasn't designed according to "usual" principles, such as:

Make sure it can do as many different things as possible
Make sure it appeals to the widest possible audience
Make sure it can have extra features added gracefully at any time
Make sure it can effortlessly scale up to the entire planet

We had very different ideas in mind. It's true those ideas weren't always made explicit and crystal clear, and there wasn't a clear ranking of design principles in order of priority, so that when two principles were in conflict and a trade-off had to be made we always bent in a consistent direction. Gemini might not be provably optimal for anything, but ultimately we ended up pretty close to the mark we were intuitively aiming for. It may not be perfect, and it may not suit your favourite use case for the web, but for people who hold our community's values and are interested in using Gemini for the kinds of things it was intended for, it's far and away the best game in town.

The following principles have left a strong mark on Gemini's design.

4.2.1 User privacy

Gemini was designed with an acute awareness that the modern web is a privacy disaster, and that the internet is no longer a safe place for plaintext. It's not enough to simply refrain from from deliberately designing in tracking features, which is easy. The history of the web proves that user tracking can and will be snuck in via the backdoor using protocol features which were never designed to facilitate it. Thus, the designers of Gemini assumed active malicious intent and tried to avoid designing in anything which could be subverted to provide effective tracking.

Any bit of information which is capable of making a round trip from a server to a client and then back to the server again without being modified has the potential to be a privacy threat, allowing the server to recognise a client as a repeat visitor even if their IP address has changed between visits. Such round trips are common in HTTP. The contents of a Last-Modified or Etag response header can end up in a subsequent If-Unmodified-Since or If-None-Match request header. When used as intended, these headers don't serve to uniquely identify a user in isolation, but they leak a non-zero quantity of information. They can be combined with other such identity leaks (such as the many well-known browser fingerprinting techniques) to help narrow down a user's identity. Far worse, they can and have been abused for the express purpose of uniquely tracking users. You can read about Etag-based "supercookies" on the web to learn more.

A protocol which is seriously committed to user privacy must be designed in such a way as to "break all loops", ensuring that nothing contained in a server's response has any way of ever ending up in a client's request. Gemini achieves this by the simple expedient of making the client request as simple as possible. It consists of nothing other than the URL which is being requested, which is of course necessary for the request to be unambiguous. Gemini requests convey literally the bare minimum of information. This not only breaks all loops between server and client, it also thwarts efforts at client fingerprinting (although fingerprinting becomes possible when clients choose to supply a client certificate).

Of course nothing is perfect. A truly malicious Gemini capsule can still track a user by programmatically inserting random unique IDs as path components of all URLs, but even Gopher is vulnerable to this.

4.2.2 User autonomy

Gemini is designed to put client users in the driving seat, not server admins or content publishers. On the web, it's impossible for you to know in advance or exert any meaningful control over whether clicking on a link will cause your computer to connect to one remote sever, or ten, or a hundred, or what information it will send to them, or how many files of what type it will download. Web "pages" are allowed by default to run calculations on your CPU and store files on your hard drive. No facility is provided for you to impose fine-grained control over any of this. If it were, most websites would break if you tried to exercise this control, because they're designed on the assumption of a submissive relationship with the user, that the web is just a way for you to hand them control of your computer. Heck, you don't even get to decide whether you read dark text on a light background or vice versa unless the web designer decides to do extra work on their end to offer you the choice.

Gemini completely inverts this relationship. When you click on a link, your client connects to exactly one server (specified in advance in the link URL) and that server gets to send you exactly one file, and chances are very good it'll be a file of gemtext (the only type of file guaranteed to be handled correctly by every Gemini client and overwhelmingly the most commonly served type), and all your client can do with that file is put some text on your screen and offer you some links to other files. How the text is formatted is almost entirely under the client's control. About the only the thing the document author can do is clarify that some sections of text really ought to be presented in a fixed width font, so that computer code or ASCII art looks like it's supposed to. Text size and font and colours are entirely up to the client.

It's not just visual styling that is under the user's control. The gemtext document was designed without any way for a document to "pull in" any other file, allowing authors to cause your computer to connect to arbitrary third parties. This doesn't just safeguard your privacy (although of course we were thinking about that, see section 4.2.1), but because gemtext documents are standalone entities which cannot meaningfully "contain" any other file or depend upon any other file to specify the appropriate way of handling them, it's impossible to to break them by exerting control. A Gemini client can be configured to refuse to connect to certain servers, or to refuse to download files with certain MIME types, or to terminate downloads which exceed a certain file size, and it's impossible for these decisions to have unintended consequences. Any document which makes it past your chosen restrictions will look and function exactly the same way it would without your restrictions. You can pick and choose at will.

The firm commitment to the principle of "one network transaction per click", which has left a few marks on the protocol design, isn't just in aid of user autonomy. It also contributes to user privacy (see answer 4.2.1), because if processing a document served from one host could trigger the automatic fetching of another file from another host, that could be used to facilitate tracking.

4.2.3 Non-extensibility

The web has slowly but very surely mutated from an electronic library of interconnected documents - which is what Gemini first and foremost aims to be - into a general purpose computing platform. User privacy (see answer 4.2.1) and user autonomy (see answer 4.2.2) have necessarily suffered, since the changes along the way have invariably given more control and more abilities to those doing the publishing but none to those doing the browsing. The web was not designed from day one to be anything like what it is today, and yet it's continued to be built on the same foundation of HTML and HTTP the whole time. How was such a dramatic and unforeseen change able to happen without needing to complete replace the foundation? The mutation was possible because HTML and HTTP are both extremely extensible.

If you know a little HTML, it's easy to see this. Suppose that the only three tags that existed were <a>. <b>, <i> and <p>. Later, if somebody wants to add a new feature, it's extremely obvious how to do it. Just pick a new letter and re-use the angle bracket syntax, say, <u>. You don't even need to stick to one letter, so the space of possible tags which could exist is literally infinite. A parser which can parse a version of HTML with 100 tags isn't much more complicated than one that can parse a version with 10 tags. Browser implementers can, and did, add tags that weren't designed into HTML. Browsers that didn't support the tag didn't crash when rendering pages that used it. After all, even if no wild HTML extensions are out there, users will make typos sometimes, so browsers have to be robust against unfamiliar tags.

Suppose there are only two browsers in the world, A and B, each with exactly 50% of the userbase. If browser A adds support for a new HTML tag, the first sites to use it won't work as intended for users of browser B. But because half the world uses A, some users of browser B will see the new site working correctly on the computer of a friend or co-worker. Maybe 10% of them will think the new tag is so cool, they switch browsers. Now browser A has 55% of the userbase. They can tell web authors that their cool new tag works for "the majority of users", so more people start using it. Browser B continues to lose users to browser A. Eventually, they cave in and add support for the new tag. Now it's for all intents of purposes a part of HTML. Rinse and repeat for a few years, and the tags can multiply out of control. It doesn't matter if the people in charge of the official definition of HTML don't want this to happen for very good reasons. They can stomp their feet and yell all they want, but you can't throw browser developers in jail for ignoring a spec, so extensions will happen.

If you know a little about HTTP, you'll see how the same is true of response and request headers, and request methods.

Even Gopher is extensible by design. The items in a Gopher menu are assigned an item type, indicated by a single number, a single symbol (like +) or a single letter, with uppercase and lowercase letters distinguished. That makes for over 62 possible item types! At least that's a finite number. And RFC 1436 explicitly encourages "local experiments" (as long as they aren't machine specific). In practice, despite being around for even longer than the web and never completely dying out, remarkably little extension of Gopher has actually happened. Two item types, i and h, which are not in RFC 1436 are widely used and widely supported, but that's it.

Why didn't Gopher undergo the same uncontrolled ballooning of the web? Part of the reason might be that because Gopher is such a simple protocol it's very easy to write your own client. If something is easy, more people do it, so there are a lot of Gopher clients out there. The dynamic between the two big web browsers A and B described above wouldn't play out anywhere near the same if there were 10 different browsers each with roughly 10% of the userbase. An extension can only become universally adopted in that scenario if almost everybody is in agreement that it really is for the best. This probably isn't the whole story behind Gopher's stability (once the web came along, Gopher would only have appealed to people who liked it as it was, and were less prone to want to extend it), but it has surely played some role. The principle that client diversity induces protocol stability is real.

Gemini was designed from the beginning with all of the above very clearly in mind. We didn't want to design a protocol around user privacy and user autonomy and then watch those things slowly erode as Gemini mutated in the wild once it left our control, so we went all out on non-extensibility. We tried to design a protocol that was hard to extend. We didn't just refuse to add any deliberate "hooks" to the protocol for people to hang arbitrary additional features off in the future; we tried to avoid inadvertently leaving any slightly pointy bits that weren't intended as hooks but might still function that way if you took care to hang something small enough at just the right angle. We tried to design a smooth, shiny protocol with a mirror polish that any attempted extensions would just slide right off. We really didn't want to take any chances, so in addition to making it smooth and shiny we also tried to make Gemini very simple to implement (see answer 4.2.4), to encourage as many different clients as possible for extra stability.

Our twin commitments to non-extensibility and simplicity of implementation are the reason that a lot of "nice things" from the web are missing in Gemini. Fewer features and the occasional rough edge are the prices we've knowingly and willingly paid to stand our ground as firmly as we can on the things that are most important to us.

4.2.4 Simplicity of implementation

Gemini's design strives for ease of implementation on two levels, conceptual and practical.

On the conceptual level, Gemini is designed to be "radically familiar". It is based on a very traditional internet architecture of clients and servers, requests and responses. There's no peer to peer networking, no distributed hash tables, no blockchains, no content based addressing or any other exciting, fun new technologies you might expect from a young, modern protocol. That's not because we think those things are necessarily bad, but just because they are unfamiliar. A lot of programmers don't understand how they work in any kind of detail. Even the ones who do have a decent theoretical understand may not have any practical experience. The concepts behind Gemini are really bread and butter stuff in comparison. Anybody who has worked with either the web or Gopher can get a pretty solid handle on Gemini in a single sitting without having to look something up on Wikipedia even once.

On the practical level, Gemini's design is basically a novel way of connecting together some very non-novel technological primitives; mature, standardised and widely used technologies like TLS (1999), URLs (1994) and UTF-8 (1993). You can typically find good support for these technologies in just about any programming language on just about computing platform that you care to name. As a result, writing Gemini software can feel a lot like joining together Lego bricks. Most of the hard work has already been done. When new languages and new platforms appear, would-be Gemini implementers just need to be patient. Somebody else, probably somebody who hasn't even heard of Gemini, will soon replicate all our required bricks for us, because they are necessary for so many other things.

Another practical aid to implementation is that Gemini was designed with a lot of "opt out" details. For example, Gemini response status codes are two digits long, but are designed so that minimal clients can decide how to handle the response on the basis of the first digit alone in a way which does not impair core functionality. Similarly, the gemtext markup language defines multiple line types, but these are split into basic and extra line types, with clients being permitted to treat extra line types as equivalent to the simplest of the basic types.

All these design decisions paid off when Gemini was featured on the popular Hacker News website in May of 2020. There was a Cambrian explosion of Gemini software in a variety of languages. Serious interoperability issues never emerged.

In the early days of Project Gemini, simplicity and minimalism were emphasised as much out of aesthetic and philosophical preference as a practical means to encourage diverse implementations to induce protocol stability. That preference for austere elegance is still there for a lot of Geminauts, but it's probably fair to say we've fallen short of some of our earliest aspirations, which were:

It should be possible for somebody who had no part in designing the protocol to accurately hold the entire protocol spec in their head after reading a well-written description of it once or twice.
A basic but usable (not ultra-spartan) client should fit comfortably within 50 or so lines of code in a modern high-level language. Certainly not more than 100.
A client comfortable for daily use which implements every single protocol feature should be a feasible weekend programming project for a single developer.

Even if Gemini isn't quite this simple, it is several orders of magnitude closer to being this simple than it is to being as complicated as the web. In the early days, we also talked about striving to maximise "power to weight ratio" rather than striving to "minimise weight". That's probably still a good way to describe Gemini.

4.3 Questions about the response and request headers

4.3.1 Why is there only one kind of request, wouldn't something like POST be nice?

This is a deliberate decision made in direct service of the guiding principle of non-extensibility (see answer 4.2.3).

In order for Gemini to support more than one kind of request, there would need to be some way to specify in a request which kind it was, similar to how HTTP requests being with the name of a method, like GET or POST. As soon as there such a way, it is immediately obvious how to easily add new kinds of request. The door to extension is wide open. This is how the web went from only GET (HTTP/0.9) to GET, HEAD and POST (HTTP/1.0), to GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS and TRACE (HTTP/1.1). We wanted to exclude that possibility.

If there's only one kind of request, there's no need to specify which kind it is. That information can be left implicit. Then the door to extension is closed, which is how we like it.

If there's only going to be one kind of request, something analogous to GET is the obvious choice. After all, GET was the first and originally only method in HTTP.

Gemini is in this regard no different to Gopher, and nobody in Gopherspace ever complains about it.

4.3.2 Why isn't there an equivalent of the HTTP Content-length header?

This is a deliberate decision made in direct service of the guiding principle of non-extensibility (see answer 4.2.3).

If the response header for a successful Gemini transaction were to include multiple different kinds of information, it would be necessary to specify some kind of delimiter to separate the components. Once a delimiter is specified, there is an obvious path to extending the design of the header to include additional pieces of information, simply by reusing the same delimiter. The door to extension is wide open.

If there's only one kind of information, there's no need to specify a delimiter. Then the door to extension is closed, which is how we like it.

If there's only going to be one kind of information about a response, an equivalent of HTTP's Content-type is a better choice than Content-length. Among other advantages, this makes it straightforward for a server to convey information about how a text response is encoded. That information is missing in Gopher, and this complicates writing a robust client which needs to be able to handle an undeclared mix of ISO/IEC 8859-1, UTF-8, KOI8-R and more.

Gopher also has no equivalent of the Content-length header, and unlike the lack of specified text encodings, this has not proven to be a practical obstacle in Gopherspace. Unlike Gopher clients, Gemini clients can distinguish between a transaction which has completed successfully and one which has dropped out mid-transfer due to a network fault or malicious attack even in the absence of Content-length information, via the presence or absence of a TLS Shutdown message.

It is true that the inability for clients to tell users how much more of a large file still has to be downloaded and to estimate how long this may take means Gemini cannot provide a very user-friendly experience for large file downloads. This would be true even if Content-length were specified, as such an experience would also require other complications to be added to the protocol e.g. the ability to resume interrupted downloads. Gemini documents can straightforwardly link to resources hosted via HTTPS, BitTorrent, IPFS, DAT, etc. and this is the better option for very large files.

It is true that the lack of Content-length makes it difficult to reuse a connection to reduce network overhead, but that is no great loss for a protocol dedicated to the idea of "one network connection per click" (see answer 4.2.2).

It is true that with a little care, a "self-terminating" response header could be designed which included more than one kind of information without leaving the door to extension wide open. Since MIME media types are permitted to include whitespace, using whitespace as a delimiter and putting something like Content-type as the final component in a sequence of other components which cannot include whitespace would have resulted in a response header which could not be unambiguously extended. This would allow a simple pair of Content-size and Content-type. This possibility was genuinely overlooked. It would have slightly complicated serving dynamically generated content, where the Content-size is not known in advance, and would render impossible some of the interesting "streaming" experiments which have appeared in Geminispace. It's unclear whether this would have been a good trade-off, given the lack of obvious problems caused by the missing Content-size information in either Geminispace or Gopherspace.

4.3.3 Why isn't a protocol version number included with requests or responses?

This is a deliberate decision made in direct service of the guiding principle of non-extensibility (see answer 4.2.3).

Since Gemini was designed from day one to resist any efforts at extension, it would be entirely self-defeating to include a facility for a smooth upgrade to "Gemini 2.0" in the future. There will never be a Gemini 2.0.

It is true that a method for signalling protocol version would be useful even in the absence of a desire for adding new features, by allowing corrections to defects which are discovered in the future. The hope is that Gemini is sufficiently simple that all but the most obscure and inconsequential of bugs can be discovered through extensive testing by the early adopters prior to freeing the specification. We are betting on being able to "get it right the first time", or at least getting close enough to right that we don't find it unbearable to live forever after with our small mistakes.

It is not true that the specification of Gemini follows a "living document" philosophy of eternal changeability. It was planned from day one that the specification would eventually be very clearly and very loudly finished and frozen.

This "one and done" approach to protocol design may seem radical or unwise, but we're cautiously optimistic. As ever, we take direct inspiration from Gopher, a simple protocol whose specification has not been changed in about 30 years, yet remains as functional and useful as it ever was and is still winning hearts and minds today. We see no reason Gemini can't follow a similar path, and we're not the only recent tech project to try something like this. The Hare programming language has a similar philosophy:

we plan to freeze the language once it reaches 1.0 and cease development of new language features. Following 1.0, the only specification changes will be clarifications and minor corrections. We won’t have the perfect language, and we’ll have to live with our oversights, but that’s okay: we’ll let the next language improve on our ideas. What we want is a language that we can rely on for as long as possible, and that requires a deep commitment to stability.

Quoted from "Hare is a boring programming language"

The idea of anything software-related being "finished" is anathema to the modern computing mentality, but it shouldn't be. We'll finish the Gemini protocol once and for all, and then the clients and servers will asymptotically tend toward being finished once and for all as well as the bugs get discovered and squashed. An ever-increasing length of time between subsequent versions of a program ought to be considered a mark of quality and good design, and we look forward to being proud of good quality Gemini programs which are reliable, stable finished products.

4.4 Questions about text/gemini

4.4.1 Why doesn't text/gemini have support for inline links?

There are multiple reasons for this, but this design is primarily in service of the guiding principle of simplicity of implementation (see answer 4.2.4). Because text/gemini is an entirely new format defined specifically for Gemini, client authors will typically need to write their own code to parse and render the format from scratch, rather than being able to rely on a pre-existing, well-tested library. Therefore, it is important that text/gemini the format is extremely simple to handle correctly. A strictly line-based format where text lines and link lines are separate concepts achieves this. There is no need for clients to scan each line character-by-character, testing for the presence of some special syntax to detect a link. Even the simplest possible inline link syntax would introduce the possibility of malformed syntax, e.g, forgetting to close a tag before opening a new one, which clients would need to be robust against. It would likely also introduce various edge cases whose handling would either need to be explicitly specified (leading to a longer, more tedious spec which was less fun to read and harder to hold in your head), or left undefined (leading to inconsistent behaviour across different clients). A simple line-based format is much, much easier to work with.

Using a system of text lines and link lines isn't just easy to parse, though. It tends to result in very clean document layouts. It encourages including only the most important or relevant links, organising links into lists which group related links together, and allows you to give each link a maximally descriptive label without having to worry about whether or not that label fits naturally into the flow of your main text. It takes some getting used to, but it's worth it. Gopher menus exert the same influence, and the ease of navigating a Gopherhole is something that lots of Gopher users appreciate and which it was considered worthwhile carrying forward into Gemini. It's not an arbitrary limitation for the sake of toughness, it really bears fruit.

If you're a unix geek, you should appreciate how nice it is to work with a line-based markup format using standard command line tools! Given a group of text/gemini documents, you can extract the URLs from all the links in all the documents, remove the duplicates and then list them in order from most to least commonly linked to using just grep, awk, sort, and uniq. Think how much more involved this task would be with HTML documents!

4.4.2 Why doesn't text/gemini have support bold or italic text?

Basically for the same reason it doesn't support inline links (see question 4.4.1). It seems on the face of it like it would be a trivial extra little thing to add, but it would force a switch from line-by-line parsing of the format with a single bit of state ("am I in pre-formatted mode or not?") to character-by-character, or at least word-by-word, parsing of the format with three bits of state, introduce a bunch of edge cases and make it easy to accidentally mangle a document.

4.4.3 Why doesn't text/gemini support inline images?

This is a deliberate decision made in direct service of the guiding principle of user autonomy (see answer 4.2.2), specifically the idea that text/gemini documents should have no way to trigger additional network requests.

Gemini is in this regard no different to Gopher, and nobody in Gopherspace ever complains about it. On the contrary, just like Gopher users enjoy the quick, easy and transparent navigation of Gopherholes that results from a hierarchical menu system rather than inline links, Gopher users enjoy a strongly text-centric space for distraction-free reading without sensory overload caused by excessive gratuitous imagery, and for it's encouragement of substance over style (nothing could be more discouraging for advertisers). All of this was considered worth carrying forward into Gemini.

4.4.4 Why doesn't text/gemini have support for styling?

This is a deliberate decision made in direct service of the guiding principle of user autonomy (see answer 4.2.2).

Gemini takes the position that visual styling of Gemini content belongs under the sole and direct control of the reader, not the writer. Not everybody has the same taste in colours and fonts, and no single way of styling a page will be optimal for all readers, all devices and all lighting conditions. There's much more at stake here than the age old-divide in preference for dark text on a light background or vice versa. People with reading disabilities like dyslexia may benefit tremendously from using specially designed fonts, for example, and people with impaired vision may have difficulty reading text if the contrast between background and foreground is too low, no matter whether it's closer to dark mode or light mode. A "one size fits all" styling system where content looks the same everywhere is guaranteed to perform poorly for a lot of people. A complicated styling system like CSS which can specify different styling for different devices and contexts would not only violate the guiding principle of simplicity of implementation (see answer 4.2.4), it would burden every individual author with the task of making sure their capsule works well everywhere and for everyone. Experience from the web suggests that accessibility issues will typically be an afterthought at best. It's much simpler, and in fact much more liberating for content authors, to let content just be content, and leave styling to the user, who after all knows their own preferences and needs better than anybody else.

Defining something like a simplified CSS would also violate the principle that following a link in Geminispace should result in downloading a single file from a single server, known in advance, and that the functionality of any one downloaded file should be in no way influenced by refusing to download any other file (see answer 4.2.2 again).

It's a myth that without some equivalent of CSS Geminispace is doomed to be visually dull and uninteresting. There are graphical Gemini clients with high quality font rendering and beautiful typography. People who value those things can enjoy that reading experience absolutely everywhere in Geminispace, even when reading content written by authors who don't care about styling at all. Beautiful Gemini clients don't even necessarily have to make capsules look good but all the same, without any individual personality. Clients like GemiNaut and Lagrange use parts of the URL as a seed for random number generators that control subtle automatic variations in capsule styling. All the pages from the same capsule look the same, and pages from different capsules look slightly different. Over time, you subconsciously learn these cues so that your favourite capsules end up looking familiar and comforting, and you can tell when you've ended up somewhere new, or even get a vague sense of "Hey, haven't I been here once before?". This is actually cutting edge UI stuff, and the Gemini community can be rightly proud of it.

4.4.5 Why didn't you just use Markdown instead of defining text/gemini?

The text/gemini markup borrows heavily from Markdown for its syntax. There are lots of Markdown libraries for all major languages and platforms, just like there are TLS libraries, so on the face of it it would seem that adopting Markdown as the "native content type" of Gemini would not be incompatible with the guiding principle of simplicity of implementation (see answer 4.2.4). But actually there are multiple reasons not to do this.

For one, there are actually many subtly different and incompatible variants of Markdown in existence, so unlike TLS all the different libraries are not guaranteed to behave similarly. There is an obvious candidate for a clearly specified "standard Markdown" which could be been used, namely CommonMark. Everything said in the rest of this answer applies if "Markdown" is understood specifically to mean "CommonMark".

Markdown permits inline links, but Gemini deliberately eschews these in an attempt to replicate the clarity of layout and ease of navigation which Gopher has proven arises naturally from a "link lines" system (see question 4.4.1).

Markdown permits inline images, but Gemini deliberate eschews these both in the interests of user autonomy and user privacy and in an attempt to replicate the lack of distraction, focus on substance over style and lack of advertising which Gopher has proven arises naturally from a text-centric space (see question 4.4.3).

Finally, but perhaps most convincingly, Markdown is in fact fundamentally tied to the concept of HTML (and allows the inclusion of arbitrary raw HTML, so it's in no way a simple, clean subset of HTML even if it is most often used as one). All of those Markdown libraries available for all major languages and platforms which would supposedly make it easy to implement a Gemini client if Markdown were the protocol's native content type do not actually do anything for the client author beyond transforming a format which is relatively pleasant to view in raw form (Markdown) into a more complex format which definitely isn't (HTML). If you were writing a simple client for a unix terminal emulator and wanted to use ANSI escape codes to render headings as bold or underlined text, transforming Markdown to HTML using a library does not actually get you closer to this goal. All it would do is require you to then parse the HTML to detect <h1>, <h2> and <h3> elements to emit the ANSI codes. You'd be better off just trying to handle the raw Markdown yourself without using any libraries, and that's less straightforward than trying to handle raw text/gemini yourself.

Of course, it is possible to serve Markdown over Gemini. The inclusion of a text/markdown Media type in the response header will allow clients to unambiguously recognise it, and nothing prohibits a client author from supporting both text/gemini and text/markdown. In fact, some Gemini clients do support Markdown. In principle, nothing could stop Markdown supplanting text/gemini as the most common item type in Geminispace and clients which support Markdown being used much more widely than clients which don't. In actuality, after more than four years, Markdown is less commonly served via Gemini than JSON or HTML.

4.5 Questions about cryptography in Gemini

4.5.1 How can you say Gemini is simple if it uses TLS?

Some people are upset that the TLS requirement means they need to use a TLS library to write Gemini code, whereas e.g. Gopher allows them full control by writing everything from scratch themselves.

Of course, even a "from scratch" Gopher client actually depends crucially on thousands of lines of complicated code written by other people in order to provide a functioning IP stack, DNS resolver and filesystem. Using a TLS library to provide a trustworthy implementation of cryptography is little different.

Gemini also turns TLS client certificates - very rarely seen on the web - into a first-class citizen with in-band signalling of their requirement. This allows restricting access to Gemini resources to certain parties, or voluntarily establishing "sessions" with server-side applications, without having to pass around cookies, passwords, authentication tokens or anything else you may be used to. It's much closer to SSH's notion of "authorized keys" and is, in fact, a much simpler approach to user authentication.

4.5.2 Why don't you care about retrocomputing support?

Gopher is so simple that computers from the 80s or 90s can easily implement the protocol, and for some people this is one of the great virtues of Gopher. The TLS requirement of Gemini limits it to more modern machines.

Old machines are awesome, and keeping them running, online and useful for as long as possible is an awesome thing to do. But it also makes no sense for the vast majority of internet users to sacrifice any and all privacy protection to facilitate this. Remember, though, that Gemini does not aim to replace Gopher, so the retro-compatible internet is not directly endangered by it. In fact, people serving content via Gopher right now are strongly encouraged to start also serving that same content via Gemini simultaneously. Retrocomputing fans can continue to access the content via Gopher, while modern computer users who wish to can switch to Gemini and reap some benefits.

And, hey, it's not like Gemini is a strictly retro-no-go-zone. Your ZX Spectrum may not be able to handle TLS, but there is in fact a functioning Gemini client for the Commodore Amiga.

AmiGemini is a Gemini+Spartan+Gopher browser for the Commodore Amiga

In fact, you can use any Gopher-capable retrocomputer to access Geminispace if you run Cosmarmot on any Gemini-capable machine on the same network:

Cosmarmot, a proxy server that translates from gemini servers to gopher clients

4.5.3 Why use TLS for crypto instead of something more modern like the Noise protocol?

TLS is certainly not without its shortcomings, but:

There are bindings to TLS libraries available for almost every programming language under the sun
Many developers are already at least partially familiar with TLS and therefore don't need to learn anything new to implement Gemini
Most users are already trusting TLS to secure their web browsing and email, and therefore don't need to decide whether or not they want to trust some unfamiliar technology to start using Gemini
TLS is a deeply entrenched industry standard, whose definition and implementations will both continue to be scrutinised and improved by security experts for the foreseeable future, and that work will happen for reasons entirely unrelated to Gemini - it makes a lot of sense for a small project to "freeride" like this.

4.5.4 Why bother with crypto at all? Nobody is making credit card purchases on Gemini. It's all just public content.

The 90s called, they would like their attitudes toward encryption back!

If Gemini were a plaintext protocol like Gopher, it would be trivial for your ISP, or, when you are out and about using public WiFi networks to access the net, for whoever runs your favourite cafe or your local public library or some random airport or hotel to:

Read everything that you read (and maybe sell that information)
Censor anything you try to read
Insert adverts into anything you read
Insert malware into software you download

This is not paranoia. All of these things have actually happened in the past. Back when the web was mostly unencrypted, there were multiple cases of commercial ISPs accepting money from advertisers to insert adverts into other people's websites, without either the ISP's customers or the website author's consenting or even knowing it was happening. This really happened! Only cryptography prevents this kind of tampering being widespread today. Again, it's not paranoid to think network providers will do this, it's naive to think that they won't. Trusting the internet is for chumps.

If you're using a public WiFi network which is itself unencrypted, which is not uncommon, then the first ability above, to read everything that you read, would apply not only to the people running the hotspot, but to everybody else sitting in the same cafe. Even if you are only reading perfectly legal public information (like browsing a Gemini interface to Wikipedia), it's nobody's business but your own if you want to read about politics, or religion, or sexual health, or anything else. In a big, bustling, anonymous city, this might not be a big deal. In small towns, where opinions are narrow and rumours travel fast, it's not remotely unreasonable to be uncomfortable literally broadcasting your online reading habits.

It might seem crazy to worry about things like this when Gemini is such a small and obscure technology, but refusing to consider them dooms it in advance to remaining smaller and more obscure than it otherwise might become.

It's true that just requiring TLS is not a silver bullet that makes everything above impossible (e.g. the hostname of the servers you visit is currently still leaked via the SNI headers, and traffic analysis might help to narrow down which specific resources you are requesting), but it substantially raises the bar.

4.5.5 Okay, but why use TLS in a weird way, without Certificate Authorities?

The Certificate Authority system is by no means without shortcomings and failures with regard to security. By now these are quite widely documented and best practices for the web are moving toward layering additional mechanisms on top in attempt to mitigate these problems (CAA, DANE, HSTS, HPKP, etc). But actually, early objections in the Gemini community to embracing the Certificate Authority system for TLS were more ideological than security minded. Gemini's designers and early adopters came overwhelmingly from Gopherspace and public access unix systems, both places where the ideals of a non-commercial, decentralised and egalitarian internet are taken very seriously.

On the web, just six certificate authorities account for about 90% of all secured websites. With the exception of the non-profit Let's Encrypt (discussed below), the rest of these CAs are all large commercial companies operated for profit and headquartered in the United States; the CA system is one where "trust" is viewed as a commodity to be bought and sold. The next most common category of CA after for-profit companies are CAs which can be considered to be directly or indirectly owned and operated by national governments, including some governments with less than stellar human rights records.

Let's Encrypt make CA-signed TLS certificates available to anybody free of charge, which is a wonderful thing and has taken the commercial nature of the CA system out of the spotlight compared to earlier times, but Let's Encrypt is free to use, not free to run. It's operated by the Internet Security Research Group, who as a non-profit depend on donations to keep the ACME servers running. The Chrome project (by Google) are a Diamond sponsor, and Meta (owner of Facebook) are a Platinum sponsor. Google and Meta are surveillance companies, and the money they donate to support Let's Encrypt is acquired through precisely the kind of practices that a lot of people use Gemini in an attempt to escape from. This isn't meant as an attack on Let's Encrypt, and it'd be hypocritical if it were, as the web mirror of the Project Gemini capsule uses Let's Encrypt certificates. We're just pointing out that for people who really hate surveillance capitalism and dream of a decentralised and non-commercial internet, it's hard to get warm fuzzies about the CA system even from Let's Encrypt.

TOFU represents an entirely self-sufficient approach to certificate validation, which doesn't rely on any infrastructure beyond the client and server which are talking to each other, and has no additional costs to either party on top of that conversation.

4.5.6 But TOFU isn't secure!

Any statement that "X is secure" or "X is insecure" without a lot of additional context is necessarily a gross simplification. All of security is economics. Attacks have costs, risks and probabilities of success, and attackers stand to receive potential rewards. Defences have costs and probabilities of success of their own, and defenders stand to incur potential losses. All any security mechanism can do is push these interrelated factors around, to disincentivise attackers and reduce risks of defenders. Something is "secure" within the framework of a particular threat model when these factors are rationally assessed to be well balanced.

TOFU is weak against active attackers launching targeted MITM attacks against individual users' connections to particular hosts. In a context that involves rich people using the internet to do their banking or shopping, like the web, that's a fatal weakness. But the first-class application of Gemini is to allow people to explore a large electronic public library on their own terms. In that context, there's no incentive for active, targeted attacks. There are, however, incentives for cheap, automated bulk surveillance and opportunistic eavesdropping.

A grounded threat model for Gemini involves ISPs and public WiFi operators who sell information on their users' reading habits for their own profit (legal without consent in the US since 2017) or who are compelled by their government to report people reading certain things (like foreign news sources, pro-democracy texts, banned religious texts, information on reproductive rights, etc.), as well as random creeps who like to passively eavesdrop on unsecured WiFi traffic. TLS+TOFU is 100% effective against the latter. It's not perfect against the former, but in today's highly mobile computing environment it's better than you might think. You can visit a dozen capsules on your home internet connection, then take your laptop, tablet or phone to your favourite cafe and visit the same capsules on their free WiFi, and then repeat this at your local public library with their free WiFi, all on the same day, for the cost of a latte and a bus ticket. If any of those three internet connections are launching bulk MITM attacks on all TLS traffic, you'll soon know about it. Of course not every Geminaut will do this, and certainly not regularly, but it only takes one to do it and spread the word, with the ISP involved potentially incurring reputational damage and losing customers. Compared to a plaintext protocol like Gopher where bulk surveillance is undetectable and therefore plausibly deniable, TLS+TOFU raises the risk of practicing bulk surveillance, which acts to disincentivise it, and raises the awareness of it when it still happens.

Of course, nothing whatsoever prohibits very privacy conscious Geminauts from adding additional certificate validation layers on top of TOFU. There are already proof-of-concept implementations of using the Tor network to validate new certificates from multiple network perspectives. If DNSSEC ever takes off, DANE can add an additional layer too.

Links

Next section: Contributing to the Gemini project

Previous section: Project history, organisation and trivia

All FAQ sections in one document.