💾 Archived View for geminiprotocol.net › history › phlog › two-proposed-status-schemes.gmi captured on 2024-05-12 at 15:19:36. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-09-08)
-=-=-=-=-=-=-
(originally posted in Gopherspace on 2019-07-29)
This post is not up to my usual standards due to time pressure and is more more rambling/thinking outloud than concrete, but here it is.
Here's a proposal for a two-digit status code scheme for Gemini, inspired by the idea I had at the end of a previous post:
Previous post "Status codes" (2019-06-27)
Two digit schemes are necessarily more complicated and scary looking than single digit schemes. This one is *very* carefully designed so that it is possible for either client or server authors to get away with ignoring the second digit:
Gemini uses two-digit numeric status codes. Related status codes have the same first digit. Importantly, the first digit of Gemini status codes do not group codes into vague categories like "client error" and "server error" as per HTTP. Instead, the first digit alone provides enough information for a client to determine how to handle the response. By design, it is possible to write a simple but feature complete client which only looks at the first digit. The second digit provides more fine-grained information, for unambiguous server logging and to enable writing smarter bots, or comfier interactive clients which provide a slightly more streamlined user interface. From the perspective of a simple client looking only at the first digit, there are 6 status codes in Gemini. They are: 1 The requested resource accepts user input. The header text is prompt which should be displayed to the user. The same resource should then be requested again with the user's input included as a query. 2 The request was handled successfully and a response body will follow the response header. The header text is a MIME type which applies to the response body. cf HTTP status 200. 3 The server is redirecting the client to a new location for the requested resource. There is no response body. The header text is a URL for the requested resource. The URL may be absolute or relative. The redirect should be considered temporary, i.e. clients should continue to request the resource at the original address and should not performance convenience actions like automatically updating bookmarks. cf HTTP status 307 4 The request has failed. There is no response body. The nature of the failure is temporary, i.e. an identical request MAY succeed in the future. The header text may provide additional information on the failure, and should be displayed to human users. 5 The request has failed. There is no response body. The nature of the failure is permanent, i.e. identical future requests will also fail and should not be attempted. The header text may provide additional information on the failure, and should be displayed to human users. 6 The requested resource requires client-certificate authentication to access. If the request was made without a certificate, it should be repeated with one. If the request was made with a certificate, the server did not accept it and the request should be repeated with a different certificate. The header text may provide additional information on certificate requirements or the reason a certificate was rejected. Note that for basic interactive clients for human use, errors 4 and 5 may be handled identically. Basic clients may also choose not to support client-certificate authentication, in which case only four distinct status handlers are required (for 1, 2, 3 and a combined 4-5). The full two-digit system is: 10 Equivalent to the single digit status 1. 20 Equivalent to the single digit status 2. 30 Temporary redirect, i.e. equivalent to the single digit status 3. Could be used for things like load balancing, or redirecting to a region-specific page based on IP geolocation. 31 Permanent redirect. The requested resource should be consistently requested from the new URL provided in future. Tools like search engine indexers or content aggregators should update their configurations, and end-user clients may update bookmarks etc. Note that single digit clients will still end up at the right place if they read this as "3", they just won't be able to make use of the knowledge that this redirect is permanent, so they'll pay a very small performance penalty by having to follow the redirect each time. 40 A temporary error has occurred and no more specific information is available. 41 Server is overloaded 42 CGI process died or timed out. 43 Rate limiting is in effect, status message indicates number of seconds to wait before another request. 50 A permanent error has occurred and no more specific information is available. 51 Not found, cf HTTP 404 53 Gone, cf HTTP 410. This resource isn't coming back at this address and it should be removed from indexes. 59 Bad request, cf HTTP 400 60 A client certificate is required to proceed 61 The server is requesting the initiation of a transient client certificate session. The client should ask the user if they want to accept this and, if so, generate a disposable key/cert pair and re-request the resource using it. 62 This resource is protected and a client certificate which the server accepts as valid must be used - a disposable key/cert is not appropriate here. 63 The supplied client certificate is not valid for the requested resource. 64 The supplied client certificate was not accepted because its validity start date is in the future. 65 The supplied client certificate was not accepted because its expiry date has passed. Note that these codes have been constructed so that simple servers can just send 40, 50 or 60 when a more carefully written server might send a more specific code. In short, all of the detail and power of the full two-digit system is built into the protocol, but both client authors and server authors need to opt in to that more complex system. It is possible for client authors to opt out by only looking at the first digit and for server authors to opt out by just putting a 0 on the end of the first digit and putting any other information into the header message.
I like this proposed system, and the one other person who has seen it so far (Conman Sean) likes it too. But at the same time, I can definitely hear a voice in the back of my head screaming "this is hugely over-engineered, we don't need it and you only like it because you're pleased with yourself about how nicely the two-digit codes degrade into one-digit codes".
The voice is right that I'm pleased as punch with the whole idea of having a functioning one-digit status code system embedded inside a two-digit status code system. I think this is a very cool idea and I'd like to see it used more widely, in non-Gemini contexts. But just because it is a very cool idea doesn't necessarily mean it's the right idea for Gemini. I worry that, at least in this case, the extra power of the two-digit system is enough to justify its weight. Consider:
The single digit codes 1 and 2 are not expanded upon at all in the two-digit system. The single digit code 3 is expanded into only two two-digit codes, 30 and 31. The arguments for including a temporary redirect are pretty flimsy - in fact, the only reason explicit temporary and permanent redirects are in there is because it was the first example I thought of where a two-digit scheme could degrade to a one-digit scheme in a totally compatible way. We could probably do without this and then fully half of the one-digit codes are not expanded upon at all, leading us seriously into "why bother?" territory.
The one really compelling reason I can come up with for all the 4x and 5x two-digit codes is that if we tried to go without them and just served up 4 and 5 with the particular error explained in the header, then clients are only going to receive an explanation of what actually happened in whatever human language was spoken by the person who wrote the server. Numeric codes, in contrast, allow clients to present translations of the particular error into whatever language the user would prefer. That's not at all an inconsequential thing for a system that one wants to see widely used, and argues strongly for having distinct status codes for at least the most meaningfully distinct conditions.
Sloum is a fan of the single character status code idea, and raised to me the interesting possibility of using a single hexadecimal digit (i.e. 0-F) as the entire space of status characters. That's a cute idea which gives us 16 codes. If we trim a little bit of the fat from the two-digit system above, can we fit everything into 16 codes, allowing translation of client interfaces?
0 Bad request 1 Input prompt 2 Success 3 Redirect 4 Not found 5 Temporary server error (overload, CGI failure) 6 Gone 7 Rate limiting in effect 8 Unused 9 Unused A Transient client cert session requested B Client cert required for protected resource C Client cert invalid for this resource D Cilent cert outside of validity window E Unused F Unused
Status codes 1 and A above refer to ideas introduced in a previous post:
Previous post "Inputs and client certificates" (2019-07-25)
Yeah, seems like we can do it with room to spare. I've structured the above so that everything to do with client certs gets an alphabetical code (the two unused codes E and F don't trouble me too much because the client certificate stuff is the most complicated part of Gemini and it's conceivable that the need for extra codes will arise), which makes it easy for simple clients which don't support client certs to detect any code related to that (e.g. in Python a simple code.isalpha() will return True for any cert related and False for anything else). The numeric codes are structured for maximum similarity with HTTP equivalents as a memory aid. The unused 8 and 9 *do* make me nervous...
Despite all the effort I put into coming up with the two-digit scheme, I have to admit that the above just feels, intuitively, much more "right" for Gemini. It's small and friendly and approachable. It's specific enough that logging a status code alone is adequately informative.
I'm tempted to just say "screw it, we're using this!" (meaning the hexadecimal scheme), but it's late and I'm sleepy and I know that kind of snap decision is a bad idea...