Two proposed status schemes

(originally posted in Gopherspace on 2019-07-29)

This post is not up to my usual standards due to time pressure and is more more rambling/thinking outloud than concrete, but here it is.

Here's a proposal for a two-digit status code scheme for Gemini, inspired by the idea I had at the end of a previous post:

Previous post "Status codes" (2019-06-27)

Two digit schemes are necessarily more complicated and scary looking than single digit schemes. This one is *very* carefully designed so that it is possible for either client or server authors to get away with ignoring the second digit:

Gemini uses two-digit numeric status codes.  Related status codes have
the same first digit.  Importantly, the first digit of Gemini status
codes do not group codes into vague categories like "client error" and
"server error" as per HTTP.  Instead, the first digit alone provides
enough information for a client to determine how to handle the
response.  By design, it is possible to write a simple but feature
complete client which only looks at the first digit.  The second digit
provides more fine-grained information, for unambiguous server logging
and to enable writing smarter bots, or comfier interactive clients
which provide a slightly more streamlined user interface.

From the perspective of a simple client looking only at the first
digit, there are 6 status codes in Gemini.  They are:

1	The requested resource accepts user input.  The header text is
	prompt which should be displayed to the user.  The same
	resource should then be requested again with the user's input
	included as a query.

2	The request was handled successfully and a response body will
	follow the response header.  The header text is a MIME type
	which applies to the response body.  cf HTTP status 200.

3	The server is redirecting the client to a new location for the
	requested resource.  There is no response body.  The header
	text is a URL for the requested resource.  The URL may be
	absolute or relative.  The redirect should be considered
	temporary, i.e. clients should continue to request the
	resource at the original address and should not performance
	convenience actions like automatically updating bookmarks.
	cf HTTP status 307

4	The request has failed.  There is no response body.  The
	nature of the failure is temporary, i.e. an identical request
	MAY succeed in the future.  The header text may provide
	additional information on the failure, and should be displayed
	to human users.

5	The request has failed.  There is no response body.  The
	nature of the failure is permanent, i.e. identical future
	requests will also fail and should not be attempted.
	The header text may provide additional information on the
	failure, and should be displayed to human users.

6	The requested resource requires client-certificate
	authentication to access.  If the request was made without a
	certificate, it should be repeated with one.  If the request
	was made with a certificate, the server did not accept it and
	the request should be repeated with a different certificate.
	The header text may provide additional information on 
	certificate requirements or the reason a certificate was
	rejected.

Note that for basic interactive clients for human use, errors 4 and 5
may be handled identically.  Basic clients may also choose not to
support client-certificate authentication, in which case only four
distinct status handlers are required (for 1, 2, 3 and a combined 4-5).

The full two-digit system is:

10	Equivalent to the single digit status 1.

20	Equivalent to the single digit status 2.

30	Temporary redirect, i.e. equivalent to the single digit
	status 3.  Could be used for things like load balancing, or
	redirecting to a region-specific page based on IP geolocation.

31	Permanent redirect.  The requested resource should be
	consistently requested from the new URL provided in future.
	Tools like search engine indexers or content aggregators
	should update their configurations, and end-user clients may
	update bookmarks etc.  Note that single digit clients will
	still end up at the right place if they read this as "3", they
	just won't be able to make use of the knowledge that this
	redirect is permanent, so they'll pay a very small performance
	penalty by having to follow the redirect each time.

40	A temporary error has occurred and no more specific
	information is available.

41	Server is overloaded

42	CGI process died or timed out.

43	Rate limiting is in effect, status message indicates number of
	seconds to wait before another request.

50	A permanent error has occurred and no more specific
	information is available.

51	Not found, cf HTTP 404

53	Gone, cf HTTP 410.  This resource isn't coming back at this
	address and it should be removed from indexes.

59	Bad request, cf HTTP 400

60	A client certificate is required to proceed

61	The server is requesting the initiation of a transient client
	certificate session.  The client should ask the user if they
	want to accept this and, if so, generate a disposable key/cert
	pair and re-request the resource using it.

62	This resource is protected and a client certificate which the
	server accepts as valid must be used - a disposable key/cert
	is not appropriate here.

63	The supplied client certificate is not valid for the requested
	resource.

64	The supplied client certificate was not accepted because its
	validity start date is in the future.

65	The supplied client certificate was not accepted because its
	expiry date has passed.

Note that these codes have been constructed so that simple servers can
just send 40, 50 or 60 when a more carefully written server might send
a more specific code.  In short, all of the detail and power of the
full two-digit system is built into the protocol, but both client
authors and server authors need to opt in to that more complex
system.  It is possible for client authors to opt out by only looking
at the first digit and for server authors to opt out by just putting a
0 on the end of the first digit and putting any other information into
the header message.

I like this proposed system, and the one other person who has seen it so far (Conman Sean) likes it too. But at the same time, I can definitely hear a voice in the back of my head screaming "this is hugely over-engineered, we don't need it and you only like it because you're pleased with yourself about how nicely the two-digit codes degrade into one-digit codes".

The voice is right that I'm pleased as punch with the whole idea of having a functioning one-digit status code system embedded inside a two-digit status code system. I think this is a very cool idea and I'd like to see it used more widely, in non-Gemini contexts. But just because it is a very cool idea doesn't necessarily mean it's the right idea for Gemini. I worry that, at least in this case, the extra power of the two-digit system is enough to justify its weight. Consider:

The single digit codes 1 and 2 are not expanded upon at all in the two-digit system. The single digit code 3 is expanded into only two two-digit codes, 30 and 31. The arguments for including a temporary redirect are pretty flimsy - in fact, the only reason explicit temporary and permanent redirects are in there is because it was the first example I thought of where a two-digit scheme could degrade to a one-digit scheme in a totally compatible way. We could probably do without this and then fully half of the one-digit codes are not expanded upon at all, leading us seriously into "why bother?" territory.

The one really compelling reason I can come up with for all the 4x and 5x two-digit codes is that if we tried to go without them and just served up 4 and 5 with the particular error explained in the header, then clients are only going to receive an explanation of what actually happened in whatever human language was spoken by the person who wrote the server. Numeric codes, in contrast, allow clients to present translations of the particular error into whatever language the user would prefer. That's not at all an inconsequential thing for a system that one wants to see widely used, and argues strongly for having distinct status codes for at least the most meaningfully distinct conditions.

Sloum is a fan of the single character status code idea, and raised to me the interesting possibility of using a single hexadecimal digit (i.e. 0-F) as the entire space of status characters. That's a cute idea which gives us 16 codes. If we trim a little bit of the fat from the two-digit system above, can we fit everything into 16 codes, allowing translation of client interfaces?

0	Bad request
1	Input prompt
2	Success
3	Redirect
4	Not found
5	Temporary server error (overload, CGI failure)
6	Gone
7 	Rate limiting in effect
8	Unused
9	Unused
A	Transient client cert session requested
B	Client cert required for protected resource
C	Client cert invalid for this resource
D	Cilent cert outside of validity window
E	Unused
F	Unused

Status codes 1 and A above refer to ideas introduced in a previous post:

Previous post "Inputs and client certificates" (2019-07-25)

Yeah, seems like we can do it with room to spare. I've structured the above so that everything to do with client certs gets an alphabetical code (the two unused codes E and F don't trouble me too much because the client certificate stuff is the most complicated part of Gemini and it's conceivable that the need for extra codes will arise), which makes it easy for simple clients which don't support client certs to detect any code related to that (e.g. in Python a simple code.isalpha() will return True for any cert related and False for anything else). The numeric codes are structured for maximum similarity with HTTP equivalents as a memory aid. The unused 8 and 9 *do* make me nervous...

Despite all the effort I put into coming up with the two-digit scheme, I have to admit that the above just feels, intuitively, much more "right" for Gemini. It's small and friendly and approachable. It's specific enough that logging a status code alone is adequately informative.

I'm tempted to just say "screw it, we're using this!" (meaning the hexadecimal scheme), but it's late and I'm sleepy and I know that kind of snap decision is a bad idea...