💾 Archived View for gemi.dev › gemini-mailing-list › 000124.gmi captured on 2023-11-04 at 12:27:03. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Status codes

📧 Messages: 4
🗣️ Authors: 4
📅 First Message: 2020-05-20 16:20
📅 Last Message: 2020-05-20 21:13

Dominik Dalek <dominik.dalek (a) thaumatec.com>

📅 Sent: 2020-05-20 16:20
📧 Message 1 of 4

Howdy!

I'm working on my own client and figured I'd write on the status codes.
This is something that, I feel, can be simplified. My comments will be
based on two assumptions:
1. Things should be as simple as possible, but not simpler
2. Complexity is a source of bugs and exploits[1]

I understand that there may be some aversion to changing things that
are already pretty well established, but I hope it's not too late and
my arguments will be at least a little bit convincing. :)

Suggestion #0: Strengthen language around status codes

It is softly stated that the second digit is an extension to the status
code. Then valid first digits are laid out. Upon first reading I found
this confusing so I would welcome rephrasing this into something that
states explicitly:
1. All clients and servers must at the very least recognize status codes
10, 20, 30, 40, 50 and 60.
2. Any xy code not recognized by the client can be safely interpreted
as x0 (e.g. 21 can be safely treated as 20, and 45 can be treated like
40 w/o consequences)
3. The status codes section shouldn't use single digits but the core
codes (10, 20, etc.) or as masks (1y, 2y, etc.)
4. Replace the unique phrasing of "range of codes" in section 1.3.1 with
"status code category" used elsewhere (range felt like a broader term
than category and caused me some time to figure out if I'm missing
something important).

Suggestion #1: Remove proxy related codes

Current spec is already something that is likely to be two IETF RFCs
in the future (protocol and document format). Adding proxy support into
the mix complicates things even further. Problems with proxy IMO start
with the fact that it's not clear (to me at least) what sort of proxy
gemini would benefit from; nor is it stated in the spec.

Classical reverse proxy roles of load balancing and caching are possible
but I don't see how these can be implemented w/o expiration mechanisms
added to the spec. It could be done, but I think it should be tackled
in a separate RFC. Forwarding proxy, anonymizing or otherwise is
a possibility but, again, I think this should be an extension that
ends up in a separate spec.

There's also a lot of work that has to be done for the proxy to support
client certificates in various setups (when we want anonymity or not,
where we deal with session certs or permanent ones, and so on), how
various error paths should perform, etc. It's a lot of work and would
definitely need fairly stable clients and some reference proxy
implementation to validate assumptions about what could work.

How server and client certs are handled in the proxy scenario is also
something that I don't think is trivial and as such would require some
legwork to get up and running. So, yeah, I'd cut any mention of proxy
at the moment and with that's codes 43 and 53.

Suggestion #2: Deduplicate client cert errors

Only a single client cert can be sent when establishing the communication
channel. My mental model (correct me if I'm wrong!) is that transient
cert is a session substitute and permanent cert is authentication
mechanism (roughly speaking at least).

Current spec has 3 cert request mechanisms, 3 rejection codes and
a revocation code. This creates numerous corner cases for clients
to handle properly, e.g.

	what do you do if you get 21 in response to a request that included

your permanent key? I sure hope the browser doesn't actually delete
the permanent key from the store :)

	what do you do if you get 64 but your cert is not from the future?
	what do you do if you get 65 but your cert hasn't expired?


I'd like to see a single "client certificate rejected" code eliminating
responses that would potentially make no sense.

I feel that differentiating between the types of rejection opens some
opportunity for oracle attacks (i.e. rogue clients can inspect server
cert processing bugs by observing how responses differ for various
crafted requests). In essence I'd cut 64 and 65.

Suggestion #3: Change end of cert session (21) into a redirection

This will probably be a very controversial one but the way I see it
the end of session typically results in the redirection. This lets you
chain requests on logout in a way that enables permanent client key
delivery or temporary key replacement. With current design you serve
a page in a response to a request that displays something and asks
the client to delete the transient cert. If you want to re-establish
some sort of validation from the client, you need a manual intervention
from the user to do that. I'm not sure if my explanation is clear enough,
I can try and expand upon it if needed.

Suggestion #4: Merge different types of server error to prevent leaking
what happened under the hood

HTTP 500 is often seen as an indication of something wrong in the server
application logic. This would be the primary attack vector for someone
trying to compromise the server (even if only DOS it). I don't think it
makes sense to differentiate codes 40-42 with the exception maybe of
a planned maintenance. Basically I'm sort of allergic to disclosing
information about the server state.

The counter argument is likely going to be: there's reason not to give
extra details.

To which I say ;) sure, but you can already do that with <META>.

Suggestion #5: A comment, really

5x codes are by design permanent errors but 51 (HTTP 404 equivalent) is
actually a temporary problem according to the spec.
In fact this is precisely what differentiates it from HTTP 410 GONE
(gemini 52). So there seems to be a design error here but I don't really
know what the correct solution is. Either 5x aren't really permanent
errors (how would they be called then?) or 51 shouldn't be a 5x error
to begin with.

This sums up my thoughts about the status codes. I know this reads very
much like "too complex, cut!" and that kinda is exactly that. But if you
can make things simpler, why not do it? :)

Thanks for reading this, cheers!
-Dom

[1] There's a neat research on exploiting HTTP status codes:
https://www.youtube.com/watch?v=4OztMJ4EL1s

Link to individual message.

Sean Conner <sean (a) conman.org>

📅 Sent: 2020-05-20 20:15
📧 Message 2 of 4

It was thus said that the Great Dominik Dalek once stated:
> Howdy!

  Hello.

  [bit ol' snip]

> Suggestion #1: Remove proxy related codes

  [ snip ]

> How server and client certs are handled in the proxy scenario is also
> something that I don't think is trivial and as such would require some
> legwork to get up and running. So, yeah, I'd cut any mention of proxy
> at the moment and with that's codes 43 and 53.

  There are at least two Gemini servers that handle proxying to various
degrees (to my knowledge).  I expect changes, but not removal.

> Suggestion #2: Deduplicate client cert errors
> 
> Only a single client cert can be sent when establishing the communication
> channel. My mental model (correct me if I'm wrong!) is that transient
> cert is a session substitute and permanent cert is authentication
> mechanism (roughly speaking at least).
> 
> Current spec has 3 cert request mechanisms, 3 rejection codes and
> a revocation code. This creates numerous corner cases for clients
> to handle properly, e.g.
> * what do you do if you get 21 in response to a request that included
> your permanent key? I sure hope the browser doesn't actually delete
> the permanent key from the store :)
> * what do you do if you get 64 but your cert is not from the future?
> * what do you do if you get 65 but your cert hasn't expired?

  Both of those indicate either

	1) a buggy server
	2) a buggy client
	2) the client clock is incorrect
	3) the server clock is incorrect

  I would only really expect to see these around the time of certificate
expiry (or certificate creation).

> I'd like to see a single "client certificate rejected" code eliminating
> responses that would potentially make no sense.
> 
> I feel that differentiating between the types of rejection opens some
> opportunity for oracle attacks (i.e. rogue clients can inspect server
> cert processing bugs by observing how responses differ for various
> crafted requests). In essence I'd cut 64 and 65.

  I don't agree.  I don't have a fully fleshed out response to the "but muh
security!" argument (well, there is "security through obscurity" isn't) but
it's rooted around the following story:

	Ken Thompson [creator of C and Unix] has an automobile which he
	helped design.  Unlike most automobiles, it has neither speedometer,
	nor gas gage, nor any of the numerous idiot lights which plague the
	modern driver.  Rather, if the driver makes any mistake, a giant "?"
	lights up in the center of the dashboard.  "The experienced driver",
	he says, "will usually know what's wrong."

  I've done tech support and have had to deal with "it's broke---fix it!"
questions with nothing more than that.  I'd rather not do that again.

> Suggestion #3: Change end of cert session (21) into a redirection
> 
> This will probably be a very controversial one but the way I see it
> the end of session typically results in the redirection. This lets you
> chain requests on logout in a way that enables permanent client key
> delivery or temporary key replacement. With current design you serve
> a page in a response to a request that displays something and asks
> the client to delete the transient cert. If you want to re-establish
> some sort of validation from the client, you need a manual intervention
> from the user to do that. I'm not sure if my explanation is clear enough,
> I can try and expand upon it if needed.

  It might help to test against a server that actually implements this.  I
kind of see what you are getting at, and a "logout" mechanism is solely
missing from HTTP (if you are using the actual authentication mechanism and
not HTTP cookies), but I'm not exactly sure what your objection here is.

> Suggestion #4: Merge different types of server error to prevent leaking
> what happened under the hood

  See above---we'll have to agree to disagree on this.

> HTTP 500 is often seen as an indication of something wrong in the server
> application logic. This would be the primary attack vector for someone
> trying to compromise the server (even if only DOS it). I don't think it
> makes sense to differentiate codes 40-42 with the exception maybe of
> a planned maintenance. Basically I'm sort of allergic to disclosing
> information about the server state.

  Nothing to stop a server from just serving up '40' for everything.

> Suggestion #5: A comment, really

  The 40 range of codes map to HTTP 500 range (server errors), and the 50
range of codes map to HTTP 400 range (client errors), and when I first wrote
GLV-1.12556, I used HTTP status codes (because I felt the original status
codes were ... less than optimal), but later solderpunk renamed "client
errors" to "permanent errors" and "server errors" to "temporary errors" (I
think he gave some justification for this, but I don't recall what it was).

> 5x codes are by design permanent errors but 51 (HTTP 404 equivalent) is
> actually a temporary problem according to the spec.
> In fact this is precisely what differentiates it from HTTP 410 GONE
> (gemini 52). So there seems to be a design error here but I don't really
> know what the correct solution is. Either 5x aren't really permanent
> errors (how would they be called then?) or 51 shouldn't be a 5x error
> to begin with.

  It starts to make sense when you realize they were originally server and
client errors.

  But let me be fair, and report back all the errors that GLV-1.12556 can
return (modulo what a CGI scripts and the torture test):

	10	prompt for input
	20	okay
	30	temporary redirect
	31	permament redirect
	40	temporary error
	51	not found
	52	gone
	59	bad request
	61	transient certificate
	62	authorized certificate
	63	certificate rejected
	64	future certificate
	65	expired certificate

  Even my CGI module only returns 40 (if it can't run a CGI script for
whatever reason).  

> This sums up my thoughts about the status codes. I know this reads very
> much like "too complex, cut!" and that kinda is exactly that. But if you
> can make things simpler, why not do it? :)

  It broke.  Fix it. 8-P

  -spc

Link to individual message.

solderpunk <solderpunk (a) SDF.ORG>

📅 Sent: 2020-05-20 20:16
📧 Message 3 of 4

> Howdy!

Ahoy!
 
> I understand that there may be some aversion to changing things that
> are already pretty well established, but I hope it's not too late and
> my arguments will be at least a little bit convincing. :)

I do kind of worry that the time to propose changes to "core" stuff is
passed or passing.  New implementations are being written at an
astonishing rate and with so many clients and servers out there, every
substantial change runs the risk of fracturing the nascent Geminispace
into incompatible subspaces.  Stuff that is very poorly implemented,
like client certificiate stuff, doesn't have this risk so much, but
anything fundamental I worry is already more or less "set" now.  It's
the downside to unexpected explosive growth!

> Suggestion #0: Strengthen language around status codes

I'm totally open to rewording parts of the spec to make stuff clearer if
people think what's there now is confusing.  I will take this into
consideration, and please look out for a post to the list sometime this
coming weekend about people being able to more conveniently make
suggestions for changes to the spec and other docs.
 
> Suggestion #1: Remove proxy related codes
 
> Current spec is already something that is likely to be two IETF RFCs
> in the future (protocol and document format).

I like your optimism!!!

> Adding proxy support into
> the mix complicates things even further. Problems with proxy IMO start
> with the fact that it's not clear (to me at least) what sort of proxy
> gemini would benefit from; nor is it stated in the spec.

The most compelling case, IMHO, is proxies which are
protocol-translating gateways.  One of these already exists
(https://tildegit.org/solderpunk/agena), it answers queries for
gopher:// URLs by fetching the original content over gopher and
translating it to text/gemini (which is, by design, very easy to do).
AV-98 can be given the host and port of such a proxy and then it will
automatically use it to follow Gopher links.  Any other client could add
this support to become a combined Gemini-Gopher client without the
author having to write a line of Gopher-related code.

In principle this could be done for HTTP too, but converting websites to
text/gemini is less straight forward (and likely to result in something
ugly to look at anyway because all sorts of irrelevant crap will be
converted too, minus some Herculean effort to detect what the actual
content is).

One could also simply run a straight Gemini proxy which listens on a
port other than 1965 in order to circumvent a filter on outgoing traffic
on that port.

There are definitely useful things which can be done.  Perhaps "proxy"
is a misleading word to have used for this.  I was never imagining
anything super complicated with expiration mechanisms, or a proxy server
which basically acts as a packet router.  They would be very explicitly
MITM operations, and you couldn't use them to access services which
relied on client cetificiatees - which should really end up being a
small minority of content.  Nothing that would actually require anything
in the way of a formal spec.

Should these be called something else?

> Suggestion #2: Deduplicate client cert errors
 
> Only a single client cert can be sent when establishing the communication
> channel. My mental model (correct me if I'm wrong!) is that transient
> cert is a session substitute and permanent cert is authentication
> mechanism (roughly speaking at least).

That model is totally accurate (roughly speaking at least), I'm glad
people get this!
 
> Current spec has 3 cert request mechanisms, 3 rejection codes and
> a revocation code. This creates numerous corner cases for clients
> to handle properly, e.g.
> * what do you do if you get 21 in response to a request that included
> your permanent key? I sure hope the browser doesn't actually delete
> the permanent key from the store :)
> * what do you do if you get 64 but your cert is not from the future?
> * what do you do if you get 65 but your cert hasn't expired?
> 
> I'd like to see a single "client certificate rejected" code eliminating
> responses that would potentially make no sense.

I'm not really convinced these are serious problems.  If you get a 64 or
65 but you know your cert is temporaly, you display an error message
suggesting perhaps the server's clock is faulty.

I mean, what do you do if you get your proposed "client certificate
rejected" status code in response to a request which was made without a
client certificate?

> Suggestion #3: Change end of cert session (21) into a redirection
> 
> This will probably be a very controversial one but the way I see it
> the end of session typically results in the redirection. This lets you
> chain requests on logout in a way that enables permanent client key
> delivery or temporary key replacement. With current design you serve
> a page in a response to a request that displays something and asks
> the client to delete the transient cert. If you want to re-establish
> some sort of validation from the client, you need a manual intervention
> from the user to do that. I'm not sure if my explanation is clear enough,
> I can try and expand upon it if needed.

I'd appreciate it if you tried, I'm not sure I really understand what
you are proposing here.

> Suggestion #4: Merge different types of server error to prevent leaking
> what happened under the hood
> 
> a planned maintenance. Basically I'm sort of allergic to disclosing
> information about the server state.

I don't think it would be any probem at all for a server admin who was
conerned about the security implications of this to configure their
server so that it logged the full two-digit codes for the sake of log
monitoring and debugging, but exclusively sent x0 codes to the client.

I think actually in the very early days of Gemini when debate raged over
how many status codes we need and how many digits was enough, I made the
case that there was no point at all in having codes that distinguish
situations which clients can't possibly react to in meaningfully
different ways.  I was convinced (and rightly so, I think), that it's
good to be able to make these distinctions on the server side.  The
strategy outlined above is kind of like an encoding of this principle.

Of course, no off-the-shell servers support this mode of operation yet,
but if people like the idea that could change...
 
> Suggestion #5: A comment, really
> 
> 5x codes are by design permanent errors but 51 (HTTP 404 equivalent) is
> actually a temporary problem according to the spec.
> In fact this is precisely what differentiates it from HTTP 410 GONE
> (gemini 52). So there seems to be a design error here but I don't really
> know what the correct solution is. Either 5x aren't really permanent
> errors (how would they be called then?) or 51 shouldn't be a 5x error
> to begin with.

It's true that "not found" is, in principle, temporary, or at least
non permanent, in the sense that, yes, maybe tomorrow or next month or
next year there will be something at that address.

The temporary/permanent error distinction in Gemini is intended mostly
to be useful for non-human user agents, like search engine crawlers or
feed aggregators or things like that, rather than people sitting in
front of something like Bombadillo or Castor.  If a bot tries to fetch a
feed from a bad URL, it would be nice if it didn't continually try again
every hour on the hour thinking that it's only a temporary failure and
one day the feed will appear!

> This sums up my thoughts about the status codes. I know this reads very
> much like "too complex, cut!" and that kinda is exactly that. But if you
> can make things simpler, why not do it? :)

It's very refreshing to see discussion on this mailing list aimed at
taking stuff away, not adding it! :)

> Thanks for reading this, cheers!

Thanks for sharing your thoughts!

Cheers,
Solderpunk

Link to individual message.

jan6@tilde.ninja <jan6 (a) tilde.ninja>

📅 Sent: 2020-05-20 21:13
📧 Message 4 of 4

May 20, 2020 11:16 PM, "solderpunk" <solderpunk at sdf.org> wrote:

> I do kind of worry that the time to propose changes to "core" stuff is
> passed or passing. New implementations are being written at an
> astonishing rate and with so many clients and servers out there, every
> substantial change runs the risk of fracturing the nascent Geminispace
> into incompatible subspaces. Stuff that is very poorly implemented,
> like client certificiate stuff, doesn't have this risk so much, but
> anything fundamental I worry is already more or less "set" now. It's
> the downside to unexpected explosive growth

can't you simply set up different version revisions, though, for that?
maybe just a "gemini-stable" and "gemini-next" branch, where gemini-next 
is explictly experimental and can change at any time,
and ideas from there that work, can be cycled into gemini-stable at some 
kinda-set intervals?

> 
>> Suggestion #5: A comment, really
>> 
>> 5x codes are by design permanent errors but 51 (HTTP 404 equivalent) is
>> actually a temporary problem according to the spec.
>> In fact this is precisely what differentiates it from HTTP 410 GONE
>> (gemini 52). So there seems to be a design error here but I don't really
>> know what the correct solution is. Either 5x aren't really permanent
>> errors (how would they be called then?) or 51 shouldn't be a 5x error
>> to begin with.
> 
> It's true that "not found" is, in principle, temporary, or at least
> non permanent, in the sense that, yes, maybe tomorrow or next month or
> next year there will be something at that address.
> 
> The temporary/permanent error distinction in Gemini is intended mostly
> to be useful for non-human user agents, like search engine crawlers or
> feed aggregators or things like that, rather than people sitting in
> front of something like Bombadillo or Castor. If a bot tries to fetch a
> feed from a bad URL, it would be nice if it didn't continually try again
> every hour on the hour thinking that it's only a temporary failure and
> one day the feed will appear!
> 

I think it'd be best if it returned an (optional?) timeout/expiry info, 
for the tag (if not optional, 0 or -1 can signify infinite time), not sure 
what time unit, probably just seconds, though "1h" for 1 hour, and such is also possible
that way the server can specify if it should be flagged as missing for the 
next hour, or next day, or forevermore, etc, 
useful in cases that you temporarily take down a page for whatever reason, 
or where you might change urls every so often, or if you're simply prone 
to typos and sometimes fumble the links, and don't want to bother to 
manually ask for re-index... sometime you might not even know that a page 
is not crawled by some crawler...

also while probably not necessarily part of the spec, what should be the 
case if there's a redirect to a nonexistant URL?
should the url that was redirected FROM, be permanently "not found" as well?

and if it's intended for non-human agents, maybe mention that, too? that 
human-controlled clients are allowed to re-request on demand and don't 
have to block it forever?

"client error" makes more sense than "permanent error" in this case, too

Link to individual message.

---

Previous Thread: [ANN] Announcing `commie.space`...

Next Thread: [ANN] New Gemfeed release