πŸ’Ύ Archived View for gemi.dev β€Ί gemini-mailing-list β€Ί 000509.gmi captured on 2024-08-31 at 17:21:55. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

CGI and client certificate, or do we need a CGI spec

1. Remco (me (a) rwv.io)

Hi,

I've been working on a gemini server and implemented CGI by skimming
through the CGI/1.1 spec (RFC 3875) and looking through some gemini
server implementations a couple of weeks ago, to figure out what
environment variables to provide.

=> https://git.sr.ht/~rwv/dezhemini/tree/bf5b0ec4/dezhmnsrv.rkt#L253

Currently I'm playing around with client certificates (having learned a
lot more about libopenssl and racket ffi than I bargained for) and was
wondering what environment variables I want to expose to bring that
information into CGI scripts.  So I visited the list of gemini server
software again, browsed some code and found 4 servers supporting both
CGI and client certificates.

Here's what the expose.

# Jetforce

=> https://github.com/michael-lazar/jetforce/blob/d2d1f63f/jetforce/protocol.py#L180



# GLV-1.12556

=> https://github.com/spc476/GLV-1.12556/blob/13d52b63/Lua/GLV-1/gateway.lua#L156



# Gemserv

=> https://git.sr.ht/~int80h/gemserv/tree/ebc22964/src/cgi.rs#L42



# The Unsinkable Molly Brown

=> https://tildegit.org/solderpunk/molly-brown/src/commit/48f9a206c03c0470e
1c132b9667c6daa3583dada/dynamic.go#L151



Looking at these it's obvious everybody is looking at everybody else to
see how they implemented it and just pick whatever they like, so it
seems I am on the right track.  ;-)  Personally I like this minimal
approach of the latter two and will probably go with no more than:



Because when writing a CGI scripts these are the only things I would
really need: a way to communicate with the user about their certificate
(REMOTE_USER) and a way to distinguish between offered certificates
(TLS_CLIENT_HASH).  I won't need AUTH_TYPE because if I do get a
TLS_CLIENT_HASH I'll know I can authenticate the user.

But that brings me to the real question here.  Does gemini need a CGI
spec?  Given status code 42 for CGI errors, it kinda committed to
something CGI-ish without actually stating what that is.  The only
server making the effort to implement CGI/1.1 is GLV but, IMHO, that
isn't the kind of simplicity I am here for and it's a bit of a hack to
be honest.

GLV does manage to make CGI scripts more portable, whereas other servers
don't really make the effort.  For instance, some don't provide
PATH_INFO but do provide PATH_TRANSLATED and others provide neither.  I
would like to share my CGI-scripts and have them run anywhere but to
make sharing easier something like a spec would be nice.  What do you
think?

Anyway, back to libopenssl and racket ffi..

Cheers,
Remco

Link to individual message.

2. Ben Goldberg (ben (a) benaaron.dev)

As someone else trying to write a gemini server with CGI support, I 
wouldn't be opposed to some standardization. It would be nice if CGI 
scripts were portable between instances.

Link to individual message.

3. RenΓ© Wagner (rwagner (a) rw-net.de)

I totally agree that we need a definition of what information has to be 
passed to a CGI script by the server. Adjust scripts to honor specific 
server implementations is a no-go.

For my part i don't care of this is part of the spec itself or a companion 
spec as long as it is something that is agreed on and written down.

Ren?

Ben Goldberg schrieb am 29.11.2020 19:02 (GMT +01:00):

> As someone else trying to write a gemini server with CGI support, I 
> wouldn't be opposed to some standardization. It would be nice if CGI 
> scripts were portable between instances.
> 
>

Link to individual message.

4. Robert "khuxkm" Miles (khuxkm (a) tilde.team)

November 29, 2020 11:58 AM, "Remco" <me at rwv.io> wrote:

> Hi,
> 
> I've been working on a gemini server and implemented CGI by skimming
> through the CGI/1.1 spec (RFC 3875) and looking through some gemini
> server implementations a couple of weeks ago, to figure out what
> environment variables to provide.
> 
> => https://git.sr.ht/~rwv/dezhemini/tree/bf5b0ec4/dezhmnsrv.rkt#L253
> 
> Currently I'm playing around with client certificates (having learned a
> lot more about libopenssl and racket ffi than I bargained for) and was
> wondering what environment variables I want to expose to bring that
> information into CGI scripts. So I visited the list of gemini server
> software again, browsed some code and found 4 servers supporting both
> CGI and client certificates.
> 
> Here's what the expose.
> 
> # Jetforce
> 
> => https://github.com/michael-lazar/jetforce/blob/d2d1f63f/jetforce/protocol.py#L180
> 
> * AUTH_TYPE : "CERTIFICATE".
> * REMOTE_USER : client certificate X509 subject common name
> * TLS_CLIENT_HASH : certificate fingerprint
> * TLS_CLIENT_NOT_BEFORE : certificate start date
> * TLS_CLIENT_NOT_AFTER : certificate end date
> * TLS_CLIENT_SERIAL_NUMBER : certificate X509 serial number
> * TLS_CLIENT_AUTHORISED : "true" if certificate is validated by server CA store
> 
> # GLV-1.12556
> 
> => https://github.com/spc476/GLV-1.12556/blob/13d52b63/Lua/GLV-1/gateway.lua#L156
> 
> * AUTH_TYPE : "Certificate"
> * REMOTE_USER : client certificate X509 subject common name
> * TLS_CLIENT_HASH : certificate fingerprint
> * TLS_CLIENT_ISSUER : certificate X509 issuer
> * TLS_CLIENT_ISSUER_* : certificate X509 issuer sub fields
> * TLS_CLIENT_NOT_AFTER : certificate end date
> * TLS_CLIENT_NOT_BEFORE : certificate start date
> * TLS_CLIENT_REMAIN : certificate days left
> * TLS_CLIENT_SUBJECT : certificate X509 subject
> * TLS_CLIENT_SUBJECT_* : certificate X509 subject sub fields
> 
> # Gemserv
> 
> => https://git.sr.ht/~int80h/gemserv/tree/ebc22964/src/cgi.rs#L42
> 
> * AUTH_TYPE : "Certificate"
> * REMOTE_USER : client certificate X509 subject common name
> * TLS_CLIENT_HASH : certificate fingerprint
> 
> # The Unsinkable Molly Brown
> 
> =>
> https://tildegit.org/solderpunk/molly-brown/src/commit/48f9a206c03c0470e1
c132b9667c6daa3583dada/dyna
> ic.go#L151
> 
> * TLS_CLIENT_HASH : certificate fingerprint
> * TLS_CLIENT_ISSUER : certificate X509 issuer
> * TLS_CLIENT_ISSUER_CN : certificate X509 issuer common name
> * TLS_CLIENT_SUBJECT : certificate X509 subject
> * TLS_CLIENT_SUBJECT_CN : certificate X509 subject common name
> 
> Looking at these it's obvious everybody is looking at everybody else to
> see how they implemented it and just pick whatever they like, so it
> seems I am on the right track. ;-) Personally I like this minimal
> approach of the latter two and will probably go with no more than:
> 
> * REMOTE_USER : client certificate X509 subject common name
> * TLS_CLIENT_HASH : certificate fingerprint
> 
> Because when writing a CGI scripts these are the only things I would
> really need: a way to communicate with the user about their certificate
> (REMOTE_USER) and a way to distinguish between offered certificates
> (TLS_CLIENT_HASH). I won't need AUTH_TYPE because if I do get a
> TLS_CLIENT_HASH I'll know I can authenticate the user.
> 
> But that brings me to the real question here. Does gemini need a CGI
> spec? Given status code 42 for CGI errors, it kinda committed to
> something CGI-ish without actually stating what that is. The only
> server making the effort to implement CGI/1.1 is GLV but, IMHO, that
> isn't the kind of simplicity I am here for and it's a bit of a hack to
> be honest.
> 
> GLV does manage to make CGI scripts more portable, whereas other servers
> don't really make the effort. For instance, some don't provide
> PATH_INFO but do provide PATH_TRANSLATED and others provide neither. I
> would like to share my CGI-scripts and have them run anywhere but to
> make sharing easier something like a spec would be nice. What do you
> think?
> 
> Anyway, back to libopenssl and racket ffi..
> 
> Cheers,
> Remco

Some thoughts:

1. My gemini CGI library for Python assumes AUTH_TYPE is available because 
it should be. I could technically edit it to test for the existence of 
TLS_CLIENT_HASH but why should I, when AUTH_TYPE is how you're supposed to check?
2. I kinda like the idea of having as much info as we can give about the 
cert, although I do think there's something to be said about minimalism 
and finding a balance.

That being said, my proposal:

 - `AUTH_TYPE`: Certificate
 - `REMOTE_USER`: certificate subject CN
 - `TLS_CLIENT_HASH`: Hash of the client cert (with `ALGO:HASHBYTESGOHERE` format)
 - `TLS_CLIENT_NOT_AFTER`: End date
 - `TLS_CLIENT_NOT_BEFORE`: Start date
 - `TLS_CLIENT_ISSUER`: Issuer
 - `TLS_CLIENT_SUBJECT`: Subject

The assumption is that TLS_CLIENT_ISSUER and TLS_CLIENT_SUBJECT are in 
forms where the info can easily be parsed out (so no need for the 
sub-field variables, apart from REMOTE_USER being the subject CN). 
TLS_CLIENT_REMAINING isn't needed, either, since any programming language 
worth its salt is going to have a way to get the days between today and 
TLS_CLIENT_NOT_AFTER.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

5. Sean Conner (sean (a) conman.org)

It was thus said that the Great Remco once stated:
> Here's what the expose.
> 
> # GLV-1.12556
> 
> => https://github.com/spc476/GLV-1.12556/blob/13d52b63/Lua/GLV-1/gateway.lua#L156
> 
> * AUTH_TYPE : "Certificate"
> * REMOTE_USER : client certificate X509 subject common name
> * TLS_CLIENT_HASH : certificate fingerprint
> * TLS_CLIENT_ISSUER : certificate X509 issuer
> * TLS_CLIENT_ISSUER_* : certificate X509 issuer sub fields
> * TLS_CLIENT_NOT_AFTER : certificate end date
> * TLS_CLIENT_NOT_BEFORE : certificate start date
> * TLS_CLIENT_REMAIN : certificate days left
> * TLS_CLIENT_SUBJECT : certificate X509 subject
> * TLS_CLIENT_SUBJECT_* : certificate X509 subject sub fields

  When I wrote the CGI module for GLV-1.12556, I modelled mine after what
Apache did, but renamed the TLS environment variables to better names.  I
figured why not?  And it's not as if all of it is needed, but I was able to
send it.

  Another aspect of GLV-1.12556 is that unless configured otherwise, a CGI
script called with a client certificate will *only* get AUTH_TYPE and
REMOTE_USER set.  AUTH_TYPE and REMOTE_USER are required by the CGI spec.

> But that brings me to the real question here.  Does gemini need a CGI
> spec?  Given status code 42 for CGI errors, it kinda committed to
> something CGI-ish without actually stating what that is.  The only
> server making the effort to implement CGI/1.1 is GLV but, IMHO, that
> isn't the kind of simplicity I am here for and it's a bit of a hack to
> be honest.

  Don't be mislead by GLV-1.12556.  Supporting CGI/1.1 isn't hard---the
complexity I have is in supporting CGI scripts meant for the web and
possibly Apache.  I did add the following variables for convenience:

	GEMINI_DOCUMENT_ROOT   = directory
	GEMINI_SCRIPT_FILENAME = full_path_to_script
	GEMINI_URL_PATH        = location.path
	GEMINI_URL             = request

> GLV does manage to make CGI scripts more portable, whereas other servers
> don't really make the effort.  For instance, some don't provide
> PATH_INFO but do provide PATH_TRANSLATED and others provide neither.  I
> would like to share my CGI-scripts and have them run anywhere but to
> make sharing easier something like a spec would be nice.  What do you
> think?

  I think it's easy enough to follow RFC-3875 (the CGI/1.1 spec).  It's not
that hard.  It's just:

	GATEWAY_INTERFACE      = "CGI/1.1"
	QUERY_STRING           = location.query or "" -- Must be set
	REMOTE_ADDR            = auth._remote
	REMOTE_HOST            = auth._remote
	SCRIPT_NAME            = base
	SERVER_NAME            = location.host
	SERVER_PORT            = tostring(location.port)
	SERVER_SOFTWARE        = "GLV-1.12556/1"
	AUTH_TYPE              = "Certificate" -- only if client cert
	REMOTE_USER            = auth.subject.CN or "" -- only if client cert
	PATH_INFO              = ... okay, some explanation required [1]
	PATH_TRANSLATED        = ... okay, some explanation required [1]

  Other variables are possible, but should be prefixed by the protocol name.

  -spc

[1]	These last two require a bit more explanation to set correctly that
	I don't have time for in this message.  I'll circle back to this
	later tonight when I have a bit more time.

Link to individual message.

6. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

One important thing to standardize is how TLS_CLIENT_HASH is
calculated. Otherwise CGI scripts will not be able to recognize
clients again if the server software changes.

makeworld

Link to individual message.

7. Michael Lazar (lazar.michael22 (a) gmail.com)

On Sun, Nov 29, 2020 at 8:32 PM <colecmac at protonmail.com> wrote:
>
> One important thing to standardize is how TLS_CLIENT_HASH is
> calculated. Otherwise CGI scripts will not be able to recognize
> clients again if the server software changes.
>
> makeworld

I think that jetforce is the odd duckling here. I'm the only one using base64
for hashes. This was discussed a while ago on the mailing list and I declared
that I would switch if a consensus was reached [0]. IIRC the discussion kind of
petered out after that...

So I will say right now, unless there's strong opposition, that I'm going to
change TLS_CLIENT_HASH to "SHA256:<HASH>" where <HASH> is the uppercase hex
representation of the certificate hash, with *no* colons in it. This change
will be made in the next release of jetforce, with TLS_CLIENT_HASH_B64 being
as a backwards compatible env var to make the transition easier for any existing
CGI scripts.

Is anybody currently using the certificate hash in their CGI scripts? I am very
curious because I haven't seen many real uses of client certs in gemini thus
far.

- Michael

[0] gemini://gemi.dev/gemini-mailing-list/messages/001529.gmi

Link to individual message.

8. Robert "khuxkm" Miles (khuxkm (a) tilde.team)

November 29, 2020 9:32 PM, "Michael Lazar" <lazar.michael22 at gmail.com> wrote:

> Is anybody currently using the certificate hash in their CGI scripts? I am very
> curious because I haven't seen many real uses of client certs in gemini thus
> far.
> 
> - Michael

I'm using the certificate hash in my Stream of Consciousness script[0], to 
maintain a list of allowed certificates that can post to a given stream. 
My planned port of a text adventure to Gemini is on hold, but such a port 
would likely use client certificates to keep track of what game state 
exists for what user.

Just my two cents,
Robert "khuxkm" Miles

[0]: https://github.com/MineRobber9000/stream-of-consciousness

Link to individual message.

9. Sean Conner (sean (a) conman.org)

It was thus said that the Great Ren? Wagner once stated:
> I totally agree that we need a definition of what information has to be
> passed to a CGI script by the server. Adjust scripts to honor specific
> server implementations is a no-go.

  I used RFC-3875 for the basis of my CGI support.  It defines the following
variables to be passed to the script.

	AUTH_TYPE

		Only set if a client certificate is present, and if so, I
		set this to "Certificate".  There's another server that sets
		this to "CERTIFICATE".

	CONTENT_LENGTH
	CONTENT_TYPE

		These two don't apply to Gemini, and are thus not set (which
		is allowed per the RFC).

	GATEWAY_INTERFACE

		Always set to "CGI/1.1"

	PATH_INFO

		This is only set if there's additional text past the CGI
		script in the path segment of the request.  If the CGI
		script is "/example/foo" and the request is
		
			gemini://example.net/example/foo

		then this isn't set.  If the request is

			gemini://example.net/example/foo/path/to/something

		then this would be set to

			/path/to/something

	PATH_TRANSLATED

		This is only set if there's additional text past the CGI
		script in the path segment of the request.  This is defined
		as taking the above, and mapping it to the server's document
		structure.  So, if the base directory of the site is
		
			/var/exaple.net/gemini/

		and PATH_INFO is set from the request
		
			gemini://example.net/example/foo/path/to/something

		then this would be set to
		
			/var/example.com/gemini/path/to/something

	QUERY_STRING

		Must always be set.  If no query string is defined, it's set
		to ""

	REMOTE_ADDR
	REMOTE_HOST

		I set these to the IP address of the client (allowed per the
		RFC).  An alternative I've suggested in the past is the use
		of "127.0.0.1" or "::" for these two fields.

	REMOTE_IDENT

		I don't bother with this one---no one runs ident anymore.

	REMOTE_USER

		Only set if a client certificate is present, and if so, the
		concensus seems to be to use the CN field of the subject of
		the certificate, or "" if that isn't present.

	REQUEST_METHOD

		I set this to "", as a request method isn't defined.  Most
		use "GET", which is what Gemini *is*, but personally, I feel
		"" is better as there is no method sent at all.

	SCRIPT_NAME

		If the request is

			gemini://example.net/example/foo

		Then this is set to

			/example/foo

		If the request is:

			gemini://example.net/example/foo/path/to/something

		Then this is still set to:

			/example/foo


	SERVER_NAME
	SERVER_PORT

		Set to the server name and port.

	SERVER_PROTOCOL

		I set this to "GEMINI".  

	SERVER_SOFTWARE

		Up to the server to set as it sees fit.

  And that's it as far as RFC-3875 goes.  Not much to it.  I currently
define only four Gemini specific variables (allowed by the RFC):

	GEMINI_DOCUMENT_ROOT

		Base directory of the site.

	GEMINI_SCRIPT_FILENAME

		This is the full path to the CGI script.

	GEMINI_URL_PATH

		This is the path portion of the Gemini request.

	GEMINI_URL

		This is the request as sent by the client.

  I also define some TLS variables, but those are under discussion elsewhere
on this thread.  These (the TLS and GEMINI variables) can be described in a
companion document, but the base I think should be RFC-3875.  It fits quite
well to Gemini.

  -spc

Link to individual message.

10. Remco (me (a) rwv.io)


2020/11/30 05:33, Sean Conner:

> It was thus said that the Great Ren? Wagner once stated:
>> I totally agree that we need a definition of what information has to be
>> passed to a CGI script by the server. Adjust scripts to honor specific
>> server implementations is a no-go.
>
>   I used RFC-3875 for the basis of my CGI support.  It defines the following
> variables to be passed to the script.
>>snip<<
>   I also define some TLS variables, but those are under discussion elsewhere
> on this thread.  These (the TLS and GEMINI variables) can be described in a
> companion document, but the base I think should be RFC-3875.  It fits quite
> well to Gemini.

Thanks for the explanation!  I'm considering adding some extra headers
to nudge closer to the spec but I still feel CGI is too HTTP focused to
just "fully" adapt it and having a clearly defined subset could be
helpful for people building servers.

Cheers,
R.

Link to individual message.

11. Remco (me (a) rwv.io)


2020/11/30 03:32, Michael Lazar:

> On Sun, Nov 29, 2020 at 8:32 PM <colecmac at protonmail.com> wrote:
>>
>> One important thing to standardize is how TLS_CLIENT_HASH is
>> calculated. Otherwise CGI scripts will not be able to recognize
>> clients again if the server software changes.
>>
>> makeworld
>
> I think that jetforce is the odd duckling here. I'm the only one using base64
> for hashes. This was discussed a while ago on the mailing list and I declared
> that I would switch if a consensus was reached [0]. IIRC the discussion kind of
> petered out after that...
>
> So I will say right now, unless there's strong opposition, that I'm going to
> change TLS_CLIENT_HASH to "SHA256:<HASH>" where <HASH> is the uppercase hex
> representation of the certificate hash, with *no* colons in it. This change
> will be made in the next release of jetforce, with TLS_CLIENT_HASH_B64 being
> as a backwards compatible env var to make the transition easier for any existing
> CGI scripts.

I like the TLS_CLIENT_HASH to "SHA256:<HASH>" format and have
implemented that.  Thanks for the link into the archives, I missed that.

BTW, shouldn't your TLS_CLIENT_HASH_B64 be TLS_CLIENT_HASH_SHA256_B64?
I am guessing it is a SHA256 hash?

Cheers,
R.

Link to individual message.

---

Previous Thread: [ANN] gebase: PoC Gemini server in shell

Next Thread: [SPEC-CHANGE] Mandatory scheme in request and link URLs