CGI and CGI like support (was Re: [ANN] Announcing Molly Brown, a Gemini server in Go)

🗣️ From: Sean Conner (sean (a) conman.org)
📅 Sent: 2020-01-14 23:49
📧 Message 4 of 5
It was thus said that the Great solderpunk once stated:
> 
> This decision should not be interpreted as a criticism of your
> RFC-3875-derived implementation.  I think it makes good sense for there
> to be an option for people to easily convert existing web CGI scripts to
> Gemini.
> 
> But I do think it would be nice if there was one vaguely standard way
> for servers to implement this kind of thing, so that dynamic content
> generating code could be more portable.  I think for that I'd probably
> prefer something as light as possible, and to explicitly distance Gemini
> from many of the ideas baked into RFC-3875, especially that dynamic
> content code should have access to the end user's IP address.

  RFC-3875 wasn't that bad to support as there aren't that many
meta-varables (as they are called) to suport, and several are optional
anyway.  The RFC doesn't cover how the meta-varriables are sent to the
script, but under Unixland, it's via envinroment variables.

Here's what I currently do:

	AUTH_TYPE
		Not set unless the client provides a certificate, then this
		gets set to "Certificate".

	CONTENT_LENGTH
		Doesn't apply as there's no way to send a document to a
		Gemini server.

	CONTENT_TYPE
		Doesn't apply.

	GATEWAY_INTERFACE
		Set to "CGI/1.1"

	PATH_INFO
		Per RFC (wording is a bit muddled), and not always set.

	PATH_TRANSLATED
		Per RFC, and not always set.  I will say that these two are
		a bit persnickity to get right.

	QUERY_STRING
		Must be set.  If no query string, set to "".

	REMOTE_ADDR
	REMOTE_HOST
		I take it these are the ones you oject to the most.  But if
		I'm running a Gemini server, I *already* have your IP
		address anyway.  It seems silly to hide it to me, but I
		don't live in Europe so take what I say with a grain of salt
		or two.  I set these (to just the IP address).

	REMOTE_IDENT
		Nobody supports RFC-1413, so I skip this one.

	REMOTE_USER 
		If a client provides a certificate, I set this to the client
		subject common name.

	REQUEST_METHOD
		I set this to "", as Gemini has no concept of a request
		method (but see below).

	SCRIPT_NAME
		Per RFC.  Not hard to set properly.

	SERVER_NAME 
		Hostname of the current server.  If you support multiple
		hosts per Gemini, then I would set this to the host the
		client connected to.

	SERVER_PORT
		Set to port number of server.

	SERVER_PROTOCOL
		Set to "GEMINI".

	SERVER_SOFTWARE
		Set to "GLV-1.12556/1".

  And that's it without further configuration.  As a default, a CGI script
will ONLY get these environment variables (whereas your implementation leaks
the parent environment to the script---might want to check that).  I allow
one to set other environment variables per script (like $PATH or $LANG or
whatever).  If you need HTTP compatibility, I set some HTTP_* and change
REQUEST_METHOD to "GET" and SERVER_PROTOCOL to "HTTP/1.0".  I also have an
option to set some variables that Apache sets as well.

  If the client presents a certificate, I set the following:

	TLS_CIPHER
	TLS_VERSION
	TLS_CLIENT_HASH
	TLS_CLIENT_ISSUER
	TLS_CLIENT_SUBJECT
	TLS_CLIENT_NOT_BEFORE
	TLS_CLIENT_NOT_AFTER
	TLS_CLIENT_REMAIN (time between now and TLS_CLIENT_NOT_AFTER)
	TLS_CLIENT_ISSUER_* (various fields broken down)
	TLS_CLIENT_SUBJECT_* (various fields broken down)

and AUTH_TYPE and REMOTE_USER as mentioned above (if Apache compatibility
requested, the names change but it's largely the same information).  Details
can be seen starting here:

https://github.com/spc476/GLV-1.12556/blob/master/Lua/GLV-1/cgi.lua#L241
	
> I think there's a lot to recommend the way Molly Brown works, especially
> if we generalise it just a little to "Gemini CGI apps should endlessly
> read single line URLs over THING, until THING is closed, at which point
> the app should terminate".  

  Oh, so pretty much a Gemini server sans TLS then.

> Here THING could be stdin, or a TCP
> connection (making a CGI app basically a small self-contained server),
> or a unix domain socket.  Simple servers could do what Molly currently
> does, just spawn the script, send a single URL over stdin and then close
> stdin, giving us the good old fashioned one-process-per-request model of
> traditional web CGI.  But more advanced servers could give admins a way
> to configure different approaches where the process is persistent, more
> like FastCGI.  Or they could round-robin load balance between multiple
> servers on a local network.  The actual CGI program would see very
> little difference between these scenarios, you'd just give a slightly
> different argument to a library function which produced some kind of
> iterator over URLs.  This has great power:weight.
> 
> I was very happy with this idea until I realised that CGI programs
> should also have some way to get access to client certificates, not just
> the URL. :(
> 
> I haven't returned since then to thinking about how to achieve this.

  Perhaps a series of line, like:

	Request: gemini://example.net/foo/bar/script
	TLS_Cipher: ...
	TLS_Version: ...
	TLS_CLient_Hash: ...
	TLS_Client_Issuer: ...

  Ends with EOF (or a blank line or a NUL byte or some way to indicate the end
of this one request).  This is consistent (each line is formatted the same
way) and I think, easy to deal with.

  But whatever you come up with, I would try to avoid calling it CGI, as
that tends to lead to RFC-3875 ...

  -spc
---
Previous in thread (3 of 5): 🗣️ solderpunk (solderpunk (a) SDF.ORG)
Next in thread (5 of 5): 🗣️ Bradley D. Thornton (Bradley (a) NorthTech.US)
View entire thread.