💾 Archived View for gemi.dev › gemini-mailing-list › 000148.gmi captured on 2023-12-28 at 15:41:57. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-11-04)

🚧 View Differences

-=-=-=-=-=-=-

Gateway Interfaces for Gemini

1. Sean Conner (sean (a) conman.org)


  I just finished support for SCGI for GLV-1.12556, which makes it the
second gateway interface it supports (the other being CGI).  And I feel this
is probably the time to talk variables---what informatino can be expected
for CGI and SCGI programs when then run under Gemini.  The CGI specification
[1] calls such information "meta-variables".  The SCGI specification [2]
calls them "headers".  Most people know them as "environment variables" [3],
but whatever, it would be nice to know what a gateway interface program can
expect in the way of data from the server.

  As I've mentioned before, for CGI, I follow the spec (with some minor
variations due to Gemini).  For SCGI, I decided to pass along the same
information but with a minor variation.  So, for BOTH, the data I pass
along:

	GEMINI_DOCUMENT_ROOT	Path to the domain's main content directory
	GEMINI_URL		The requested URL
	GEMINI_URL_PATH		The path portion of the requested URL
	PATH_INFO		optional per CGI
	PATH_TRANSLATED		optional per CGI
	QUERY_STRING		Query string portion of request, or "" [a]
	REMOTE_ADDR		Remote address [b]
	REMOTE_HOST		Remote hostname [b][c]
	SCRIPT_NAME		Name of the script [d]
	SERVER_NAME		Hostname of the request
	SERVER_PORT		Port number from the request
	SERVER_SOFTWARE		"GLV-1.12556/1"

	AUTH_TYPE		"Certificate" [e]
	REMOTE_USER		The subject CN from the client cert [e]

	TLS_CIPHER		TLS cipher used [e][f]
	TLS_CLIENT_HASH		TLS hash [e][f]
	TLS_CLIENT_ISSUER	The issuer field [e][f][g]
	TLS_CLIENT_ISSUER_*	Subfields from the isser (like C, CN, etc.) [e][f]
	TLS_CLIENT_NOT_AFTER	Expiration date [e][f]
	TLS_CLIENT_NOT_BEFORE	Valid date [e][f]
	TLS_CLIENT_REMAIN	Number of days until cert expires [e][f]
	TLS_CLIENT_SUBJECT	The subject field [e][f]
	TLS_CLIENT_SUBJECT_*	Subfields from the subject(like CN, etc.) [e][f][g]
	TLS_VERSION		TLS version being used [e][f]

[a]	Mandatory per RFC-3875
[b]	Mandatory per RFC-3875---the more security conscience of you might
	not like this, but in that case, I can recommend the value of
	"127.0.0.1" or "::1" 
[c]	Can be the IP address, which is what I do
[d]	In my case, it's the full path to the file (CGI) or symbolic link
	(SCGI) 
[e]	Only set if a client certificate is sent
[f]	Only set if configured to do so.
[g]	For example, TLS_CLIENT_SUBJECT_CN, TLS_CLIENT_SUBJECT_OU

  I added GEMINI_DOCUMENT_ROOT to mimic Apache's DOCUMENT_ROOT, and
GEMINI_URL and GEMINI_URL_PATH because I found a few servers that defined
GEMINI_URL and passed either the full URL or the path portion, and I wanted
to cover both cases with something.

  For CGI, the program will also receive the following variable:

	GATEWAY_INTERFACE	"CGI/1.1" (mandatory per RFC-3875)

  And for SCGI, the program will receive the following variables:

	CONTENT_LENGTH		"0" (mandatory per spec)
	SCGI			"1" (mandatory per spec)

  Why SCGI didn't use GATEWAY_INTERFACE="SCGI/1" is beyond me, but anyway,
there's the variables I pass along for both CGI and SCGI.  You can see
actual values used by following these links:

	gemini://gemini.conman.org/cgi/test
	gemini://gemini.conman.org/cgi/test/path/file
	gemini://gemini.conman.org/scgi-sample
	gemini://gemini.conman.org/scgi-sample/path/file

  If you see some extra data, it's because I allow extra values to be set.

  And my question to you is---what variables should a CGI/SCGI program
depend upon to exist?

  -spc

[1]	RFC-3875

[2]	https://web.archive.org/web/20020403050958/http://python.ca/nas/scgi/protocol.txt

[3]	Because how they're passed to CGI scripts.

Link to individual message.

2. solderpunk (solderpunk (a) SDF.ORG)

On Mon, May 25, 2020 at 07:11:04PM -0400, Sean Conner wrote:

> [b]	Mandatory per RFC-3875---the more security conscience of you might
> 	not like this, but in that case, I can recommend the value of
> 	"127.0.0.1" or "::1" 
> [c]	Can be the IP address, which is what I do

It's true that, as I've written in the past, I really am not a fan of
this information being passed along for privacy reasons.  Yes, of
course, I know full well that the server itself already knows your IP
address, by necessity.  I am totally fine with admins logging that
information for the sake of debugging or abuse prevention.

But I just don't see the need to pass this information along to
applications.  What possible legitimate use could they have for it?  If
they want to recognise consecutive requests from the same user so they
can maintain state server side, well, that's what client certificates
are for.  The application can request one, instead of relying on the IP
address, which won't work well anyway if somebody is using a popular VPN
exit node.  The only other thing I can think of which is potentially
even vaguely legimiate is geolocation so the app can e.g. serve a
suitable translated interface.  But even that's iffy in my mind because
geolocation is so terribly unreliable in this day and age because so
many people habitually use VPNs and may not be where they appear to be.

I know this field is mandatory in RFC-3875 - what is the scope of that
RFC with respect to protocols?  Does it only talk about HTTP or is it
supposed to be more general?

Cheers,
Solderpunk

Link to individual message.

3. Sean Conner (sean (a) conman.org)

It was thus said that the Great solderpunk once stated:
> On Mon, May 25, 2020 at 07:11:04PM -0400, Sean Conner wrote:
> 
> > [b]	Mandatory per RFC-3875---the more security conscience of you might
> > 	not like this, but in that case, I can recommend the value of
> > 	"127.0.0.1" or "::1" 
> > [c]	Can be the IP address, which is what I do
> 
> I know this field is mandatory in RFC-3875 - what is the scope of that
> RFC with respect to protocols?  Does it only talk about HTTP or is it
> supposed to be more general?

  First paragraph of the abstract of RFC-3875:

	The Common Gateway Interface (CGI) is a simple interface for running
	external programs, software or gateways under an information server
	in a platform-independent manner.  Currently, the supported
	information servers are HTTP servers.

so it's clearly HTTP biased, but it also states in 4.1.18:

	The server SHOULD set meta-variables specific to the protocol and
	scheme for the request.  Interpretation of protocol-specific
	variables depends on the protocol version in SERVER_PROTOCOL.  The
	server MAY set a meta-variable with the name of the scheme to a
	non-NULL value if the scheme is not the same as the protocol.  The
	presence of such a variable indicates to a script which scheme is
	used by the request.

which to me reads that it may apply to other protocols, like Gemini (or even
gopher).  From RFC-3875:

	4.1.8.  REMOTE_ADDR

		The REMOTE_ADDR variable MUST be set to the network address
		of the client sending the request to the server.

	4.1.9.  REMOTE_HOST

		The REMOTE_HOST variable contains the fully qualified domain
		name of the client sending the request to the server, if
		available, otherwise NULL. ...

		The server SHOULD set this variable.  If the hostname is not
		available for performance reasons or otherwise, the server
		MAY substitute the REMOTE_ADDR value.

thus mu recommendation for "127.0.0.1" or "::1".

  -spc

Link to individual message.

4. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

> But I just don't see the need to pass this information along to
> applications. What possible legitimate use could they have for it?

I use it for gemlikes, because it's just a really simple way to prevent
spam without having to complicate things with client certs. Client
certs are great, but as you said, it would be a huge hassle to have
to do it for each site.

I providing the IP address is fine, and applications can use it if they
want. Trying to restrict apps from accessing it will give people a false
sense of security, I think.

makeworld

??????? Original Message ???????
On Tuesday, May 26, 2020 1:59 PM, solderpunk <solderpunk at SDF.ORG> wrote:

> On Mon, May 25, 2020 at 07:11:04PM -0400, Sean Conner wrote:
>
> > [b] Mandatory per RFC-3875---the more security conscience of you might
> > not like this, but in that case, I can recommend the value of
> > "127.0.0.1" or "::1"
> > [c] Can be the IP address, which is what I do
>
> It's true that, as I've written in the past, I really am not a fan of
> this information being passed along for privacy reasons. Yes, of
> course, I know full well that the server itself already knows your IP
> address, by necessity. I am totally fine with admins logging that
> information for the sake of debugging or abuse prevention.
>
> But I just don't see the need to pass this information along to
> applications. What possible legitimate use could they have for it? If
> they want to recognise consecutive requests from the same user so they
> can maintain state server side, well, that's what client certificates
> are for. The application can request one, instead of relying on the IP
> address, which won't work well anyway if somebody is using a popular VPN
> exit node. The only other thing I can think of which is potentially
> even vaguely legimiate is geolocation so the app can e.g. serve a
> suitable translated interface. But even that's iffy in my mind because
> geolocation is so terribly unreliable in this day and age because so
> many people habitually use VPNs and may not be where they appear to be.
>
> I know this field is mandatory in RFC-3875 - what is the scope of that
> RFC with respect to protocols? Does it only talk about HTTP or is it
> supposed to be more general?
>
> Cheers,
> Solderpunk

Link to individual message.

---

Previous Thread: jetforce security vulnerability, affecting versions < 0.2.3

Next Thread: [ANN] gemlikes - A liking and comment system for Gemini