💾 Archived View for rawtext.club › ~sloum › geminilist › 004978.gmi captured on 2024-03-21 at 16:46:31. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

[spec] The Tragedy of &

Sean Conner sean at conman.org

Sun Jan 31 23:16:18 GMT 2021

- - - - - - - - - - - - - - - - - - -

It was thus said that the Great Gary Johnson once stated:

Sean Conner <sean at conman.org> writes:

Not if the CGI interface is properly written. All I had to do was write

this CGI script and drop it into my tests directory [1]:

gemini://gemini.conman.org/test/pathseg.cgi

[ snip ]

Thanks for sharing some code, Sean. I, of course, realize that one could

write a CGI script to pick apart the PATH_INFO for user inputs. This

issue I raised in my message was that this doesn't make any sense in the

context of a CGI script which is looked up using the path on the remote

filesystem.

In your example, your script is located at /test/pathseg.cgi. However,

lacking side information, I see no indicator (outside of the --

admittedly optional -- cgi extension on your file name) of which path

segments should be considered part of the CGI filename lookup and which

parts are meant to be user input data in your example link:

/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh

That's a particular implementation detail of GLV-1.12556 [1]. Otherservers could require the extension, or some other mechanism.

This feels like a massive hack to me and an abuse of path segments TBH.

If I were to embrace this approach, I can see that I would have to

reprogram my server to do some additional path preprocessing magic. I

could either:

1. Check every sequence of path segments starting from the document root

to see if any of them correspond to an executable file or have the

blessed CGI file extension for my server.

I see your server just accepts the requested path as is. GLV-1.12556(once it gets into the filesystem handler) walks down the document rootchecking each path segment looking for an exectuable file (which indicates aCGI script) or symbolic link (which indicates a SCGI script).

Once one of these 3 approaches enables the server to successfully detect

that a particular path corresponds to a CGI script that is not actually

located where that path is pointing, then the server would need to

execute that script with PATH_INFO bound to the entire path. Every

installed CGI script would then be responsible for manually removing

SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,

which puts an additional burden on CGI developers.

If you want to follow RFC-3875, that's not the case. PATH_INFO onlycontans data past the script name (section 4.1.5). This link:

gemini://gemini.conman.org/cgi

returns

SCRIPT_NAME = /cgi

There is no PATH_INFO or PATH_TRANSLATED because it's not needed. However:

gemini://gemini.conman.org/cgi/path/to/nowhere

returns

SCRIPT_NAME = /cgi PATH_INFO = /path/to/nowhere PATH_TRANSLATED = /home/spc/projects/gemini/non-checkin/gemini.conman.org/path/to/nowhere

The work is on the server side, not the CGI script side.

So I've now heard from multiple folks that we should all just get on

with these path segment hacks and accept that as the best we can do in

Gemini.

While I can see that it's technically possible (though arguable ugly) to

do so, I suppose my question is:

"What exactly does Gemini lose by allowing chained query parameters?

(with &)"

Nothing as far as I can see, as long as the characters '=' and '&' areescaped if they appear in the input (to prevent confusion).

What am I missing here, folks?

Somebody to do a proof-of-concept probably.

Any chance of weighing in here, Solderpunk?

Is he still alive?

-spc

[1] https://github.com/spc476/GLV-1.12556