💾 Archived View for soviet.circumlunar.space › oak › mailinglist › 10.gmi captured on 2024-06-16 at 12:58:47. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-03)
-=-=-=-=-=-=-
Date: Wed, 27 Jan 2021 17:19:59 -0500
Jason McBrayer wrote:
Having more complex forms is a temptation to implement applications on
Gemini, rather than using pairings of protocol+client that are more
appropriate (e.g. using NNTP for a message board).
Charlie Stanton <charlie at shtanton.com> wrote:
I agree with this completely. I think Gemini should be a protocol for
viewing content only. I missed all the discussion around inimeg, titan
etc. at the time but I feel similarly about those.
I think a different protocol for filling out forms makes a lot more
sense, and we can work on having gemini clients and form clients play
nicely together so the user experience doesn't suffer from using a
different program to fill out a form.
Adding forms would take us wayyyyy too close to the web in my opinion.
And now me...
tl;dr: Gemini can already emulate forms. We just need a spec language
clarification in Section 3.2.1 1x (INPUT) from Solderpunk and for
client authors to update their software accordingly. I illustrate
both points (and provide code) below.
I appreciate the generally conservative nature of the Gemini community
when it comes to extending the Gemini and Gemtext specifications. As a
server author, this certainly keeps my life easier.
However, I'd like to go on record here to say that interactive capsules
are not something that worries me. There are already quite a few of them
out there in Geminispace (hello Astrobotany!), and I'd like to continue
to see this medium grow and thrive in our little corner of the internet.
I don't think form-like data submission should be seen as an evil. It
allows us to implement a wide variety of CGI-style applications that do
all their computing on the server side (often through some script
extension mechanism). This keeps our servers and clients simple,
empowers content authors to build cool things, and still keeps us nicely
insulated from "The Javascript Trap" since our Gemini clients never
download and run any client-side code.
Over the months that I have followed this mailing list, I've seen
broadly two categories of proposals around extending Gemini's simple
input methods:
1. Ways to submit multiple pieces of information to a server at once.
2. Ways to upload files to a server.
Both proposals are pretty self-explanatory since they extend the
possible functionality of interactive Gemini capsules without breaking
any of our privacy or security guarantees. However, option 1 puts an
additional burden on client authors, and option 2 puts an additional
burden on both client and server authors.
Some members of our community have suggested that these features aren't
worth the extra effort. Others have argued in favor of one or both of
them, and a brave few have gone off and created their own sister
protocols to try and implement Gemini-like systems that also support
some variant of these two data upload options (e.g., Titan, Dioscuri,
Inimeg).
From a personal standpoint (and I can only speak for myself here
obviously), I wouldn't mind one or more form types being added to
Gemtext (option 1 above) as it would reduce the total number of
round-trip network requests between client and server to submit multiple
pieces of information (and I have quite a slow satellite internet
connection, so this matters to me).
However, even without (a very unlikely) form enhancement to Solderpunk's
Gemtext spec, I'd like to remind folks that we actually do (or at least
we should) already have the ability to emulate forms in our Gemini
capsules.
Assuming we are currently browsing a page at
gemini://awesome.capsule.net/form, this dynamic Gemtext page could
include forms as follows:
# Welcome to my Gemini Form! To fill in any field below, simply click it. Everything's a link in Gemini, so you can't really mess up! => form?$SESSION&name Name: $NAME => form?$SESSION&password Password: $PASSWORD => form?$SESSION&smog SMOG is great: $SMOG => form?$SESSION&plant Best Astrobotany Plant: $PLANT => form?$SESSION&submit Submit Answers
Here, my Gemtext is a template string, which I process in a context in
which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or
default to empty strings). When the page first loads, we create a new
$SESSION value in our CGI script and insert it into the links to
preserve state across requests until we restart the server or the user
refreshes the page.
(Obviously, a more robust state management mechanism could be achieved
with client certs and a DB, but I just mean to show a very simple
example here.)
Here would be the server-side responses for each of those links:
For the boolean choice (SMOG) and the multiple choice (PLANT) inputs,
you could, of course, perform input validation and re-prompt if
necessary. You could also simply include one link per choice in your
form template instead of using a 10 INPUT response.
The intention of this example is that the clients would produce requests
of this form after each input prompt:
gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
gemini://awesome.capsule.net/form?$SESSION&password&secret
gemini://awesome.capsule.net/form?$SESSION&smog&yes
gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
where $SESSION is whatever value was generated by the CGI script on the
first page load.
With this information in the query params, it would be easy to store a
lookup table in the CGI script that mapped session -> field -> value,
and these values can then be easily inserted into the original Gemtext
template form above (see Section 3.1) in response to these requests.
The form?$SESSION&submit link can then trigger the server to validate
that all of the required form fields have been filled in correctly and
perform whatever next step operation you want.
In addition, as I mentioned several months ago on this list, you could
perform file "uploads" by having one of the input links prompt for a URL
to a file. Then the server could download that file and store it in your
session (or account if you're using client certs and a DB).
While this example creates more back-and-forth requests than a proper
client-side form would generate, I hope it demonstrates that Gemini and
Gemtext in their current incarnations are already sufficiently complete
to build interactive CGI applications with them today.
The only problem I'm running into here is that the various Gemini
clients I've tested (elpher, bombadillo, kristall) don't actually append
a user's input as an additional parameter to an existing query string if
one is present. Instead, bombadillo and kristall just overwrite the
existing query string and only return ?$NEW_INPUT. Elpher, on the other
hand, just creates invalid URLs by simply appending ?$NEW_INPUT to
whatever is already in the URL (e.g.,
gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these
behaviors do what I'd want or expect here.
I think the culprit then is probably Gemini Protocol Specification
section 3.2.1 1x (INPUT):
Status codes beginning with 1 are INPUT status codes, meaning: The requested resource accepts a line of textual user input. The <META> line is a prompt which should be displayed to the user. The same resource should then be requested again with the user's input included as a query component. Queries are included in requests as per the usual generic URL definition in RFC3986, i.e. separated from the path by a ?. Reserved characters used in the user's input must be "percent-encoded" as per RFC3986, and space characters should also be percent-encoded.
As far as I can tell, the fix here is for Solderpunk to update the text
in section 3.2.1 to indicate that if a query string is already part of
the request leading to an INPUT response, then the user's input should
be appended (using &) to the existing query string rather than replacing
it wholesale (using ?).
Otherwise, we really have no way to input more than one query param
(with &) other than asking the user to type it directly into the INPUT
prompt (e.g., cat&dog&pig). I'm hoping this isn't the spec's intention
here and that we just have a case of ambiguous wording that has led some
client authors to create divergent (or broken) implementations.
Okay, that was a LONG message, but I hope I've communicated my points
clearly. Thanks to all who read this far, and thanks to everyone for
making Gemini such an active and engaging community!
I've attached a short (47 line) CGI script (for Space Age) that
implements the dynamic form example described in this email. If clients
would append user input params (with &) to existing query strings rather
than replace them, it should work perfectly. Until then, it will just
have to feel a bit sad and dejected.
Whose client is going to make it work first! I wait eagerly with bated
breath to find out.
Happy hacking!
Gary
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: form.clj
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210127/4d28aea6/attachment.ksh>
-------------- next part --------------
--
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
() ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org - against proprietary attachments
Why is HTML email a security nightmare? See https://useplaintext.email/
Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
--------
Date: Fri, 29 Jan 2021 13:05:31 +0100
Gary Johnson <lambdatronic at disroot.org> wrote
=> form?$SESSION&name Name: $NAME
=> form?$SESSION&password Password: $PASSWORD
=> form?$SESSION&smog SMOG is great: $SMOG
=> form?$SESSION&plant Best Astrobotany Plant: $PLANT
=> form?$SESSION&submit Submit Answers
[...]
(Obviously, a more robust state management mechanism could be achieved
with client certs and a DB, but I just mean to show a very simple
example here.)
Yes, if the client supports client certificates, we can skip sending
$SESSION and use the regular inputs:
=> gemini://awesome.capsule.net/form/name Name => gemini://awesome.capsule.net/form/password Password => gemini://awesome.capsule.net/form/smog SMOG is great => gemini://awesome.capsule.net/form/plant Best Astrobotany Plant => gemini://awesome.capsule.net/form/submit Submit Answers
[...]
The intention of this example is that the clients would produce requests
of this form after each input prompt:
=> gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
=> gemini://awesome.capsule.net/form?$SESSION&password&secret
=> gemini://awesome.capsule.net/form?$SESSION&smog&yes
=> gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
where $SESSION is whatever value was generated by the CGI script on the
first page load.
I do not understand this example.
When using regular inputs, the client will send these requests:
gemini://awesome.capsule.net/form/name?Gary%Johnson
gemini://awesome.capsule.net/form/password?secret
gemini://awesome.capsule.net/form/smog?yes
gemini://awesome.capsule.net/form/plant?Ficus
gemini://awesome.capsule.net/form/submit
(No "?" on "submit" since it's just telling the server that we're done.)
What is the benefit of doing it your way?
With this information in the query params, it would be easy to store a
lookup table in the CGI script that mapped session -> field -> value,
and these values can then be easily inserted into the original Gemtext
template form above (see Section 3.1) in response to these requests.
If you format the URLs like this:
gemini://$HOST/path/to/script/$FIELD?$VALUE
...then $FIELD should show up as PATH_INFO (probably with a leading "/")
and $VALUE as QUERY_STRING.
[...]
The only problem I'm running into here is that the various Gemini
clients I've tested (elpher, bombadillo, kristall) don't actually append
a user's input as an additional parameter to an existing query string if
one is present. Instead, bombadillo and kristall just overwrite the
existing query string and only return ?$NEW_INPUT. Elpher, on the other
hand, just creates invalid URLs by simply appending ?$NEW_INPUT to
whatever is already in the URL (e.g.,
gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these
behaviors do what I'd want or expect here.
Elpher is doing something weird here but the others are handling inputs as
intended.
I think the culprit then is probably Gemini Protocol Specification
section 3.2.1 1x (INPUT):
[...]
As far as I can tell, the fix here is for Solderpunk to update the text
in section 3.2.1 to indicate that if a query string is already part of
the request leading to an INPUT response, then the user's input should
be appended (using &) to the existing query string rather than replacing
it wholesale (using ?).
This is not a necessary spec change.
Otherwise, we really have no way to input more than one query param
(with &) other than asking the user to type it directly into the INPUT
prompt (e.g., cat&dog&pig).
The responsibility for collecting parameters fall on the server, not on the
client. The only thing the client needs to do is sending one query for each
field.
I'm hoping this isn't the spec's intention
here and that we just have a case of ambiguous wording that has led some
client authors to create divergent (or broken) implementations
Sorry to disappoint you. I suggest leaving the ampersands to the web
queries.
[...]
I've attached a short (47 line) CGI script (for Space Age) that
implements the dynamic form example described in this email.
Thank you for providing example code and I'm sorry for not doing the same.
--
Katarina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210129/c5136798/attachment.htm>
--------
Date: Sat, 30 Jan 2021 15:54:39 -0500
## Section 3.3: (DESIRED) Client-side Requests
>
>
> The intention of this example is that the clients would produce requests
> of this form after each input prompt:
>
> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
> => gemini://awesome.capsule.net/form?$SESSION&password&secret
> => gemini://awesome.capsule.net/form?$SESSION&smog&yes
> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
>
> where $SESSION is whatever value was generated by the CGI script on the
> first page load.
>
I do not understand this example.
When using regular inputs, the client will send these requests:
gemini://awesome.capsule.net/form/name?Gary%Johnson
gemini://awesome.capsule.net/form/password?secret
gemini://awesome.capsule.net/form/smog?yes
gemini://awesome.capsule.net/form/plant?Ficus
gemini://awesome.capsule.net/form/submit
(No "?" on "submit" since it's just telling the server that we're done.)
What is the benefit of doing it your way?
Hi Katarina,
Thanks for taking the time to reply to my message. I'll try to clarify
my point here.
The issue I'm raising is that there appears to be no way to pass more
than one piece of information at a time in our query strings. This has a
very significant impact on any writers of CGI scripts, which is how many
Gemini servers allow users to add dynamic pages to their capsules.
But why, you ask?
Because each CGI script is available at a particular file path and
therefore additional path segments can't be used to pass information to
them. They have to get their inputs from the query string.
This is a script. It probably returns a 20 response:
gemini://awesome.capsule.net/form.clj
If I want to fill in a name field on that page, I might provide a link
like this:
gemini://awesome.capsule.net/form.clj?name
This calls the CGI script with a query parameter. Great! The script can
use "name" to look up the appropriate response. Here it is:
10 Please enter your name\r\n
However, when the user fills in their name, the browser will now send
this request to the server:
gemini://awesome.capsule.net/form.clj?Gary%20Johnson
There is no way for the CGI script to know that this is a name value and
not the value for any other form field on the page.
And therein lies the rub. If the only way to associate input values with
the variables they represent is with path segments, then CGI scripts
simply can't ever use more than one input field per page. Even then, if
the query string used to trigger a 10 INPUT response is typed by the
user (into the totally free form text field they are presented), then
the server will continue to respond with yet another 10 INPUT response.
This would make a form with N fields require N+1 separate CGI scripts,
all chained together via links that represent the directory structure
into which they are installed.
This is an absolute nightmare scenario for programming anything that
wants to accept user inputs.
So what does this mean for Geminispace?
It means essentially that CGI scripts are currently second-class
citizens, and the only people who can write dynamic capsules are server
authors (or people willing to hack on server code). This is because
encoding information using path segments requires injecting custom
routing table code into the server's request handler.
As a server author, I am capable of creating a custom fork of my server
with a new routing table for each dynamic capsule I want to build.
However, I suspect the majority of Gemini users are not going to have
both the skill and willingness to engage in this level of coding on
their pages.
That is why I and many other authors have added support for CGI scripts
to our servers. But under the "only one piece of information in the
query string" paradigm, these scripts are currently rather handicapped
when it comes to accepting user input.
Hopefully, I've made the technical merits of my case clear here.
## Section 4.2: Append Don't Replace!
>
>
> As far as I can tell, the fix here is for Solderpunk to update the text
> in section 3.2.1 to indicate that if a query string is already part of
> the request leading to an INPUT response, then the user's input should
> be appended (using &) to the existing query string rather than replacing
> it wholesale (using ?).
>
This is not a necessary spec change.
Yes, it really is if anyone other than server authors is ever going to
be able to write their own dynamic pages.
Otherwise, we really have no way to input more than one query param
> (with &) other than asking the user to type it directly into the INPUT
> prompt (e.g., cat&dog&pig).
The responsibility for collecting parameters fall on the server, not on the
client. The only thing the client needs to do is sending one query for each
field.
Again, see above. A single query value cannot be associated with its
variable without adding a custom routing table to the server to enable
the parsing of path segment data as additional inputs.
I'm hoping this isn't the spec's intention
> here and that we just have a case of ambiguous wording that has led some
> client authors to create divergent (or broken) implementations
>
Sorry to disappoint you. I suggest leaving the ampersands to the web
queries.
I'm afraid we disagree here.
Thank you for providing example code and I'm sorry for not doing the same.
If you can write a CGI script that can correctly associate INPUT
responses with their intended variables, please share it. I suspect it
would be quite educational.
Happy hacking,
Gary
--
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
() ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org - against proprietary attachments
Why is HTML email a security nightmare? See https://useplaintext.email/
Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
--------
Date: Sat, 30 Jan 2021 22:19:25 +0000
This is going to be weird, because I disagree with almost everything you've said except that appending the query string should be guaranteed. I hope this is helpful
January 30, 2021 1:54 PM, "Gary Johnson" <lambdatronic at disroot.org> wrote:
The issue I'm raising is that there appears to be no way to pass more
than one piece of information at a time in our query strings. This has a
very significant impact on any writers of CGI scripts, which is how many
Gemini servers allow users to add dynamic pages to their capsules.
But why, you ask?
Because each CGI script is available at a particular file path and
therefore additional path segments can't be used to pass information to
them. They have to get their inputs from the query string.
%< ------------------------------
This would make a form with N fields require N+1 separate CGI scripts,
all chained together via links that represent the directory structure
into which they are installed.
This is an absolute nightmare scenario for programming anything that
wants to accept user inputs.
Well, you *could* pass extra path info to the script... So, the script at cgi-bin/index.cgi handles all cgi-bin/* and treats the path after cgi-bin as positional arguments
So what does this mean for Geminispace?
It means essentially that CGI scripts are currently second-class
citizens, and the only people who can write dynamic capsules are server
authors (or people willing to hack on server code). This is because
encoding information using path segments requires injecting custom
routing table code into the server's request handler.
CGI scripts *are* second class citizens in Gemini, but it's because the UX and dev-op experience of line based input is terrible. The fact that a static routing table is more performant and has a better security profile than parsing the path info dynamically is less relevant than the fact that this is a line based protocol
%< ------------------------------
> ## Section 4.2: Append Don't Replace!
>> As far as I can tell, the fix here is for Solderpunk to update the text
>> in section 3.2.1 to indicate that if a query string is already part of
>> the request leading to an INPUT response, then the user's input should
>> be appended (using &) to the existing query string rather than replacing
>> it wholesale (using ?).
>
> This is not a necessary spec change.
Yes, it really is if anyone other than server authors is ever going to
be able to write their own dynamic pages.
Now, "Append, don't replace," is a reasonable expectation to make of clients and it's still useful for the devops situation, even if it's not *strictly* necessary
%< ------------------------------
If you can write a CGI script that can correctly associate INPUT
responses with their intended variables, please share it. I suspect it
would be quite educational.
The two alternatives to requiring clients to preserve collected state in the query parameter are to save state in the CGI script or to pass positional arguments via the path. I think append is reasonable. It also preserves principle of least surprise and other desirable qualities
CGI *is* going to be second class in Gemini as long as forms aren't an option, but that's a consequence of the decision to support line-based clients. Appending the query doesn't do violence to that design
Chris
--------
Date: Sat, 30 Jan 2021 18:59:47 -0500
It was thus said that the Great Gary Johnson once stated:
The issue I'm raising is that there appears to be no way to pass more
than one piece of information at a time in our query strings. This has a
very significant impact on any writers of CGI scripts, which is how many
Gemini servers allow users to add dynamic pages to their capsules.
But why, you ask?
Because each CGI script is available at a particular file path and
therefore additional path segments can't be used to pass information to
them. They have to get their inputs from the query string.
[ snip ]
It means essentially that CGI scripts are currently second-class
citizens, and the only people who can write dynamic capsules are server
authors (or people willing to hack on server code). This is because
encoding information using path segments requires injecting custom
routing table code into the server's request handler.
Not if the CGI interface is properly written. All I had to do was write
this CGI script and drop it into my tests directory [1]:
gemini://gemini.conman.org/test/pathseg.cgi
It uses only three of the RFC-3875 defined variables, QUERY_STRING,
SCRIPT_NAME and PATH_INFO to do all the work. The script will ask for three
fields and then present a final page with all three fields. But the script
will only work if all three variables are defined per RFC-3975 (PATH_INFO is
the tricky one).
Yes, it's a bit ugly and yes, it's a second class citizen and yes, it
requires a proper CGI module to work, but it can be done without the
configuration you think it does. The script just simply appends each input
field as the path, so if you enter
and a one
and a two
skidoosh
as the values, the final URL will be:
gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh
Yes, I could have done a bit more processing, naming each segment:
/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
but I was lazy and wanted to just do a proof-of-concept here.
If you can write a CGI script that can correctly associate INPUT
responses with their intended variables, please share it. I suspect it
would be quite educational.
I have added it [3].
-spc
[1] I gave the script a .cgi extension just to drive the point
home---for my server, GLV-1.12556 [2], the extension of a CGI script
doesn't matter at all.
[2] https://github.com/spc476/GLV-1.12556
[3] Here you go. It's in Lua, but it's easy going except for the first
bit which is a bit of broilerplate I needed for encoding and
decoding various strings. The main logic is marked though, so you
can skip the first section.
-- ************************************************************************
-- Decoding and Encoding crap, not much to see here, citizen! Move along!
-- ************************************************************************
local lpeg = require "lpeg"
local xdigit = lpeg.locale().xdigit
local char = lpeg.P"%" * lpeg.C(xdigit * xdigit)
/ function(c)
return string.char(tonumber(c,16))
end
+ lpeg.P"+" / " "
+ lpeg.P(1)
local decode_query = lpeg.Cs(char^1)
local function tohex(c)
return string.format("%%%02X",string.byte(c))
end
local unsafe = lpeg.P" " / "%%20"
+ lpeg.P"#" / "%%23"
+ lpeg.P"%" / "%%25"
+ lpeg.P"<" / "%%3C"
+ lpeg.P">" / "%%3E"
+ lpeg.P"[" / "%%5B"
+ lpeg.P"\\" / "%%5C"
+ lpeg.P"]" / "%%5D"
+ lpeg.P"^" / "%%5E"
+ lpeg.P"{" / "%%7B"
+ lpeg.P"|" / "%%7C"
+ lpeg.P"}" / "%%7D"
+ lpeg.P'"' / "%%22"
+ lpeg.R("\0\31","\127\255") / tohex
local char_path = lpeg.P"?" / "%%3F"
+ unsafe
+ lpeg.P(1)
local esc_path = lpeg.Cs(char_path^0)
-- ************************************************************************
-- The main script starts here
-- ************************************************************************
local query = os.getenv("QUERY_STRING")
local script_name = os.getenv("SCRIPT_NAME")
local pathinfo = os.getenv("PATH_INFO")
if not pathinfo and query == "" then
io.stdout:write("Status: 10\n")
io.stdout:write("Content-Type: Input field\n")
io.stdout:write("\n")
os.exit(0,true)
end
if not pathinfo then
query = decode_query:match(query)
query = esc_path:match(query)
io.stdout:write("Status: 30\n")
io.stdout:write(string.format("Location: %s/%s\n",script_name,query))
io.stdout:write("\n")
os.exit(0,true)
end
if pathinfo:match("^/[^/]*/[^/]*/[^/]*") then
local f1,f2,f3 = pathinfo:match("^/([^/]*)/([^/]*)/([^/]*)")
f1 = decode_query:match(f1)
f2 = decode_query:match(f2)
f3 = decode_query:match(f3)
io.stdout:write("Status: 20\n")
io.stdout:write("Content-Type: text/gemini\n")
io.stdout:write("\n")
io.stdout:write("The three fields you input:\n")
io.stdout:write("\n")
io.stdout:write(string.format("* %s\n",f1))
io.stdout:write(string.format("* %s\n",f2))
io.stdout:write(string.format("* %s\n",f3))
io.stdout:write("\n")
io.stdout:write(string.format("=> %s Try again\n",script_name))
os.exit(0,true)
end
if query == "" then
io.stdout:write("Status: 10\n")
io.stdout:write("Content-Type: Input next field\n")
io.stdout:write("\n")
os.exit(0,true)
else
query = decode_query:match(query)
query = esc_path:match(query)
pathinfo = esc_path:match(pathinfo)
io.stdout:write("Status: 30\n")
io.stdout:write(string.format("Location: %s%s/%s\n",script_name,pathinfo,query))
io.stdout:write("\n")
os.exit(0,true)
end
--------
Date: Sat, 30 Jan 2021 22:09:55 -0500
On Wed, Jan 27, 2021 at 5:20 PM Gary Johnson <lambdatronic at disroot.org>
wrote:
I don't think form-like data submission should be seen as an evil. It
allows us to implement a wide variety of CGI-style applications that do
all their computing on the server side (often through some script
extension mechanism).
+1
which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or default
to empty strings). When the page first loads, we create a new
$SESSION value in our CGI script and insert it into the links to
preserve state across requests until we restart the server or the user
refreshes the page.
I think this is exactly the Right Thing.
(Obviously, a more robust state management mechanism could be achieved
with client certs and a DB, but I just mean to show a very simple
example here.)
A TLS session is not the same as an application session. I may, for
example, have two tabs (or whatever) open in my Gemini browser that refer
to the same access-controlled capsule, and which therefore must be accessed
with the same cert. Nevertheless, the two pages should operate as distinct
sessions: I should be able to fill out a form in one page while searching
help documents in the other. So I think a session ID is the Right Thing.
However, this is a matter of server/capsule/CGI design, not of the Gemini
protocol.
While this example creates more back-and-forth requests than a proper
client-side form would generate, I hope it demonstrates that Gemini and
Gemtext in their current incarnations are already sufficiently complete
to build interactive CGI applications with them today.
The biggest problem is most likely the cost of setting up and tearing down
all the TLS connections, but there is no help for that.
The requested resource accepts a line of textual user input. The <META>
line is a prompt which should be displayed to the user. The same
resource should then be requested again with the user's input included
as a query component.
"Included" is a vague word, and should be fixed whether we do appending or
not.
As far as I can tell, the fix here is for Solderpunk to update the text
in section 3.2.1 to indicate that if a query string is already part of
the request leading to an INPUT response, then the user's input should
be appended (using &) to the existing query string rather than replacing
it wholesale (using ?).
I suggest that if there is no query part, we append ? followed by the
user's input, whereas if there is, we just append the user's input. That
lets a simple form work like this:
1) Suppose Fluffy (a server) wants me to send my name and email address.
Fluffy sends this bare-bones text/gemini document, which we will say comes
from gemini://fluffy.example/form1, to my client Aarfy.
2) Let's say I choose the first link. Fluffy sends Arfy 10 Enter your name.
I type John Cowan into Aarfy, which sends the URL
gemini://fluffy.example/form1?session=ABC&name=John%20Cowan. Fluffy sends
this new document to Aarfy:
[John Cowan]: ?session=ABC&name=
?session=ABC&name=John%20Cowan&email=
3) If I choose the first link, I can change my name. If I choose the
second link, Fluffy will send Arfy 10 Enter your email. I type
cowan at ccil.org into Aarfy, which sends the URL
gemini://fluffy.example/form1?session=ABC&name=John%20Cowan&email=
cowan at ccil.org. Fluffy sends this third document to Aarfy:
[John Cowan]: ?session=ABC&email=cowan at ccil.org&name=
[cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=
[cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=
cowan at ccil.org&submit
4) If I choose the first or second link again, I can change my name or
email address. But if I choose the third link, which Fluffy does *not*
interpret as a search link, Fluffy will write my name and email into a
database, or send me an email saying "HA HA HA!", or whatever it does.
Because all that happens is following links and reading input lines, it
does not matter if Aarfy is a CLI, TUI, or GUI client: the protocol
exchanges work in any case. Furthermore, Fluffy does not have to retain
partial state, because it is passed back and forth between Aarfy and Fluffy
with no real interpretation at either end until Aarfy receives a submission
URL.
For that matter there is no real need to have a submission link: an URL
that specifies both name and email could be interpreted as a submission.
As before, this is a matter of design, not protocol.
John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org
There is no real going back. Though I may come to the Shire, it will
not seem the same; for I shall not be the same. I am wounded with
knife, sting, and tooth, and a long burden. Where shall I find rest?
--Frodo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210130/c841e6c8/attachment.htm>
--------
Date: Sun, 31 Jan 2021 09:51:53 +0100
> Gemtext in their current incarnations are already sufficiently
> complete to build interactive CGI applications with them today.
The biggest problem is most likely the cost of setting up and tearing down
all the TLS connections, but there is no help for that.
Well, there is "0-RTT" TLS session resumption with early data. That
reduces the overheads substantially (though it still requires a fresh
TCP connection for each request). As far as I know no server supports
0-RTT currently, and I think only a few clients do. But it would fit
well with heavy CGI use.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210131/e7e5a388/attachment.sig>
--------
Date: Sun, 31 Jan 2021 13:07:51 -0500
Sean Conner <sean at conman.org> writes:
Not if the CGI interface is properly written. All I had to do was write
this CGI script and drop it into my tests directory [1]:
gemini://gemini.conman.org/test/pathseg.cgi
It uses only three of the RFC-3875 defined variables, QUERY_STRING,
SCRIPT_NAME and PATH_INFO to do all the work. The script will ask for three
fields and then present a final page with all three fields. But the script
will only work if all three variables are defined per RFC-3975 (PATH_INFO is
the tricky one).
Yes, it's a bit ugly and yes, it's a second class citizen and yes, it
requires a proper CGI module to work, but it can be done without the
configuration you think it does. The script just simply appends each input
field as the path, so if you enter
and a one
and a two
skidoosh
as the values, the final URL will be:
gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh
Yes, I could have done a bit more processing, naming each segment:
/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
but I was lazy and wanted to just do a proof-of-concept here.
Thanks for sharing some code, Sean. I, of course, realize that one could
write a CGI script to pick apart the PATH_INFO for user inputs. This
issue I raised in my message was that this doesn't make any sense in the
context of a CGI script which is looked up using the path on the remote
filesystem.
In your example, your script is located at /test/pathseg.cgi. However,
lacking side information, I see no indicator (outside of the --
admittedly optional -- cgi extension on your file name) of which path
segments should be considered part of the CGI filename lookup and which
parts are meant to be user input data in your example link:
/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
This feels like a massive hack to me and an abuse of path segments TBH.
If I were to embrace this approach, I can see that I would have to
reprogram my server to do some additional path preprocessing magic. I
could either:
1. Check every sequence of path segments starting from the document root
to see if any of them correspond to an executable file or have the
blessed CGI file extension for my server.
2. (To use Chris Babcock's suggestion), check every sequence of path
segments starting from the document root to see if any of them
correspond to a directory containing an index.cgi file.
3. Include an input parameter to my server (on the command line or in a
config file) that specifies a particular mapping between path
segments and CGI scripts on my filesystem. That is, I would be
defining a routing table at server start time. This approach has the
unfortunate side effect of preventing users on a pubnix from
installing CGI scripts in their ~/public_gemini capsules without
getting the server administrator to update the global routing table
on their behalf. Alternatively, it would require each ~/public_gemini
capsule to include a routing table config file within it if it wanted
to support CGI scripts, and these would have to be read and parsed by
the server both at server start time and/or on a periodic interval or
event-based basis in order to support new user scripts as they are
added without having to restart the server.
Once one of these 3 approaches enables the server to successfully detect
that a particular path corresponds to a CGI script that is not actually
located where that path is pointing, then the server would need to
execute that script with PATH_INFO bound to the entire path. Every
installed CGI script would then be responsible for manually removing
SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,
which puts an additional burden on CGI developers.
So I've now heard from multiple folks that we should all just get on
with these path segment hacks and accept that as the best we can do in
Gemini.
While I can see that it's technically possible (though arguable ugly) to
do so, I suppose my question is:
"What exactly does Gemini lose by allowing chained query parameters?
(with &)"
I can't for the life of me see any downside. It should literally be one
line of code changed in your favorite Gemini client. Just append inputs
to the query string if one already exists rather than replacing the
query string outright.
I believe John Cowan is right that "include" is too vague a word in
Solderpunk's current specification for the 10 INPUT field. Both
appending and replacing are forms of inclusion, so any Gemini client
author who chooses to append shouldn't be in violation of the spec as it
is currently worded.
And changing that one line in your client could save every CGI script
writer in Geminispace a lot of additional work (as clearly demonstrated
by the examples shared by Katarina, Chris, and Sean).
Seems like a very positive return on investment for a small change.
What am I missing here, folks?
Any chance of weighing in here, Solderpunk?
With best intentions,
Gary
--
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
() ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org - against proprietary attachments
Why is HTML email a security nightmare? See https://useplaintext.email/
Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
--------
Date: Sun, 31 Jan 2021 18:16:18 -0500
It was thus said that the Great Gary Johnson once stated:
Sean Conner <sean at conman.org> writes:
> Not if the CGI interface is properly written. All I had to do was write
> this CGI script and drop it into my tests directory [1]:
>
> gemini://gemini.conman.org/test/pathseg.cgi
[ snip ]
Thanks for sharing some code, Sean. I, of course, realize that one could
write a CGI script to pick apart the PATH_INFO for user inputs. This
issue I raised in my message was that this doesn't make any sense in the
context of a CGI script which is looked up using the path on the remote
filesystem.
In your example, your script is located at /test/pathseg.cgi. However,
lacking side information, I see no indicator (outside of the --
admittedly optional -- cgi extension on your file name) of which path
segments should be considered part of the CGI filename lookup and which
parts are meant to be user input data in your example link:
/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
That's a particular implementation detail of GLV-1.12556 [1]. Other
servers could require the extension, or some other mechanism.
This feels like a massive hack to me and an abuse of path segments TBH.
If I were to embrace this approach, I can see that I would have to
reprogram my server to do some additional path preprocessing magic. I
could either:
1. Check every sequence of path segments starting from the document root
to see if any of them correspond to an executable file or have the
blessed CGI file extension for my server.
I see your server just accepts the requested path as is. GLV-1.12556
(once it gets into the filesystem handler) walks down the document root
checking each path segment looking for an exectuable file (which indicates a
CGI script) or symbolic link (which indicates a SCGI script).
Once one of these 3 approaches enables the server to successfully detect
that a particular path corresponds to a CGI script that is not actually
located where that path is pointing, then the server would need to
execute that script with PATH_INFO bound to the entire path. Every
installed CGI script would then be responsible for manually removing
SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,
which puts an additional burden on CGI developers.
If you want to follow RFC-3875, that's not the case. PATH_INFO only
contans data past the script name (section 4.1.5). This link:
gemini://gemini.conman.org/cgi
returns
SCRIPT_NAME = /cgi
There is no PATH_INFO or PATH_TRANSLATED because it's not needed. However:
gemini://gemini.conman.org/cgi/path/to/nowhere
returns
SCRIPT_NAME = /cgi
PATH_INFO = /path/to/nowhere
PATH_TRANSLATED = /home/spc/projects/gemini/non-checkin/gemini.conman.org/path/to/nowhere
The work is on the server side, not the CGI script side.
So I've now heard from multiple folks that we should all just get on
with these path segment hacks and accept that as the best we can do in
Gemini.
While I can see that it's technically possible (though arguable ugly) to
do so, I suppose my question is:
"What exactly does Gemini lose by allowing chained query parameters?
(with &)"
Nothing as far as I can see, as long as the characters '=' and '&' are
escaped if they appear in the input (to prevent confusion).
What am I missing here, folks?
Somebody to do a proof-of-concept probably.
Any chance of weighing in here, Solderpunk?
Is he still alive?
-spc
[1] https://github.com/spc476/GLV-1.12556
--------