💾 Archived View for gemi.dev › gemini-mailing-list › 000640.gmi captured on 2024-12-17 at 15:27:13. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

Proposal: Simple structured form specification

1. Russtopia (rmagee (a) gmail.com)

Hi all,
I have just started to get into Gemini, now running a simple server locally
with the Kristal client to test things out.

Reading the draft spec, I see there are only two ways to obtain input from
the
user -- via responses 10 and 11 (INPUT and SENSITIVE INPUT).

While this allows simple single-field entry of values to the server, there
appears to be no facility to allow users to enter multi-field, structured
data in a single operation -- that is, simple forms.

If it does not violate the fundamental tenets of the Gemini project, I
humbly
suggest an extension to the syntax of gemtext in the following manner to
enable structured multi-field forms. The logic and work here would be ~90%
on
the client side, as it is mostly a convention for encoding structured forms
within .gmi documents, parsed and acted on by the client. Form submission
would require no additional core server-side logic; interpretation of
submitted form values would follow the standard patterns of request handling
with URL query parameters.
No client or server state would be introduced.

I have used an encoding similar to what follows successfully in a past
'minimal HTML' project, to generate simple forms dynamically (but on the
server-side). I hope it might serve well to allow Gemini users a more
convenient way to interface with back-end programs, whilst preserving the
overall ethos of minimalism and privacy.

If the idea of adding a new link type to the gemtext specification is
anathema,
please consider the proposal as a purely client-side convention,
displayed but otherwise ignored by existing clients as part of the
<USER-FRIENDLY LINK NAME> portion of the existing '=>' syntax instead.

--

PROPOSAL
5.5.4 Link line form encoding

Lines beginning with the two characters "?>" are form-link lines, which
follow
the same rules as standard Link lines [5.4.2], plus a <FORM-SPEC> section
preceding the <USER-FRIENDLY LINK NAME>:

?>[<whitespace>]<URL>[<whitespace>]<FORM-SPECS><whitespace>[<USER-FRIENDLY
LINK NAME>]

where:

<whitespace> is any non-zero number of consecutive spaces or tabs
Square brackets indicate that the enclosed content is optional.
<URL> is a URL, which may be absolute or relative.
<FORM-SPEC> is a list of one or more <form-field-specifier> items, separated
 by the Unicode/ASCII forward-slash '/'. The client uses <FORM-SPEC> to
build
 a form entry popup, dialog, or series of input prompts to gather structured
 user input.

Each <form-field-specifier> follows the form

  T#VAR#DEFVALUE#LABEL

  where T ? [ s | c | b  ]

's' denotes a string (freeform data) field;
'x' denotes a string/number SENSITIVE field, which the client MUST shroud as
  password-type data;
'c' denotes a choice (dropdown/one-of) field;
'b' denotes a boolean ( checkbox, yes/no ) field

's' fields are freeform text, and may be used for numbers, text, or freeform
  data. It is the server's responsibility to validate submitted data.
For 'c' fields, the first choice is the default, all following choices being
  delimited from it and each other by the pipe '|' character.
For 'b' fields, allowable DEFVALUEs are [0 | 1], [yes | no], or [on | off].
The
  client is responsible for detecting these and returning sensible
  counterparts to the DEFVALUE if the user chooses their opposites.

VAR denotes the name of the form variable when submitting back to the server
  upon submission.
DEFVALUE denotes the default value to be displayed in the form. For choice
  type fields, the first item in the choice list is the default.
LABEL is text to be displayed by the client explaining the form field.
Clients
 MUST use the VAR field as a default LABEL if the .gmi link omits one.

A valid <FORM-SPEC> field is of the form

  <form-field-specifier>{/<form-field-specifier>/...}

Example - three field form requesting a string, a choice, and a boolean

  s#DELAY#5#Delay in seconds/c#SIZE#small|big|huge#Size of
something:b#DEBUG#1#

Note the above example has no LABEL for the final form field, so the client
  should render a default label using the form variable's name, 'DEBUG'.
This would instruct the client to display a form, popup or series of prompts
  (in the case of a text-based client) to enter three items.
The choice field would default to 'small', the first item in its set.
The client would return a URI upon submission with the query parameters

  ?DELAY=5&SIZE=small&DEBUG=1

if the user submitted with default values.


A fully-realized example of this proposed syntax would thus be

?> gemini://example.org/formSubmit s#DELAY#5#Delay in
seconds:c#SIZE#small|big|huge#Size of something:b#DEBUG#1# Please fill out
this handy form

Suggested limits on form structure and data

Max <form-field-specifiers>: 8
Max <form-field-specifier> sub-field lengths
  (VAR, DEFVALUE, LABEL): 255 Unicode characters
Max 'c' type choice length per item: 64 Unicode characters
Max 'c' type choices: 64
Max 's' value: enforced by server-side endpoint handler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210125/f3a3
aeec/attachment.htm>

Link to individual message.

2. nothien (a) uber.space (nothien (a) uber.space)

Russtopia <rmagee at gmail.com> wrote:
> Hi all,

Hi!

> [... form proposal ... ]

The issue is that Gemini (and gemtext) is too close to being frozen
forever.  And most Gemini content out there doesn't need to use forms.

I have an alternative proposal: make the form on a new page, and use
something other than gemtext for it (and possibly something other than
Gemini the protocol).  I suggest instead making a gemtext-derived format
that is specially suited to forms, which has special line types and
stuff as necessary (in fact, I suggest adding a similar line type to the
one you gave in your proposal).  You could probably borrow from a lot of
the gemtext spec directly.  It would be best to use one of the
Gemini-analogue protocols which allow uploading (I know of Dioscuri,
Titan, and Inimeg) so that users could submit the form by performing an
upload process to the same URL.  Using a different protocol would be
best as then there are no upload limitations; with Gemini, the URL given
by the client has to be less than ~1024 bytes, which is not suited to
all forms.

~aravk | ~nothien

Link to individual message.

3. Jason McBrayer (jmcbray (a) carcosa.net)

nothien at uber.space writes:
> Russtopia <rmagee at gmail.com> wrote:
>> [... form proposal ... ]

> The issue is that Gemini (and gemtext) is too close to being frozen
> forever. And most Gemini content out there doesn't need to use forms.

This is my personal opinion, but I would go further and say that Gemini
doesn't need forms. If you have fully featured forms, you can implement
largely any kind of application (excluding very interactive ones), and
in the history of the web, this lead to re-implementing the whole
internet over port 80. I don't think Gemini needs to do that. The one
input lets you select different dynamic versions of a page, like for
different locales, or search results, or similar.

Having more complex forms is a temptation to implement applications on
Gemini, rather than using pairings of protocol+client that are more
appropriate (e.g. using NNTP for a message board).

-- 
Jason McBrayer      | ?Strange is the night where black stars rise,
jmcbray at carcosa.net | and strange moons circle through the skies,
                    | but stranger still is lost Carcosa.?
                    | ? Robert W. Chambers,The King in Yellow

Link to individual message.

4. Russtopia (rmagee (a) gmail.com)

On Tue, 26 Jan 2021 at 05:43, Jason McBrayer <jmcbray at carcosa.net> wrote:

> nothien at uber.space writes:
> > Russtopia <rmagee at gmail.com> wrote:
> >> [... form proposal ... ]
>
> > The issue is that Gemini (and gemtext) is too close to being frozen
> > forever. And most Gemini content out there doesn't need to use forms.
>
> This is my personal opinion, but I would go further and say that Gemini
> doesn't need forms. If you have fully featured forms, you can implement
> largely any kind of application (excluding very interactive ones), and
> in the history of the web, this lead to re-implementing the whole
> internet over port 80. I don't think Gemini needs to do that. The one
> input lets you select different dynamic versions of a page, like for
> different locales, or search results, or similar.
>
> Having more complex forms is a temptation to implement applications on
> Gemini, rather than using pairings of protocol+client that are more
> appropriate (e.g. using NNTP for a message board).
>
>
Point well taken -- I know it is a slippery slope, but I still think forms
could be
supported in a manner which keeps complexity down.

This form-builder idea could be implemented with no protocol changes,
100% client-side, within the existing => link syntax. To be user-friendly,
it would
require some sort of agreement by clients on the FORM-SPEC syntax.
Consider if clients agreed to act on documents ending in .gmif
('gemini-form'):

=> gemini://example.site/form.gmif This is a form link, click to fill out

.. clients could, by convention, parse such .gmiform files containing the
FORM-SPEC
using it to build a dynamic entry form. After form entry, the client would
encode it and send back
to the same endpoint (here, form.gmif) so the server could act on the
values.

For clients that did not choose to implement forms, they would by default
simply display .gmif
documents as any other plaintext resource.

The .gmif document holding the <FORM-SPEC> could even have comments with
instructions
on how to manually create a query to the endpoint, for users of older
clients eg:

[form.gmif]

s#DELAY#5#Delay in seconds
> c#SIZE#small|big|huge#Size of something
> b#DEBUG#1#
>
> // If you are seeing this page, your gemini client does not support
> automatic dynamic forms.
> // You can still use this service by building a FORM-SPEC matching the
> above syntax and
> // pasting it to this page's URI, eg:


> //    gemini://example.site/form.gmif?DELAY=5&SIZE=small&DEBUG=1



As ~nothien points out, there's a limit of ~1024 bytes on URLs which I
think is a good thing;
working within the protocol as-is will naturally force relatively small,
simple forms.

So I guess I'm now proposing an optional client-side '.gmif' form standard
:)

-Russ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210126/f34c
1343/attachment.htm>

Link to individual message.

5. easrng (easrng (a) gmail.com)

I agree that Gemini should have forms, but that format doesn't look
great to me. What if forms were implemented as their own pages, not a
new link line type? I'm thinking something like this:

# Gemini Form Proposal
## Basic Format
Form fields are on their own line. Any line starting with `[in` is
considered a form element. This allows clients to continue to
determine line types by the first 3 characters. After the `[in`, a
dash is required.
After the dash, the input type is listed. If a client cannot handle an
input type, it SHOULD fall back to text. The rest of the line, up to
the first ] is used as options for the input. Next, there is a ]
character, then an optional text label.
## Escape sequences
In order to allow for ] characters, newlines, and literal backslashes
in input options, preface them with a backslash.
## Input types
Other types could be added, like number, email, phone, color, or even
file, but I feel this is an acceptable minimum set of types.
### text
A text input is a single line text input. Its format is as follows.
(Things in parentheses are optional)
 ```
[in-text <name>( <initial value>)]
 ```
For example,
 ```
[in-text userFullName Alex Fierro] What is your name?
 ```
### password
A password input is a single line text input that MUST hide the value
typed into it. For security reasons, there is now way to prefill a
password input. Its format is as follows.
 ```
[in-password <name>]
 ```
For example,
 ```
[in-password newPassword] What is your name?
 ```
### multiline
A multiline input is a text input that accepts any number of lines.
Its format is as follows. (Things in parentheses are optional)
 ```
[in-multiline <name> ( <initial value>)]
 ```
For example,
 ```
[in-multiline bio I am a demigod who enjoys art and the colors pink
and green.\nI usually use she/her pronouns, but sometimes I use
he/him.] Tell me about yourself
 ```
### submit
A submit input is a button that submits a form to a URL. It's label
SHOULD be used as the text on the button. Its format is as follows.
 ```
[in-submit /url/to/submit/to]
 ```
For example,
 ```
[in-submit /cgi-bin/profile.py] Update Profile
 ```
## Submitting
When an `[in-submit]` is (clicked|tapped|activated with the
keyboard|etc) all named inputs on the page (submit buttons are not
named) will be added to the URL of the submit button as query
parameters, and the resulting URL will be loaded.


- easrng

Link to individual message.

6. Russtopia (rmagee (a) gmail.com)

On Tue, 26 Jan 2021 at 15:16, easrng <easrng at gmail.com> wrote:

> I agree that Gemini should have forms, but that format doesn't look
> great to me.


Sure, I'm not attached too much to my syntax, it was a hack-ish one I'd
come up with for another project of mine.

> What if forms were implemented as their own pages, not a
> new link line type? I'm thinking something like this:
>
> # Gemini Form Proposal
> ## Basic Format
> Form fields are on their own line. Any line starting with `[in` is
> considered a form element. This allows clients to continue to
> determine line types by the first 3 characters. After the `[in`, a
> dash is required.
> After the dash, the input type is listed. If a client cannot handle an
> input type, it SHOULD fall back to text. The rest of the line, up to
> the first ] is used as options for the input. Next, there is a ]
> character, then an optional text label.
> ## Escape sequences
> In order to allow for ] characters, newlines, and literal backslashes
> in input options, preface them with a backslash.
> ## Input types
> Other types could be added, like number, email, phone, color, or even
> file, but I feel this is an acceptable minimum set of types.
>

I do feel like there should be some explicit allowance for one-of
types and boolean types, to give a client app hints as to what 'widgets'
to best use to present the valid values to the user  (eg., drop-downs or
choose a-b-c-d for a text-based client, and checkboxes). IMHO.

> ### text
> A text input is a single line text input. Its format is as follows.
> (Things in parentheses are optional)
> ```
> [in-text <name>( <initial value>)]
> ```
> For example,
> ```
> [in-text userFullName Alex Fierro] What is your name?
> ```
> ### password
> A password input is a single line text input that MUST hide the value
> typed into it. For security reasons, there is now way to prefill a
> password input. Its format is as follows.
> ```
> [in-password <name>]
> ```
> For example,
> ```
> [in-password newPassword] What is your name?
> ```
> ### multiline
> A multiline input is a text input that accepts any number of lines.
> Its format is as follows. (Things in parentheses are optional)
> ```
> [in-multiline <name> ( <initial value>)]
> ```
> For example,
> ```
> [in-multiline bio I am a demigod who enjoys art and the colors pink
> and green.\nI usually use she/her pronouns, but sometimes I use
> he/him.] Tell me about yourself
> ```
> ### submit
> A submit input is a button that submits a form to a URL. It's label
> SHOULD be used as the text on the button. Its format is as follows.
> ```
> [in-submit /url/to/submit/to]
> ```
> For example,
> ```
> [in-submit /cgi-bin/profile.py] Update Profile
> ```
> ## Submitting
> When an `[in-submit]` is (clicked|tapped|activated with the
> keyboard|etc) all named inputs on the page (submit buttons are not
> named) will be added to the URL of the submit button as query
> parameters, and the resulting URL will be loaded.
>
>
> - easrng
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210126/2d27
2f1c/attachment-0001.htm>

Link to individual message.

7. Charlie Stanton (charlie (a) shtanton.com)

On Tue Jan 26, 2021 at 1:43 PM GMT, Jason McBrayer wrote:
> Having more complex forms is a temptation to implement applications on
> Gemini, rather than using pairings of protocol+client that are more
> appropriate (e.g. using NNTP for a message board).

I agree with this completely. I think Gemini should be a protocol for viewing
content only. I missed all the discussion around inimeg, titan etc. at the time
but I feel similarly about those.

I think a different protocol for filling out forms makes a lot more sense, and
we can work on having gemini clients and form clients play nicely together so
the user experience doesn't suffer from using a different program to fill out a
form.

Adding forms would take us wayyyyy too close to the web in my opinion.

Charlie (shtanton)

Link to individual message.

8. me (a) edaha.org (me (a) edaha.org)


I've been following this discussion and while I'm on the side of "gemini 
doesn't need forms", I thought it would be fun to see how far we could 
simplify the concept of a form :) I don't have any direct responses to 
anything said thus far, but my ideas below have been inspired by the back and forth.

Ultimately, there are only a few data types that are truly needed for any 
form: binaries, choices, and text. This helps limit the scope of what 
would need to be implemented and what people would need to learn to be 
able to use forms. Keep it simple, smarty!

With gemini's limitation on URLs, and the fact that data can only be 
passed via the URL, we have to keep in mind that we'd want to limit the 
amount of information needed to be passed. Thus, I'd say that, were forms 
to be implemented, they must pass arguments positionally instead of as named parameters.

# Syntax
To keep it simple (and have fun), I think an input's identifier can simply 
be the reverse of the link identifier: <=

Much like how => signifies "going somewhere else", <= signifies "sending 
information here". 

<= lines take two parameters: type and label. Again, very similar to links :)

<= type label

As mentioned, there are really only three types of data that are 
meaningful and distinct. The valid options for "type" would only be 
"binary" "text" and "choice". "submit" is also a type that is needed as it's an action.

# binary
Binary options are best known as checkboxes on the web. They're simple 
on/off toggles. How they are displayed is up to the client

<= binary I have read this email and understand the binary option type

When submitted, this would be a simple 0/1 value in the url.

# text
The bulk of data that we could ever ask for is just 'text'. 
Differentiating between "tel" "num" and "text" should be done server-side, 
as these are all still just text fields. "password" is deliberately not 
supported -- this is why gemini has client certs.

<= text What's your website?

If an input should be multi-line, then the following could be used:

<= text Tell me what you like about gemini in a paragraph.
<= text

A single `<= text label` input can be followed by exactly one more `<= 
text` line to signify that it should be multi-line input. It is up to the 
client to decide how many lines to display. If a third `<= text` line were 
added, it would be interpreted as a new text input.

# choice
Choice is ultimately optional, IMO. This provides closed-ended responses 
for a user. Again, the idea of "radio" vs. "dropdown" vs. anything else 
does not matter, as those are entirely client-side decisions. What's 
important is that the user is only able to select one response.

<= choice Which of these protocols are we using?
<= choice gemini
<= choice gopher
<= choice HTTP

After an initial `<= choice` toggle, immediately adjacent ones are 
interpreted as options for a single input. The above could be displayed as 
a dropdown, or radio boxes, or anything the client decides.

# submit
Finally, submit. This one's easy:

<= submit /path/to/interpreter

Note that there's no support for a custom label for the button -- again, 
by design :) We don't need them.

# Putting it all together

The last part is how it's passed. As mentioned at the beginning, I think 
(were gemini to support this, which I don't think it should) that form 
inputs are sent /positional/, not named. This is a requirement to reduce 
the chance of hitting the url limit.

Using the above inputs as an example, upon clicking the submit button, the 
url would look like this (for my own sanity I'm not doing proper url 
encoding -- i leave that as an exercise to the reader):

/path/to/interpreter?1&edaha.org&I like how simple and easy it is to 
use\nit's a lot of fun!&gemini

and that's it! I've got to get back to work now, but this was a fun thought experiment :)

Link to individual message.

9. easrng (easrng (a) gmail.com)

On January 27, 2021 3:55:41 AM UTC, Russtopia <rmagee at gmail.com> wrote:
>I do feel like there should be some explicit allowance for one-of
>types and boolean types, to give a client app hints as to what 'widgets'
>to best use to present the valid values to the user (eg., drop-downs or
>choose a-b-c-d for a text-based client, and checkboxes). IMHO.

I agree that there should me checkboxes, but I think radio buttons are more
in the spirit of Gemini than menus.


RE: Keeping Gemini simple, what if forms were a separate MIME type like
text/form+gemini? I think forms are important to have because unlike ex.
file uploads I don't know if there is another lightweight solution. The
only options I can think of for form filling and submission are PDF forms,
but those are only really usable in Acrobat and HTML forms, which I suppose
we could serve over Gemini but often require JavaScript to function.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210127/5cb5
f193/attachment.htm>

Link to individual message.

10. Johann Galle (johann (a) qwertqwefsday.eu)

I think this should have a [TECH] tag? [1]

I am quite firmly on the "Gemini does not need forms" side. At least as 
far as implementing additional syntax in clients. For some clients it 
might be a big leap to have to implement this, especially considering you 
can already build forms in Gemini, you just have to get a bit creative and 
it does not require any new syntax or behaviour on the client side. And 
thus no additional specification. But again: One should think twice if 
this really has to be implemented using Gemini and not another, more suited protocol.

The basic idea is the following: Each form field is presented on a 
separate "page" and the server keeps track of where the client is in the 
form. Ideally the URL (URI/IRI?) contains all the data necessary, thus 
"saving" the data on the client so it might be continued at a later date. 
If the amount of data expected is larger than would fit in the URL, server 
side state with client certificates would be an alternative.

Now to the different types of input fields. I assume the form's base is 
gemini://example.com/form/ which might display some information about the 
form and the first input field.

On 27.01.2021 18:37, me at edaha.org wrote among?other?things:
 > # binary
 > Binary options are best known as checkboxes on the web. They're simple 
on/off toggles. How they are displayed is up to the client

A check box can be simply implemented with two links for yes and no like 
this for example:
 ```
Does Gemini need forms?
=> 0/ No, it does not.
=> 1/ Yes, it does.
 ```

These two links would then direct the client to either 
<gemini://example.com/form/0/> or <gemini://example.com/form/1/> 
respectively. Now the server would have to understand that this still 
belongs to the form and it should serve the next input field page. In this 
example I just put each input in a path segment as that is the first thing 
I came up with. You could of course put this in the query, separate it 
with spaces, commas semicolons or something completely different.

A nice side effect of doing it this way is that you could in theory build 
this from a static site server by creating respectively named directories 
and index.gmi files (or whatever your server happens to use).

 > # text
 > The bulk of data that we could ever ask for is just 'text'.

We already have this (and passwords too!) with status codes 10 and 11. So 
when you go to either of the URLs in the example above, the server will 
have to respond with that status code. Upon receiving the data it might 
redirect the client so that the URL holds some representation of the data, 
say for example to <gemini://example.com/0/42/>.

 > Differentiating between "tel" "num" and "text" should be done 
server-side, as these are all still just text fields.

I completely agree.

 > "password" is deliberately not supported -- this is why gemini has client certs.

Odd choice considering Gemini already has support for this.

 > If an input should be multi-line, then the following could be used: [...]

That is not something that could be implemented by using Gemini this way 
[2], but if you want to write big amounts of text, you should really use 
something else, e.g. file upload (ftp) or maybe email, irc, xmpp, nntp, etc.

 > # choice
 > Choice is ultimately optional, IMO. This provides closed-ended 
responses for a user. Again, the idea of "radio" vs. "dropdown" vs. 
anything else does not matter, as those are entirely client-side 
decisions. What's important is that the user is only able to select one response.
 >
 > <= choice Which of these protocols are we using?
 > <= choice gemini
 > <= choice gopher
 > <= choice HTTP

This could be implemented just like a checkbox, just with more options. 
Coincidentally the syntax is very similar to the one proposed.
 ```
Which of these protocols are we using?
=> gemini/ gemini
=> gopher/ gopher
=> http/ HTTP
 ```

 > # submit
 > Finally, submit. This one's easy:

... because it is not necessary. The server will just show you the result 
or take the respective action after the last form field is filled. Maybe a 
final checkbox of "Are you sure" would be nice to let the user know that 
this will result in some action.

 > # Putting it all together
 >
 > The last part is how it's passed. As mentioned at the beginning, I 
think (were gemini to support this, which I don't think it should) that 
form inputs are sent /positional/, not named. This is a requirement to 
reduce the chance of hitting the url limit.

And as I said above, you can pass these however you like, although I would 
guess using the path segment approach would be the easiest and best 
supported. If I recall correctly there were different opinions voiced 
about carrying on a query in a redirect. So who knows what some clients 
might do if the data is all stored in the query and a user decides to go 
to the next form field.

 > and that's it!

... we do not need an additional specification for this. If you insist on 
implementing forms in Gemini, you already can - it just requires you to 
think a bit, and the solution might addmittedly be a bit clunky. But 
that's because Gemini is not meant for doing forms!

I hope this was not too bad of a rant but actually helpful.

Johann

[1] gemini://gemi.dev/gemini-mailing-list/messages/004142.gmi

[2] Maybe you could have the users type in some escaped line feeds, but 
the specification says: "The requested resource accepts *a line* of 
textual user input." (? 3.2.1, emphasis mine)
---
You can verify the digital signature on this email with the public key 
available through web key discovery. Try e.g. `gpg --locate-keys`...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210127/570d
7ce5/attachment-0001.sig>

Link to individual message.

11. Gary Johnson (lambdatronic (a) disroot.org)

Jason McBrayer wrote:

> Having more complex forms is a temptation to implement applications on
> Gemini, rather than using pairings of protocol+client that are more
> appropriate (e.g. using NNTP for a message board).

Charlie Stanton <charlie at shtanton.com> wrote:

> I agree with this completely. I think Gemini should be a protocol for
> viewing content only. I missed all the discussion around inimeg, titan
> etc. at the time but I feel similarly about those.
>
> I think a different protocol for filling out forms makes a lot more
> sense, and we can work on having gemini clients and form clients play
> nicely together so the user experience doesn't suffer from using a
> different program to fill out a form.
>
> Adding forms would take us wayyyyy too close to the web in my opinion.

And now me...

tl;dr: Gemini can already emulate forms. We just need a spec language
       clarification in Section 3.2.1 1x (INPUT) from Solderpunk and for
       client authors to update their software accordingly. I illustrate
       both points (and provide code) below.


# Section 1: Motivation


I appreciate the generally conservative nature of the Gemini community
when it comes to extending the Gemini and Gemtext specifications. As a
server author, this certainly keeps my life easier.

However, I'd like to go on record here to say that interactive capsules
are not something that worries me. There are already quite a few of them
out there in Geminispace (hello Astrobotany!), and I'd like to continue
to see this medium grow and thrive in our little corner of the internet.

I don't think form-like data submission should be seen as an evil. It
allows us to implement a wide variety of CGI-style applications that do
all their computing on the server side (often through some script
extension mechanism). This keeps our servers and clients simple,
empowers content authors to build cool things, and still keeps us nicely
insulated from "The Javascript Trap" since our Gemini clients never
download and run any client-side code.

=> https://www.gnu.org/philosophy/javascript-trap.html The Javascript Trap


# Section 2: The Problem


Over the months that I have followed this mailing list, I've seen
broadly two categories of proposals around extending Gemini's simple
input methods:

1. Ways to submit multiple pieces of information to a server at once.

2. Ways to upload files to a server.

Both proposals are pretty self-explanatory since they extend the
possible functionality of interactive Gemini capsules without breaking
any of our privacy or security guarantees. However, option 1 puts an
additional burden on client authors, and option 2 puts an additional
burden on both client and server authors.

Some members of our community have suggested that these features aren't
worth the extra effort. Others have argued in favor of one or both of
them, and a brave few have gone off and created their own sister
protocols to try and implement Gemini-like systems that also support
some variant of these two data upload options (e.g., Titan, Dioscuri,
Inimeg).

>From a personal standpoint (and I can only speak for myself here
obviously), I wouldn't mind one or more form types being added to
Gemtext (option 1 above) as it would reduce the total number of
round-trip network requests between client and server to submit multiple
pieces of information (and I have quite a slow satellite internet
connection, so this matters to me).


# Section 3: A Solution


However, even without (a very unlikely) form enhancement to Solderpunk's
Gemtext spec, I'd like to remind folks that we actually do (or at least
we should) already have the ability to emulate forms in our Gemini
capsules.


## Section 3.1: Form Templates


Assuming we are currently browsing a page at
gemini://awesome.capsule.net/form, this dynamic Gemtext page could
include forms as follows:

 ```Gemtext form template
# Welcome to my Gemini Form!

To fill in any field below, simply click it. Everything's a link in 
Gemini, so you can't really mess up!

=> form?$SESSION&name Name: $NAME

=> form?$SESSION&password Password: $PASSWORD

=> form?$SESSION&smog SMOG is great: $SMOG

=> form?$SESSION&plant Best Astrobotany Plant: $PLANT

=> form?$SESSION&submit Submit Answers
 ```

Here, my Gemtext is a template string, which I process in a context in
which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or
default to empty strings). When the page first loads, we create a new
$SESSION value in our CGI script and insert it into the links to
preserve state across requests until we restart the server or the user
refreshes the page.

(Obviously, a more robust state management mechanism could be achieved
with client certs and a DB, but I just mean to show a very simple
example here.)


## Section 3.2: Server-side Responses


Here would be the server-side responses for each of those links:



For the boolean choice (SMOG) and the multiple choice (PLANT) inputs,
you could, of course, perform input validation and re-prompt if
necessary. You could also simply include one link per choice in your
form template instead of using a 10 INPUT response.


## Section 3.3: (DESIRED) Client-side Requests


The intention of this example is that the clients would produce requests
of this form after each input prompt:

=> gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
=> gemini://awesome.capsule.net/form?$SESSION&password&secret
=> gemini://awesome.capsule.net/form?$SESSION&smog&yes
=> gemini://awesome.capsule.net/form?$SESSION&plant&Ficus

where $SESSION is whatever value was generated by the CGI script on the
first page load.


## Section 3.4: Server-side State Management and Form Submission


With this information in the query params, it would be easy to store a
lookup table in the CGI script that mapped session -> field -> value,
and these values can then be easily inserted into the original Gemtext
template form above (see Section 3.1) in response to these requests.

The form?$SESSION&submit link can then trigger the server to validate
that all of the required form fields have been filled in correctly and
perform whatever next step operation you want.


## Section 3.5: File "Uploads"


In addition, as I mentioned several months ago on this list, you could
perform file "uploads" by having one of the input links prompt for a URL
to a file. Then the server could download that file and store it in your
session (or account if you're using client certs and a DB).


# Section 4: What's Stopping This from Working?


While this example creates more back-and-forth requests than a proper
client-side form would generate, I hope it demonstrates that Gemini and
Gemtext in their current incarnations are already sufficiently complete
to build interactive CGI applications with them today.

The only problem I'm running into here is that the various Gemini
clients I've tested (elpher, bombadillo, kristall) don't actually append
a user's input as an additional parameter to an existing query string if
one is present. Instead, bombadillo and kristall just overwrite the
existing query string and only return ?$NEW_INPUT. Elpher, on the other
hand, just creates invalid URLs by simply appending ?$NEW_INPUT to
whatever is already in the URL (e.g.,
gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these
behaviors do what I'd want or expect here.


## Section 4.1: Check the Spec!


I think the culprit then is probably Gemini Protocol Specification
section 3.2.1 1x (INPUT):

 ```Gemini specification section 3.2.1 1x (INPUT)
Status codes beginning with 1 are INPUT status codes, meaning:

The requested resource accepts a line of textual user input. The <META>
line is a prompt which should be displayed to the user. The same
resource should then be requested again with the user's input included
as a query component. Queries are included in requests as per the usual
generic URL definition in RFC3986, i.e. separated from the path by a ?.
Reserved characters used in the user's input must be "percent-encoded"
as per RFC3986, and space characters should also be percent-encoded.
 ```


## Section 4.2: Append Don't Replace!


As far as I can tell, the fix here is for Solderpunk to update the text
in section 3.2.1 to indicate that if a query string is already part of
the request leading to an INPUT response, then the user's input should
be appended (using &) to the existing query string rather than replacing
it wholesale (using ?).

Otherwise, we really have no way to input more than one query param
(with &) other than asking the user to type it directly into the INPUT
prompt (e.g., cat&dog&pig). I'm hoping this isn't the spec's intention
here and that we just have a case of ambiguous wording that has led some
client authors to create divergent (or broken) implementations.


# Section 5: Conclusion and a Call to Action


Okay, that was a LONG message, but I hope I've communicated my points
clearly. Thanks to all who read this far, and thanks to everyone for
making Gemini such an active and engaging community!

I've attached a short (47 line) CGI script (for Space Age) that
implements the dynamic form example described in this email. If clients
would append user input params (with &) to existing query strings rather
than replace them, it should work perfectly. Until then, it will just
have to feel a bit sad and dejected.

Whose client is going to make it work first! I wait eagerly with bated
breath to find out.

Happy hacking!
  Gary

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: form.clj
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210127/4d28
aea6/attachment.ksh>
-------------- next part --------------

-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

12. Katarina Eriksson (gmym (a) coopdot.com)

When this topic have come up in the past, we have concluded that Gemini can
support forms with serial input fields, as opposed to the parallel input
fields people are used to from web forms.

One way to do this is to send status 10 until all the fields are satisfied,
like a CLI. Another way is to have a page with links pointing to one field
at a time.

Johann Galle <johann at qwertqwefsday.eu> wrote:

> The basic idea is the following: Each form field is presented on a
> separate "page" and the server keeps track of where the client is in the
> form. Ideally the URL (URI/IRI?) contains all the data necessary, thus
> "saving" the data on the client so it might be continued at a later date.
> If the amount of data expected is larger than would fit in the URL, server
> side state with client certificates would be an alternative.
>

I haven't seen this approach yet, seems just as valid as the other ones.

Now to the different types of input fields. I assume the form's base is
> gemini://example.com/form/ which might display some information about the
> form and the first input field.
>
> On 27.01.2021 18:37, me at edaha.org wrote among other things:
>  > # binary
>  > Binary options are best known as checkboxes on the web. They're simple
> on/off toggles. How they are displayed is up to the client
>
> A check box can be simply implemented with two links for yes and no like
> this for example:
> ```
> Does Gemini need forms?
> => 0/ No, it does not.
> => 1/ Yes, it does.
> ```
>

Asking a question like this is not a good example for showing off
checkboxes but another way is to send this:
 ```
10 Does Gemini need forms? [yes/no]
 ```
...and repeat until the user supply a valid answer.

Multiple choice checkboxes can be combined into one input:
 ```
10 My server supports: [c: CGI, v: virtual host, s: sessions]
 ```
...and the user can answer "sv" or "cs" or "v" or whatever other valid
combination.

[...]

 > # choice


[...]
This refers to a single choice among a list of things.

 > <= choice Which of these protocols are we using?

 > <= choice gemini
>  > <= choice gopher
>  > <= choice HTTP
>
> This could be implemented just like a checkbox, just with more options.
> Coincidentally the syntax is very similar to the one proposed.
> ```
> Which of these protocols are we using?
> => gemini/ gemini
> => gopher/ gopher
> => http/ HTTP
> ```
>

This would be:
 ```
10 Which of these protocols are we using? [gemini/gopher/http]
 ```
...in that same example. Though, I do like the links better.

 > # submit
>  > Finally, submit. This one's easy:
>
> ... because it is not necessary. The server will just show you the result
> or take the respective action after the last form field is filled. Maybe a
> final checkbox of "Are you sure" would be nice to let the user know that
> this will result in some action.
>

Or just "Confirm sending this information"

-- 
Katarina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210128/272c
292c/attachment.htm>

Link to individual message.

13. Katarina Eriksson (gmym (a) coopdot.com)

Gary Johnson <lambdatronic at disroot.org> wrote

> => form?$SESSION&name Name: $NAME
>
> => form?$SESSION&password Password: $PASSWORD
>
> => form?$SESSION&smog SMOG is great: $SMOG
>
> => form?$SESSION&plant Best Astrobotany Plant: $PLANT
>
> => form?$SESSION&submit Submit Answers
>

[...]

(Obviously, a more robust state management mechanism could be achieved
> with client certs and a DB, but I just mean to show a very simple
> example here.)
>

Yes, if the client supports client certificates, we can skip sending
$SESSION and use the regular inputs:

 ```text/gemini list of links
=> gemini://awesome.capsule.net/form/name Name
=> gemini://awesome.capsule.net/form/password Password
=> gemini://awesome.capsule.net/form/smog SMOG is great
=> gemini://awesome.capsule.net/form/plant Best Astrobotany Plant
=> gemini://awesome.capsule.net/form/submit Submit Answers
 ```

[...]

## Section 3.3: (DESIRED) Client-side Requests
>
>
> The intention of this example is that the clients would produce requests
> of this form after each input prompt:
>
> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
> => gemini://awesome.capsule.net/form?$SESSION&password&secret
> => gemini://awesome.capsule.net/form?$SESSION&smog&yes
> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
>
> where $SESSION is whatever value was generated by the CGI script on the
> first page load.
>

I do not understand this example.

When using regular inputs, the client will send these requests:

gemini://awesome.capsule.net/form/name?Gary%Johnson
gemini://awesome.capsule.net/form/password?secret
gemini://awesome.capsule.net/form/smog?yes
gemini://awesome.capsule.net/form/plant?Ficus
gemini://awesome.capsule.net/form/submit

(No "?" on "submit" since it's just telling the server that we're done.)

What is the benefit of doing it your way?

## Section 3.4: Server-side State Management and Form Submission
>
>
> With this information in the query params, it would be easy to store a
> lookup table in the CGI script that mapped session -> field -> value,
> and these values can then be easily inserted into the original Gemtext
> template form above (see Section 3.1) in response to these requests.
>

If you format the URLs like this:

gemini://$HOST/path/to/script/$FIELD?$VALUE

...then $FIELD should show up as PATH_INFO (probably with a leading "/")
and $VALUE as QUERY_STRING.

[...]

 The only problem I'm running into here is that the various Gemini

clients I've tested (elpher, bombadillo, kristall) don't actually append
> a user's input as an additional parameter to an existing query string if
> one is present. Instead, bombadillo and kristall just overwrite the
> existing query string and only return ?$NEW_INPUT. Elpher, on the other
> hand, just creates invalid URLs by simply appending ?$NEW_INPUT to
> whatever is already in the URL (e.g.,
> gemini://awesome.capsule.net/form?$SESSION&smog?yes. Neither of these
> behaviors do what I'd want or expect here.
>

Elpher is doing something weird here but the others are handling inputs as
intended.

## Section 4.1: Check the Spec!
>
>
> I think the culprit then is probably Gemini Protocol Specification
> section 3.2.1 1x (INPUT):
>

[...]

>
## Section 4.2: Append Don't Replace!
>
>
> As far as I can tell, the fix here is for Solderpunk to update the text
> in section 3.2.1 to indicate that if a query string is already part of
> the request leading to an INPUT response, then the user's input should
> be appended (using &) to the existing query string rather than replacing
> it wholesale (using ?).
>

This is not a necessary spec change.

Otherwise, we really have no way to input more than one query param
> (with &) other than asking the user to type it directly into the INPUT
> prompt (e.g., cat&dog&pig).


The responsibility for collecting parameters fall on the server, not on the
client. The only thing the client needs to do is sending one query for each
field.

I'm hoping this isn't the spec's intention
> here and that we just have a case of ambiguous wording that has led some
> client authors to create divergent (or broken) implementations
>

Sorry to disappoint you. I suggest leaving the ampersands to the web
queries.

[...]

I've attached a short (47 line) CGI script (for Space Age) that
> implements the dynamic form example described in this email.
>

Thank you for providing example code and I'm sorry for not doing the same.

-- 
Katarina

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210129/c513
6798/attachment.htm>

Link to individual message.

14. Gary Johnson (lambdatronic (a) disroot.org)

> ## Section 3.3: (DESIRED) Client-side Requests
>>
>>
>> The intention of this example is that the clients would produce requests
>> of this form after each input prompt:
>>
>> => gemini://awesome.capsule.net/form?$SESSION&name&Gary%20Johnson
>> => gemini://awesome.capsule.net/form?$SESSION&password&secret
>> => gemini://awesome.capsule.net/form?$SESSION&smog&yes
>> => gemini://awesome.capsule.net/form?$SESSION&plant&Ficus
>>
>> where $SESSION is whatever value was generated by the CGI script on the
>> first page load.
>>
>
> I do not understand this example.
>
> When using regular inputs, the client will send these requests:
>
> gemini://awesome.capsule.net/form/name?Gary%Johnson
> gemini://awesome.capsule.net/form/password?secret
> gemini://awesome.capsule.net/form/smog?yes
> gemini://awesome.capsule.net/form/plant?Ficus
> gemini://awesome.capsule.net/form/submit
>
> (No "?" on "submit" since it's just telling the server that we're done.)
>
> What is the benefit of doing it your way?

Hi Katarina,

  Thanks for taking the time to reply to my message. I'll try to clarify
my point here.

The issue I'm raising is that there appears to be no way to pass more
than one piece of information at a time in our query strings. This has a
very significant impact on any writers of CGI scripts, which is how many
Gemini servers allow users to add dynamic pages to their capsules.

But why, you ask?

Because each CGI script is available at a particular file path and
therefore additional path segments can't be used to pass information to
them. They have to get their inputs from the query string.

This is a script. It probably returns a 20 response:

=> gemini://awesome.capsule.net/form.clj

If I want to fill in a name field on that page, I might provide a link
like this:

=> gemini://awesome.capsule.net/form.clj?name

This calls the CGI script with a query parameter. Great! The script can
use "name" to look up the appropriate response. Here it is:

10 Please enter your name\r\n

However, when the user fills in their name, the browser will now send
this request to the server:

=> gemini://awesome.capsule.net/form.clj?Gary%20Johnson

There is no way for the CGI script to know that this is a name value and
not the value for any other form field on the page.

And therein lies the rub. If the only way to associate input values with
the variables they represent is with path segments, then CGI scripts
simply can't ever use more than one input field per page. Even then, if
the query string used to trigger a 10 INPUT response is typed by the
user (into the totally free form text field they are presented), then
the server will continue to respond with yet another 10 INPUT response.

This would make a form with N fields require N+1 separate CGI scripts,
all chained together via links that represent the directory structure
into which they are installed.

This is an absolute nightmare scenario for programming anything that
wants to accept user inputs.

So what does this mean for Geminispace?

It means essentially that CGI scripts are currently second-class
citizens, and the only people who can write dynamic capsules are server
authors (or people willing to hack on server code). This is because
encoding information using path segments requires injecting custom
routing table code into the server's request handler.

As a server author, I am capable of creating a custom fork of my server
with a new routing table for each dynamic capsule I want to build.
However, I suspect the majority of Gemini users are not going to have
both the skill and willingness to engage in this level of coding on
their pages.

That is why I and many other authors have added support for CGI scripts
to our servers. But under the "only one piece of information in the
query string" paradigm, these scripts are currently rather handicapped
when it comes to accepting user input.

Hopefully, I've made the technical merits of my case clear here.


> ## Section 4.2: Append Don't Replace!
>>
>>
>> As far as I can tell, the fix here is for Solderpunk to update the text
>> in section 3.2.1 to indicate that if a query string is already part of
>> the request leading to an INPUT response, then the user's input should
>> be appended (using &) to the existing query string rather than replacing
>> it wholesale (using ?).
>>
>
> This is not a necessary spec change.


Yes, it really is if anyone other than server authors is ever going to
be able to write their own dynamic pages.


> Otherwise, we really have no way to input more than one query param
>> (with &) other than asking the user to type it directly into the INPUT
>> prompt (e.g., cat&dog&pig).
>
>
> The responsibility for collecting parameters fall on the server, not on the
> client. The only thing the client needs to do is sending one query for each
> field.


Again, see above. A single query value cannot be associated with its
variable without adding a custom routing table to the server to enable
the parsing of path segment data as additional inputs.


> I'm hoping this isn't the spec's intention
>> here and that we just have a case of ambiguous wording that has led some
>> client authors to create divergent (or broken) implementations
>>
>
> Sorry to disappoint you. I suggest leaving the ampersands to the web
> queries.


I'm afraid we disagree here.


> Thank you for providing example code and I'm sorry for not doing the same.


If you can write a CGI script that can correctly associate INPUT
responses with their intended variables, please share it. I suspect it
would be quite educational.

Happy hacking,
  Gary

-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

15. Chris Babcock (cbabcock (a) asciiking.com)

This is going to be weird, because I disagree with almost everything 
you've said except that appending the query string should be guaranteed. I 
hope this is helpful

January 30, 2021 1:54 PM, "Gary Johnson" <lambdatronic at disroot.org> wrote:

> The issue I'm raising is that there appears to be no way to pass more
> than one piece of information at a time in our query strings. This has a
> very significant impact on any writers of CGI scripts, which is how many
> Gemini servers allow users to add dynamic pages to their capsules.
> 
> But why, you ask?
> 
> Because each CGI script is available at a particular file path and
> therefore additional path segments can't be used to pass information to
> them. They have to get their inputs from the query string.

%<  ------------------------------

> This would make a form with N fields require N+1 separate CGI scripts,
> all chained together via links that represent the directory structure
> into which they are installed.
> 
> This is an absolute nightmare scenario for programming anything that
> wants to accept user inputs.

Well, you *could* pass extra path info to the script... So, the script at 
cgi-bin/index.cgi handles all cgi-bin/* and treats the path after cgi-bin 
as positional arguments

> So what does this mean for Geminispace?
> 
> It means essentially that CGI scripts are currently second-class
> citizens, and the only people who can write dynamic capsules are server
> authors (or people willing to hack on server code). This is because
> encoding information using path segments requires injecting custom
> routing table code into the server's request handler.

CGI scripts *are* second class citizens in Gemini, but it's because the UX 
and dev-op experience of line based input is terrible. The fact that a 
static routing table is more performant and has a better security profile 
than parsing the path info dynamically is less relevant than the fact that 
this is a line based protocol

%<  ------------------------------

>> ## Section 4.2: Append Don't Replace!
>>> As far as I can tell, the fix here is for Solderpunk to update the text
>>> in section 3.2.1 to indicate that if a query string is already part of
>>> the request leading to an INPUT response, then the user's input should
>>> be appended (using &) to the existing query string rather than replacing
>>> it wholesale (using ?).
>> 
>> This is not a necessary spec change.
> 
> Yes, it really is if anyone other than server authors is ever going to
> be able to write their own dynamic pages.
> 

Now, "Append, don't replace," is a reasonable expectation to make of 
clients and it's still useful for the devops situation, even if it's not 


%<  ------------------------------
 
> If you can write a CGI script that can correctly associate INPUT
> responses with their intended variables, please share it. I suspect it
> would be quite educational.

The two alternatives to requiring clients to preserve collected state in 
the query parameter are to save state in the CGI script or to pass 
positional arguments via the path. I think append is reasonable. It also 
preserves principle of least surprise and other desirable qualities

CGI *is* going to be second class in Gemini as long as forms aren't an 
option, but that's a consequence of the decision to support line-based 
clients. Appending the query doesn't do violence to that design

Chris

Link to individual message.

16. Sean Conner (sean (a) conman.org)

It was thus said that the Great Gary Johnson once stated:
> 
> The issue I'm raising is that there appears to be no way to pass more
> than one piece of information at a time in our query strings. This has a
> very significant impact on any writers of CGI scripts, which is how many
> Gemini servers allow users to add dynamic pages to their capsules.
> 
> But why, you ask?
> 
> Because each CGI script is available at a particular file path and
> therefore additional path segments can't be used to pass information to
> them. They have to get their inputs from the query string.

  [ snip ]

> It means essentially that CGI scripts are currently second-class
> citizens, and the only people who can write dynamic capsules are server
> authors (or people willing to hack on server code). This is because
> encoding information using path segments requires injecting custom
> routing table code into the server's request handler.

  Not if the CGI interface is properly written.  All I had to do was write
this CGI script and drop it into my tests directory [1]:

	gemini://gemini.conman.org/test/pathseg.cgi

It uses only three of the RFC-3875 defined variables, QUERY_STRING,
SCRIPT_NAME and PATH_INFO to do all the work.  The script will ask for three
fields and then present a final page with all three fields.  But the script
will only work if all three variables are defined per RFC-3975 (PATH_INFO is
the tricky one).

  Yes, it's a bit ugly and yes, it's a second class citizen and yes, it
requires a proper CGI module to work, but it can be done without the
configuration you think it does.  The script just simply appends each input
field as the path, so if you enter

	and a one
	and a two
	skidoosh

as the values, the final URL will be:

	gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh

  Yes, I could have done a bit more processing, naming each segment:

	/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh

but I was lazy and wanted to just do a proof-of-concept here.
	
> If you can write a CGI script that can correctly associate INPUT
> responses with their intended variables, please share it. I suspect it
> would be quite educational.

  I have added it [3].

  -spc

[1]	I gave the script a .cgi extension just to drive the point
	home---for my server, GLV-1.12556 [2], the extension of a CGI script
	doesn't matter at all.

[2]	https://github.com/spc476/GLV-1.12556

[3]	Here you go.  It's in Lua, but it's easy going except for the first
	bit which is a bit of broilerplate I needed for encoding and
	decoding various strings.  The main logic is marked though, so you
	can skip the first section.

#!/usr/bin/env lua

-- ************************************************************************
-- Decoding and Encoding crap, not much to see here, citizen!  Move along!
-- ************************************************************************

local lpeg    = require "lpeg"
local xdigit  = lpeg.locale().xdigit
local char    = lpeg.P"%" * lpeg.C(xdigit * xdigit)
              / function(c)
                  return string.char(tonumber(c,16))
                end
              + lpeg.P"+" / " "
              + lpeg.P(1)
local decode_query = lpeg.Cs(char^1)

local function tohex(c)
  return string.format("%%%02X",string.byte(c))
end

local unsafe     = lpeg.P" "  / "%%20"
                 + lpeg.P"#"  / "%%23"
                 + lpeg.P"%"  / "%%25"
                 + lpeg.P"<"  / "%%3C"
                 + lpeg.P">"  / "%%3E"
                 + lpeg.P"["  / "%%5B"
                 + lpeg.P"\\" / "%%5C"
                 + lpeg.P"]"  / "%%5D"
                 + lpeg.P"^"  / "%%5E"
                 + lpeg.P"{"  / "%%7B"
                 + lpeg.P"|"  / "%%7C"
                 + lpeg.P"}"  / "%%7D"
                 + lpeg.P'"'  / "%%22"
                 + lpeg.R("\0\31","\127\255") / tohex
local char_path  = lpeg.P"?"  / "%%3F"
                 + unsafe
                 + lpeg.P(1)
local esc_path   = lpeg.Cs(char_path^0)

-- ************************************************************************
-- The main script starts here
-- ************************************************************************

local query       = os.getenv("QUERY_STRING")
local script_name = os.getenv("SCRIPT_NAME")
local pathinfo    = os.getenv("PATH_INFO")

if not pathinfo and query == "" then
  io.stdout:write("Status: 10\n")
  io.stdout:write("Content-Type: Input field\n")
  io.stdout:write("\n")
  os.exit(0,true)
end

if not pathinfo then
  query = decode_query:match(query)
  query = esc_path:match(query)
  io.stdout:write("Status: 30\n")
  io.stdout:write(string.format("Location: %s/%s\n",script_name,query))
  io.stdout:write("\n")
  os.exit(0,true)
end

if pathinfo:match("^/[^/]*/[^/]*/[^/]*") then
  local f1,f2,f3 = pathinfo:match("^/([^/]*)/([^/]*)/([^/]*)")
  f1 = decode_query:match(f1)
  f2 = decode_query:match(f2)
  f3 = decode_query:match(f3)
  
  io.stdout:write("Status: 20\n")
  io.stdout:write("Content-Type: text/gemini\n")
  io.stdout:write("\n")
  io.stdout:write("The three fields you input:\n")
  io.stdout:write("\n")
  io.stdout:write(string.format("* %s\n",f1))
  io.stdout:write(string.format("* %s\n",f2))
  io.stdout:write(string.format("* %s\n",f3))
  io.stdout:write("\n")
  io.stdout:write(string.format("=> %s Try again\n",script_name))
  os.exit(0,true)
end

if query == "" then
  io.stdout:write("Status: 10\n")
  io.stdout:write("Content-Type: Input next field\n")
  io.stdout:write("\n")
  os.exit(0,true)
else
  query    = decode_query:match(query)
  query    = esc_path:match(query)
  pathinfo = esc_path:match(pathinfo)
  io.stdout:write("Status: 30\n")
  io.stdout:write(string.format("Location: %s%s/%s\n",script_name,pathinfo,query))
  io.stdout:write("\n")
  os.exit(0,true)
end

Link to individual message.

17. John Cowan (cowan (a) ccil.org)

On Wed, Jan 27, 2021 at 5:20 PM Gary Johnson <lambdatronic at disroot.org>
wrote:


> I don't think form-like data submission should be seen as an evil. It
> allows us to implement a wide variety of CGI-style applications that do
> all their computing on the server side (often through some script
> extension mechanism).
>

+1

which $SESSION, $NAME, $PASSWORD, $SMOG, and $PLANT are defined (or default
> to empty strings). When the page first loads, we create a new
> $SESSION value in our CGI script and insert it into the links to
> preserve state across requests until we restart the server or the user
> refreshes the page.
>

I think this is exactly the Right Thing.


> (Obviously, a more robust state management mechanism could be achieved
> with client certs and a DB, but I just mean to show a very simple
> example here.)
>

A TLS session is not the same as an application session.  I may, for
example, have two tabs (or whatever) open in my Gemini browser that refer
to the same access-controlled capsule, and which therefore must be accessed
with the same cert.  Nevertheless, the two pages should operate as distinct
sessions: I should be able to fill out a form in one page while searching
help documents in the other.  So I think a session ID is the Right Thing.
However, this is a matter of server/capsule/CGI design, not of the Gemini
protocol.

While this example creates more back-and-forth requests than a proper
>
client-side form would generate, I hope it demonstrates that Gemini and
> Gemtext in their current incarnations are already sufficiently complete
> to build interactive CGI applications with them today.
>

The biggest problem is most likely the cost of setting up and tearing down
all the TLS connections, but there is no help for that.

> The requested resource accepts a line of textual user input. The <META>
> line is a prompt which should be displayed to the user. The same
> resource should then be requested again with the user's input included
> as a query component.


"Included" is a vague word, and should be fixed whether we do appending or
not.

> As far as I can tell, the fix here is for Solderpunk to update the text
> in section 3.2.1 to indicate that if a query string is already part of
> the request leading to an INPUT response, then the user's input should
> be appended (using &) to the existing query string rather than replacing
> it wholesale (using ?).
>

I suggest that if there is no query part, we append ? followed by the
user's input, whereas if there is, we just append the user's input.  That
lets a simple form work like this:

1) Suppose Fluffy (a server) wants me to send my name and email address.
Fluffy sends this bare-bones text/gemini document, which we will say comes
from gemini://fluffy.example/form1, to my client Aarfy.

=> Name: ?name=
=> Email: ?email=

2) Let's say I choose the first link.  Fluffy sends Arfy 10 Enter your name.
I type John Cowan into Aarfy, which sends the URL
gemini://fluffy.example/form1?session=ABC&name=John%20Cowan.  Fluffy sends
this new document to Aarfy:

=> Name [John Cowan]: ?session=ABC&name=
=> Email: ?session=ABC&name=John%20Cowan&email=

3) If I choose the first link, I can change my name.  If I choose the
second link, Fluffy will send Arfy 10 Enter your email.  I type
cowan at ccil.org into Aarfy, which sends the URL
gemini://fluffy.example/form1?session=ABC&name=John%20Cowan&email=
cowan at ccil.org.  Fluffy sends this third document to Aarfy:

=> Name [John Cowan]: ?session=ABC&email=cowan at ccil.org&name=
=> Email: [cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=
=> Submit: [cowan at ccil.org] ?session=ABC&name=John%20Cowan&email=
cowan at ccil.org&submit

4) If I choose the first or second link again, I can change my name or
email address.  But if I choose the third link, which Fluffy does *not*
interpret as a search link, Fluffy will write my name and email into a
database, or send me an email saying "HA HA HA!", or whatever it does.

Because all that happens is following links and reading input lines, it
does not matter if Aarfy is a CLI, TUI, or GUI client: the protocol
exchanges work in any case.  Furthermore, Fluffy does not have to retain
partial state, because it is passed back and forth between Aarfy and Fluffy
with no real interpretation at either end until Aarfy receives a submission
URL.

For that matter there is no real need to have a submission link: an URL
that specifies both name and email could be interpreted as a submission.
As before, this is a matter of design, not protocol.



John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
There is no real going back.  Though I may come to the Shire, it will
not seem the same; for I shall not be the same.  I am wounded with
knife, sting, and tooth, and a long burden.  Where shall I find rest?
                --Frodo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210130/c841
e6c8/attachment.htm>

Link to individual message.

18. Martin Bays (mbays (a) sdf.org)



>> Gemtext in their current incarnations are already sufficiently 
>> complete to build interactive CGI applications with them today.
>
>The biggest problem is most likely the cost of setting up and tearing down
>all the TLS connections, but there is no help for that.

Well, there is "0-RTT" TLS session resumption with early data. That 
reduces the overheads substantially (though it still requires a fresh 
TCP connection for each request). As far as I know no server supports 
0-RTT currently, and I think only a few clients do. But it would fit 
well with heavy CGI use.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20210131/e7e5
a388/attachment.sig>

Link to individual message.

19. Gary Johnson (lambdatronic (a) disroot.org)

Sean Conner <sean at conman.org> writes:

> Not if the CGI interface is properly written.  All I had to do was write
> this CGI script and drop it into my tests directory [1]:
>
> 	gemini://gemini.conman.org/test/pathseg.cgi
>
> It uses only three of the RFC-3875 defined variables, QUERY_STRING,
> SCRIPT_NAME and PATH_INFO to do all the work.  The script will ask for three
> fields and then present a final page with all three fields.  But the script
> will only work if all three variables are defined per RFC-3975 (PATH_INFO is
> the tricky one).
>
>   Yes, it's a bit ugly and yes, it's a second class citizen and yes, it
> requires a proper CGI module to work, but it can be done without the
> configuration you think it does.  The script just simply appends each input
> field as the path, so if you enter
>
> 	and a one
> 	and a two
> 	skidoosh
>
> as the values, the final URL will be:
>
> 	gemini://gemini.conman.org/test/pathseg.cgi/and%20a%20one/and%20a%20two/skidoosh
>
>   Yes, I could have done a bit more processing, naming each segment:
>
> 	/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh
>
> but I was lazy and wanted to just do a proof-of-concept here.

Thanks for sharing some code, Sean. I, of course, realize that one could
write a CGI script to pick apart the PATH_INFO for user inputs. This
issue I raised in my message was that this doesn't make any sense in the
context of a CGI script which is looked up using the path on the remote
filesystem.

In your example, your script is located at /test/pathseg.cgi. However,
lacking side information, I see no indicator (outside of the --
admittedly optional -- cgi extension on your file name) of which path
segments should be considered part of the CGI filename lookup and which
parts are meant to be user input data in your example link:

/test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh

This feels like a massive hack to me and an abuse of path segments TBH.

If I were to embrace this approach, I can see that I would have to
reprogram my server to do some additional path preprocessing magic. I
could either:

1. Check every sequence of path segments starting from the document root
   to see if any of them correspond to an executable file or have the
   blessed CGI file extension for my server.

2. (To use Chris Babcock's suggestion), check every sequence of path
   segments starting from the document root to see if any of them
   correspond to a directory containing an index.cgi file.

3. Include an input parameter to my server (on the command line or in a
   config file) that specifies a particular mapping between path
   segments and CGI scripts on my filesystem. That is, I would be
   defining a routing table at server start time. This approach has the
   unfortunate side effect of preventing users on a pubnix from
   installing CGI scripts in their ~/public_gemini capsules without
   getting the server administrator to update the global routing table
   on their behalf. Alternatively, it would require each ~/public_gemini
   capsule to include a routing table config file within it if it wanted
   to support CGI scripts, and these would have to be read and parsed by
   the server both at server start time and/or on a periodic interval or
   event-based basis in order to support new user scripts as they are
   added without having to restart the server.

Once one of these 3 approaches enables the server to successfully detect
that a particular path corresponds to a CGI script that is not actually
located where that path is pointing, then the server would need to
execute that script with PATH_INFO bound to the entire path. Every
installed CGI script would then be responsible for manually removing
SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,
which puts an additional burden on CGI developers.


So I've now heard from multiple folks that we should all just get on
with these path segment hacks and accept that as the best we can do in
Gemini.

While I can see that it's technically possible (though arguable ugly) to
do so, I suppose my question is:

"What exactly does Gemini lose by allowing chained query parameters?
(with &)"

I can't for the life of me see any downside. It should literally be one
line of code changed in your favorite Gemini client. Just append inputs
to the query string if one already exists rather than replacing the
query string outright.

I believe John Cowan is right that "include" is too vague a word in
Solderpunk's current specification for the 10 INPUT field. Both
appending and replacing are forms of inclusion, so any Gemini client
author who chooses to append shouldn't be in violation of the spec as it
is currently worded.

And changing that one line in your client could save every CGI script
writer in Geminispace a lot of additional work (as clearly demonstrated
by the examples shared by Katarina, Chris, and Sean).

Seems like a very positive return on investment for a small change.

What am I missing here, folks?

Any chance of weighing in here, Solderpunk?


With best intentions,
  Gary

-- 
GPG Key ID: 7BC158ED
Use `gpg --search-keys lambdatronic' to find me
Protect yourself from surveillance: https://emailselfdefense.fsf.org
=======================================================================
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments

Why is HTML email a security nightmare? See https://useplaintext.email/

Please avoid sending me MS-Office attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Link to individual message.

20. Sean Conner (sean (a) conman.org)

It was thus said that the Great Gary Johnson once stated:
> Sean Conner <sean at conman.org> writes:
> 
> > Not if the CGI interface is properly written.  All I had to do was write
> > this CGI script and drop it into my tests directory [1]:
> >
> > 	gemini://gemini.conman.org/test/pathseg.cgi

  [ snip ]

  
> Thanks for sharing some code, Sean. I, of course, realize that one could
> write a CGI script to pick apart the PATH_INFO for user inputs. This
> issue I raised in my message was that this doesn't make any sense in the
> context of a CGI script which is looked up using the path on the remote
> filesystem.
> 
> In your example, your script is located at /test/pathseg.cgi. However,
> lacking side information, I see no indicator (outside of the --
> admittedly optional -- cgi extension on your file name) of which path
> segments should be considered part of the CGI filename lookup and which
> parts are meant to be user input data in your example link:
> 
> /test/pathseg.cgi/name=and%20a%20one/age=and%20a%20two/action=skidoosh

  That's a particular implementation detail of GLV-1.12556 [1].  Other
servers could require the extension, or some other mechanism.

> This feels like a massive hack to me and an abuse of path segments TBH.
> 
> If I were to embrace this approach, I can see that I would have to
> reprogram my server to do some additional path preprocessing magic. I
> could either:
> 
> 1. Check every sequence of path segments starting from the document root
>    to see if any of them correspond to an executable file or have the
>    blessed CGI file extension for my server.

  I see your server just accepts the requested path as is.  GLV-1.12556
(once it gets into the filesystem handler) walks down the document root
checking each path segment looking for an exectuable file (which indicates a
CGI script) or symbolic link (which indicates a SCGI script).

> Once one of these 3 approaches enables the server to successfully detect
> that a particular path corresponds to a CGI script that is not actually
> located where that path is pointing, then the server would need to
> execute that script with PATH_INFO bound to the entire path. Every
> installed CGI script would then be responsible for manually removing
> SCRIPT_NAME from PATH_INFO and splitting it up to get the user inputs,
> which puts an additional burden on CGI developers.

  If you want to follow RFC-3875, that's not the case.  PATH_INFO only
contans data past the script name (section 4.1.5). This link:

	gemini://gemini.conman.org/cgi

returns

	SCRIPT_NAME = /cgi

There is no PATH_INFO or PATH_TRANSLATED because it's not needed.  However:

	gemini://gemini.conman.org/cgi/path/to/nowhere

returns

	SCRIPT_NAME = /cgi
	PATH_INFO = /path/to/nowhere
	PATH_TRANSLATED = 
/home/spc/projects/gemini/non-checkin/gemini.conman.org/path/to/nowhere

  The work is on the server side, not the CGI script side.

> So I've now heard from multiple folks that we should all just get on
> with these path segment hacks and accept that as the best we can do in
> Gemini.
> 
> While I can see that it's technically possible (though arguable ugly) to
> do so, I suppose my question is:
> 
> "What exactly does Gemini lose by allowing chained query parameters?
> (with &)"

  Nothing as far as I can see, as long as the characters '=' and '&' are
escaped if they appear in the input (to prevent confusion).  

> What am I missing here, folks?

  Somebody to do a proof-of-concept probably.

> Any chance of weighing in here, Solderpunk?

  Is he still alive?

  -spc

[1]	https://github.com/spc476/GLV-1.12556

Link to individual message.

---

Previous Thread: Poetry in Gemini

Next Thread: [tech] robots.txt format