After talking to Alex Schroeder about an upload protocol for Gemini I have realized, that his version of the protocol is not quite what I envisioned, and that my version of the protocol is not clearly defined, this document is an attempt to define my protocol:
My goals for this protocol are:
This protocol is intended to be used on the same port as spec compliant Gemini server as a companion protocol to allow for uploading files, nothing less and nothing more.
Uploading a file is divided into several stages: (Get it? Gemini? Multistage rocket to get something into space?)
The client opens a connection and writes the following line:
<upload_uri>\t<filesize>\t<mimetype>
\t meaning an ASCII horizontal tab caracter
To which it expects a response in the format of a regular Gemini response header except, that it also accepts non numeric response codes. The responses can be sorted into three categorys. Any non success response code will cause the upload to fail, which MUST be displayed to the user!
If the response code is a ��WR” the client knows, that it can carry on to the next stage, the `<META>` part MUST be ignored by the client and MUST be left blank by the server it MAY NOT even be there.
In case of the “WR” response code the server MUST not drop any data, that was sent in the time between the client finishing the request and the response being send. This means that a simple client (like the commandline example) does not have to wait for the WR response, but can start writing immediately after sending the request, the worst that will happen is that the server will ignore what the client says if an error comes up.
The upload protocol defines a few other response codes:
`ES` - The file will not be accepted because it is too large, the `<META>` part of the response MUST contain the maximum allowed filesize.
`EM` - The file will not be accepted because of it’s MIME-type `<META>` contains a message which should be displayed to the user.
`E_` - The file will not be accepted `<META>` contains a message which should be displayed to the user.
A smarter client could show the user the error and let the user retry the upload with a different file. A less smart client can get away with looking at the first character of the response and tell the user that the file was rejected. After sending an Ex code the server MUST skip to the third stage and ignore any data the client sends, the server MAY close the receiving side of its socket, if the underlying network permits that. When receiving an Ex code the client MUST skip the second stage, or MAY close the connection.
A range of official response codes MAY be used to indicate various errors at this stage, all other codes should be treated like unknown Gemini response codes:
`3x` - the client SHOULD ask the user if redirecting to the same host, it MUST ask if the upload will go to a different host, if the redirect leads to a gemini+upload:_ uri the client MAY ask the user if they want to retry uploading the file_
`4x` - the codes mean exactly the same as defined in the gemini spec, a client should offer the option to retry the upload
`41` - The sever SHOULD use this code for indicating a temporary lack of enough storage (a full disk) as it fits within the definition of an overload, and tells the client, that it can retry in a few hours
`5x` - the codes mean exactly the same as defined in the gemini spec, the resource being the ability to upload a file
`53` - the server will use this one or 59 to indicate, that it does not support this protocol at all, bots should stop trying to upload to the server
`59` - the server will use this one or 53 to indicate, that it does not support this protocol at all, bots should stop trying to upload to the server
`6x` - the codes mean exactly the same as defined in the Gemini spec, a client should offer the option to retry the upload after asking for a certificate
NOTE: The reason for not including the `1x` range in the list of valid response codes is to satisfy the non-interactive upload requirement, if you want to ask the user something do it using normal Gemini and give them the upload link after they respond (and if you really want to, you can point to such an input with a redirect back to a `gemini://` URI) This is intended to make it inconvenient to use this protocol for delivering content or asking the user for things which is what the Gemini protocol is for.
The server MUST close the connection after sending a response using a official code.
The `1x` and `2x` codes MUST be treated like unknown codes in this context! A client should close the connection after receiving an unknown code!
In the second stage the client simply sends the file to the server, by simply throwing it down the pipe. When all bytes are received, the server has two options of replying to the client with a line that again looks like a Gemini response:
The two possible status codes are:
OK - The file was uploaded successfully, `<META>` SHOULD contain the URI to use for downloading the file again, it MAY NOT contain anything else, at this point the client can consider the upload a success
EC - The file was rejected because of its content, `<META>` MAY contain a message that a client SHOULD display to the user
NOTE: The EC code was previously an E_ code, but if a client decides to use the statuscodes directly, it would have had no way of telling it from an E_ general error and would either have to guess or store the fact, that it uploaded the file (this is only a minor detail, but it should make it easier to implement a very simple client and reusing statuscodes is not friendly to people who try to debug software)
These codes will tell any client if the upload was a success independent of possible content that the server sends for a human reader in stage 3. If the client receives an invalid response code it MUST assume, that the upload failed, close the connection and tell the user.
After indicating that an upload was successful a server immediately sends another Gemini header line. This one does not use custom codes and only allows a limited rand of the official ones.
The allowed ones are:
`1x` - handle as specified by the Gemini specification
`2x` - handle as specified by the Gemini specification
`3x` - handle as specified by the Gemini specification
`40` - indicates that the server won’t send a response, the `<META>` section should be empty and be ignored by the client.
All other response codes, have to be treated as unknown which in this case means that no relevant information will follow and BOTH server and client MUST close the connection.
NOTE: When sending a `2x` response please keep in mind, that some clients will only have a cached copy of the file they currently display, and that this will be lost as soon as the user leaves the page (reloading is not practical as the client would have to reupload the file to get the page again)
NOTE: When using `1x` codes to authenticate a file upload after it finished you are doing it wrong! These are intended for asking the user about a filename, tags or something similar.
Most readers will have noticed by now, that this protocol reserved no space for authentication, that is, because Gemini already provides the option to have a client certificate, which is supported by this protocol by making the `6x` range of response codes a valid return codes. An alternative would be to use “public” endpoints with an access-token embedded in the uri.
The thought behind leaving out a dedicated authentication field is, that Gemini already supports ways to tell the server for example a username and or password by entering them into Gemini input fields (response code `10`).
NOTE: A server MAY use any of the above authentication schemes in any combination (including none at all or both)
NOTE: A format that utilizes this protocol to upload text only forms would also be great to have, as such a thing allows the user to know all the questions, take their time to answer them and use one request to tell the server all the answers (similar to the `gopher+write` protocol specified in 2017-12-30 Gopher Wiki or what gopher+ specifies).
To avoid a collision with Alex’ proposal I suggest using the following scheme for this protocol
gemini+upload://<server>[:<port>]/<path>?<query>
How the path and query are used is left to the imagination of the reader.
How exactly a Gemini response header looks like is defined by the Gemini specification in section 3.1. At the time of writing (2020-06-05) the Gemini specification had the following to say about response headers:
Gemini response headers look like this:
`<STATUS><SPACE><META><CR><LF>`
[...]
If `<STATUS>` does not belong to the “SUCCESS” range of codes, then the server MUST close the connection after sending the header and MUST NOT send a response body.
If a server sends a `<STATUS>` which is not a two-digit number or a `<META>` which exceeds 1024 bytes in length, the client SHOULD close the connection and disregard the response header, informing the user of an error.
Stage 1 C: opens a TLS connection with a valid client certificate to port 1965 of example.org C: gemini+upload://drop.example.org/upload 1035 text/plain<CR><LF> S: WR<CR><LF> Stage 2 C: *sends 1035 bytes of plain text* S: OK gemini://drop.example.org/files/2020-06/9a8d4186-90c8-4d2b-a296-b4990973892f<CR><LF> Stage 3 S: 20 text/gemini<CR><LF> S: #Your upload was successful <LF> S: The sha256 hash of your file is: 183f4e791e9d4f69d01c0fa67e9a7f6fa2cef48663555dc27114d144307b1f24 <LF> S: => gemini://drop.example.org/files/2020-06/9a8d4186-90c8-4d2b-a296-b4990973892f You can find your file here<LF> S: => gemini://drop.example.org/files/2020-06/9a8d4186-90c8-4d2b-a296-b4990973892f/delete?cd76f8a7-6ae5-45cd-9fe4-80cd6af0735d<LF> S: *closes the connection*
Note, that the client only sends the URI with the metadata and the file-contents for which it doesn’t even have to wait for a response from the server.
Stage 1 C: opens a TLS connection without a client certificate to port 1965 of example.org C: gemini+upload://drop.example.org/upload?uplo-ad_t-oken 1035 text/plain<CR><LF> S: WR<CR><LF> Stage 2 C: *sends 1035 bytes of plain text* S: OK gemini://drop.example.org/files/2020-06/9a8d4186-90c8-4d2b-a296-b4990973892f<CR><LF> Stage 3 S: 20 text/gemini<CR><LF> S: *sends response body* S: *closes the connection*
NOTE: sending a redirect on a successful upload is possible, just currently in none of the examples
Stage 1 C: opens a TLS connection without a client certificate to port 1965 of example.org C: gemini+upload://drop.example.org/upload?uplo-ad_t-oken 999999999 image/png<CR><LF> S: ES 50000000<CR><LF> Stage 2 gets skipped because of the Ex statuscode Stage 3 S: 30 gemini://drop.example.org/uploadguide<CR><LF> S: *closes the connection*
NOTE: The maximum size is inclusive, so attempting to upload exactly 50000000 Byte (50 MB) would have been okay
Stage 1 C: opens a TLS connection without a client certificate to port 1965 of example.org C: gemini+upload://drop.example.org/upload?uplo-ad_t-oken 34782 application/vnd.microsoft.portable-executable<CR><LF> S: EM Get out, who do you think you are? Trying to upload a microsoft executable ... rude ...<CR><LF> Stage 2 gets skipped because of the Ex statuscode Stage 3 S: 20 gemini://drop.example.org/uploadguide<CR><LF> S: To prevent the spread of malware to unsuspecting users, we have banned the direct uploading executables S: If you really have to share an executable, precompiled library or similar try packaging them in an archive of some kind. S: Uploading the sourcecode instead of the application is encouraged S: => gemini://drop.example.org/uploadguide You can find our upload guide here S: *closes the connection*
C: opens a TLS connection without a client certificate to port 1965 of example.org C: gemini+upload://image.example.org/upload?uplo-ad_t-oken 34782 image/png<CR><LF> S: WR<CR><LF> Stage 2 C: *uploads an image with a broken header* S: EC Unfortunately, the file you uploaded is not an image Stage 3 S: 20 gemini://drop.example.org/uploadguide<CR><LF> S: The file you uploaded did not look like an image. S: Either the images format did not match the mimetype "text/png" or the file was damaged S: Please verify, that the image is in the right format, repair it S: Hint: Opening it in an image editing program and saving it again can help in a lot of cases S: => gemini://image.example.org/uploadguide You can find our upload guide here S: *closes the connection*
This example demonstrates how a server can use existing gemini mechanisms for allowing the user to upload a file after answering a captcha question
This example will use a randomly generated token in the uri as a kind of very limited cookie (more like an on the fly generated API key)
A page containing a link that will lead to the upload as an entry point
URI: gemini://drop.example.org 20 text/gemini
A redirect, to a page that will ask the user the captcha question. This redirect should just generate a random token and not set any state in the server
URI: gemini://drop.example.org/upload 30 gemini://drop.example.org/validate/rand-dom_-code
The user gets asked a captcha question (the server should at this point link the token to the correct answer and an expiry date) giving a correct answer will validate the code for uploading a file
URI: gemini://drop.example.org/validate/rand-dom_-code 10 What is bigger on the inside than on the outside?
NOTE: At this point the server could ask for a username or other information
For a correct answer the user will get redirected to an upload uri already containing the now valid token ...
URI: gemini://drop.example.org/validate/rand-dom_-code?tardis 30 gemini+upload://drop.example.org/upload?rand-dom_-code
..... which can be used to upload a file.
URI: gemini+upload://drop.example.org/upload?rand-dom_-code
For an incorrect answer the server can serve an error page, and delete the token and the associated information.
URI: gemini://drop.example.org/validate/rand-dom_-code?no%20idea 20 text/gemini Wrong answer! => gemini://drop.example.org/validate/rand-dom_-code Try again!
NOTE: The validation thingy should always ask a random question, however the server may ask the same question for the same token until it gets validated and used or it expires
NOTE: since these codes are temporary then could be kept in ram by the server, keep in mind that take space and creating a lot of them without ever validating them is a possible DOS attack (to prevent crawlers from getting trapped in anything consider blacklisting the entire validate, upload construct in you servers robots.txt)
This one uploads the file `test.txt` to the URI `gemini+upload://localhost/Test` using an unencrypted connection (replace `netcat` with `gnutls-cli --insecure` to encrypt it).
(sleep 1 \ && echo -e "gemini+upload://localhost/Test\t"$(wc -c < test.txt)"\t"$(file --brief --mime-type test.txt) \ && sleep .01 \ && cat test.txt) \ | nc localhost 1965
(Please contact me if you want to remove your comment.)
⁂
My writing can be *a bit* complicated to follow without me realising, don’t hesitate to ask for clarifications. Comments, criticism and suggestions are welcome too!
– baschdel 2020-06-06 21:33 UTC
---
The two proposals seem to be very similar. If the world ends up using your proposal, I won’t be sad. 😃
As for specifics: I find the use of TAB to separate fields reminds me of Gopher and that’s not really something we need for Gemini. It’s also harder to write about in documentation because it’s invisible and easy confused with spaces, unlike newlines. I also felt that my proposal just uses status code 59 and manages to do well enough so I am not convinced we needed all the extra.
I also felt that it was a lot longer and I can’t put my finger on specific sentences and paragraphs but after reading it, my main reaction was: “wow, that’s long. Do we need all that?” Specially since I already know that the minimal format I use is working.
I also feel that I’m not interested in the kind of client certificate authentication that Gemini uses: I don’t care if two edits are made by the exact same user, and I might want to restrict edits even if they have a client certificate, so the proposal now forces me to keep state, and a database of known certificates, instead of a much simpler list of tokens (possibly just a single editor password for everybody).
As you can see, nothing big, but still a lot of friction.
– Alex Schroeder 2020-06-06
---
The reason I have used the TAB-separated approach to send metadata tab separated (despite liking and actually preferring Alex’s approach of putting them onto separate lines) is to keep the request scheme of Gemini, client sends one line to the server and the server reply with a status line, however an in spec Gemini client would never write a `gemini+upload://` URI to the server anyway, but the chances of flaws with implementations being discovered should be higher when the client fails because of an unrecognised non numeric status code instead of waiting for the server while the server waits for the client, eventually timing out.
The many response codes are there to always let the client know why the upload failed. As a client developer this is a big one for me.
The third stage however could be replaced by a simple optional redirect that sends an URI on one line or an empty line. But because of the way I specified authentication, I wanted to keep the option to save a redirect in the proposal, also the server could send some human readable information directly to the client after the upload without being limited to a single line or having to store it until the user decides to follow the redirect: detailed descriptions of errors, friendly links to manuals, tokens to delete the file etc.
On authentication tokens: I’m still in favour of putting them into the URI (client certificates are OPTIONAL) instead of having a dedicated token field. Asking the user for something, not knowing what the server wants to know is extremely inconvenient and is guaranteed to confuse at least some users, especially if the server doesn’t even care about that field, or uses it something other than an authentication token. Implementing the other one would require me to do about five minutes of work on my client plus another five for fixing bugs, but designing the UI in the least confusing way would be quite challenge without having a reliable source of information on what to ask the user for.
– baschdel 2020-06-06 22:06 UTC
---
You point about a lack of good UI is well made. My error messages are necessarily terse and I don’t like that, either. I think the better solution (though unlikely) would be to allow a payload for error messages, too. Like HTTP: if a page does not exists, you have three levels:
1. Status code 404
2. Status message “Page not found”
3. and the complete body of the response to provide additional information
I really like that.
– Alex Schroeder 2020-06-06 22:21 UTC
---
On the three part error message: The proposal already supports that in its current form by jumping to the third stage (downloading a response from the server) when the server sends an Ex response code whith a message (or in the case of the file being too big the maximum allowed size)
Examples for failed uploads and uploading using only a tken will be added soon (TM)
– baschdel 2020-06-06 22:33 UTC
---
I added a few more examples and tryed to give the diocument a bit more structure by categorising the stage 1 responsecodes.
The stage two responescode “E_” was replaced by “EC” in order to avoid one code having two meanings.
Code 41 in response to Stage 1 is now specified to indicate a full disk as this makes the uploading functionality temporarely unavaiable and is a kind of overload.
– baschdel 2020-06-07 01:13 UTC
---
I made a diagram, that visualizes the protocol, to view it your browser has to support UTF-8 (If it doesn’t the first version was uploaded as a png image) Baschdels_spin_on_Gemini_uploading_diagram.txt
Baschdels_spin_on_Gemini_uploading_diagram.txt
– Anonymous 2020-06-07 21:16 UTC
---
I think my main issue is the multi-staged approach. Perhaps I’m simply used to stateless protocols like HTTP, finger, gopher, etc. Waiting for the WR response seems to offer little benefit.
On the contrary, I currently have the problem that my site cannot tell users “this is a page you can edit”. They just get an error if they try to write to the wrong URL. I also cannot *link* to the editable URL because following the URL is going to be an error due to the unknown URL scheme. Solving that problem seems more interesting to me right now.
I don’t mind mind more error codes explaining what the errors are, but I also get the feeling that many of them don’t result in clients doing anything other than displaying an error message. And once we’re simply displaying an error message, we could also just use 59.
I’ve used the token idea for leaving comments on this site, though: you have to answer a simple question when trying to leave a comment via Gemini: follow the comment link and you are asked for the question, answer the question and you are redirected to a new URL where you are asked for the comment, and finally you are redirected back to the page with all the comments.
– Alex Schroeder 2020-06-08 07:09 UTC
---
I have tryed to improve the section, and made it explicit, that the client does not have to wait for the server to send a WR response, at the risk of the uploaded data being ignored.
When it comes to the status codes: You are right most of them will on most clients result simply in an error message to the user (but that is also the case with the whole 4× and 5× range on gemini), but a dedicated uploader for, say mediafiles could offer the user to convert the file to a different format if the current one gets rejected, or to drop just enough quality for the file to be accepted if its too large. Also a well behaved automated client (bot) will wait at least a few hours if not days, when it receives a 59 or 53 code as this indicates, that the server does not support uploading at all.
I’m sure people will come up with other genuine uses that depend on the server telling the client why the upload was rejected. (With a single status code, you have to guess that based on the message wich kind of works when you expect the message to be in english and the server replys in english but falls apart, as soon as any of these assumptions mismstch (also no assumptions was one of the goals, if I havn’t messed that up too))
– baschdel 2020-06-08 10:21 UTC
---
Also I like how you implemented commenting on gemini, this is excactly the kind of thing I suggested for upload authentication without using client certificates.
Telling the user how to upload should be simple enough by putting a `Replace this page using gemini+(write|upload)://` at the bottom of the page together with the Raw text link. The gemini specification already states that clients have to expect unknown uri schemes, and even the simplest clients should offer the option to copy an uri with an unknown scheme (That includes simply writing it on screen where the user can select and copy it).
– baschdel 2020-06-08 10:43 UTC
---
Ok, not exactly what I suggested, but for this wiki, with behaving users this implementation should be fine
– baschdel 2020-06-08 11:16 UTC