πΎ Archived View for gemi.dev βΊ gemini-mailing-list βΊ 000101.gmi captured on 2023-12-28 at 15:41:16. Gemini links have been rewritten to link to archived content
β¬ οΈ Previous capture (2023-11-04)
-=-=-=-=-=-=-
Hey! I wanted to ask (and maybe discuss, if not done already) why Gemini has no option to upload a file to a server except for a roughly 1024 byte long URL string. Imho this would be a practical use case e.g. for web forums or other services. As some gemini servers already work with cgi scripts and similar, i think it's a reasonable thing to allow updates via the spec. And people will eventually use the query string to upload files with multiple requests if they want to upload files... Another question: Why does gemini use <CR><LF> line endings instead of a single <LF> or <CR> token? It makes the parser implementation more complex and imho brings no benefit to the protocol and text format itself. I see no reason why i should have a single <CR> or <LF> in a line and it may confuse users: "12345, World!<CR>Hello" may be printed as "Hello, World!" on most text terminals which is imho undesirable for non-interactive output. If a text-based client requires output to be <CR><LF> instead of <LF>, it can patch the data on the fly while outputting it. Regards xq
May 17, 2020 3:55 PM, "Felix Quei?ner" <felix at masterq32.de> wrote: > Why does gemini use <CR><LF> line endings instead of a single <LF> or > <CR> token? It makes the parser implementation more complex and imho > brings no benefit to the protocol and text format itself. <CR><LF> is the windows line ending, also HTTP spec for example, requires that no idea why it's required, exactly functionally you could just split on <LF> and remove all <CR> occurrences, I think. CR, the carriage return, would return the cursor to the start of the line, which is almost certainly not wanted in the middle of the text (also ancient macOS, before OS X, used ONLY <CR>, iirc, as a fun sidenote) using <CR><LF> would allow you to directly netcat from windows, or so, I suppose... I'd think it would be best if the server side would accept either <CR><LF>, AND only a normal <LF> and I'm not sure it really matter what the server returns, because you can strip out all <CR> characters, and on linux (probably all unixes?) it displays like a normal newline anyway
On Sun, May 17, 2020 at 02:55:03PM +0200, Felix Quei?ner wrote: > I wanted to ask (and maybe discuss, if not done already) why Gemini has > no option to upload a file to a server except for a roughly 1024 byte > long URL string. I have to admit, this idea has occasionally crossed my mind. Most recently, when Dave wrote a helper for Git so that people could `git pull` over Gemini, which I thought was super cool - `git push` isn't possible with Gemini as a read-only protocol. It's not that I don't think there are good uses for this. The original reason is that I was obsessed from day one with making it extremely hard for people to be able to extend the core Gemini protocol. HTTP, for example, allows as many headers as you like in requests/responses. Clients are expected to read them all, and handle the ones they can handle. This means anybody can come up with a new header, and if it's popular many clients/servers will implement it, and then it becomes a de facto part of the standard, and clients/servers which don't handle it are seen as "broken" or "primitive". This extensibility is of course a useful thing in many ways from an engineering perspective. But in the long term it is, IMHO, fundamentaly totally incompatible with ideals like simplicity and minimalism and privacy and "anybody can implement it themselves over a weekend in < 1000 LOC". Designers of protocols which are extensible effectively lose a lot of control over their protocol. It's pointless me trying very hard to keep stuff which could be abused for tracking out of Gemini if it can be snuck in by popular consensus this way, because inevitably it will be. You've just got to limit the scope for this kind of extension everywhere you can. If you take this idea seriously, you are basically forced to choose one kind of "thing" a lot, and then have that thing be totally implicit. If there's only one kind of Gemini request (something analogous to GET), then we don't have to explicitly put anything in the request format saying "this is a GET-ish request". And if there's nothing explicit there, nobody can write an "advanced" server which recognises a different value in that place. So, thinking from a perspective of simplifying HTTP, I had to choose only one method, so I chose GET. I had to choose only one response header, so I chose Content-type (because my experience maintaining a popular Gopher client convinced me this was the most sorely lacking bit of information). Several people convinced me to use full blown URLs instead of just paths as I originally specced for requests, which is equivalent to choosing Host as the only request header. Basically this theme runs deep all throughout Gemini's design: wherever HTTP allows several things, pick the one most fundamentally important/useful one, and make it an implicit default with no scope for anything else. If somebody can come up with a way to distinguish GET from POST style requests without also opening up an obvious door to arbitrarily many extra request types, I'll give it some thought. But I'm not optimistic. Insisting on non-extensibility necessarily imposes limits on how much Gemini can do. That's okay. Limitations encourage creativity, and give different things their own unique style/taste/whatever. Gemini is never going to be able to do everything that the web can do - it can't possibly do that while remaining simpler. We should accept this. > Another question: > Why does gemini use <CR><LF> line endings instead of a single <LF> or > <CR> token? It makes the parser implementation more complex and imho > brings no benefit to the protocol and text format itself. I see no > reason why i should have a single <CR> or <LF> in a line and it may > confuse users: "12345, World!<CR>Hello" may be printed as "Hello, > World!" on most text terminals which is imho undesirable for > non-interactive output. As recently mentioned, the spec doesn't actually explicitly say anything about line endings in text/gemini content itself (although it should). It does suggest that CRLF is needed at the end of => lines, but that was unintentional on my part. I agree that requiring CRLF for actual content is strange and I suspect this will change in the next revision. CRLF *is* clearly and deliberately specced in the non-content part of the protocol, i.e. for requests and response headers. And the honest answer here is, well, that's how every internet protocol whose spec I've ever looked at works - HTTP, Gopher, SMTP, IRC, for example, all do this. I admit to being ignorant as to the exact historical reason for his convention. But it's a deep and wide convention adhered to by people who know more than I do, and for that reason I'm reluctant to break it without very good reason. If people have strong feelings in either direction about the line terminator to be used in the protocol and in text/gemini content, I'm very happy to hear it. Cheers, Solderpunk
First of all: thanks for the very extensive response! > It's not that I don't think there are good uses for this. > > The original reason is that I was obsessed from day one with making it > extremely hard for people to be able to extend the core Gemini protocol. > HTTP, for example, allows as many headers as you like in > requests/responses. Clients are expected to read them all, and handle > the ones they can handle. This means anybody can come up with a new > header, and if it's popular many clients/servers will implement it, and > then it becomes a de facto part of the standard, and clients/servers > which don't handle it are seen as "broken" or "primitive". Yes i can understand this and it was not my intention to create extensibility in the protocol but just allow a single, client-induced data upload to the server. > This extensibility is of course a useful thing in many ways from an > engineering perspective. But in the long term it is, IMHO, fundamentaly > totally incompatible with ideals like simplicity and minimalism and > privacy and "anybody can implement it themselves over a weekend in < > 1000 LOC". Designers of protocols which are extensible effectively lose > a lot of control over their protocol. Yes, true > It's pointless me trying very > hard to keep stuff which could be abused for tracking out of Gemini if > it can be snuck in by popular consensus this way, because inevitably it > will be. You've just got to limit the scope for this kind of extension > everywhere you can. One proposal for more privacy and less tracking: Explicitly allow clients to remove the query string from any request, as most of the web stuff does also tracking via request parameters (before cookies). This would prevent servers relying on per-user generated URLs in between pages and the user can be queried if they want to remove the query parameters. > If you take this idea seriously, you are basically forced to choose > one kind of "thing" a lot, and then have that thing be totally implicit. > If there's only one kind of Gemini request (something analogous to GET), > then we don't have to explicitly put anything in the request format > saying "this is a GET-ish request". And if there's nothing explicit > there, nobody can write an "advanced" server which recognises a > different value in that place. Yeah that's why i asked for a specific PUT in the first place. It may start to emerge that people want a more interactive version of gemini-served pages and would start to abuse standard features like url queries to introduce that kind of interactivity and it would be a point where the server would be able to pretty easily "trick" the user into following trackable links. Having an explicit PUT option in the protocol and preventing servers to rely on queries would make stuff simpler and more straightforward in the long term > If somebody can come up with a way to distinguish GET from POST style > requests without also opening up an obvious door to arbitrarily many > extra request types, I'll give it some thought. But I'm not optimistic. I actually came up with an idea, but i don't know how good it is in the end: Respec the 10 INPUT so that it works like this: 1. Client sends usual request header 2. Server responds with "10 Your forum post:" 3. Client now has two options: 1. The client drops the connection and sends no bytes. This would be the status quo. 2. The client now sends a single line with the mime type of the data, then sends the data similar to the server responding with a 20 status code (so, instead of the server sending data to the client, the client just sends data to the server) This would allow several things: 1. Server can notify that the client needs to upload data, the client can now chose to upload or not 2. With the mime type in the upload header, the server can just drop the connection after the mime, displaying the client that the data sent is unwanted. > Insisting on non-extensibility necessarily imposes limits on how much > Gemini can do. That's okay. Limitations encourage creativity, and give > different things their own unique style/taste/whatever. Gemini is never > going to be able to do everything that the web can do - it can't > possibly do that while remaining simpler. We should accept this. Yeah true. But the first idea that comes to my mind when i'd like to upload a file would be: Chunk the file into 256 byte large pieces, and upload the whole data via a huge load of requests containing a query /path/?offset=X&length=Y&blob=Z where X is the offset in the uploaded file, Y is the length of the transferred data and Z would be the URL-encoded data itself. > As recently mentioned, the spec doesn't actually explicitly say anything > about line endings in text/gemini content itself (although it should). > It does suggest that CRLF is needed at the end of => lines, but that was > unintentional on my part. I agree that requiring CRLF for actual > content is strange and I suspect this will change in the next revision. > > CRLF *is* clearly and deliberately specced in the non-content part of > the protocol, i.e. for requests and response headers. And the honest > answer here is, well, that's how every internet protocol whose spec I've > ever looked at works - HTTP, Gopher, SMTP, IRC, for example, all do > this. I admit to being ignorant as to the exact historical reason for > his convention. But it's a deep and wide convention adhered to by > people who know more than I do, and for that reason I'm reluctant to > break it without very good reason. Thanks for clarifying! > If people have strong feelings in either direction about the line > terminator to be used in the protocol and in text/gemini content, I'm > very happy to hear it. I'd like to see a pure <LF> version, especially for the protocol header. My client atm just reads until the first <LF>, then checks if the <CR> is there and if not, drops the connection to the server and respons with "InvalidResponse" I assume a lot of servers/clients either ignore the existence of <CR> or drop the connection for protocol violation because both options are the sane thing to do. It's not like a lone <CR> or <LF> are allowed anyways in the header. Regards xq
solderpunk writes: > On Sun, May 17, 2020 at 02:55:03PM +0200, Felix Quei?ner wrote: > I have to admit, this idea has occasionally crossed my mind. Most > recently, when Dave wrote a helper for Git so that people could `git > pull` over Gemini, which I thought was super cool - `git push` isn't > possible with Gemini as a read-only protocol. > > It's not that I don't think there are good uses for this. For what it's worth, I personally don't even think allowing git pushes over gemini is a good reason for modifying the protocol. Git has it's own application-level protocol that works perfectly for this kind of thing. Gemini is very close to _perfect_ for it's intended use: serving up primarily textual content to humans in a relatively secure but pleasantly simple way. I really think it'd be tragic to compromise on this for reasons that are tangential to the original goal. Just my 2c, Tim -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200517/ebf5 5f04/attachment-0001.sig>
---