It was thus said that the Great solderpunk once stated: > Two quick points with regard to the fact that Gemini currently does not > convey file sizes to users at any point: > > * Sean has pointed out in one of his RFCs that this means there is no > way for a client to know whether or not a download completed > successfully or was interrupted due to an accidentally dropped or > even a maliciously severred connection > > * I've received an email from somebody watching the Gemini design unfold > with interest, who is concerned about Gemini clients with limited > system resources unwittingly downloading large files (such as PDFs of > scanned documents) which they aren't even capable of opening. While I > quite like the idea of Gemini being friendly to low-end systems, I do > wonder whether or not the TLS requirement makes this a little moot. > > Anyway, the question is do we want to change anything to address these > issues and if so how do we want to do it? > > I'll quickly note in pasing that both of these problems also exist in > exactly the same form for Gopher, but I've never once heard Gopher users > complain about them. Gopher does address this rather obliquely---text files (and gopher indexes) are supposed to end with a '.' on a line by itself. This lets the client know it received the data correctly, and it says as much in RFC-1436, section 3.8: Note that for type 5 or type 9 the client must be prepared to read until the connection closes. There will be no period at the end of the file; ... It's not necessarily a pain point about the filesizes not being known before hand, but it does make displaying a progress bar (for example) difficult to implement. > One possibility, as proposed by Sean, is to add file size to the > response header, with it optionally appearing after the MIME type. I'm > not hugely fond of this myself, simply because it complicates parsing of > the response header. I'm not seeing much of an issue. Assuming tabs separate the compoents on the status line, then (\d+)\t([^\t]+)(\t([^\t]+))* would parse the line (I suspect, I'm not a fan of regex but I think the above would work to parse the status line). I don't see much of an issue in parsing any of the following: 20<HTAB>text/plain; charset=utf-8<HTAB>2123<CRLF> 20<HTAB>text/plain<CRLF> > Remember that the MIME type can have multiple > components specifying encodings etc. If you just split the META part of > the header on whitespace, the number of components is variable, so > recognising whether or not an optional filesize is present requires > actually inspecting the parts and looking for a number. In fairness to > Sean, at the time of writing of his RFC the spec spec said META was > separated from STATUS by a tab (whereas now it is just whitespace), so > tacking something after META with another tab was unambiguous, assuming > nobody put tabs in their MIME types... Which could be specified, "don't put tabs in the MIME type section." > Another possibility ties into another request I got from somebody very > early on - it would be nice if there was some way to query a Gemini > server for the time a resource was last modified, so that Gemini > equivaents of tools like moku pona could avoid needlessly fetching > unchanged resources over and over again. At that point I started > wondering about giving Gemini some equivalent of HTTP HEAD, although I > abandoned it pretty quickly when I realised that substantial TLS > overhead probably made making a whole second request to check if a > resource had changed not such a worthwhile idea. One way would be to query a well-known endpoint (these exist in the HTTP world---robots.txt is one such file) that contains tiemstamps for various resources. Slap a MIME type of text/gemini-timestamp and call it done: gemini://example.com/ 2019-08-15T13:53:00-05:00 gemini://example.com/feed 2019-07-29T00:00:00-05:00 gemini://example.com/other 2019-08-01T00:00:00-05:00 That's one way. > But, we could possibly > bring this idea back, as the response to such a request could naturally > include the file size as well. The real question is how to *make* such > a request, ideally in a way which doesn't open the door to a half dozen > other new "methods". As I mentioned in a private email to solderpunk earlier, one could always take advantage of the sub-delimeters in the path portion. I had at one point mentioned using those to specify the prompt (otherwise the server would return a status of 10): gemini://example.com/search;Search%20for This could be formalized: gemini://example.com/search;prompt=Search%20for gemini://example.com/blogfeed;timestamp=2019-08-15T00:00:00Z gemini://example.com/wildexample;prompt=Search%20for;timestamp=2019-08-15: 00:00:00Z?query=foo&usename=bar So, you have "prompt" and "timestamp". Others could be propsed. If the "timestamp" thing above is accepted, then you might want to have a new status code meaning "no change" or "okay, but there's no content". > Regarding ways to enable something like a HEAD request without changing > the request format to include a method field - I'm not quite sure > whether using a fixed URL fragment, like #meta, on requests would be a > kosher way to do this. Does metadata count as "some portion or subset > of the primary resource, some view on representations of the primary > resource, or some other resource defined or described by those > representations" (from RFC3986)? Well, there are RFC-5147 and RFC-7111 that give semantics to the URI fragment section, but I still think using the sub-delimeter of ';' in the path portion is the way to go. -spc
---
Previous in thread (1 of 11): 🗣️ solderpunk (solderpunk (a) SDF.ORG)
Next in thread (3 of 11): 🗣️ solderpunk (solderpunk (a) SDF.ORG)