💾 Archived View for gemi.dev › gemini-mailing-list › 000456.gmi captured on 2024-08-19 at 00:46:01. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
Proposal: Cache duration Gemini servers MAY include a duration (in seconds) which a client SHOULD cache a resource when the request is successful. When a cache duration of 0 or less is provided, clients SHOULD NOT cache the resource. Cache for one week: 20 text/gemini; charset=utf-8; cache=604800 Do not cache: 20 text/gemini; charset=utf-8; cache=0 Proposal: Response body size Gemini servers SHOULD include the size (in bytes) of the response body when the request is successful. Clients SHOULD utilize this information when downloading files to indicate progress. 20 text/gemini; charset=utf-8; size=1108 These proposals are available at gemini://twins.rocketnine.space/proposals.gmi
We just had massive discussions about both of these on the ML. I'll provide you with some of my thoughts and consensuses(?) from the conversations in the ML: > Caching > Content-Size / Content-Length Both of these proposals and the general idea around them focuses on the transmission of large files. Caching is geared towards reducing request times, and content-size/length aims to provide progress bars and size verification. Content hashing, which you've not discussed, is also toward content verification. Gemini is focused on serving small textual files. They clearly do not need caching. Clients that do wish to cache should only cache backward and forward in history (i.e. pages that the user had visited in the current session), and should provide a reload button at least for these pages so that they can be refreshed when needed. In addition, large files are not expected to be requested repeatedly (at least, Gemini is not the protocol to do it in), so they do not need to be cached. Small text files are also small enough that their size does not need to be known (as they load quickly enough to remove the need for progress bars) and they can easily be verified by hand just by reading them (the underlying connection is already over TLS, so receiving bad data is hard enough as it is). Gemini is just not good at large transmissions, and this is intentional. If you need to do large transmissions, use other protocols that already support content-size (and perhaps content-hash), as they are geared towards such transmissions and will support it better. ~aravk | ~nothien -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/af87 1fb9/attachment.sig>
Both of these proposals sound very much like introducing MIME type parameters as geminis version of headers, which is something to be avoided if I read the [FAQ] correctly. > Proposal: Cache duration This has been discussed at length in this mailing list and I personally do not see a convincing case as to why this is necessary with typical small gemini documents. > Proposal: Response body size There is a section in the FAQ specifically on why gemini does not have this: tl;dr: It's not sensible for the small gemini documents, for larger files you should use something else, e.g. torrent, IPFS etc. > section 2.11 of the FAQ > ... > It is true that the inability for clients to tell users how much more of > a large file still has to be downloaded and to estimate how long this may > take means Gemini cannot provide a very user-friendly experience for large > file downloads. However, this would be the case even if Content-length were > specified, as such an experience would also require other complications to be > added to the protocol e.g. the ability to resume interrupted downloads. > Gemini documents can of course straightforwardly link to resources hosted via > HTTPS, BitTorrent, IPFS, DAT, etc. and this may be the best option for > very large files. [FAQ]: gemini://gemini.circumlunar.space/docs/faq.gmi
On Tue, 10 Nov 2020 04:30:51 +0000 trevor at rocketnine.space wrote: > Proposal: Cache duration > Proposal: Response body size As suggested these basically reinvent HTTP headers in a way that's not easily distinguishable from MIME parameters OR add exactly the kind of global MIME parameters that RFC 2045 prohibits: > There are NO globally-meaningful parameters that apply to all media > types. Truly global mechanisms are best addressed, in the MIME > model, by the definition of additional Content-* header fields. It's clean in implementation (old clients should be forwards compatible insofar that they ignore MIME parameters that they don't know, as mandated by RFC 2045) but it still doesn't conform to RFC 2045 and strikes me as a hacky solution to the problem, and at worst, a potentially balkanizing vector for protocol extensions. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/d7cb f0cc/attachment.sig>
trevor at rocketnine.space writes: > 20 text/gemini; charset=utf-8; cache=604800 > 20 text/gemini; charset=utf-8; size=1108 I think we've already discussed abusing the MIME type to hold extra "headers", and the idea has been generally rejected. -- +-----------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | A flower falls, even though we love it; and a weed grows, | | even though we do not love it. -- Dogen |
November 10, 2020 7:49 AM, "Johann Galle" <johann.galle at protonmail.com> wrote: > Both of these proposals sound very much like introducing MIME type parameters > as geminis version of headers, which is something to be avoided if I read > the [FAQ] correctly. > >> Proposal: Cache duration > > This has been discussed at length in this mailing list and I personally do > not see a convincing case as to why this is necessary with typical small > gemini documents. > >> Proposal: Response body size > > There is a section in the FAQ specifically on why gemini does not have this: > > tl;dr: It's not sensible for the small gemini documents, for larger files > you should use something else, e.g. torrent, IPFS etc. > > - SNIP - > > [FAQ]: gemini://gemini.circumlunar.space/docs/faq.gmi I keep hearing this argument and I don't agree with it one bit. (I'm not necessarily disagreeing with *you*, Johann, just in general.) One design criterion of Gemini is generality. As the FAQ puts it: > But, just like HTTP can be, and is, used for much, much more than serving HTML, Gemini should be able to be used for as many other purposes as possible without compromising the simplicity and privacy criteria above. It's *possible* to serve large files over Gemini. I can make a script that generates a large image (or I can just have a large image to serve). Therefore, you should be able to use Gemini to serve large files. The main draw I can see for a Content-Length header is being able to see if the whole file comes across the wire. The ability to resume an interrupted download need not exist; just retry the request if all of the content doesn't get across. That being said, I am wary of making it even a SHOULD. If the file needs it (i.e; it's a big file that probably could get interrupted mid-download), then go ahead, but those of us who don't care about Content-Length or Cache duration shouldn't feel "pressured" to do it. Just my two cents, Robert "khuxkm" Miles (I should really set up an identity in my email client so my name shows up on the mailing list archive next to my email)
On Tue, Nov 10, 2020 at 02:17:13PM +0000, khuxkm at tilde.team wrote: > It's *possible* to serve large files over Gemini. I can make a script that generates a large image (or I can just have a large image to serve). Therefore, you should be able to use Gemini to serve large files. The main draw I can see for a Content-Length header is being able to see if the whole file comes across the wire. The ability to resume an interrupted download need not exist; just retry the request if all of the content doesn't get across. > > That being said, I am wary of making it even a SHOULD. If the file needs it (i.e; it's a big file that probably could get interrupted mid-download), then go ahead, but those of us who don't care about Content-Length or Cache duration shouldn't feel "pressured" to do it. I'm far from a TLS expert, so I might be completely wrong here, but can't the client rely on the server's TLS close_notify signal to decide if the download was interrupted? As far as I can tell, the entire point of close_notify is to guard against data truncation...? bie
November 10, 2020 9:34 AM, "bie" <bie at 202x.moe> wrote: > On Tue, Nov 10, 2020 at 02:17:13PM +0000, khuxkm at tilde.team wrote: > >> It's *possible* to serve large files over Gemini. I can make a script that generates a large image >> (or I can just have a large image to serve). Therefore, you should be able to use Gemini to serve >> large files. The main draw I can see for a Content-Length header is being able to see if the whole >> file comes across the wire. The ability to resume an interrupted download need not exist; just >> retry the request if all of the content doesn't get across. >> >> That being said, I am wary of making it even a SHOULD. If the file needs it (i.e; it's a big file >> that probably could get interrupted mid-download), then go ahead, but those of us who don't care >> about Content-Length or Cache duration shouldn't feel "pressured" to do it. > > I'm far from a TLS expert, so I might be completely wrong here, but > can't the client rely on the server's TLS close_notify signal to decide if > the download was interrupted? As far as I can tell, the entire point of > close_notify is to guard against data truncation...? > > bie I mean, yes, that's the point of close_notify, but what if the connection dies *after* the data is sent but *before* the close_notify can be sent? Then it turns into an unsolvable two-generals problem. I'm not *advocating* for either proposal, I'm just sick and tired of people acting like "Gemini is meant for small text files" is a good excuse to ignore proposals that could help UX when downloading larger files. Just my two cents, Robert "khuxkm" Miles
On Tuesday, 10 November 2020 15:17, <khuxkm at tilde.team> wrote: > It's possible to serve large files over Gemini. Yes, I agree that it is possible, but I think it is not desirable to do so. Just because you can does not mean you should. > Therefore, you should be able to use Gemini to serve large files. Did you not just say that it is possible? > The main draw I can see for a Content-Length header is being able to see if > the whole file comes across the wire. This is really not something gemini should be worried about, that is the job of TLS, as bie already pointed out. > The ability to resume an interrupted download need not exist; just retry the > request if all of the content doesn't get across. I agree with you here too, I think the FAQ might be a bit quick to jump to this conclusion. > That being said, I am wary of making it even a SHOULD. > If the file needs it ... ... you should probably use another protocol which is better suited for transferring files, may I suggest FTP ;)
November 10, 2020 10:18 AM, "Johann Galle" <johann.galle at protonmail.com> wrote: > On Tuesday, 10 November 2020 15:17, <khuxkm at tilde.team> wrote: > > -snip- > >> The main draw I can see for a Content-Length header is being able to see if >> the whole file comes across the wire. > > This is really not something gemini should be worried about, that is the job of > TLS, as bie already pointed out. And as I pointed out, relying on a close_notify to determine whether the entire response was received becomes a Two Generals problem. >> That being said, I am wary of making it even a SHOULD. >> If the file needs it ... > > ... you should probably use another protocol which is better suited for > transferring files, may I suggest FTP ;) For the last time, "just use another protocol" isn't a valid excuse. The sentence I quoted from the FAQ doesn't say "Gemini should be used for things we think it should be used for and nothing more". It says that Gemini should be used for as many things as possible, while still staying with simplicity and privacy as the other ethos. You can serve large files over Gemini, so if there's a way to make downloading large files over Gemini better for the client without massively complicating things, we should at the very least consider it. If I need to distribute text, I should use a protocol which is better suited for transferring plain text. May I suggest Gopher? ;) Just my two cents, Robert "khuxkm" Miles
On Tue, 10 Nov 2020 15:11:54 +0000 "Robert \"khuxkm\" Miles" <khuxkm at tilde.team> wrote: > I mean, yes, that's the point of close_notify, but what if the connection dies *after* the data is sent but *before* the close_notify can be sent? Then it turns into an unsolvable two-generals problem. Then you don't know that you received the complete data. You can safely apply the same strategy as you would if you *did* know that you didn't receive the complete data. It's only a problem if you make it one. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/718d aa37/attachment.sig>
November 10, 2020 10:29 AM, "Philip Linde" <linde.philip at gmail.com> wrote: > On Tue, 10 Nov 2020 15:11:54 +0000 > "Robert \"khuxkm\" Miles" <khuxkm at tilde.team> wrote: > >> I mean, yes, that's the point of close_notify, but what if the connection dies *after* the data is >> sent but *before* the close_notify can be sent? Then it turns into an unsolvable two-generals >> problem. > > Then you don't know that you received the complete data. You can > safely apply the same strategy as you would if you *did* know that you > didn't receive the complete data. It's only a problem if you make it > one. > > -- > Philip But then this goes back to the whole discussion we had about caching: sure, the overhead of a TLS connection is low, but it's not zero. If I know for a fact that the response I got has as many bytes as I was told I had, then even in the absence of a close_notify, I know for a fact I have the whole response and can be sure in that. (If I get a close_notify while not having as many bytes, I can just assume the server either lied or something else broke, and I can ask my user what to do.) Just my two cents, Robert "khuxkm" Miles
On Tue, 10 Nov 2020 15:36:01 +0000 "Robert \"khuxkm\" Miles" <khuxkm at tilde.team> wrote: > But then this goes back to the whole discussion we had about caching: sure, the overhead of a TLS connection is low, but it's not zero. If I know for a fact that the response I got has as many bytes as I was told I had, then even in the absence of a close_notify, I know for a fact I have the whole response and can be sure in that. Sure, but as an argument for inclusion this is not compelling enough to warrant the proposed change IMO. Knowing for sure that you don't know whether you have received a full response is almost as good as knowing for sure that you have in this context. It's only in the unlikely edge case that a server might want to send close_notify but the client fails to receive it (and only it) that it's useful to save an additional request, and considering the hacky piggybacking on MIME parameters the proposal entails (that RFC 2045 in the most forgiving interpretation at least advises against), it's a small win. A more compelling argument IMO is that knowing the expected size in advance, you can support progress bars and time estimates. Another compelling argument might be that a lot of servers apparently ignore the fact that they *have to* close_notify, which also seems like par for the course for web servers. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201110/18ca 7a46/attachment.sig>
bie <bie at 202x.moe> writes: > I'm far from a TLS expert, so I might be completely wrong here, but > can't the client rely on the server's TLS close_notify signal to decide if > the download was interrupted? As far as I can tell, the entire point of > close_notify is to guard against data truncation...? > > bie That's right. Here's a summary of the issues and solutions proposed on the mailing list over the past couple of weeks regarding various content-size and content-hash response header proposals: 1. What about caching? This should either be performed by clients for links visited within a single session or not at all. If a client performs caching, it should provide some way to signal that you want to clear the current page from the cache and reload it. 2. How do I know if I got the entire response body? Your client will receive a TLS close_notify message when the server has finished sending the response. If you don't get one, the connection was broken. Retry your request or don't. It's up to you. 3. What if I'm impatient and am prone to canceling requests that take a long time? Outside of network latency issues or buggy servers, this should really only be happening when requesting large files. Content authors should consider including the file size for such files in their link descriptions, so the user will know generally how long they might have to wait. => /foo.mp3 foo.mp3 (14 MiB) 4. Okay, the link told me it was a big file, so I waited long enough for it to finish, and I know I got the whole file because I received a TLS close_notify message...but...how do I know I got all the intended bytes? If content validation is desirable, authors should provide a content hash in the link description or provide a separate link to a file containing the content hash (e.g., foo.mp3.md5 or foo.mp3.sha256): => /foo.mp3 foo.mp3 (14 MiB) SHA256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 5. Now can we add my proposal for sending content-size, content-hash, and cache-duration headers to Gemini responses? See 1-4 above. That's all, folks! Gary -- GPG Key ID: 7BC158ED Use `gpg --search-keys lambdatronic' to find me Protect yourself from surveillance: https://emailselfdefense.fsf.org ======================================================================= () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments Why is HTML email a security nightmare? See https://useplaintext.email/ Please avoid sending me MS-Office attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
On 11/10/20 10:33 AM, Gary Johnson wrote: > bie <bie at 202x.moe> writes: > >> I'm far from a TLS expert, so I might be completely wrong here, but >> can't the client rely on the server's TLS close_notify signal to decide if >> the download was interrupted? As far as I can tell, the entire point of >> close_notify is to guard against data truncation...? >> >> bie > That's right. Here's a summary of the issues and solutions proposed on > the mailing list over the past couple of weeks regarding various > content-size and content-hash response header proposals: > > 1. What about caching? > > This should either be performed by clients for links visited within a > single session or not at all. If a client performs caching, it should > provide some way to signal that you want to clear the current page > from the cache and reload it. > > 2. How do I know if I got the entire response body? > > Your client will receive a TLS close_notify message when the server > has finished sending the response. If you don't get one, the > connection was broken. Retry your request or don't. It's up to you. > > 3. What if I'm impatient and am prone to canceling requests that take a > long time? > > Outside of network latency issues or buggy servers, this should > really only be happening when requesting large files. Content authors > should consider including the file size for such files in their link > descriptions, so the user will know generally how long they might > have to wait. > > => /foo.mp3 foo.mp3 (14 MiB) > > 4. Okay, the link told me it was a big file, so I waited long enough for > it to finish, and I know I got the whole file because I received a > TLS close_notify message...but...how do I know I got all the intended > bytes? > > If content validation is desirable, authors should provide a content > hash in the link description or provide a separate link to a file > containing the content hash (e.g., foo.mp3.md5 or foo.mp3.sha256): > > => /foo.mp3 foo.mp3 (14 MiB) SHA256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 To further reduce the opportunity for undetected data corruption, clients could also keep an in-memory hash of the received data and compare this to a hash of the stored file. > 5. Now can we add my proposal for sending content-size, content-hash, > and cache-duration headers to Gemini responses? > > See 1-4 above. > > That's all, folks! > Gary >
On Tue, 10 Nov 2020 14:17:13 +0000 khuxkm at tilde.team wrote: > I keep hearing this argument and I don't agree with it one bit. (I'm > not necessarily disagreeing with *you*, Johann, just in general.) > > One design criterion of Gemini is generality. As the FAQ puts it: > > > But, just like HTTP can be, and is, used for much, much more than > > serving HTML, Gemini should be able to be used for as many other > > purposes as possible without compromising the simplicity and > > privacy criteria above. > > It's *possible* to serve large files over Gemini. I can make a script > that generates a large image (or I can just have a large image to > serve). Therefore, you should be able to use Gemini to serve large > files. The main draw I can see for a Content-Length header is being > able to see if the whole file comes across the wire. The ability to > resume an interrupted download need not exist; just retry the request > if all of the content doesn't get across. Just because it is possible to do this within the protocol doesn't mean it should accommodate every use case, your use case is possible, but not recommended, the protocol is designed for serving small text documents, I said this million times, you just skip it and continue your push for the "solutions" to non-existent problems in Gemini. If you believe the protocol is limited, THAT IS BY DESIGN, if it gets extended to allow the frequent serving of large files then nothing stopping anyone from developing a new markup language that is meant for Gemini that serves stylesheets and what not, I know this is against the Gemini philosophy, but if it is, then the protocol should not provide means for its possibility. You have to appreciate that the community give each other a slap on the wrist for when someone thinks that their proposal is acceptable, for example, look at my recent suggestion to add escaped lines to gemtext, it got rejected with valid criticism, It was explained to me it was a bad idea so I backed off and agreed, I'm thankful for that. Yet you refuse to address the common concerns brought to you against your proposal, it seems that every time the mailing list moves on you come back advocating for such feature again and again, and you get the exact same replies... move on. Let me quote Arav K. for you, as he brought up the same points I brought up previously that you seemed to have ignored anyway: On Tue, 10 Nov 2020 13:46:30 +0100 "Arav K." <nothien at uber.space> wrote: > We just had massive discussions about both of these on the ML. I'll > provide you with some of my thoughts and consensuses(?) from the > conversations in the ML: > > > Caching > > Content-Size / Content-Length > > Both of these proposals and the general idea around them focuses on > the transmission of large files. Caching is geared towards reducing > request times, and content-size/length aims to provide progress bars > and size verification. Content hashing, which you've not discussed, > is also toward content verification. > > Gemini is focused on serving small textual files. They clearly do not > need caching. Clients that do wish to cache should only cache > backward and forward in history (i.e. pages that the user had visited > in the current session), and should provide a reload button at least > for these pages so that they can be refreshed when needed. In > addition, large files are not expected to be requested repeatedly (at > least, Gemini is not the protocol to do it in), so they do not need > to be cached. Small text files are also small enough that their size > does not need to be known (as they load quickly enough to remove the > need for progress bars) and they can easily be verified by hand just > by reading them (the underlying connection is already over TLS, so > receiving bad data is hard enough as it is). > > Gemini is just not good at large transmissions, and this is > intentional. If you need to do large transmissions, use other > protocols that already support content-size (and perhaps > content-hash), as they are geared towards such transmissions and will > support it better.
---