πΎ Archived View for gemi.dev βΊ gemini-mailing-list βΊ 000439.gmi captured on 2024-08-31 at 17:05:59. Gemini links have been rewritten to link to archived content
β¬ οΈ Previous capture (2023-12-28)
-=-=-=-=-=-=-
Hello everyone! There's been some light discussion on the gemini irc channel about if and how clients should cache gemini responses, and I'd love to hear how people here think about the issue... Personally I feel like caching adds a ton of complexity, so the default behavior should be not to cache anything. There's no way to know if the response from a server is the product of a dynamic CGI script or a static file, and trying to guess at what's what will definitely introduce some weird behavior. Just to give some examples - on my personal site I have a script that returns a random jpeg, a guestbook and a simple gemlog, none of which should be cached! Now, I don't see the need for caching at all when it comes to gemini, but if it's something the gemini community wants or needs then I'd like to suggest a 2x status code that let's a client know that the response is unlikely to change and thus cacheable. This could even kind of match how the temporary/permanent redirect status codes are defined: 20 - SUCCESS 21 - "PERMANENT" SUCCESS (Feel free to cache this) This allows simple clients to keep doing what they're doing now (send a request, show the response) and allows more complex clients to play around with caching without breaking compatibility with what we already have. Another suggestion made on irc was to do the opposite and create a "do not cache" 2x status code, but I'm not really sold on that. Caching introduces a lot of complexity even without a separate status code (how should query strings be handled, 1x responses, redirects?) and these issues don't go away if you add a "do not cache" code. Any thoughts...? bie
Instead of "permanent" caching (what is permanent?) I am thinking about using timestamps. So, for a requested resource, (if it is available) return a timestamp which denotes when the resource was last changed. When requesting the resource again, send that timestamp with it and the server checks if the cache is stale or not and responds accordingly (either "resource is modified after $(old_timestamp) so here is the new resource and it was modified on $(new_timestamp)" or "the resource was not changed after $(timestamp)"). But the problem is, where does the timestamp goes in the request and response? ~smlckz
I *love* it when clients cache stuff, for performance and environmental reasons. But I think that's something that each individual client (or client dev, I suppose) should deal with in their own fashion and only if they want to. As mentioned before most files served on gemini:// are small text files, and it's likely to remain that way for the foreseeable future. I don't think caching is something that the protocol should care about or cater to specifically, as it adds complexity and ambiguity. > Now, I don't see the need for caching at all when it comes to > gemini, but if it's something the gemini community wants or needs then > I'd like to suggest a 2x status code that let's a client know that the > response is unlikely to change and thus cacheable. This could even kind > of match how the temporary/permanent redirect status codes are defined: > > 20 - SUCCESS > 21 - "PERMANENT" SUCCESS (Feel free to cache this) Adding a response code like that is counterproductive, I believe, as there's really no way for a server to determine what sort of file or data is unlikely to change, and there's no way for a client to determine what a flag of "unlikely to change" means. Does it mean it won't change in the next hour? The next month? As mentioned in previous discussions on caching and checksums, there's not really room in the protocol specification for the needed information unless you start misusing MIME type info. I've come around to thinking that the added complexity, ambiguity, and possibly breaking changes far outweigh the added benefit. On a side note I am (somewhat slowly) playing around with python3/Tkinter to make a client, and my current thinking for the design of that is that when the user clicks a link that returns an image I'll display it inline and cache it (based on its path)
For me personally, that brings the protocol a bit too close to the complexity of HTTP, but I'd be curious to hear what the use-cases for something like that would be...? When I used the word "permanent" it was just to draw a comparison to the 31 status code (REDIRECT - PERMANENT). The idea is just to let the client know that if it wants to cache the response for a session or however long it wants to... it's unlikely to cause any issues. bie On Fri, Nov 06, 2020 at 12:57:11PM +0530, Sudipto Mallick wrote: > Instead of "permanent" caching (what is permanent?) I am thinking > about using timestamps. > > So, for a requested resource, (if it is available) return a timestamp > which denotes when the resource was last changed. > > When requesting the resource again, send that timestamp with it and > the server checks if the cache is stale or not and responds > accordingly (either "resource is modified after $(old_timestamp) so > here is the new resource and it was modified on $(new_timestamp)" or > "the resource was not changed after $(timestamp)"). > > But the problem is, where does the timestamp goes in the request and response? > > > ~smlckz
> Adding a response code like that is counterproductive, I believe, as > there's really no way for a server to determine what sort of file or > data is unlikely to change, and there's no way for a client to > determine what a flag of "unlikely to change" means. Does it mean it > won't change in the next hour? The next month? If there's no way for a server to determine what's unlikely to change then it's *definitely* no way for a client to know, in which case caching the response is just plain bad-mannered. > On a side note I am (somewhat slowly) playing around with > python3/Tkinter to make a client, and my current thinking for the > design of that is that when the user clicks a link that returns an > image I'll display it inline and cache it (based on its path) > *forever*, though it will be clearly marked as a cached resource in > the UI somehow as of yet undetermined. As for text/gemini I don't see > a reason to cache them for more than very short periods: they're too > small and frequently changed. No changes are needed in the protocol > for this sort of behaviour, and I'm not really convinced it's a > behaviour most clients should adopt either. Client-side caching is > something that users have strong opinions about. I definitely have strong opinions about this, yeah - this would, like I mentioned earlier, mean that my script returning a random photo wouldn't work in your client, and that a lot of the tools and toys I was hoping to use the gemini protocol for are probably suited for something else, at least if the consensus among the gemini community is that arbitrary caching is fine. ?? bie
> I definitely have strong opinions about this, yeah - this would, like I > mentioned earlier, mean that my script returning a random photo wouldn't > work in your client, and that a lot of the tools and toys I was hoping to use > the gemini protocol for are probably suited for something else, > at least if the consensus among the gemini community is that arbitrary > caching is fine. I wouldn't worry about that if I were you -- I'm pretty sure I'm the only one on the planet dumb enough to cache like this :D There's a fair chance I'll change that stance when my browser is ready enough to be tested.
On Fri, 6 Nov 2020 12:57:11 +0530 Sudipto Mallick <smallick.dev at gmail.com> wrote: > But the problem is, where does the timestamp goes in the request and response? I think that's the advantage of bie's suggested solution. It doesn't require any breaking changes, and a client that doesn't recognize the difference between codes 20 and 21 will still be fully compatible with a server that does. There can be different codes roughly representing different cache lifetimes. "PERMANENT" for things that should stick on the disk until the user (or user configured policy) removes them. "SESSION" for things that stick for the lifetime of a browsing session. A HTTP HEAD Last-Modified like solution also provides little advantage for the smaller documents people typically serve on Gemini. A lot of overhead exists in TLS negotiation, so one request is almost certainly better than two for small blog posts or articles. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/c453 1315/attachment.sig>
On Fri, 6 Nov 2020 08:43:39 +0100 Bj?rn W?rmedal <bjorn.warmedal at gmail.com> wrote: > Adding a response code like that is counterproductive, I believe, as > there's really no way for a server to determine what sort of file or > data is unlikely to change, and there's no way for a client to > determine what a flag of "unlikely to change" means. Does it mean it > won't change in the next hour? The next month? Why does the server need to determine this? It should be up to the server admin to determine it. If I have some file I want to serve that I anticipate will never change, I configure the server to respond to reqeusts for it with code 21. The client can take this to mean a week, a day or forever depending on how sure the user wants to be that the information is current. The client could override this behavior by allowing the user to force a cache entry to be purged. > As mentioned in previous discussions on caching and checksums, there's > not really room in the protocol specification for the needed > information unless you start misusing MIME type info. This doesn't at all address the suggested solution, which there *is* room for in the protocol. No need to misuse MIME type info. No need for breaking changes to the specification. As suggested, this is entirely backwards-compatible, older clients and servers are entirely forwards-compatible and the change to the spec would be entirely additive. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/dab5 7087/attachment.sig>
On Fri, 6 Nov 2020 14:52:17 +0900 bie <bie at 202x.moe> wrote: > Another suggestion made on irc was to do the opposite and create a "do > not cache" 2x status code, but I'm not really sold on that. Caching > introduces a lot of complexity even without a separate status code (how > should query strings be handled, 1x responses, redirects?) and these > issues don't go away if you add a "do not cache" code. > > Any thoughts...? My own (end-user facing) client is a browser plugin and I inherit the caching policy from it. Practically, this means for me that everything is cached throughout the browsing session. An entry can be purged from the cache by a "force reload" that the user can issue. I find this to be a good all round policy for documents. For CGI I usually have to issue a force reload. There could be three 2x codes: 20 (unspecified caching; unchanged from current spec) 21 ("permanent" caching) 22 (no caching) A server that knows 21 and 22 can use them as appropriate on documents that will never change and will always change respectively. A client that doesn't understand them will still behave well because 2x is still 2x. Then there's the question of whether this is something I've missed as a user and implementer. Not really? I suppose being able to specify "no caching" in particular is useful for CGI, but I spend most of my time on Gemini reading documents that rarely change. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/8722 9e49/attachment.sig>
On Fri, Nov 6, 2020 at 5:11 AM Philip Linde <linde.philip at gmail.com> wrote: > I think that's the advantage of bie's suggested solution. It doesn't > require any breaking changes, and a client that doesn't recognize the > difference between codes 20 and 21 will still be fully compatible with > a server that does. > I agree, except that I am in favor of code 22 meaning "It is inadvisable to cache this", on the assumption that most Gemini documents are static and will continue to be so. Even on the Web, most documents are static. If there is to be just one new code, better it should be 22. If people feel strongly about 21, then both 21 and 22. > There can be different codes roughly representing different cache > lifetimes. "PERMANENT" for things that should stick on the disk until > the user (or user configured policy) removes them. "SESSION" for things > that stick for the lifetime of a browsing session. > That seems to me too complex for a client to interpret. What is a browsing session, if I always keep my client running? Note that "don't cache" can also be interpreted as "don't mirror", which is often an important point when dealing with living documents. There are an awful lot of broken mirrors of Wikipedia out there. John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org But that, he realized, was a foolish thought; as no one knew better than he that the Wall had no other side. --Arthur C. Clarke, "The Wall of Darkness" -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/4b5c 5ae5/attachment-0001.htm>
On Fri, Nov 06, 2020 at 05:19:18PM -0500, John Cowan wrote: > On Fri, Nov 6, 2020 at 5:11 AM Philip Linde <linde.philip at gmail.com> wrote: > > > > I think that's the advantage of bie's suggested solution. It doesn't > > require any breaking changes, and a client that doesn't recognize the > > difference between codes 20 and 21 will still be fully compatible with > > a server that does. > > > > I agree, except that I am in favor of code 22 meaning "It is inadvisable to > cache this", on the assumption that most Gemini documents are static and > will continue to be so. Even on the Web, most documents are static. If > there is to be just one new code, better it should be 22. If people feel > strongly about 21, then both 21 and 22. > This reduces gemini to a simple file sharing protocol and basically says that dynamic content is out (unless only targeting advanced clients). Ultimately, I like the gemini protocol just the way it is (and wouldn't be opposed to even a 1000 year feature freeze) but arbitrary caching by clients kills a whole host of use-cases around generated and dynamic responses. bie
On Fri, Nov 6, 2020 at 10:48 PM bie <bie at 202x.moe> wrote: > This reduces gemini to a simple file sharing protocol and basically says > that dynamic content is out (unless only targeting advanced clients). > Here are my assumptions. 1) Clients are going to cache, like it or not. Some already do. 2) Servers are in the best position to say whether content is dynamic or not. "Dynamic" in this case is not just CGI-generated; it's also static files that change often. (I post a static file on the Web that is recomputed every ten minutes by a cron job.) 3) If the server can communicate "don't cache this", the client can provide a better UX. Ultimately, I like the gemini protocol just the way it is (and wouldn' > be opposed to even a 1000 year feature freeze) but arbitrary caching by > clients kills a whole host of use-cases around generated and dynamic > responses. > That horse has sailed and that ship is out of the barn. "The world will go as it will, and not as you or I would have it." John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org weirdo: When is R7RS coming out? Riastradh: As soon as the top is a beautiful golden brown and if you stick a toothpick in it, the toothpick comes out dry. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/b73a d739/attachment-0001.htm>
On Fri, Nov 06, 2020 at 10:55:50PM -0500, John Cowan wrote: > On Fri, Nov 6, 2020 at 10:48 PM bie <bie at 202x.moe> wrote: > > > > This reduces gemini to a simple file sharing protocol and basically says > > that dynamic content is out (unless only targeting advanced clients). > > > > Here are my assumptions. > > 1) Clients are going to cache, like it or not. Some already do. > > 2) Servers are in the best position to say whether content is dynamic or > not. "Dynamic" in this case is not just CGI-generated; it's also static > files that change often. (I post a static file on the Web that is > recomputed every ten minutes by a cron job.) > > 3) If the server can communicate "don't cache this", the client can provide > a better UX. 1) Rude clients are going to cache, yes. 2/3) Totally agree, just not about what the default should be ? And just to keep this "grounded", here are some examples of stuff that arbitrary caching would break: - Adventure games that keep state in the client cert and update the response not only on user input - The URL on my personal gemini site that responds with a random photo - Guestbooks - Streaming content Meanwhile, a default of not caching anything doesn't break anything. All it does is degrade the UX (minimally, IMO). > Ultimately, I like the gemini protocol just the way it is (and wouldn' > > be opposed to even a 1000 year feature freeze) but arbitrary caching by > > clients kills a whole host of use-cases around generated and dynamic > > responses. > > > > That horse has sailed and that ship is out of the barn. "The world will go > as it will, and not as you or I would have it." Well, yes, fair enough. But gemini is only a tiny part of the world. If the aesthetic sensibilities of the community turn out to conflict with mine it's easy to look away ;) bie
> > Clients are going to cache, like it or not. Some already do. > Rude clients are going to cache, yes. I don't understand why a client caching responses is rude. Or rather, I don't understand who it is being rude to. When I configure my HTTP or Gemini browser to cache every response, is my browser now being rude to me? Is it being rude to the server? How can something that causes less resource usage on the server be rude to the server, or something I configured or downloaded as a "client that has caching" be rude to me for using it? Is not doing everything a server sends being rude to the server operator? If a server sends a 100000000x100000000 image, is my image viewer being rude for refusing to decode/display it? -- Leo
On Sat, 07 Nov 2020 10:17:20 +0300 "Leo" <list at gkbrk.com> wrote: > I don't understand why a client caching responses is rude. Or rather, > I don't understand who it is being rude to. When I configure my HTTP > or Gemini browser to cache every response, is my browser now being > rude to me? Is it being rude to the server? The rude thing here would be having to serve large files over Gemini and expect them to be served often, the protocol operates under the assumption that caching does not exist, it's by convention, this simplifies the client a LOT and removes such uncertainty when writing dynamic content for Gemini. Consider reading 2.1.1 in gemini://gemini.circumlunar.space/docs/faq.gmi > How can something that causes less resource usage on the server be > rude to the server, or something I configured or downloaded as a > "client that has caching" be rude to me for using it? Because the server operates under the assumption that content is not cached, if you're serving large files over Gemini you should look somewhere else, this is not bittorrent, and if your server is eating up a lot of resources, you're doing Gemini wrong, Gemini servers don't have to be complicated, that's your own problem. Consider using connection queue and serving connections one by instead of forking or multithreading because the protocol allows such simple design by closing the connection right after the transaction, it's not like in HTTP land where you have keep-alive. > Is not doing everything a server sends being rude to the server > operator? If a server sends a 100000000x100000000 image, is my image > viewer being rude for refusing to decode/display it? No, its being sane, this does not apply here. Does everyone here require a lecture on why their desired features aren't in the protocol yet? seems to be the common point of discussion here, as if the protocol is NOT ENOUGH, I don't know what brought your interest here, did you see it as a great way of avoiding the current scope creep of the modern web, or as a playground to satisfy your bad ideas? Do you have anything else to help the community with? perhaps hosting content in the Gemini space? or helping in the development of tools interacting with the protocol? or is your interest just satisfied when the spec becomes ten times of its current size, then you can MAYBE decide to use the protocol for yourself. If any feature does not add a great value at an acceptable cost to the simplicity of the protocol, consider it rejected before even proposing it, I don't want to have a different experience browsing Gemini space using netcat than using Kristall.
On Sat, 7 Nov 2020, Ali Fardan wrote: > Does everyone here require a lecture on why their desired features > aren't in the protocol yet? seems to be the common point of discussion > here, as if the protocol is NOT ENOUGH, I don't know what brought your > interest here, did you see it as a great way of avoiding the current > scope creep of the modern web, or as a playground to satisfy your bad > ideas? As I've said too many times: Gemini offends people. Don't let them see you being annoyed when they try to provoke you. Mk
On Sat, Nov 7, 2020 at 11:44 AM Ali Fardan <raiz at stellarbound.space> wrote: The rude thing here would be having to serve large files over Gemini > and expect them to be served often, the protocol operates under the > assumption that caching does not exist > If clients aren't free to cache, then I'm not free to save a .gmi file on my file system. That's all a client-side cache is. > Consider using > connection queue and serving connections one by instead of forking or > multithreading because the protocol allows such simple design by > closing the connection right after the transaction, it's not like in > HTTP land where you have keep-alive. > The advantages of not serving connections one by one is that it provides better service to clients on a heavily-used server. Right now there are no heavily-used servers, but there's nothing in the Gemini ethos that says "documents should only be of interest to a few". That's sheer elitism. > Does everyone here require a lecture on why their desired features > aren't in the protocol yet? Client caching has nothing to do with the protocol. The idea of 22 is that authors (not servers) may want to advise clients against caching in a particular case. Do you have anything else to help the community with? > "I'm thinking! I'm thinking!" --Jack Benny during a holdup > If any feature does not add a great value at an acceptable cost to the > simplicity of the protocol, consider it rejected before even proposing > it, I don't want to have a different experience browsing Gemini space > using netcat than using Kristall. > If you are browsing with netcat, caching is not even an issue. If nobody wanted to serve dynamic content, 22 wouldn't be useful. It is handy for those who do want to, to communicate their intent. No client and no server has to implement this. John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org A witness cannot give evidence of his age unless he can remember being born. --Judge Blagden -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201107/7ccd 309e/attachment.htm>
November 7, 2020 11:43 AM, "Ali Fardan" <raiz at stellarbound.space> wrote: > On Sat, 07 Nov 2020 10:17:20 +0300 > "Leo" <list at gkbrk.com> wrote: > >> I don't understand why a client caching responses is rude. Or rather, >> I don't understand who it is being rude to. When I configure my HTTP >> or Gemini browser to cache every response, is my browser now being >> rude to me? Is it being rude to the server? > > The rude thing here would be having to serve large files over Gemini > and expect them to be served often, the protocol operates under the > assumption that caching does not exist, it's by convention, this > simplifies the client a LOT and removes such uncertainty when writing > dynamic content for Gemini. Consider reading 2.1.1 in > gemini://gemini.circumlunar.space/docs/faq.gmi Alright, bet. 2.1.1 in the FAQ says: > Gemini aims to be simple, but not too simple. Gopher is simpler at > a protocol level, but as a consequence the client is eternally > uncertain: what character encoding is this text in? Is this text the > intended content or an error message from the server? What kind of > file is this binary data? Because of this, a robust Gopher client is > made less simple by needing to infer or guess missing information. > Early Gemini discussion included three clear goals with regard to > simplicity: > > * It should be possible for somebody who had no part in designing > the protocol to accurately hold the entire protocol spec in their > head after reading a well-written description of it once or twice. > * A basic but usable (not ultra-spartan) client should fit > comfortably within 50 or so lines of code in a modern high-level > language. Certainly not more than 100. > * A client comfortable for daily use which implements every single > protocol feature should be a feasible weekend programming project > for a single developer. Adding separate 2x status codes for "feel free to cache this" and "don't cache this" would follow:
> On Nov 5, 2020, at 9:52 PM, bie <bie at 202x.moe> wrote: > > Now, I don't see the need for caching at all when it comes to > gemini, but if it's something the gemini community wants or needs then > I'd like to suggest a 2x status code that let's a client know that the > response is unlikely to change and thus cacheable. This could even kind > of match how the temporary/permanent redirect status codes are defined: > > 20 - SUCCESS > 21 - "PERMANENT" SUCCESS (Feel free to cache this) Bit of a bikeshed comment, but ?permanent success?, especially the ?permanent? part, makes me think of Cache-Control: immutable. That is, the server is claiming that the file will not change, ever. (Gemini doesn?t have conditional revalidation.) https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
I apologise if anyone got this email twice. I forgot to switch out my email address before submitting to the list. Following is my original reply. Ok when I replied I thought you were going to argue about something ideological about how the content creator should get the final say in how their content is accessed, but it looks like you are arguing on a technical viewpoint. What you are saying does not make a lot of sense to me though. > Rude clients are going to cache. This is the statement that bothered me because it seemed very weird to consider a client "rude" for not making requests. So I asked about it for clarification. > > I don't understand why a client caching responses is rude. Or rather, > > I don't understand who it is being rude to. When I configure my HTTP > > or Gemini browser to cache every response, is my browser now being > > rude to me? Is it being rude to the server? > The rude thing here would be having to serve large files over Gemini > and expect them to be served often, But here you are saying that the rude thing here is having to serve large files over gemini and expecting them to be served often. This would be the server being rude, and clients caching or not caching resources would not have an impact on how large a server's reponses are. If anything, it would result in less traffic and at least partially mitigate the issue. If your server is being _rude_ and sending large responses, my client is not gonna make it worse by requesting it every single time. Your server might have a lot of bandwidth, my client does not. > the protocol operates under the assumption that caching does not exist, > it's by convention, this simplifies the client a LOT and removes such > uncertainty when writing dynamic content for Gemini. Consider reading > 2.1.1 in gemini://gemini.circumlunar.space/docs/faq.gmi The protocol operates under the assumption that caching does not exist in order to make clients and servers simpler. A client adding caching would NOT affect a server negatively. If you consider caching clients to be rude clients, too bad, you will never notice them because instead of making extra requests, they are making less. I suppose you can ask them to apologise for not making you spend money on compute/bandwidth etc. > > How can something that causes less resource usage on the server be > > rude to the server, or something I configured or downloaded as a > > "client that has caching" be rude to me for using it? > Because the server operates under the assumption that content is not > cached, if you're serving large files over Gemini you should look > somewhere else, this is not bittorrent, and if your server is eating up > a lot of resources, you're doing Gemini wrong, Gemini servers don't > have to be complicated, that's your own problem. Consider using > connection queue and serving connections one by instead of forking or > multithreading because the protocol allows such simple design by > closing the connection right after the transaction, it's not like in > HTTP land where you have keep-alive. Again, you are talking about making servers more complicated. If my client does not cache and your server sends large responses, this is indeed a problem and you might have to make your server more complex to handle it. Caching clients would make simpler server code possible, as performance tricks would not be necessary. But even ignoring that, a client that caches responses would not make your server more complex. > > Is not doing everything a server sends being rude to the server > > operator? If a server sends a 100000000x100000000 image, is my image > > viewer being rude for refusing to decode/display it? > No, its being sane, this does not apply here. So it is being sane when a server says "display this monstrous image" and the client refuses, but it is insane when the server says "please download this resource that you downloaded 3 seconds ago" and the client says "I'll just use the response I have cached"? > Does everyone here require a lecture on why their desired features > aren't in the protocol yet? seems to be the common point of discussion > here, as if the protocol is NOT ENOUGH, I don't know what brought your > interest here, did you see it as a great way of avoiding the current > scope creep of the modern web, or as a playground to satisfy your bad > ideas? The whole point of the discussion is that clients WILL cache resources. Gemini benefits a lot from having a wide range of different clients. You simply can't control everyone's use case to fit the way you think they should consume content. Just like some browsers/proxies cache HTTP and Gopher responses, some people who are bandwidth constrained or simply not wasteful will choose to cache their Gemini responses too. The discussion throws out the idea of client HINTS that will say "This might be a dynamic response, maybe don't cache it". Neither the server nor the client needs to take advantage of this, a client can always cache or never cache, and a server can expect everything to be cached or not cached. Just like before. I understand the desire to keep the spec short, and actually agree with people on this. I believe a new response code for this is not necessary. But a client that caches something is not being rude, at least not more than a client that offers to save the resource to the disk or to send it to a printer. > Do you have anything else to help the community with? perhaps hosting > content in the Gemini space? or helping in the development of tools > interacting with the protocol? or is your interest just satisfied when > the spec becomes ten times of its current size, then you can MAYBE > decide to use the protocol for yourself. Disagreeing with someone is one thing, but you don't have to attack them like this. Before you shame other people, please enlighten us with the tangible value that you have provided to the greater Gemini community. > If any feature does not add a great value at an acceptable cost to the > simplicity of the protocol, consider it rejected before even proposing > it, I don't want to have a different experience browsing Gemini space > using netcat than using Kristall. I agree with this completely. -- Leo
On Sat, Nov 7, 2020 at 2:35 PM <khuxkm at tilde.team> wrote: > Again, your elitism is showing. I feel like adding a way to signal > whether or not you can safely cache a response isn't really sacrificing > any of the Gemini ethos: simplicity, privacy, or generality. > +1 > I fail to see how adding "safe to cache" and "do not cache" status codes > would make using netcat different from Kristall or any other client. Just > treat it as a 2x success status code, and move on. > In addition, if you click on a link saying "Random cool picture!", first of all that's safe (no pixel bug in it), and second, if you notice that it's the same picture as last time, you do a refresh, causing your client to override its cache, and you see a different picture. Everybody wins. John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org There is no real going back. Though I may come to the Shire, it will not seem the same; for I shall not be the same. I am wounded with knife, sting, and tooth, and a long burden. Where shall I find rest? --Frodo -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201107/6ba6 a604/attachment-0001.htm>
On Sat, 7 Nov 2020 13:42:57 -0500 John Cowan <cowan at ccil.org> wrote: > If clients aren't free to cache, then I'm not free to save a .gmi > file on my file system. That's all a client-side cache is. You're free to save your served documents, just don't operate under the assumption that they're permanent, that's how caching can break current assumptions. > The advantages of not serving connections one by one is that it > provides better service to clients on a heavily-used server. Right > now there are no heavily-used servers, but there's nothing in the > Gemini ethos that says "documents should only be of interest to a > few". That's sheer elitism. I'm not telling you how to implement your server, I just don't like raising the barrier to entry for server implementations, right now, I can use inetd to serve Gemini, if keeping a connection alive is a requirement in the future that wouldn't be the case. > Client caching has nothing to do with the protocol. The idea of 22 > is that authors (not servers) may want to advise clients against > caching in a particular case. How would gemtext author instruct the server not to cache their gemtext document? do we now resort to writing response headers in gemtext? is that the solution? Or suppose you allowed an undefined implementation detail to the server for allowing the distinction of static versus dynamic content, and every server goes their way of implementing it whether it be through configuration files or some other means, now 20 becomes obsolete because existing implementations of servers can't be sure that clients will not cache their dynamic content, brilliant. > If you are browsing with netcat, caching is not even an issue. If > nobody wanted to serve dynamic content, 22 wouldn't be useful. It is > handy for those who do want to, to communicate their intent. No > client and no server has to implement this. If 22 is explicit no caching response, how would 20 be redefined? Whenever something becomes optional, but de-facto implementations cover it, and the whole ecosystem operates under the assumption of its presence, the feature shifts from being optional to being a barrier to entry.
On Sat, 07 Nov 2020 19:35:02 +0000 khuxkm at tilde.team wrote: > Elitism much? Let me read you 2.1.3, from the very same FAQ you > quoted: > > > The "first class" application of Gemini is human consumption of > > predominantly written material - to facilitate something like > > gopherspace, or like "reasonable webspace" (e.g. something which is > > comfortably usable in Lynx or Dillo). But, just like HTTP can be, > > and is, used for much, much more than serving HTML, Gemini should > > be able to be used for as many other purposes as possible without > > compromising the simplicity and privacy criteria above. This means > > taking into account possible applications built around non-text > > files and non-human clients. > > I'm not sure how you read this section, but it seems to me like the > intent is to be able to serve large files if you can. Indeed, no arguments here, however, large files typically wouldn't need any caching, you download once and save to your disk, these include ISOs, FLACs, JPEGs and so on. And again: > The "first class" application of Gemini is human consumption of > predominantly written material Written material in gemtext wouldn't need caching because it's a few kilobytes. > > Does everyone here require a lecture on why their desired features > > aren't in the protocol yet? seems to be the common point of > > discussion here, as if the protocol is NOT ENOUGH, I don't know > > what brought your interest here, did you see it as a great way of > > avoiding the current scope creep of the modern web, or as a > > playground to satisfy your bad ideas? > > Again, your elitism is showing. I feel like adding a way to signal > whether or not you can safely cache a response isn't really > sacrificing any of the Gemini ethos: simplicity, privacy, or > generality. If you consider keeping the core philosophy (what I suppose it should be) from solutions to non-existent problems elitism, then have fun comforting each other and entertaining bad design choices. Once caching becomes standard, there's nothing stopping servers from serving large content just because major client implementations will cache it anyway, and this turns from optional to *if you don't support it, your client is limited*, what next? And I'm gonna say it again, if the protocol is centered around serving plain text documents, why would caching be a concern? seems to me that just invites extension and the possibility of generalizing the protocol the same way HTTP went.
On Sun, 08 Nov 2020 00:41:10 +0300 "Leo" <list at gkbrk.com> wrote: > This is the statement that bothered me because it seemed very weird to > consider a client "rude" for not making requests. So I asked about it > for clarification. Because as of currently, there's no way to indicate if content is dynamic or static. > But here you are saying that the rude thing here is having to serve > large files over gemini and expecting them to be served often. This > would be the server being rude, and clients caching or not caching > resources would not have an impact on how large a server's reponses > are. > > If anything, it would result in less traffic and at least partially > mitigate the issue. If your server is being _rude_ and sending large > responses, my client is not gonna make it worse by requesting it every > single time. Your server might have a lot of bandwidth, my client does > not. Gemini responses should typically be in the kilobytes range as they are plain text, unless you're serving MP3s or ISOs which are saved to the disk permanently, there's no point of having caching here. > The protocol operates under the assumption that caching does not exist > in order to make clients and servers simpler. A client adding caching > would NOT affect a server negatively. If you consider caching clients > to be rude clients, too bad, you will never notice them because > instead of making extra requests, they are making less. I suppose you > can ask them to apologise for not making you spend money on > compute/bandwidth etc. Again, the protocol by convention should not go in the direction of bittorrent, plain text documents aren't large. > Again, you are talking about making servers more complicated. If my > client does not cache and your server sends large responses, this is > indeed a problem and you might have to make your server more complex > to handle it. Caching clients would make simpler server code > possible, as performance tricks would not be necessary. But even > ignoring that, a client that caches responses would not make your > server more complex. They won't take burden off the server, instead, they add more, now the server have to tell the client which content to cache and which it should not. > So it is being sane when a server says "display this monstrous image" > and the client refuses, but it is insane when the server says "please > download this resource that you downloaded 3 seconds ago" and the > client says "I'll just use the response I have cached"? Resources are identified using URIs, that's all the client knows, how does it know that the resource haven't been altered since the last time it got accessed? the client could be showing an outdated document in the case of dynamic content. > The whole point of the discussion is that clients WILL cache > resources. Gemini benefits a lot from having a wide range of > different clients. You simply can't control everyone's use case to > fit the way you think they should consume content. Different use cases can be satisfied with different protocols. > Just like some browsers/proxies cache HTTP and Gopher responses, some > people who are bandwidth constrained or simply not wasteful will > choose to cache their Gemini responses too. The discussion throws out > the idea of client HINTS that will say "This might be a dynamic > response, maybe don't cache it". Neither the server nor the client > needs to take advantage of this, a client can always cache or never > cache, and a server can expect everything to be cached or not cached. > Just like before. Again, there shouldn't be a point where caching is mandatory if plain text documents are being served. > Disagreeing with someone is one thing, but you don't have to attack > them like this. I'm sorry you feel offended. > Before you shame other people, please enlighten us > with the tangible value that you have provided to the greater Gemini > community. Not proposing solutions to non-existent problems. > > If any feature does not add a great value at an acceptable cost to > > the simplicity of the protocol, consider it rejected before even > > proposing it, I don't want to have a different experience browsing > > Gemini space using netcat than using Kristall. > > I agree with this completely. Then there's nothing to discuss further.
Just wanted to pop in: November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote: > On Sat, 7 Nov 2020 13:42:57 -0500 > John Cowan <cowan at ccil.org> wrote: > > -snip- > >> If you are browsing with netcat, caching is not even an issue. If >> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is >> handy for those who do want to, to communicate their intent. No >> client and no server has to implement this. > > If 22 is explicit no caching response, how would 20 be redefined? 20 wouldn't be redefined. A status code of 20 would simply have no assumptions as to the cacheability of a resource (i.e; cache at your own risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return them. Just my two cents, Robert "khuxkm" Miles
It was thus said that the Great khuxkm at tilde.team once stated: > November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote: > > On Sat, 7 Nov 2020 13:42:57 -0500 > > John Cowan <cowan at ccil.org> wrote: > > > >> If you are browsing with netcat, caching is not even an issue. If > >> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is > >> handy for those who do want to, to communicate their intent. No > >> client and no server has to implement this. > > > > If 22 is explicit no caching response, how would 20 be redefined? > > 20 wouldn't be redefined. A status code of 20 would simply have no > assumptions as to the cacheability of a resource (i.e; cache at your own > risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return > them. Ah, clashing proposals! How wonderful! In another thread, not at all related to caching, we have prisonpotato at tilde.team who said: > This seems like a neat solution to this problem to me, but I'm not sure if > it would work at this stage of gemini's life cycle. There are also of > course the issues with dynamically sized responses as generated by CGI > scripts and stuff like that, so maybe we could introduce a new response > code, like 22: Response with size. > > 20 text/gemini > 22 100 text/gemini > > This solves both problems by making content length optional again, but > exposes a risk that this type of extension could be used to add more fields (gemini://gemi.dev/gemini-mailing-list/messages/003010.gmi) and John Cowan, who said this in this thread: > I agree, except that I am in favor of code 22 meaning "It is inadvisable > to cache this", on the assumption that most Gemini documents are static > and will continue to be so. Even on the Web, most documents are static. > If there is to be just one new code, better it should be 22. If people > feel strongly about 21, then both 21 and 22. So, which is it? Sizes? Or caching? Or I suppose we could all the above: 20 status, no size 22 status, size 21 cache, no size 22 cache, size 23 no-cache, no size 24 no-cache, size and before you know it: 20 status, no size, no future feature 22 status, size, no future feature 21 cache, no size, no future feature 22 cache, size, no future feature 23 no-cache, no size, no future feature 24 no-cache, size, no future feature 25 status, no size, future feature 26 status, size, future feature 27 cache, no size, future feature 28 cache, size, future feature 29 no-cache, no size, future feature 30 no-cache, size, future feature ... oh, wait a second ... We're done out of status codes and crashing into the next block. It may seem silly to worry about future feature now, but hey, the future comes eventually. Even *if* the size doesn't get its own status code, I think my argument stands---features can mix, and if they can mix, the number of status code explodes: 20 status 21 cache 22 no-cache 23 status, future feature 1 24 cache, future feature 1 25 no-cache, future feature 1 26 status, no future feature 1, future feature 2 27 cache, no future feature 1, future feature 2 28 no-cache, no future feature 1, future feature 2 29 status, future feature 1, future feature 2 30 ... uh oh ... I have my own ideas about caching, but I want to cobble up a proof-of-concept first before I talk about it more, because from where I come from, working code is worth more than talk. -spc
November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote: > It was thus said that the Great khuxkm at tilde.team once stated: > >> November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote: >> On Sat, 7 Nov 2020 13:42:57 -0500 >> John Cowan <cowan at ccil.org> wrote: >> >> If you are browsing with netcat, caching is not even an issue. If >> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is >> handy for those who do want to, to communicate their intent. No >> client and no server has to implement this. >> >> If 22 is explicit no caching response, how would 20 be redefined? >> >> 20 wouldn't be redefined. A status code of 20 would simply have no >> assumptions as to the cacheability of a resource (i.e; cache at your own >> risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return >> them. > > Ah, clashing proposals! How wonderful! > > In another thread, not at all related to caching, we have prisonpotato at > tilde.team who said: > >> This seems like a neat solution to this problem to me, but I'm not sure if >> it would work at this stage of gemini's life cycle. There are also of >> course the issues with dynamically sized responses as generated by CGI >> scripts and stuff like that, so maybe we could introduce a new response >> code, like 22: Response with size. >> >> 20 text/gemini >> 22 100 text/gemini >> >> This solves both problems by making content length optional again, but >> exposes a risk that this type of extension could be used to add more fields > > (gemini://gemi.dev/gemini-mailing-list/messages/003010.gmi) > > and John Cowan, who said this in this thread: > >> I agree, except that I am in favor of code 22 meaning "It is inadvisable >> to cache this", on the assumption that most Gemini documents are static >> and will continue to be so. Even on the Web, most documents are static. >> If there is to be just one new code, better it should be 22. If people >> feel strongly about 21, then both 21 and 22. > > So, which is it? Sizes? Or caching? Or I suppose we could all the > above: > > 20 status, no size > 22 status, size > 21 cache, no size > 22 cache, size > 23 no-cache, no size > 24 no-cache, size > > and before you know it: > > 20 status, no size, no future feature > 22 status, size, no future feature > 21 cache, no size, no future feature > 22 cache, size, no future feature > 23 no-cache, no size, no future feature > 24 no-cache, size, no future feature > > 25 status, no size, future feature > 26 status, size, future feature > 27 cache, no size, future feature > 28 cache, size, future feature > 29 no-cache, no size, future feature > 30 no-cache, size, future feature ... oh, wait a second ... > > We're done out of status codes and crashing into the next block. It may > seem silly to worry about future feature now, but hey, the future comes > eventually. Even *if* the size doesn't get its own status code, I think my > argument stands---features can mix, and if they can mix, the number of > status code explodes: > > 20 status > 21 cache > 22 no-cache > 23 status, future feature 1 > 24 cache, future feature 1 > 25 no-cache, future feature 1 > 26 status, no future feature 1, future feature 2 > 27 cache, no future feature 1, future feature 2 > 28 no-cache, no future feature 1, future feature 2 > 29 status, future feature 1, future feature 2 > 30 ... uh oh ... > > I have my own ideas about caching, but I want to cobble up a > proof-of-concept first before I talk about it more, because from where I > come from, working code is worth more than talk. > > -spc This can't happen, though, because the first proposal breaks the compatibility of <META> in response codes within a block, and the second one is just debating which of the codes we should add. Cache/no-cache would be 2 (at most) response codes. That's all. I'm also going to try and put together some basic code to demonstrate how I think this should work. Maybe, then, it'll be a bit clearer. Just my two cents, Robert "khuxkm" Miles
It was thus said that the Great khuxkm at tilde.team once stated: > > This can't happen, though, because the first proposal breaks the > compatibility of <META> in response codes within a block, and the second > one is just debating which of the codes we should add. This *did* happen, back on November 3rd. gemini://gemi.dev/gemini-mailing-list/messages/003015.gmi And it's already received over a hundred hits. > Cache/no-cache would be 2 (at most) response codes. That's all. And my argument, even with the size response code removed, can *still* lead to a combinatoric explosion of response codes. Today it's just *two*, but what about tomorrow? -spc
>We're done out of status codes and crashing into the next block. It >may >seem silly to worry about future feature now, but hey, the future comes >eventually. Even *if* the size doesn't get its own status code, I >think my >argument stands---features can mix, and if they can mix, the number of >status code explodes: > from where >I >come from, working code is worth more than talk. I think there's still a long way for gemini clients to come before demanding the Future Inevitable Feature. Like, the more thought that goes into this, the less likely we'll run out of status codes. One way a client could implement caching is by loading the cached version of a page, and telling the user when it was downloaded, next to a 'refresh' button. A gemini client could even offer a diff view between the current page and the cached copy. All of that *empowers the users* over the servers, and doesn't rely on adding anything to the spec. A "please don't cache" response code would be (ab)used by servers who desire to track their users. -Zach
This is what I *think* how *should* clients work with caching: The clients with history support and supports going ''backwards'' and ''forwards'' through history should cache text/* responses in memory for that browsing session. When the user browses through the history using ''forward'' and ''backward'' action, no reloading should happen. But, when a user clicks the link for a resource already in cache or writes the link by hand or selects the link from previously visited links or asks for reload: the cache is purged and the resource reloaded. It is assumed that requests with query part are idempotent. Now, when a page is dynamic, it should be stated as such so that the user would reload that page. With that, no new response codes. ~smlckz
On November 8, 2020 12:59:09 AM EST, Sudipto Mallick <smallick.dev at gmail.com> wrote: >It is assumed that requests with query part ... That client-implementation could lead to strange uses of the query part... But if it's a result of a 10 or 11 response, it would make sense to make the request fresh. mbays at sdf.org is right, the user should be in control. A caching client could simply split links to cached resources into two parts: clicking one part would pull the cached version, clicking the other would make a new request. -Zach
November 8, 2020 12:52 AM, "Zach DeCook" <zachdecook at librem.one> wrote: >> We're done out of status codes and crashing into the next block. It >> may >> seem silly to worry about future feature now, but hey, the future comes >> eventually. Even *if* the size doesn't get its own status code, I >> think my >> argument stands---features can mix, and if they can mix, the number of >> status code explodes: >> >> from where >> I >> come from, working code is worth more than talk. > > I think there's still a long way for gemini clients to come before demanding the Future Inevitable > Feature. Like, the more thought that goes into this, the less likely we'll run out of status codes. > One way a client could implement caching is by loading the cached version of a page, and telling > the user when it was downloaded, next to a 'refresh' button. A gemini client could even offer a > diff view between the current page and the cached copy. All of that
November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote: > I have my own ideas about caching, but I want to cobble up a > proof-of-concept first before I talk about it more, because from where I > come from, working code is worth more than talk. > > -spc Speaking of cobbling up proof-of-concepts, I've created a proof-of-concept of how I feel these cache-hint success codes would work: https://gist.github.com/febd3f5ae2308e8b55449a92c6e58a65 (Yes, I know it's on GitHub, but I have a shell script to make Gists from the command line and so I want to use it.) This includes a spartan client (it literally just spits the raw protocol response out at you) with caching behavior influenced by 21/22 (in practice, it caches all 2x responses except for 22 responses), as well as examples of endpoints to hit that return each code. Hopefully my working code will prove my point better than I could in words. Just my two cents, Robert "khuxkm" Miles
It was thus said that the Great khuxkm at tilde.team once stated: > November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote: > > > I have my own ideas about caching, but I want to cobble up a > > proof-of-concept first before I talk about it more, because from where I > > come from, working code is worth more than talk. > > > > -spc > > Speaking of cobbling up proof-of-concepts, I've created a proof-of-concept > of how I feel these cache-hint success codes would work: > > https://gist.github.com/febd3f5ae2308e8b55449a92c6e58a65 > > (Yes, I know it's on GitHub, but I have a shell script to make Gists from > the command line and so I want to use it.) > > This includes a spartan client (it literally just spits the raw protocol > response out at you) with caching behavior influenced by 21/22 (in > practice, it caches all 2x responses except for 22 responses), as well as > examples of endpoints to hit that return each code. > > Hopefully my working code will prove my point better than I could in words. And I have my "proof-of-concept" up at well. It's at gemini://gemini.conman.org/test/testcache.gemini My approach is a bit different, probably a bit harder to implement server side, but deals not only with the caching issue, but with repeated requests. I'm not sure how popular it will be, but hey, it's out there, and it only adds one general purpose status code (23 for now) that means: okay, request was okay, but there is no data to serve you. How it works: A plain request: gemini://gemini.conman.org/test/cachefile.txt will always return the content. However, if you include a timestamp using a path parameter (which is *NOT* the same as a query paramter, and is in the ISO-8601 format): gemini://gemini.conman.org/test/cachefile.txt;2020-11-08T00:00:00 If the file is *newer* than that timestamp, you get the normal response of 20 and all the content; otherwise you get a response of 23 (with the normal MIME type) and no content, meaning it hasn't changed since the given date. This means a client that doesn't with to deal with caching at all will never see any difference. A client that does, or (in the case of feeds) not want to always download content that hasn't changed, can do that as well. -spc
On 2020-11-08 12:49 AM, Sean Conner wrote: > My approach is a bit different, probably a bit harder to implement server > side, but deals not only with the caching issue, but with repeated requests. > I'm not sure how popular it will be, but hey, it's out there, and it only > adds one general purpose status code (23 for now) that means: > > okay, request was okay, but there is no data to serve you. You could get a similar effect with 44 SLOW DOWN as it already exists, and boom, now the protocol doesn't need to know about caching at all.
On Sun, 8 Nov 2020 00:50:18 -0500 Sean Conner <sean at conman.org> wrote: > It was thus said that the Great khuxkm at tilde.team once stated: > > > > This can't happen, though, because the first proposal breaks the > > compatibility of <META> in response codes within a block, and the second > > one is just debating which of the codes we should add. > > This *did* happen, back on November 3rd. > > gemini://gemi.dev/gemini-mailing-list/messages/003015.gmi > > And it's already received over a hundred hits. You're mistaking the sense in which khuxkm is saying that it can't happen. It did happen in the sense that you implemented a server for a protocol that works like this. It can't happen in the sense that it fundamentally breaks compatibility with clients that only concern themselves with the first digit of the response code. That you've implemented the resulting protocol of your suggested change has no bearing on the argument. The overflowing of 2x code resulting from a combinatorial hell is entirely self-inflicted through your choice to ignore this aspect of the existing specification. I've already suggested to you a way to avoid this (use a different first digit). This additionally avoids having to rewrite 3.2 of the spec and invalidate existing clients that take for granted that "the first digit alone provides enough information for a client to determine how to handle the response". Moreover, with TLS already having a mechanism to signal the intended end of a connection, I don't think that content size is a pressing issue. It would allow for download progress bars, but adds nothing over TLS in terms of ensuring that the content is fully received. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201108/213b a640/attachment.sig>
On Sun, 8 Nov 2020 11:29:09 +0530 Sudipto Mallick <smallick.dev at gmail.com> wrote: > This is what I *think* how *should* clients work with caching: > > The clients with history support and supports going ''backwards'' and > ''forwards'' through history should cache text/* responses in memory > for that browsing session. When the user browses through the history > using ''forward'' and ''backward'' action, no reloading should happen. > But, when a user clicks the link for a resource already in cache or > writes the link by hand or selects the link from previously visited > links or asks for reload: the cache is purged and the resource > reloaded. It is assumed that requests with query part are idempotent. > Now, when a page is dynamic, it should be stated as such so that the > user would reload that page. > With that, no new response codes. This is the creativity I like to see when dealing with limited environments, this is a solution that requires no change in the protocol, if a client developer feels the urge to have caching which I still don't understand why it should be necessary for Gemini. (everybody refused to give me an answer) Great solution, but I doubt anyone would listen to you considering the current climate of discussion being focused heavily on adding more.
On 11/8/20 3:44 PM, Ali Fardan wrote: > if a client developer feels the urge to have caching which I > still don't understand why it should be necessary for Gemini. (everybody > refused to give me an answer) I'm not 100% sure if it requires caching or not, but what I would want from a client with a back button is to take me back to where I was in the document's scroll position before I clicked. Some clients are already handling this use case (hooray!). This may be a greater concern for graphical clients than text ones, unless the client is handling paging itself. Losing your place can be frustrating when moving through a series of inter-linked documents. Imagine going through older CAPCOM articles, for instance. Every time you click back needing to scroll back down to wherever you were becomes obnoxious and a reason to just stop. Again, I'm not sure if caching is involved in it or not, but that's the best thing I could come up with as an answer to your question.
On Sun, 8 Nov 2020 16:21:03 +0000 James Tomasino <tomasino at lavabit.com> wrote: > Again, I'm not sure if caching is involved in it or not, but that's > the best thing I could come up with as an answer to your question. This can be done by remembering the offset percentage-wise in the client for each previous page, however, this could be broken if caching wasn't involved in case document changes meanwhile, if this is a compelling reason to have caching, I'd suggest Mallick's method, as it won't involve any changes to the protocol and it fits the context very well.
On 11/8/20, James Tomasino <tomasino at lavabit.com> wrote: > I'm not 100% sure if it requires caching or not, but what I would want from > a client with a back button is to take me back to where I was in the > document's scroll position before I clicked. Some clients are already > handling this use case (hooray!). This may be a greater concern for > graphical clients than text ones, unless the client is handling paging > itself. Losing your place can be frustrating when moving through a series of > inter-linked documents. Imagine going through older CAPCOM articles, for > instance. Every time you click back needing to scroll back down to wherever > you were becomes obnoxious and a reason to just stop. Now in that case, if the client does not cache, the page is reloaded. Even if the client remembers the position where you were before in that page, imagine the scenario where the page gets changed before reloading (say, new links added to (or removed in) that page), then you have to scroll anyway. (In the worst case, imagine the link you clicked does not exist in the new page.) > Again, I'm not sure if caching is involved in it or not, but that's the best > thing I could come up with as an answer to your question. So caching is required. Your client must keep that page in cache along with your last position in that page for the feature you want. ~smlckz
> This is the creativity I like to see when dealing with limited > environments, this is a solution that requires no change in the > protocol, if a client developer feels the urge to have caching which I > still don't understand why it should be necessary for Gemini. (everybody > refused to give me an answer) > > Great solution, but I doubt anyone would listen to you considering the > current climate of discussion being focused heavily on adding more. I really like this approach, as I like the protocol as-is. I am currently working on a client for Android, and I am planning to make heavy use of caching, although I do not think a change in the protocol is necessary. In my usecase, I have a dataplan that's free, but just gives me flaky 32kbit/s at best, usually far less. (It's called "messaging option"). Loading a page on Gemini usually takes multiple seconds. This would be similar for packet radio and similar applications. Without caching, this is a huge PITA. I will take care to have a "Reload" button once I use caching, so the users themselves can decide when new content should be fetched. I want to stress that caching is neccessary in my usecase. It's a much more needed feature than, say, client certificate support. At the moment, the majority of Content on Gemini is static and I believe it will continue to be. - Waweic
On Sun, 08 Nov 2020 17:21:17 +0000 Waweic <waweic at protonmail.com> wrote: > I really like this approach, as I like the protocol as-is. Wonderful. > I want to stress that caching is neccessary in my usecase. It's a > much more needed feature than, say, client certificate support. At > the moment, the majority of Content on Gemini is static and I believe > it will continue to be. Go for it.
It was thus said that the Great Philip Linde once stated: > > The overflowing of 2x code resulting from a combinatorial hell is > entirely self-inflicted through your choice to ignore this aspect of > the existing specification. I don't agree that this is self-inflicted on me---what I'm trying (and failing) to point out is that adding mroe success calls *could* lead to combinatorial hell. HTTP already has 10 success codes in the 200 range, and none of them relate to caching status (since caching status is passed along in headers). Granted, none of the HTTP success statuses (with the exception of 200, which is mapped to 20 in Gemini) apply to Gemini, but in another universe where HTTP *did* have separate success code for caching information, I can see some combinatoric increase, say for 204, 205 and 206 with nothing said, cache and no-caching of results (that's nine new ones right there). > I've already suggested to you a way to avoid this (use a different > first digit). No, that was to prisonpotato at tilde.team---they were the first to come up with the idea, not me. I just implemented it first (much the same way I implemented the first Gemini server even before solderpunk did [1]). Also, the size breaking code is only active on one link on my site, not everwhere. > This additionally avoids having to rewrite 3.2 of the > spec and invalidate existing clients that take for granted that "the > first digit alone provides enough information for a client to determine > how to handle the response". > > Moreover, with TLS already having a mechanism to signal the intended > end of a connection, I don't think that content size is a pressing > issue. It would allow for download progress bars, but adds nothing > over TLS in terms of ensuring that the content is fully received. The concern is over large responses. It wasn't much of a concern until gemini://konpeito.media/ was created and serving up large audio files (and archives of said audio files). I can envision a client being configured to abort the download if say, a 10 megabyte file is being downloaded. It
On 08-Nov-2020 22:22, Sean Conner wrote: > > The concern is over large responses. It wasn't much of a concern until > gemini://konpeito.media/ was created and serving up large audio files (and > archives of said audio files). I can envision a client being configured to > abort the download if say, a 10 megabyte file is being downloaded. It > *sucked* when my DSL went down in late September/early October (yes, about > three weeks) and I had to rely upon my cellphone hot spot. I didn't have a > large data plan for the cell phone becuase I didn't need it, until I did. > It would have been nice to configure my web browser to not download anything > over 5M at that point. (At the risk of wading into this thread - I should know better) Yes that is a sensible client design IMO, since you cannot know how long to wait. In my client GemiNaut (Windows only atm sorry folks), there are two options: - abandon download after X Mb or after Y seconds The gemget client/utility also implements this approach, which is what GemiNaut uses under the hood. The values are tunable according to the desires of the user. Mostly I have mine set to 5mb or 10 seconds. If something times out beyond that I make a judgement whether I really want to up the threshold temporarily to let it through. Or maybe go look elsewhere ;) By and large tex/gemini content is very small and fast which is one of its great selling points. Other binary files are more "meh". - Luke
On Sun, Nov 08, 2020 at 05:21:17PM +0000, Waweic wrote: > I really like this approach, as I like the protocol as-is. I am currently working on a client for Android, and I am planning to make heavy use of caching, although I do not think a change in the protocol is necessary. In my usecase, I have a dataplan that's free, but just gives me flaky 32kbit/s at best, usually far less. (It's called "messaging option"). Loading a page on Gemini usually takes multiple seconds. This would be similar for packet radio and similar applications. Without caching, this is a huge PITA. I will take care to have a "Reload" button once I use caching, so the users themselves can decide when new content should be fetched. > > I want to stress that caching is neccessary in my usecase. It's a much more needed feature than, say, client certificate support. At the moment, the majority of Content on Gemini is static and I believe it will continue to be. I should add that in cases where there's an actual need for caching like this, caching is totally fine by me, but I hope you can find a way to add some kind of indicator to make it clear to the user whether you're showing fresh or cached content! bie
> This is what I *think* how *should* clients work with caching: > > The clients with history support and supports going ''backwards'' and > ''forwards'' through history should cache text/* responses in memory > for that browsing session. When the user browses through the history > using ''forward'' and ''backward'' action, no reloading should happen. > But, when a user clicks the link for a resource already in cache or > writes the link by hand or selects the link from previously visited > links or asks for reload: the cache is purged and the resource > reloaded. It is assumed that requests with query part are idempotent. > Now, when a page is dynamic, it should be stated as such so that the > user would reload that page. > With that, no new response codes. I think your proposal is excellent. I am sure a browser could add a command "press r to reload" for any page. I have to say I am really puzzled by the protracted discussion about caching. The difference between returning the full document and a message saying "document hasn't changed" is really small in the greater scheme of things: The tcp 3 way handshake plus the tls negotiation consume quite a number of packets, and add round trip latency. Network load is often measured in number of frames rather than bytes, and there is space for 1500 or even 9000 bytes per frame - this means that if your document is below that size (I'd venture most good ones are), then a "not changed" response doesn't even change the number of packets sent. That even suggests another heuristic: If your content is dynamic, try generating a short document, the implication being that larger ones should be cached, as there might be an actual benefit. I really think this is a holdover people still thinking in http+html instead of gemini: Plaintext http allowed several parties to share a cache. This isn't the case there, as things are encrypted. Html often includes other urls ("img src"), which might be shared across pages. Gemini doesn't do that either. And *if* caching should be done, then it seems a poor idea to have the caching clues live in the transfer/session/transport layer. Instead it should be in the document markup. Even http+html finally realised that with the messy "http-meta-equiv" fields in the markup. At least that provides a better path for the document author to tell is how long the document might be valid for. And with a machine readable license that would allow for aggregation/replication/archiving/broadcast/etc which seems a much better way to save bandwidth and have persistent documents. TLDR: Don't think in http+html, do better regards marc
On Sun, 8 Nov 2020 17:22:56 -0500 Sean Conner <sean at conman.org> wrote: > I don't agree that this is self-inflicted on me---what I'm trying (and > failing) to point out is that adding mroe success calls *could* lead to > combinatorial hell. This is unavoidable in any case of adding a sufficient number of features that can be combined. If every feature proposal should be evaluated in terms of the slippery slope of "what if we add more" there wouldn't be a point to discussing additions to the protocol, a notion that I'm actually partial to, but that I think shouldn't be used as a basis for judging against individual feature proposals. > No, that was to prisonpotato at tilde.team---they were the first to come up > with the idea, not me. I just implemented it first (much the same way I > implemented the first Gemini server even before solderpunk did [1]). Also, > the size breaking code is only active on one link on my site, not everwhere. Okay, I'm glad that you picked it up even when it wasn't addressed directly at you. Consider the option, and why I believe that using the 2x range is inappropriate. > The concern is over large responses. It wasn't much of a concern until > gemini://konpeito.media/ was created and serving up large audio files (and > archives of said audio files). I can envision a client being configured to > abort the download if say, a 10 megabyte file is being downloaded. It > *sucked* when my DSL went down in late September/early October (yes, about > three weeks) and I had to rely upon my cellphone hot spot. I didn't have a > large data plan for the cell phone becuase I didn't need it, until I did. > It would have been nice to configure my web browser to not download anything > over 5M at that point. That would be a nice feature, but is it nice enough to warrant breakage across the growing number of implementations? text/gemini documents can display an estimated size for linked files, and a user can configure their automatic client to abort at a certain point or choose in their interactive client to cancel a request. Not a complete solution by any means, but worth considering as it is much cheaper. No-cache/please-cache 2x responses however seem like they can add a lot of value for very little cost. Existing client implementations will be compatible, and the idea of the second digit as a hint from the server that can be ignored by a simpler client is retained. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/695c 2fca/attachment-0001.sig>
On Mon, 9 Nov 2020 17:18:21 +0100 Philip Linde <linde.philip at gmail.com> wrote: > On Sun, 8 Nov 2020 17:22:56 -0500 > This is unavoidable in any case of adding a sufficient number of > features that can be combined. If every feature proposal should be > evaluated in terms of the slippery slope of "what if we add more" > there wouldn't be a point to discussing additions to the protocol, a > notion that I'm actually partial to, but that I think shouldn't be > used as a basis for judging against individual feature proposals. Lets not look at it this way, lets look at it for the way of what does caching truly enable, I've discussed this many times, no one seemed to address my concerns, if in-protocol caching mechanism exists, it opens the window for serving complex document formats that are resource heavy, see HTML pulling stylesheets and images for example, it is discouraged to use anything other than gemtext for documents in Gemini. In the case of serving media files like music, video, and even large PDFs, these are downloaded to be saved on disk anyway, so caching wouldn't be a concern. The only convincing reasoning for caching is Waweic's use case using Mallick's method, and even that doesn't introduce any new protocol features. > That would be a nice feature, but is it nice enough to warrant > breakage across the growing number of implementations? text/gemini > documents can display an estimated size for linked files, and a user > can configure their automatic client to abort at a certain point or > choose in their interactive client to cancel a request. Not a > complete solution by any means, but worth considering as it is much > cheaper. While to each their own, and implementations can behave the way they desire, I'm against this feature as it breaks certain sites that are slow to load, I'd have to refresh once again and get it to load fast if I don't want it to timeout, what you could do instead is have in the bottom (or top) status bar or any information indicator a display of how much is downloaded of the requested file, the user then judges if they want to stop or not by hitting a cancel button.
Zach DeCook <zachdecook at librem.one> writes: > A "please don't cache" response code would be (ab)used by servers who > desire to track their users. I think this is an important point which has been largely overlooked. We do not have (many?) bad actors on the gemiverse for now, but this can't continue forever. I'm also wondering whether we shouldn't state outright that Gemini requests are expected to be idempotent. I know this would break dynamic pages that people are already using and enjoying (astrobotany, guestbooks), but other types of dynamic pages with user input would still be relevant, like searches and local weather. I dislike the idea of adding caching-related result codes. That's one of the things that drove the increasing complexification of HTTP during the HTTP/1.0 era (i.e., before HTTP/1.1). If we're not doing full-fledged application development on top of Gemini, we don't need server-side cache control. -- +-----------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | A flower falls, even though we love it; and a weed grows, | | even though we do not love it. -- Dogen |
Sudipto Mallick <smallick.dev at gmail.com> writes: > The clients with history support and supports going ''backwards'' and > ''forwards'' through history should cache text/* responses in memory > for that browsing session. When the user browses through the history > using ''forward'' and ''backward'' action, no reloading should happen. > But, when a user clicks the link for a resource already in cache or > writes the link by hand or selects the link from previously visited > links or asks for reload: the cache is purged and the resource > reloaded. It is assumed that requests with query part are idempotent. I also believe that this is the correct approach, and should be considered a client 'best practice'. -- +-----------------------------------------------------------+ | Jason F. McBrayer jmcbray at carcosa.net | | A flower falls, even though we love it; and a weed grows, | | even though we do not love it. -- Dogen |
On Sun, Nov 8, 2020 at 12:52 AM Zach DeCook <zachdecook at librem.one> wrote: > A "please don't cache" response code would be (ab)used by servers who > desire to track their users. > I don't understand your reasoning there. What does a server learn by sending a 21 YOU CAN CACHE or 22 YOU SHOULD NOT CACHE response back instead of a plain 20 response? (I'm not a security expert and I know there are loopholes I don't see.) John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org So they play that [tune] on their fascist banjos, eh? --Great-Souled Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/8317 e59c/attachment.htm>
It was thus said that the Great Sean Conner once stated: > > And I have my "proof-of-concept" up at well. It's at > > gemini://gemini.conman.org/test/testcache.gemini I have removed this "proof-of-concept" after some thought about the approach. I agree with Ali Fardan that Mallick's method is the way to handle caching (or not at all). Now, on with destroying my own idea here ... > How it works: A plain request: > > gemini://gemini.conman.org/test/cachefile.txt > > will always return the content. However, if you include a timestamp using a > path parameter (which is *NOT* the same as a query paramter, and is in the > ISO-8601 format): > > gemini://gemini.conman.org/test/cachefile.txt;2020-11-08T00:00:00 > > If the file is *newer* than that timestamp, you get the normal response of > 20 and all the content; otherwise you get a response of 23 (with the normal > MIME type) and no content, meaning it hasn't changed since the given date. The major problem here is timezones. Time zone information is complicated, and from what I've seen, operating system specific (the C standard doesn't mention it; POSIX does it one way; Windows another) so that's a complication for both servers and clients. Also, does the concept apply to each path component? Or only the end? For example: gemini://gemini.conman.org/test;2020-11-08T00:00:00/cachefile.txt return 23 if the directory test hasn't changed, even if cachefile.txt has? Or is it ignored *unless* it's the last path component? My gut instinct is to say "last component" but it get messy: gemini://gemini.coman.org/test;2020-11-08T00:00:00 will result in a redirect (or should), but gemini://gemini.coman.org/test;2020-11-08T00:00:00/ won't. So, for these reasons, nah. I won't push this. -spc
On Mon, Nov 9, 2020 at 4:33 PM Sean Conner <sean at conman.org> wrote: > The major problem here is timezones. Time zone information is > complicated, and from what I've seen, operating system specific (the C > standard doesn't mention it; POSIX does it one way; Windows another) so > that's a complication for both servers and clients. > ISO 8601 doesn't use named timezones, only offsets, but the best policy is to do everything in UTC, which is pretty universally available on all operating systems now. It's perfectly fine to just write "2020-11-09T22:14:05Z". > Also, does the concept apply to each path component? Or only the end? > I would say "The end", but I would write it as a query term: gemini:// example.com/test?since=2020-11-09T22:14:05Z. After all, the path elements aren't necessarily actual objects with modification dates (in Amazon S3, for example, they are just part of the name). In addition, you could use the "inode/x-empty" MIME-type instead of a different protocol response. But, like you, I doubt this is worth doing. Servers should advise clients about whether the author thinks the result should be cached, and clients can do what they like about it. (There are any number of ways for the author to communicate this intent; it doesn't have to be part of the content.) John Cowan http://vrici.lojban.org/~cowan cowan at ccil.org I am expressing my opinion. When my honorable and gallant friend is called, he will express his opinion. This is the process which we call Debate. --Winston Churchill -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/e355 08dc/attachment.htm>
John Cowan <cowan at ccil.org> writes: > I don't understand your reasoning there. What does a server learn by > sending a 21 YOU CAN CACHE or 22 YOU SHOULD NOT CACHE response back > instead of a plain 20 response? (I'm not a security expert and I know > there are loopholes I don't see.) The server operator gets a decent guess at whether the user has visited the page before (within a reasonable caching window), because if you sent a 21 YOU CAN CACHE, and they made the request, that means they hadn't seen it recently. Combine this with query strings, IP addresses, and/or fragment identifiers, and you can identify individual users, even users who have refused to set a client certificate when you asked. It's a pretty minor information leak, since it can't be used for cross-site tracking. But give techbros an inch, and they'll take a mile. -- Jason McBrayer | ?Strange is the night where black stars rise, jmcbray at carcosa.net | and strange moons circle through the skies, | but stranger still is lost Carcosa.? | ? Robert W. Chambers,The King in Yellow
---