Caching and status codes

📧 Messages: 55
🗣️ Authors: 19
📅 First Message: 2020-11-06 05:52
📅 Last Message: 2020-11-09 23:02

1. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-06 05:52
📧 Message 1 of 55

Hello everyone!

There's been some light discussion on the gemini irc channel about if
and how clients should cache gemini responses, and I'd love to hear how
people here think about the issue...

Personally I feel like caching adds a ton of complexity, so the default
behavior should be not to cache anything. There's no way to know if the
response from a server is the product of a dynamic CGI script or a static
file, and trying to guess at what's what will definitely introduce some
weird behavior. Just to give some examples - on my personal site I have
a script that returns a random jpeg, a guestbook and a simple gemlog,
none of which should be cached!

Now, I don't see the need for caching at all when it comes to
gemini, but if it's something the gemini community wants or needs then
I'd like to suggest a 2x status code that let's a client know that the
response is unlikely to change and thus cacheable. This could even kind
of match how the temporary/permanent redirect status codes are defined:

20 - SUCCESS
21 - "PERMANENT" SUCCESS (Feel free to cache this)

This allows simple clients to keep doing what they're doing now (send a
request, show the response) and allows more complex clients to play
around with caching without breaking compatibility with what we already
have.

Another suggestion made on irc was to do the opposite and create a "do
not cache" 2x status code, but I'm not really sold on that. Caching
introduces a lot of complexity even without a separate status code (how
should query strings be handled, 1x responses, redirects?) and these
issues don't go away if you add a "do not cache" code.

Any thoughts...?

bie

Link to individual message.

2. Sudipto Mallick (smallick.dev (a) gmail.com)

📅 Sent: 2020-11-06 07:27
📧 Message 2 of 55

Instead of "permanent" caching (what is permanent?) I am thinking
about using timestamps.

So, for a requested resource, (if it is available) return a timestamp
which denotes when the resource was last changed.

When requesting the resource again, send that timestamp with it and
the server checks if the cache is stale or not and responds
accordingly (either "resource is modified after $(old_timestamp) so
here is the new resource and it was modified on $(new_timestamp)" or
"the resource was not changed after $(timestamp)").

But the problem is, where does the timestamp goes in the request and response?


~smlckz

Link to individual message.

3. Björn Wärmedal (bjorn.warmedal (a) gmail.com)

📅 Sent: 2020-11-06 07:43
📧 Message 3 of 55

I *love* it when clients cache stuff, for performance and environmental reasons.

But I think that's something that each individual client (or client
dev, I suppose) should deal with in their own fashion and only if they
want to. As mentioned before most files served on gemini:// are small
text files, and it's likely to remain that way for the foreseeable
future. I don't think caching is something that the protocol should
care about or cater to specifically, as it adds complexity and
ambiguity.

> Now, I don't see the need for caching at all when it comes to
> gemini, but if it's something the gemini community wants or needs then
> I'd like to suggest a 2x status code that let's a client know that the
> response is unlikely to change and thus cacheable. This could even kind
> of match how the temporary/permanent redirect status codes are defined:
>
> 20 - SUCCESS
> 21 - "PERMANENT" SUCCESS (Feel free to cache this)

Adding a response code like that is counterproductive, I believe, as
there's really no way for a server to determine what sort of file or
data is unlikely to change, and there's no way for a client to
determine what a flag of "unlikely to change" means. Does it mean it
won't change in the next hour? The next month?

As mentioned in previous discussions on caching and checksums, there's
not really room in the protocol specification for the needed
information unless you start misusing MIME type info. I've come around
to thinking that the added complexity, ambiguity, and possibly
breaking changes far outweigh the added benefit.

On a side note I am (somewhat slowly) playing around with
python3/Tkinter to make a client, and my current thinking for the
design of that is that when the user clicks a link that returns an
image I'll display it inline and cache it (based on its path)

	forever*, though it will be clearly marked as a cached resource in

the UI somehow as of yet undetermined. As for text/gemini I don't see
a reason to cache them for more than very short periods: they're too
small and frequently changed. No changes are needed in the protocol
for this sort of behaviour, and I'm not really convinced it's a
behaviour most clients should adopt either. Client-side caching is
something that users have strong opinions about.

The only real benefit of caching would be on the server side; whether
it's in a proxying situation or some type of mirror, or just a server
that wants to be speedier when serving dynamic data. I don't see a
need for a protocol change for that, though; if someone wants to make
a super fancy app server they're free to do it as long as it looks the
same to clients :)

Cheers,
ew0k

Link to individual message.

4. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-06 07:44
📧 Message 4 of 55

For me personally, that brings the protocol a bit too close to the
complexity of HTTP, but I'd be curious to hear what the use-cases for
something like that would be...?

When I used the word "permanent" it was just to draw a comparison to the
31 status code (REDIRECT - PERMANENT). The idea is just to let the
client know that if it wants to cache the response for a session or
however long it wants to... it's unlikely to cause any issues.

bie

On Fri, Nov 06, 2020 at 12:57:11PM +0530, Sudipto Mallick wrote:
> Instead of "permanent" caching (what is permanent?) I am thinking
> about using timestamps.
> 
> So, for a requested resource, (if it is available) return a timestamp
> which denotes when the resource was last changed.
> 
> When requesting the resource again, send that timestamp with it and
> the server checks if the cache is stale or not and responds
> accordingly (either "resource is modified after $(old_timestamp) so
> here is the new resource and it was modified on $(new_timestamp)" or
> "the resource was not changed after $(timestamp)").
> 
> But the problem is, where does the timestamp goes in the request and response?
> 
> 
> ~smlckz

Link to individual message.

5. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-06 08:10
📧 Message 5 of 55

> Adding a response code like that is counterproductive, I believe, as
> there's really no way for a server to determine what sort of file or
> data is unlikely to change, and there's no way for a client to
> determine what a flag of "unlikely to change" means. Does it mean it
> won't change in the next hour? The next month?

If there's no way for a server to determine what's unlikely to change
then it's *definitely* no way for a client to know, in which case
caching the response is just plain bad-mannered.

> On a side note I am (somewhat slowly) playing around with
> python3/Tkinter to make a client, and my current thinking for the
> design of that is that when the user clicks a link that returns an
> image I'll display it inline and cache it (based on its path)
> *forever*, though it will be clearly marked as a cached resource in
> the UI somehow as of yet undetermined. As for text/gemini I don't see
> a reason to cache them for more than very short periods: they're too
> small and frequently changed. No changes are needed in the protocol
> for this sort of behaviour, and I'm not really convinced it's a
> behaviour most clients should adopt either. Client-side caching is
> something that users have strong opinions about.

I definitely have strong opinions about this, yeah - this would, like I
mentioned earlier, mean that my script returning a random photo wouldn't
work in your client, and that a lot of the tools and toys I was hoping to use
the gemini protocol for are probably suited for something else,
at least if the consensus among the gemini community is that arbitrary
caching is fine.

??
bie

Link to individual message.

6. Björn Wärmedal (bjorn.warmedal (a) gmail.com)

📅 Sent: 2020-11-06 09:01
📧 Message 6 of 55

> I definitely have strong opinions about this, yeah - this would, like I
> mentioned earlier, mean that my script returning a random photo wouldn't
> work in your client, and that a lot of the tools and toys I was hoping to use
> the gemini protocol for are probably suited for something else,
> at least if the consensus among the gemini community is that arbitrary
> caching is fine.

I wouldn't worry about that if I were you -- I'm pretty sure I'm the
only one on the planet dumb enough to cache like this :D There's a
fair chance I'll change that stance when my browser is ready enough to
be tested.

Link to individual message.

7. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-11-06 10:11
📧 Message 7 of 55

On Fri, 6 Nov 2020 12:57:11 +0530
Sudipto Mallick <smallick.dev at gmail.com> wrote:

> But the problem is, where does the timestamp goes in the request and response?

I think that's the advantage of bie's suggested solution. It doesn't
require any breaking changes, and a client that doesn't recognize the
difference between codes 20 and 21 will still be fully compatible with
a server that does.

There can be different codes roughly representing different cache
lifetimes. "PERMANENT" for things that should stick on the disk until
the user (or user configured policy) removes them. "SESSION" for things
that stick for the lifetime of a browsing session.

A HTTP HEAD Last-Modified like solution also provides little advantage
for the smaller documents people typically serve on Gemini. A lot of
overhead exists in TLS negotiation, so one request is almost certainly
better than two for small blog posts or articles.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/c453
1315/attachment.sig>

Link to individual message.

8. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-11-06 10:24
📧 Message 8 of 55

On Fri, 6 Nov 2020 08:43:39 +0100
Bj?rn W?rmedal <bjorn.warmedal at gmail.com> wrote:

> Adding a response code like that is counterproductive, I believe, as
> there's really no way for a server to determine what sort of file or
> data is unlikely to change, and there's no way for a client to
> determine what a flag of "unlikely to change" means. Does it mean it
> won't change in the next hour? The next month?

Why does the server need to determine this? It should be up to the
server admin to determine it. If I have some file I want to serve that
I anticipate will never change, I configure the server to respond to
reqeusts for it with code 21. The client can take this to mean a week,
a day or forever depending on how sure the user wants to be that the
information is current. The client could override this behavior by
allowing the user to force a cache entry to be purged.

> As mentioned in previous discussions on caching and checksums, there's
> not really room in the protocol specification for the needed
> information unless you start misusing MIME type info.

This doesn't at all address the suggested solution, which there *is*
room for in the protocol. No need to misuse MIME type info. No need for
breaking changes to the specification. As suggested, this is entirely
backwards-compatible, older clients and servers are entirely
forwards-compatible and the change to the spec would be entirely
additive.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/dab5
7087/attachment.sig>

Link to individual message.

9. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-11-06 10:41
📧 Message 9 of 55

On Fri, 6 Nov 2020 14:52:17 +0900
bie <bie at 202x.moe> wrote:

> Another suggestion made on irc was to do the opposite and create a "do
> not cache" 2x status code, but I'm not really sold on that. Caching
> introduces a lot of complexity even without a separate status code (how
> should query strings be handled, 1x responses, redirects?) and these
> issues don't go away if you add a "do not cache" code.
> 
> Any thoughts...?

My own (end-user facing) client is a browser plugin and I inherit the
caching policy from it. Practically, this means for me that everything
is cached throughout the browsing session. An entry can be purged from
the cache by a "force reload" that the user can issue. I find this to
be a good all round policy for documents. For CGI I usually have to
issue a force reload. There could be three 2x codes:

20 (unspecified caching; unchanged from current spec)
21 ("permanent" caching)
22 (no caching)

A server that knows 21 and 22 can use them as appropriate on documents
that will never change and will always change respectively. A client
that doesn't understand them will still behave well because 2x is still
2x.

Then there's the question of whether this is something I've missed as a
user and implementer. Not really? I suppose being able to specify "no
caching" in particular is useful for CGI, but I spend most of my time
on Gemini reading documents that rarely change.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/8722
9e49/attachment.sig>

Link to individual message.

10. mbays (a) sdf.org (mbays (a) sdf.org)

📅 Sent: 2020-11-06 15:24
📧 Message 10 of 55


	Friday, 2020-11-06 at 14:52 +0900 - bie <bie at 202x.moe>:


>There's been some light discussion on the gemini irc channel about if
>and how clients should cache gemini responses, and I'd love to hear how
>people here think about the issue...

Since gemini requests have no guarantee of idempotency, I think it's 
crucial that the user always knows whether an action will cause 
a request or not. That means simple consistent rules on whether to 
retrieve from cache or make a request, that don't depend on subtleties 
like the current time or RAM availability. One simple way to achieve 
this is to always make a fresh request *except* when navigating history 
("back" or "forward"). You could also interpret navigating to an url 
which is in (linear) history as navigation of history, so load from 
cache (in theory this does require unbounded memory use to store the 
urls of an unboundedly long history list, but in practice this is 
unlikely to be more than a few KB). If the page you want to load from 
cache has been deleted due to memory constraints, I'd say you should 
present an error and offer to let the user make the request again, 
rather than doing it automatically. But if all but the tail is 
text/gemini, probably you can afford to store everything anyway.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/378c
00b5/attachment.sig>

Link to individual message.

11. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-06 22:19
📧 Message 11 of 55

On Fri, Nov 6, 2020 at 5:11 AM Philip Linde <linde.philip at gmail.com> wrote:

> I think that's the advantage of bie's suggested solution. It doesn't
> require any breaking changes, and a client that doesn't recognize the
> difference between codes 20 and 21 will still be fully compatible with
> a server that does.
>

I agree, except that I am in favor of code 22 meaning "It is inadvisable to
cache this", on the assumption that most Gemini documents are static and
will continue to be so.  Even on the Web, most documents are static.  If
there is to be just one new code, better it should be 22.  If people feel
strongly about 21, then both 21 and 22.

> There can be different codes roughly representing different cache
> lifetimes. "PERMANENT" for things that should stick on the disk until
> the user (or user configured policy) removes them. "SESSION" for things
> that stick for the lifetime of a browsing session.
>

That seems to me too complex for a client to interpret.  What is a browsing
session, if I always keep my client running?  Note that "don't cache" can
also be interpreted as "don't mirror", which is often an important point
when dealing with living documents.  There are an awful lot of broken
mirrors of Wikipedia out there.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
But that, he realized, was a foolish thought; as no one knew better than
he that the Wall had no other side.
        --Arthur C. Clarke, "The Wall of Darkness"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/4b5c
5ae5/attachment-0001.htm>

Link to individual message.

12. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-07 03:48
📧 Message 12 of 55

On Fri, Nov 06, 2020 at 05:19:18PM -0500, John Cowan wrote:
> On Fri, Nov 6, 2020 at 5:11 AM Philip Linde <linde.philip at gmail.com> wrote:
> 
> 
> > I think that's the advantage of bie's suggested solution. It doesn't
> > require any breaking changes, and a client that doesn't recognize the
> > difference between codes 20 and 21 will still be fully compatible with
> > a server that does.
> >
> 
> I agree, except that I am in favor of code 22 meaning "It is inadvisable to
> cache this", on the assumption that most Gemini documents are static and
> will continue to be so.  Even on the Web, most documents are static.  If
> there is to be just one new code, better it should be 22.  If people feel
> strongly about 21, then both 21 and 22.
> 

This reduces gemini to a simple file sharing protocol and basically says
that dynamic content is out (unless only targeting advanced clients).

Ultimately, I like the gemini protocol just the way it is (and wouldn't
be opposed to even a 1000 year feature freeze) but arbitrary caching by
clients kills a whole host of use-cases around generated and dynamic
responses.

bie

Link to individual message.

13. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-07 03:55
📧 Message 13 of 55

On Fri, Nov 6, 2020 at 10:48 PM bie <bie at 202x.moe> wrote:

> This reduces gemini to a simple file sharing protocol and basically says
> that dynamic content is out (unless only targeting advanced clients).
>

Here are my assumptions.

1) Clients are going to cache, like it or not.  Some already do.

2) Servers are in the best position to say whether content is dynamic or
not. "Dynamic" in this case is not just CGI-generated; it's also static
files that change often.  (I post a static file on the Web that is
recomputed every ten minutes by a cron job.)

3) If the server can communicate "don't cache this", the client can provide
a better UX.

Ultimately, I like the gemini protocol just the way it is (and wouldn'
> be opposed to even a 1000 year feature freeze) but arbitrary caching by
> clients kills a whole host of use-cases around generated and dynamic
> responses.
>

That horse has sailed and that ship is out of the barn.  "The world will go
as it will, and not as you or I would have it."

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
weirdo:    When is R7RS coming out?
Riastradh: As soon as the top is a beautiful golden brown and if you
stick a toothpick in it, the toothpick comes out dry.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201106/b73a
d739/attachment-0001.htm>

Link to individual message.

14. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-07 04:32
📧 Message 14 of 55

On Fri, Nov 06, 2020 at 10:55:50PM -0500, John Cowan wrote:
> On Fri, Nov 6, 2020 at 10:48 PM bie <bie at 202x.moe> wrote:
> 
> 
> > This reduces gemini to a simple file sharing protocol and basically says
> > that dynamic content is out (unless only targeting advanced clients).
> >
> 
> Here are my assumptions.
> 
> 1) Clients are going to cache, like it or not.  Some already do.
> 
> 2) Servers are in the best position to say whether content is dynamic or
> not. "Dynamic" in this case is not just CGI-generated; it's also static
> files that change often.  (I post a static file on the Web that is
> recomputed every ten minutes by a cron job.)
> 
> 3) If the server can communicate "don't cache this", the client can provide
> a better UX.

1) Rude clients are going to cache, yes.

2/3) Totally agree, just not about what the default should be ?

And just to keep this "grounded", here are some examples of stuff that
arbitrary caching would break:

- Adventure games that keep state in the client cert and update the
  response not only on user input
- The URL on my personal gemini site that responds with a random photo
- Guestbooks
- Streaming content

Meanwhile, a default of not caching anything doesn't break anything. All
it does is degrade the UX (minimally, IMO).

> Ultimately, I like the gemini protocol just the way it is (and wouldn'
> > be opposed to even a 1000 year feature freeze) but arbitrary caching by
> > clients kills a whole host of use-cases around generated and dynamic
> > responses.
> >
> 
> That horse has sailed and that ship is out of the barn.  "The world will go
> as it will, and not as you or I would have it."

Well, yes, fair enough. But gemini is only a tiny part of the world. If
the aesthetic sensibilities of the community turn out to conflict with
mine it's easy to look away ;)

bie

Link to individual message.

15. Leo (list (a) gkbrk.com)

📅 Sent: 2020-11-07 07:17
📧 Message 15 of 55

> > Clients are going to cache, like it or not.  Some already do.
> Rude clients are going to cache, yes.

I don't understand why a client caching responses is rude. Or rather, I
don't understand who it is being rude to. When I configure my HTTP or
Gemini browser to cache every response, is my browser now being rude to
me? Is it being rude to the server?

How can something that causes less resource usage on the server be rude
to the server, or something I configured or downloaded as a "client that
has caching" be rude to me for using it?

Is not doing everything a server sends being rude to the server
operator? If a server sends a 100000000x100000000 image, is my image
viewer being rude for refusing to decode/display it?

--
Leo

Link to individual message.

16. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-07 16:43
📧 Message 16 of 55

On Sat, 07 Nov 2020 10:17:20 +0300
"Leo" <list at gkbrk.com> wrote:
> I don't understand why a client caching responses is rude. Or rather,
> I don't understand who it is being rude to. When I configure my HTTP
> or Gemini browser to cache every response, is my browser now being
> rude to me? Is it being rude to the server?

The rude thing here would be having to serve large files over Gemini
and expect them to be served often, the protocol operates under the
assumption that caching does not exist, it's by convention, this
simplifies the client a LOT and removes such uncertainty when writing
dynamic content for Gemini.  Consider reading 2.1.1 in
gemini://gemini.circumlunar.space/docs/faq.gmi

> How can something that causes less resource usage on the server be
> rude to the server, or something I configured or downloaded as a
> "client that has caching" be rude to me for using it?

Because the server operates under the assumption that content is not
cached, if you're serving large files over Gemini you should look
somewhere else, this is not bittorrent, and if your server is eating up
a lot of resources, you're doing Gemini wrong, Gemini servers don't
have to be complicated, that's your own problem.  Consider using
connection queue and serving connections one by instead of forking or
multithreading because the protocol allows such simple design by
closing the connection right after the transaction, it's not like in
HTTP land where you have keep-alive.

> Is not doing everything a server sends being rude to the server
> operator? If a server sends a 100000000x100000000 image, is my image
> viewer being rude for refusing to decode/display it?

No, its being sane, this does not apply here.

Does everyone here require a lecture on why their desired features
aren't in the protocol yet? seems to be the common point of discussion
here, as if the protocol is NOT ENOUGH, I don't know what brought your
interest here, did you see it as a great way of avoiding the current
scope creep of the modern web, or as a playground to satisfy your bad
ideas?

Do you have anything else to help the community with? perhaps hosting
content in the Gemini space? or helping in the development of tools
interacting with the protocol? or is your interest just satisfied when
the spec becomes ten times of its current size, then you can MAYBE
decide to use the protocol for yourself.

If any feature does not add a great value at an acceptable cost to the
simplicity of the protocol, consider it rejected before even proposing
it, I don't want to have a different experience browsing Gemini space
using netcat than using Kristall.

Link to individual message.

17. Martin Keegan (martin (a) no.ucant.org)

📅 Sent: 2020-11-07 17:08
📧 Message 17 of 55

On Sat, 7 Nov 2020, Ali Fardan wrote:

> Does everyone here require a lecture on why their desired features
> aren't in the protocol yet? seems to be the common point of discussion
> here, as if the protocol is NOT ENOUGH, I don't know what brought your
> interest here, did you see it as a great way of avoiding the current
> scope creep of the modern web, or as a playground to satisfy your bad
> ideas?

As I've said too many times: Gemini offends people. Don't let them see you 
being annoyed when they try to provoke you.

Mk

Link to individual message.

18. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-07 18:42
📧 Message 18 of 55

On Sat, Nov 7, 2020 at 11:44 AM Ali Fardan <raiz at stellarbound.space> wrote:

The rude thing here would be having to serve large files over Gemini
> and expect them to be served often, the protocol operates under the
> assumption that caching does not exist
>

If clients aren't free to cache, then I'm not free to save a .gmi file on
my file system.  That's all a client-side cache is.

> Consider using
> connection queue and serving connections one by instead of forking or
> multithreading because the protocol allows such simple design by
> closing the connection right after the transaction, it's not like in
> HTTP land where you have keep-alive.
>

The advantages of not serving connections one by one is that it provides
better service to clients on a heavily-used server.  Right now there are no
heavily-used servers, but there's nothing in the Gemini ethos that says
"documents should only be of interest to a few".  That's sheer elitism.

> Does everyone here require a lecture on why their desired features
> aren't in the protocol yet?

Client caching has nothing to do with the protocol.  The idea of 22 is that
authors (not servers) may want to advise clients against caching in a
particular case.

Do you have anything else to help the community with?
>

"I'm thinking!  I'm thinking!"  --Jack Benny during a holdup

> If any feature does not add a great value at an acceptable cost to the
> simplicity of the protocol, consider it rejected before even proposing
> it, I don't want to have a different experience browsing Gemini space
> using netcat than using Kristall.
>

If you are browsing with netcat, caching is not even an issue.  If nobody
wanted to serve dynamic content, 22 wouldn't be useful.  It is handy for
those who do want to, to communicate their intent.  No client and no server
has to implement this.

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
A witness cannot give evidence of his age unless he can remember being born.
                --Judge Blagden
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201107/7ccd
309e/attachment.htm>

Link to individual message.

19. khuxkm (a) tilde.team (khuxkm (a) tilde.team)

📅 Sent: 2020-11-07 19:35
📧 Message 19 of 55

November 7, 2020 11:43 AM, "Ali Fardan" <raiz at stellarbound.space> wrote:

> On Sat, 07 Nov 2020 10:17:20 +0300
> "Leo" <list at gkbrk.com> wrote:
> 
>> I don't understand why a client caching responses is rude. Or rather,
>> I don't understand who it is being rude to. When I configure my HTTP
>> or Gemini browser to cache every response, is my browser now being
>> rude to me? Is it being rude to the server?
> 
> The rude thing here would be having to serve large files over Gemini
> and expect them to be served often, the protocol operates under the
> assumption that caching does not exist, it's by convention, this
> simplifies the client a LOT and removes such uncertainty when writing
> dynamic content for Gemini. Consider reading 2.1.1 in
> gemini://gemini.circumlunar.space/docs/faq.gmi

Alright, bet. 2.1.1 in the FAQ says:

> Gemini aims to be simple, but not too simple. Gopher is simpler at
> a protocol level, but as a consequence the client is eternally
> uncertain: what character encoding is this text in? Is this text the
> intended content or an error message from the server? What kind of 
> file is this binary data? Because of this, a robust Gopher client is
> made less simple by needing to infer or guess missing information. 
> Early Gemini discussion included three clear goals with regard to
> simplicity:
>
> * It should be possible for somebody who had no part in designing
> the protocol to accurately hold the entire protocol spec in their
> head after reading a well-written description of it once or twice.
> * A basic but usable (not ultra-spartan) client should fit
> comfortably within 50 or so lines of code in a modern high-level
> language. Certainly not more than 100.
> * A client comfortable for daily use which implements every single
> protocol feature should be a feasible weekend programming project
> for a single developer.

Adding separate 2x status codes for "feel free to cache this" and 
"don't cache this" would follow:


	It's literally just 2 more numbers to remember, shouldn't be much 

harder than remembering the status codes that exist today.

	A basic but usable client can just ignore 21/22 and act like it

got a 20 response code (that's the point of the 2-digit response
code, AFAICT).

	Implementing some form of caching wouldn't be too hard to add to

a client that "implements every single protocol feature".

>> How can something that causes less resource usage on the server be
>> rude to the server, or something I configured or downloaded as a
>> "client that has caching" be rude to me for using it?
> 
> Because the server operates under the assumption that content is not
> cached, if you're serving large files over Gemini you should look
> somewhere else, this is not bittorrent, and if your server is eating up
> a lot of resources, you're doing Gemini wrong, Gemini servers don't
> have to be complicated, that's your own problem. Consider using
> connection queue and serving connections one by instead of forking or
> multithreading because the protocol allows such simple design by
> closing the connection right after the transaction, it's not like in
> HTTP land where you have keep-alive.

Elitism much? Let me read you 2.1.3, from the very same FAQ you quoted:

> The "first class" application of Gemini is human consumption of
> predominantly written material - to facilitate something like
> gopherspace, or like "reasonable webspace" (e.g. something which is
> comfortably usable in Lynx or Dillo). But, just like HTTP can be, and
> is, used for much, much more than serving HTML, Gemini should be able to
> be used for as many other purposes as possible without compromising the
> simplicity and privacy criteria above. This means taking into account
> possible applications built around non-text files and non-human clients.

I'm not sure how you read this section, but it seems to me like the intent
is to be able to serve large files if you can.

>> Is not doing everything a server sends being rude to the server
>> operator? If a server sends a 100000000x100000000 image, is my image
>> viewer being rude for refusing to decode/display it?
> 
> No, its being sane, this does not apply here.

Let me rephrase the question: if a server tells you you can cache the 
response, is your client being rude for refusing to cache it?

> Does everyone here require a lecture on why their desired features
> aren't in the protocol yet? seems to be the common point of discussion
> here, as if the protocol is NOT ENOUGH, I don't know what brought your
> interest here, did you see it as a great way of avoiding the current
> scope creep of the modern web, or as a playground to satisfy your bad
> ideas?

Again, your elitism is showing. I feel like adding a way to signal 
whether or not you can safely cache a response isn't really sacrificing 
any of the Gemini ethos: simplicity, privacy, or generality.

> If any feature does not add a great value at an acceptable cost to the
> simplicity of the protocol, consider it rejected before even proposing
> it, I don't want to have a different experience browsing Gemini space
> using netcat than using Kristall.

I fail to see how adding "safe to cache" and "do not cache" status codes 
would make using netcat different from Kristall or any other client. Just 
treat it as a 2x success status code, and move on.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

20. Nathan Galt (mailinglists (a) ngalt.com)

📅 Sent: 2020-11-07 21:13
📧 Message 20 of 55

> On Nov 5, 2020, at 9:52 PM, bie <bie at 202x.moe> wrote:
> 
> Now, I don't see the need for caching at all when it comes to
> gemini, but if it's something the gemini community wants or needs then
> I'd like to suggest a 2x status code that let's a client know that the
> response is unlikely to change and thus cacheable. This could even kind
> of match how the temporary/permanent redirect status codes are defined:
> 
> 20 - SUCCESS
> 21 - "PERMANENT" SUCCESS (Feel free to cache this)

Bit of a bikeshed comment, but ?permanent success?, especially the 
?permanent? part, makes me think of Cache-Control: immutable. That is, the 
server is claiming that the file will not change, ever. (Gemini doesn?t 
have conditional revalidation.)

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control

Link to individual message.

21. Leo (list (a) gkbrk.com)

📅 Sent: 2020-11-07 21:41
📧 Message 21 of 55

I apologise if anyone got this email twice. I forgot to switch out my
email address before submitting to the list. Following is my original
reply.


Ok when I replied I thought you were going to argue about something
ideological about how the content creator should get the final say in
how their content is accessed, but it looks like you are arguing on a
technical viewpoint. What you are saying does not make a lot of sense to
me though.

> Rude clients are going to cache.

This is the statement that bothered me because it seemed very weird to
consider a client "rude" for not making requests. So I asked about it
for clarification.

> > I don't understand why a client caching responses is rude. Or rather,
> > I don't understand who it is being rude to. When I configure my HTTP
> > or Gemini browser to cache every response, is my browser now being
> > rude to me? Is it being rude to the server?
> The rude thing here would be having to serve large files over Gemini
> and expect them to be served often,

But here you are saying that the rude thing here is having to serve
large files over gemini and expecting them to be served often. This
would be the server being rude, and clients caching or not caching
resources would not have an impact on how large a server's reponses are.

If anything, it would result in less traffic and at least partially
mitigate the issue. If your server is being _rude_ and sending large
responses, my client is not gonna make it worse by requesting it every
single time. Your server might have a lot of bandwidth, my client does
not.

> the protocol operates under the assumption that caching does not exist,
> it's by convention, this simplifies the client a LOT and removes such
> uncertainty when writing dynamic content for Gemini. Consider reading
> 2.1.1 in gemini://gemini.circumlunar.space/docs/faq.gmi

The protocol operates under the assumption that caching does not exist
in order to make clients and servers simpler. A client adding caching
would NOT affect a server negatively. If you consider caching clients to
be rude clients, too bad, you will never notice them because instead of
making extra requests, they are making less. I suppose you can ask them
to apologise for not making you spend money on compute/bandwidth etc.

> > How can something that causes less resource usage on the server be
> > rude to the server, or something I configured or downloaded as a
> > "client that has caching" be rude to me for using it?
> Because the server operates under the assumption that content is not
> cached, if you're serving large files over Gemini you should look
> somewhere else, this is not bittorrent, and if your server is eating up
> a lot of resources, you're doing Gemini wrong, Gemini servers don't
> have to be complicated, that's your own problem. Consider using
> connection queue and serving connections one by instead of forking or
> multithreading because the protocol allows such simple design by
> closing the connection right after the transaction, it's not like in
> HTTP land where you have keep-alive.

Again, you are talking about making servers more complicated. If my
client does not cache and your server sends large responses, this is
indeed a problem and you might have to make your server more complex to
handle it. Caching clients would make simpler server code possible, as
performance tricks would not be necessary. But even ignoring that, a
client that caches responses would not make your server more complex.

> > Is not doing everything a server sends being rude to the server
> > operator? If a server sends a 100000000x100000000 image, is my image
> > viewer being rude for refusing to decode/display it?
> No, its being sane, this does not apply here.

So it is being sane when a server says "display this monstrous image"
and the client refuses, but it is insane when the server says "please
download this resource that you downloaded 3 seconds ago" and the client
says "I'll just use the response I have cached"?

> Does everyone here require a lecture on why their desired features
> aren't in the protocol yet? seems to be the common point of discussion
> here, as if the protocol is NOT ENOUGH, I don't know what brought your
> interest here, did you see it as a great way of avoiding the current
> scope creep of the modern web, or as a playground to satisfy your bad
> ideas?

The whole point of the discussion is that clients WILL cache resources.
Gemini benefits a lot from having a wide range of different clients. You
simply can't control everyone's use case to fit the way you think they
should consume content.

Just like some browsers/proxies cache HTTP and Gopher responses, some
people who are bandwidth constrained or simply not wasteful will choose
to cache their Gemini responses too. The discussion throws out the idea
of client HINTS that will say "This might be a dynamic response, maybe
don't cache it". Neither the server nor the client needs to take
advantage of this, a client can always cache or never cache, and a
server can expect everything to be cached or not cached. Just like
before.

I understand the desire to keep the spec short, and actually agree with
people on this. I believe a new response code for this is not necessary.
But a client that caches something is not being rude, at least not more
than a client that offers to save the resource to the disk or to send it
to a printer.

> Do you have anything else to help the community with? perhaps hosting
> content in the Gemini space? or helping in the development of tools
> interacting with the protocol? or is your interest just satisfied when
> the spec becomes ten times of its current size, then you can MAYBE
> decide to use the protocol for yourself.

Disagreeing with someone is one thing, but you don't have to attack them
like this. Before you shame other people, please enlighten us with the
tangible value that you have provided to the greater Gemini community.

> If any feature does not add a great value at an acceptable cost to the
> simplicity of the protocol, consider it rejected before even proposing
> it, I don't want to have a different experience browsing Gemini space
> using netcat than using Kristall.

I agree with this completely.

--
Leo

Link to individual message.

22. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-07 22:48
📧 Message 22 of 55

On Sat, Nov 7, 2020 at 2:35 PM <khuxkm at tilde.team> wrote:


> Again, your elitism is showing. I feel like adding a way to signal
> whether or not you can safely cache a response isn't really sacrificing
> any of the Gemini ethos: simplicity, privacy, or generality.
>

+1

> I fail to see how adding "safe to cache" and "do not cache" status codes
> would make using netcat different from Kristall or any other client. Just
> treat it as a 2x success status code, and move on.
>

In addition, if you click on a link saying "Random cool picture!", first of
all that's safe (no pixel bug in it), and second, if you notice that it's
the same picture as last time, you do a refresh, causing your client to
override its cache, and you see a different picture.  Everybody wins.



John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
There is no real going back.  Though I may come to the Shire, it will
not seem the same; for I shall not be the same.  I am wounded with
knife, sting, and tooth, and a long burden.  Where shall I find rest?
                --Frodo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201107/6ba6
a604/attachment-0001.htm>

Link to individual message.

23. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 01:15
📧 Message 23 of 55

On Sat, 7 Nov 2020 13:42:57 -0500
John Cowan <cowan at ccil.org> wrote:
> If clients aren't free to cache, then I'm not free to save a .gmi
> file on my file system.  That's all a client-side cache is.

You're free to save your served documents, just don't operate under the
assumption that they're permanent, that's how caching can break current
assumptions.

> The advantages of not serving connections one by one is that it
> provides better service to clients on a heavily-used server.  Right
> now there are no heavily-used servers, but there's nothing in the
> Gemini ethos that says "documents should only be of interest to a
> few".  That's sheer elitism.

I'm not telling you how to implement your server, I just don't like
raising the barrier to entry for server implementations, right now, I
can use inetd to serve Gemini, if keeping a connection alive is a
requirement in the future that wouldn't be the case.

> Client caching has nothing to do with the protocol.  The idea of 22
> is that authors (not servers) may want to advise clients against
> caching in a particular case.

How would gemtext author instruct the server not to cache their gemtext
document? do we now resort to writing response headers in gemtext? is
that the solution?

Or suppose you allowed an undefined implementation detail to the server
for allowing the distinction of static versus dynamic content, and
every server goes their way of implementing it whether it be through
configuration files or some other means, now 20 becomes obsolete
because existing implementations of servers can't be sure that clients
will not cache their dynamic content, brilliant.

> If you are browsing with netcat, caching is not even an issue.  If
> nobody wanted to serve dynamic content, 22 wouldn't be useful.  It is
> handy for those who do want to, to communicate their intent.  No
> client and no server has to implement this.

If 22 is explicit no caching response, how would 20 be redefined?

Whenever something becomes optional, but de-facto implementations cover
it, and the whole ecosystem operates under the assumption of its
presence, the feature shifts from being optional to being a barrier to
entry.

Link to individual message.

24. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 01:33
📧 Message 24 of 55

On Sat, 07 Nov 2020 19:35:02 +0000
khuxkm at tilde.team wrote:
> Elitism much? Let me read you 2.1.3, from the very same FAQ you
> quoted:
> 
> > The "first class" application of Gemini is human consumption of
> > predominantly written material - to facilitate something like
> > gopherspace, or like "reasonable webspace" (e.g. something which is
> > comfortably usable in Lynx or Dillo). But, just like HTTP can be,
> > and is, used for much, much more than serving HTML, Gemini should
> > be able to be used for as many other purposes as possible without
> > compromising the simplicity and privacy criteria above. This means
> > taking into account possible applications built around non-text
> > files and non-human clients.  
> 
> I'm not sure how you read this section, but it seems to me like the
> intent is to be able to serve large files if you can.

Indeed, no arguments here, however, large files typically wouldn't need
any caching, you download once and save to your disk, these include
ISOs, FLACs, JPEGs and so on.

And again:

> The "first class" application of Gemini is human consumption of
> predominantly written material

Written material in gemtext wouldn't need caching because it's a few
kilobytes.

> > Does everyone here require a lecture on why their desired features
> > aren't in the protocol yet? seems to be the common point of
> > discussion here, as if the protocol is NOT ENOUGH, I don't know
> > what brought your interest here, did you see it as a great way of
> > avoiding the current scope creep of the modern web, or as a
> > playground to satisfy your bad ideas?  
> 
> Again, your elitism is showing. I feel like adding a way to signal 
> whether or not you can safely cache a response isn't really
> sacrificing any of the Gemini ethos: simplicity, privacy, or
> generality.

If you consider keeping the core philosophy (what I suppose it should
be) from solutions to non-existent problems elitism, then have fun
comforting each other and entertaining bad design choices.

Once caching becomes standard, there's nothing stopping servers from
serving large content just because major client implementations will
cache it anyway, and this turns from optional to *if you don't support
it, your client is limited*, what next?

And I'm gonna say it again, if the protocol is centered around serving
plain text documents, why would caching be a concern? seems to me that
just invites extension and the possibility of generalizing the protocol
the same way HTTP went.

Link to individual message.

25. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 02:04
📧 Message 25 of 55

On Sun, 08 Nov 2020 00:41:10 +0300
"Leo" <list at gkbrk.com> wrote:
> This is the statement that bothered me because it seemed very weird to
> consider a client "rude" for not making requests. So I asked about it
> for clarification.

Because as of currently, there's no way to indicate if content is
dynamic or static.

> But here you are saying that the rude thing here is having to serve
> large files over gemini and expecting them to be served often. This
> would be the server being rude, and clients caching or not caching
> resources would not have an impact on how large a server's reponses
> are.
>
> If anything, it would result in less traffic and at least partially
> mitigate the issue. If your server is being _rude_ and sending large
> responses, my client is not gonna make it worse by requesting it every
> single time. Your server might have a lot of bandwidth, my client does
> not.

Gemini responses should typically be in the kilobytes range as they are
plain text, unless you're serving MP3s or ISOs which are saved to the
disk permanently, there's no point of having caching here.

> The protocol operates under the assumption that caching does not exist
> in order to make clients and servers simpler. A client adding caching
> would NOT affect a server negatively. If you consider caching clients
> to be rude clients, too bad, you will never notice them because
> instead of making extra requests, they are making less. I suppose you
> can ask them to apologise for not making you spend money on
> compute/bandwidth etc.

Again, the protocol by convention should not go in the direction of
bittorrent, plain text documents aren't large.

> Again, you are talking about making servers more complicated. If my
> client does not cache and your server sends large responses, this is
> indeed a problem and you might have to make your server more complex
> to handle it. Caching clients would make simpler server code
> possible, as performance tricks would not be necessary. But even
> ignoring that, a client that caches responses would not make your
> server more complex.

They won't take burden off the server, instead, they add more, now the
server have to tell the client which content to cache and which it
should not.

> So it is being sane when a server says "display this monstrous image"
> and the client refuses, but it is insane when the server says "please
> download this resource that you downloaded 3 seconds ago" and the
> client says "I'll just use the response I have cached"?

Resources are identified using URIs, that's all the client knows, how
does it know that the resource haven't been altered since the last time
it got accessed? the client could be showing an outdated document in
the case of dynamic content.

> The whole point of the discussion is that clients WILL cache
> resources. Gemini benefits a lot from having a wide range of
> different clients. You simply can't control everyone's use case to
> fit the way you think they should consume content.

Different use cases can be satisfied with different protocols.

> Just like some browsers/proxies cache HTTP and Gopher responses, some
> people who are bandwidth constrained or simply not wasteful will
> choose to cache their Gemini responses too. The discussion throws out
> the idea of client HINTS that will say "This might be a dynamic
> response, maybe don't cache it". Neither the server nor the client
> needs to take advantage of this, a client can always cache or never
> cache, and a server can expect everything to be cached or not cached.
> Just like before.

Again, there shouldn't be a point where caching is mandatory if plain
text documents are being served.

> Disagreeing with someone is one thing, but you don't have to attack
> them like this.

I'm sorry you feel offended.

> Before you shame other people, please enlighten us
> with the tangible value that you have provided to the greater Gemini
> community.

Not proposing solutions to non-existent problems.

> > If any feature does not add a great value at an acceptable cost to
> > the simplicity of the protocol, consider it rejected before even
> > proposing it, I don't want to have a different experience browsing
> > Gemini space using netcat than using Kristall.  
> 
> I agree with this completely.

Then there's nothing to discuss further.

Link to individual message.

26. khuxkm (a) tilde.team (khuxkm (a) tilde.team)

📅 Sent: 2020-11-08 04:33
📧 Message 26 of 55

Just wanted to pop in:

November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote:

> On Sat, 7 Nov 2020 13:42:57 -0500
> John Cowan <cowan at ccil.org> wrote:
> 
> -snip-
> 
>> If you are browsing with netcat, caching is not even an issue. If
>> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is
>> handy for those who do want to, to communicate their intent. No
>> client and no server has to implement this.
> 
> If 22 is explicit no caching response, how would 20 be redefined?

20 wouldn't be redefined. A status code of 20 would simply have no 
assumptions as to the cacheability of a resource (i.e; cache at your own 
risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return them.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

27. Sean Conner (sean (a) conman.org)

Subject Changed! New Subject: Caching and sizes, the explosion of responise codes (was Re: Caching and status codes)
📅 Sent: 2020-11-08 05:18
📧 Message 27 of 55

It was thus said that the Great khuxkm at tilde.team once stated:
> November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote:
> > On Sat, 7 Nov 2020 13:42:57 -0500
> > John Cowan <cowan at ccil.org> wrote:
> > 
> >> If you are browsing with netcat, caching is not even an issue. If
> >> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is
> >> handy for those who do want to, to communicate their intent. No
> >> client and no server has to implement this.
> > 
> > If 22 is explicit no caching response, how would 20 be redefined?
> 
> 20 wouldn't be redefined. A status code of 20 would simply have no
> assumptions as to the cacheability of a resource (i.e; cache at your own
> risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return
> them.

  Ah, clashing proposals!  How wonderful!

  In another thread, not at all related to caching, we have prisonpotato at
tilde.team who said:

> This seems like a neat solution to this problem to me, but I'm not sure if
> it would work at this stage of gemini's life cycle.  There are also of
> course the issues with dynamically sized responses as generated by CGI
> scripts and stuff like that, so maybe we could introduce a new response
> code, like 22: Response with size.
> 
> 20 text/gemini
> 22 100 text/gemini
> 
> This solves both problems by making content length optional again, but
> exposes a risk that this type of extension could be used to add more fields

(gemini://gemi.dev/gemini-mailing-list/messages/003010.gmi)

and John Cowan, who said this in this thread:

> I agree, except that I am in favor of code 22 meaning "It is inadvisable
> to cache this", on the assumption that most Gemini documents are static
> and will continue to be so.  Even on the Web, most documents are static. 
> If there is to be just one new code, better it should be 22.  If people
> feel strongly about 21, then both 21 and 22.

  So, which is it?  Sizes?  Or caching?  Or I suppose we could all the
above:

	20	status, no size
	22	status, size
	21	cache, no size
	22	cache, size
	23	no-cache, no size
	24	no-cache, size

and before you know it:

	20	status, no size, no future feature
	22	status, size, no future feature
	21	cache, no size, no future feature
	22	cache, size, no future feature
	23	no-cache, no size, no future feature
	24	no-cache, size, no future feature

	25	status, no size, future feature
	26	status, size, future feature
	27	cache, no size, future feature
	28	cache, size, future feature
	29	no-cache, no size, future feature
	30	no-cache, size, future feature ... oh, wait a second ...

  We're done out of status codes and crashing into the next block.  It may
seem silly to worry about future feature now, but hey, the future comes
eventually.  Even *if* the size doesn't get its own status code, I think my
argument stands---features can mix, and if they can mix, the number of
status code explodes:

	20	status
	21	cache
	22	no-cache
	23	status, future feature 1
	24	cache, future feature 1
	25	no-cache, future feature 1
	26	status, no future feature 1, future feature 2
	27	cache, no future feature 1, future feature 2
	28	no-cache, no future feature 1, future feature 2
	29	status, future feature 1, future feature 2
	30	... uh oh ...

  I have my own ideas about caching, but I want to cobble up a
proof-of-concept first before I talk about it more, because from where I
come from, working code is worth more than talk.

  -spc

Link to individual message.

28. khuxkm (a) tilde.team (khuxkm (a) tilde.team)

📅 Sent: 2020-11-08 05:32
📧 Message 28 of 55

November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote:

> It was thus said that the Great khuxkm at tilde.team once stated:
> 
>> November 7, 2020 8:15 PM, "Ali Fardan" <raiz at stellarbound.space> wrote:
>> On Sat, 7 Nov 2020 13:42:57 -0500
>> John Cowan <cowan at ccil.org> wrote:
>> 
>> If you are browsing with netcat, caching is not even an issue. If
>> nobody wanted to serve dynamic content, 22 wouldn't be useful. It is
>> handy for those who do want to, to communicate their intent. No
>> client and no server has to implement this.
>> 
>> If 22 is explicit no caching response, how would 20 be redefined?
>> 
>> 20 wouldn't be redefined. A status code of 20 would simply have no
>> assumptions as to the cacheability of a resource (i.e; cache at your own
>> risk). Meanwhile, 21 and 22 would be there for CGI, etc. that can return
>> them.
> 
> Ah, clashing proposals! How wonderful!
> 
> In another thread, not at all related to caching, we have prisonpotato at
> tilde.team who said:
> 
>> This seems like a neat solution to this problem to me, but I'm not sure if
>> it would work at this stage of gemini's life cycle. There are also of
>> course the issues with dynamically sized responses as generated by CGI
>> scripts and stuff like that, so maybe we could introduce a new response
>> code, like 22: Response with size.
>> 
>> 20 text/gemini
>> 22 100 text/gemini
>> 
>> This solves both problems by making content length optional again, but
>> exposes a risk that this type of extension could be used to add more fields
> 
> (gemini://gemi.dev/gemini-mailing-list/messages/003010.gmi)
> 
> and John Cowan, who said this in this thread:
> 
>> I agree, except that I am in favor of code 22 meaning "It is inadvisable
>> to cache this", on the assumption that most Gemini documents are static
>> and will continue to be so. Even on the Web, most documents are static.
>> If there is to be just one new code, better it should be 22. If people
>> feel strongly about 21, then both 21 and 22.
> 
> So, which is it? Sizes? Or caching? Or I suppose we could all the
> above:
> 
> 20 status, no size
> 22 status, size
> 21 cache, no size
> 22 cache, size
> 23 no-cache, no size
> 24 no-cache, size
> 
> and before you know it:
> 
> 20 status, no size, no future feature
> 22 status, size, no future feature
> 21 cache, no size, no future feature
> 22 cache, size, no future feature
> 23 no-cache, no size, no future feature
> 24 no-cache, size, no future feature
> 
> 25 status, no size, future feature
> 26 status, size, future feature
> 27 cache, no size, future feature
> 28 cache, size, future feature
> 29 no-cache, no size, future feature
> 30 no-cache, size, future feature ... oh, wait a second ...
> 
> We're done out of status codes and crashing into the next block. It may
> seem silly to worry about future feature now, but hey, the future comes
> eventually. Even *if* the size doesn't get its own status code, I think my
> argument stands---features can mix, and if they can mix, the number of
> status code explodes:
> 
> 20 status
> 21 cache
> 22 no-cache
> 23 status, future feature 1
> 24 cache, future feature 1
> 25 no-cache, future feature 1
> 26 status, no future feature 1, future feature 2
> 27 cache, no future feature 1, future feature 2
> 28 no-cache, no future feature 1, future feature 2
> 29 status, future feature 1, future feature 2
> 30 ... uh oh ...
> 
> I have my own ideas about caching, but I want to cobble up a
> proof-of-concept first before I talk about it more, because from where I
> come from, working code is worth more than talk.
> 
> -spc

This can't happen, though, because the first proposal breaks the 
compatibility of <META> in response codes within a block, and the second 
one is just debating which of the codes we should add.

Cache/no-cache would be 2 (at most) response codes. That's all.

I'm also going to try and put together some basic code to demonstrate how 
I think this should work. Maybe, then, it'll be a bit clearer.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

29. Sean Conner (sean (a) conman.org)

📅 Sent: 2020-11-08 05:50
📧 Message 29 of 55

It was thus said that the Great khuxkm at tilde.team once stated:
> 
> This can't happen, though, because the first proposal breaks the
> compatibility of <META> in response codes within a block, and the second
> one is just debating which of the codes we should add.

  This *did* happen, back on November 3rd.  

	gemini://gemi.dev/gemini-mailing-list/messages/003015.gmi

  And it's already received over a hundred hits.

> Cache/no-cache would be 2 (at most) response codes. That's all.

  And my argument, even with the size response code removed, can *still*
lead to a combinatoric explosion of response codes.  Today it's just *two*,
but what about tomorrow?

  -spc

Link to individual message.

30. Zach DeCook (zachdecook (a) librem.one)

📅 Sent: 2020-11-08 05:52
📧 Message 30 of 55

>We're done out of status codes and crashing into the next block.  It
>may
>seem silly to worry about future feature now, but hey, the future comes
>eventually.  Even *if* the size doesn't get its own status code, I
>think my
>argument stands---features can mix, and if they can mix, the number of
>status code explodes:

>  from where
>I
>come from, working code is worth more than talk.

I think there's still a long way for gemini clients to come before 
demanding the Future Inevitable Feature. Like, the more thought that goes 
into this, the less likely we'll run out of status codes.
One way a client could implement caching is by loading the cached version 
of a page, and telling the user when it was downloaded, next to a 
'refresh' button. A gemini client could even offer a diff view between the 
current page and the cached copy. All of that *empowers the users* over 
the servers, and doesn't rely on adding anything to the spec.
A "please don't cache" response code would be (ab)used by servers who 
desire to track their users.
-Zach

Link to individual message.

31. Sudipto Mallick (smallick.dev (a) gmail.com)

📅 Sent: 2020-11-08 05:59
📧 Message 31 of 55

This is what I *think* how *should* clients work with caching:

The clients with history support and supports going ''backwards'' and
''forwards'' through history should cache text/* responses in memory
for that browsing session. When the user browses through the history
using ''forward'' and ''backward'' action, no reloading should happen.
But, when a user clicks the link for a resource already in cache or
writes the link by hand or selects the link from previously visited
links or asks for reload: the cache is purged and the resource
reloaded. It is assumed that requests with query part are idempotent.
Now, when a page is dynamic, it should be stated as such so that the
user would reload that page.
With that, no new response codes.

~smlckz

Link to individual message.

32. Zach DeCook (zachdecook (a) librem.one)

📅 Sent: 2020-11-08 06:08
📧 Message 32 of 55

On November 8, 2020 12:59:09 AM EST, Sudipto Mallick <smallick.dev at gmail.com> wrote:
>It is assumed that requests with query part ...

That client-implementation could lead to strange uses of the query part...
But if it's a result of a 10 or 11 response, it would make sense to make 
the request fresh.
mbays at sdf.org is right, the user should be in control.
A caching client could simply split links to cached resources into two 
parts: clicking one part would pull the cached version, clicking the other 
would make a new request.
-Zach

Link to individual message.

33. khuxkm (a) tilde.team (khuxkm (a) tilde.team)

📅 Sent: 2020-11-08 06:14
📧 Message 33 of 55

November 8, 2020 12:52 AM, "Zach DeCook" <zachdecook at librem.one> wrote:

>> We're done out of status codes and crashing into the next block. It
>> may
>> seem silly to worry about future feature now, but hey, the future comes
>> eventually. Even *if* the size doesn't get its own status code, I
>> think my
>> argument stands---features can mix, and if they can mix, the number of
>> status code explodes:
>> 
>> from where
>> I
>> come from, working code is worth more than talk.
> 
> I think there's still a long way for gemini clients to come before 
demanding the Future Inevitable
> Feature. Like, the more thought that goes into this, the less likely 
we'll run out of status codes.
> One way a client could implement caching is by loading the cached 
version of a page, and telling
> the user when it was downloaded, next to a 'refresh' button. A gemini 
client could even offer a
> diff view between the current page and the cached copy. All of that 

	empowers the users* over the

> servers, and doesn't rely on adding anything to the spec.
> A "please don't cache" response code would be (ab)used by servers who 
desire to track their users.
> -Zach

Again, clients are free to ignore the "do not cache" response code as they 
wish. It would be considered bad form, and it may confuse the everloving 
hell out of somebody who's requesting a random picture and keeps getting 
the same one, but you're free to do it, just as you're free to cache any response now.

Just my two cents,
Robert "khuxkm" Miles

(PS: still working on that implementation)

Link to individual message.

34. khuxkm (a) tilde.team (khuxkm (a) tilde.team)

📅 Sent: 2020-11-08 06:29
📧 Message 34 of 55

November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote:

> I have my own ideas about caching, but I want to cobble up a
> proof-of-concept first before I talk about it more, because from where I
> come from, working code is worth more than talk.
> 
> -spc

Speaking of cobbling up proof-of-concepts, I've created a proof-of-concept 
of how I feel these cache-hint success codes would work:

https://gist.github.com/febd3f5ae2308e8b55449a92c6e58a65

(Yes, I know it's on GitHub, but I have a shell script to make Gists from 
the command line and so I want to use it.)

This includes a spartan client (it literally just spits the raw protocol 
response out at you) with caching behavior influenced by 21/22 (in 
practice, it caches all 2x responses except for 22 responses), as well as 
examples of endpoints to hit that return each code.

Hopefully my working code will prove my point better than I could in words.

Just my two cents,
Robert "khuxkm" Miles

Link to individual message.

35. Sean Conner (sean (a) conman.org)

📅 Sent: 2020-11-08 07:49
📧 Message 35 of 55

It was thus said that the Great khuxkm at tilde.team once stated:
> November 8, 2020 12:18 AM, "Sean Conner" <sean at conman.org> wrote:
> 
> > I have my own ideas about caching, but I want to cobble up a
> > proof-of-concept first before I talk about it more, because from where I
> > come from, working code is worth more than talk.
> > 
> > -spc
> 
> Speaking of cobbling up proof-of-concepts, I've created a proof-of-concept
> of how I feel these cache-hint success codes would work:
> 
> https://gist.github.com/febd3f5ae2308e8b55449a92c6e58a65
> 
> (Yes, I know it's on GitHub, but I have a shell script to make Gists from
> the command line and so I want to use it.)
> 
> This includes a spartan client (it literally just spits the raw protocol
> response out at you) with caching behavior influenced by 21/22 (in
> practice, it caches all 2x responses except for 22 responses), as well as
> examples of endpoints to hit that return each code.
> 
> Hopefully my working code will prove my point better than I could in words.

  And I have my "proof-of-concept" up at well.  It's at

	gemini://gemini.conman.org/test/testcache.gemini

  My approach is a bit different, probably a bit harder to implement server
side, but deals not only with the caching issue, but with repeated requests. 
I'm not sure how popular it will be, but hey, it's out there, and it only
adds one general purpose status code (23 for now) that means:

	okay, request was okay, but there is no data to serve you.

  How it works:  A plain request:

	gemini://gemini.conman.org/test/cachefile.txt

will always return the content.  However, if you include a timestamp using a
path parameter (which is *NOT* the same as a query paramter, and is in the
ISO-8601 format):

	gemini://gemini.conman.org/test/cachefile.txt;2020-11-08T00:00:00

If the file is *newer* than that timestamp, you get the normal response of
20 and all the content; otherwise you get a response of 23 (with the normal
MIME type) and no content, meaning it hasn't changed since the given date.

  This means a client that doesn't with to deal with caching at all will
never see any difference.  A client that does, or (in the case of feeds) not
want to always download content that hasn't changed, can do that as well.

  -spc

Link to individual message.

36. boringcactus (boringcactus (a) gmail.com)

📅 Sent: 2020-11-08 08:17
📧 Message 36 of 55

On 2020-11-08 12:49 AM, Sean Conner wrote:
>    My approach is a bit different, probably a bit harder to implement server
> side, but deals not only with the caching issue, but with repeated requests.
> I'm not sure how popular it will be, but hey, it's out there, and it only
> adds one general purpose status code (23 for now) that means:
> 
> 	okay, request was okay, but there is no data to serve you.
You could get a similar effect with 44 SLOW DOWN as it already exists, 
and boom, now the protocol doesn't need to know about caching at all.

Link to individual message.

37. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-11-08 12:40
📧 Message 37 of 55

On Sun, 8 Nov 2020 00:50:18 -0500
Sean Conner <sean at conman.org> wrote:

> It was thus said that the Great khuxkm at tilde.team once stated:
> > 
> > This can't happen, though, because the first proposal breaks the
> > compatibility of <META> in response codes within a block, and the second
> > one is just debating which of the codes we should add.
> 
>   This *did* happen, back on November 3rd.  
> 
> 	gemini://gemi.dev/gemini-mailing-list/messages/003015.gmi
> 
>   And it's already received over a hundred hits.

You're mistaking the sense in which khuxkm is saying that it can't
happen. It did happen in the sense that you implemented a server for a
protocol that works like this.

It can't happen in the sense that it fundamentally breaks compatibility
with clients that only concern themselves with the first digit of the
response code. That you've implemented the resulting protocol of your
suggested change has no bearing on the argument.

The overflowing of 2x code resulting from a combinatorial hell is
entirely self-inflicted through your choice to ignore this aspect of
the existing specification.

I've already suggested to you a way to avoid this (use a different
first digit). This additionally avoids having to rewrite 3.2 of the
spec and invalidate existing clients that take for granted that "the
first digit alone provides enough information for a client to determine
how to handle the response".

Moreover, with TLS already having a mechanism to signal the intended
end of a connection, I don't think that content size is a pressing
issue. It would allow for download progress bars, but adds nothing
over TLS in terms of ensuring that the content is fully received.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201108/213b
a640/attachment.sig>

Link to individual message.

38. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 15:44
📧 Message 38 of 55

On Sun, 8 Nov 2020 11:29:09 +0530
Sudipto Mallick <smallick.dev at gmail.com> wrote:
> This is what I *think* how *should* clients work with caching:
> 
> The clients with history support and supports going ''backwards'' and
> ''forwards'' through history should cache text/* responses in memory
> for that browsing session. When the user browses through the history
> using ''forward'' and ''backward'' action, no reloading should happen.
> But, when a user clicks the link for a resource already in cache or
> writes the link by hand or selects the link from previously visited
> links or asks for reload: the cache is purged and the resource
> reloaded. It is assumed that requests with query part are idempotent.
> Now, when a page is dynamic, it should be stated as such so that the
> user would reload that page.
> With that, no new response codes.

This is the creativity I like to see when dealing with limited
environments, this is a solution that requires no change in the
protocol, if a client developer feels the urge to have caching which I
still don't understand why it should be necessary for Gemini. (everybody
refused to give me an answer)

Great solution, but I doubt anyone would listen to you considering the
current climate of discussion being focused heavily on adding more.

Link to individual message.

39. James Tomasino (tomasino (a) lavabit.com)

📅 Sent: 2020-11-08 16:21
📧 Message 39 of 55

On 11/8/20 3:44 PM, Ali Fardan wrote:
> if a client developer feels the urge to have caching which I
> still don't understand why it should be necessary for Gemini. (everybody
> refused to give me an answer)

I'm not 100% sure if it requires caching or not, but what I would want 
from a client with a back button is to take me back to where I was in the 
document's scroll position before I clicked. Some clients are already 
handling this use case (hooray!). This may be a greater concern for 
graphical clients than text ones, unless the client is handling paging 
itself. Losing your place can be frustrating when moving through a series 
of inter-linked documents. Imagine going through older CAPCOM articles, 
for instance. Every time you click back needing to scroll back down to 
wherever you were becomes obnoxious and a reason to just stop. 

Again, I'm not sure if caching is involved in it or not, but that's the 
best thing I could come up with as an answer to your question.

Link to individual message.

40. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 16:31
📧 Message 40 of 55

On Sun, 8 Nov 2020 16:21:03 +0000
James Tomasino <tomasino at lavabit.com> wrote:
> Again, I'm not sure if caching is involved in it or not, but that's
> the best thing I could come up with as an answer to your question.

This can be done by remembering the offset percentage-wise in the
client for each previous page, however, this could be broken if caching
wasn't involved in case document changes meanwhile, if this is a
compelling reason to have caching, I'd suggest Mallick's method, as it
won't involve any changes to the protocol and it fits the context very
well.

Link to individual message.

41. Sudipto Mallick (smallick.dev (a) gmail.com)

📅 Sent: 2020-11-08 17:20
📧 Message 41 of 55

On 11/8/20, James Tomasino <tomasino at lavabit.com> wrote:
> I'm not 100% sure if it requires caching or not, but what I would want from
> a client with a back button is to take me back to where I was in the
> document's scroll position before I clicked. Some clients are already
> handling this use case (hooray!). This may be a greater concern for
> graphical clients than text ones, unless the client is handling paging
> itself. Losing your place can be frustrating when moving through a series of
> inter-linked documents. Imagine going through older CAPCOM articles, for
> instance. Every time you click back needing to scroll back down to wherever
> you were becomes obnoxious and a reason to just stop.

Now in that case, if the client does not cache, the page is reloaded.
Even if the client remembers the position where you were before in
that page, imagine the scenario where the page gets changed before
reloading (say, new links added to (or removed in) that page), then
you have to scroll anyway. (In the worst case, imagine the link you
clicked does not exist in the new page.)

> Again, I'm not sure if caching is involved in it or not, but that's the best
> thing I could come up with as an answer to your question.
So caching is required. Your client must keep that page in cache along
with your last position in that page for the feature you want.
~smlckz

Link to individual message.

42. Waweic (waweic (a) protonmail.com)

📅 Sent: 2020-11-08 17:21
📧 Message 42 of 55

> This is the creativity I like to see when dealing with limited
> environments, this is a solution that requires no change in the
> protocol, if a client developer feels the urge to have caching which I
> still don't understand why it should be necessary for Gemini. (everybody
> refused to give me an answer)
>
> Great solution, but I doubt anyone would listen to you considering the
> current climate of discussion being focused heavily on adding more.

I really like this approach, as I like the protocol as-is. I am currently 
working on a client for Android, and I am planning to make heavy use of 
caching, although I do not think a change in the protocol is necessary. In 
my usecase, I have a dataplan that's free, but just gives me flaky 
32kbit/s at best, usually far less. (It's called "messaging option"). 
Loading a page on Gemini usually takes multiple seconds. This would be 
similar for packet radio and similar applications. Without caching, this 
is a huge PITA. I will take care to have a "Reload" button once I use 
caching, so the users themselves can decide when new content should be fetched.

I want to stress that caching is neccessary in my usecase. It's a much 
more needed feature than, say, client certificate support. At the moment, 
the majority of Content on Gemini is static and I believe it will continue to be.

- Waweic

Link to individual message.

43. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-08 17:26
📧 Message 43 of 55

On Sun, 08 Nov 2020 17:21:17 +0000
Waweic <waweic at protonmail.com> wrote:
> I really like this approach, as I like the protocol as-is.

Wonderful.

> I want to stress that caching is neccessary in my usecase. It's a
> much more needed feature than, say, client certificate support. At
> the moment, the majority of Content on Gemini is static and I believe
> it will continue to be.

Go for it.

Link to individual message.

44. Sean Conner (sean (a) conman.org)

📅 Sent: 2020-11-08 22:22
📧 Message 44 of 55

It was thus said that the Great Philip Linde once stated:
> 
> The overflowing of 2x code resulting from a combinatorial hell is
> entirely self-inflicted through your choice to ignore this aspect of
> the existing specification.

  I don't agree that this is self-inflicted on me---what I'm trying (and
failing) to point out is that adding mroe success calls *could* lead to 
combinatorial hell.  HTTP already has 10 success codes in the 200 range, and
none of them relate to caching status (since caching status is passed along 
in headers).  Granted, none of the HTTP success statuses (with the exception
of 200, which is mapped to 20 in Gemini) apply to Gemini, but in another
universe where HTTP *did* have separate success code for caching
information, I can see some combinatoric increase, say for 204, 205 and 206
with nothing said, cache and no-caching of results (that's nine new ones   
right there).

> I've already suggested to you a way to avoid this (use a different 
> first digit). 

  No, that was to prisonpotato at tilde.team---they were the first to come up
with the idea, not me.  I just implemented it first (much the same way I
implemented the first Gemini server even before solderpunk did [1]).  Also,
the size breaking code is only active on one link on my site, not everwhere.

> This additionally avoids having to rewrite 3.2 of the
> spec and invalidate existing clients that take for granted that "the   
> first digit alone provides enough information for a client to determine
> how to handle the response".
> 
> Moreover, with TLS already having a mechanism to signal the intended
> end of a connection, I don't think that content size is a pressing
> issue. It would allow for download progress bars, but adds nothing
> over TLS in terms of ensuring that the content is fully received.

  The concern is over large responses.  It wasn't much of a concern until
gemini://konpeito.media/ was created and serving up large audio files (and
archives of said audio files).  I can envision a client being configured to
abort the download if say, a 10 megabyte file is being downloaded.  It

	sucked* when my DSL went down in late September/early October (yes, about

three weeks) and I had to rely upon my cellphone hot spot.  I didn't have a
large data plan for the cell phone becuase I didn't need it, until I did. 
It would have been nice to configure my web browser to not download anything
over 5M at that point.

  -spc

[1]     I did it completely ignoring the status codes defined at the time,
        because I felt they were too limiting (single digit).  It took some
        back and forth, along with a few other implementations (that did
        follow the spec), before the current two-digit scheme was defined.

Link to individual message.

45. Luke Emmet (luke (a) marmaladefoo.com)

📅 Sent: 2020-11-08 23:37
📧 Message 45 of 55

On 08-Nov-2020 22:22, Sean Conner wrote:
>
>   The concern is over large responses.  It wasn't much of a concern until
> gemini://konpeito.media/ was created and serving up large audio files (and
> archives of said audio files).  I can envision a client being configured to
> abort the download if say, a 10 megabyte file is being downloaded.  It
> *sucked* when my DSL went down in late September/early October (yes, about
> three weeks) and I had to rely upon my cellphone hot spot.  I didn't have a
> large data plan for the cell phone becuase I didn't need it, until I did.
> It would have been nice to configure my web browser to not download anything
> over 5M at that point.

(At the risk of wading into this thread - I should know better)

Yes that is a sensible client design IMO, since you cannot know how long 
to wait. In my client GemiNaut (Windows only atm sorry folks), there are 
two options:

  - abandon download after X Mb or after Y seconds

The gemget client/utility also implements this approach, which is what 
GemiNaut uses under the hood.

The values are tunable according to the desires of the user. Mostly I 
have mine set to 5mb or 10 seconds. If something times out beyond that I 
make a judgement whether I really want to up the threshold temporarily 
to let it through. Or maybe go look elsewhere ;)

By and large tex/gemini content is very small and fast which is one of 
its great selling points. Other binary files are more "meh".

  - Luke

Link to individual message.

46. bie (bie (a) 202x.moe)

📅 Sent: 2020-11-09 02:22
📧 Message 46 of 55

On Sun, Nov 08, 2020 at 05:21:17PM +0000, Waweic wrote:
> I really like this approach, as I like the protocol as-is. I am 
currently working on a client for Android, and I am planning to make heavy 
use of caching, although I do not think a change in the protocol is 
necessary. In my usecase, I have a dataplan that's free, but just gives me 
flaky 32kbit/s at best, usually far less. (It's called "messaging 
option"). Loading a page on Gemini usually takes multiple seconds. This 
would be similar for packet radio and similar applications. Without 
caching, this is a huge PITA. I will take care to have a "Reload" button 
once I use caching, so the users themselves can decide when new content 
should be fetched.
> 
> I want to stress that caching is neccessary in my usecase. It's a much 
more needed feature than, say, client certificate support. At the moment, 
the majority of Content on Gemini is static and I believe it will continue to be.

I should add that in cases where there's an actual need for caching like
this, caching is totally fine by me, but I hope you can find a way to
add some kind of indicator to make it clear to the user whether you're
showing fresh or cached content!

bie

Link to individual message.

47. marc (marcx2 (a) welz.org.za)

📅 Sent: 2020-11-09 07:53
📧 Message 47 of 55

> This is what I *think* how *should* clients work with caching:
> 
> The clients with history support and supports going ''backwards'' and
> ''forwards'' through history should cache text/* responses in memory
> for that browsing session. When the user browses through the history
> using ''forward'' and ''backward'' action, no reloading should happen.
> But, when a user clicks the link for a resource already in cache or
> writes the link by hand or selects the link from previously visited
> links or asks for reload: the cache is purged and the resource
> reloaded. It is assumed that requests with query part are idempotent.
> Now, when a page is dynamic, it should be stated as such so that the
> user would reload that page.
> With that, no new response codes.

I think your proposal is excellent. I am sure a browser
could add a command "press r to reload" for any page.

I have to say I am really puzzled by the protracted
discussion about caching. The difference between returning
the full document and a message saying "document hasn't
changed" is really small in the greater scheme of things:

The tcp 3 way handshake plus the tls negotiation
consume quite a number of packets, and add round trip
latency. Network load is often measured in number of frames
rather than bytes, and there is space for 1500 or even
9000 bytes per frame - this means that if your document is
below that size (I'd venture most good ones are), then a
"not changed" response doesn't even change the number of
packets sent. That even suggests another heuristic: If
your content is dynamic, try generating a short document,
the implication being that larger ones should be cached,
as there might be an actual benefit.

I really think this is a holdover people still thinking
in http+html instead of gemini:

Plaintext http allowed several parties to share
a cache. This isn't the case there, as things are
encrypted. Html often includes other urls ("img src"),
which might be shared across pages. Gemini doesn't do
that either.

And *if* caching should be done, then it seems
a poor idea to have the caching clues live in the
transfer/session/transport layer. Instead it should be in
the document markup. 

Even http+html finally realised that with the messy
"http-meta-equiv" fields in the markup. At least that
provides a better path for the document author to
tell is how long the document might be valid for. And
with a machine readable license that would allow for
aggregation/replication/archiving/broadcast/etc which seems a
much better way to save bandwidth and have persistent documents.

TLDR: Don't think in http+html, do better

regards

marc

Link to individual message.

48. Philip Linde (linde.philip (a) gmail.com)

📅 Sent: 2020-11-09 16:18
📧 Message 48 of 55

On Sun, 8 Nov 2020 17:22:56 -0500
Sean Conner <sean at conman.org> wrote:

>   I don't agree that this is self-inflicted on me---what I'm trying (and
> failing) to point out is that adding mroe success calls *could* lead to 
> combinatorial hell.

This is unavoidable in any case of adding a sufficient number of
features that can be combined. If every feature proposal should be
evaluated in terms of the slippery slope of "what if we add more" there
wouldn't be a point to discussing additions to the protocol, a notion
that I'm actually partial to, but that I think shouldn't be used as a
basis for judging against individual feature proposals.

>   No, that was to prisonpotato at tilde.team---they were the first to come up
> with the idea, not me.  I just implemented it first (much the same way I
> implemented the first Gemini server even before solderpunk did [1]).  Also,
> the size breaking code is only active on one link on my site, not everwhere.

Okay, I'm glad that you picked it up even when it wasn't addressed
directly at you. Consider the option, and why I believe that using the
2x range is inappropriate.

>   The concern is over large responses.  It wasn't much of a concern until
> gemini://konpeito.media/ was created and serving up large audio files (and
> archives of said audio files).  I can envision a client being configured to
> abort the download if say, a 10 megabyte file is being downloaded.  It
> *sucked* when my DSL went down in late September/early October (yes, about
> three weeks) and I had to rely upon my cellphone hot spot.  I didn't have a
> large data plan for the cell phone becuase I didn't need it, until I did. 
> It would have been nice to configure my web browser to not download anything
> over 5M at that point.

That would be a nice feature, but is it nice enough to warrant breakage
across the growing number of implementations? text/gemini documents can
display an estimated size for linked files, and a user can configure
their automatic client to abort at a certain point or choose in their
interactive client to cancel a request. Not a complete solution by any
means, but worth considering as it is much cheaper.

No-cache/please-cache 2x responses however seem like they can add a lot
of value for very little cost. Existing client implementations will be
compatible, and the idea of the second digit as a hint from the server
that can be ignored by a simpler client is retained.

-- 
Philip
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/695c
2fca/attachment-0001.sig>

Link to individual message.

49. Ali Fardan (raiz (a) stellarbound.space)

📅 Sent: 2020-11-09 17:16
📧 Message 49 of 55

On Mon, 9 Nov 2020 17:18:21 +0100
Philip Linde <linde.philip at gmail.com> wrote:
> On Sun, 8 Nov 2020 17:22:56 -0500
> This is unavoidable in any case of adding a sufficient number of
> features that can be combined. If every feature proposal should be
> evaluated in terms of the slippery slope of "what if we add more"
> there wouldn't be a point to discussing additions to the protocol, a
> notion that I'm actually partial to, but that I think shouldn't be
> used as a basis for judging against individual feature proposals.

Lets not look at it this way, lets look at it for the way of what does
caching truly enable, I've discussed this many times, no one seemed to
address my concerns, if in-protocol caching mechanism exists, it opens
the window for serving complex document formats that are resource
heavy, see HTML pulling stylesheets and images for example, it is
discouraged to use anything other than gemtext for documents in Gemini.

In the case of serving media files like music, video, and even large
PDFs, these are downloaded to be saved on disk anyway, so caching
wouldn't be a concern.

The only convincing reasoning for caching is Waweic's use case using
Mallick's method, and even that doesn't introduce any new protocol
features.

> That would be a nice feature, but is it nice enough to warrant
> breakage across the growing number of implementations? text/gemini
> documents can display an estimated size for linked files, and a user
> can configure their automatic client to abort at a certain point or
> choose in their interactive client to cancel a request. Not a
> complete solution by any means, but worth considering as it is much
> cheaper.

While to each their own, and implementations can behave the way they
desire, I'm against this feature as it breaks certain sites that are
slow to load, I'd have to refresh once again and get it to load fast if
I don't want it to timeout, what you could do instead is have in the
bottom (or top) status bar or any information indicator a display of
how much is downloaded of the requested file, the user then judges if
they want to stop or not by hitting a cancel button.

Link to individual message.

50. Jason McBrayer (jmcbray (a) carcosa.net)

📅 Sent: 2020-11-09 18:15
📧 Message 50 of 55

Zach DeCook <zachdecook at librem.one> writes:

> A "please don't cache" response code would be (ab)used by servers who
> desire to track their users.

I think this is an important point which has been largely overlooked. We
do not have (many?) bad actors on the gemiverse for now, but this can't
continue forever.

I'm also wondering whether we shouldn't state outright that Gemini
requests are expected to be idempotent. I know this would break dynamic
pages that people are already using and enjoying (astrobotany,
guestbooks), but other types of dynamic pages with user input would
still be relevant, like searches and local weather.

I dislike the idea of adding caching-related result codes. That's one of
the things that drove the increasing complexification of HTTP during the
HTTP/1.0 era (i.e., before HTTP/1.1). If we're not doing full-fledged
application development on top of Gemini, we don't need server-side
cache control.

-- 
+-----------------------------------------------------------+
| Jason F. McBrayer                    jmcbray at carcosa.net  |
| A flower falls, even though we love it; and a weed grows, |
| even though we do not love it.            -- Dogen        |

Link to individual message.

51. Jason McBrayer (jmcbray (a) carcosa.net)

📅 Sent: 2020-11-09 19:11
📧 Message 51 of 55

Sudipto Mallick <smallick.dev at gmail.com> writes:

> The clients with history support and supports going ''backwards'' and
> ''forwards'' through history should cache text/* responses in memory
> for that browsing session. When the user browses through the history
> using ''forward'' and ''backward'' action, no reloading should happen.
> But, when a user clicks the link for a resource already in cache or
> writes the link by hand or selects the link from previously visited
> links or asks for reload: the cache is purged and the resource
> reloaded. It is assumed that requests with query part are idempotent.

I also believe that this is the correct approach, and should be
considered a client 'best practice'.

-- 
+-----------------------------------------------------------+
| Jason F. McBrayer                    jmcbray at carcosa.net  |
| A flower falls, even though we love it; and a weed grows, |
| even though we do not love it.            -- Dogen        |

Link to individual message.

52. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-09 20:53
📧 Message 52 of 55

On Sun, Nov 8, 2020 at 12:52 AM Zach DeCook <zachdecook at librem.one> wrote:

> A "please don't cache" response code would be (ab)used by servers who
> desire to track their users.
>

I don't understand your reasoning there.  What does a server learn by
sending a 21 YOU CAN CACHE  or 22 YOU SHOULD NOT CACHE response back
instead of a plain 20 response?  (I'm not a security expert and I know
there are loopholes I don't see.)

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
So they play that [tune] on their fascist banjos, eh?
        --Great-Souled Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/8317
e59c/attachment.htm>

Link to individual message.

53. Sean Conner (sean (a) conman.org)

📅 Sent: 2020-11-09 21:33
📧 Message 53 of 55

It was thus said that the Great Sean Conner once stated:
> 
>   And I have my "proof-of-concept" up at well.  It's at
> 
> 	gemini://gemini.conman.org/test/testcache.gemini

  I have removed this "proof-of-concept" after some thought about the
approach.  I agree with Ali Fardan that Mallick's method is the way to
handle caching (or not at all).

  Now, on with destroying my own idea here ...

>   How it works:  A plain request:
> 
> 	gemini://gemini.conman.org/test/cachefile.txt
> 
> will always return the content.  However, if you include a timestamp using a
> path parameter (which is *NOT* the same as a query paramter, and is in the
> ISO-8601 format):
> 
> 	gemini://gemini.conman.org/test/cachefile.txt;2020-11-08T00:00:00
> 
> If the file is *newer* than that timestamp, you get the normal response of
> 20 and all the content; otherwise you get a response of 23 (with the normal
> MIME type) and no content, meaning it hasn't changed since the given date.

  The major problem here is timezones.  Time zone information is
complicated, and from what I've seen, operating system specific (the C
standard doesn't mention it; POSIX does it one way; Windows another) so
that's a complication for both servers and clients.

Also, does the concept apply to each path component?  Or only the end?  For
example:

	gemini://gemini.conman.org/test;2020-11-08T00:00:00/cachefile.txt 

return 23 if the directory test hasn't changed, even if cachefile.txt has? 
Or is it ignored *unless* it's the last path component?  My gut instinct is
to say "last component" but it get messy:

	gemini://gemini.coman.org/test;2020-11-08T00:00:00

will result in a redirect (or should), but

	gemini://gemini.coman.org/test;2020-11-08T00:00:00/

won't.

  So, for these reasons, nah.  I won't push this.

  -spc

Link to individual message.

54. John Cowan (cowan (a) ccil.org)

📅 Sent: 2020-11-09 22:52
📧 Message 54 of 55

On Mon, Nov 9, 2020 at 4:33 PM Sean Conner <sean at conman.org> wrote:

> The major problem here is timezones.  Time zone information is
> complicated, and from what I've seen, operating system specific (the C
> standard doesn't mention it; POSIX does it one way; Windows another) so
> that's a complication for both servers and clients.
>

ISO 8601 doesn't use named timezones, only offsets, but the best policy is
to do everything in UTC, which is pretty universally available on all
operating systems now.  It's perfectly fine to just write
"2020-11-09T22:14:05Z".

> Also, does the concept apply to each path component?  Or only the end?
>

I would say "The end", but I would write it as a query term:  gemini://
example.com/test?since=2020-11-09T22:14:05Z.  After all, the path elements
aren't necessarily actual objects with modification dates (in Amazon S3,
for example, they are just part of the name).

In addition, you could use the "inode/x-empty" MIME-type instead of a
different protocol response.

But, like you, I doubt this is worth doing.  Servers should advise clients
about whether the author thinks the result should be cached, and clients
can do what they like about it.  (There are any number of ways for the
author to communicate this intent; it doesn't have to be part of the
content.)

John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
I am expressing my opinion.  When my honorable and gallant friend is
called, he will express his opinion.  This is the process which we
call Debate.                   --Winston Churchill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20201109/e355
08dc/attachment.htm>

Link to individual message.

55. Jason McBrayer (jmcbray (a) carcosa.net)

📅 Sent: 2020-11-09 23:02
📧 Message 55 of 55

John Cowan <cowan at ccil.org> writes:

> I don't understand your reasoning there. What does a server learn by
> sending a 21 YOU CAN CACHE or 22 YOU SHOULD NOT CACHE response back
> instead of a plain 20 response? (I'm not a security expert and I know
> there are loopholes I don't see.)

The server operator gets a decent guess at whether the user has visited
the page before (within a reasonable caching window), because if you
sent a 21 YOU CAN CACHE, and they made the request, that means they
hadn't seen it recently. Combine this with query strings, IP addresses,
and/or fragment identifiers, and you can identify individual users, even
users who have refused to set a client certificate when you asked. It's
a pretty minor information leak, since it can't be used for cross-site
tracking. But give techbros an inch, and they'll take a mile.

-- 
Jason McBrayer      | ?Strange is the night where black stars rise,
jmcbray at carcosa.net | and strange moons circle through the skies,
                    | but stranger still is lost Carcosa.?
                    | ? Robert W. Chambers,The King in Yellow

Link to individual message.

---

Previous Thread: [ANN] twins, a Gemini server written in Go

Next Thread: Default document for root