I sent the following email (and many others, which will follow shortly) to the list last night, but SDF's mailserver was acting up and they never went through. I thought they'd appear eventually, but it appears not, so I'm resending them now. Apologies if double posts eventually appear. ----- Forwarded message from solderpunk <solderpunk at SDF.ORG> ----- Date: Fri, 26 Jun 2020 15:47:57 +0000 From: solderpunk <solderpunk@SDF.ORG> To: gemini at lists.orbitalfox.eu Subject: Illusory latency due to trailing slash redirects Gemini servers (and servers for any other protocol supporting relative URLs, including HTTP) will use redirects to get clients to add a trailing slash to a URL which maps to a directory on the server's filesystem. Handling this redirect is often invisible to the client user, with the result that what "looks and feels" like a single request is actually two immediately consecutive requests. This makes latency appear to be much worse than it actually is. Perhaps this underlies some people's perceptions that Gemini has unacceptable latency. This basic problem is unavoidable, but there are many small things client and server authors can do to minimise how often it happens: In general, the client can't add a missing trailing slash itself because it never knows if a given URL maps to something file-like or something directory-like. An interesting exception is the root of the server. Molly Brown and Jetforce (but not GLV-1.12556) both seem to redirect gemini.example.com to gemini.example.com/, making the loading of "homepages" feel slower than it needs to. I'm pretty sure it should be safe for clients to automatically append trailing slashes to URLs without paths. On the server side, server authors should take care when doing things like automatically generating directory listings to put trailing slashes in links to directories so that the redirect is not necessary. These measures together could cut perceived latency in half for a non-trivial proportion of Gemini requests, including the psychologically heavily weighted request of visiting a server's homepage. Cheers, Solderpunk
> I'm pretty sure it should be safe for clients to automatically append > trailing slashes to URLs without paths. Indeed it should be, per rfc3986: > In general, a URI that uses the generic syntax for authority with > an empty path should be normalized to a path of "/". An authority in this context is //[<user>[:<password>]@]<host>[:<port>], so "gemini://example.horse" and "gemini://example.horse/" are equivalent in a strictly compliant rfc3986 implementation. A server that serves different content on //server and //server/ can be a huge pain in the neck. When I wrote my dillo plugin I noticed that it seems to apply this normalization in reverse before the URL is handed to the plugin. So "gemini://example.horse/" could only be seen on the plugin end as "gemini://example.horse", which also makes redirects to "gemini://example.horse/" loop back to "gemini://example.horse/". Redirect hell ensued! That Dillo does this might be strange, but absolutely fine from a standards perspective. It's the server that's not compliant. -- Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200627/627e 8733/attachment.sig>
> I'm pretty sure it should besafe for clients to automatically append > trailing slashes to URLs without paths. I've done this in Amfora, it makes typing in domains much simpler. Currently, every redirect in Amfora must be confirmed by the user, but I wonder if it would make sense to automatically accept redirects that just add a slash, as a convenience thing. I would need to add some logic that prevents a redirect hell of course. makeworld ??????? Original Message ??????? On Saturday, June 27, 2020 4:57 AM, solderpunk <solderpunk at SDF.ORG> wrote: > I sent the following email (and many others, which will follow shortly) > to the list last night, but SDF's mailserver was acting up and they > never went through. I thought they'd appear eventually, but it appears > not, so I'm resending them now. Apologies if double posts eventually > appear. > > ----- Forwarded message from solderpunk solderpunk at SDF.ORG ----- > > Date: Fri, 26 Jun 2020 15:47:57 +0000 > From: solderpunk solderpunk at SDF.ORG > To: gemini at lists.orbitalfox.eu > Subject: Illusory latency due to trailing slash redirects > > Gemini servers (and servers for any other protocol supporting relative > URLs, including HTTP) will use redirects to get clients to add a > trailing slash to a URL which maps to a directory on the server's > filesystem. Handling this redirect is often invisible to the client > user, with the result that what "looks and feels" like a single request > is actually two immediately consecutive requests. This makes latency > appear to be much worse than it actually is. Perhaps this underlies > some people's perceptions that Gemini has unacceptable latency. > > This basic problem is unavoidable, but there are many small things > client and server authors can do to minimise how often it happens: > > In general, the client can't add a missing trailing slash itself because > it never knows if a given URL maps to something file-like or something > directory-like. An interesting exception is the root of the server. > Molly Brown and Jetforce (but not GLV-1.12556) both seem to redirect > gemini.example.com to gemini.example.com/, making the loading of > "homepages" feel slower than it needs to. I'm pretty sure it should be > safe for clients to automatically append trailing slashes to URLs > without paths. > > On the server side, server authors should take care when doing things > like automatically generating directory listings to put trailing slashes > in links to directories so that the redirect is not necessary. > > These measures together could cut perceived latency in half for a > non-trivial proportion of Gemini requests, including the psychologically > heavily weighted request of visiting a server's homepage. > > Cheers, > Solderpunk
On Sat, Jun 27, 2020 at 05:52:31PM +0000, colecmac at protonmail.com wrote: Hi! [...] > Currently, every redirect in Amfora must be confirmed by the user, Funny thing is that yhis isOA what i also do. I thought it was a bad idea, turned out it was not, actually! :) Bye! C.
On Sat, 27 Jun 2020, cage wrote: >> Currently, every redirect in Amfora must be confirmed by the user, > > Funny thing is that yhis isOA what i also do. I thought it was a bad > idea, turned out it was not, actually! :) Incidentally, how do these clients behave if someone sets their server to redirect "gemini://hostname.domain.com/" to "gemini://hostname.domain.com", i.e., if they remove the slash? Mk -- Martin Keegan, +44 7779 296469, @mk270, https://mk.ucant.org/
On Sat, Jun 27, 2020 at 07:33:30PM +0100, Martin Keegan wrote: [...] > Incidentally, how do these clients behave if someone sets their server to > redirect "gemini://hostname.domain.com/" to "gemini://hostname.domain.com", > i.e., if they remove the slash? Interesting question! I did not took into account this case but seems that my client is not going to remove the slash. I wonder if this is the correct way to process such URI. Bye! C.
On Sat, 27 Jun 2020 08:57:14 +0000 solderpunk <solderpunk at SDF.ORG> wrote: > On the server side, server authors should take care when doing things > like automatically generating directory listings to put trailing > slashes in links to directories so that the redirect is not necessary. Search engines and link aggregators could also help. For example, GUS has this entry: gemini://gemini.circumlunar.space/servers which is a redirect to: gemini://gemini.circumlunar.space/servers/ The search engine could try to detect that kind of redirect during the crawl and update the entry. On Sat, 27 Jun 2020 11:42:39 +0200 Philip Linde <linde.philip at gmail.com> wrote: > A server that serves different content on //server and //server/ can > be a huge pain in the neck. When I wrote my dillo plugin I noticed > that it seems to apply this normalization in reverse before the URL > is handed to the plugin. So "gemini://example.horse/" could only be > seen on the plugin end as "gemini://example.horse", which also makes > redirects to "gemini://example.horse/" loop back to > "gemini://example.horse/". Redirect hell ensued! I also noticed this. Dillo uses whichever URL was loaded first in the session (with or without the "/"). It caches the resulting resource, with the normalized URL as the cache key. So if you visit "gemini://example.horse" first, then subsequently visiting "gemini://example.horse/" retrieves the cached response for "gemini://example.horse". And you can't remove the cache entry except by starting a new Dillo process. Since servers are redirecting "" to "/", I think adding the "/" in the plugin like you are doing is the right thing to do, to work around what I now consider a bug in Dillo. In the older dillo-gemini plugin, I did not think to do that, but instead made redirects not automatic but just an internal page with a link to the target URL (i.e. require user confirmation) and work around the cache/normalization problem as a user by manually adding a query string to the URL to get a new cache key, e.g. visiting "gemini://example.horse/?" -- Charles -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200627/2dc0 b44e/attachment.sig>
> I also noticed this. Dillo uses whichever URL was loaded first in the > session (with or without the "/"). It caches the resulting resource, > with the normalized URL as the cache key. So if you visit > "gemini://example.horse" first, then subsequently visiting > "gemini://example.horse/" retrieves the cached response for > "gemini://example.horse". And you can't remove the cache entry except > by starting a new Dillo process. The way I do it in Amfora is normalize each URL for the cache and display, so that URLs like //example.com:1965/ and gemini://example.com/ aren't cached twice. And redirects are never cached, although maybe it makes sense to do that separately. makeworld
On Sat, Jun 27, 2020 at 08:57:14AM +0000, solderpunk wrote: > Gemini servers (and servers for any other protocol supporting relative > URLs, including HTTP) will use redirects to get clients to add a > trailing slash to a URL which maps to a directory on the server's > filesystem. Handling this redirect is often invisible to the client > user, with the result that what "looks and feels" like a single request > is actually two immediately consecutive requests. This makes latency > appear to be much worse than it actually is. Perhaps this underlies > some people's perceptions that Gemini has unacceptable latency. > > This basic problem is unavoidable, but there are many small things > client and server authors can do to minimise how often it happens: > > In general, the client can't add a missing trailing slash itself because > it never knows if a given URL maps to something file-like or something > directory-like. An interesting exception is the root of the server. > Molly Brown and Jetforce (but not GLV-1.12556) both seem to redirect > gemini.example.com to gemini.example.com/, making the loading of > "homepages" feel slower than it needs to. I'm pretty sure it should be > safe for clients to automatically append trailing slashes to URLs > without paths. I have a script which checks for redirects on websites and reports status codes. On web it is much smaller problem - the saved latency is much less. This way I can see dead links and links which could cause problems - http redirects instead of https or normal redirects which can be optimised. Maybe something similar could be done for gemini, so that content authors could check if their links are ok. I don't like this idea, because content authors should focus on writing content, not on how the content looks or if they are writing it the "correct way". Linters for gemini code sound like a bad idea and a sign that something is wrong. > On the server side, server authors should take care when doing things > like automatically generating directory listings to put trailing slashes > in links to directories so that the redirect is not necessary. That's the best solution I think. Paper
Please excuse something a bit off-topic. If the author/maintainer of the (mostly excellent) bombadillo browser sees this, please email back to me. I want to pass on an issue, and have been stymied by your "issues" page. Thanks. tb
> Please excuse something a bit off-topic. > > If the author/maintainer of the (mostly excellent) bombadillo browser > sees this, please email back to me. I want to pass on an issue, and have > been stymied by your "issues" page. > > Thanks. > > tb You'll need a tildegit account. Feel free to email me the issue and I'll file it for you though. makeworld
Right! one issue with bombadillo, one with the "issues" page. Bombadillo: I really like the look and feel of the browser, and accessing a few sites (especially gemini.circumlunar.space) is very quick. But most other gopher and gemini sites (eg: bombadillo.colorfield.space) seem to be unreachable. I get a TCP time-out after 5 seconds. This is tough, since my "broadband" connection is pretty slow. "Issues": you may have answered this already. Submitting an issue requires registering. Registering requires three fields: a user name, an email address, and a password. But when I give an email address, the page tells me that I cannot use an email address (required field). Odd. Thanks very much for your help. tb On 06/27/2020 03:46 PM, colecmac at protonmail.com wrote: >> Please excuse something a bit off-topic. >> >> If the author/maintainer of the (mostly excellent) bombadillo browser >> sees this, please email back to me. I want to pass on an issue, and have >> been stymied by your "issues" page. >> >> Thanks. >> >> tb > > You'll need a tildegit account. Feel free to email me the issue and I'll file it for you though. > > makeworld
On 6/27/20 3:54 PM, Terry Brennan wrote: > Right! one issue with bombadillo, one with the "issues" page. > > Bombadillo: I really like the look and feel of the browser, and > accessing a few sites (especially gemini.circumlunar.space) is very > quick. But most other gopher and gemini sites (eg: > bombadillo.colorfield.space) seem to be unreachable. I get a TCP > time-out after 5 seconds. This is tough, since my "broadband" connection > is pretty slow. > > "Issues": you may have answered this already. Submitting an issue > requires registering. Registering requires three fields: a user name, an > email address, and a password. But when I give an email address, the > page tells me that I cannot use an email address (required field). Odd. I'm the admin on tildegit.org. I just added a notice to the signup page about the allowed email domain list. Sorry for the inconvenience but it has curbed automated spam accounts entirely, which was taking a significant amount of my time. Cheers, ~ben
On Sat, Jun 27, 2020 at 03:28:17PM -0400, Charles E. Lehner wrote: > Search engines and link aggregators could also help. > > For example, GUS has this entry: > gemini://gemini.circumlunar.space/servers > which is a redirect to: > gemini://gemini.circumlunar.space/servers/ Out of curiosity, do you remember the query that got you that search result? When I perform the below search, I only seem to be able to find it with the trailing slash. Your result is surprising to me because GUS already does reliably follow (or at least I think it does :) trailing slash redirects and provide the redirected URL in search results. ``` ## "servers AND domain:circumlunar" => gemini://gemini.circumlunar.space/servers/ gemini.circumlunar.space/servers/ (text/gemini, 2K)
Hi Natalie, > > For example, GUS has this entry: > > gemini://gemini.circumlunar.space/servers > > which is a redirect to: > > gemini://gemini.circumlunar.space/servers/ > > Out of curiosity, do you remember the query that got you that search > result? When I perform the below search, I only seem to be able to > find it with the trailing slash. Your result is surprising to me > because GUS already does reliably follow (or at least I think it does > :) trailing slash redirects and provide the redirected URL in search > results. My mistake, it is houston, not GUS. I'm glad GUS follows the redirects already. gemini://houston.coder.town/search?list Regards, Charles -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200627/82a1 14fd/attachment-0001.sig>
Ok, so I have a legitimate question... why are servers even redirecting for something as dumb as a trailing slash anyways? Christian Seibold Sent with ProtonMail Secure Email. ??????? Original Message ??????? On Saturday, June 27, 2020 3:57 AM, solderpunk <solderpunk at SDF.ORG> wrote: > I sent the following email (and many others, which will follow shortly) > to the list last night, but SDF's mailserver was acting up and they > never went through. I thought they'd appear eventually, but it appears > not, so I'm resending them now. Apologies if double posts eventually > appear. > > ----- Forwarded message from solderpunk solderpunk at SDF.ORG ----- > > Date: Fri, 26 Jun 2020 15:47:57 +0000 > From: solderpunk solderpunk at SDF.ORG > To: gemini at lists.orbitalfox.eu > Subject: Illusory latency due to trailing slash redirects > > Gemini servers (and servers for any other protocol supporting relative > URLs, including HTTP) will use redirects to get clients to add a > trailing slash to a URL which maps to a directory on the server's > filesystem. Handling this redirect is often invisible to the client > user, with the result that what "looks and feels" like a single request > is actually two immediately consecutive requests. This makes latency > appear to be much worse than it actually is. Perhaps this underlies > some people's perceptions that Gemini has unacceptable latency. > > This basic problem is unavoidable, but there are many small things > client and server authors can do to minimise how often it happens: > > In general, the client can't add a missing trailing slash itself because > it never knows if a given URL maps to something file-like or something > directory-like. An interesting exception is the root of the server. > Molly Brown and Jetforce (but not GLV-1.12556) both seem to redirect > gemini.example.com to gemini.example.com/, making the loading of > "homepages" feel slower than it needs to. I'm pretty sure it should be > safe for clients to automatically append trailing slashes to URLs > without paths. > > On the server side, server authors should take care when doing things > like automatically generating directory listings to put trailing slashes > in links to directories so that the redirect is not necessary. > > These measures together could cut perceived latency in half for a > non-trivial proportion of Gemini requests, including the psychologically > heavily weighted request of visiting a server's homepage. > > Cheers, > Solderpunk
It was thus said that the Great Krixano once stated: > Ok, so I have a legitimate question... why are servers even redirecting > for something as dumb as a trailing slash anyways? Semantically (and stated in RFC-3986) they are separate resources. For a fun time, try the following two links: gemini://gemini.conman.org/test/doc1 gemini://gemini.conman.org/test/doc1/ Spoiler: they aren't the same document. -spc (Also, consider CGI scripts, they don't *have* to have an extension, at least on GLV-1.12556 ... [1]) [1] Not to say those two links are done via CGI.
> Semantically (and stated in RFC-3986) they are separate resources. For a > fun time, try the following two links: > > gemini://gemini.conman.org/test/doc1 > gemini://gemini.conman.org/test/doc1/ Those two are, but consider these: gemini://gemini.conman.org gemini://gemini.conman.org/ Per RFC 3986, they should be equivalent. A server respecting the standard should understand an authority without a path specified as referring to the path "/". This is described in section 6.2.3. Of course, this is "in general" for schemes using authorities? and I guess a scheme can define its own normalization rules, but AFAIK no exceptions to this have been specified for Gemini. On that basis I think servers that serve different content on these should fix it to mean exactly the same thing. For URLs that have paths specified like in your example, clients should as you say be able to handle them being different because there are no such normalization rules for URLS with paths specified. Best regards, Philip -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200628/56c5 d513/attachment.sig>
On Sun, Jun 28, 2020 at 08:20:40AM +0000, Krixano wrote: > Ok, so I have a legitimate question... why are servers even redirecting for something as dumb as a trailing slash anyways? It might *seem* dumb, but it's essential for relative URLs to work correctly. Links in text/gemini may be relative URLs, e.g. just "other_page.gmi". It's the client's job to turn that into an absolute URL, in accordance with the RFC rules for doing so. If you are at "gemini://example.com/foo", then that relative link turns into "gemini://example.com/other_page.gmi", but if you are at "gemini://example.com/foo/" then it turns into "gemini://example.com/foo/other_page.gmi". Web servers do this all the time for exactly the same reason, you just never notice because browsers never tell you. Cheers, Solderpunk
> On Sun, Jun 28, 2020 at 08:20:40AM +0000, Krixano wrote: > > > Ok, so I have a legitimate question... why are servers even redirecting for something as dumb as a trailing slash anyways? Why are browsers requesting url with servername without the trailing slash? freD.
---