💾 Archived View for geminiprotocol.net › history › phlog › request-formats-virtual-hosting-and-proxy… captured on 2023-09-08 at 16:03:36. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
(originally posted in Gopherspace on 2019-08-09)
By the far the most frequent request/inquiry I receive about Gemini design is to do something to facilitate virtual hosting, i.e. allowing different Gemini sites with different hostnames to be served from the same IP address. I understand the concern - it's one of the limitations of gopher which has frustrated me personally, as I can't serve differing content from zaibatsu.circumlunar.space and circumlunar.space, as much as I'd like to.
In HTTP, virtual hosting is achieved with a "Host:" request header. This became mandatory for all requests starting from HTTP 1.1: a new revision of the protocol had to be made just to facilitate this functionality.
More than one person has independently suggested to me that we avoid this problem in Gemini by changing the request format so that clients send a full URL instead of just a path/selector/whatever. Sean has written a short RFC for this.
To be precise, this means that to follow a link to gemini://foo.com/bar/baz, instead of:
the procedure would be:
The server would then parse this URL to separate out the foo.com and /bar/baz parts and act appropriately. Simple servers which support only a single hostname can simply discard the host part and then proceed as they always have.
This is a change which would impact both server and client authors and need everybody to get in line to avoid breakage, so I've put it on the list of important stuff that needs to be finalised ASAP, along with the status codes. Once the status codes and request format are locked in, I think it's no longer folly to start actually writing non-trivial Gemini software and setting up servers.
There are other good questions that people are raising, especially regarding reflowing of text so Gemini content looks good on narrow-window devices like phones. I don't mean to dismiss those things, but they're in some meaningful sense less essential than the actual details of the protocol itself so I'm letting them sit on the backburner so as not to get overwhelmed with scary decision making. So, I'm very interested to hear what people think of this proposed change. Preferably ASAP!
An interesting thing to note is that this change would not only allow virtual hosting, but also proxy servers - servers which would accept URLs starting not with one of the hosts they actually serve content for, but with any other host, which they'd then speak to on the client's behalf.
From the start of this whole project, I've intended to eventually set up a Gemini-to-gopher proxy, so that Gemini clients could be used to access more than just the very few and very small experimental Gemini servers which have popped up so far. This would be an extremely elegant way for this to work - you just connect to that proxy server and send the gopher URL you want and get back a Gemini response, with gopher menus translated to Gemini menus.
The *only* thing which has stopped me just saying "yes, this is a good idea, let's do it!" has been that it entails a change in the way I always imagined this proxying thing would work. I've not yet had the chance to really think about how this stuff would work, so I'm going to do it out-loud now...
The current speculative specification says nothing about the structure of Gemini paths/selectors/whatevers. Implicitly, they may not contain <CR><LF> sequences as that would stuff up the whole line-based format, but aside from that, anything goes. So a fully-formed URL would be a perfectly valid selector. And when I thought that we'd access gemini://foo.com/bar/baz by connecting to foo.com and sending "/bar/baz", I thought that we'd have simple proxying with URLs like gemini://hypotheticalgopherproxy.net/gopher%3A%2F%2Fsdf.org, which would involve connecting to hypotheticalgopherproxy.net and sending "gopher://sdf.org" - yeah, the URL encoding/decoding is gross but it lets us use perfectly standard definitions of a URL and there are libraries to handle this kind of chore in every language under the sun, so we can live with it. This would allow Gemini menus to directly link to proxied resources.
And, actually, this would still work exactly the same way under the proposed change. I mean, you could still directly link to a proxied resource with a URL just like the above. The change is that the actual network transaction would no longer involve just sending "gopher://sdf.org" to the proxy server. Under the proposed scheme there is no way to send a request to any server which does not begin with one of the hostnames associated with that server. And I suppose this isn't actually a problem, nobody is going to *see* what happens on the wire anyway. But it feels like the "old way" is just a whole lot more elegant. It's much easier and all-round more sensible seeming for a server to figure out whether it has received a request to act as a proxy by just looking at the host part of the URL it's just received a request for and seeing if it's one of its own hosts.
This behaviour could be preserved by abandoning the idea that proxied resources should be able to be linked to - which, afterall, is not the case with HTTP. The alternative way is that you configure your Gemini client by telling it "Please using hypotheticalgopherproxy.net as a proxy for the gopher:// protocol". Then whenever it sees a gopher:// link in a menu, it knows it should actually connect to hypotheticalgopherproxy.net and then send that URL. This is just how HTTP proxies work in browsers. Of course, hypotheticalgopherproxy.net could just as easily be localhost:1966, with a small single-purpose proxy program running locally to preserve your privacy.
In some ways, this is perhaps the better option. It means that people can just include plain vanilla gopher links in their menus, and it's up to the client how this is handled: maybe directly, if the client supports both Gemini and Gopher (which I suspect and hope many clients will), maybe via a proxy if the user has configured one. The more I think about it, the feature of web-to-gopher proxies where you can directly link to proxied content is just a work-around for the fact that modern browsers don't give you a way to directly configure a "proper" proxy. It shouldn't be emulated.
Okay, I've quite happily convinced myself that the change to sending a whole URL in Gemini requests is a Good Thing. It definitely increases the power-to-weight ratio. I'll be adding it to the spec unless I receive very compelling, well-argued objections within the next few days.
Guess I'll have to consider whether we need any proxy-related status codes...