💾 Archived View for zaibatsu.circumlunar.space › ~solderpunk › phlog › protocol-pondering-intensifie… captured on 2022-06-03 at 23:17:35.
⬅️ Previous capture (2020-09-24)
-=-=-=-=-=-=-
Protocol pondering intensifies, Pt III -------------------------------------- Having previously[1,2] pondered request and response formats for a hypothetical protocol which is a bit more powerful than gopher but a lot less powerful than full-blown HTTP, now I want to turn my attention to the question of navigation, or how documents served by this protocol can link to one another. One option, which I briefly mentioned in Part II, is to keep something like the gopher menu, and give it an item type of some sort which is conveyed in the response header. This approach retains gopher's hard conceptual division between navigation and content which, as I wrote about yet earlier[3], I am not sure is something we necessarily want, but it's worthy of consideration. Even if we retain the idea of a "menu type", we don't necessarily need to user gopher's exact format. Let's think about that. A standard gopher menu line looks like this: ---------- <ITEM TYPE><ITEM NAME><TAB><SELECTOR><TAB><HOST><TAB><PORT> ---------- Why aren't the item type and item name separated by a tab? I'm not sure. If you know, or even just have a hunch, please let me know! UPDATE 17/06/2019: Visiblink has offered an explanation for this which is so obviously correct that I'm embarrassed for having asked! Gopher item types are guaranteed to be one character long, so there is no need for a tab to unambiguously signal the border between item type and item name. It'd just be a wasted byte. An obvious update which could be made here is to take advantage of the fact that between now and gopher was first invented, URLs have been invented! We don't need to specify the selector (path), host and port separately, we have a standard way to build that into one string, and every modern programming language has libraries for parsing/buiding them. At first glance this might seem like pointless modernisation for its own sake, just replacing tabs with slashes and colons, but there's one very important extra bit of power that switching to URLs brings, and that's the ability to specify the protocol. Standard gopher menu items can only link to other gopher items, not e.g. to items shared via HTTP(S), FTP, or anything else. I don't think this is necessarily a bad thing, for the record, but there is good evidence that people want to be able to link to arbitrary non-gopher protocols, in the form of widely adopted ugly hack of 'h' type items whose selector is a URL with a "URL:" prefix. Sufficiently smart clients recognise these, extract the URL and act appropriately (if they support the additional protocol), while dumb ones ask the gopher server for a selector beginning with "URL:", which the *server* recognises and responds to by serving a tiny HTML page with a redirect to the URL. Just putting URLs directly into menus would let us side-step this little dance. It would also, incidentally, solve the problem that there's no way in a standard gopher menu to convey whether or not TLS should be used[4], by allowing the use of gophers:// URLs. So, we might use something like this as a menu item in a new protocol: ---------- <ITEM TYPE><TAB><ITEM NAME><TAB><URL> ---------- Yep, I put a tab between item type and item name. Not sorry. In Part II I advocated for including item types in server responses, which arguably makes them redundant here. We *could* simplify these lines even further by just including a name and a URL. I actually kind of like the idea that you know what kind of thing a document is before you fetch it, so you can use that information to decide whether or not you want to fetch it. But it's also kind of weird. That information can only authoritatively come from the server hosting it, but having them in menus has arbitrary third parties declaring that information. I don't really know how I feel on this for now. An alternative to keeping the menu system would be to take the web approach of drawing no distinction between content and navigation and using some kind of markup language with support for inline links which can facilitate both menus *and* content. I think this is conceptually simpler, although it brings with it the huge can of worms of choosing one particular markup language. If this new protocol is to be vaguely gopherlike I think we'd all agree the language should be simple and minimal and human-readable even when looked at as plain text. Something like, but not necessarily, MarkDown. With this approach you'd build a very gopher-like menu with something like this: * [<ITEM NAME 1>|<URL 1>] * [<ITEM NAME 2>|<URL 2>] * [<ITEM NAME 3>|<URL 3>] With this approach, there's no way to convey item type in a menu. This doesn't seem to be a big problem for the web, although it would stop us from easily keeping something like gopher's search system, which is based on a special item type. To implement searches without that item type would require something similar to HTML <form>s, and for me that's way too big a step up in complexity. So this approach would leave serious question marks surrounding search. That *sounds* like a big problem, with a web mindset, but I'll point out that while gopher search currently exists, it's very under-developed and under-used and a strong sense of community that extends across multiple servers has developed despite this. Here's one last option: a lot of gopher users who like the idea of being able to put links at almost arbitrary points inside content serve things like phlog posts as gopher menus. Most of their content is included as item type i lines. This upsets some gopher purists because i is not standard, and it upsets other gopher purists because it involves telling a lie via item type (declaring something to be a menu when it's actually not). But what if we standardised on something like this as the main, and indeed only, document type in a new protocol? That is to say, there's just one kind of thing, not necessarily a pure menu, not necessarily pure content, just a file where any line that fits the template: <ITEM TYPE><TAB><ITEM NAME><TAB><URL> is interpreted as a link, and any line which doesn't, isn't. This is, actually, exactly the kind of file many people who serve content as item type 1 are already writing. They certainly aren't manually putting an "i" at the beginning of every line and some fake hosts and ports at the end. Their gopher server does this for them, by recognising lines which don't fit the format of a menu item and converting them to items of type i. If we just declared what all those people are already writing to be the standard format, the server wouldn't *need* to do this transformation, and could just send it over the wire as-is. This is basically elevating the gophermap to first-class status, instead of being a behind-the-scenes convenience. Note that this would reduce network traffic non-trivially in many cases: the cost of serving a phlog post as a menu is that for *every line* of the post you have to send an i, two tabs, a dummy hostname and a dummy port (which is often "70"). Assuming a one character dummy hostname, that's 6 bytes. Per line. Which is automatically added by the server and then automatically removed by the client, and never seen by human eyes. Getting rid of that dead weight would easily make up for the extra roughly 20 bytes that the response header I proposed in Part II would add to a transaction. Gopher severs like Tomasino's gopher.black, where All the World's a Menu, would actually have to transfer *fewer* bytes under this protocol than under gopher, to serve *exactly the same* content, in a way that's *friendlier* to the client! I'd call that a win. I actually think I really like this idea, compared to something like MarkDown, for one main reason: it forces one link per line, whereas a general markup language with hyperlink support would allow many links per line, scattered about wherever the author wants. Scattered links like that can be hard to spot, and they don't lend themselves as nicely to rapid navigation based on indices, as featured in e.g. VF-1, cgo and Bombadillo. I sure don't want to give that up! Forcing one link per line should also help preserve one of the great virtues of gopher menus, which is that you are more or less forced to lay things out in a nice and neat way. It's *possible* to lay out a MarkDown page every bit as nicely, but it's also possible not too, so that route would involve trusting the community to develop a strong norm of doing that. I think that would probably work out (the early adopters of this protocol, if there were in fact any, would no doubt be gopher-heads), but why take the chance? Of course, there is nothing at all to stop those who want to serving MarkDown, putting the text/markdown MIME type in the response header, and clients can optionally implement it. That's, I think, all I have to say for now on the navigation question. In these three epic posts (if you've read all of every one of them - thank you, really!) I have come the closest I ever have to actually offering a concrete proposal for a protocol "between gopher and the web". There are certainly still details to be ironed out, and I'm not ready yet to give this thing a name and start coding, but I have been thinking, vaguely, about what would be involved in converting VF-1 from a gopher client to a...whatever-this-is client. All the code related to trying to estimate text encodings if UTF-8 doesn't work, reporting encoding errors to the user, allowing the user to specify their preferred fallback encoding would disappear. All the code related to trying to assign a MIME type to a non-text document to be able to choose a handler program would disappear. All the places where item types 0 and 1 need to be treated differently would disappear. Of course I won't know for sure until I actually do it, but it seems highly likely to me that a client for this protocol which had exactly the same user interface and capabilities would be a lot less code. I think this exposes an important truth about gopher: it's not just really simple, it's *too* simple, if you want it to do anything other than serve ASCII text. Doing anything else forces a lot of complexity into the client. Now, to be sure, there are gopher clients out there where the codebase would get *larger* and *more complex* if you converted them to a protocol based on my sketchy outline here. But those same gopher clients would probably explode if you tried to take them into Russian gopherholes where Cyrllic text is encoded with the old KOI8-R Soviet standard. That's not a joke, these exist[5]! VF-1 can go there. No other gopher client I've tried renders the text properly, not one (happy to be corrected, though!). Those other clients also don't let you specify your preferred third-party application for handling PDFs and other file types which don't have any item type more appropriate than the type 9 "binary wastebin". I'm not saying an ASCII-only protocol is useless, it surely has its place. But I *really* like the idea of a protocol that lets you write a quick and simple and obviously trustworthy client which can anonymously Go Anywhere and Do Anything, and gopher is not that. But not much has to be added at all to get there! I really, really want to hear feedback on the ideas in this long series, even if it's negative (of course, constructive criticism is the best criticism). I'm not super attached to many of the details of what I've sketched here. I'm sure improvements exist, and I'd like to hear ideas for them. [1] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies.txt [2] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/protocol-pondering-intensifies-ii.txt [3] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/phlog/the-soul-of-gopher.txt [4] gopher://gopher.conman.org:70/0Phlog:2019/03/31.1 [5] gopher.pclovers.ru:70/1/rus.koi8