TLS overhead

(originally posted in Gopherspace on 2019-06-21)

Right at the end of a recent post, sloum voiced his support for TLS as a core part of a new protocol. I didn't make any mention of TLS in my recent "protocol pondering intensifies" series, but based on earlier stuff I've written it should come as no surprise that I am all in favour. Perhaps it was disingenuous not to make any mention of this when I wrote about how my proposed protocol could still be used over telnet. I don't feel too bad about it, though. For one thing, I'm *sure* there is something like "telnet for TLS" out there. For another, telnetability is not actually a super important practical property. *I've* never actually surfed gopherspace in that way. It's more of a token property that serves as a "seal of simpliciy".

Sloum's 2019-06-19 post

Post I in my "Protocol pondering intensifies" series

Post II in my "Protocol pondering intensifies" series

Post III in my "Protocol pondering intensifies" series

My 2019-03-31 post "Why Gopher needs crypto"

Anyway, today for the first time I asked myself what the overhead of mandating TLS for all connections could be. I quickly came across this write-up which estimated the TLS handshake to need about 6.5 kilobytes. Oof! That was a kick in the teeth. It made a lot of the stuff I wrote recently seem incredibly naive. I made a big deal out of serving gophermaps saving 10 or 20 bytes per line compared to a gopher menu. That benefit would be totally wiped out in most cases by 6.5kB of overhead.

A web article on TLS overhead

My first thought was that I would need to switch to keeping connections open for re-use instead of immediately closing them after the response was sent. This would at least allow spreading the TLS overhead over several requests. It would be a real shame, though. It would complicate client and serve programming, and would also require adding an extra component to the response header, equivalent to HTTP's "Content-Length" header.

With a little more reading I learned that recent versions of TLS support session resumption, where subsequent secure connections to the same server can be established with very low overhead (about 330 bytes). I thought this could save things, but was disappointed to find that Python's standard library `ssl` module doesn't seem to support this. I don't want to design into the protocol a feature of TLS which is not widely supported in high-quality libraries for popular languages. Of course, some clients might be able to make use of this, and I'd encourage it.

With yet more reading of the linked article, I relaxed a little bit. That 6.5kB estimate is based on some assumptions specific to the modern web. In particular, it assumes the server sending a chain of 4 certificates to a trusted root certificate. My plan from the start for this protocol has been to shun the certificate authority system used by the web in favour of a much simpler and less hierarchical "TOFU" system similar to SSH: the first time a client connects to a server, it accepts whatever certificate it gets, but remembers it, and raises the alarm if the same server offers up a different certificate in future. This would allow servers to send only a single, self-signed certificate, which the article states can be as small as 800 bytes. So, maybe we can get a typical case around 1kB. That's still relatively heavy, but it's a heck of a lot better than 6.5kB. I think 1kB is acceptably low that I would rather swallow it than add complexity by switching to a proocol oriented around reusing connections for multiple requests.

On the face of it, an unavoidable 1kB overhead on every connection would seem like a license to not care so much about saving a 10 or 20 bytes in the response header. I don't want to fall into that trap, though. For one thing, TLS session resumption might become a much more widely supported feature in the future, in which case the overhead might become a lot lower. For another, it's possible that some people (e.g. retrocomputing fans) might want to run Gemini unencrypted. Rest assured this will be in violation of the spec, but folks doing it will be guilty of precisely the same sin that I'm guilty of for including TLS support in VF-1, so I can't really complain. So long as they do it on some non-standard port, that's their perogative.

So, I'm still in favour of mandating TLS, but a lot of reading and a lot of care is going to be needed to specify using it in a way that minimises overhead. All part of the fun.

There's another kind of overhead associated with TLS, beyond the network traffic, and that's the implementation overhead. This is a big concern of rain, who points out that it's totally impractical for individual programmers to implement TLS (I fully agree), that they will need to use libraries, and that it violates the spirit of gopher to make the implementation so complex that a normal programmer can't implement it in a weekend.

Rain's 2019-06-08 post on encrypting gopher

I'm hugely sympathetic to these concerns. One of the stated design criteria for Gemini in the FAQ is that:

A client comfortable for daily use which implements every single protocol feature should be a feasible weekend programming project for a single developer.

I don't think that relying on TLS conflicts with this. High-level TLS support is now present in the standard libraries for Python and Go. I am sympathetic to developers who like to avoid third-party dependencies at all costs (VF-1 only "softly" depends on chardet), but not using the *standard library* of your language doesn't make a lot of sense. Here's how to do a TLS connection in Python 3, assuming the variable `s` is a regular TCP socket, already connected, of exactly the kind you'd need to construct if Gemini didn't depend on tls:

import ssl

context = ssl.create_default_context()
s = context.wrap_socket(s)

It's three additional lines of code. Yes, this uses all the default settings and you are trusting the Python standard library developers to have chosen sane and secure defaults. Even if you think you know better than them and want to manually specify some things, you're not talking about more than 10 lines of code total in all likelihood. Python might be ahead of the game here (I honestly don't know), and this might be trickier in other languages, but I strongly suspect it's only going to get easier, on average, over time. Hopefully it will also get easier to link these languages against OpenBSD's LibreSSL instead of OpenSSL, so that the amount of code and the complexity of code this pulls in will decrease..

I completely understand the decreased feeling of satisfaction and self-sufficiency that comes from having critical functionality provided by a large chunk of complex code that you didn't write yourself. Though, let's be honest with ourselves - gopher clients which don't have to worry about TLS are still sitting atop the OSes TCP/IP stack, DNS library, filesystem and a bunch of other stuff that the average person has no hope of implementing well in a weekend. I don't see that relying on your programming language's standard library is cheating any more than relying on your operating system is.

I'd love a simpler, lighter alternative, but realistically I don't see any which is going to do the job. Rolling your own crypto is fraught with peril. SSL libraries may be large and complex, but they exist in just about any language and they are used and tested by a lot of people, many of whom know more about what they are doing than the average developer who might implement Gemini. Anything else is almost guaranteed to be less portable and less well vetted. I'm open to concrete suggestions if anybody has them, but for now I still think TLS is our best bet.