Published: 2022-05-16
Updated: 2022-05-16
Tags: #rant #security #tech #web #browsers
Much as been made in the posts and discussions surrounding the design of the
gemini protocol and about limiting what user-agents can reveal about their user
to servers. The general call to arms is "It's None of Your Damned Business!"
which in general sums the mission statement of the TOR-Browser and other
privacy centered browser plugins.
But despite all the effort to which the Gemini protocol has gone, it's still
technically possible for a nefarious user-agent and malicious server to
colaborate and share information about the user which the user would prefer to
keep private.
I call user-agents which collaborate with servers in a way which goes against
a users interests "Disloyal User-Agents".
For reasons (outlined below) I doubt the Gemini clients will become a disloyal
user-agents, but there's nothing in principle preventing them.
Modern HTTP/HTML browsers are an objectively Disloyal User-Agent. It didn't
start out that way but the evolution of browsers, and their modern default
behaviour, is much more aligned with the needs of servers than the needs of the
user it represents.
Cookies are probably the pinacle of disloyalty. The server and user-agent can
tag a user, and then continuously exchange this information behind the users
back without notification or permission. I have yet to see a mainstream
browser that allows explicit whitelisting of Cookies as a core feature,
something that was proposed in the original RFC outlining Cookie Headers.
RFC 2109: HTTP State Management Mechanism - PRIVACY
Well documented are the many other headers which can be be combined to identify
the user, or at least put them in a small bucket or users. Again these are
added implicitly to outgoing request without control given to the user. If you
have never tried it out I highly recommend visiting the EFF's Panopticlick
project to see how accurately you can be identified and tracked using
information leaked by the disloyal user-agent (browser) you are using.
Beyond the default behaviour leaking information we have to also consider
clientside code execution of remote code. Whilst javascript has undoubtedly
unleased the power of the HTTP/HTML, the only security model is the Same-Origin
model, and this puts little if any control in the hands of users. Once
again the privacy of a user is mostly in control of configuration sent by the
server which the user-agent faithfully follows. There's not way for a user to
prevent a server from inspecting features via that web-api (these should be
whitelisted), there's no way to prevent information leakage via dns requests.
Even pre-flight queries can transport information which really puts a limit on
the protections CORS can provide.
The tragedy of the situation is that it is browser standardisation which has
gone a long way in creating the current situation. Standard headers, default
web-apis, and default expected behaviour have been forced into user-agents
with little to no recourse from users themselves.
The most trusted user-agent is the one you write yourself. One of the great
things about the Gemini and Spartan protocols, and the gemtext document type,
is that they are simple enough to quickly write a custom user-agent with which
to explore this new web of content.
Laconia - A Simple Spartan Protocol Client
A phenomenal number of user-agents will (hopefully) be produced for protocols
and document types of this simplicity. And this in effect will create choice
and competition where users will (also hopefully) find user-agents which value
their needs rather than that of the servers.
The small size of the community may also help as bad acting clients & servers
may be publically shamed, and users may hopefully abandon them.
The constraint that a client should be simple enough for a wide selection of
software developers to code is a large limiting factor on creating dynamic
content and sophisticated internet applications. HTTP/HTML Browsers are really
amazing peices of software and allow incredible experiences to be delivered to
a user. But the chances of you or Me building such a piece of software by
ourselves, or even in a small group, is now extremely unlikely.
For this reason I don't think there's any hope of saving the HTTP/HTML web.
It's simply too complex for new entrants, most new browsers just fork off of
the Chromium project. Regulators around the world are trying to tackle this
issue, but my prediction is that they're not capable of operating at the speed
and complexity of the tech and its evolving protocols.
Instead of starting with the protocol what if we start with a framework of a
"Loyal User-Agent". What is the maximum level of complexity that a user can
understand and configure to manage their privacy? What is the maximum level of
complexity that a large number of software developers reasonably be expected to
tackle when creating a user-agent? Is it possible to deliver rich internet
applications without centralised standards bodies?
I don't yet have the answer to these questions, although I do have some nascent
ideas forming.