💾 Archived View for rosenzweig.io › gemlog › fun-and-games-with-exposure-notifications.gmi captured on 2021-12-17 at 13:26:06. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2020-11-07)
-=-=-=-=-=-=-
_Exposure Notifications_ is a protocol developed by Apple and Google for facilitating COVID-19 contact tracing on _mobile phones_ by exchanging codes with nearby phones over Bluetooth, implemented within the Android and iOS operating systems, now available here in Toronto.
Wait -- phones? Android and iOS only? Can't my Debian laptop participate? It has a recent Bluetooth chip. What about phones running GNU/Linux distributions like the PinePhone or Librem 5?
Exposure Notifications breaks down neatly into three sections: a Bluetooth layer, some cryptography, and integration with local public health authorities. Linux is up to the task, via BlueZ, OpenSSL, and some Python).
Given my background, will this build to be a reverse-engineering epic resulting in a novel open stack for a closed system?
...
Not at all. The specifications for the Exposure Notifications are available for both the Bluetooth protocol and the underlying cryptography. A partial reference implementation is available for Android, as is an independent Android implementation in microG. In Canada, the key servers run an open source stack originally built by Shopify and now maintained by the Canadian Digital Service, including open protocol documentation.
All in all, this is looking to be a smooth-sailing weekend [Today (Monday) is Labour Day, so this is a 3-day weekend. But I started on Saturday and posted this today, so it _technically_ counts.] project.
The devil's in the details.
Exposure Notifications operates via Bluetooth Low Energy "advertisements". Scanning for other devices is as simple as scanning for advertisements, and broadcasting is as simple as advertising ourselves.
On an Android phone, this is handled deep within Google Play Services. Can we drive the protocol from userspace on a regular GNU/Linux laptop? It depends. Not all laptops support Bluetooth, not all Bluetooth implementations support Bluetooth Low Energy, and I hear not all Bluetooth Low Energy implementations properly support undirected transmissions ("advertising").
Luckily in my case, I develop on an Debianized Chromebook with a Wi-Fi/Bluetooth module. I've never used the Bluetooth, but it turns out the module has full support for advertisements, verified with the `lescan` (**L**ow **E**nergy **Scan**) command of the `hcitool` Bluetooth utility.
`hcitool` is a part of BlueZ, the standard Linux library for Bluetooth. Since `lescan` is able to detect nearby phones running Exposure Notifications, pouring through its source code is a good first step to our implementation. With some minor changes to `hcitool` to dump packets as raw hex and to filter for the Exposure Notifications protocol, we can print all nearby Exposure Notifications advertisements. So far, so good.
That's about where the good ends.
While scanning is simple with reference code in `hcitool`, advertising is complicated by BlueZ's lack of an interface at the time of writing. While a general "enable advertising" routine exists, routines to set advertising parameters and data per the Exposure Notifications specification are unavailable. This is not a showstopper, since BlueZ is itself an open source userspace library. We can drive the Bluetooth module the same way BlueZ does internally, filling in the necessary gaps in the API, while continuing to use BlueZ for the heavy-lifting.
Some care is needed to multiplex scanning and advertising within a single thread while remaining power efficient. The key is that advertising, once configured, is handled entirely in hardware without CPU intervention. On the other hand, scanning does require CPU involvement, but it is *not* necessary to scan continuously. Since COVID-19 is thought to transmit from *sustained* exposure, we only need to scan every few minutes. (Food for thought: how does this connect to the sampling theorem?)
Thus we can order our operations as:
Since most of the time the program is asleep, this loop is efficient. It additionally allows us to reconfigure advertising every ten to fifteen minutes, in order to change the Bluetooth address to prevent tracking.
All of the above amounts to a few hundred lines of C code, treating the Exposure Notifications packets themselves as opaque random data.
Yet the data is far from random; it is the result of a series of operations in terms of secret keys defined by the Exposure Notifications cryptography specification. Every day, a "temporary exposure key" is generated, from which a "rolling proximity identifier key" and an "associated encrypted metadata key" are derived. These are used to generate a "rolling proximity identifier" and the "associated encrypted metadata", which are advertised over Bluetooth and changed in lockstep with the Bluetooth random addresses.
There are lots of moving parts to get right, but each derivation reuses a common encryption primitive: HKDF-SHA256 for key derivation, AES-128 for the rolling proximity identifier, and AES-128-CTR for the associated encrypted metadata. Ideally, we would grab a state-of-the-art library of cryptography primitives like `NaCl` or `libsodium` and wire everything up.
First, some good news: once these routines are written, we can reliably unit test them. Though the specification states that "test vectors... are available upon request", it isn't clear *who* to request from. But Google's reference implementation is itself unit-tested, and sure enough, it contains a `TestVectors.java` file, from which we can grab the vectors for a complete set of unit tests.
After patting ourselves on the back for writing unit tests, we'll need to pick a library to implement the cryptography. Suppose we try `NaCl` first. We'll quickly realize the primitives we need are missing, so we move onto `libsodium`, which is backwards-compatible with NaCl. For a moment, this will work -- `libsodium` has upstream support for HKDF-SHA256. Unfortunately, the version of `libsodium` shipping in Debian testing is too old for HKDF-SHA256. Not a big problem -- we can backwards port the implementation, written in terms of the underlying HMAC-SHA256 operations, and move on to the AES.
AES is a standard symmetric cipher, so `libsodium` has excellent support... for some modes. However standard, AES is not _one_ cipher; it is a family of ciphers with different key lengths and operating modes, with dramatically different security properties. "AES-128-CTR" in the Exposure Notifications specification is clearly 128-bit AES in CTR (**C**oun**t**e**r**) mode, but what about "AES-128" alone, stated to operate on a "single AES-128 block"?
The mode implicitly specified is known as ECB (**E**lectronic **C**ode**b**ook) mode and is known to have fatal security flaws in most applications. Because AES-ECB is generally insecure, `libsodium` does not have any support for this cipher mode. Great, now we have *two* problems -- we have to rewrite our cryptography code against a new library, and we have to consider if there is a vulnerability in Exposure Notifications.
ECB's crucial flaw is that for a given key, identical plaintext will always yield identical ciphertext, regardless of position in the stream. Since AES is block-based, this means identical blocks yield identical ciphertext, leading to trivial cryptanalysis.
In Exposure Notifications, ECB mode is used only to derive rolling proximity identifiers from the rolling proximity identifier key and the timestamp, by the equation:
RPI_ij = AES_128_ECB(RPIK_i, PaddedData_j)
...where `PaddedData` is a function of the quantized timestamp. Thus the issue is avoided, as every plaintext will be unique (since timestamps are monotonically increasing, unless you're trying to contact trace _Back to the Future_).
Nevertheless, `libsodium` doesn't know that, so we'll need to resort to a ubiquitous cryptography library that doesn't, uh, take security quite so seriously...
I'll leave the implications up to your imagination.
While the Bluetooth and cryptography sections are governed by upstream specifications, making sense of the data requires tracking a significant amount of state. At *minimum*, we must:
Encrypted Metadata). * Query received packets for diagnosed identifiers. * Record our Temporary Encryption Keys. * Query our keys to upload if we are diagnosed.
If we were so inclined, we could handwrite all the serialization and concurrency logic and hope we don't have a bug that results in COVID-19 mayhem.
A better idea is to grab SQLite, perhaps the most deployed software in the world, and express these actions as SQL queries. The database persists to disk, and we can even express natural unit tests with a synthetic in-memory database.
With this infrastructure, we're now done with the primary daemon, recording Exposure Notification identifiers to the database and broadcasting our own identifiers. That's not interesting if we never *do* anything with that data, though. Onwards!
Once per day, Exposure Notifications implementations are expected to query the server for Temporary Encryption Keys associated with diagnosed COVID-19 cases. From these keys, the cryptography implementation can reconstruct the associated Rolling Proximity Identifiers, for which we can query the database to detect if we have been exposed.
Per Google's documentation, the servers are expected to return a `zip` file containing two files:
Buffers](https://en.wikipedia.org/wiki/Protocol_Buffers) containing Diagnosis Keys * `export.sig`: a signature for the export with the public health agency's key
The signature is not terribly interesting to us. On Android, it appears the system pins the public keys of recognized public health agencies as an integrity check for the received file. However, this public key is given directly to Google; we don't appear to have an easy way to access it.
Does it matter? For our purposes, it's unlikely. The Canadian key retrieval server is already transport-encrypted via HTTPS, so tampering with the data would already require compromising a certificate authority in addition to intercepting the requests to <https://canada.ca>. Broadly speaking, that limits attackers to nation-states, and since Canada has no reason to attack its own infrastructure, that limits our threat model to foreign nation-states. International intelligence agencies probably have better uses of resources than getting people to take extra COVID tests.
It's worth noting other countries' implementations could serve this zip file over plaintext HTTP, in which case this signature check becomes important.
Focusing then on `export.bin`, we may import the relevant protocol buffer definitions to extract the keys for matching against our database. Since this requires only read-only access to the database and executes infrequently, we can safely perform this work from a separate process written in a higher-level language like Python, interfacing with the cryptography routines over the Python foreign function interface `ctypes`. Extraction is easy with the Python protocol buffers implementation, and downloading should be as easy as a `GET` request with the standard library's `urllib`, right?
Here we hit a gotcha: the retrieval endpoint is guarded behind an HMAC, requiring authentication to download the `zip`. The protocol documentation states:
Of course there's no reliable way to truly authenticate these requests in an environment where millions of devices have immediate access to them upon downloading an Application: this scheme is purely to make it much more difficult to casually scrape these keys.
Ah, security by obscurity. Calculating the HMAC itself is simple given the documentation, but it requires a "secret" HMAC key specific to the server. As the documentation is aware, this key is hardly secret, but it's not available on the Canadian Digital Service's official repositories. Interoperating with the upstream servers would require some "extra" tricks.
From purely academic interest, we can write and debug our implementation without any such authorization by running our own sandbox server. Minus the configuration, the server source is available, so after spinning up a virtual machine and fighting with Go versioning, we can test our Python script.
Speaking of a personal sandbox...
There is one essential edge case to the contact tracing implementation, one that we *can't* test against the Canadian servers. And edge cases matter. In effect, the entire Exposure Notifications infrastructure is designed for the edge cases. If you don't care about edge cases, you don't care about digital contact tracing (so please, stay at home.)
The key feature -- and key edge case -- is uploading Temporary Exposure Keys to the Canadian key server in case of a COVID-19 diagnosis. This upload requires an alphanumeric code generated by a healthcare provider upon diagnosis, so if we used the shared servers, we couldn't test an implementation. With our sandbox, we can generate as many alphanumeric codes as we'd like.
Once sandboxed, there isn't much to the implementation itself: the keys are snarfed out of the SQLite database, we handshake with the server over protocol buffers marshaled over POST requests, and we throw in some public-key cryptography via the Python bindings to `libsodium`.
This functionality neatly fits into a second dedicated Python script which does _not_ interface with the main library. It's exposed as a command line interface with flow resembling that of the mobile application, adhering reasonably to the UNIX philosophy. Admittedly I'm not sure wrestling with the command line is top on the priority list of a Linux hacker ill with COVID-19. Regardless, the interface is suitable for higher-level (graphical) abstractions.
Problem solved, but of course there's a gotcha: if the request is malformed, an error should be generated as a key robustness feature. Unfortunately, while developing the script against my sandbox, a bug led the request to be dropped unexpectedly, rather than returning with an error message. On the server implemented in Go), there was an apparent `nil` dereference. Oops. Fixing this isn't necessary for this project, but it's still a bug, even if it requires a COVID-19 diagnosis to trigger. So I went and did the Canadian thing and sent a pull request.
All in all, we end up with a Linux implementation of Exposure Notifications functional in Ontario, Canada. What's next? Perhaps supporting contact tracing systems elsewhere in the world -- patches welcome. Closer to home, while functional, the aesthetics are not (yet) anything to write home about -- perhaps we could write a touch-based Linux interface for mobile Linux interfaces like Plasma Mobile and Phosh, maybe even running it on a Android flagship flashed with postmarketOS to go full circle.
Source code for `liben` is available for any one who dares go near. Compiling from source is straightforward but necessary at the time of writing. As for packaging?
Here's hoping COVID-19 contact tracing will be obsolete by the time `liben` hits Debian stable.