By: The Depressed Milkman
Date: 2021-07-07
Tags: #typesystems
There are a multitude of contexts within which programs must process sensitive data securely. Sensitive data is information such as passwords and private keys - private data which is never publicly revealed. However, programs processing sensitive data occasionally have vulnerabilities due to which they leak sensitive data through side-channels (e.g. GnuPG). Many of these vulnerabilities can be attributed to compiler optimizations and transformations which did not respect the sensitivity of data because they were not informed of it, due to limitations of the programming languages used. This document discusses approaches for providing support within programming languages (more specifically, their typesystems) for identifying and correctly processing sensitive data. It is most applicable to strongly and statically typed programming languages which support zero-cost encapsulation, such as Rust and Zig.
A new built-in type wrapper, `sec T` for every type `T`, is described, called a secure type. A secure 32-bit unsigned integer, for example, would be written as `sec u32`. A secure value (a value of a secure type) is sensitive data, and operations on secure types will (attempt to) not leak that data. The compiler is also required to not transform (e.g. due to optimizations) operations upon secure values such that the results are leaked.
A secure value cannot be used in place of an insecure one. For example, the condition expression in an `if` statement takes a `bool`, not a `sec bool`. This prevents sensitive data from being revealed due to the differing characteristics of the branches of the `if` statement (e.g. time taken or memory accessed). Instead, a secure selection operation is provided, which takes a `sec bool` condition and two secure values and chooses one in a manner which does not leak the condition.
Operations upon secure values typically produce new secure values. For example, adding together two secure integers results in a new secure integer. This ensures that secure data is not accidentally transformed into insecure data.
An explicit cast operation is provided for considering secure values as insecure ones. This must only be used when the secure value is safe to leak. This can happen, for example, when sensitive data like a private key has been transformed into public data like a signature - within the code, all the processed data can be considered sensitive until the final signature data has been formed, at which point it is necessary to make it insensitive so that it can be shared externally. Note that the compiler cannot prove that the signature is safe to leak, partially because the mathematics behind the operation is too complex, and partially because the private key can be leaked if enough signatures are produced - but the compiler would not understand that the probability is too low for this scenario to be significant.
The compiler cannot guarantee that secure values will never be leaked. Hardware vulnerabilities like Spectre break the fundamental assumptions that programs make about memory security, and so it is necessary to make clear that no complete guarantee can be made about the security of sensitive data. However, a program written using secure types should not voluntarily leak any data at all, especially through recognized side channels (primarily time, memory access due to caching, and power consumption). Compilers must explicitly document what side channels they protect against, and if protection varies enough between competing compilers for a language, dedicated feature switches and compile-time tests may be necessary.
-- (C) CC-BY-NC-SA 4.0 2021-07-07 The Depressed Milkman