There's no reason for C to have just a single stack

This patch adds the safe stack instrumentation pass to LLVM, which separates the program stack into a safe stack, which stores return addresses, register spills, and local variables that are statically verified to be accessed in a safe way, and the unsafe stack, which stores everything else. Such separation makes it much harder for an attacker to corrupt objects on the safe stack, including function pointers stored in spilled registers and return addresses. You can find more information about the safe stack, as well as other parts of or control-flow hijack protection technique in our OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf [1]) and our project website (http://levee.epfl.ch [2]).

Via Hacker News [3], “Protection against stack-based memory corruption errors using SafeStack ⋅ llvm-mirror/llvm@7ffec83 [4]”

I'm really surprised this wasn't done sooner. There's nothing in the C Standard that mandates how the call stack [5] must be implemented, but it seems that for the past forty years or so, the system stack (the stack the CPU (Central Processing Unit) uses to store return addresses for subroutine calls or state information when handling interrupts) has been the default place to store the call stack, which is the prime reason why buffer overflows [6] are so dangerous (because with a buffer overrun, the attacker can embed machine code and cause the CPU to return to the embedded machine code to do nefarious things (Using fingerd, the worm initiated a memory overflow situation by sending too many characters for fingerd to accommodate (in the gets library routine). Upon overflowing the storage space, the worm was able to execute a small arbitrary program. Only 4.3BSD VAX machines suffered from this attack.) [7]). Sure, it might appear that dedicating another CPU register to point to this secondary stack might be wasteful, but most C compilers on modern systems already use a second CPU register to point to the primary stack (to make it easier to generate stack frames when debugging or analyzing a core dump [8]). Also, the system stack wouldn't have to be so big—even an 8K (kilobyte) stack on a 64-bit machine would easily allow a call depth (A calls B which calls C which calls D is a call-depth of 4) of over 500 (technically, 1,024 but this would disallow other uses of the stack, such as temporarily saving registers, or handling of interrupts) which should handle most programs (the exception being badly written programs with unbounded recursion [9]; sidenote to Google [10]: good one, you got me [11]).

This patch, however, seems to still save the call stack in the system stack with the exeception of arrays or items whose address is taken, and it stores the pointer to the “unsafe stack” in a thread-specific variable. It turns out the performance loss isn't that bad as most routines don't have such problematic variables to begin with.

[1] http://dslab.epfl.ch/pubs/cpi.pdf

[2] http://levee.epfl.ch/

[3] https://news.ycombinator.com/item?id=9782368

[4] https://github.com/llvm-mirror/llvm/commit/7ffec838a2b72e6841d9fb993b5fe6a45f3b2a90

[5] https://en.wikipedia.org/wiki/Call_stack

[6] https://en.wikipedia.org/wiki/Buffer_overflow

[7] https://tools.ietf.org/html/rfc1135

[8] https://en.wikipedia.org/wiki/Core_dump

[9] https://en.wikipedia.org/wiki/Recursion

[10] https://www.google.com/

[11] https://www.google.com/search?q=recursion

Gemini Mention this post

Contact the author