💾 Archived View for tilde.pink › ~racoon › unix › code_sandboxing.gmi captured on 2023-11-04 at 12:22:08. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
Roy Marples added "privilege separation" to dhcpcd.
This basically means that the software runs as multiple processes
which communicate with each other. Lots of regular operations
can be performed in underprivileged subprocesses that aren't
allowed to do much, which greatly minimizes the impact of exploits.
He wrote a couple of blog posts comparing OS-specific techniques
for restricting processes:
Capsicum vs Pledge in a Network Management Tool
Capsicum vs Pledge in a Network Management Tool ... Part 2
... and then followed up with a third after I poked him on IRC
and pointed out that setrlimit is a thing:
Capsicum vs Pledge Final Thoughts
Perhaps you write super awesome code in languages that give you
a nice sense of safety and a remote execution bug is totally
out of the question.
Maybe you're a super devops and containerize everything so you're
totally safe, at least until the latest container escape bug comes out.
Even then, it's nice to clearly state what your program is
allowed to do, it gives you constraints to work with, and
makes bugs really obvious.
Perhaps the easiest way to do sandboxing is with separate user
accounts. If your process changes from root to an unprivileged
user shortly after starting, it greatly reduces opportunity
for abuse.
Most sensitive data on most systems is stored in files with
fine-grained permissions, so not being root really helps.
in most cases root is also allowed to do nasty system-wide
things that ordinary users aren't.
POSIX has a setrlimit function. combined with flags like
RLIMIT_NOFILE, you can do things like prevent your
process from opening any more files, or spawning any
new subprocesses. This is really nice until you run into
system-specific quirks as Roy discovered in his last
blog post.
chroot is very useful for application authors for
one particular purpose: changing the root filesystem for your
process to an empty directory is a surefire promise that
your process will not open any new files from this point forwards.
It is often claimed that chroot is not a security mechanism.
Indeed, many UNIX vendors have refused to patch potential chroot
escapes.
However, it has clearly been used as one for a long time.
chroot also requires root, so you have to do it before
dropping to your underprivileged user.
At this point I can point to Daniel J Bernstein's extremesandbox.c
as a classic example of these techniques.
Essentially, you build a list of low level system calls you
expect your process to use into the binary, then pass
this to the kernel in some way, and then it enforces
this usage.
This seems to be very commonly deployed these days, thanks to
Linux's seccomp-bpf. The BSDs previously had something
similar in systrace, but it very much went out of fashion
as bugs were found.
System call allow lists are system-specific by definition.
if you use libraries that abstract the OS away, you can
probably make a reasonable guess at what it currently does,
but not necessarily what it might do in the future.
System call restrictions have to face a fundamental problem with
how software development works: most of the time we do not use
system calls, we use nice friendly libraries that wrap those
system calls. The precise system calls the library uses are what
we in the trade call an "implementation detail".
The standard model to talk to a UNIX system is to do it through libc.
if you're writing a programming language, it's probably safer
to bind to libc than to use syscalls directly, since they have
not traditionally been seen as a stable interface. (note: in
NetBSD even using libc involves abstractions, functions are
versioned to avoid ABI breakage, and this is hidden from the
programmer).
This is the approach OpenBSD took with their pledge sandboxing
mechanism.
My primary problem with this is that the categories they chose to let
you allow seem both too broad and too tied to the C programming language:
do I really want to allow stdio?
As a programmer, you're probably far more aware of what resources
your program requires than what system calls it might happen to
use. This is why I like setrlimit - it's much easier to understand
how many files a program might open.
It also happens that most of the harm you can do as a naughty exploiter,
if you happen to take over a process, involves using resources: maybe
you want to read some private data from a file and send it over a
network socket, that involves opening several new resources.
I'd already spent a while thinking about this before I learned about
Solaris privileges and the setppriv system call in illumos.
The setppriv model provides a nice abstraction where you have
to think about the resources your code is using, but not necessary
the system calls (or indeed areas of the C library) it wants to use.
I think it's very interesting, and I think it's a shame that like
many innovative features in OS development it's been slightly
forgotten.
It's supported in portable OpenSSH, look for the file sandbox-solaris.c.