💾 Archived View for thrig.me › blog › 2022 › 12 › 14 › duplicate-environment-variables.gmi captured on 2024-09-29 at 00:20:32. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
A common problem is to actually convince programmers that duplicate environment variables are possible on unix; most programmers interact with the environment through an interface that gives the impression that environment variables are unique in their Platonic splendor.
$ env FOO=bar perl -E 'say $ENV{FOO}' bar $ env FOO=bar cfu 'printf("%s\n", getenv("FOO"))' bar $ env FOO=bar perl -E '$ENV{FOO} = "baz"; say $ENV{FOO}' baz $ env FOO=bar cfu 'setenv("FOO", "baz",1);printf("%s\n", getenv("FOO"))' baz
See? FOO only has one value that changes when you update it. Therefore, environment variables are unique. Q.E.D.
Wrong!
%ENV (a hash or associative array or dictionary, typical in those filthy scripting languages) or getenv(3) (the C library function interface) are abstractions built on top of something. What is that something? A computer! That models a PDP-11! No, closer to home. Various approaches work here, such as delving the code for getenv, or to read documentation such as environ(7), which may mention something along the lines of
NAME environ - user environment SYNOPSIS extern char **environ; DESCRIPTION An array of strings called the "environment" is made available by execve(2) when a process begins. By convention these strings have the form name=value.
That's from OpenBSD; other unixlikes may vary with the documentation. But the gist is that **environ is an array of strings, which if you know anything about C might look a lot like *argv[] or the equivalent **argv for the arguments given to a program.
// argarg - print two args #include <stdio.h> int main(int argc, char **argv) { if (argc > 2) { printf("%s %s\n", argv[1], argv[2]); } return 0; }
Thus we can have duplicate entries in **argv, a point that few will dispute:
$ make argarg cc -O2 -pipe -o argarg argarg.c $ ./argarg foo foo foo foo
Given this, what do you think **environ might allow by way of duplicates? This takes a bit more work to setup, but luckily someone wrote a small program that helpfully creates duplicate environment variables, somewhere under the glorious mess that is
https://thrig.me/src/scripts.git
With dupenv, we can wrap env, here merely to report what environment variables are set, and see if two FOO exist.
$ dupenv FOO=bar FOO=baz env | grep FOO FOO=bar FOO=baz
Nope, FOO is not Platonic. More like contingent arising... I digress.
$ dupenv SHELL=/bin/sh SHELL=/bin/ed env | grep \^SHELL SHELL=/bin/ksh SHELL=/bin/sh SHELL=/bin/ed
Whose shell is it, anyways?
The duplication has been known and published since at least the 1990s, though not widely known, even among unix users. There have been various band-aids put in place, because if you pick the wrong environment variable or otherwise fail to cleanup the list, you get security vulnerabilities that were there for something like 35 years,
https://www.sudo.ws/repos/sudo/rev/d4dfb05db5d7
whoops, and a complicating factor is that different languages have put the band-aid on in different ways, or not at all, and some will pick the first of any duplicated environment variables (C, Go, Perl, sudo, zsh, ...) while others will pick the last of any duplicated environment variables (bash, ksh, ...). Also languages vary as whether they de-duplicate environment variables, whether a de-duplicated list or the original list is passed to child processes, etc.
$ dupenv FOO=aaa FOO=ZZZ cfu 'printf("%s\n", getenv("FOO"))' aaa $ dupenv FOO=aaa FOO=ZZZ ksh -c 'echo $FOO' ZZZ $ dupenv FOO=aaa FOO=ZZZ zsh -c 'echo $FOO' aaa $ dupenv FOO=aaa FOO=ZZZ expect -c 'puts "$env(FOO) [exec sh -c {echo $FOO}]"' aaa ZZZ $ dupenv FOO=aaa FOO=ZZZ python3 -uc 'import os;print(os.environ["FOO"]);os.execvp("env",["env"])' | egrep 'aaa|ZZZ' aaa FOO=aaa FOO=ZZZ $ dupenv FOO=aaa FOO=ZZZ perl -E 'say $ENV{FOO};exec qw(sh -c), q{echo $FOO}' aaa aaa
Buyer beware?
Good question! One may note that some of the above languages pass duplicate environment variables to programs they run--garbage in, garbage out--and that some tools use the last of the duplicate values instead of the first. This is wiggle room for an attacker, and perhaps enough wiggle room to embiggen the CVE list. What would happen if say, hypothetically, you have some Python code that runs some bash scripts, and the bash scripts see completely different values for PATH or LD_PRELOAD or who knows what other envrionment variables? What could an attacker do with that difference? Could there be an information leak, or an escalation of privileges?
Are the programmers in error? At a certain level of abstraction, no. In C, using only the getenv and setenv interface the environment will not appear to contain duplicates, and this will in most cases not cause a problem. In Go or Perl where the environment list is de-duplicated it is even more true that environment variables are unique, though Go does let one create a new []string with duplicates for the syscall.Exec call.
There are security ramifications.
tags #perl #c #go #unix