Robust Defence Against Directory Transversal attacks

2022-02-10 | #security | @Acidus

I've seen a lot of really positive discussion happening as a result of the gemserv security incident (more details coming, I promise). People are updating their software, finding and fixing security issues in other servers, and discussing different ways to handle URL paths to avoid directory transversal attacks.

Updating Gemsev

JAGS-PHP security fix

Alexey idea for protection

Kevin@susa's idea for protection

An anarchist take on parsing URLs/encoding

Security is something close to my heart. Once upon a time, I had hair, it was green, and I broke into websites for a living. Now I'm mostly bald, somewhat gray, but still I thought I'd provide some feedback on the various approaches to securing capsules I've been seeing, using my lens of experience.

What is a Directory Transversal attack?

For context, what people have been discussing recently are called "Directory Transversal" attacks. Basically sending a request that tricks the server into return files or directory listings outside the public root of the capsule. Like this:

gemini://example.com/dir/../../../../etc/ssh/ssh_config

OWASP's excellent page on directory transversal attacks.

While we have been talking about directory transversal, it's important to understand that is just a specific example of a more fundamental problem: insecurely processing untrusted user input. Directory transversals happen because the URL the user is requesting is the input. They completely control it, and can craft it to be malicious. There is nothing you can do about that. Period. Full stop. All you can do is make sure your capsule software safely handles that input, and doesn't fail in a way that allows a bad thing to happen.

Deny-listing and Allow-listing

When someone sees how an attacker exploits a security issue, the response is often: "Ok, I'll just fail if I see someone do that." I've read many gemlog posts about just not allowing specific characters. This is what we now call deny-listing. Deny-listing is when you look at user input and reject it if it has specific "bad" characters you want to deny. For example, if "../" allows someone to go "up" the directory tree, check the input for that and deny it.

Deny-listing can be effective, but it is not usually a good long term strategy since it fails "open". Meaning, as soon as someone discovers a way to do something dangerous that doesn't use the characters you are blocking, your defense stops working.

As an example, the directory transversal attack against gemserv didn't use "../". It didn't even use "/". It used %2F, which is the URL encoded version of "/". OK, so don't allow %2F. Well, maybe I use %252F, which is the URL encoded version of %2F, because I looked at your source code and I see that a function you call after you do your input validation looking for bad characters will automatically URL decode it. You would be amazed the number of times I've broken into an application professionally because of a function trying to be "helpful" and doing something the programmer didn't expect.

Deny-listing is a complex subject, more than I want to go into here. There are a lot of nuances that allow an attacker to do directory transversals besides just using "../". For example, "~" can be used to auto-resolve to a home directory, "/" can be interpreted as an absolute path, or you can use unicode escaping (even nested inside of URL encoding) to mask characters that will escape a directory. And thats not even talking about the sexy vector of unicode character transforms.

Behold the 75 CVE entries for directory transveral attacks against Apache or its components in the last 20 years

So yeah. Protecting against directory transversal is surprisingly more difficult than you would think.

"So, then, we should just allow [a-z]... oh, and 0-9... wait, crap, also \s... oh wait but also..."

Congratulations! You just rediscovered what we now call allow-listing. With allow-listing you only accept input if it uses specific characters in a specific format. And you also discovered the challenge of allow-listing. Allow-listing is easy for something like a postal code or city name, but not so easy for something like a URL with a large number of allowed characters. For example, I see people on Antenna saying just allow [a-z0-9]+. I think your French and German friends may be annoyed by that. Also, the query string is part of the URL, so this restriction prevents anyone from using a Gemini search engine to query for words in other languages.

(I had a whole aside here about the idea of just relying on UTF-8 encoding alone but this is getting long so I'll save that for later)

A Better Way

Let's step back from allow-listing and deny-listing and think about the actual problem.

When serving a static file, a gemini server determines what file to open by combining the path of the request with the path of the capsule's root and resolving that to an absolute file path. A terribly dangerous way to do this to concatenate them together and open the file, relying on the operating system to resolve it.

// path to the capsule's public root,
// loaded via a config file or parameter
publicRoot = "/var/gemini/public/"

// incoming request from the user
// gemini://example.com/cool/stuff/index.gmi
requestedPath = "cool/stuff/index.gmi";

fileToOpen = publicRoot + requestedPath
// result: "/var/gemini/public/cool/stuff/index.gmi"
File.Open(fileToOpen);

Here is why that is bad:

// path to the capsule's public root,
// loaded via a config file or parameter
publicRoot = "/var/gemini/public/"

// incoming request from the user
// gemini://example.com/tmp/../../../../../etc/passwd
requestedPath = "tmp/../../../../../etc/passwd";

fileToOpen = publicRoot + requestedPath
// result: "/var/gemini/public/tmp/../../../../../etc/passwd"
// resolves to: "/etc/passwd"
File.Open(fileToOpen);

To properly secure this, you would have to validate the path of the incoming request in some way, or try and ensure that no parts of the user supplied path could be interpreted to escape the root (e.g., following "~", using "../", using "/", dereferencing a symlink, etc). That's doing deny-listing or allow-listing, and if your are wrong, your are toast. Acidus is reading your ~/.ssh/ keys and hacking all your boxen.

A better option is to use a function that provides the final resolved path (the combination of the capsule root and the requested path), and ensure that this final path is still inside the capsule root. Most programming languages have function that does this as part of their standard library. In C#, you use "Path.GetFullPath(relativePath, absolutePath)". In Go, you use "path.Join()" and "path.Dir()". This resolves what file path you would end up at, automatically handles things like following symlinks, processing "../", etc. You can then check that this resolved path is still inside the public root, and fail if its not, as shown below:

// path to the capsule's public root,
// loaded via a config file or parameter
publicRoot = "/var/gemini/public/"

// incoming request from the user
// gemini://example.com/tmp/../../../../../etc/passwd
requestedPath = "tmp/../../../../../etc/passwd";

fileToOpen = Path.GetFullPath(requestedPath, publicRoot)
// result: "/etc/passwd"
if(!fileToOpen.StartsWith(publicRoot))
{
	//security exception! Trying to serve a document
	//outside of the capsule root directory!
} else {
	File.Open(fileToOpen);
}

Conclusion

Don't defend against directory transversal attacks using deny-listing. You will fail. Or you won't, for a while, until someone find a way to do it without characters in your deny-list, and then you fail open.

Don't defend against directory transversal attacks using allow-listing. Sure, that's better than deny-listing, but URLs are so complex doing this without being overly restrictive and causing problems with interoperability or other layers of your stack is hard.

Instead, use a function that resolves the user's request to the final, absolute file system path, and then verify that is still inside the capsule root.