💾 Archived View for wilw.capsule.town › log › 2021-11-06-auto-av.gmi captured on 2023-07-10 at 13:46:57. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-04-19)

-=-=-=-=-=-=-

🏡 Home

Back to gemlog

Automatically scanning for malicious user-uploaded files

Posted on 06 November 2021

If you run a service that accepts file uploads from users, and then subsequent re-download by other users (such as images), then your service is potentially at risk of becoming a system for distributing malware. Without safeguards in place, bad actors could potentially use your service to upload harmful files with the intention of them being downloaded by other users.

Services like Google Drive and some email providers will automatically scan files for malicious payloads, but if you - like many people - rely on more basic object storage for storing files for your apps, then there may be less default protection available.

Luckily there are a number of methods available for addressing this.

Overview

Whilst the concepts are mostly generic and framework/infrastructure agnostic, in this post I'll focus on a process that leverages Amazon S3, Lambda, and ClamAV [1]. ClamAV is open-source antimalware software that can be executed as a binary without requiring a GUI.

1

In this post I won't include code (for brevity), but I will walk through the key stages, which are:

1. Periodic refresh of malware/virus definitions

1. Running the antimalware check upon new file upload

1. Denying uploads to "infected" files.

Managing and refreshing virus definitions

This stage allows ClamAV to keep up-to-date and recognise the types of files that might be infected. It involves three steps:

ClamAV provides FreshClam [2] - a tool for updating and managing a local database of virus signatures.

2

In order to obtain a `freshclam` binary for use in Lambda, I recommend installing ClamAV on an EC2 instance running Amazon Linux 2, and then extracting the tool from the filesystem (e.g. by running `which freshclam`). Note that you may also need some other library files for the binary to work (you'll notice errors which will be pretty self-explanatory).

Once you have the `freshclam` binary, create and upload a Lambda Layer [3] containing the binary. Depending on the runtime your Lambda function will use, you will need to store the binary at a specific path in your Layer. Give your Layer a suitable name, such as `freshclamLayer`.

3

Next, we need to create a Lambda function. It doesn't matter what runtime you use, so long as you can use it to execute the `freshclam` binary from the filesystem as some form of subprocess. Once ready, upload it to Lambda and reference the `freshclamLayer` Layer so that the binary can be made available to the function.

The function should outut the generated signature database to an S3 bucket. As such, your function will need to have an IAM role that enables write access to your chosen bucket.

Finally, use CloudWatch Events [4] to schedule your function to be run periodically. For example, you could update the signatures once or twice per day, depending on your needs.

4

Running the antimalware check on new files

This stage uses the virus definitions to decide whether a newly-uploaded file might be infected. It involves two key steps:

In addition to FreshClam, ClamAV also provides a tool called ClamScan [5] which can be invoked on specific files or directories in order to check them for malicious content.

5

Obtain the `clamscan` binary as described in the previous step, and bundle this into a Layer (again, as above), named something like `clamscanLayer`.

Next, create a new Lambda function (again, choose a runtime that can invoke binary subprocesses). The function should check the `event` passed to it in order to determine the path to the file that was uploaded, download the previously-uploaded virus signatures from S3, and then run `clamscan` against the target file. The output from the binary should be monitored to understand whether the file is suspected to be malicious.

If `clamscan` determines that the file is malicious, then use an S3 tag to mark the file as infected. For example, you could create a tag named `Infected` and pass a value of `true` or `false` depending on the scan's output. For this, you'll need to give your function relevant IAM permissions to enable object tagging in S3 and also read access to the bucket where the virus definitions were stored in the previous stage.

Also make sure that your function uses your `clamscanLayer` Layer so that it can access the relevant binary.

Finally, configure the S3 bucket responsible for holding user uploads in order to add a new trigger that invokes your scanning function each time a new object is put to the bucket.

Restrict downloads of infected files

The final stage involves telling S3 to forbid access to infected files. To do so, simply create (or modify) the bucket policy [6] for the user uploads bucket such that it denies the `GetObject` action for any objects that have the condition of an `Infected` tag with the value of `true`.

6

Conclusion

In this post I've provided a rough overview of a process that allows for scanning user-uploaded files for malicious content. Hopefully this might help if you're looking to make your services more secure for your users!

Reply via email

Back to gemlog