💾 Archived View for chirale.org › 2018-05-29_4301.gmi captured on 2024-09-29 at 00:04:44. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-05-12)

-=-=-=-=-=-=-

Automate log cleanup for GDPR: the Sentry case

With the General Data Protection Regulation (GDPR) enforced by European Union logs have to be cleaned regularly to delete IP addresses and other information about visitors. This can be interpreted as a way to protect an emerging and discussed right, the right to be forgotten.

General Data Protection Regulation

right to be forgotten

This new regulation is impacting every automated log system out of there. Since Sentry is a good open source error monitoring software\* and it’s widely used, this guide will show how to clean Sentry logs on Linux systems according to GDPR using the sentry cleanup command line utility.

Sentry

cleanup command line utility

Set a time limit for logs

Before starting discover the maximum time limit a log can be kept according to the service policy you’re working on.

In the below examples, the max time a log can be kept is 26 months, one of the sizes proposed by Google Analytics on cleanup settings.

A 26 months limit for stored logs in sentry are set like this:

env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749

where /usr/local/etc/sentry is the directory where config.yml and sentry.conf.py are located or

env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5

where 5 is the id of the project you can find in Project settings \> Client Keys (DSN) as the very last part of the DSN path (always an integer number).

749 days are calculated like this:

30 days × 26 month = 780 days – 31 days = 749

31 days are a margin to safely delete logs the same day of each month.

Apparently, sentry cleanup needs to be root to access to postgres user and thus all sentry database tables so we have to put it on the cron for root.

Schedule the cleanup

Login as root with su – or sudo bash crontab -e add a command line like this

. /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 758 --project 5 && deactivate

leading dot . is an alternative for source available on /bin/sh (environment of cron) and not only by /bin/bash. This avoid to set the environment variable SHELL=’/bin/bash’ on crontab.

The resulting cron entry would be:

20 3 28 all.sh django2gmi.sh processing README.md wp2gmi.sh all.sh django2gmi.sh processing README.md wp2gmi.sh . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 --project 5 && deactivate

It isn’t a bad idea to add a fallback cleanup command the day after, so if you forget to cleanup logs for a specific project it will be done automatically:

20 3 29 all.sh django2gmi.sh processing README.md wp2gmi.sh all.sh django2gmi.sh processing README.md wp2gmi.sh . /usr/local/etc/virtualenvs/sentry/bin/activate && env SENTRY_CONF='/usr/local/etc/sentry' sentry cleanup --days 749 && deactivate

Now even your Sentry logs are GDPR compliant. The power of this method is that you can set a different cleanup limit for every project, according to its policies. And you haven’t to use any proprietary software to do this, just free/libre open source software.

If you are in a hurry to publish privacy policies and you have a dedicated hosting, give a try to JournaKit legalazy on GitHub.

\* Plus it’s written on top of Django.

Django

GDPR

https://web.archive.org/web/20180529000000*/https://en.wikipedia.org/wiki/Right_to_be_forgotten

https://web.archive.org/web/20180529000000*/https://docs.sentry.io/

https://web.archive.org/web/20180529000000*/https://docs.sentry.io/server/cli/cleanup/

https://web.archive.org/web/20180529000000*/https://github.com/chirale/legalazy