For a long time, my backup methods were very... questionable. A few times a year, I would manually copy the files I wanted to download to an external hard drive. Of course, this raised a lot of problems: the backups were irregular and rare (up to six months between two backups, so it was not unusual for me to lose the previous day's work because of a mistake). Moreover, they were done by hand, which was not only painful, but also risked forgetting files or making mistakes. Lastly, the hard disk was stored at my place, which ran the risk of losing everything in case of theft or fire.
So it was time to use a much more serious backup method. My goal is to create a script to automatically back up the files of several folders in an archive, and to synchronize it on a remote server. This archive should :
- deduplicate the files, i.e. store only the changes from one date to another (but have the possibility to extract the complete backup from any date) so as not to take up a considerable amount of space;
- compress the files, to limit the space taken up as well;
- encrypt the backup, so that the data cannot be decoded by someone and store it safely in the cloud.
Borg is a deduplicating backup program which supports compression and authenticated encryption. After several tries, I ended up preferring it for its simplicity and performance over its many competitors, such as Restic, Duplicacy or Duplicati.
First, a Borg directory must be initialized. To do this, we use the following command:
borg init --encryption=repokey /path/to/repo
A password is required to encrypt the archive. To avoid having to type it (to be used in a script for example), it is possible to store it in a `~/.borg-passphrase` file and pass it as an environment variable:
export BORG_PASSCOMMAND="cat $HOME/.borg-passphrase"
Let's imagine that we want to create a "Monday" archive containing two folders:
borg create /path/to/repo::Monday ~/folder1 ~/folder2
Since the files stored by Borg are *compressed*, the archive will be smaller than the original files.
The next day, you can create a new "Tuesday" archive with the same files:
borg create /path/to/repo::Tuesday ~/folder1 ~/folder2
Thanks to deduplication, Borg will only store new data: files that have not been modified are not added a second time, which greatly limits the size of the archive.
Of course, in the case of a script that automates the backup, for example on a daily basis, it is easiest to enter the date as the name of the archive:
DATE=$(date +%Y-%m-%d) borg create /path/to/repo::$DATE
The manipulation of the data is relatively simple. `borg list repo` allows to list the existing archives (dates). `borg list repo::Monday` allows you to list the existing files in an archive, and `borg extract repo::Monday` allows you to extract it into a directory. It is even possible to use `borg mount repo::Monday mnt` to mount an image and directly browse the archive.
A good backup should be replicated to a remote site. I decided to use a cloud service, which I don't mind since the data is encrypted.
My choice was Scaleway which offers a Glacier type service: C14 Cold Storage. Its price is really low: €0.002 per GB per month with 75 GB of free storage. Incoming and outgoing transfer, archiving and restoration are free.
For 350 GB, it costs me €0.5 per month, against €3.5 on OVH or Amazon Glacier which in addition charge for the outgoing transfer or extraction.
Last but not least, it is a French service and allows me to store the files in France, without having to trust foreign laws.
Rclone is, like Rsync with SSH servers, a command line program that allows you to manage or synchronize files in the cloud. It can be used with a lot of cloud providers.
The configuration is simple, and is done through the file `.config/rclone/rclone.conf` (where `access_key_id`, `secret_access_key`, `region` et `endpoint` are given by Scaleway):
[scaleway] type = s3 provider = Scaleway access_key_id = xxxxx secret_access_key = xxxxx region = fr-par endpoint = s3.fr-par.scw.cloud acl = private storage_class = GLACIER
Once your bucket is created, you can then simply synchronize your Borg archive (or, really, any folder) with it:
rclone sync -v /path/to/repo scaleway:my-bucket