💾 Archived View for perso.pw › blog › articles › borg-vs-restic.gmi captured on 2022-04-29 at 12:25:03. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-17)
-=-=-=-=-=-=-
NIL=> https://bsd.network/@solenepercent/106274498003871032 Comment on Mastodon
Backups are important, lot of our life is now related to digital data and it's important to take care of them because computers are unreliable, can be stolen and mistakes happen. I really like two programs which are restic and borg, they have nearly the same features but it's hard to decide between both, this is an attempt to understand the differences for my use case.
Restic is a backup software written in Go with a "push" workflow, it supports data deduplication within a repository and multiple systems using the same repository and also encryption.
Restic can backup to a remote sftp server but also many network services storage like S3/Minio and even more when using with the program rclone (which can turn any supported backend into a compatible restic backend). Restic seems compatible with Windows (I didn't try).
Borg is a backup software written in Python with a "push" workflow, it supports encryption, data deduplication within a repository and compression. You can backup to a remote server using ssh but the remote server requires borg to be installed.
It's a very good and reliable backup software. It has a companion app named "borgmatic" to automate the backup process and snapshots managements (daily/hourly/monthly ... and integrity checking).
I've been making a backup of my /home/ partition (minus some directories that has been excluded in both cases) using borg and restic. I always performed the restic backup and then the borg backup, measuring bandwidth for each and execution time for each.
There are five steps: init for the first backup of lot of data, little changes twice, which is basically opening firefox, browsing a few pages, closing it, refreshing my emails in claws-mail (this changes a lot of small files) and use the computer for an hour. There is a massive change as fourth step, I found a few game installers that I unzipped, producing lot of small files instead of one big file and finally, 24h of normal use between the fourth and last step which is a good representation of a daily backup.
restic borg Data transmitted (MB) --------------------- Backup 1 (init) 62860 53730 Backup 2 (little changes) 15 26 Backup 3 (little changes) 168 171 Backup 4 (massive changes) 4820 3910 Backup 5 (typical day of use) 66 44 Local cache size (MB) --------------------- Backup 1 (init) 161 45 Backup 2 (little changes) 163 45 Backup 3 (little changes) 207 46 Backup 4 (massive changes) 211 47 Backup 5 (typical day of use) 216 47 Backup time (seconds) --------------------- Backup 1 (init) 2139 2999 Backup 2 (little changes) 38 131 Backup 3 (little changes) 43 114 Backup 4 (massive changes) 201 355 Backup 5 (typical day of use) 50 110 Repository size (GB) 65 56
Borg was a lot slower than restic but in my experiment the remote ssh server is a dual core atom system, borg is using a process on the other end to manage the data, so maybe that CPU was slowing the backup process. Nevertheless, in my real use case, borg is effectively slower.
Most of the time, borg was more bandwidth effective than restic: it saved 15% of bandwidth for the first backup and 18% after some big changes, but in some cases it used a bit more bandwidth. I have no explanation for this, I guess it depends how file chunks are calculated, if a big database file is changing then one may be able to save only the difference and not the whole file. Borg is also compressing the data (using lz4 by default), this may explain the bandwidth saving that doesn't work for binary data.
The local cache (typically in /root/.cache/) was a lot bigger for restic than for borg, and was increasing slightly at each new backup while borg cache never changed much.
Finally, the whole repo size holding all the snapshots has a different size for restic and borg, respectively 65 GB and 56 GB, which makes a 14% difference between each which may due to the compression done by borg.
I tested Restic and Borg because they are both good software using the "push" workflow (local computer sends the data) making full snapshots of every backup, but there are many other backup solution available.
- duplicity: fully scriptable, works over many remote protocols but requires a full snapshot and then incremental snapshots to work, when you need to make a new full snapshot it will take a lot of space which is not always convenient. Supports GPG encrypted backup stored over FTP, this is useful for some dedicated server offering 100GB of free FTP.
- burp: not very well known, the setup uses TLS certificates for encryption, requires a burp server and a burp client
- rsnapshot: based on rsync, automate the rotation of backups, use hard links to avoid data duplication for files that didn't change between two backups, it pulls data from servers from a central backup system.
- backuppc: a perl app that will pull data from servers to its repository, not really easy to use
- bacula: enterprise grade solution that I never got to work because it's really complicated but can support many things, even saving on tapes
In this benchmark, borg is clearly slower but was the most storage and bandwidth efficient. On the other hand, restic is easier to deploy (static binary) and supports a simple sftp server while borg requires borg installed on both sides.
A biggest difference between restic and borg, is that restic supports multiples systems backup in the same repository, allowing a massive data deduplication gain across machines, while a borg repository is for single system (it could work with multiples systems but they should not backup at the same time and they would have to rebuild the local cache every time which is slow).
I'll stick with borg because the backup time isn't a real issue given it's not dramatically slower than restic and that I really enjoy using borgmatic to automatically manage the backups.
For doing backups to a remote server over the Internet, the bandwidth efficiency would be my main concern of all the differences, borg seems a clear winner here.