2023-11-19 Cheap backups

Every day is backup day. I use BorgBackup to backup up my laptop to one of two external disks. One of these is always at the office, so there are three copies: on my laptop, on a backup disk at home and on a backup disk at the office. It's important that those three are never in the same location.

I do concede, however, that it's tricky setup.

For my wife, I've set up the Mac to backup the laptop and an external disk with media files to one of two external disks. Again, one of these is always at the office. So, same deal, except now I'm using TimeMachine instead of BorgBackup. Somehow, that makes it take a very long time. But it seems to work.

Still, plugging in backup disks, carrying them to the office, bringing the other one back, plugging it in again… as you can imagine, we don't do this nearly often enough.

So I need a quick way to backup stuff. Like, super quick. Super cheap.

The simplest option I can think of is to use rsync on the local disk because it has an option to create links for files ­– and while a link doesn't take zero space, it takes a lot less space than making a copy.

So what I wanted was a quick command that I can run in the directory I'm in and it creates a backup using rsync.

I've used the following bash script. Let me know if it can be improved.

#!/usr/bin/bash
d=$(basename $(pwd))
t=$(date --iso-8601)
echo Creating a snapshot of $d in ../$d-$t
rsync --link-dest "../$d" --archive . "../$d-$t/"

That is, if the current directory is `/home/alex/src/wiki`:

When I use `ls -l`, this is what it looks like:

drwxr-xr-x 67 alex alex 749568 18. Nov 23:56 wiki/
drwxr-xr-x 67 alex alex 548864 17. Okt 23:43 wiki-2023-10-17/
drwxr-xr-x 67 alex alex 548864 19. Okt 22:47 wiki-2023-10-19/
drwxr-xr-x 67 alex alex 552960 30. Okt 13:18 wiki-2023-10-30/

It's not great, but it works.

For a file that hasn't changed, you'll see that `ls -l` says there are four links to this file. This is correct: one link for every directory. The data exists only once, on the disk.

> ls -l w*/2000-03-10_Elendor.md 
-rw-r--r-- 4 alex alex 16313 28. Apr 2017  wiki/2000-03-10_Elendor.md
-rw-r--r-- 4 alex alex 16313 28. Apr 2017  wiki-2023-10-17/2000-03-10_Elendor.md
-rw-r--r-- 4 alex alex 16313 28. Apr 2017  wiki-2023-10-19/2000-03-10_Elendor.md
-rw-r--r-- 4 alex alex 16313 28. Apr 2017  wiki-2023-10-30/2000-03-10_Elendor.md

For a file that changes all the time, `ls -l` shows that there is just one link for every file, meaning that these files are distinct. And for a lay person, that is confirmed by the different file size and last modified timestamp.

> ls -l w*/changes.md
-rw-r--r-- 1 alex alex 1997 17. Okt 16:28 wiki-2023-10-17/changes.md
-rw-r--r-- 1 alex alex 2123 19. Okt 22:36 wiki-2023-10-19/changes.md
-rw-r--r-- 1 alex alex 5847 30. Okt 13:18 wiki-2023-10-30/changes.md
-rw-r--r-- 1 alex alex 4255 18. Nov 23:54 wiki/changes.md

​#Administration ​#Backup

Limitations

As @edavies@functional.cafe and @Sandra@idiomdrottning.org have pointed out, this only works if the editing you do always recreates the files you're editing (because then these files get a new inode even if their names stay the same). Examples where this is a problem: database files are updated in place; some editors allow you to decide how backups are created. Emacs, for example, has the option `backup-by-copying` which is nil by default. When you save a file, the original is renamed to the backup and a new file is written. If you set the option, however, the backup is copied first, and then the original is changed in-place (like a database file, without getting a new inode). Now you've changed the file in all the "cheap backup" directories, since the links keep pointing to the same inode and that inode now points to the new file.

Possibly a better solution that uses copies for the first backup and then future backups link to the old backups (and never to the originals) would be the following:

#!/usr/bin/bash
d=$(basename $(pwd))
t=$(date --iso-8601)
p=$(find .. -maxdepth 1 -type d -name "$d-[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]" -prune | sort | tail -n 1)
if [ -z "$p" ]; then
  echo Creating a snapshot of $d in ../$d-$t
  rsync --archive . "../$d-$t/"
else
  echo Creating a snapshot of $d in ../$d-$t with links into $p
  rsync --link-dest "$p" --archive . "../$d-$t/"
fi

Note that now, in case you just need a single backup, you might just as well just make a copy using your favourite file manager. Not so cheap any more, eh? 🤨

Or use rsync-time-backup, a shell script.

rsync-time-backup

Or rsnapshot, a Perl script.

rsnapshot

Another solution

@edavies@functional.cafe offered the following script for a use case for `/home`, `/etc` and `/usr/local` to a 1TB hard drive by the computer:

#!/bin/bash

set -e

STAMP=$(date "+%Y-%m-%dT%H:%M:%S")
DEST=/run/media/ed/hitachi
MACHINE=$(uname -n)
TARGET=$DEST/$MACHINE

[ -d $TARGET ] || (
    mkdir $TARGET
    mkdir $TARGET/0
    ln -s ../0 $TARGET/current
)

[ -L $TARGET/current ] || (
    echo Target directory $TARGET/current doesn\'t exist or isn\'t a symlink
    false
)

for DIR in home etc usr/local
do
    mkdir -p $TARGET/incomplete-$STAMP/$DIR
    rsync -axv --link-dest=$TARGET/current/$DIR/ /$DIR/ $TARGET/incomplete-$STAMP/$DIR/
done

mv $TARGET/incomplete-$STAMP $TARGET/back-$STAMP
rm -f $TARGET/current
ln -s back-$STAMP $TARGET/current
ls -l $TARGET
df -h $DEST
umount $DEST