💾 Archived View for thorjhanson.com › blog › 20210203-i-deleted-usr.gmi captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
Friday night, I accidentally deleted the `/usr` directory on my Arch Linux server. Here's how I recovered the system.
Friday night, I was ssh'd into my NAS cleaning out unused files in my home folder. At some point I had extracted something containing a `usr` directory. It was nearly midnight, and without stopping to think, I ran a very bad command:
rm -rf /usr
I stared for five or ten seconds while the command ran before I suddenly realized: I had not deleted the file `/home/thor/usr`, but rather the top-level directory `/usr`. I hit ctrl+c as fast as I could, but at this point there was little left.
On most Linux distributions, `/usr` holds most of the executables for the system. In the case of Arch Linux, the `bin`, `lib`, `lib64`, and `sbin` directories are all symlinked (aliased) to locations under `/usr` as well. This means that almost all commands, programs, and libraries are installed under `/usr`. The only commands I could run were the Bash built-ins - which notably doesn't even include `ls`.
I run ZFS on my NAS, but the OS directories are stored on a regular ext4 drive. This means there were no snapshots to roll back to. I knew I would have to boot from a flash drive to try repairing the system, so I powered off my NAS and began combing the internet for suggestions. On an Arch Linux forum, I found a snippet of wisdom:
1. mount your partitions
2. pacman -r /mnt -S base
3. reboot into multi-user target on your system
4. pacman -Qk to find remaining broken packages
5. pacman -Syu <results from 4>
While I wouldn't end up exactly following these steps, this gave me a great starting-off point to do my repair. In some ways, it could be said that `/usr` is the most critical directory in a Linux installation, since it holds the executables - but at the same time, it is the most replaceable, since there are no configuration files or pieces of personal data stored there. Hence, if I could reinstall all the packages, the system would theoretically be completely back to normal.
My actual recovery process took several (failed) attempts - I'll describe the steps that actually worked, ordered and arranged as if I got them right the first time (for easier reading).
After booting from an external flash drive, I mounted my OS disk under `/mnt`, then mounted my EFI partition under `/mnt/boot`. Reading pacman's documentation revealed that the `-r` option can be used to specify a "new root" from which pacman should operate. In this case, with my real OS root was mounted under `/mnt`, passing `-r /mnt` to pacman would tell it to operate on my real OS rather than the booted flash drive.
Before I could start reinstalling any packages, I would need the `/usr` directory to exist and have an intact structure - I had no idea what subdirectories may have been destroyed already. This was easy enough to do by making a new temp directory `/mnt/root2`, then using `pacstrap -i /mnt/root2 base`. This would populated `/mnt/root2` with a fresh base installation of Arch. Once completed, I replaced my OS's `/usr` (`/mnt/usr`) with `/mnt/root2/usr`. This, I hoped, would allow pacman to successfully reinstall packages.
Ideally I would want to reinstall the same versions of all the packages that were previously installed on the system - I figured this would minimized the chance of conflicting libraries. I knew most or all of my system's packages would be stored in the cache, `/mnt/var/cache/pacman/pkg`. The cache has several versions of most packages in it, but I wanted to get a list of just the most recent of each one. After several minutes of examining, trial, and error, I came up with the following:
ls /mnt/var/cache/pacman/pkg/ | sed 's/-/ /' | sort -r | rev | uniq -f 1 | rev | sed 's/ /-/' | sort
Translated roughly to "list all the packages, replace the first dash in each name with a space, sort them by reverse-alphabetical order, flip each name to be backwards (package 3.1 -> 1.3 egakcap), keep only one occurrence of each name but skip field one, reverse the names back, put the dashes back, sort alphabetically". Technically, this command is flawed because it replaces the first dash in the name, but it should replace the last dash. In my case, I lucked out and this didn't cause any problems, but I should fix it if I ever use the command again.
Once I was satisfied with the results, I saved the list to a variable:
PACKAGES="$(ls /mnt/var/cache/pacman/pkg/ | sed 's/-/ /' | sort -r | rev | uniq -f 1 | rev | sed 's/ /-/' | sort)"
Then I prepended the full path to each name in the list:
for i in $PACKAGES ; do PACKAGES2="/mnt/var/cache/pacman/pkg/$i $PACKAGES2"; done
At this point, the variable `$PACKAGES2` held a list of the full paths to each package I wanted installed on the system:
pacman -r /mnt -U $PACKAGES2
I also had to use the `--assume-installed` option on one particular Python package that pacman tried (and failed) to download. I figured whatever the package was, it probably wasn't critical for the system to operate, and this flag would allow me to instruct pacman to skip trying to install this particular package. Additionally, I had to make use of the `--overwrite` command to tell pacman to specifically overwrite certain files that already existed (due to me prepopulating the `/usr` directory with a base installation). The --overwrite command took several tries before I had a full list of all the files pacman needed permission to overwrite.
Once I successfully reinstalled all the packages in the cache, I decided I would try to `chroot` into the system so I could fully repair the installation from the inside (I couldn't do this before because there would have been no executables for me to run from inside the chroot).
arch-chroot /mnt bash
Inside the chroot, I was able to use pacman to get a list of which files the system had concerns about:
pacman -Qkk
There was quite a lot wrong with the system. I decided I would first try running a full update with `pacman -Syu`, then try reinstalling all packages on the system. Both of these likely required use of the --overwrite flag again (although I admittedly cannot remember).
Running `pacman -Qe > /packagelist.txt` gave me a file with all the packages listed. Trying to install using `pacman -S $(cat /packagelist.txt)` told me which couldn't be found in the repository (due to being installed from other sources, such as the AUR). At that point I just had to go through the `/packages.txt` and delete the 4 or 5 offending packages. Running `pacman -S $(cat /packageslist.txt)` successfully reinstalled all the packages, and `pacman -Qkk` confirmed that there were no malignant files to be found!
At this point, I also reinstalled grub and rebuilt the initramfs. These steps were probably not needed, but I didn't want to reboot into disappointment again, and I wasn't sure if all my "repair" work may have corrupted libraries needed by the ZFS module.
Rebooting was a success!
A lot of initial wisdom I found on the internet said recovering from a deleted `/usr` was not worth the time, and it would probably not result in a stable system even if successful. I found the optimism and DIY attitude of Arch Linux resources really shined in helping me get this problem repaired.
Now, I'm backing up my OS installation nightly. In the future, this would mean recovering would be as easy as booting into an external flash drive and re-copying whatever directories I needed from the backup into place (or even the entire OS).