💾 Archived View for gemini.ctrl-c.club › ~michal_atlas › posts › diagnostics › 01.gmi captured on 2024-03-21 at 15:33:25. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-02-05)

-=-=-=-=-=-=-

LVM, RAID1 and Guix chroot

Quick glossary of terms:

LVM - Logical Volume Management

Linux's system to manage pools of disks and build logical partitions on top of them.

PV - physical volume

this is any block device be it an entire disk or just a partition, that is managed by LVM

VG - volume group

a group of PVs, that can operate transparently as one huge container for data, agnostic to what PV exactly that data ends up on

LV - logical volume

LVMs partition, this gets created on a VG and can subsequently be mounted as any other partition

RAID0

Connects two or more storage devices, to pretend they're one long block of memory. Usually the data is interlaced to spread out operations, across the devices

RAID1

Connects two or more storage devices, and puts the data on every single one of them. This means that if you put 2 disks in a RAID1, then any of the two disks can die and the other should still contain the entirety of the data

The story

This is a recollection of today's attempts to solve the write delays on my main home mount.

For context the setup on my main machine is, a small SSD for the EFI, Swap and System, the system being on one LVM VG (spool).

And then there's two 2TiB Hard Drives with their own VG (rpool).

It used to be one extremely slow NAS drive, but I swapped that out, and with lovely LVM it was as easy as:

Just like that all the structure, partitions and everything stayed intact, and started a new life on one of the new disks.

rpool was taken after what we called our ZFS pools at school, which I then tried at home back when I had Nix, and now it's a VG volume, oh how names stick, and spool is just SSD rpool.

rpool has a partition for images, games and the like, and then I wanted to do a second one, which would hold the more important stuff, so I set up this one to be mirrored accross the two disks, LVM calls it mirroring, it should behave as a RAID1, though LVM also has a RAID1 LV type, so I'm not yet sure what that's all about. So in theory if one disk fails the data survives.

And with LVM it's as easy as

lvcreate -m 1 -L 500G rpool

Now what I didn't realize was, how much this tanks performance, specifically writes. Sometimes one can feel fetches, similarly to when I accidentally installed a NAS disk, how you list a directory and then hear the drive spin-up in real time, and after a few moments the directory print to the screen. Similarly saving a file could take a few seconds. What I found out though was that many programs are absolutely not built for this. Namely Firefox, Dovecot and Emacs, are the big culprits of basically, daily crashes whenever I needed to save something.

I knew that for consistency both writes need to go through before the block gets released, however according to some online sources this may end up halving write performance, which would make sense if and only if the write wasn't parallelised at all, but I digress.

Nonetheless I wanted to try adding a writecache on the SSD infront of that mirrored volume. Which... would require an LV in the same VG to sit on the SSD. As I already had spool on there, which I kept separate because I was afraid back then of accidentally placing the system on the HDD, however now I figured there's no harm done joining them.

vgmerge rpool spool

And just like that the root joined the others in rpool.

Getting guix to accept the change

This shouldn't be that much of a problem, I mean I just reconfigured, and then went to make the change, but sadly not exactly, it still looked for the old UUID. Not to worry I can just chroot inside and reinitialize. Actually running `guix-daemon` inside with disabled build sandboxing `--disable-chroot` allows us to system reconfigure as normal. There's a few things to look out for, however. You do have to `--rbind` /sys, /dev and /proc from the booted system, and then use `linux64 chroot` instead of just `chroot` (at least on fedora) otherwise `grub-install` will try to install into i686 instead of x86_64 and fail.

So I did that and now GRUB got me all the way to shepherd, but shepherd didn't want to mount. It stated something along the lines of "device mapper: failed to run raid array -EINVAL", I dunno why that was.

I had 'dm-raid' in initramfs and everything, besides it was literally the same setup I had before, just with an additional LV.

My guess is that the activation of rpool happened too early, and some module wasn't loaded yet, since disabling autoactivation of the other LVs on startup allowed me to boot.

In the end I just removed the mirroring, since it solved both my current problem and the original problem I was trying to solve. I have other backup schemes and besides, one of my Professors used to say that the reliability of RAIDs in small setups are overrated, since if buying a few of the exact same disk of the exact same brand, there is quite a probability that they'll also die close to each other, which is compounded by the fact that after one dies, even if the other still has some spirits, it'll get hit extra hard by one racing to pull all the data off.