💾 Archived View for tobykurien.com › articles › 2022-05-17-smol-data-centre.gmi captured on 2023-03-20 at 17:34:16. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
I've been listening to several podcasts and watching some YouTube videos of tech hobbyists and many of them talk about setting up a data centre at home for running various things (like NAS, Home Assistant, etc.). Almost all of them seem to want to go as over the top as possible, e.g. having terabytes of NVMe storage with ZFS or RAID or some other complicated setup, running Kubernetes clusters on their Pi's, complaining about how they aren't getting their full 10 gigabits of throughput from their NAS even though they re-wired the house with Cat6a, how their Starlink isn't roaming so they have to suffer the indignation of using 5G, etc.
Meanwhile, back at my house, I'm very happily routing my (capped) LTE internet connection through a Pi 2, connected to a 2.4GHz WiFi AP at 100mbit, and total shared storage of about 120Gb (of which more than 50% is unused). I didn't want to waste my Pi 4 by using it in place of the bottleneck that is the Pi 2, even though it would increase the throughput of my internet connection, because it's fast enough as it is already. That is to say, my setup is the polar opposite of what seems to be the norm these days amoung tech enthusiasts, even though I consider myself a tech enthusiast.
I get that for many of these enthusiasts they want to use the hobby as an opportunity to learn some enterprise-grade technology. However, I still clearly remember the days before technology got "good enough" (by my standards anyway), and so I started thinking deeply about what I would consider a minimum viable home data centre, or as I'm calling it here, a Smol Data Centre (inspired by the smolnet). The intention is to host smolnet stuff like Gemini content, but also maybe run my mail server, calendar/contact DAV server, and other small services at home.
My over-arching guiding philosophy in this endeavour is to make it as cheap, simple, and efficient as possible. Unfortunately "simple" can mean different things to different people, so I'd like to define it a bit more clearly. For me, simplicity means:
This basically translates to: I would prefer to use simple UNIX command line tools that have been around forever, rather than YAML incantations to big shiny new monoliths. Also, for efficiency, I would like to have as few "moving parts" as possible.
My current physical implementation of my Smol Data Centre is:
I would have liked to run OpenBSD on all the Pi's, but alas it only runs on a Pi 3 or Pi 4. I have it running on the Pi 3. One thing I really like about OpenBSD is that it's easy to get my head around and to administer. As an example, running `mount`, `set`, `export`, or even `ps ax` returns a sane screenful or less of understandable output, unlike in most modern Linuxes, where you have to `grep` the result to find the needle in the messy haystack.
For the older Pi's, I chose to run Alpine Linux, which is also small, stable, and easy to understand.
Where possible, I make the OS microSD card read-only (for reliability), and mount `tmpfs` for things like `/var`. I have to temporarily make the filesystem read-write to do OS updates.
I'm not using RAID or ZFS, but plain old ext4 on a USB thumbdrive. Sounds crazy, but I've got thumbdrives from over 15 years ago that are still working reliably! As long as it's backed up, a failure shouldn't be a problem. Here are my thoughts on backups:
In order to simplify connecting my servers to the Internet, I decided to use a tiny VPS server at a provider. I am currently just using `autossh` to maintain a constant SSH port forward from each server to the VPS. I would like to learn and use WireGuard, but that adds a lot of complexity and for now the port forward works fine. For web-based services, I have `nginx` on the VPS server that manages the Let's Encrypt SSL certificate, reverse proxying to my various servers, and adds rate-limiting and other protections. This greatly simplifies the admin required on each of my servers as it's centrally managed.
Other layers of security can be added to each server, for example, instead of running services inside Docker containers, I'm running them in sandboxes created using BubbleWrap. I think `bwrap` is a very under-appreciated tool and I've been using it to great effect even for my dev workflow, but that's a topic for another post.
This is an ongoing exploration for me, and I'm struggling to find online discussions around the topic of minimal self-hosting at home. If you have ideas or want to have a discussion, do reply to this post on Antenna or get in touch, I'd love to hear from you!