Server Migration: Reflection

1/13/2021

Almost two weeks ago, my friends and I decided to migrate our VPS from DigitalOcean (DO) to Vultr. Here I discuss the motivation and reflect on the process.

Background

My friend Andrew and I had been renting a VPS from DO since late 2016 or early 2017 - the specifics being lost to time. It had begun as an Ubuntu 16.04 server running Apache and our websites, at the time accessible at "www.thorandrew.com/thor" and "www.thorandrew.com/andrew". It had a single core, 500 MB of RAM, and costed a very-affordable 5 USD/mo.

Andrew at ScientificPineapple.com

In the years following, the server grew in purpose and in specs. We installed a number of open-source pieces of software, some of which we used daily, much of it providing valuable alternatives to services offered by the megacorps like Google and Microsoft. Nextcloud, Bitwarden, GitLab (replacing Gogs), Mumble, and Wireguard are some of the more prominent ones.

As we added software, we also increased our resource allotment to a 2-core, 4 GB of RAM machine with an 80 GB SSD. By making use of the Wireguard VPN, we were able to offload the RAM-heavy GitLab software to my NAS. The server also went through an OS upgrade from Ubuntu 16.04 to 18.04, which was a smooth process.

While the DO server filled its needs well, we found ourselves increasingly frustrated by the long ping times, particularly with us running our personal servers through it via Wireguard. The closest datacenter DO offered was in New York. Researching into alternatives kept coming back to Vultr, who offered the same specifications at the same price points as DO, but with a datacenter in Chicago. This would cut the ping times from our residences in North Dakota approximately in half, without costing anything additional per month.

A Plan Forms

In spring of 2020, Andrew and I began seriously discussing a migration to Vultr. We were still fuzzy on the details of our desired setup - the OS in particular - but we determined that an orderly migration would require us to first have an orderly server. We decided that Docker containers (particularly making use of Docker Compose) would be the sanest route forward.

We decided we would make a top-level directory `/persistent`, inside which we would store any data that was meant to persist beyond the life of an individual container. We also decided we needed a tool to manage the deployment and updates of our server in an organized fashion. We opted to write a small tool in Bash to manage this for us, and we un-creatively named it "provisioning-tool".

Provisioning tool repository

With this, we had a plan forward: we would, piece by piece, migrate segments of our server to use Docker containers and our provisioning tool. We decided that once everything was all running well with Docker and the provisioning tool, we would migrate to Vultr.

A Plan Becomes A Goal

Over the following few months, we converted some of the lower-hanging fruit to use our tool and Docker. Most of the web-based tools on our server are accessed through our Apache reverse proxy. This made it easy to swap the bare-metal installs for Docker containers. Some of the easier ones:

With the first few items going smoothly, we decided to make a (somewhat ambitious) goal of having the migration done the first of the year. At that point, we would bring our friend Jake on board, allowing us to split costs three ways and maximizing our bang-per-buck (and keeping maintenance lower than if we each were running our own stack).

Challenges

Friendica

One of the pieces of software I was running was a Fediverse server called Friendica. They had an official Docker image, but it wasn't very mature, and there was no documentation on moving an existing bare-metal installation to a container.

Friendica Website

The application is PHP with MariaDB as the database, so after some research I decided I would try dumping the database with `mariadb-dump`, fire up the database container, import the data, then fire up the official Docker container. Despite multiple attempts, I could not get Friendica to work from inside the Docker container. The closest I got had Friendica up and running, and accessible, with all my data and with email notifications working - but for some reason, even after setting up the cron job correctly, new posts would not show up on the timeline. It was like Friendica would remain frozen at the point where I had dumped the database.

For now my Friendica instance has been taken offline, but I hope to have it up and running again in the next week or two.

Nextcloud

Migrating Nextcloud was an incredibly smooth experience. Their official Docker container has instructions on how to migrate. Since Nextcloud is also a PHP and MariaDB application, it basically amounted to dumping the database, starting up the database container, importing the data, then starting up the Nextcloud container with the data directory mounted. Additionally, a cron container needed to be set up. Putting this all in a `docker-compose.yml` keeps the whole thing very tidy and straightforward, and takes away a lot of the more brittle aspects of normal Nextcloud updates. For a short time I encountered what I thought to be bugs in the videoconferencing app inside Nextcloud: URLs were not properly being filled in, and so clicking the button to go to the videoconferencing app (Nextcloud Talk) would instead try to point to the docker IP of the container. I followed a post in their GitHub tracker, and after a while realized the proper solution was setting `ProxyPreserveHost On` in the Apache reverse proxy configuration. This, as well as leaving Nextcloud's automatic domain name detection on and setting Nextcloud's `overwriteprotocol` to `https`, made sure that the URL detected by Nextcloud worked regardless which of the three domain names it was visited on.

Nextcloud Docker Documentation

Nextcloud Talk Issue on GitHub

Letsencrypt

I was apprehensive approaching the Letsencrypt daemon's Docker conversion. While Letsencrypt does provide a Docker image with their certbot tool included, at the time we were using the Apache plugin for authenticating challenges: whenever it was time to do a challenge, certbot would add an Apache rule to point to a special subdirectory ($DOMAIN/.well-known/acme-challenge/$RANDOMSTRING) to the challenge file. Putting Letsencrypt's certbot as well as our Apache reverse proxy in separate containers would break the ability for certbot to manipulate Apache to pass challenges. After some thought and more than a little time fighting Apache configurations, I came upon a solution:

# Redirect http to https
<VirtualHost *:80>
  ServerName thorjhanson.com
  ServerAlias *.thorjhanson.com
  ServerAdmin admin@thorjhanson.com

  # If it is NOT an Acme challenge, redirect to https
  RewriteEngine On
  RewriteCond %{REQUEST_URI} !^/.well-known/acme-challenge/
  RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]

  ProxyPass /.well-known/acme-challenge/ http://localhost:1075/.well-known/acme-challenge/

  ErrorLog ${APACHE_LOG_DIR}/thorjhanson-error.log
  CustomLog ${APACHE_LOG_DIR}/thorjhanson-access.log combined
</VirtualHost>

Adding a section like this to the Apache configuration of each of the base domains would ensure that specifically any requests to the subdirectory `/.well-known/acme-challenge/` would get proxied into the Letsencrypt container.

We set up the Letsencrypt container to mount a directory on the host machine, making the certs available to be mounted by whatever other containers would need them (Apache reverse proxy, Postfix, and Mumble). With this, what I expected to be one of the most challenging containers was actually very straightforward.

Postfix (Mailserver)

Pleasantly, I found that setting up a Postfix container was really easy. My configuration file for Postfix, which was initially a nightmare to configure, required almost no changes. It was likely the easiest container to set up.

The Migration

As December drew to a close, we decided that on Saturday, January 2nd, we would begin the migration process. This was despite us not having yet determined what firewall application we would be using, nor having a functioning version of the Apache reverse proxy put into Docker.

We settled on using Arch Linux on ZFS. This would require a custom installation on Vultr, a feature DO did not offer. This would let us use the snapshotting capabilities of ZFS, as well as sending those snapshots to our in-home servers to facilitate easy backups. This would also mean dropping our server costs from 24 USD/mo to 20 USD/mo, since we wouldn't be paying for Vultr's backup services. We chose Arch Linux because Andrew and I were most familiar with it, and because (with all of our services in Docker anyway), our choice of host OS wasn't as critical as it would be if we were running all the services bare-metal. We also decided to adopt nftables as our firewall application of choice, since it is said to be the successor to iptables (we had previously been using UFW).

We had some initial trouble getting Arch on ZFS installed, but Andrew managed to get it working the next day. With the base OS installed, we could start with some of the "real" migration steps (many done in parallel, since both Andrew and I were working on it):

We had to take a break for the next two days after that, after which Andrew was able to put the finishing touches on the firewall rules. On Tuesday, January 5th, we decided we were ready to take the leap: I copied over the DNS rules from DO to Vultr, and we flipped our domains to point to the new server.

With this done, we would have to wait around 24 hours for the DNS changes to propagate.

By Wednesday night, the DNS all seemed to be in working order. This meant we could start and troubleshoot the remaining services. Andrew got our Mumble server up and running. We also had to edit the firewall rules to allow our Docker applications to talk to our mailserver. It was at that point that we realized Vultr didn't allow us to send mail, since that was a by-request-only item. I submitted the ticket and went to bed.

The next day, we got approval to send mail. This meant that Bitwarden was up and running successfully (it relies on our mailserver). Within the remaining few days, I had the Nextcloud instance back up and running. Saturday we took a deep breath, looked around, and decided it was time to shut down and delete the old server for good.

In Retrospect

We are still polishing up a few things here and there (and I still need to try and get my Friendica server back up and running), but by and large I would say we got our migration done in a week, with only a few days of downtime on our most critical services. While it certainly isn't professional grade, for a couple of hobbyists working our free time I think we did pretty well. We are extremely pleased with the new setup. Performance has been as-expected, and traffic doesn't have to travel as far before exiting our VPN.

I'd like to say thank you to DigitalOcean for several years of beyond-excellent service. I highly recommend them for any VPS needs. I would also like to give a thank-you to everyone who helps contribute to the Arch Wiki, since without it we likely wouldn't have been able to pull this off. And, of course, thank you to Andrew and Jake, who both offered technical and moral support. Without them, I would be running a much lonelier server with a lot less ambition.