gemini - gem.snowgoons.ro

gem.snowgoons.ro

In my last article[1], I talked about how this site is built using the Hugo Static Site Generator. But I notably didn't talk about how it's deployed.

I've also written before[2] about the - slightly crazy - local Kubernetes cluster that serves as my sandpit and also the hosting platform for this site. This post is where the two things come together, with a brief guide to the interested on how to easily deploy a Static Site Generator built site onto Kubernetes.

This is not the quickest/simplest way to deploy your SSG site to the web - the prize for that probably goes to something like GitLab Pages[3], or to one of the dedicated hosting providers. But it is nevertheless a *remarkably* simple process if you happen to have a Kubernetes cluster in your house.

Pre-Requisites

For this article, I'm going to assume you already have access to a Kubernetes cluster. It doesn't have to be in your own home - you could use Google Kubernetes Engine[4] or similar - that said, Cloud K8s providers are definitely *not* the cheapest option... I will describe how I arrange an ingress from the Internet to my Kubernetes cluster for cheap, though.

I'll also assume you are familiar with working with the Unix/Mac commandline, and you have your Static Site Generator working (or just have some static content you want to publish.)

This isn't really about practicality, it's about playing with some fun tech :-).

**Important:** I should also flag that this isn't necessarily about the *right*

way to do this. This is *my* way that I did this, which I like because it's

simple and it works.

So what are the key deliverables?

Ultimately, what I wanted was:

1. To host some static content on a webserver deployed in Kubernetes.

2. A simple way to update that webserver whenever I have new content to publish.

3. A way to expose the webserver running on the K8s cluster on my desk to the Internet securely.

To do that, I need to build a few things:

┌─────────────────────────┬────────────────────────────────────────────────────┐
│          Thing          │                       What?                        │
╞═════════════════════════╪════════════════════════════════════════════════════╡
│                         │ This will be the Docker image that contains a      │
│ A webserver image       │ simple webserver, which serves my static content   │
│                         │ to whoever asks                                    │
├─────────────────────────┼────────────────────────────────────────────────────┤
│                         │ This is the description that tells Kubernetes how  │
│ A Kubernetes deployment │ to deploy my webserver - where to download the     │
│                         │ image, how many copies to run, that sort of thing. │
├─────────────────────────┼────────────────────────────────────────────────────┤
│ An Kubernetes ingress   │ This is the Kubernetes component that will route   │
│                         │ traffic from 'the real world' into my webserver    │
├─────────────────────────┼────────────────────────────────────────────────────┤
│                         │ A simple script that will compile the new version  │
│ A build/deploy script   │ of the site, then build the docker container, then │
│                         │ tell Kubernetes to deploy it                       │
└─────────────────────────┴────────────────────────────────────────────────────┘

It seems like a lot of components, but in practice two short YAML files and a simple shell script is all I need to provide all of the above. Fairly amazing.

If I was using a commercial Kubernetes hosting provider, that is likely all I would need. But since I'm crazy enough to be hosting this on my own local server, I need one other thing - a tunnel from my local cluster to the Internet, so I can route traffic from public sites into my cluster. (Since I am using a consumer Internet connection here, it also needs to be resilient to things like my IP address periodically changing.) For this I use the Cloudflare Argo Tunnel[5] service; I'll describe the details in another (very) short post.

A Docker Webserver

Rather than deploy a webserver in a Docker container which mounts some kind of persistent storage device, that I then deploy my site into (with FTP, or over NFS, whatever,) I decided to go for a simple solution. I would build a Docker image which has my site content baked right into it, as part of the image itself. This way, I don't need to worry about configuring any shared storage or complicated mechanisms to update the storage - I deploy the webserver image, and the content is deployed right along with it.

How to deploy updated versions of the site? Just build a new Docker image, with the updated content. We'll use the version tag when we tell Kubernetes to deploy the container to ensure it uses the latest version.

Given this, our `Dockerfile` describing the Docker image is insanely simple. We're going to start with the 'standard' Apache webserver docker image, and then we will copy our static content into Apache's web root directory.

And that's it.

This is what our Dockerfile looks like:

FROM httpd:alpine
COPY ./site user/local/apache2/htdocs

No, really, *that's the entire thing*. Build and deploy this docker image, and it will listen on the default port (port 80) for connections, and serve up my static content. Which is exactly what I want it to do.

We're not going to deploy it though, we're going to ask Kubernetes to do that for us. But we do need to build the container image and push it up to a registry that our Kubernetes cluster can then pull from to deploy it. I use the private registry that Gitlab[6] provides with their Git hosting service, so for me the commands to build the docker image look like this:

docker build -t registry.gitlab.com/snowgoons/websites/snowgoons-ro:latest .
docker push     registry.gitlab.com/snowgoons/websites/snowgoons-ro

One note on detail - you'll notice I use the Alpine Linux version of the httpd image as my base. This is just because, as I've mentioned before, my Kubernetes cluster is tiny and underpowered, and Alpine images are in general much smaller. I use them when I can, but it's not essential.

Haven't you forgotten something?

You mean the site itself I think. Well, as I mentioned before, I'm using the Hugo[7] Static Site Generator to build my site. So the incantations to do so look like this:

hugo

OK OK, there is a little bit more than that. I also need to configure where Hugo should output the files - ideally the same place that my Dockerfile will pick it up to copy into the image. This is done by adding the following line to Hugo's `config.toml` file:

publishDir = "docker/site"

And *that* really is everything. If you wanted to test it on a local Docker instance (something like Docker Desktop[8]) you could do it with the following command:

docker run -p 8080:80 registry.gitlab.com/snowgoons/websites/snowgoons-ro:latest

And then point your browser at `localhost:8080` to view the site.

The Kubernetes Deployment

OK, so now I have a Docker image that consists of an Apache webserver, with my static content baked into the image. All I need to do is tell Kubernetes to deploy it. (Actually, I'm going to tell it to deploy two instances - so I have some redundancy against failure. This being Kubernetes, I'll let it handle all the magic involved there.

To do this, I need a simple deployment descriptor in my `deployment.yaml`:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: snowgoons-httpd
  namespace: dmz
  labels:
    app: httpd
    site: snowgoons
spec:
  replicas: 2
  selector:
    matchLabels:
      app: httpd
      site: snowgoons
  template:
    metadata:
      labels:
        app: httpd
        site: snowgoons
    spec:
      imagePullSecrets:
      - name: gitlab-private-registry
      containers:
      - name: httpd
        image: registry.gitlab.com/snowgoons/websites/snowgoons-ro:latest
        ports:
        - containerPort: 80

This is about as simple as a deployment gets. There are a couple of notes though:

In my Kubernetes cluster I have a namespace called *dmz* - anything in this namespace is potentially exposed to the Internet via my Argo Tunnel. For a more complex application, the only component that would be deployed in the dmz would be a simple proxy (a bastion host, well, container, I guess) that routes traffic to the backend applications in their own namespace. This static site is so simple though, the HTTPD server itself just sits in the DMZ.
Don't forget to make sure your pod labels match the selector labels here. In my DMZ namespace I have a (personal) guideline that I label everything with `app` to identify the service, and `site` to identify the public site it's associated with (if it wasn't associated with a public site, it wouldn't be in the DMZ, after all.) You'll see I use the same labels later to identify the Service description, and also to configure the tunnel that exposes that service to the Internet.
Because I'm using a private registry to host my images, I need to tell Kubernetes how to log in. The login credentials are stored in a Kubernetes secret, `gitlab-private-registry`, which I reference in any deployment description that needs to pull from that registry, with the *imagePullSecrets* key.

The Deployment descriptor isn't quite enough though, I also need to define the Service - that is, to expose the container port that Apache is listening on to the internal Kubernetes network, so the Ingress can talk to it.

This I can do with the following service description:

apiVersion: v1
kind: Service
metadata:
  name: snowgoons-dmz
  namespace: dmz
spec:
  selector:
    app: httpd
    site: snowgoons
  ports:
  - name: http
    protocol: "TCP"
    port: 80
    targetPort: 80
  type: ClusterIP

Note that I'm not worrying about SSL/TLS at all here. Internally, I don't need to worry about it (I'm just throwing non-personal data around my own network, so security is not really an issue,) and externally the Argo Tunnel from Cloudflare will completely automate the creation and management of SSL certificates for the site once it's deployed.

The Kubernetes Ingress

So, the final piece of the Kubernetes puzzle is the *Ingress* - that is, the component that will route traffic from the public Internet to my service.

If you are using a Cloud Provider for your Kubernetes, the Ingress specification will tell your provider to instantiate some kind of public load-balancer - on Amazon's Elastic Kubernetes Service[9], for example, it will automatically create an Elastic Load Balancer (ELB) instance that routes traffic to your service.

I on the other hand am not using a public cloud provider, and I don't have a weapons-grade internet connection or a pool of publicly routable IP addresses that I could spin up my own load-balancers on. That's where Cloudflare's Argo Tunnel comes in - it acts as my ingress provider. When I ask for an Ingress, it sets up a VPN from my cluster to the nearest Cloudflare PoP, routes the traffic, and even automatically creates the DNS entry for the new service. It gives me the same level of simplicity I would get from a cloud managed Kubernetes, but with my on-premises service. *For $5 a month.*

Anyway, what does that look like in real terms? Like this:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: argo-tunnel
  labels:
    component: ingress
    site: snowgoons
  name: snowgoonsdotro
  namespace: dmz
spec:
  rules:
  - host: snowgoonsdotro.snowgoons.ro
    http:
      paths:
      - backend:
          serviceName: snowgoons-dmz
          servicePort: 80
        path: /
  tls:
    - hosts:
      - snowgoonsdotro.snowgoons.ro
      secretName: snowgoonsdotro-tunnel-cert

This is basically a fairly standard Ingress descriptor. The *ingress.class* annotation tells the Argo Tunnel agent that I'd like it to handle the management of this Ingress for me. The *host* I specify here is **declarative** - I'm not saying "this is the domain I created for this", I'm saying "this is the domain I want you to go away and create for me - handle everything else by magic please."

And that, really, is everything you need to deploy a Static site with Kubernetes. One short docker file, and one Kubernetes deployment yaml with three short object descriptions - the deployment, the service, and the ingress.

I'm actually still amazed even as I write this. All I need to do to deploy everything, and have the site working, the DNS created, SSL certificates created, all configured together, is type

kubectl apply -f deployment.yaml

Tying it together - a simple deploy script

So, to finally put it all together, we need one last thing - a simple script that will build the content when I want to publish updates, and push those changes to the Kubernetes cluster. The script needs to:

1. Build the site with Hugo

2. Build the updated Docker image & push it to the registry

3. Tell Kubernetes to update its deployment.

Now, there is one tiny wrinkle with that very last point; up until now, I have used the `:latest` tag for my Docker images. But that gives a tiny problem.

Using a :latest tag on an image reference will by default tell Kubernetes to ignore its local image cache, and always pull the image from the registry when a new deployment is made (it has an implicit `imagePullPolicy: always`, in other words.) This isn't necessarily what I want to happen, and what I *do* want to happen probably won't:

When a container gets recreated on another node (because the server died or for whatever reason), Kubernetes will always pull the *:latest* version of the Docker image with my site. But that isn't necessarily the last version that I wanted deployed - maybe I am building new versions for some reason that I don't want published yet; I don't want Kubernetes accidentally publishing the new version just because a node got restarted.
On the other hand, I want to update the Kubernetes deployment when I have new content. But if I tell it "this deployment should be using the :latest version", Kubernetes is going to say "it already is, so I have nothing to do here". At this point, it's not about the image cache - it's about the 'new' Deployment description being identical to the old one; that means Kubernetes won't even bother trying to download a new version, cached or not, because it doesn't think anything changed to prompt it to.

No, we shouldn't use *:latest* in our deployment description - we should be using a specific version, that is "the version that I built the last time I deployed the site".

If I was integrating this with a proper Continuous Integration system, the answer would be simple - I'd use the identifier of the last version control commit as the version tag. But I haven't done that - I wanted just a simple command I could run to deploy mt site; so instead, in my shell script I just create a version tag using `date +%s` to get the current time in seconds since epoch.

The other trick I need to make this work is that a Kubernetes yaml file doesn't by default parse environment variables[^1], but I need some way of getting that new build version into the Deployment description when I update Kubernetes. This is where I do a small cheat - I run the deployment yaml through the Unix `envsubst` command first, to substitute the value in; and the image definition on my deployment.yaml now looks like this:

      containers:
      - name: httpd
        image: registry.gitlab.com/snowgoons/websites/snowgoons-ro:${BUILDTAG}

So, to finally bring it all together, this is what my build script looks like:

#!/bin/sh

REGISTRY=registry.gitlab.com/snowgoons/websites/snowgoons-ro

BUILDTAG=`date +%s`
export BUILDTAG

echo "Step 1 - Build Site"
hugo

echo "Step 2 - Build Docker image"
cd docker
docker build -t ${REGISTRY}:${BUILDTAG} .
docker push ${REGISTRY}
cd ..

echo "Step 3 - Deploy to Kubernetes"
cd k8s
envsubst < deployment.yaml | kubectl apply -f -
cd ..

That's all folks

And, that's it for now! If you read this far, I hope you learned something useful - even if it was just how not to do things. As a closing summary, here is the entire process of deploying this updated story to the site, just to prove I'm not making it up:

[Sasha:Development/Websites/snowgoons-ro] timwa% ./deploy.sh 
Step 1 - Build Site

                   | EN  
-------------------+-----
  Pages            | 32  
  Paginator pages  |  2  
  Non-page files   |  0  
  Static files     | 32  
  Processed images |  0  
  Aliases          | 11  
  Sitemaps         |  1  
  Cleaned          |  0  

Total in 351 ms
Step 2 - Build Docker image
Sending build context to Docker daemon  4.469MB
Step 1/2 : FROM httpd:alpine
 ---> eee6a6a3a3c9
Step 2/2 : COPY ./site /usr/local/apache2/htdocs
 ---> 4f1bd1a1dada
Successfully built 4f1bd1a1dada
Successfully tagged registry.gitlab.com/snowgoons/websites/snowgoons-ro:1589467618
The push refers to repository [registry.gitlab.com/snowgoons/websites/snowgoons-ro]
bcd468d0da79: Layer already exists 
75330f1d6f1c: Layer already exists 
0d648f7641c2: Layer already exists 
[... snipped a load of Docker stuff ...]
ee4c544f5273: Layer already exists 
c1a19e8cc45c: Layer already exists 
3e207b409db3: Layer already exists 
latest: digest: sha256:768b5dfb6c4699503b2623e6b15cebe1c83946280a4475fb427fa502204d7c94 size: 1572
Step 3 - Deploy to Kubernetes
service/snowgoons-dmz unchanged
ingress.extensions/snowgoonsdotro unchanged
deployment.apps/snowgoons-httpd configured