A few days ago I built a new website, though calling it such might be a touch too generous. It's called *500 as a Service* or *500aaS* for short. You can visit it on
. Don't sue me if you find it disappointing. It's meant to be a failure after all.
I figured, with the wealth of things available to consume as a service nowadays, it felt just appropriate to altruistically offer a piece of failure on-demand, free of charge, to my fellow humans. I know, I may have probably peaked with this idea right here.
Take it for what you may, but I probably got more out of building this website than you, dear reader, after
contemplating it in befuddlement. I didn't set out to build a lazy failure, mind you. I wanted to build a massively scalable one. And
this poses a somewhat more interesting challenge. How did I do it?
The vision for 500aaS is as follows:
Provide a planet-scale, elastic, resilient, secure and low-latency on-demand HTTP 500 service.
Planet-scale, elastic... those couple of buzzword bingo entries hint at a cloud service... AWS maybe? Correct! (it was
the *elastic* part that gave it away, wasn't it?). In times past, the simplest way to deploy a site like this would've
been to get hold of a server box somewhere, set up Apache or Nginx on it and configure the web server to always return
500 errors, regardless of the URL pattern it receives. Open this server up to the public Internet via a static IP
address and voila: you got yourself a homemade failure.
# Minimal nginx config that will get you 500 responses forever # /etc/nginx.conf events {} http { server { location / { return 500; } } }
But there is a big caveat here. This is just one physical server we're talking about. What if 500aaS took off big time and people started swarming onto my site, anxiously seeking their daily fix of foobar? The server could become overwhelmed, unable to even muster the processing power to serve a *faux* HTTP 500, and start returning real ones, if at all. You could argue this is technically still OK, as the whole point of 500aaS is to fail, but I'm a bit of a purist, so I coudn't accept that possibility. The question remains then: how do I deploy a service like this so that it can serve endless botched responses in a controlled manner, to anyone, under any circumstances? By taking it to the cloud, of course!
The easiest, most scalable way to host a site on AWS is to build it on top of their serverless stack: Lambda,
DynamoDB... My application doesn't need any state to remember it should be always serving a 500 response back so all I
need is a simple Lambda function to run it. One like this maybe:
const fs = require('fs'); exports.handler = async () => { const htmlBody = ` <!doctype html> <html> <head> <title>500 Internal Server Error</title> </head> <body> <h1>Internal Server Error</h1> <p>There was an error processing your request.</p> </body> </html> `; const response = { status: '500', statusDescription: 'Internal Server Error', headers: { vary: [{ key: 'Vary', value: '*', }], 'last-modified': [{ key: 'Last-Modified', value: '2017-01-13', }], 'content-type': [{ key: 'Content-Type', value: 'text/html', }], }, body: htmlBody, }; return response; };
I want to return an error with the minimum amount of complexity and effort possible. Turns out it's actually pretty hard
to cause a Lambda function to truly crash, so I manually craft the 500 status codes instead. Is this cheating? Maybe, but it's not like a user of this service would care. They just want to see a 500 error page, for God's sake!
Despite the simplicity of its implementation, this approach would still require me setting up an API
Gateway as a frontend, which is good but not very cheap in the long run, and not entirely hassle-free. There is another
option, and that is to serve the content as close as possible to the location it was requested from, and generating said response directly where it's served. Does this sound like I'm talking about a CDN? Because that's exactly what I'm talking about.
If you've never come across this approach before, several cloud vendors and CDN providers let you ship your code
directly to the servers at their edge points of presence, which means the client-server exchange journey is remarkably
shortened. Instead of having the CDN as the middle-man that caches the content served from the actual web servers, the
CDN now becomes **the** server. Wait a minute, aren't CDNs just dumb caches serving static Internet files all over the planet? Well, not
anymore! You can now run arbitrary code in them too which allows them to modify Internet payloads running through them
on the fly, as
well as generating new content dynamically!
The first major vendor I know of that started offering this were Cloudflare, with
that enables this is pretty interesting but beyond the scope of this article. Anyway, getting started with Cloudflare Workers is fairly easy nowadays. Here's a sample Worker JS script I put together:
addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)); }); /** * Respond with HTTP 500 * @param {Request} request */ async function handleRequest(request) { const htmlBody = ` <html> <head> <title>500 - Internal Server Error</title> </head> <body> <h1>500</h1> <h2>Internal Server Error</h2> </body> </html> `; return new Response(htmlBody, { status: 500, headers: { 'content-type': 'text/html' }, }); }
Deploying it was easy too. The problem came shortly after when I tried to add my new Worker URL to my AWS Route53 DNS
records so that I could use the 500asaservice.com domain for it. The bad news is that Cloudflare won't allow you to do
this, unless you pay them a shed load of money. So that was the end of my adventure with Cloudflare Workers. At this
point I decided to return to AWS to see what they could do for me. And easy enough, they have something pretty similar
to Cloudflare Workers. It's called Lambda@Edge and it allows you to run Lambda functions within CloudFront itself.
With barely no changes to my original Lambda code, I set up a new CloudFront distribution. The origin for the distribution is inconsequential since every single response will be generated within the
Lambda so I just gave it a made-up one. Then, all I had to do was to set up a CloudFront `viewer-request` event as the
trigger for my Lambda and deploy the distribution. Once I got everything working, I encoded the configuration in a
`serverless.yml` so it was easier to change and deploy. And that was pretty much it. I now have a Lambda function which
runs atop Amazon's ubiquitous and nearly infallible CDN. It costs me almost nothing to run it (provided it doesn't start
serving huge amounts of traffic) and requires no maintenance at all. I'm so confident of the performance and uptime
(downtime?) of my application that I even published
.
There are still a couple of bugs in my application. Excuse me, bugs in an app that was built to fail? Yup, it turns out,
500aaS does not always return a HTTP 500 status code. It can still be susceptible to malformed HTTP requests, which will force
CloudFront to step in and return a HTTP 400 error instead, bypassing the Lambda altogether (this is why my SLA does not promise 100% downtime). This is something I could perhaps fix by overriding the custom error
responses CloudFront returns, but they seem to be set up as a function of the origin response, so I don't know if they
would work with Lambda@Edge. Still a work in progress.
If you're interested in checking out how 500aaS was built, you can browse the
. Pull requests and suggestions welcome. I even set up a GitHub Actions pipeline to run a test to ensure it always fails. Because I have standards, you know?