💾 Archived View for capsule.adrianhesketh.com › 2022 › 07 › 27 › migrating-fargate-and-lambda-to-arm captured on 2023-05-24 at 17:50:39. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
I've been using AWS's Graviton ARM processors for personal projects for a while.
They're cheaper, use lower power chips, and might be better for the environment [0].
But... can I actually use them for commercial projects? I decided to try and migrate some production services:
Since I use an M1 Mac to develop on, I was fairly confident that my code could would work fine on an ARM processor in AWS.
The migration process is explained in an AWS blog [1], but this post covers the use of CDK instead.
Migrating Node.js Lambda functions with CDK is very straightforward. It's just a case of setting the `architecture` field to `ARM_64` in the `NodejsFunction` properties.
import { Stack, StackProps, CfnOutput } from 'aws-cdk-lib'; import { Construct } from 'constructs'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as lambdaNodeJS from 'aws-cdk-lib/aws-lambda-nodejs'; export class LfurlStack extends Stack { constructor(scope: Construct, id: string, props?: StackProps) { super(scope, id, props); const f = new lambdaNodeJS.NodejsFunction(this, 'test', { entry: 'function.ts', architecture: lambda.Architecture.ARM_64, }) const functionUrl = f.addFunctionUrl({ authType: lambda.FunctionUrlAuthType.NONE, }) new CfnOutput(this, 'functionUrl', { value: functionUrl.url, }) } }
Migrating a Go function requires a shift of runtime from `awslambda.Runtime_GO_1_X()` to `awslambda.Runtime_PROVIDED_AL2()` as well as shifting the `Architecture` field to `awslambda.Architecture_ARM_64()`.
In my production account, this caused AWS Security Hub to complain that I was using an unsupported runtime for Lambda, even though it is supported. In fact, it's more up-to-date to use Amazon Linux 2 than the built-in Go runtime [2].
This blog post from Capital One [3] also talks about using a compile tag to reduce the binary size by stripping out some (no longer required, since we're using `provided.AL2`) RPC code from the output binaries to make it start up faster.
I tried with and without. Adding `-tags lambda.norpc` to my build config took 280KB off the output binary size, but there was negligible difference between cold starts. However, since it's supported and tested in the AWS libraries [4], I stuck with it.
package main import ( "github.com/aws/aws-cdk-go/awscdk/v2" "github.com/aws/aws-cdk-go/awscdk/v2/awslambda" awsapigatewayv2 "github.com/aws/aws-cdk-go/awscdkapigatewayv2alpha/v2" awsapigatewayv2integrations "github.com/aws/aws-cdk-go/awscdkapigatewayv2integrationsalpha/v2" awslambdago "github.com/aws/aws-cdk-go/awscdklambdagoalpha/v2" "github.com/aws/constructs-go/constructs/v10" jsii "github.com/aws/jsii-runtime-go" ) func NewExampleStack(scope constructs.Construct, id string, props *awscdk.StackProps) awscdk.Stack { stack := awscdk.NewStack(scope, &id, props) bundlingOptions := &awslambdago.BundlingOptions{ GoBuildFlags: &[]*string{jsii.String(`-ldflags "-s -w" -tags lambda.norpc`)}, } f := awslambdago.NewGoFunction(stack, jsii.String("handler"), &awslambdago.GoFunctionProps{ Runtime: awslambda.Runtime_PROVIDED_AL2(), Architecture: awslambda.Architecture_ARM_64(), Entry: jsii.String("../lambda"), Bundling: bundlingOptions, }) // Add a Function URL. url := f.AddFunctionUrl(&awslambda.FunctionUrlOptions{ AuthType: awslambda.FunctionUrlAuthType_NONE, }) awscdk.NewCfnOutput(stack, jsii.String("lambdaFunctionUrl"), &awscdk.CfnOutputProps{ ExportName: jsii.String("lambdaFunctionUrl"), Value: url.Url(), }) }
I use a minor variation on the Vercel example [5] to create my production Next.js Docker containers.
I could build and run the Dockerfile perfectly on my ARM Mac already, so I didn't have to make any changes.
However, if you're customising your Docker images to use additional tools, you might need to make sure that you're downloading the appropriate x86 or ARM version of binary distributions.
I found that using `dpkg` to get the architecture and to normalise the filenames made it simpler for me.
RUN curl -fsSL -o awscli_amd64.zip https://awscli.amazonaws.com/awscli-exe-linux-x86_64-2.7.12.zip RUN curl -fsSL -o awscli_arm64.zip https://awscli.amazonaws.com/awscli-exe-linux-aarch64-2.7.12.zip RUN curl -fsSL -o go_amd64.tar.gz "https://go.dev/dl/go1.18.3.linux-amd64.tar.gz" RUN curl -fsSL -o go_arm64.tar.gz "https://go.dev/dl/go1.18.3.linux-arm64.tar.gz" # Use the specific architectures. RUN mv "/downloads/awscli_$(dpkg --print-architecture).zip" /downloads/awscli.zip RUN mv "/downloads/go_$(dpkg --print-architecture).tar.gz" /downloads/go.tar.gz
CDK changes were minimal. I only had to change the `platform` of the `DockerImageAsset` to `LINUX_ARM64` and the `cpuArchitecture` of the `FargateTaskDefinition` to `ARM64`.
import { Stack, StackProps, CfnOutput } from 'aws-cdk-lib'; import { Construct } from 'constructs'; import { DockerImageAsset, Platform } from 'aws-cdk-lib/aws-ecr-assets'; import * as path from 'path'; import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns'; import { ContainerImage, FargateTaskDefinition, CpuArchitecture, OperatingSystemFamily } from 'aws-cdk-lib/aws-ecs'; export class ArmTestStack extends Stack { constructor(scope: Construct, id: string, props?: StackProps) { super(scope, id, props); const image = new DockerImageAsset(this, "ArmNodeExample", { directory: path.join(__dirname, "../node-docker-example"), platform: Platform.LINUX_ARM64, }) const taskDefinition = new FargateTaskDefinition(this, "TaskDef", { runtimePlatform: { operatingSystemFamily: OperatingSystemFamily.LINUX, cpuArchitecture: CpuArchitecture.ARM64, }, cpu: 1024, memoryLimitMiB: 2048, }); taskDefinition.addContainer("Web", { portMappings: [{ containerPort: 3000 }], image: ContainerImage.fromDockerImageAsset(image), }); const service = new ApplicationLoadBalancedFargateService(this, "LoadBalancedService", { assignPublicIp: true, taskDefinition, }) new CfnOutput(this, "endpointURL", { value: service.loadBalancer.loadBalancerDnsName, }) } }
The process for updating the Go-based Docker CDK projects was exactly the same for migrating Node.js Docker containers, just set the Docker image platform and the task definition's runtime platform to ARM64.
The `DockerImageAsset` construct in CDK is smart enough to compile the Go code using the `GOOARCH` flag to build for ARM64.
For TypeScript / Node.js and Go, these all just worked without any changes. Nothing to do at all!
For a simple "Hello World" app, I just needed to add the QEMU and Docker Buildx support to allow the x86 runners to be able to build for ARM64, and that was it.
- name: Set up QEMU uses: docker/setup-qemu-action@v2 - name: Set up Docker Buildx id: buildx uses: docker/setup-buildx-action@v2
Unfortunately, when I tried to get the Next.js multi-stage example working. I got errors like this:
Step 4/23 : COPY package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./ failed to get destination image "sha256:b146e0c3ee7ccd3e761491562dfa9c96075c7ed9932c6237a9217cc13fb6c527": image with reference sha256:b146e0c3ee7ccd3e761491562dfa9c96075c7ed9932c6237a9217cc13fb6c527 was found but does not match the specified platform: wanted linux/arm64, actual: linux/amd64
It seems that Github Actions can also get confused between ARM and x86 downloads, and might need a reminder to use the ARM version (run `docker pull --plaform=linux/arm64 container:label`). Unfortunately, I ran into other weirdness around builds, and wasted a lot of time trying to work out what the problem was.
I "fixed" everything by moving to a self-hosted Github runner than runs in an ARM instance within AWS.
I didn't want to run my own CI/CD instance, but the CDK set up is fairly straightforward, and it comes with some benefits:
package main import ( _ "embed" "os" "github.com/aws/aws-cdk-go/awscdk" "github.com/aws/aws-cdk-go/awscdk/awsec2" "github.com/aws/aws-cdk-go/awscdk/awsiam" "github.com/aws/aws-cdk-go/awscdk/awsssm" "github.com/aws/constructs-go/constructs/v3" "github.com/aws/jsii-runtime-go" ) type CIRunnerStackProps struct { awscdk.StackProps VPCID string } //go:embed userdata.sh var userData string func NewCIRunnerStack(scope constructs.Construct, id string, props *CIRunnerStackProps) awscdk.Stack { var sprops awscdk.StackProps if props != nil { sprops = props.StackProps } stack := awscdk.NewStack(scope, &id, &sprops) vpc := awsec2.Vpc_FromLookup(stack, jsii.String("SharedVpc"), &awsec2.VpcLookupOptions{ VpcId: &props.VPCID, }) role := awsiam.NewRole(stack, jsii.String("BuildServerRole"), &awsiam.RoleProps{ AssumedBy: awsiam.NewServicePrincipal(jsii.String("ec2.amazonaws.com"), nil), }) machineImage := awsec2.MachineImage_FromSSMParameter(jsii.String("/aws/service/canonical/ubuntu/server/focal/stable/current/arm64/hvm/ebs-gp2/ami-id"), awsec2.OperatingSystemType_LINUX, nil) // Add an EC2 instance running ARM64. instance := awsec2.NewInstance(stack, jsii.String("BuildServer"), &awsec2.InstanceProps{ InstanceType: awsec2.InstanceType_Of(awsec2.InstanceClass_BURSTABLE4_GRAVITON, awsec2.InstanceSize_SMALL), MachineImage: machineImage, Vpc: vpc, AllowAllOutbound: jsii.Bool(true), BlockDevices: &[]*awsec2.BlockDevice{ { DeviceName: jsii.String("/dev/sda1"), Volume: awsec2.BlockDeviceVolume_Ebs(jsii.Number(128), &awsec2.EbsDeviceOptions{ DeleteOnTermination: jsii.Bool(true), VolumeType: awsec2.EbsDeviceVolumeType_GP3, }, ), MappingEnabled: jsii.Bool(true), }, }, DetailedMonitoring: jsii.Bool(true), RequireImdsv2: jsii.Bool(true), Role: role, SecurityGroup: nil, UserData: awsec2.UserData_Custom(&userData), UserDataCausesReplacement: jsii.Bool(true), VpcSubnets: &awsec2.SubnetSelection{}, }) instance.Role().AddManagedPolicy(awsiam.ManagedPolicy_FromAwsManagedPolicyName(jsii.String("AmazonSSMManagedInstanceCore"))) // Enable CloudWatch Agent (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html). instance.Role().AddManagedPolicy(awsiam.ManagedPolicy_FromAwsManagedPolicyName(jsii.String("CloudWatchAgentServerPolicy"))) // Give the Github Runner instance access to the secrets it needs to register itself. patParameter := awsssm.StringParameter_FromSecureStringParameterAttributes(stack, jsii.String("GithubPAT"), &awsssm.SecureStringParameterAttributes{ ParameterName: jsii.String("/github/actions/runner/pat"), }) patParameter.GrantRead(instance) // aws ssm start-session --target ${INSTANCE_ID} --region=eu-west-1 awscdk.NewCfnOutput(stack, jsii.String("InstanceID"), &awscdk.CfnOutputProps{ Value: instance.InstanceId(), }) return stack }
You might have noticed the use of two external dependencies. One is the `userdata.sh` file used to configure the instance, the other is a Github Personal Access Token which I'd previously added to SSM parameter store.
The Github Personal Access token is used to register the runner with Github Actions. It must be generated by an owner of the Organisation, and also have the `admin:org` and `admin:enterprise` scopes enabled.
The `userdata.sh` file installs Docker, the AWS CLI, and then installs the Github Actions runner.
Note that I've hard coded it to work for `https://github.com/a-h` - you'll need to adjust this to match your Github organisation.
#!/bin/sh # Install Docker apt-get -y update apt-get -y install htop jq apt-transport-https ca-certificates curl gnupg lsb-release unzip curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo "deb [arch=arm64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null apt-get -y update apt-get install -y docker-ce docker-ce-cli containerd.io usermod -a -G docker ubuntu systemctl start docker systemctl enable docker # Install AWS CLI. apt-get install -y awscli # https://docs.github.com/en/actions/hosting-your-own-runners/configuring-the-self-hosted-runner-application-as-a-service # Get the personal access token, and runner registration token from SSM. export GITHUB_PAT=`aws ssm get-parameter --region=eu-west-1 --name="/github/actions/runner/pat" --query "Parameter.Value" --output text --with-decryption` # Install actions-runnner. mkdir -p /opt/actions-runner cd /opt/actions-runner curl -o actions-runner-linux-arm64-2.294.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.294.0/actions-runner-linux-arm64-2.294.0.tar.gz echo "98c34d401105b83906fd988c184b96d1891eaa1b28856020211fee4a9c30bc2b actions-runner-linux-arm64-2.294.0.tar.gz" | shasum -a 256 -c tar xzf ./actions-runner-linux-arm64-2.294.0.tar.gz echo "Configuring" sudo chown -R github:github /opt/actions-runner sudo RUNNER_ALLOW_RUNASROOT=1 ./config.sh --unattended --work "_work" --url https://github.com/a-h --pat $GITHUB_PAT --replace --name "ARM Runner" echo "Installing" sudo ./svc.sh install # Start service. sudo ./svc.sh start
To use the runner, I had to configure the Github Actions YAML to use it by setting the `runs-on` field to use the runner.
jobs: deploy: runs-on: [self-hosted, Linux, ARM64]
If Github Actions had built-in ARM64 runners, that would be really helpful. Unfortunately, it's not even on the Github roadmap. There's a feedback item at [7] if you'd like to vote for that.
The Go ARM Lambda functions started faster than before. It's clearly visible in the graph when I switched.
However, it could be that the performance improvement is down to removing the Lambda RPC code and migrating to AL2 than ARM.
./go_arm_lambda_init_duration.png
There was no detectable shift in duration. Probably because it's a highly network bound service.
I didn't spot any difference in the ARM performance compared to the x64 version in the Next.js container I migrated. So, it's just cheaper for this use case.
ARM Lambda functions and Fargate tasks are my default for new workloads now.
It was really easy to migrate AWS Lambda functions. While there's a risk in the migration, the payoff seems to be a faster cold start and 20% off the cost.
For Docker, it's also easy to migrate, but Github Actions spoiled it by not having ARM build servers, but keeping the Github Actions runner isn't much effort at all. It's been running for a few weeks without problems.
Meeting CIS AWS Foundations Benchmarks
Process for creating a React page