💾 Archived View for silbernagel.dev › gemlog › distributed-sqlite-with-elixir captured on 2023-12-28 at 15:05:11. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-04)
-=-=-=-=-=-=-
__ __ __ __ ___ _/ / ___ ____ ___ __ _____/ /__ ___ _/ /_/ / / _ `/ _ \/ _ `/ _ \/ _ \/ // / _ / -_) _ `/ __/ _ \ \_,_/_//_/\_,_/ .__/ .__/\_, /\_,_/\__/\_,_/\__/_//_/ /_/ /_/ /___/
Posted on Sun, January 08 2023
I recently read about a newer SQLite database tool called Litestream which creates backups of the database in S3 compatible storage after every transaction. Litestream restores from that backup, so when scaling horizontally to a new server, the latest version of the DB will be available. Fly.io seems like the ideal platform and Elixir the perfect language for this solution.
See the companion repository for reference
Create a new phoenix app that uses SQLite as the database:
$ mix phx.new distributed_sqlite --database sqlite3
Launch it as a new fly app:
$ fly launch
An environment variable, `DATABASE_PATH`, is needed to indicate which file to use for the SQLite database. Open the `fly.toml` file and add `DATABASE_PATH = /app/distributed_sqlite.db` (use any database name you want here) under the `[env]` section and try to deploy.
$ flyctl deploy
Success. A Phoenix app running on Fly using SQLite. Now, to see it in action.
Lets build a simple, naive counter that just counts the views of each page.
$ mix phx.gen.schema Counter.PageCount page_counts page:string count:integer $ mix ecto.migrate
Add a `Counter` module which can add page view counts
# lib/distributed_sqlite/counter.ex defmodule DistributedSqlite.Counter do alias DistributedSqlite.Counter.PageCount alias DistributedSqlite.Repo def count_page_view(page_name) do page_count = Repo.get_by(PageCount, page: page_name) case page_count do nil -> %PageCount{} |> PageCount.changeset(%{count: 1, page: page_name}) |> Repo.insert() %PageCount{} = page_count -> page_count |> PageCount.changeset(%{count: page_count.count + 1}) |> Repo.update() end end end
Update the `page_controller` to count views
# lib/distributed_sqlite_web/controllers/page_controller.ex alias DistributedSqlite.Counter def index(conn, _params) do page_view = Counter.count_page_view("home") render(conn, "index.html", page_count: page_view.count) end
Display the counter on the page
<!-- lib/distributed_sqlite_web/templates/page/index.html.heex --> <h1> Page Views <%= @view_count %> </h1>
Now deploy again using `flyctl deploy` and then browse to your site to verify that the count shows and updates when refreshing.
The next problem to deal with is that the database will be wiped on our next deploy since it's using ephemeral storage.
One way to resolve this is to use a persistent volume (which should be done for production apps). But since this post is all about Litestream, so lets set that up and see how can help with this.
The first step is to create a bucket in some S3 compatible storage. I like to use Digital Ocean Spaces for this, but you can also use AWS if you want.
See Litestream docs for more options
When you have your storage setup, you'll need 3 pieces of information
1. the bucket URL
2. the access key
3. the secret key
A configuration file, `litestream.yml`, is needed for Litestream to function. Create one in the root of the project with the following text (remember to replace the path with YOUR path):
access-key-id: ${LITESTREAM_ACCESS_KEY_ID} secret-access-key: ${LITESTREAM_SECRET_ACCESS_KEY} dbs: - path: /app/distributed_sql.db replicas: - url: ${REPLICA_URL}
Now set the three variables in Fly to the values you recorded from when setting up the bucket.
$ flyctl secrets set REPLICA_URL=... LITESTREAM_ACCESS_KEY_ID=... LITESTREAM_SECRET_ACCESS_KEY=...
Next, add Litestream to our Docker image. Add the following lines to `Dockerfile` as part of the `builder` phase:
ADD https://github.com/benbjohnson/litestream/releases/download/v0.3.9/litestream-v0.3.9-linux-amd64-static.tar.gz /tmp/litestream.tar.gz RUN tar -C /usr/local/bin -xzf /tmp/litestream.tar.gz
And in the `runner` phase add:
COPY --from=builder /usr/local/bin/litestream /usr/local/bin/litestream COPY litestream.yml /etc/litestream.yml
Update the starting script so that the elixir release is a sub-process of litestream. The easiest way I've found to do this is to create a run script called `run.sh` with the following content:
#!/bin/bash set -e # Restore the database if it does not already exist. if [ -f /app/distributed_sql.db ]; then echo "Database already exists, skipping restore" else echo "No database found, restoring from replica if exists" litestream restore -v -if-replica-exists -o /app/distributed_sql.db "${REPLICA_URL}" fi # Run migrations /app/bin/migrate # Run litestream with your app as the subprocess. exec litestream replicate -exec "/app/bin/server"
Be sure to remove the migration script from fly.toml since it runs in the run.sh script now.
Now update the `Dockerfile` to use this new script to start the app:
COPY run.sh /scripts/run.sh RUN chmod 755 /scripts/run.sh CMD ["/scripts/run.sh"]
Deploying should now start using Litestream to restore the database on deploys and push backups when data changes. You can verify in the monitoring interface of fly. Look for something similar to the image below:
Fly logs showing that Litestream is running
With all of this in place, things would work great when running one instance of you app. But as soon as you add another node things get out of whack. Lets see this in action. Scale the app to 2 and see what happens to the data.
$ flyctl scale count 2
Enough refreshing the browser or opening in different windows/tabs and you'll start to see discrepancies in the view count. This is because we are not replicating the data between the instances. This is a problem Elixir is built for.
Follow the Fly guide to get clustering working
Once clustered, we can begin to replicate our database calls. Add a new GenServer with the following content:
# /lib/distributed_sqlite/repo_replication.ex defmodule DistributedSqlite.RepoReplication do @moduledoc """ Run on each node to handle replicating Repo writes """ use GenServer alias DistributedSqlite.Repo def start_link(args) do GenServer.start_link(__MODULE__, args, name: __MODULE__) end @impl true def init(_args) do {:ok, []} end def handle_cast({:replicate, query, :insert}, state) do Repo.insert!(query) {:noreply, state} end def handle_cast({:replicate, changeset, :update}, state) do Repo.update!(changeset) {:noreply, state} end end
Make sure to start it in the `application.ex`
# lib/distributed_sqlite/application.ex children = [ ..., {DistributedSqlite.RepoReplication, []} ]
Open the `DistributedSqlite.Repo` module and add a `replicate/2` function
@doc """ Replicate the query on the the other nodes in the cluster """ def replicate({:ok, data_to_replicate} = ret, operation) when operation in [:insert, :update] do _ = for node <- Node.list() do GenServer.cast( {DistributedSqlite.RepoReplication, node}, {:replicate, data_to_replicate, operation} ) end ret end def replicate({:error, _changeset} = ret, _), do: ret def replicate(%Ecto.Changeset{} = changeset, operation) when operation in [:insert, :update] do _ = for node <- Node.list() do GenServer.cast( {DistributedSqlite.RepoReplication, node}, {:replicate, changeset, operation} ) end {:ok, changeset} end def replicate(schema, :insert) do _ = for node <- Node.list() do GenServer.cast( {DistributedSqlite.RepoReplication, node}, {:replicate, schema, :insert} ) end {:ok, schema} end
This function can be piped into from an `Repo.insert` or the result of a `Repo.update`. Try this in the `DistributedSqlite.Counter` module:
case page_count do nil -> %PageCount{} |> PageCount.changeset(%{count: 1, page: page_name}) |> Repo.insert() |> Repo.replicate(:insert) %PageCount{} = page_count -> page_count |> PageCount.changeset(%{count: page_count.count + 1}) |> Repo.update() |> case do {:ok, cnt} -> cnt |> PageCount.replicate_changeset() |> Repo.replicate(:update) {:ok, cnt} end end
With this in place, deploy again. Your data is now consistent no matter which node serves the traffic.
I'm not sure how far this can be pushed and there are downsides to this approach, but I plan on continuing this journey that puts my data as close to the app as possible.
There is another worth exploring that removes the need to replicate the data on the application side. It is still in beta, but I plan on trying it out soon as well.