gemini - kennedy.gemi.dev

💾 Archived View for wilmhit.pw › blog › docker-is-flawed.gmi captured on 2024-05-10 at 10:24:40. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

---

title: "Docker is flawed"

date: 2023-12-28

draft: false

---

Imagine yourself in 2010 as application developer. You're working on your web

application. Suddenly a few red lines show up on the screen: *Port already

bound*. You think to yourself *This is madness! Why I have to deal with it?*.

Then, you remember an approach you used previously to isolate environment:

chroot. The issue is chroot only works for filesystem. It won't separate network

interfaces. You need a stronger chroot but virtualizing entire system is so

much hassle.

It probably isn't how docker came to be. It's just how I imagine it. At that

time the idea was truly great. However, it didn't account for one thing: how

popular docker will become.

My stance on docker

This article may seem like a rant on what docker is and how it's used but I

want to make this clear: Docker[^1] has already made my life easier. I really like

it and for the foreseeable future I will continue using it. It seems the more I

use it the better is my opinion about it. Now without further ado let's get

onto list of issues.

Now we need docker in docker

Nowadays containers are used by pretty much every corner of software

engieneering. Hold your horses: this statement was true. You just need to think

a litte wider.

If your application is not containarized it still may be build

by containarized CI runners. If not, then maybe a tool during development was a

container. Maybe a monitoring system for your infrastructure is based on

docker. Probably your database deployment was a compose file. Maybe you don't

even know about it. The point is: containars are now everywhere.

This is a problem.

The more containers there is the more chances you will have to launch one

container inside another. There is a tool for that. It's called DinD (Docker in

Docker).

DinD has it's own problems. It's needs to run privileged container. And it

forces you to use specific set of images (Dind compatible). You can also

make your own images but it's difficult and your duplicating work.

You can bypass that using DooD (Docker outside of Docker). Neither solution is

ideal. Dood gives up separation for the sake of allowing you to use any image.

If anything goes wrong inside your container it can do equivalent of

`docker prune`.

The correct way is the difficult way

How many of your docker setups use rootless docker installation? How is it

possible that `apt install docker` creates so big security risk.

If you didn't know default installation of docker uses root. This means that

user with id 0 (root) in the container is the same user as user with id 0

outside the container; on the host machine. If you have ability to spin up root

shell in the container, there is so many things that you can do to "escape" it

and become root on the host.

This is by design. This is not a bug. This is why so many systems require you

to type `sudo` before docker commands. This also means that you should think

twice before adding user to "docker" group.

There is a way to prevent this and it's called "rootless docker". Podman

by default is not using root. The biggest problem is how little

used are those solutions.

Podman does the right thing and cannot be blamed here for not having market

dominance. Docker works like this due to compatibility with previous versions.

I am still gonna put the blame on them. There are no plans to depracate root

docker installation. Rootless docker has many downsides and they havn't been

addressed for so years now. Some of them were, but not by Docker Inc. but by

the community. The company makes absolutely no effort to make the rootless

installation easier[^3].

Version incompatibilites

Have you ever typed in `docker compose` in order to find this:

docker: 'compose' is not a docker command.
See 'docker --help'

Or maybe you typed `docker-compose` and found this:

docker-compose: command not found...

Yeah. It's a mess. Docker compose and docker-compose are 2 very different

things. Officially they are called Compose v2 and v1. But there is also third

element: Compose v2 standalone version.

Let me explain.

Docker in early days didn't have plugin mechanism. There was a separate

project called compose. It was decided to make it a docker plugin sometime

later. As such the CLI command `docker-compose` became a subcommand of

`docker`.

After this change, to ease the transition, the Compose v2 standalone was

created. It allowed you to use old command `docker-compose`.

Now we're at the point where Compose v1 is depracated. Some systems have

already switched to Compose v2 some ~~(mainly Debian)~~ is still using v1. It you

take a random distribution and install compose from package manager you don't

know what to expect. Some distros ship v2 standalone version by default, along

with normal v2.

Repository in the name? Tags? Versioning?

Versioning docker images works differently then in any other system. That alone

is not to discriminate against it. So let's build other reasons.

We can always use hashes to idenify images. We don't like hashes. Humans cannot

memorize them. The average docker image identifier that is not hash looks like

that:

codeberg.org/wilmhit/akkoma-ebooks:stable

1. The image name contains alse a repository. That means if you use another

registry to obtain that image it will be treated as different image.

2. It is even worse if you use compose files to launch a multi-container

project. To use same image from different repository you would have to

change the compose files directly after obtaining them.

3. There is no way to really know the version just by whatever you find after

`:`. You may think that if it says `:3.3.10` it will always download the

same image but that's not true. Developer may as well replace 3.3.10 on

registry with an updated version. We had to implement `@sha256` notation to

ensure images are always the same.

4. Since all tags work just as git branches it all confuses image developers

too. Different images have different tags to note "stable" or "tested"

release. Only tag that is special is `:latest` and that is beacuase it is

assumed.

5. All projects that use sematic versioning should allways push 4 versions to

it's registry at every realease. Let's say we just released version

"1.2.3". We should push tags `:latest`, `:1.2.3`, `1.2`, `1`. Multiply that

by the number of registries we use.

Packaging

We have to talk about packaging. Containers are really the first system to

stray away from letting users have an image as a file on your system. All

other packaging formats let you "just download a file". When you build ends you

most times end up with a file. It may be a:

- `.deb` package for use in apt

- `.rpm` package for use in dnf/yum/zypper

- `.msi` installer for windows

- `.jar` archiver for java

- binary file for your system after compiling C code.

But in docker you end up with an image and a bunch of build cache somewhere in

docker engine. Only after you can use `docker save` to save that to `tar` file.

This is completely useless hovever since you cannot neither upload nor download

that from registry. You *have* to have a connection to registry target

deployment machine or otherwise you will be using a unconventional means of

transferring a tar file.

Inconsistent CLI

Docker (specifically docker, not other runtimes) has a hell-of-inconsistent

CLI. It is especially annoying when writing scripts. Not only `docker top`

will take your terminal in semi-curses mode whereas `docker ps` will not. Try

using custom format options. It is really hard to guess what are the valid

names for all the varibles. Example is below.

$ docker ps
CONTAINER ID   IMAGE         COMMAND   CREATED         STATUS         PORTS     NAMES
b2b6051d810f   python:3.11   "bash"    3 seconds ago   Up 2 seconds             relaxed_hawking

Here output is a single display of a table with headings*

$ docker ps --format json | jq
{
  "Command": "\"bash\"",
  "CreatedAt": "2023-12-25 21:14:51 +0100 CET",
  "ID": "b2b6051d810f",
  "Image": "python:3.11",
  "Labels": "",
  "LocalVolumes": "0",
  "Mounts": "",
  "Names": "relaxed_hawking",
  "Networks": "bridge",
  "Ports": "",
  "RunningFor": "2 minutes ago",
  "Size": "0B",
  "State": "running",
  "Status": "Up 2 minutes"
}

Here we can list all containers as json but the keys are not the same as in

table*

$ docker stats

This takes entire screen*

$ docker top b2b6051d810f

This table is so long it won't fit the screen even though all others did*

$ docker top --help
Usage:  docker top CONTAINER [ps OPTIONS]

So I can use options from ps command, huh?*

docker top --format json
unknown flag: --format
See 'docker top --help'.

It doesn't seem so :/*

One admirable thing is that it didn't fall for trap that Go designers set for

Go programmers. If you didn't know - pretty much entire container ecosystem is

written in Go. Golang's standard library provides a way of parsing aguments

from CLI. The sad thing here is that it **doesn't allow** for long arguments

starting with double-dash! It misses the Unix stadard.

Wait, are registry frontends really that bad?

Since containers are really used everywhere now you could think there is a wide

market of infrastucture solutions. This is only somewhat true.

It is hard for me to reason why this happened but maybe the whole ecosystem

expands at speed that outpaces it's own solutions.

As an example please try to find a really simple fronted that will not only

display images but also other types of ortifacts. For more than few versions

you can upload any file as ORAS artifact to OCI registry. There are only 2

registry frontends that I could find that are fully-open source, fairly light

(that excludes Harbor) and can be used to just display what is in registry.

I don't want to redeply my registry. I don't want to have a bunch of extra

functionality. I just want to display what's on `/v2/_catalog` endpoint in a

fairly pretty maner. I'm sure this is not a niche usecase.

![Registry frontend screenshot](/image/registry-frontend.png)

![Registry UI screenshot](/image/registry-ui.png)

Both of those frontends do a bad job displaying anything other then images. One

completely skips ORAS artifacts in listing and the other one displays only the

names. All other fiels are *null*.

Swarm

Please Docker Inc, I'm not going to use it. Don't push it. K8s won. Get over

it. This is annoying.

Maily because documentation is mixed. If you browse though docker documentation

you find links to many interesting features. The problem is that many of them

link to swarm documentation.

This is really a nitpick but annoying one.

Done the open source homework. Now we make money.

One more annoying thing is how *Docker Inc.* is managed. In recent years their

policy on software freedom have changed. Previously they wrote a free software

(although under OSS flag, not under FOSS flag[^2]). Now it's all about docker

swarm, Docker Desktop and centralized docker hub.

I don't regard services where someone else has to pay for server upkeep and

maintanance as strictly evil. I would be happy to pay for storing my images on

dockerhub if that service wasn't treated as better all other registries by their

software.

I can also mention what happened to Docker Desktop. This application

transitioned from just closed-source to paid (for some customers). It is

absolutely absurd to me that companies just keep getting baited by these trick

in this day and age.

Image usage hints

The Dockerfile definition can include really a lot of hints the tell you how to

use the image later on. This information should be exposed to the users.

Right now only the ports are. If you specify a port in your Dockerfile then it

will appear in output of `docker ps` command. Where are volumes that were

specified? Defaults for environment variables? This is all information that was

included during image build. This can be dug up with `inspect` commands.

There is also a part that wasn't included. The part that is included in readme.

Details like what ports are crutial for application and which ports can be left

unexposed. What volumes need to be kept during applications restarts. What

environment variables need to be provided additionally.

I would like to have all that on command line. I would like to be able to

launch containers without being forced to browse the README files (that very

often are not even included on Dockerhub). Extra commands could be added to

Dockerfiles so that docker engine can check that your command line is

incomplete instead of launching the app just so it can shut down second later

with an error message.

[^1]: For the purposes of this article I imply that all containars are

Docker-based if not specified otherwise explicitly.

[^2]: With this statement I mean how the company was marketing itself.

[^3]: Rootless installation isn't as straightforward as you might think as it

requires giving particular user ability to map uids.