💾 Archived View for wilmhit.pw › blog › docker-is-flawed.gmi captured on 2024-05-10 at 10:24:40. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-12-28)
-=-=-=-=-=-=-
---
title: "Docker is flawed"
date: 2023-12-28
draft: false
---
Imagine yourself in 2010 as application developer. You're working on your web
application. Suddenly a few red lines show up on the screen: *Port already
bound*. You think to yourself *This is madness! Why I have to deal with it?*. <!--more-->
Then, you remember an approach you used previously to isolate environment:
chroot. The issue is chroot only works for filesystem. It won't separate network
interfaces. You need a stronger chroot but virtualizing entire system is so
much hassle.
It probably isn't how docker came to be. It's just how I imagine it. At that
time the idea was truly great. However, it didn't account for one thing: how
popular docker will become.
This article may seem like a rant on what docker is and how it's used but I
want to make this clear: Docker[^1] has already made my life easier. I really like
it and for the foreseeable future I will continue using it. It seems the more I
use it the better is my opinion about it. Now without further ado let's get
onto list of issues.
Nowadays containers are used by pretty much every corner of software
engieneering. Hold your horses: this statement was true. You just need to think
a litte wider.
If your application is not containarized it still may be build
by containarized CI runners. If not, then maybe a tool during development was a
container. Maybe a monitoring system for your infrastructure is based on
docker. Probably your database deployment was a compose file. Maybe you don't
even know about it. The point is: containars are now everywhere.
This is a problem.
The more containers there is the more chances you will have to launch one
container inside another. There is a tool for that. It's called DinD (Docker in
Docker).
DinD has it's own problems. It's needs to run privileged container. And it
forces you to use specific set of images (Dind compatible). You can also
make your own images but it's difficult and your duplicating work.
You can bypass that using DooD (Docker outside of Docker). Neither solution is
ideal. Dood gives up separation for the sake of allowing you to use any image.
If anything goes wrong inside your container it can do equivalent of
`docker prune`.
How many of your docker setups use rootless docker installation? How is it
possible that `apt install docker` creates so big security risk.
If you didn't know default installation of docker uses root. This means that
user with id 0 (root) in the container is the same user as user with id 0
outside the container; on the host machine. If you have ability to spin up root
shell in the container, there is so many things that you can do to "escape" it
and become root on the host.
This is by design. This is not a bug. This is why so many systems require you
to type `sudo` before docker commands. This also means that you should think
twice before adding user to "docker" group.
There is a way to prevent this and it's called "rootless docker". Podman
by default is not using root. The biggest problem is how little
used are those solutions.
Podman does the right thing and cannot be blamed here for not having market
dominance. Docker works like this due to compatibility with previous versions.
I am still gonna put the blame on them. There are no plans to depracate root
docker installation. Rootless docker has many downsides and they havn't been
addressed for so years now. Some of them were, but not by Docker Inc. but by
the community. The company makes absolutely no effort to make the rootless
installation easier[^3].
Have you ever typed in `docker compose` in order to find this:
docker: 'compose' is not a docker command. See 'docker --help'
Or maybe you typed `docker-compose` and found this:
docker-compose: command not found...
Yeah. It's a mess. Docker compose and docker-compose are 2 very different
things. Officially they are called Compose v2 and v1. But there is also third
element: Compose v2 standalone version.
Let me explain.
Docker in early days didn't have plugin mechanism. There was a separate
project called compose. It was decided to make it a docker plugin sometime
later. As such the CLI command `docker-compose` became a subcommand of
`docker`.
After this change, to ease the transition, the Compose v2 standalone was
created. It allowed you to use old command `docker-compose`.
Now we're at the point where Compose v1 is depracated. Some systems have
already switched to Compose v2 some ~~(mainly Debian)~~ is still using v1. It you
take a random distribution and install compose from package manager you don't
know what to expect. Some distros ship v2 standalone version by default, along
with normal v2.
Versioning docker images works differently then in any other system. That alone
is not to discriminate against it. So let's build other reasons.
We can always use hashes to idenify images. We don't like hashes. Humans cannot
memorize them. The average docker image identifier that is not hash looks like
that:
codeberg.org/wilmhit/akkoma-ebooks:stable
1. The image name contains alse a repository. That means if you use another
registry to obtain that image it will be treated as different image.
2. It is even worse if you use compose files to launch a multi-container
project. To use same image from different repository you would have to
change the compose files directly after obtaining them.
3. There is no way to really know the version just by whatever you find after
`:`. You may think that if it says `:3.3.10` it will always download the
same image but that's not true. Developer may as well replace 3.3.10 on
registry with an updated version. We had to implement `@sha256` notation to
ensure images are always the same.
4. Since all tags work just as git branches it all confuses image developers
too. Different images have different tags to note "stable" or "tested"
release. Only tag that is special is `:latest` and that is beacuase it is
assumed.
5. All projects that use sematic versioning should allways push 4 versions to
it's registry at every realease. Let's say we just released version
"1.2.3". We should push tags `:latest`, `:1.2.3`, `1.2`, `1`. Multiply that
by the number of registries we use.
We have to talk about packaging. Containers are really the first system to
stray away from letting users have an image as a file on your system. All
other packaging formats let you "just download a file". When you build ends you
most times end up with a file. It may be a:
- `.deb` package for use in apt
- `.rpm` package for use in dnf/yum/zypper
- `.msi` installer for windows
- `.jar` archiver for java
- binary file for your system after compiling C code.
But in docker you end up with an image and a bunch of build cache somewhere in
docker engine. Only after you can use `docker save` to save that to `tar` file.
This is completely useless hovever since you cannot neither upload nor download
that from registry. You *have* to have a connection to registry target
deployment machine or otherwise you will be using a unconventional means of
transferring a tar file.
Docker (specifically docker, not other runtimes) has a hell-of-inconsistent
CLI. It is especially annoying when writing scripts. Not only `docker top`
will take your terminal in semi-curses mode whereas `docker ps` will not. Try
using custom format options. It is really hard to guess what are the valid
names for all the varibles. Example is below.
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b2b6051d810f python:3.11 "bash" 3 seconds ago Up 2 seconds relaxed_hawking
$ docker ps --format json | jq { "Command": "\"bash\"", "CreatedAt": "2023-12-25 21:14:51 +0100 CET", "ID": "b2b6051d810f", "Image": "python:3.11", "Labels": "", "LocalVolumes": "0", "Mounts": "", "Names": "relaxed_hawking", "Networks": "bridge", "Ports": "", "RunningFor": "2 minutes ago", "Size": "0B", "State": "running", "Status": "Up 2 minutes" }
table*
$ docker stats
$ docker top b2b6051d810f
$ docker top --help Usage: docker top CONTAINER [ps OPTIONS]
docker top --format json unknown flag: --format See 'docker top --help'.
One admirable thing is that it didn't fall for trap that Go designers set for
Go programmers. If you didn't know - pretty much entire container ecosystem is
written in Go. Golang's standard library provides a way of parsing aguments
from CLI. The sad thing here is that it **doesn't allow** for long arguments
starting with double-dash! It misses the Unix stadard.
Since containers are really used everywhere now you could think there is a wide
market of infrastucture solutions. This is only somewhat true.
It is hard for me to reason why this happened but maybe the whole ecosystem
expands at speed that outpaces it's own solutions.
As an example please try to find a really simple fronted that will not only
display images but also other types of ortifacts. For more than few versions
you can upload any file as ORAS artifact to OCI registry. There are only 2
registry frontends that I could find that are fully-open source, fairly light
(that excludes Harbor) and can be used to just display what is in registry.
I don't want to redeply my registry. I don't want to have a bunch of extra
functionality. I just want to display what's on `/v2/_catalog` endpoint in a
fairly pretty maner. I'm sure this is not a niche usecase.
![Registry frontend screenshot](/image/registry-frontend.png)
![Registry UI screenshot](/image/registry-ui.png)
Both of those frontends do a bad job displaying anything other then images. One
completely skips ORAS artifacts in listing and the other one displays only the
names. All other fiels are *null*.
Please Docker Inc, I'm not going to use it. Don't push it. K8s won. Get over
it. This is annoying.
Maily because documentation is mixed. If you browse though docker documentation
you find links to many interesting features. The problem is that many of them
link to swarm documentation.
This is really a nitpick but annoying one.
One more annoying thing is how *Docker Inc.* is managed. In recent years their
policy on software freedom have changed. Previously they wrote a free software
(although under OSS flag, not under FOSS flag[^2]). Now it's all about docker
swarm, Docker Desktop and centralized docker hub.
I don't regard services where someone else has to pay for server upkeep and
maintanance as strictly evil. I would be happy to pay for storing my images on
dockerhub if that service wasn't treated as better all other registries by their
software.
I can also mention what happened to Docker Desktop. This application
transitioned from just closed-source to paid (for some customers). It is
absolutely absurd to me that companies just keep getting baited by these trick
in this day and age.
The Dockerfile definition can include really a lot of hints the tell you how to
use the image later on. This information should be exposed to the users.
Right now only the ports are. If you specify a port in your Dockerfile then it
will appear in output of `docker ps` command. Where are volumes that were
specified? Defaults for environment variables? This is all information that was
included during image build. This can be dug up with `inspect` commands.
There is also a part that wasn't included. The part that is included in readme.
Details like what ports are crutial for application and which ports can be left
unexposed. What volumes need to be kept during applications restarts. What
environment variables need to be provided additionally.
I would like to have all that on command line. I would like to be able to
launch containers without being forced to browse the README files (that very
often are not even included on Dockerhub). Extra commands could be added to
Dockerfiles so that docker engine can check that your command line is
incomplete instead of launching the app just so it can shut down second later
with an error message.
[^1]: For the purposes of this article I imply that all containars are
Docker-based if not specified otherwise explicitly.
[^2]: With this statement I mean how the company was marketing itself.
[^3]: Rootless installation isn't as straightforward as you might think as it
requires giving particular user ability to map uids.