Tuesday, 15. November 2022
[This article has been bi-posted to Gemini and the Web]
One of the more revolutionary developments in computing history is _virtualization_. I've wanted to write a series of posts about _CBSD_ for quite a while now. This post however is another one to which the same applies: An introduction into what virtualization actually is.
A Monday morning after a night with several incidents during on-call duty, I drove my children to school. I don't remember exactly why, but I obviously mumbled something about that one darn Virtual Machine that had ruined my night. Turns out that children can be pretty prick-eared and so I found myself in need of explaining what a VM is.
Well, with virtualization it is somewhat of a challenge to wrap your head around the basic idea when you first encounter it (try to explain it in simple terms to your eight year old if you think that it's really nothing!). And even when you think that you got a pretty good grasp of it, you never cease to be surprised by what new use-cases people come up with! It's a huge topic, so a small recap doesn't hurt - and it'll provide a bit of context before actually exploring CBSD.
Also I'm deliberately trying to write this in a way that people could point their less tech-savvy friends at the article and they should get it even if they don't have a lot of IT knowledge. If you could hold an off-the-cuff lecture on Virtual Machines, paravirtualization, emulation and containers without breaking a sweat, you'll almost certainly learn nothing new and probably want to skip this article. Otherwise you may want to at least skim over it and maybe pick up a couple of new ways to look at things.
When discussing a complex topic, I usually like to touch on adjacent topics briefly to be able to better define the actual scope. Next step is to discuss the terminology, so it's clear what something means (and what it doesn't). However things are somewhat entangled here as we see it pretty often in Information Technology: You can't simply start explaining A, move to B and then C. Why? Well, to be able to really comprehend A, you need to already understand at least C - which requires prior knowledge of B and A! Classic deadlock situation. To solve it, you start digging into _any_ of those topics but have to live with some simplifications for example (which are either not the whole truth or even actually kind of wrong). You can fix any misconceptions later when you move on to understand the next topic. That's what we will be doing here.
One neighboring field to virtualization is _emulation_. So what does it mean? The term is derived from the Latin word "aemulus" which meant "rivaling" or "following suit" - well, or just "emulating" as that became an English word. Then an "aemulator" was a person who was a "rival" or someone striving to perfectly copying somebody else's behavior. In modern usage to _emulate_ basically means to _imitate_ the characteristics and behavior of something. The person (or software) which does the emulation is an _emulator_.
For the term _virtualization_ we can trace things back to the most basic form in Latin as "vir" which is "man". Getting closer, there's "virtus" which used to mean "manliness" in ancient Latin but morphed to "chivalry", "bravery" and "strength" in classic Latin. The word was adopted into English as "virtue", now referring to a person's good character traits (not limited to males), integrity but also _value_, _benefit_ and such.
In today's use this leads to (at least?) two cases for _virtual_:
1) used like "almost" (because of the high similarity). E. g. "virtually nothing".
2) used as "it shows exactly the traits of X without actually being X".
Of course we're talking the latter here, be it virtual network interfaces, virtual LANs or Virtual Machines.
There are two common forms of representation for programs (we'll be running into a third one below): _Source code_ and _machine code_ (also called _binary_). The latter can usually be run directly on the platform that it was built for; it usually doesn't work on other platforms. What's a platform and why "usually", though? A _platform_ is the combination of ISA (more on that in a minute) and operating system.
At a computer's very core is the Central Processing Unit (CPU), also called the _processor_. There are various kinds of processors out there that work differently. If you're reading this, it's probably a safe bet that you've at least heard of Intel and AMD. Both vendors sell processors (among other things). Have you ever come across software that offers separate downloads based on whether you have an Intel or an AMD machine? This is not very likely. Why? Because the most common processors that any of them sell are largely _compatible_ as they share the same processor _architecture_. The latter determines which instruction sets (think mathematical operations) a processor supports. Both Intel's and AMD's processors belong to the _instruction set architecture_ (ISA) known as "amd64" or "x86_64" (AMD introduced 64-bit PC CPU's when Intel's were still all 32-bit).
But if they are compatible, why did I bring up the question of separate software for them? Well, because such cases exist! Have a look at the download page for OpenMandriva Linux for example which offers separate builds for generic x86_64 and for modern AMD families of CPUs. While every x86_64 processor is guaranteed to support the baseline instruction set, both Intel and AMD introduced various extensions to speed up specialized calculations. Not every Intel feature is available in all AMD processors and vice versa. You can build your programs to ignore those advanced features making them run on either vendor's CPU at the cost of maybe sub-optimal performance. Or you can take advantage of them with highly optimized builds that will run on specific processors only.
The previous ISA used for the PC from the mid 80's till the mid 2000's is the 32-bit architecture called "i386" or "x86". Think of x86_64 as a superset of x86 which is why modern CPUs can still execute old 32-bit programs if they have to. This is one of the special cases where it is possible to run unmodified binaries from a different ISA. There's also the other special case: FreeBSD for example supports "Linux mode" and thus can run (many but not all) Linux binaries even though it's a different operating system. Combining both is possible, too, like running a i386 Linux binary on FreeBSD for amd64.
But those are the exceptions to the rule. You cannot take a binary from your PC and expect it to run on your Raspberry Pi - even if both run the same operating system. The "aarch64" ISA is wildly different from x86_64 and thus the binaries will not work on the respective other platform.
There's a good chance to get your program running if you've got the source code, though. A program's source code it what its programmer(s) wrote, the human readable code (well, readable for those who understand the programming language used). It is processed by a _compiler_ and translated to machine code which is then stored in a binary file so that the program can run on its own (well, it usually needs the help of the operating system, but I digress). Compilers must know the target architecture that they produce binary files for. So unless your source code makes use of specialized things that are simply not available on the other architecture, you can e. g. build two binaries from it: One for the PC (amd64) and one for your Raspberry Pi (aarch64). Those programs can then be run _natively_.
There are programming languages out there that do not compile source code. Programs written in one of those can not run on their own; they need an _interpreter_ program which is kind of like a compiler, but does its translation to machine code ad-hoc each time the program runs.
Let's take brief look at emulation next. You probably know that software exists that lets you play classic Gameboy games on your PC. Any program capable of doing this is a Gameboy _emulator_. There's other emulators for the Playstation 3, Nintendo 64 or basically whatever gaming console you might name. Emulation absolutely is not limited to gaming, though. The right software can make your PC look like an 80's mainframe as far as software for those beasts is concerned. It can emulate machines that have a specification but where an actual hardware implementation never happened. For a skilled coder there's really no limit to this concept as long as the machine you're using is more powerful than the emulated platform you're targeting.
What does an emulator do to make things work? Well, emulating a different ISA means translating the instructions of the machine code to the _equivalent_ ones native to the platform that the emulator runs on. If there is no exact equivalent, the program has to do some trickery: It needs to know how to achieve the same goal combining some instructions that are available. For both the translation and especially for "missing" instructions that need to be emulated by more complex operations there of course is a performance penalty.
Simulating another ISA is not the only kind of emulation possible. You can also emulate the behavior of other operating systems (or combine both). There is for example _DOSBox_, a software popular with people who still like to play the DOS era games. As the name implies, it gives you a DOS-like environment in a window on whatever operating system you actually run. For me it's FreeBSD, a Unix-like OS. Even though the commands that my native OS uses differ from those that DOS offers, the environment inside DOSBox behaves as if it were a DOS machine of its own. This is required so that DOS programs can be run. _To them_ everything looks like they are running on a DOS computer.
The emulator does not only emulate the operating system, though. It emulates an x86 PC as well, so with DOSBox you can also run your x86 DOS games on your Raspberry Pi that's actually running a Unix-like OS. It also emulates other _hardware_ as it was in common use back in the DOS days. In contrast to the operating systems that we use today, DOS was a pretty simplistic one. It ran in so-called _real mode_ and gave programs direct access to the hardware without any abstractions. The consequence of this is that there are no _shared drivers_; each and every program had to deal with various hardware itself! It most certainly knew how to make a Sound Blaster card play sound. It would have no idea whatsoever how to talk to any sound cards that you might use today. Therefore DOSBox makes the environment look like there's a sound card of the old days present in the machine (i. e. it emulates it) and takes care of everything needed that your system can actually play the sound that it is meant to.
Now that was quite some text before we finally actually get to virtualization, eh? Well, to be honest, I tricked you. We've been talking virtualization for a while now! But let's take a look from a different angle before we combine the pieces. While the term Virtual Machine (or VM) is very common these days, asking people what that actually means can lead to pretty interesting results. You can expect IT professionals and also a lot of hobbyists to have a pretty good idea of what it is, but giving a definition can be hard. It's useful, though, so let's give it a try.
Right at the beginning of it, we're facing another reoccurring problem: Term overloading. Even if we don't leave the scope of IT topics, you cannot decide, _which kind_ of VM is being talked about if you don't have at least a little bit of context. When the context is programming languages, there is e. g. the JVM or _Java Virtual Machine_. It's an runtime environment that allows for running programs compiled to _bytecode_ on actual hardware. While this is certainly a related topic, it's not what we're looking at. Our topic is _hardware virtualization_.
So what is a Virtual Machine? Simply put, it's something that _behaves_ like a real machine but isn't one. Remember the emulated sound card? It doesn't exist at all, but software emulates its behavior and so even it's not there, you can use it like it were real. You have a virtual sound card!
Take all the other pieces that make up a computer, simulate them using software and combine them. Final result: A Virtual Machine! As far as a program inside the VM is concerned, there is no "outside" machine, it is blissfully ignorant of the fact that the "hardware" it makes use of is purely virtual. And indeed it doesn't make any difference as long as the virtualization works flawlessly and the virtual hardware behaves as expected.
While a software that emulates other platforms is called an emulator, something that provides virtual machines is called a _hypervisor_ or a _Virtual Machine Monitor_ (VMM). (The predecessors to operating systems used to be called _supervisors_ because they were responsible for any running processes and a hypervisor stands another level above that.)
You might sometimes encounter the terms "type-1", "bare-metal" or "native" hypervisor as well as "type-2" or "hosted" hypervisor. The distinction here was that the former ran directly on some hardware (as a special-purpose program totally on its own) while the latter was a normal program that ran on top of a general-purpose operating system. These terms are kind of obsolete as some very popular hypervisors take a middle ground approach: They are _part of_ a general-purpose OS which means that they both run directly on the hardware and need an operating system to run. It's still useful to have heard about these classifications, though.
With some classic type-1 hypervisors come so-called LPARs (_Logical Partitions_) as IBM calls them or LDOMs (_Logical Domains_) as Sun named theirs. A Logical Partition / Domain is a subset of a machine's hardware set aside to form one Virtual Machine. Usually the hardware configuration can be changed at runtime if the operating system running inside the VM can cope with that. Two type-1 hypervisors that you may encounter in the wild are VMware's ESXi and Xen as well as Hyper-V (the latter if you're working in a Microsoft environment).
There are several popular type-2 hypervisors out there (usually involving kernel modules and thus blurring the lines as described above), both commercial and Open Source. The most widely used one is KVM (Kernel Virtual Machine) which gives the Linux Kernel hypervisor capabilities. Very popular on the Mac is Parallels' virtualization. Mostly used on desktop systems and available for many operating systems is VirtualBox. And then there's the lesser known but rather interesting newcomers that originate in the BSD family of operating systems: Bhyve (FreeBSD), VMM/VMD (OpenBSD) and NVMM as well as HAXM (both NetBSD).
While virtualization enabled a lot of new possibilities, it comes at a price, too. All the work that the hypervisor has to do is using system resources that cannot be used for the actual programs you might want to run. One mechanism to lessen the impact of this is so-called _paravirtualization_. In contrast to _full virtualization_ where the guest operating system runs in a VM ignorantly, with paravirtualization the OS actually knows that it is virtualized. There often are _guest extensions_ and such - those allow the virtualized OS to communicate with the hypervisor. Part of that are usually drivers which allow for the use of specialized virtual hardware which is lower on system resources than simulating actual complex hardware.
While this can both result in reducing the price of virtualization and even considerably boost the speed of VM execution, it also has its limits. If there's no extensions for the OS you want to virtualize which fit your hypervisor, all you can do is fall back to full virtualization.
The last topic that we are going to discuss is so-called _containerization_. This is a very light-weight albeit limited form of virtualization. It was pioneered by FreeBSD which implemented what is called _jails_. While a standard PC running a hypervisor can rarely run more then 10 VMs that are somewhat useful, running several hundred jails on the same machine is not much of a problem. What's the magic here? Enter _OS-level virtualization_.
A jail (or anything following the same idea like Solaris zones or Linux containers) is NOT a VM. It doesn't provide all the virtual hardware to run an operating system in it. What it does instead is isolate processes running on the _main operating system_ from each other. So as long as you want to virtualize FreeBSD on FreeBSD or Linux on Linux, you can do that thanks to light-weight containers with no hypervisor involved. You cannot run any foreign system since there's no way to run another kernel. Everything runs on the one operating system instance and is made to look like all the containers are mini systems of their own.
Thanks to syscall translation it is possible to expand somewhat beyond these strict borders: Solaris introduced "LX-branded zones" which, while still running on a Solaris system, make the software in the zone believe that it runs on Linux. FreeBSD can do the same thing with "Linux jails" that use the Linuxulator for translation.
While this is nothing new (FreeBSD introduced jails to the general public in the late 1990's and they were conceived a while before that), mainstream has largely ignored it until the mid 2010's when it rose to a real hype as everybody and their cat embraced Docker. Today Linux has the most mature tooling around containers with programs like Kubernetes being used basically everywhere. It has the most fragile implementation, too: Linux containers are a security nightmare. To be fair - they were never meant as a tool for security. Jails and zones on the other hand (still) don't receive as much attention but are much more solid and an interesting option for people who dare look left and right.
So much for a very general overview of virtualization! In the next post we'll start taking a close look at a complete virtualization management framework - CBSD.