Nim Solo5 platform support

Early this year I managed to reach a significant milestone in my personal UNIX jihad, succesfully building and testing Nim unikernels.

I was introduced to the unikernel concept several years ago while I was using Genode. My definition is loose one, a unikernel is a basically an application that executes as indepedently as possible from an operating system. Some unikernels can execute on bare-metal whereas most are built to run as a native process of a host operating system while communicating only over synthetic devices. Why run an application as neither a bare-metal program or a normal OS process but instead in some strange middle ground? There has been plenty of experience running programs bare-metal but there is no turning back from time-sharing operating system. Most of our programs these days, server or client, are tightly coupled to the interfaces of an operating system. If the operating system interface is reduced to a few simple device abstractions, then mismatches with the system ABIs are eliminated, and the potentially harmful effects of applications are sharply reduced.

I can across while unikernels and Solo5 while using Genode because porting software from a monolithic to a componetized OS can be an enormous amount of effort, and reducing the surface area between applications and their OS minimizes that effort. I was first introduced to MirageOS, which is a large collection of libraries for building OCaml unikernels, and then to Solo5, which is middleware that defines and implements the abstract device interface between unikernel and host.

As a side note, the timeline of my personal involvement in Solo5 was:

The porting

There are many reasons while Nim is my preferred langauge, but one of the most important is that it is easy to port to new platforms. I've ported Nim to Genode and partially to Plan9, so I immediately know where to start editing files.

A technical description follows for those interested in the different tricks to make it work.

The entrypoint

Nim is a compile-to-C language, so the compiler is modified to include the Solo5 headers in the C code that it generates and to replace the entrypoint

N_LIB_PRIVATE int cmdCount;
N_LIB_PRIVATE char** cmdLine;
N_LIB_PRIVATE char** gEnv;

int main(int argc, char** args, char** env) {
        cmdLine = args;
        cmdCount = argc;
        gEnv = env;
        NimMain();
        return nim_program_result;
}

with the entrypoint

struct solo5_start_info *nim_start_info;

int solo5_app_main(const struct solo5_start_info *start_info) {
        nim_start_info = start_info;
        NimMain();
        return nim_program_result;
}

The allocator

Solo5 calls the application entrypoint with a description of a continuous region of memory that the application is free to use. The backend allocator is a pointer to the top of a heap that starts at the beginning of that region. This is sloppy but it gets things going in a hurry.

elif defined(solo5):
  import solo5/solo5
  const stackMargin = 1 shl 20 # 1 MiB

  var heapTop: ByteAddress

  proc osTryAllocPages(size: int): pointer {.inline.} =
    var stackMarker: ByteAddress
    stackMarker = cast[ByteAddress](addr stackMarker)
    if heapTop == 0: heapTop = cast[ByteAddress](nim_start_info.heap_start)
    if heapTop+size+stackMargin < stackMarker:
      result = cast[pointer](heapTop)
      inc heapTop, size

  proc osDeallocPages(p: pointer, size: int) {.inline.} =
    if heapTop-size == cast[ByteAddress](p):
      dec heapTop, size

Floating-point conversions

Converting floating-point numbers from a binary representation to text and back is a non-trival task. This is usually done by the libc, but the Nim standard library does have some standalone converters. Currently the port cannot convert with the same precision of a native Linux program, but the FPU works as expected in any case.

Clock and timers

Solo5 has a monotic clock with an unspecified start time and nanosecond precision as well as a UNIX wall clock. Solo5 has only three time related procedures and a single 64-bit representation of time, so this is trivial to glue to the Nim runtime.

Filesystem

Stub out the "os" module to to behave as if there is an empty read-only filesystem. In my experience this is better than throwing errors immediately because you cannot anticipate or mitigate file-system accesses.

Networking

Stub out anything in the standard library that directly uses the BSD socket API. Yes, this does break some things.

Mutexes

Solo5 doesn't support creation of additional threads (UNIX wasn't designed for threads either) so stub it all out.

Async dispatcher

Solo5 supports up to 64 devices and has an interface for yielding execution until a network packet arrives. Applications register Nim handler callbacks for each device and the async dispatcher drives these. Async programing works as expected.

Device manifest

Solo5 binaries contain an ELF note that describes the devices that a unikernel expects to boot with. Solo5 comes with a utility for generating C code from a JSON description, which might look like this:

{
  "type": "solo5.manifest",
  "version": 1,
  "devices": [
    { "name": "blk-0", "type": "BLOCK_BASIC" },
    { "name": "net-a", "type": "NET_BASIC" },
    { "name": "net-b", "type": "NET_BASIC" }
  ]
}

For Nim the manifest must match the device handlers that are installed. I refuse to manually synchronize the content of two files written in different languages, and I resent the expectation that I do so because a build system cannot. My solution is to make the compiler to do it for me, so I can install the handlers and generate the manifest in the same Nim module.

when defined(solo5):
  from solo5/devices import acquireDevices, netBasic
  from taps import netAcquireHook
  acquireDevices([("net-a", netBasic), ("net-b", netBasic)], netAcquireHook)

asyncdispatch.runForever()

"acquireDevices" is a compile-time template that generates a manifest in C form and uses the "{.emit.}" compiler pragma to inject it into the generated C sources. It also registers the devices with the async dispatcher at runtime. It can only be called once per-program, which is a blessing because a library cannot sneak in a device handler without a compile-time error.

Taps

As mentioned earlier, I've removed or stubbed-out all BSD socket support from the Nim standard library. Solo5 (and MirageOS) is single-threaded so applications must respond to events asychronously, which will not be a problem because Nim is relatively ergonomic for writing async code. Making the application asynchronous is preferable because networking is in reality asynchronous in every regard, and a blocking operation is magic that is best avoided.

I am not in a position to develop my own networking API, but luckily there is the Taps interface which is language-agnostic and specified by a number of IETF dafts. I've anticipated a Nim-native Taps implementation for some time now, so as a baseline I already had a CoAP library implemented over Taps, and Taps implemented over BSD sockets.

Taps is a callback oriented interface for which Nim is well suited.

lwIP

Taps works for Linux using sockets, but I need an IP stack to position between the Taps API and the Solo5 devices. The only readily available stack would be lwIP, which is fine for the first iteration of networking.

Configuration

LwIP can be a pain to configure to and but it's worth effort. I assume that lwIP is mostly known as an embedded implementation of sockets but this sits above a low-level callback API that is also exported. In this case I enabled only the callback API for UDP, TCP, and IPv6. In my previous work with lwIP I had enabled IPv4 and disabled IPv6 but never did the extra work for IPv6, so this time I made sure it got done. The IPv6 address allocation mechanism is not as complex as you might think.

LwIP debugging requires "sprintf", which you will not have without a libc. The Solo5 tests use a standalone "sprintf" header implmentation that I was able to copy. If debugging is not enabled then this shouldn't be necessary.

Building

As a courtesy to others I vendored lwIP within the Taps repository using git sub-tree. LwIP comes with a CMake build system that can be embedded in a parent project. I hate on principle any build tool with "make" in the name, so I wrote a Tup file to generate Nim modules that contain compiler pragma directives to add the lwIP C sources to the Nim project being built.

This Tupfile:

LWIPDIR = $(TUP_CWD)/upstream/src
include $(LWIPDIR)/Filelists.mk

: *.c $(COREFILES) $(APIFILES) $(LWIPDIR)/netif/ethernet.c \
	|> !dumpNimCompilePragmas |> core.nim
: $(CORE4FILES) |> !dumpNimCompilePragmas |> core4.nim
: $(CORE6FILES) |> !dumpNimCompilePragmas |> core6.nim

generates Nim modules like this:

{.compile: "./upstream/src/core/ipv6/dhcp6.c".}
{.compile: "./upstream/src/core/ipv6/ethip6.c".}
{.compile: "./upstream/src/core/ipv6/icmp6.c".}
{.compile: "./upstream/src/core/ipv6/inet6.c".}
{.compile: "./upstream/src/core/ipv6/ip6.c".}
{.compile: "./upstream/src/core/ipv6/ip6_addr.c".}
{.compile: "./upstream/src/core/ipv6/ip6_frag.c".}
{.compile: "./upstream/src/core/ipv6/mld6.c".}
{.compile: "./upstream/src/core/ipv6/nd6.c".}

and no additional build system is necessary. Tup needs to be invoked when the vendored sources are updated, which is better than tossing some goofy extra build steps downstream.

Usage

Building

To build a unikernel you will need a patched compiler and standard library, see this PR for details:

https://github.com/nim-lang/Nim/pull/19536

The compiler must be invoked with the "--os:solo5" flag. The Solo5 sub-platform can be specified with "-d:solo5tender=…" with a default to hvt, the hardware virtualized tender.

# Sandboxed process tender (spt) target
echo 'echo "Hello spt!"' > hello.nim
nim c --os:solo5 -d:solo5tender=spt -o:hello.spt hello.nim
solo5-spt ./hello.spt
# Virtio target
echo 'echo "Hello virtio!"' > hello.nim
nim c --os:solo5 -d:solo5tender=virtio -o:hello.virt hello.nim
solo5-virtio-mkimage hello.img hello.virt
qemu-system-x86_64 -machine q35 -display none -serial stdio -drive file=test.img

Networking

I test my unikernels using a tap device. The tap device in an network bridge and I announce an IPv6 prefix into the bridge with radvd.

Configuration

I have not tried but I plan to use Syndicate to communicate configuration from host to unkernel using Syndicate over TCP. Its a bit heavy but it comes with a decent serialization format and configuration reloading support.

Future work