💾 Archived View for gemini.hitchhiker-linux.org › gemlog › cross_compilers_part_1.gmi captured on 2023-01-29 at 02:13:40. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Cross Compilers: Part 1

2023-01-11

Preface

One of my recent projects has had me exploring the feasability of cross compiling Rust code for several achitectures on Linux. It turns out that it is not difficult to do once you have a suitable cross toolchain for C, but getting to that point is often a challenge as what documentation is available is often severely out of date. Worse, pretty much all of the documentation has a caveat saying that you should just use crosstool-ng, and my experience with that tool has been less than great. I'm writing this series both as a way to help others who may wish to take a diy approach to cross compilation, and as documentation for myself for future reference.

Note that there are probably other methods to get a working cross toolchain and some of them may be more efficient. Your distro may even have a suitable cross toolchain already built in it's repositories for you. This is what works for me, and while I have been working with cross toolchains for a number of years at this point YMMV.

The Outline

A functional cross compiler includes more than a compiler and is probably better referred to as a cross-toolchain. Included in the toolchain are a cross-linker, cross-compiler, and a sysroot containing kernel headers and C library compiled for the target arch. In order to get to a working cross-toolchain we are going to follow these steps in order.

Set up the build environment and download sources
Build the cross-linker
Build a crippled bootstrap version of gcc
Install the kernel headers into the sysroot
Compile the C library and install it into the sysroot
Build our final version of GCC

Setting Up

It's a good idea to start this process in a clean environment, and keep things confined to a dedicated build directory so as not to litter files all over your home directory. We're going to assume the cross-toolchain will live at /usr/local/${tgt} where ${tgt} is the target triple. We're going to build everything inside of ${HOME}/cross. For the sake of this article we're going to build for riscv64 and use musl as the C library, but building for other architectures is generally as easy as changing some environment variables and adjusting a couple commands. Similarly, using Glibc is pretty much the same process, although it will take significantly longer and there will be some different flags passed to ./configure.

We're starting this guide with the assumption that you already have a working C compiler, preferably Gcc and preferably a very recent version of it. If your distro's version of Gcc is old, it would be a good idea to build a normal toolchain first using the versions of binutils and gcc in this tutorial.

So let's set up a clean working environment and grab some sources.

# Remove any CFLAGS, LDFLAGS or other crap from your shell's environment
env -i HOME=$HOME PATH=/usr/bin:/bin TERM=$TERM
# Create the build directory
install -dv cross
cd cross
# Add some useful environment vars
export arch=riscv64
export libc=musl
export tgt=${tgt}-linux-${libc}
export sysroot=/usr/local/${tgt}
export prefix=${sysroot}/toolchain
# Get the sources
wget -c https://ftp.gnu.org/gnu/gcc/gcc-12.2.0/gcc-12.2.0.tar.xz
wget -c https://ftp.gnu.org/gnu/binutils/binutils-2.39.tar.xz
wget -c https://ftp.gnu.org/gnu/mpc/mpc-1.3.1.tar.gz
wget -c https://ftp.gnu.org/gnu/mpfr/mpfr-4.2.0.tar.xz
wget -c https://ftp.gnu.org/gnu/gmp/gmp-6.2.1.tar.xz
wget -c https://musl.libc.org/releases/musl-1.2.3.tar.gz
wget -c https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.4.tar.xz

The Cross Linker

The first piece we actually build must be the linker, as our compiler and C library will need to know at build time the capabilities of the linker. This linker is provided, along with a number of other low level tools used to manipulate binaries, by the GNU binutils package. We're going to build it inside a dedicated build directory. Actually, we're going to do that with all of our packages with the exception of the kernel headers. This keeps the source directory pristine and it is able to be used over and over to build for different configurations.

tar -xJf binutils-2.39.tar.xz
install -dv binutils-build
cd binutils-build
../binutils-2.39/configure \
    --build=$(gcc -dumpmachine)
    --target=${tgt} \
    --prefix=${prefix} \
    --with-sysroot=${sysroot} \
    --disable-nls
make -j4
make install-strip
cd ../ && rm -rf binutils-build

Command explanations

--build=$(gcc-dumpmachine) tells the build system that these tools will run on the current system.
--target=${tgt} tells the build system that these tools will generate binaries for ${tgt}
--prefix=${prefix} the installation prefix, under which everything will be installed.
--with-sysroot=${sysroot} the location where the system headers and C library will live for the target.
--disable-nls - do not build native language support.
make -j4 - the -j4 tells make to use 4 threads. Adjust this to match your available CPU cores
make install-strip - the install-strip target installs and strips the binaries in one step, saving space.

Bootstrap Compiler

Our initial version of Gcc is going to be severely crippled, statically linked and only used once to compile the C library for our target. It solves the chicken and egg problem that wee need to have a working C library in order to build the final compiler. There are or at least were ways to build Gcc slowly, in stages, interspersed with building libc and installing headers, but that is a brittle process that is failure prone and likely to change. Using a bootstrap compiler like this works for me with more successes than failures.

tar -xJf gcc-12.2.0.tar.xz
tar -xJf gmp-6.2.1.tar.xz
ln -sv gmp-6.2.1 gcc-12.2.0/gmp
tar -xzf mpc-1.3.1.tar.gz
ln -sv mpc-1.3.1 gcc-12.2.0/mpc
tar -xJf mpfr-4.2.0.tar.xz
ln -sv mpfr-4.2.0 gcc-12.2.0/mpfr
install -dv gcc-build
cd gcc-build
../gcc-12.2.0/configure \
    --build=$(gcc -dumpmachine)
    --target=${tgt} \
    --prefix=/usr/local/${tgt}/toolchain \
    --with-sysroot=/usr/local/${tgt} \
    --with-newlib \
    --without-headers \
    --disable-nls \
    --disable-shared \
    --disable-multilib \
    --disable-decimal-float \
    --disable-threads \
    --disable-libatomic \
    --disable-libgomp \
    --disable-libquadmath \
    --disable-libssp \
    --disable-libvtv \
    --disable-libstdcxx \
    --enable-languages=c,c++
make -j4
make install
cd ../ && rm -rf gcc-build

Command explanations

ln -sv {gmp-6.2.1,mpc-1.3.1,mpfr-4.2.0} gcc-12.2.1/{gmp,mpc,mpfr} - we link the sources for gmp, mpc and mpfr into the gcc source tree so that they will be built as a part of the gcc build process rather than separately. This is much less complicated than building them separately, a task that will ultimately not be needed as we're not going to use the libraries for anything other linking into gcc.
--build= --target= --prefix= --with-sysroot= --disable-nls - as above for binutils
--with-newlib - Newlib is a lightweight embeddable libc which we are here using as the C library that Gcc is comoiled against for it's runtime usage. It's lightweight and works well for this use case.

--disable-* - shutting off a lot of features that are not needed and which would probably fail when building this bootstrap compiler

--enable-languages=c,c++ - only build the C and C++ compilers, which are the only ones needed.

Kernel headers

Before we can go further we're going to need the headers for our kernel installed into the sysroot. There are a couple of gotchas with kernel headers, and the first one is that Linux does not follow the same naming scheme for machine architecture as the GNU tools. So instead of setting ARCH to aarch64, for instance, we would instead use arm64. If you are building for an arhitecture other than what I've listed here, check in the kernel source directory inside the subdirectory arch to see what your architecture will be named (each arch has it's own subdirectory named correspondingly).

tar -xJf linux-6.1.4.tar.xz
cd linux-6.1.4
make mrproper
make ARCH=riscv headers
find usr/include -name '.*' -delete
rm -rf usr/include/Makefile
install -dv ${sysroot}/usr/include
cp -rv usr/include/* ${sysroot}/usr/include
cd ../

Command explanations

make mrproper - cleans up anything not needed and left behind in the source directory and set's it to a known state.
make ARCH=riscv headers - installs the headers for the riscv platform into usr/include. Note that Linux only supports 64 bit on riscv, not 32 bit, so the 64 is redundant.
find usr/include.. - removes some unneeded dotfiles from the usr/include directory

Building the C library

Since the C library is going to be built using the bootstrap compiler that we created earlier, we need to prepend the directory where those binaries reside to our $PATH environment variable in order for the build system to find them.

export PATH=${prefix}/bin:${PATH}

Now that we have made the bootstrap compiler accessible in our shell, we can build the C library.

tar -xJf musl-1.2.3.tar.xz
install -dv musl-build
cd musl-build
../musl-1.2.3/configure \
    --target=${tgt} \
    --prefix=/usr \
    --libdir=/lib
make -j4
make DESTDIR=/usr/local/${tgt} install
cd ../
rm -rf musl-build

Notice that this time around we've passed `--prefix=/usr` and `--libdir=/lib` to ./configure. We want our C library to be built with the same hard coded paths as if it was installed as the system library in a real filesystem. Then we set the `DESTDIR` variable when running `make install`, which prepends that path to our sysroot to the actual installation paths.

Building the final Cross Compiler

In this step we reuse the source directory for gcc to build the final cross compiler. There are significantly fewer switches that need to be passed to ./configure this time, because this will be a fully capable compiler for the target system.

install -dv gcc-build
cd gcc-build
../gcc-12.2.0/configure \
    --target=${tgt} \
    --prefix=/usr/local/${tgt}/toolchain \
    --with-sysroot=/usr/local/${tgt} \
    --enable-languages=c,c++ \
    --disable-bootstrap \
    --disable-multilib \
    --disable-libssp
make -j4
make install-strip
cd ../
rm -rf gcc-build

Note: If desired it is completely possible to build compilers for languages other than C and C++ at this stage by appending to the `--enable-languages` list. I have personally had success building both the Fortran and Dlang frontends previously. In particular, if you wish to use this cross compiler to bootstrap a fully functional system it is a good idea to build GFortran, as certain widely used packages such as numpy rely on Fortran code. If you want to build a Fortran compiler, you will need a Fortran compiler, so you should install Gfortran on the host system first. This cross compiler can then be used to build a system compiler which will run on the target system.

Testing the compiler

It's a good idea to test out the new compiler by compiling a small `Hello World` program.

#include <stdio.h>

int main() {
    printf("Hello, world!\n");
    return 0;
}
EOF
riscv64-linux-musl-gcc -o hello -static hello.c

include <stdio.h>

int main() {

printf("Hello, world!\n");

return 0;

}

EOF

riscv64-linux-musl-gcc -o hello -static hello.c

file hello

The compilation should succeed without errors, and the last command should reveal that we have a statically linked riscv binary. If you happen to have qemu binaries installed you can run even the resulting program on the host machine.

qemu-riscv64-static hello

outputs `Hello, world!`

Tags for this page

All content for this site is released under the CC BY-SA license.

Finger

Contact