💾 Archived View for gemini.hitchhiker-linux.org › gemlog › cross_compilers_part_1.gmi captured on 2023-01-29 at 02:13:40. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
One of my recent projects has had me exploring the feasability of cross compiling Rust code for several achitectures on Linux. It turns out that it is not difficult to do once you have a suitable cross toolchain for C, but getting to that point is often a challenge as what documentation is available is often severely out of date. Worse, pretty much all of the documentation has a caveat saying that you should just use crosstool-ng, and my experience with that tool has been less than great. I'm writing this series both as a way to help others who may wish to take a diy approach to cross compilation, and as documentation for myself for future reference.
Note that there are probably other methods to get a working cross toolchain and some of them may be more efficient. Your distro may even have a suitable cross toolchain already built in it's repositories for you. This is what works for me, and while I have been working with cross toolchains for a number of years at this point YMMV.
A functional cross compiler includes more than a compiler and is probably better referred to as a cross-toolchain. Included in the toolchain are a cross-linker, cross-compiler, and a sysroot containing kernel headers and C library compiled for the target arch. In order to get to a working cross-toolchain we are going to follow these steps in order.
It's a good idea to start this process in a clean environment, and keep things confined to a dedicated build directory so as not to litter files all over your home directory. We're going to assume the cross-toolchain will live at /usr/local/${tgt} where ${tgt} is the target triple. We're going to build everything inside of ${HOME}/cross. For the sake of this article we're going to build for riscv64 and use musl as the C library, but building for other architectures is generally as easy as changing some environment variables and adjusting a couple commands. Similarly, using Glibc is pretty much the same process, although it will take significantly longer and there will be some different flags passed to ./configure.
We're starting this guide with the assumption that you already have a working C compiler, preferably Gcc and preferably a very recent version of it. If your distro's version of Gcc is old, it would be a good idea to build a normal toolchain first using the versions of binutils and gcc in this tutorial.
So let's set up a clean working environment and grab some sources.
# Remove any CFLAGS, LDFLAGS or other crap from your shell's environment env -i HOME=$HOME PATH=/usr/bin:/bin TERM=$TERM # Create the build directory install -dv cross cd cross # Add some useful environment vars export arch=riscv64 export libc=musl export tgt=${tgt}-linux-${libc} export sysroot=/usr/local/${tgt} export prefix=${sysroot}/toolchain # Get the sources wget -c https://ftp.gnu.org/gnu/gcc/gcc-12.2.0/gcc-12.2.0.tar.xz wget -c https://ftp.gnu.org/gnu/binutils/binutils-2.39.tar.xz wget -c https://ftp.gnu.org/gnu/mpc/mpc-1.3.1.tar.gz wget -c https://ftp.gnu.org/gnu/mpfr/mpfr-4.2.0.tar.xz wget -c https://ftp.gnu.org/gnu/gmp/gmp-6.2.1.tar.xz wget -c https://musl.libc.org/releases/musl-1.2.3.tar.gz wget -c https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.4.tar.xz
The first piece we actually build must be the linker, as our compiler and C library will need to know at build time the capabilities of the linker. This linker is provided, along with a number of other low level tools used to manipulate binaries, by the GNU binutils package. We're going to build it inside a dedicated build directory. Actually, we're going to do that with all of our packages with the exception of the kernel headers. This keeps the source directory pristine and it is able to be used over and over to build for different configurations.
tar -xJf binutils-2.39.tar.xz install -dv binutils-build cd binutils-build ../binutils-2.39/configure \ --build=$(gcc -dumpmachine) --target=${tgt} \ --prefix=${prefix} \ --with-sysroot=${sysroot} \ --disable-nls make -j4 make install-strip cd ../ && rm -rf binutils-build
Our initial version of Gcc is going to be severely crippled, statically linked and only used once to compile the C library for our target. It solves the chicken and egg problem that wee need to have a working C library in order to build the final compiler. There are or at least were ways to build Gcc slowly, in stages, interspersed with building libc and installing headers, but that is a brittle process that is failure prone and likely to change. Using a bootstrap compiler like this works for me with more successes than failures.
tar -xJf gcc-12.2.0.tar.xz tar -xJf gmp-6.2.1.tar.xz ln -sv gmp-6.2.1 gcc-12.2.0/gmp tar -xzf mpc-1.3.1.tar.gz ln -sv mpc-1.3.1 gcc-12.2.0/mpc tar -xJf mpfr-4.2.0.tar.xz ln -sv mpfr-4.2.0 gcc-12.2.0/mpfr install -dv gcc-build cd gcc-build ../gcc-12.2.0/configure \ --build=$(gcc -dumpmachine) --target=${tgt} \ --prefix=/usr/local/${tgt}/toolchain \ --with-sysroot=/usr/local/${tgt} \ --with-newlib \ --without-headers \ --disable-nls \ --disable-shared \ --disable-multilib \ --disable-decimal-float \ --disable-threads \ --disable-libatomic \ --disable-libgomp \ --disable-libquadmath \ --disable-libssp \ --disable-libvtv \ --disable-libstdcxx \ --enable-languages=c,c++ make -j4 make install cd ../ && rm -rf gcc-build
--disable-* - shutting off a lot of features that are not needed and which would probably fail when building this bootstrap compiler
--enable-languages=c,c++ - only build the C and C++ compilers, which are the only ones needed.
Before we can go further we're going to need the headers for our kernel installed into the sysroot. There are a couple of gotchas with kernel headers, and the first one is that Linux does not follow the same naming scheme for machine architecture as the GNU tools. So instead of setting ARCH to aarch64, for instance, we would instead use arm64. If you are building for an arhitecture other than what I've listed here, check in the kernel source directory inside the subdirectory arch to see what your architecture will be named (each arch has it's own subdirectory named correspondingly).
tar -xJf linux-6.1.4.tar.xz cd linux-6.1.4 make mrproper make ARCH=riscv headers find usr/include -name '.*' -delete rm -rf usr/include/Makefile install -dv ${sysroot}/usr/include cp -rv usr/include/* ${sysroot}/usr/include cd ../
Since the C library is going to be built using the bootstrap compiler that we created earlier, we need to prepend the directory where those binaries reside to our $PATH environment variable in order for the build system to find them.
export PATH=${prefix}/bin:${PATH}
Now that we have made the bootstrap compiler accessible in our shell, we can build the C library.
tar -xJf musl-1.2.3.tar.xz install -dv musl-build cd musl-build ../musl-1.2.3/configure \ --target=${tgt} \ --prefix=/usr \ --libdir=/lib make -j4 make DESTDIR=/usr/local/${tgt} install cd ../ rm -rf musl-build
Notice that this time around we've passed `--prefix=/usr` and `--libdir=/lib` to ./configure. We want our C library to be built with the same hard coded paths as if it was installed as the system library in a real filesystem. Then we set the `DESTDIR` variable when running `make install`, which prepends that path to our sysroot to the actual installation paths.
In this step we reuse the source directory for gcc to build the final cross compiler. There are significantly fewer switches that need to be passed to ./configure this time, because this will be a fully capable compiler for the target system.
install -dv gcc-build cd gcc-build ../gcc-12.2.0/configure \ --target=${tgt} \ --prefix=/usr/local/${tgt}/toolchain \ --with-sysroot=/usr/local/${tgt} \ --enable-languages=c,c++ \ --disable-bootstrap \ --disable-multilib \ --disable-libssp make -j4 make install-strip cd ../ rm -rf gcc-build
Note: If desired it is completely possible to build compilers for languages other than C and C++ at this stage by appending to the `--enable-languages` list. I have personally had success building both the Fortran and Dlang frontends previously. In particular, if you wish to use this cross compiler to bootstrap a fully functional system it is a good idea to build GFortran, as certain widely used packages such as numpy rely on Fortran code. If you want to build a Fortran compiler, you will need a Fortran compiler, so you should install Gfortran on the host system first. This cross compiler can then be used to build a system compiler which will run on the target system.
It's a good idea to test out the new compiler by compiling a small `Hello World` program.
#include <stdio.h> int main() { printf("Hello, world!\n"); return 0; } EOF riscv64-linux-musl-gcc -o hello -static hello.chello.c<
int main() {
printf("Hello, world!\n");
return 0;
}
EOF
riscv64-linux-musl-gcc -o hello -static hello.c
file hello
The compilation should succeed without errors, and the last command should reveal that we have a statically linked riscv binary. If you happen to have qemu binaries installed you can run even the resulting program on the host machine.
qemu-riscv64-static hello
All content for this site is released under the CC BY-SA license.
© 2023 by JeanG3nie