More stupid benchmarks about compiling a million lines of code

I'm looking at the code GCC (Gnu's Not Unix Compiler Collection) [1] produced for the 32-bit system (I cut down the number of lines of code [2]):

 804836b:       68 ac 8e 04 08          push   0x8048eac
 8048370:       e8 2b ff ff ff          call   80482a0 <puts@plt>
 8048375:       68 ac 8e 04 08          push   0x8048eac
 804837a:       e8 21 ff ff ff          call   80482a0 <puts@plt>
 804837f:       68 ac 8e 04 08          push   0x8048eac
 8048384:       e8 17 ff ff ff          call   80482a0 <puts@plt>
 8048389:       68 ac 8e 04 08          push   0x8048eac
 804838e:       e8 0d ff ff ff          call   80482a0 <puts@plt>
 8048393:       68 ac 8e 04 08          push   0x8048eac
 8048398:       e8 03 ff ff ff          call   80482a0 <puts@plt>
 804839d:       68 ac 8e 04 08          push   0x8048eac
 80483a2:       e8 f9 fe ff ff          call   80482a0 <puts@plt>
 80483a7:       68 ac 8e 04 08          push   0x8048eac
 80483ac:       e8 ef fe ff ff          call   80482a0 <puts@plt>
 80483b1:       68 ac 8e 04 08          push   0x8048eac
 80483b6:       e8 e5 fe ff ff          call   80482a0 <puts@plt>
 80483bb:       83 c4 20                add    esp,0x20

My initial thought was Why doesn't GCC (Gnu's Not Unix Compiler Collection) just push the address once? but then I remembered that in C, function parameters can be modified. But that lead me down a slight rabbit hole in seeing if printf() (with my particular version of GCC) even changes the parameters. It turns out that no, they don't change (your mileage may vary though). So with that in mind, I wrote the following assembly code:

        bits    32
        global  main
        extern  printf

        section .rodata
msg:
                db      'Hello, world!',10,0

        section .text
main:
                push    msg
                call    printf
	;; 1,999,998 more calls to printf
		call	printf
		pop	eax
		xor	eax,eax
		ret

Yes, I cheated a bit by not repeatedly pushing and popping the stack. But I was also interested in seeing how well nasm [3] fares compiling 1.2 million lines of code. Not too badly, compared to GCC:

[spc]lucy:/tmp>time nasm -f elf32 -o pg.o pg.a

real    0m38.018s
user    0m37.821s
sys     0m0.199s
[spc]lucy:/tmp>

I don't even need to generate a 17M (Megabyte) assembly file though, nasm can do the repetition for me:

        bits    32
        global  main
        extern  printf

        section .rodata

msg:            db      'Hello, world!',10,0

        section .text

main:           push    msg
        %rep 1200000
                call    printf
        %endrep

                pop     eax
                xor     eax,eax
                ret

It can skip reading 16,799,971 bytes and assemble the entire thing in 25 seconds:

[spc]lucy:/tmp>time nasm -f elf32 -o pf.o pf.a

real    0m24.830s
user    0m24.677s
sys     0m0.144s
[spc]lucy:/tmp>

Nice. But then I was curious about Lua. So I generated 1.2 million lines of Lua:

print("Hello, world!")
-- 1,999,998 more calls to print()
print("hello, world!")

And timed out long it took Lua to load (but not run) the 1.2 million lines of code:

[spc]lucy:/tmp>time lua zz.lua
function: 0x9c36838

real    0m1.666s
user    0m1.614s
sys     0m0.053s
[spc]lucy:/tmp>

Sweet!

[1] https://gcc.gnu.org/

[2] /boston/2019/10/04.2

[3] https://nasm.us/

Gemini Mention this post

Contact the author