💾 Archived View for arcanesciences.com › gemlog › 22-07-28 captured on 2024-09-29 at 00:37:38. Gemini links have been rewritten to link to archived content
View Raw
More Information
⬅️ Previous capture (2022-07-28)
-=-=-=-=-=-=-
Code Density Compared Between Way Too Many Instruction Sets
A lot of code density claims I see online - "RISC-V code density is best in class", "x86 code density is better than any RISC", - have always struck me as unlikely and inconsistent with what I've seen in the trenches. After I tried and failed to find a modern comparison with a broad range of instruction sets, I decided to run my own. The cool-kid approach to this would be to use SPEC or similar, and look at density alongside dynamic and static instruction counts, but I have a deep-seated loathing of both SPECtools and the subtests themselves and had no desire to try to make them build for m68k or Xtensa. (nb: SPEC is actually a great benchmark - the best available. It just isn't always much fun.) Instead, I did it the janky way: do a buildroot run, with -Os and as few changes to default settings as possible, and count the bytes in the busybox executable. The results were unsurprising in places - Thumb2 being excellent, for instance - but I was surprised to see just how terrible the density of the "classic" RISCs is.
Without further ado, here's the table. CSky didn't finish building and I wasn't particularly in a mood to diagnose it, so it's not included. (Sorry, CSky.) Every other supported ISA has at least one result, with little-endian preferred as well as an attempt at matching a common embedded config when further options were presented.
- ARC LE - 623,912 bytes - 8K pages, HS38 Quad MAC + FPU target
- ARM Thumb1 LE - 632,004 bytes - Thumb, softfloat, ARM926T target
- ARM Thumb2 LE - 599,248 bytes - Thumb2, VFPv4-D16, Cortex-A7 target
- ARM64 LE - 779,936 bytes - Cortex-A76 target, FP-ARMv8
- x86 - 713,916 bytes - i686 target
- m68k - 698,776 bytes - M68040 target
- Microblaze LE - 1,223,148 bytes - No target specific settings available
- MIPS32 LE - 1,017,196 bytes - P5600 target, no softfloat
- MIPS64 LE - 989,024 bytes - P6600 target, no softfloat, n32 ABI
- NDS32 - 888,124 bytes - No target specific settings available
- Nios II - 892,316 bytes - No target specific settings available
- OpenRISC - 1,133,164 bytes - No target specific settings available
- PPC32 - 984,396 bytes - 476FP target
- PPC64LE - 985,552 bytes - Power8 target
- RV64G - 934,368 bytes - RV64G platform defaults
- RV64GC - 741,856 bytes - ilp64d ABI
- RV32GC - 719,916 bytes - ilp32d ABI
- S390X - 907,712 bytes - z15 target
- SH-4A LE - 842,884 bytes - No target specific settings
- SPARC32 - 931,064 bytes - SPARCv8
- SPARC64 - 997,624 bytes - SPARCv9
- x86_64 - 747,224 bytes - Haswell target
- Xtensa - 1,216,228 bytes - fsf target
Observations
- Z code density is worse than I expected from a rich variable-length ISA. IBM has large enough L1's that they probably don't need to care much, though.
- ARM64 code density, for a fixed-length ISA with 64b words, is actually pretty good.
- x86 isn't as dense as it's cracked up to be.
- Microblaze, OR, and MIPS are very sparse. Compressed RISC-V density, while not as good as some of its hype would suggest, is a massive improvement over OR.