đŸ’Ÿ Archived View for koyu.space â€ș aartaka â€ș public â€ș making-c-uglier.gmi captured on 2024-05-10 at 10:49:01. Gemini links have been rewritten to link to archived content

View Raw

More Information

âžĄïž Next capture (2024-05-26)

-=-=-=-=-=-=-

📎 Making C Code Uglier

By Artyom Bologov

"Engraving with a hell landscape filled with people and demons chopping, spiking, eating, and otherwise harming each other"

<figcaption>

A group of C programmers arguing about which indentation style is better.

Just kidding, it's "Anger" by Pieter van der Heyden, 1558

</figcaption>

</figure>

C++ is practical, yet sometimes scary.

C is outright frightening. If someone writes code in C++, they must be smart.

If someone writes code in C, they must be crazy (well, at least I am.)

But still, C—with its guts full of eldritch horrors—is

the lingua franca of programming and the most portable assembly language.

C is readable enough to most programmers, because most mainstream languages

are

C progenies.

Pointers and macros are loathsome, but they are rare enough (are they?) to ignore.

So how scary can C code get?

Not as a production use, but rather as an exercise in aesthetics.

This post goes through a set of things that can convolute/obfuscate C code,

from the minute details to critical readability losses.

Note that some obvious things like

are not mentioned to leave space for the scarier ones.

Test Program (#test-program)

I'm going to use a slightly modified

C version of Trabb Pardo Knuth algorithm

from Wikipedia because it's small enough

while still showcasing most of C constructs/features:

 #include <math.h>;
 #include <stdio.h>;

double f (double t)
{
    return sqrt(fabs(t)) + 5 * pow(t, 3);
}

void tpk (void)
{
    double y, a[11] = {0};
    for (int i = 0; i < 11; i++)
        scanf("%lf", &amp;a[i]);

    for (int i = 10; i > = 0; i--)
        if ((y = f(a[i])) > 400)
            printf("%d TOO LARGE\n", i);
        else
            printf("%d %.16g\n", i, y);
}

int main (void)
{
    tpk();
    return 0;
}

Benign: Indentation and Bracket Placement Style (#style)

Two of four spaces? Eight? Or three, maybe?

Or—God almighty—tabs?

C code styles are numerous

and these styles have only one thing in common:

all the styles are mutually incompatible and un-aesthetic.

No matter which style one prefers—they're delusional and wrong,

at least to the ones exhorting another style.

I use Linux kernel style, which might make you scream from the 8 spaces-wide tabs. But I'm not surrendering it.

As a matter of example, I'll use the Pico indentation style (four/five spaces)

and bracket placement (before the first expression and after the last one.)

Plus added spaces mimicking the Glib style: ¶

double
f (double t)
{ return
    sqrt (fabs (t)) + 5 * pow (t, 3); }

Ugh, block scope and control flow are illegible now.

Confusing: Subscripts (#subscripts)

A queer behavior of the standard array subscripts:

the index and array parts can be swapped:

double a[11] = {0}, y;
for (int i = 0; i < 11; i++)
    scanf ("%lf", &i[a]);

This reversal is modest, but nonetheless galling.

An exercise to the reader: can you find the exact spot where the subscript is reversed?

Antiquated: K&R style (#k-n-r-style)

That's where the post gets shuddery.

K&R style, or, as they call it, "I don't understand old C code".

double
f(t)
double t;
{ return
    sqrt (fabs (t)) + 5 * pow (t, 3); }

This style

Luckily, C23 finally removes it,

after more than thirty years of yielding to the horror

and maintaining it in deprecated status.

Smart: Recursion (#recursion)

Reordering and refactoring functions is always fun.

So how about turning all the for-loops into recursion?

Recursion is cool, I've heard.

So here's a recursive rendering of the number printing loop:

void
print_nums(a, i)
double *a;
int i;
{ if (i < 0)
         return;
     double y = f (i[a]);
     if (y > 400)
         printf ("%d TOO LARGE\n", i);
     else
         printf ("%d %.16g\n", i, y);
     print_nums (a, --i); }

Five more code lines, lots of stack frames (unless you have tail call elimination),

and overall less comprehensible control flow. Yay!

<details>

<summary>

Recursion is good, actually

</summary>

Like some un-aesthetic and alienating changes this page lists, recursion might be useful. It can make your algorithms simple and powerful when done right. I often use recursion when writing Lisp. But I can relate to people seeing it as vile and perplexing.

</details>

Terse: Ternaries (#ternaries)

This is my favorite: switching from if-else to ternaries.

It's shorter, expression-only, and it makes code look more daunting.

And there's a rumor that compilers increase the optimization level

when they see ternaries.

Likely, out of regard for programmer's bravery.

void
print_nums (a, i)
double *a;
int i;
{ double y;
     (i < 0) ? 0 :
               (y = f (i[a]),
                (y > 400 ? printf ("%d TOO LARGE\n", i) :
                           printf ("%d %.16g\n", i, y)))
               print_nums (a, --i); }

If only comma operator allowed for variable declaration

(wink wink C standard committee),

this function might've had no <code>double y</code> in it either.

But, for now, let this stateful statement stay there.

<details>

<summary>

Ternaries are good, actually

</summary>

I like the ternary-formatted code because it forces a side effect-less algos where I want it to. It's even more useful in other C-like languages because they have less restrictive blocks and more abstractions compatible with functional style.

</details>

Unconventional: Delimiter-First Code (#delimiter-first)

There are reasons one can use

leading-delimiter style in SQL and Haskell.

But in other languages...

void
print_nums (a, i)
double *a;
int i;
{ double y;
     (i < 0)
     ? 0
     : (y = f (i[a])
        , (y > 400
           ? printf ("%d TOO LARGE\n"
                     , i)
           : printf ("%d %.16g\n"
                     , i, y))
        , print_nums (a, --i)); }

I like how the ternaries become more pronounced

and how it promotes a functional-ish style.

But I bet, your eyes are already hemorrhaging,

so feel free to ignore my aesthetic preferences.

Awful: Alternative representations (#alt-representations)

That's the most horrifying one:

C has alternatives to some characters

that weren't there at the time of the first standard.

There are two-(digraphs)

and three-character (trigraphs, deprecated in C23) encodings

for <code>[</code>, <code>^</code>, <code>{</code> etc.

Here's a table of transformations:

<table>

<thead>

<tr> <th>C char <th>Digraph <th>Trigraph

<tbody>

<tr> <td>{<td>< % <td>??<

<tr> <td>} <td>% > <td>??>

<tr> <td>[<td>< : <td>??(

<tr> <td>] <td>: > <td>??)

<tr> <td># <td>%: <td>??=

<tr> <td>\ <td> <td>??/

<tr> <td>^ <td> <td>??&#39;

<tr> <td>| <td> <td>??!

<tr> <td>~ <td> <td>??-

<caption> All the digraphs and trigraphs with their decoding </caption>

</table>

And here's the code with encoded parts:

void
read_nums (a, i)
double *a;
int i;
< % if (i == 11)
    < % return; % >
    else
    < % scanf ("%lf", &amp;i&lt;:a: >);
        read_nums (a, ++i);% > % >

And that's just digraphs, trigraphs are even worse!

<details>

<summary>

Alternative representations are good, actually

</summary>

There is a more useful side to alternative encodings.

<code><iso646.h></code> provides the spelled-out logical operators

far more readable than single-character operators:

&amp;&amp;, and
&amp;=, and_eq
&amp;, bitand
|, bitor
~, compl
!, not
!=, not_eq
||, or
|=, or_eq
^, xor
^=, xor_eq

Even though it's atypical, I'm tempted to use these in my projects.

</details>

Wrapping Up (#wrapping-up)

Here's the final code for TPK algorithm. It compiles under Clang 13.0.1 on my <code>x86_64-unknown-linux-gnu</code> đŸ˜” (the exact command is <code>clang tpk.c -trigraphs -lm</code>.)

%:include <math.h>;
??=include <stdio.h>;

double
f(t)
double t;
??< return
    sqrt (fabs (t))
    + 5
    * pow (t
           , 3); % >

void
read_nums(a, i)
double *a;
int i;
< % if (i == 11)
    < % return; % >
    else
    ??< scanf ("%lf"
               , &i&lt;:a??));
        read_nums (a
                   , ++i);% > % >


void
print_nums(a, i)
double *a;
int i;
< % double y;
     (i < 0)
     ? 0
     : (y = f (i??(a: >)
        , (y > 400
           ? printf ("%d TOO LARGE\n"
                     , i)
           : printf ("%d %.16g\n"
                     , i, y))
        , print_nums (a
                      , --i)); ??>

void tpk ()
??< double a&lt;:11: > = ??<0??>
           , y;
    read_nums (a
               , 0);
    print_nums (a
                , 10);
    /* Absolutely unnecessary, but irritating. */
    return; % >

int main ()
< % tpk();
    return 0; ??>

If you want some job security as a C or C++ programmer,

you might use some of the things discussed above.

But in any other scenario: you don't want to write code this way!

Be kind to each other, even when y'all write chthonic C code.

Update: some commenters on Reddit mentioned

IOCCC

as an additional inspiration and further research direction.

This post is by no means exhaustive,

and you will likely find much more gory details if you explore IOCCC.

Another update:

u/insanelygreat

shared an absolutely horrendous piece of code

and set of macros that turn C into something BASIC.

Here's a small piece of code from their

comment you should read in full:

/* check for meta chars */
BEGIN
   REG BOOL slash; slash=0;
   WHILE !fngchar(*cs)
   DO IF *cs++==0
        THEN IF rflg ANDF slash THEN break; ELSE return(0) FI
        ELIF *cs=='/'
        THEN slash++;
        FI
   OD
END

This website is

Designed to Last

and generated with the help of

C preprocessor.

You can view page sources by appending .h to the page URL.

Copyright 2022-2024 Artyom Bologov (aartaka).

Any and all opinions listed here are my own and not representative of my employers; future, past and present.

Back to home page

About me

My projects