💾 Archived View for thrig.me › blog › 2024 › 05 › 19 › gemfeed2atom.gmi captured on 2024-06-19 at 22:49:06. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-05-26)

-=-=-=-=-=-=-

gemfeed2atom

Previously shell and Perl were dabbled with for minimal conversion of gemfeed to Atom, with various assumptions and simplifications. This version of the same minimalism is in C:

gemfeed2atom.c

Efficiency

If judged by "time spent putting the code together" the C is very inefficient. On the CPU (and memory usage) front, we might compare the C with a library-using Perl script.

gem2atom.pl

    $ fgrep -c '=> ' index.gmi
    360
    $ time ./gemfeed2atom -b x index.gmi >/dev/null
        0m00.01s real     0m00.01s user     0m00.00s system
    $ time ./gem2atom.pl -b x index.gmi >/dev/null
        0m03.86s real     0m02.97s user     0m00.81s system

That was with the CPU set as slow as possible, which is how I usually have it set, except when playing silly games that need the CPU or doing something with audio that needs more CPU. With the CPU turned all the way up (this makes the fans run), the runtime is less bad for Perl:

    $ time ./gemfeed2atom -b x index.gmi >/dev/null
        0m00.00s real     0m00.00s user     0m00.00s system
    $ time ./gem2atom.pl -b x index.gmi >/dev/null
        0m00.55s real     0m00.45s user     0m00.08s system

But the speed difference is significant enough to not need a longer benchmark; languages or implementations that are closer in speed will need longer and more careful checks to suss out whether there really is a noteworthy difference.

Some of the slowness here is FFS on OpenBSD, especially when large numbers of files are involved, as will be common in Perl scripts or other such languages that pull in various modules. Hence me porting a lot of my scripts to C; the spinny metal harddrive on the 2009 MacBook (RIP 2022) also demanded such speed improvements. The various modules may also be prone to security or supply chain attacks, and could rack up quite a lot of code that several someones would ideally need to review for any such problems (but probably will not). On the other hand, the C is pretty fragile, not very flexible, and may not generate correct XML, so there are tradeoffs here.

P.S. I also have a Go version of the script, but that pulls in random Atom and RSS modules, which fail to put an ultimate '\n' at the end of the file, and most of my Go scripts eventually end up being rewritten in C.

P.P.S. I'm dogfooding the C code here, so if the Atom is broke, that's probably why.