Converting BBCode to gemtext

aka

Linux and low-spec gaming blog

This is my very first Gemini log post, and I've decided to dive right in to writing some Gemini code. All the cool kids write their own servers, their own toys and proxies, and so on. It's not a mandatory rite of passage, but it did give me an extra push to commit to contributing to Geminispace, as opposed to just passively consuming feeds.

My evening project is not quite as cool as a server, but it was fun nonetheless.

How I got here

Remember bulletin boards?

I'm a bit too young to fondly remember email, but looking back, I've always preferred slower kinds of 'social media'. I've never (and still don't) understand the appeal of Twitter or Mastodon, and I may have enjoyed Reddit, but my true comfort zone has always been forums. Also known as bulletin boards.

A community dedicated to a specific topic. Perfectly balanced. Posts are not too long (unlike blogs, gosh, I'm feeling the pressure now, you're telling me I need to develop *writing skills*?) but also not too short. No full anonymity (which discourages bad-faith behavior) but also no karma system (which discourages the toxic desire to be popular). Easier to use than mailing lists, there, I said it.

Not that it's impossible to use this tech in an unhealthy way: I remember publishing my Minecraft and Terraria mods to the respective communities' forums, and then compulsively monitoring and refreshing the forum pages all day long. But I do still kinda miss it.

How long to beat?

A more recent obsession for me is this website called HowLongToBeat.

https://howlongtobeat.com

It's a game database, the goal of which is to aggregate data about, well, how long it takes to beat a video game. One of those things that I didn't think I wanted until I saw it. After all, any other form of media has a known length (minutes, episodes, pages) but not games. Am I about to invest 100 hours or 10 hours into this adventure? Let's find out!

Sadly, this turned out to be a case of Jevons paradox for me: I use this website to more 'efficiently' plan which games to play (and perhaps not waste time on a mediocre game if they're known to take dozens of hours), and yet because of the increased efficiency, I end up spending more time gaming, not less. The act of cataloguing your 'backlog' by itself forces you to build a stronger attachment to your games, plus you get satisfaction for contributing data to a greater cause (almost like contributing actual knowledge to a wiki, but worse).

Wikipedia - Jevons Paradox

Gemini mirror of Wikipedia if you're into that sort of thing

Uhh, anyway, I can't say that I don't like video games, but I needed to figure out a better relationship with this hobby.

(for now, my HLTB profile is public, so you can see my game library there)

Linux and low-spec-ish gaming blog

So, I found out that HowLongToBeat has a forum attached to it. The people turned out to be pretty nice! I supsect this has to do with the nature of the website: it does not focus on speedruns, it does not focus on competitive online gaming or 100% achievement completion - it focuses mainly on single-player experiences. People who enjoy exploring single-player worlds tend to be pretty chill, apparently.

Anyway, I've decided to make my time spent on video games at least somewhat meaningful by promoting Linux and low-spec gaming on the forum via a blog.

gardenapple's Linux and Low Spec gaming on HLTB

People on the forums expressed interest, though by Gemini's standards, this is not at all new or radical.

I have no interest in CPUs or GPUs so some of these specs are complete gibberish to me.

This makes it very close to the typical 'Linux gaming' publication in a market that is, in some sense, over-saturated nowadays. Maybe I could do something to make it slightly cooler?

Converting BBCode to Gemtext

Like most forums, HowLongToBeat's forum uses a markup language called BBCode. It looks like this:

[URL=example.com]This text is a link to a website[/URL] with some [B]bold text[/B], and now some more [I]italics[/I].
[QUOTE=Some user]
I am 'some user' and I said something that one time.[/QUOTE]

It's ugly and I never enjoyed writing it, but I figured that at least it's simple. I could take the 'source' for my blog and convert it to Gemtext via a quick-and-dirty parser.

However, I quickly realized that, while BBCode implementations usually have a minimal set of features, actually rendering it is not easy at all. The intended use-case for it is simply converting it 1:1 to HTML. So all bets are off when it comes to multiline and nested tags.

[URL=example.com]Hyperlinks can (and do) stretch across multiple lines.
[B]Part of the hyperlink can be bold,
[I]and sometimes even bold italic and [U]underlined[/U][/I][/B][/URL]

The heck do I do with this? Sacrifices have to be made when converting this to Gemini, which has no inline formatting.

Luckily, I never promised to write a *compliant* BBCode parser (or even pretend that this project will be generally useful). I can just hack it together and test it only against my own blog/forum posts.

Awk

I decided to use (G)awk for this. It's both a standard Unix utility and a somewhat robust programming language. Would've been much easier to use a proper language, obviously, but also I've been in a kind of 'programming funk' lately where I just don't feel like diving deeper into any particular programming lang. My one area of expertise is Kotlin & Android but I'd like to get away from that. So instead I spent about an hour sifting through this standard utility's documentation, and many more hours debugging it. Not the worst way to spend an evening?

Wikipedia: AWK is a domain-specific language designed for text processing

Gemini mirror

GAWK is GNU's implementation of AWK with some non-standard quality-of-life additions.

To my surprise, it is actually a fairly robust programming language. It has variables, arrays, ifs, loops, it can open input from files and processes, there is even a GNU extension for doing TCP/IP networking (!), because GNU's Not Unix, dammit!

Anyway, I am not the kind of person to write a full-fledged parser using Awk. My script is mostly very simple search-and-replace commands (à la 'sed'). It uses arrays to store encountered [URL]s within a line, which it then spits out after the end of the line.

Multiline [URL]s are generally not supported, though I do occasionally use them in my blog, so I've added specific hacks for that.

For example, one commmon element in a lot of my blog posts was this:

[URL=example.com][B]Game Title[/B] - OS/Platform
[IMG]game-promotional-artwork-which-is-not-essential.png[/IMG][/URL]

My parser completely rearranges this, turning it into:

Game Title

example.com

(and the image is completely removed, as that was really just fluff)

My parser also features two modes, one for when we're inside of a [QUOTE] statement, and the default mode.

You can see the Gawk program here, it even has a few comments:

/play/bb2gem.gawk

Now all that's left is to get the source code for my forum posts... manually... by clicking 'Edit' on every post and copy-pasting the source. Luckily there is not *too* much content, and I guess the manual process gave me plenty of time to debug and adjust the script as it was applied to each post.

Tiny helper script for testing purposes:

/play/bb2gem

Amfora is the nicest-looking Gemini TUI that I'm aware of.

Results

You can compare the input and output here:

Civilization VI - BBCode

Civilization VI - Gemtext

The output is a bit ugly, there is tons of inline formatting like *pseudo-bold text* and /pseudo-italics/, but I suppose that makes it less boring(?) Overall, I'm happy with it, though arguably you have to try really hard to screw up Gemtext.

Do not worry, I will *not* flood Antenna with what I know to be relatively low-quality content. Instead, I will keep a separate feed here:

Linux and low-spec-ish gaming gemlog

Oh, and considering that Gemini users probably have very different standards for hardware, the Gemini version will use the word "low-spec-ish" as opposed to "low-spec" to refer to my computer. Your Pinebook Pro might be cool, but can it run Disco Elysium, huh? (it's okay if it doesn't, I still want to get a proper ARM machine at some point)

-- gardenapple 2021-11-25

Home