💾 Archived View for ecs.d2evs.net › posts › 2024-02-07-awkbot.gmi captured on 2024-08-18 at 17:17:18. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-05-10)
-=-=-=-=-=-=-
if you've hung out around me long enough, you'll know that i have a bit of an unhealthy love of awk. some of that is because it's just a nice language, but a lot of it can be traced back to this one exchange i had on irc a few years ago:
2021-09-22 22:02:51 @etj hm 2021-09-22 22:02:54 @etj acctually 2021-09-22 22:03:06 @etj i think i know the right language for writing microbots in 2021-09-22 22:03:10 @ecs do tell 2021-09-22 22:03:12 @etj awk.
on 2020-04-20, i decided to write an irc bot to display search results from duckduckgo. i'm not really sure why i wanted to do that, but i had this to say about it at the time
2020-04-20 16:50:54 @ecs let's write a *useful* irc bot 2020-04-20 16:51:13 @ecs something that'll interpret commands like: 2020-04-20 16:51:22 @ecs !ddg this should show the top search result
for reasons that were funny at the time, she ended up being named "qta", and she soon grew to a healthy 1500loc. she could set reminders, forecast the weather, produce fascinating insights with a markov chain dynamically trained on messages sent in the channel, control an mpd server which i'm pretty sure only ever played metallica because look i don't know, and a bunch of other small things i ended up wanting her to do
on 2020-10-01, inspiration struck, and the first¹ of my jsbot clones was born
¹: technically second, but the scheme bot i wrote on a phone in the middle of a 6-week hiking trip and never actually used doesn't really count
Drew's blog post about jsbot (cw some kinda iffy language)
2020-10-01 00:39:57 @ecs it would be nice to have a good™ embeddable scripting language 2020-10-01 00:40:36 @ecs https://github.com/d5/tengo maybe 2020-10-01 00:41:17 @etj !tengo when 2020-10-01 00:42:22 @ecs https://github.com/traefik/yaegi huh [snip] 2020-10-01 01:43:22 @ecs !go fmt.Println("hello, world!") 2020-10-01 01:43:23 quaternia-test => hello, world! 2020-10-01 01:43:27 @ecs \o/
we developed some infrastructure for writing little irc bots inline in go, but it never worked particularly well. go is a kinda verbose language, and that's a problem when your code needs to fit within a 512-byte irc message. we also hacked together persistent data by manually appending go code to a file in /etc which got sourced on boot, with the idea if you wanted to eg. increment a number, you'd run `foo++` and then append `foo++` to /etc/quaternia/init.go
this worked about as well as you'd expect it to
a year later, etj was complaining about just how bad the !go persistence mechanism was. after having messed with it manually in order to remove a macro that expanded "lol" to "lingerie of love" (don't ask), they were thinking of just removing it entirely ("i really think that the scriptable bot that you write bots in is a bad idea" "or at least langbot shouldn't be the final destination for bots"), and i just thought that "it would be nice to switch to a non-go language", because "livecoding one line at a time over im protocols is the best software development method". etj came up with a half-serious proposal of using awk, and i was "not actually as horrified by that as i feel like i should be", so i got to work implementing it
after a few hours of hacking and a brief break to determine qta's birthday (valentine's day in 1970, apparently):
2021-09-23 03:00:47 @ecs let's see if this works 2021-09-23 03:01:12 @ecs .awk add test /^\.awkping$/ { print("pong") } 2021-09-23 03:01:12 +qta success 2021-09-23 03:01:16 @ecs .awkping 2021-09-23 03:01:16 +qta pong
(note: the syntax for adding snippets has changed since these logs)
initially, i was worried about persistence
2021-09-23 03:07:25 @ecs one caveat about this is that 2021-09-23 03:08:03 @ecs .awk add foo BEGIN { foo = 0 } /bar/ { foo++; print foo; } 2021-09-23 03:08:03 +qta success 2021-09-23 03:08:07 @ecs bar 2021-09-23 03:08:07 +qta 1 2021-09-23 03:08:08 @ecs bar 2021-09-23 03:08:09 +qta 1 2021-09-23 03:08:24 @ecs it doesn't retain state
but i've since come around. both gobot and jsbot have issues with persistence being optional: it's possible to write programs which look like they work, but which lose data when the bot is rebooted. because awkbot's awk context isn't kept around between messages, you're forced to use the postgres database it provides bindings for if you want to keep any data around at all, and rebooting the bot is guaranteed to never break anything
once i'd written the initial version of awkbot, i slowly started making it more and more powerful in order to be able to rewrite more and more of the original bot in awk. i even managed to rewrite half of the bot itself in awk - the interface for adding, listing, and removing awk snippets was originally written in go, but once i added a function to run arbitrary sql queries from awk², i was able to delete those 127 lines of code
²: not a trivial task. goawk has ffi, but go functions that're callable from awk can only take in and return primitive types and strings, so i had to do some creative escaping in order to pull the argument and result arrays across that boundary
one substantial improvement i managed to make over jsbot is the ability to execute code at an arbitrary time, rather than being limited to replying to messages. you can call `at(date, cmd)` to add `cmd` to a table, marked with the timestamp `date`. every 10 seconds, the go code executes the snippet named "__ontick__", which looks through that table and executes any code whose timestamp is in the past. another hacked-together system sits on top of this for implementing repeating commands, allowing you to, for example, print "hi" every 10 minutes by running `.cron now "in 10m" '{ print "hi" }'`. don't ask how that works, you don't want to know
somewhere along the line we decided that it'd be a good idea to give awkbot an http client, so now there's some awk code to print out url titles, interface with the schedule api for my local public transit agency, check the weather, and a bunch more stuff. she also knows how to parse json, xml, and html, though i'm thinking of trying to rewrite some of that in awk
the old bot weighed in at around 2.2kloc at her peak, and i finished rewriting the last part (my rss reader³) in awk on 2022-12-10. the new bot consists of 156 snippets of awk code totalling 29,981 characters, and 428 lines (9.155 characters) of go code. she's grown a brainfuck interpreter, an implementation of the geohashing algorithm, half of a cube timer, and dozens of other things i don't have time to list
³: said rss reader once got me a politely worded email from someone whose blog i followed asking me to please stop hammering her rss feed once every 10 seconds. i fixed it, now qta only hammers rss feeds once every 10 minutes)
every so often i come back and hack on her some more, but for the most part qta's just become part of my life at this point. sometimes i decide to add another organizational tool to her, and on occasion some of them even get a bit of use. i recently sorted out some race conditions in the output-channel management⁴, because a youtube rss feed outage was causing her to yell at me really really loudly and incessantly in dms. i also just rewrote the geocoding, reverse geocoding, shlexing, and human-friendly datetime parsing bits in awk, which shaved off around 80 lines of go code and got rid of quite a few dependencies
⁴: the go code provides a setchan() function, which controls the channel that data is printed to, but because user code used to run inside a `print eval(...)`, that data wasn't actually sent to the pseudo-stdout until it finished evaluating, which meant that if you did something along the lines of `print "hi"; setchan("#a")`, the "hi" would be sent to #a. in order to solve this i added a "passthrough" parameter to eval(), which tells it to immediately write everything to the channel in addition to buffering it up to be returned as a string. then i discovered that it still didn't work because for some reason i was having goawk print things into an io.Pipe which a different goroutine was reading from, rather than just having an io.Writer which writes messages to irc. anyways it works now, though i didn't quite manage to fix it before youtube fixed their feeds
there's no actual moral to this story. you probably shouldn't ever use this code, but the source code is linked below if you're curious. bye!