I was going through my logs (I've been vacation [1] for the past two weeks) and I noticed a few crashes of mod_blog [2]. It was easy enough to determine that a call to assert() was the culpret (the clue is highlighted):
CRASH(32421/000): pid=32421 signal='Aborted' CRASH(32421/001): reason='Unspecified/untranslated error' CRASH(32421/002): CS=B7EA0073 DS=007B ES=007B FS=0000 GS=0033 CRASH(32421/003): EIP=B7FE87A2 EFL=00000246 ESP=BFF9AE28 EBP=BFF9AE3C ESI=00007EA5 EDI=B7FAFFF4 CRASH(32421/004): EAX=00000000 EBX=00007EA5 ECX=00007EA5 EDX=00000006 CRASH(32421/005): UESP=BFF9AE28 TRAPNO=00000000 ERR=00000000 CRASH(32421/006): STACK DUMP CRASH(32421/007): BFF9AE28: A5 07 EB B7 00 00 00 00 F4 FF FA B7 00 00 00 00 CRASH(32421/008): BFF9AE38: C0 86 E8 B7 6C AF F9 BF 09 22 EB B7 06 00 00 00 CRASH(32421/009): BFF9AE48: 50 AE F9 BF 00 00 00 00 20 00 00 00 00 00 00 00 CRASH(32421/010): BFF9AE58: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/011): BFF9AE68: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/012): BFF9AE78: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/013): BFF9AE88: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/014): BFF9AE98: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/015): BFF9AEA8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/016): BFF9AEB8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 CRASH(32421/017): BFF9AEC8: 00 00 00 00 00 00 00 00 C7 04 FB B7 C8 04 FB B7 CRASH(32421/018): BFF9AED8: F4 FF FA B7 C7 04 FB B7 80 04 FB B7 08 AF F9 BF CRASH(32421/019): BFF9AEE8: 28 85 CA 08 F4 FF FA B7 9F 70 EE B7 02 00 00 00 CRASH(32421/020): BFF9AEF8: C8 78 CA 08 4C 00 00 00 C8 78 CA 08 4C 00 00 00 CRASH(32421/021): BFF9AF08: 44 AF F9 BF EC 72 EE B7 80 04 FB B7 C8 78 CA 08 CRASH(32421/022): BFF9AF18: 4C 00 00 00 27 00 00 00 C7 04 FB B7 00 00 00 00 CRASH(32421/023): STACK TRACE CRASH(32421/024): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805ccf0] CRASH(32421/025): /home/spc/web/sites/boston.conman.org/htdocs/boston.cgi[0x805d46b] CRASH(32421/026): /lib/tls/libc.so.6[0xb7eb0890] CRASH(32421/027): /lib/tls/libc.so.6(abort+0xe9)[0xb7eb2209]
The hard part was trying to figure out which of the three calls [3] to assert() was being triggered. Fortunately, there was enough information logged to reproduce the error (for the record, it was assert(month < 13)). Unfortunately, it has to do with the tumbler parsing code [4].
One of the unique features of mod_blog is the “entry addressing scheme,” where you can address not only a single entry like 2018/10/14.1 [5] but a range of entries like 2000/08/10.2-15.5 [6]. In fact, the same code internally changes a reference like 2018/09 [7] to 2018/09/11.1-09/30.1 [8] (the first and last entry in the given month; it also works for days and years). When I wrote the code, I had in mind a way of it working and the bug here is in my inattention to details in checking what I've received.
The code in question, when it sees a request in the form of “number / number - number” is to assume that the number after the literal “-” is a month and not a year. “The Knowledge AI” program was making a request of 2015/04-2015, and max_monthday() was being given an invalid month, thus the assert(month < 13) being false and triggering a crash. That I can fix.
But I do question the programming of the “The Knowledge AI” crawler. I don't have any links in that form, and I'm not aware of any links on other pages of that form (in fact, that particular feature of entry addressing is not used that often, even by me) so I have to wonder how it got a link like that? Does it try randomly generating links to see what it gets? A bug in their code? It's inexplicable.
[2] https://github.com/spc476/mod_blog