Maybe this time I'll get it

I think nailed that heisenbug [1] in the greylist daemon [2]. Given that it somehow ends up in the weeds (as a friend of mine used to say), I decided on a lark to delete all logging from the program.

Okay, it's not as insane as it sounds. The program supports logging to either syslogd or to stdout, selectable at runtime. To support this, I use a function pointer to store which logging routine to use (why I do this is a topic for another time). The functions themselves work very much like printf(), meaning they take a variable number of arguments and a format string describing the type of each argument.

The easiest way to test that particular hypothesis was to rip out that code (only on the production server).

Six hours later, it's still running, which is a very good sign.

I then audited the code, and yes, there were a few type mismatches and one instance of a mismatched number of parameters. Fixed those up, and restarted the greylist daemon.

Hopefully this will fix the problem.

Update a few hours later …

[stupid bugs] [3]

[1] /boston/2007/10/16.1

[2] /boston/2007/08/16.1

[3] /boston/2005/07/14/love_your_job.gif

Gemini Mention this post

Contact the author