Internet access at Chez Boca was non-existant today (scuttlebut: a fibre cut, which is currently the third one in about two months time) and without the Intarweb pipes, I can't work (good news! I get a day off! Bad news! I can't surf the web!), so I figured I would take the time to get a few personal projects out the door.
First one up—a new version of the greylist daemon [1] has been released. a few bugs are fixed—the first being an error codition wasn't properly sent back to the gld-mcp (Greylist Daemon Master Control Program). The second one a segmentation fault (not fatal actually—the greylist daemon [2] restarts itself on a segfault) if it recieved too many requests (by “too many” I mean “the tuple storage is filled to capacity”—when that happens, it just dumps everything it has and starts over, but I forgot to take that into account elsewhere in the code). The last prevented the Postfix [3] interface from logging any syslog messages (I think I misunderstood how setlogmask() worked).
The other changes are small (a more pedantic value for the number of seconds per year, adding sequence numbers to all logged messages (that's another post) and set the version number directly from source control) but the reason I'm pushing out version 1.0.12 now (well, aside from there was nothing else to do yesterday) is related to the one outstanding bug (that I know of) in the program. That bug deals with bulk data transfers (Mark [4] has been bit by it; I haven't) and I suspect I know the problem, but the solution to that problem requires a incompatible change to the program.
Well, okay—the easy solution requires an incompatible change to the program. The problem is the protocol used between the greylist daemon and the master control program. The easy solution is to just change the protocol and be done with it; the harder solution would be to work the change into the existing protocol, and that could be done, but I'm not sure if it's worth the work. The number of people (that I know of) using the greylist daemon I can count on the fingers of one hand, so an incompatible change wouldn't affect that many people.
Then again, I might not hear the end of it from Mark.
In any case, there are more metrics I want to track (more on that below) and those would require a change to the protocol as well (or more technically, an extention to the protocol). The addtional metrics may possibly help with some long term plans that involve making the greylist daemon multithreaded (yes, the main page states I can get over 6,000 requests a second; that's a very conservative value—drop the amount of logging and I can handle over 32,000/second, all on a single core).
Some of the new metrics involve tracking the the lowest and highest number of tuples during any given period of time. This should help with fine-tuning the --max-tuples parameter (currently defaults to 65,536). I've noticed over the years that I don't seem to get much past 3,000 tuples at any one time, but I would like to make sure before I tweak the value on my mail server.
The other metrics I want to track are the number of tuple searches (or “reads”) and the number of tuple updates (additions or deletions, but in other words, “writes”). These metrics should help if I decide to go multithreaded with the greylist daemon, and how best to store the tuples depending if the application is read heavy or write heavy (but my guess is that reads and writes are nearly equal, which presents a whole set of challenges). But it's not like the program isn't fast as it is—while I claim over 6,000 requests per second, that's a rather conservative figure—drop the logging to a minimum and it can handle over 30,000 requests per second on a single core. I would be interested to see if I can improve on that with additional cores.
Since the there are quite a few changes that require protocol changes, I decided to just make the last release of the 1x line, and start work on version 2 of the greylist daemon—or maybe 1.5 and leave 2x for the multithreaded version.