More thoughts on optimizing a greylist daemon

I ran the updated stress test [1] on a faster (2.6GHz (gigaHertz) machine) and managed to get some impressive results.

There were three different ways I ran the test. One option had the stress program send a request and wait for a reply. This was by far the slowest of the tests, but the most reliable (in terms of actually processing every request) with the greylist daemon [2] handling between 4,000 to 6,300 tuples per second. Another option has a separate process waiting for the replies and that goes faster, between 11,000 and 17,000 tuples per second, but drops a ton of requests (on the order of 70%). The last option doesn't even bother with replies. This does both the best and the worst—30,000 tuples per second, but it drops something like 90%.

So, the program can easily handle about 5,000 requests per second on a nice server, which is probably way more than most SMTP (Simple Mail Transport Protocol) servers can handle (and it's much nicer than the 130/second I thought it could handle).

I profiled the program again, and this time, got actual results I could use:

Table: Each sample counts as 0.01 seconds.
% time	cumulative seconds	self seconds	calls	self Ts/call	total Ts/calls	name
------------------------------
21.24	0.48	0.48	2260060	0.00	0.00	crc32
14.38	0.81	0.33	443203	0.00	0.00	tuple_search
11.51	1.07	0.26	565012	0.00	0.00	ip_match
8.85	1.27	0.20	565012	0.00	0.00	type_graylist
7.97	1.45	0.18	1	0.18	2.20	mainloop
6.64	1.60	0.15	565015	0.00	0.00	send_packet
4.87	1.71	0.11	7648182	0.00	0.00	tuple_cmp_ift
4.87	1.82	0.11	565012	0.00	0.00	graylist_sanitize_req
3.98	1.91	0.09	1761756	0.00	0.00	edomain_search
3.54	1.99	0.08	2637054	0.00	0.00	edomain_cmp
3.10	2.06	0.07	421359	0.00	0.00	tuple_add
2.21	2.11	0.05	565012	0.00	0.00	send_reply
2.21	2.16	0.05	1	0.05	0.05	whitelist_dump_stream
0.89	2.18	0.02	565127	0.00	0.00	ipv4
------------------------------
% time	cumulative seconds	self seconds	calls	self Ts/call	total Ts/calls	name

Again, nothing terribly surprising here, except for the code gcc generated for the crc32() function (two lines of C code, one of which is while(size--)), but I used the default compiler settings; if it really bothers me, I can up the compiler settings and see what I get.

[1] /boston/2007/10/29.2

[2] /boston/2007/08/16.1

Gemini Mention this post

Contact the author