I have seen this many, many times. Something that runs fast during development and maybe even testing because the data used in testing was too small and didn't match real world conditions. If you are working on a small set of data, everything is fast, even slow things.
“Software performance with large sets of data [1]”
The primary test I use with greylist daemon [2] is (I think) brutal. I have a list of 27,155 tuples (real tuples, logged from my own SMTP (Simple Main Transport Protocol) server) of which 25,261 are unique. When I run the greylist daemon, I use an embargo timeout of one second (to ensure a significant number of tuples make it to the whitelist), a greylist timeout of at least two minutes, with the cleanup code (which checks for expired records and removes them) running every minute. Then to run the actual test, I pump the tuple list through a small program that reformats the tuples that the Postfix [3] module expects, which then connects to the daemon. There is no delay in the sending of these tuples— we're talking thousands of tuples per minute, for several minutes, being pumped through the greylist daemon.
Now, there are several lists the tuple is compared against. I have a list of IP (Internet Protocol) addresses that will cause the daemon to accept or reject the tuple. I can check the sender email address or sender domain, and the recipient email address or domain. If it passes all those, then I check the actual tuple list. The IP list is a trie [4] (nice for searching through IP blocks). The other lists are all sorted arrays, using a custom binary search [5] to help with inserting new records.
Any request that the server can handle immediately (say, checking a tuple, or returning the current config or statistics to the Greylist Daemon Master Control Program [6]) are done in the main processing loop; for longer operations (like sending back a list of tuples to the Master Control Program) it calls fork() and the child process handles the request.
I haven't actually profiled the program, but at this point, I haven't had a need to. It doesn't drop a request, even when I run the same grueling test on a 150MHz (megaHertz) PC (Personal Computer)).
I just might though … it would be interesting to see the results.
[1] http://spinthecat.blogspot.com/2007/09/software-performance-with-large-