On Saturday, I bumped into Rob at a “After Thanksgiving Party” and we discussed the use of X-Grey at Negiyo [1], at least, those parts of Negiyo email that Rob helps to manage.
The code, as is, won't work with their setup. First problem, the sheer volume of email—something like 100,000 connections per second. These are fed through two load balancers and farmed out to about 100 servers, so each server is responsible for 10,000 connections per second. While I suspect X-Grey can handle 10,000 connections per second, the major problem are the load balancers—there's just no guarantee that the load balancers will be consistent on which machine they send the connection to.
For instance, we have some machine, on IP (Internet Protocol) address 10.20.30.40 sending an email from alice@example.net to bob@negiyo.com. The load balancer will send that to server A, which doesn't find the tuple [10.20.30.40 , alice@example.net , bob@negiyo.com], stores it for later reference, and sends back “try again later.” Later, the machine at 10.20.30.40 tries sending the email again, only this time, the load balancer sends the connection to server B, which doesn't find the tuple, stores it, and sends back “try again later.” Lather, rinse, repeat until the sender gives up, or the load balancer manages to send the traffic to a machine that actually has the tuple stored.
There's just no way of knowing which server the load balancer will send the traffic to. So, we point all the servers to a single greylist server, which now has to handle 100,000 requests per second. Okay, so assuming X-Grey can handle that load (it's a real beefy box on a fat pipe), and given that we store greylisted tuples for six hours … carry the one … 2,160,000,000 tuples.
Blink.
Blink.
Okay, now that I'm actually doing the math instead of sitting around in a comfortable chair listening to Rob while chowing down on turkey and stuffing, I find it rather difficult to believe that Negiyo is getting around 8½ billion emails per day—even a billion per day is stretching my credibility. The worst we get at The Company is 8 per second, with an average hovering around 1.4 (or 122,540 per day, which I calculated twice, using two different statistics that are recorded). More believable is 100,000 per hour (or even up to 1,000,000 per hour, which is 11 emails per second).
I'll have to get back with Rob on this …