A month ago, I re-evaluated the use of SPF (Sender Policy Framework) as an anti-spam measure [1] and found it wanting. Today, I decided to re-evaluate my stance on the various real-time blackhole lists [2] that exist. I was relunctant to use an RBL (Real-time Blackhole List) because of over-aggresive classification for even the smallest of infractions could lead to false positives (wanted email being rejected as spam). It has been over a decade since I first rejected the idea, and I was curious to see just how it would all shake out.
I used the Wikipedia list of RBLs [3] as a starting point, figuring it would be pretty up-to-date. I then dumped information from my greylist daemon [4]. The idea is to see how much additional spam would be caught if, after getting a “GO!” from the greylist daemon, I do a RBL check.
Out of the current 2,830 entries, only 145 had not been whitelisted. I didn't filter these out before running the test, but I don't think it would throw off the results too much. Half an hour of coding later, and I had a simple script to query the various RBLs for each unique IP (Internet Protocol) address (1,446). I let it run for a few hours, as it had quite a few queries to make (1,446 IP addresses, each one requiring one query to see if the IP address is a known spammer, and a possible second one for the reason, across 45 RBL servers—it took awhile).
First up, how many “spam” results did I get from each RBL:
Table: Results from each RBL RBL hits reasons given ------------------------------ truncate.gbudb.net. 108 108 dnsbl.proxybl.org. 0 0 dnsbl-1.uceprotect.net. 132 132 dnsbl-2.uceprotect.net. 145 145 dnsbl-3.uceprotect.net. 23 23 dnsbl.sorbs.net. 65 65 safe.dnsbl.sorbs.net. 65 65 http.dnsbl.sorbs.net. 0 0 socks.dnsbl.sorbs.net. 0 0 misc.dnsbl.sorbs.net. 0 0 smtp.dnsbl.sorbs.net. 0 0 web.dnsbl.sorbs.net. 21 21 new.spam.dnsbl.sorbs.net. 37 37 recent.spam.dnsbl.sorbs.net. 184 184 old.spam.dbsbl.sorbs.net. 0 0 spam.dbsbl.sorbs.net. 0 0 escalations.dbsbl.sorbs.net. 0 0 block.dnsbl.sorbs.net. 0 0 zombie.dbsbl.sorbs.net. 0 0 dui.dnsbl.sorbs.net. 0 0 rhsbl.sorbs.net. 0 0 badconf.rhsbl.sorbs.net. 0 0 nomail.rhsbl.sorbs.net. 0 0 sbl.spamhaus.org. 293 293 xbl.spamhaus.org. 53 53 pbf.spamhaus.org. 0 0 cbl.abuseat.org. 36 37 psbi.surriel.com. 0 0 intercept.datapacket.net. 186 186 db.wpbi.info. 0 0 bl.spamcop.net. 65 65 noptr.spamrats.com. 224 224 dyna.spamrats.com. 208 208 spam.spamrats.com. 15 15 bl.spamcannibal.org. 96 96 spamtrap.drbl.drand.net. 0 0 blacklist.hostkama.com. 0 0 dnsbl.dronebl.org. 2 2 list.quorum.to. 1309 1309 ix.dnsbl.manitu.net. 48 48 dnsbl.inps.de. 627 627 bl.blocklist.de. 6 6 srnblack.surgate.net. 21 21 all.s5h.net. 363 363 rbl.megarbl.net. 54 54
As you can see, some of them were not worth querying. Also, about list.quorum.to … it's not straightforward to use that server [5] as it always sent back a result even when the others did not. I ultimately decided that any result that only had a “hit” from list.quorum.to to be “non-spam” because of the issues.
I then proceeded to pour through all 2,830 results.
Table: Email classification from RBLs Marked as SPAM 2739 97% Not marked as SPAM 91 3% Total 2830 100%
And out of the 91 that was not marked as spam, only 7 were spam not marked by any of the RBLs. Not bad. But the real test is false positives—email marked as spam that isn't. And unfortunately, there were a few:
Table: False positives bounce.twitter.com 10 icpbounce.com 2 bounce.linkedin.com 3 returns.groups.yahoo.com 8 facebookmail.com 6 Total 29
Now, I realize that some of my readers might very well consider email from Twitter [6] or Facebook [7] as spam, but hey, don't judge me!
Ahem.
Anyway, that's a problem for me. I will occasionally have issues with the greylisting in some cases (rare, but it does happen, and I have to explicitely authorize the email when I become aware of the issue) but it's even worse with this. For instance:
66.220.155.148 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.151 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.143 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.136 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL ix.dnsbl.manitu.net. 66.220.155.172 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.140 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.142 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL intercept.datapacket.net. ix.dnsbl.manitu.net. 66.220.155.144 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL ix.dnsbl.manitu.net. 66.220.155.137 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL recent.spam.dnsbl.sorbs.net. ix.dnsbl.manitu.net. 66.220.155.141 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL dnsbl.dronebl.org. ix.dnsbl.manitu.net. 66.220.155.147 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.152 notification+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.152 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO! 66.220.155.150 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org FAIL ix.dnsbl.manitu.net. 66.220.155.138 update+iedcilif@facebookmail.com XXXXXXXXXXXXX@conman.org GO!
It's hit-or-miss within the IP range Facebook uses to send email. This would make troubleshooting quite difficult. I could whitelist the problematic domains but for any new site I might want to receive email from, I would have to watch the logs very closely for issues like this. But it's not as bad as I thought it would be, and it would cut out a lot of the spam I do get. It's tempting.
I shall have to think about this.
[2] http://en.wikipedia.org/wiki/DNSBL
[3] http://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists