Labrea [1] is actually logging about half a gig a day [2]. Over a 24 hour period (from about 6 am Thursday to 6 am today) I'm tarpitting 82,359 connections across 2,059 unique IP addresses (24,252 connections from a single IP address). And while the number of network ports being accessed has increased a bit, it's the Microsoft specific ports that are still the most popular targets (with 72% of the scans):
Table: Top 10 ports captured by Labrea in the past 24 hours Port # Port description # connections ------------------------------ 139 NetBIOS (Basic Input/Output System) Session Service 24941 445 Microsoft-DS (Directory Service?) Service 23013 1433 Microsoft SQL (Standard Query Language) Server 6772 4899 Remote Administration [3] 5620 135 Microsoft-RPC (Remote Procedure Call) service 4722 80 Hypertext Transfer Protocol 3697 8080 Hypertext Transfer Protocol—typical alternative port 1686 7212 (unknown) 1683 8000 (unknown) 1471 10000 (some web based control panels use this port) 951 ------------------------------ Port # Port description # connections
The program I'm using to generate the stats is written in Perl, and it took about 4 hours to run over a day's worth of data (the machine that does the tarpitting isn't the fastest machine we have, but it's more than enough to dedicate to just running LaBrea). I definitely want to write a program to process LaBrea data in real time.