Are you tired of network tarpits yet?

You would not believe how hard it was to write a binary search that returned the correct index for a missing record in an array.

“Some notes on a binary search implementation [1]”

A week later, and I finally have it working.

One technique used to debug a program is to have another program that does the same thing, but implemented using a different method or language or both. And I did. I ran the Perl program I had over the 1.1G (Gigabyte) log file [2], then ran ltpstat over the same log file and got two different results.

Not good.

ltpstat returned 2% more connections than the Perl script. Getting a dump from the currently running version on the LaBrea [3] system and cleaning the output showed a 2% difference again.

So I spent the past week trying to track down the problem. It was obvious that ltpstat was storing duplicate records, but why was a different matter. My testing sample of about 1,100 connections is apparently too small to completely test the program, so I had to test using the 1.1G log file which has approximately 230,000 connections.

To help debug this problem, I wrote a linear search and would call it as well as the binary search. If both agreed, then I would return the information, otherwise, I would log the discrepency, do the search again, then exit. The reason for doing the search a second time? So I could set a breakpoint there, and let the program run for a couple of hours until it triggered. Then I could step through both searches to see where the problem was.

Yup, each run took several hours to trigger the bug.

I ended up testing four different binary search routines (including the original one I thought worked, plus one I modified from The Standard C Library [4], plus two other versions I wrote) before sitting down and working through things on paper.

And I still missed corner cases.

But finally, I tested my final version it against the Perl script and only had 122 discrepencies out of some 230,000 records (or 5% of 5%—too small for me to worry about after spending a week on this).

* * * * *

I took a snapshot of the currently running version (which had been running for a bit over three days now), cleansed the output of duplicates, and the final tally was 416,230 connections from 12,911 unique IPs. Again, nothing surprising about the ports being attacked:

Table: Top 10 ports captured by Labrea in the past 3 days
Port #	Port description	# connections
------------------------------
139	NetBIOS (Basic Input/Output System) Session Service	160,799
135	Microsoft-RPC (Remote Procedure Call) service	108,958
445	Microsoft-DS (Directory Service?) Service	67,506
80	Hypertext Transfer Protocol	23,921
4899	Remote Administration [5]	9,225
22	Secure Shell Login	7,253
1433	Microsoft SQL (Standard Query Language) Server	6,503
8080	Hypertext Transfer Protocol—typical alternative port	3,717
3128	Squid HTTP (HyperText Transport Protocol) Proxy	3,329
1080	W32.Mydoom.F@mm worm [6]	3,150
------------------------------
Port #	Port description	# connections

And again, the Microsoft specific ports account for 81% of the scans. I'll need to discuss with Smirk about blocking those ports in the core router. If nothing else, LaBrea is giving me an indication of which ports to block.

[1] /boston/2006/01/15.3

[2] /boston/2006/01/18.2

[3] http://sourceforge.net/projects/labrea

[4] http://www.amazon.com/exec/obidos/ASIN/0131315099/conmanlaborat-20

[5] http://www.famatech.com/

[6] http://securityresponse.symantec.com/avcenter/venc/data/w32.mydoom.f@mm.html

Gemini Mention this post

Contact the author