Less traffic and lost packets … I'm stumped

Part of my job at The Corporation is load testing. And generally, when I do load testing, I pretty much write code that sends requests as quickly as possible to the server (and I might run multiple programs that spam the server with requests). But recently, Smirk brought to my attention Poisson distributions [1], claiming that it presents a more realistic load.

I was skeptical. I really didn't see how randomly sending requests using a Poisson distribution would behave any differently than a (large) constant load, but Smirk insisted.

So I wrote a simple UDP service and ran that on one machine. I then coded up two client programs. They were actually identical except for one line of code—the “constant” client:

>
```
sleep(0)
```

and the “Poisson distribution” client:

>
```
sleep(-log(1.0 - random()) / 5000.0)
```

(sleep(0) doesn't actually sleep, but does cause the calling process to be rescheduled. This limits the process to about 10,000 messages a second, so it's a good baseline, and we can always run more processes. random() returns a value between 0 and 1 (includes 0, excludes 1) and we subtract that from 1.0 to prevent taking the log of 0 (which is undefined). Logarithms of numbers less than 1 are negative, so we negate to get a positive value, and divide by 5,000, which means we average 5,000 messages per second. Yes, it's half the rate of the constant client, but there are two reasons for this—one, we can always run more, and two, there's a limit to short we can sleep—a value of 0 just reschedules the process; under Solaris, you can't sleep less than 1/100 of a second (so values less than .01 are rounded up to .01); Linux is around 1/500 or 1/1000 of a second (depending upon configuration) so 5,000 is kind of a “eh, it's good enough” value)

(I should also mention that the version of sleep() I'm using can take a fractional number of seconds, like sleep(0.125), since all the code I'm talking about is in Lua [2], because I was writing a “proof-of-concept” server, not a “high performance” server)

So, I run 64 “constant” clients and get:

Table: 64 “constant” client results, packets per 10 second interval
packets sent	packets received	packets dropped
------------------------------
636002	636002	0
621036	621036	0
631890	631890	0
631051	631051	0
613912	613912	0

Pretty much around 10,000 messages per second with no dropped data. And now, for 128 ”Poisson distribution” clients:

Table: 128 ”Poisson distribution” clients, packets per 10 second interval
packets send	packets received	packets dropped
------------------------------
348620	348555	65
439038	438988	50
375482	375436	46
382650	382600	50
396886	396828	58

Um … what?

Half the number of packets, and I'm loosing some as well? What weirdness is this? No matter how many times I run the tests, or for how long, I get similar results. The “Poisson distribution” client gets horrible results.

And as Smirk said, that's exactly the point.

And the odd thing is, I can't explain this behavior. I can't comprehend what could be happening that could be causing this behavior, over one line change.

Disturbing.

[1] http://preshing.com/20111007/how-to-generate-random-timings-for-a-poisson-process

[2] http://www.lua.org/

Gemini Mention this post

Contact the author