Part of my job at The Corporation is load testing. And generally, when I do load testing, I pretty much write code that sends requests as quickly as possible to the server (and I might run multiple programs that spam the server with requests). But recently, Smirk brought to my attention Poisson distributions [1], claiming that it presents a more realistic load.
I was skeptical. I really didn't see how randomly sending requests using a Poisson distribution would behave any differently than a (large) constant load, but Smirk insisted.
So I wrote a simple UDP service and ran that on one machine. I then coded up two client programs. They were actually identical except for one line of code—the “constant” client:
>
```
sleep(0)
```
and the “Poisson distribution” client:
>
```
sleep(-log(1.0 - random()) / 5000.0)
```
(sleep(0) doesn't actually sleep, but does cause the calling process to be rescheduled. This limits the process to about 10,000 messages a second, so it's a good baseline, and we can always run more processes. random() returns a value between 0 and 1 (includes 0, excludes 1) and we subtract that from 1.0 to prevent taking the log of 0 (which is undefined). Logarithms of numbers less than 1 are negative, so we negate to get a positive value, and divide by 5,000, which means we average 5,000 messages per second. Yes, it's half the rate of the constant client, but there are two reasons for this—one, we can always run more, and two, there's a limit to short we can sleep—a value of 0 just reschedules the process; under Solaris, you can't sleep less than 1/100 of a second (so values less than .01 are rounded up to .01); Linux is around 1/500 or 1/1000 of a second (depending upon configuration) so 5,000 is kind of a “eh, it's good enough” value)
(I should also mention that the version of sleep() I'm using can take a fractional number of seconds, like sleep(0.125), since all the code I'm talking about is in Lua [2], because I was writing a “proof-of-concept” server, not a “high performance” server)
So, I run 64 “constant” clients and get:
Table: 64 “constant” client results, packets per 10 second interval packets sent packets received packets dropped ------------------------------ 636002 636002 0 621036 621036 0 631890 631890 0 631051 631051 0 613912 613912 0
Pretty much around 10,000 messages per second with no dropped data. And now, for 128 ”Poisson distribution” clients:
Table: 128 ”Poisson distribution” clients, packets per 10 second interval packets send packets received packets dropped ------------------------------ 348620 348555 65 439038 438988 50 375482 375436 46 382650 382600 50 396886 396828 58
Um … what?
Half the number of packets, and I'm loosing some as well? What weirdness is this? No matter how many times I run the tests, or for how long, I get similar results. The “Poisson distribution” client gets horrible results.
And as Smirk said, that's exactly the point.
And the odd thing is, I can't explain this behavior. I can't comprehend what could be happening that could be causing this behavior, over one line change.
Disturbing.
[1] http://preshing.com/20111007/how-to-generate-random-timings-for-a-poisson-process