BLOG UPDATE (09-JUN-2015): The link to the downloadable SLB tar archive has been updated below.
BLOG UPDATE (23-NOV-2010): Please note, I have updated the SLB tarball to address some irregularities found by readers during their testing. I need to update this post further as the numbers in the boxes below are begat of the previous SLB and are therefore no longer relevant.
A few years ago I was working a series on AMD Opteron, Hypertransport, NUMA and what that meant to Oracle. Along the way I put out the Silly Little Benchmark (SLB) as discussed in my post entitled Oracle on Opteron with Linux-The NUMA Angle (Part III). Introducing The Silly Little Benchmark. I’ve had a lot of requests recently for updated copies of SLB. If you are looking for SLB, please download it at the following link:
SLB (Silly Little Benchmark) Tar Archive 09 June 2015
New SLB Kit
I’d like to point out a couple of things about the new SLB tar archive.
- The code has changed so results from this kit are not comparable to prior kits.
- The kit now performs 30 seconds of random memory reads followed by 30 seconds of random memory writes.
- The kit includes a wrapper script called runit.sh that runs SLB processes each with 512MB physical memory. The argument to runit.sh is a loop control of how many SLB processes to run upon each invocation of the benchmark.
- The kit includes a README that shows how to compile the kit and also offers further explanation of item #3 in this list.
Previous SLB Blog Posts
The following are a few pointers to prior content that dealt with SLB in one way or the other.
- “Feel” Your Processor Cache. Oracle Does. Part I.
- Linux Bogomips, Or Is That Bogusmips. Part – I
- Oracle on Opteron with Linux-The NUMA Angle (Part VI). Introducing Cyclops.
- Kevin Closson’s Silly Little Benchmark is Silly Fast on Nehalem (Greg Rahn)
Some Recent SLB Results
The following are a few updated SLB results using the new kit.
The first recent result is from a 2s Westmere EP (Xeon 5600) system. I passed “6” into runit.sh to see what one socket’s worth of performance looks like.
$ ./runit.sh 6 Users: 6 Buffer area size 524288 KB ADDR 0x2AD3ABA02010 Waiting for semaphore... Total wops 239999724 Secs 30.1 Avg nsec/op 125 TPUT ops/sec 7978921.92 Total rops 399999540 Secs 30.1 Avg nsec/op 75 TPUT ops/sec 13300920.06 Buffer area size 524288 KB ADDR 0x2B56949FA010 Waiting for semaphore... Total wops 379999563 Secs 30.6 Avg nsec/op 80 TPUT ops/sec 12416873.45 Total rops 619999287 Secs 30.1 Avg nsec/op 48 TPUT ops/sec 20623014.23 Buffer area size 524288 KB ADDR 0x2AAF02293010 Waiting for semaphore... Total wops 239999724 Secs 30.2 Avg nsec/op 125 TPUT ops/sec 7939962.30 Total rops 459999471 Secs 30.1 Avg nsec/op 65 TPUT ops/sec 15257316.97 Buffer area size 524288 KB ADDR 0x2AFADC4C9010 Waiting for semaphore... Total wops 379999563 Secs 31.1 Avg nsec/op 81 TPUT ops/sec 12216920.78 Total rops 599999310 Secs 30.2 Avg nsec/op 50 TPUT ops/sec 19873638.29 Buffer area size 524288 KB ADDR 0x2AEB7B430010 Waiting for semaphore... Total wops 379999563 Secs 31.2 Avg nsec/op 82 TPUT ops/sec 12174302.22 Total rops 599999310 Secs 30.1 Avg nsec/op 50 TPUT ops/sec 19941302.38 Buffer area size 524288 KB ADDR 0x2B6A80F63010 Waiting for semaphore... Total wops 239999724 Secs 30.2 Avg nsec/op 125 TPUT ops/sec 7938049.67 Total rops 479999448 Secs 31.0 Avg nsec/op 64 TPUT ops/sec 15474601.85 Test Summary: Total wops 1859997861 Total rops 3159996366 Runtime seconds: 31 wops/s 59615316 rops/s 101281934
That was a bit bumpy. I re-ran it with affinity (taskset) and collected the following results:
$ taskset -pc 0-5 $$ pid 15320's current affinity list: 0-23 pid 15320's new affinity list: 0-5 $ sh ./runit.sh 6 Users: 6 Buffer area size 524288 KB ADDR 0x2B28784C4010 Waiting for semaphore... Total wops 379999563 Secs 31.0 Avg nsec/op 81 TPUT ops/sec 12238869.35 Total rops 499999425 Secs 30.2 Avg nsec/op 60 TPUT ops/sec 16580155.46 Buffer area size 524288 KB ADDR 0x2B1241B67010 Waiting for semaphore... Total wops 379999563 Secs 31.4 Avg nsec/op 82 TPUT ops/sec 12118541.38 Total rops 499999425 Secs 30.4 Avg nsec/op 60 TPUT ops/sec 16446948.61 Buffer area size 524288 KB ADDR 0x2B4893BFD010 Waiting for semaphore... Total wops 379999563 Secs 31.3 Avg nsec/op 82 TPUT ops/sec 12136661.49 Total rops 499999425 Secs 30.5 Avg nsec/op 60 TPUT ops/sec 16403891.60 Buffer area size 524288 KB ADDR 0x2B94FD5AA010 Waiting for semaphore... Total wops 379999563 Secs 31.0 Avg nsec/op 81 TPUT ops/sec 12272774.98 Total rops 519999402 Secs 30.9 Avg nsec/op 59 TPUT ops/sec 16820126.30 Buffer area size 524288 KB ADDR 0x2B0D09454010 Waiting for semaphore... Total wops 379999563 Secs 31.4 Avg nsec/op 82 TPUT ops/sec 12107983.29 Total rops 499999425 Secs 30.5 Avg nsec/op 61 TPUT ops/sec 16368642.72 Buffer area size 524288 KB ADDR 0x2AAD4513E010 Waiting for semaphore... Total wops 379999563 Secs 31.4 Avg nsec/op 82 TPUT ops/sec 12097160.14 Total rops 499999425 Secs 30.6 Avg nsec/op 61 TPUT ops/sec 16354937.65 Test Summary: Total wops 2279997378 Total rops 3019996527 Runtime seconds: 31 wops/s 72611381 rops/s 96178233
That result was a lot smoother and the wops (write ops per second) improved 22%. The rops, on the other hand, suffered a small 5% degredation. I’ll blog further about that in another post.
Other Results?
It sure would be nice if folks could try this out on other platforms. I’ve compiled and run it on Power6 so I know that it works on AIX 5L.
… also back available on new oakie site :
http://www.oaktable.net/contribute/slb-silly-little-benchmark
Thanks Kurt.