On 30 November, 2011 Oracle published the second result in a recent series of TPC-H benchmarks. The prior result was a 1000GB scale result with a single SPARC T4-4 connected to 4 Sun Storage F5100 Flash Arrays configured as direct attached storage (DAS). We can ascertain the DAS aspect by reading the disclosure report where we see there were 16 SAS host bus adaptors in the T4-4. As an aside, I’d like to point out that the F5100 is “headless” which means in order to provision Real Application Clusters storage one must “front” the device with a protocol head (e.g., COMSTAR) such as Oracle does when running TPC-C with the SPARC SuperCluster. I wrote about that style of storage presentation in one of my recent posts about SPARC SuperCluster. It’s a complex approach, is not a product, but it works.
The more recent result, published on 30 November, was a 3000TB scale result with a single SPARC T4-4 server and, again, the storage was DAS. However, this particular benchmark used Sun Storage 2540-M2 (OEMed storage from LSI or Netapp?) attached with Fibre Channel. As per the disclosure report there were 12 8GFC FC HBAs (dual port) for a maximum read bandwidth of 19.2GB/s (24 x 800MB/s). The gross capacity of the storage was 45,600GB which racked up entirely in a single 42U rack.
So What Is My Take On All This?
Shortly after this 3TB result went public I got an email from a reader wondering if I intended to blog about the fact that Oracle did not use Exadata in this benchmark. I replied that I am not going to blog that point because while TPC-H is an interesting workload it is not a proper DW/BI workload. I’ve blogged about that fact many times in the past. The lack of Exadata TPC benchmarks is in itself a non-story.
What I do appreciate gleaning from these results is information about the configurations and, when offered, any public statements about I/O bandwidth achieved by the configuration. Oracle’s press release on the benchmark specifically called out the bandwidth achieved by the SPARC T4-4 as it scanned the conventional storage via 24 8GFC paths. As the following screen shot of the press release shows, Oracle states that the single-rack of conventional storage achieved 17 GB/s.
Oracle Press Release: 17 GB/s Conventional Storage Bandwidth.
I could be wrong on the matter, but I don’t believe the Sun Storage 2540 supports 16GFC Fibre Channel yet. If it had, the T4-4 could have gotten away with as few as 6 dual-port HBAs. It is my opinion that 24 paths is a bit cumbersome. However, since it wasn’t a Real Application Clusters configuration, the storage network topology even with 24 paths would be doable by mere mortals. But, again, I’d rather have a single rack of storage with a measly 12 FC paths for 17 GB/s and since 16GFC is state of the art that is likely how a fresh IT deployment of similar technology would transpire.
SPARC T4-4 Bandwidth
I do not doubt Oracle’s 17GB/s measurement in the 3TB result. The fact is, I am quite astounded that the T4-4 has the internal bandwidth to deal with 17GB/s data flow. That’s 4.25GB/s of application data flow per socket. Simply put, the T4-4 is a very high-bandwidth server. In fact, when we consider the recent 1T result the T4-4 came within about 8% of the HP Proliant DL980 G7 with 8 Xeon E7 sockets and their PREMA chipset . Yes, within 8% (QphH) of 8 Xeon E7 sockets with just 4 T4 sockets. But is bandwidth everything?
The T4 architecture favors highly-threaded workloads just like the T3 before it. This attribute of the T4 is evident in the disclosure reports as well. Consider, for instance, that the 1TB SPARC T4 test was conducted with 128 query streams whereas the HP Proliant DL980 case used 7. The disparity in query response times between these two configurations running the same scale test is quite dramatic as the following screen shots of the disclosure reports show. With the HP DL980, only query 18 required more than 300 seconds of processing whereas not a single query on the SPARC T4 finished in less than 1200 seconds.
Summary
These recent SPARC T4-4 TPC result proved several things:
1. Conventional Storage Is Not Dead. Achieving 17GB/s from storage with limited cabling is nothing to sneeze at.
2. Modern servers have a lot of bandwidth.
3. There is a vast difference between a big machine and a fast machine. The SPARC T4 is a big (bandwidth) system.
Finally, I did not blog about the fact that the SPARC T4 TPC-H benchmarks do not leverage Exadata storage. Why? Because it simply doesn’t matter. TPC-H is not a suitable test for a system like Exadata. Feel free to Google the matter…you’ll likely find some of my other writings stating the same.
Right to the heart of the critical resource tradeoffs. Thanks Kevin
mwf
Thanks for stopping by, Mark.
Seems like 3000TB is a typo and it should be 3000GB.
Yes, Sabastian, it was a typo. Thanks for pointing that out.
Does this summarize your point(s)?
TPC-H produces a number which is a reflection of (hourly?!?) system throughput.
System throughput may not be indicative of system “performance” to its uses b/c users are typically most intersted in response time. Thus, TPC-H is a easily mis-used benchmark for comparing real world performance.
Oracle is using our misunderstanding of throughput as performnace to Market systems which are excellent throughput machines as excellent performance machines, when in fact their performance may be less then desirable.
Hello ajmaidak,
A reply will require a new post I think. Please watch for that.
Thanks for the follow up. The takeway I’m getting from this is that when reviewing TPC-H be sure to pay attention to not just the throughput metric (QphH) but the geometric mean of the power run and the number of query streams. Depending on your situation ability to complete 128 1200s streams may be more or less desirable to 8 441s streams.
The thing that confuses me is the 1TB HP Intel Oracle RAC result has such a strong geometric mean for the Power Run (4.6s) but the 1TB DL980G7 SQL Server result has such a poor geometric mean fo the power run (58.3s). I suspect this is what you were talking about when saying its a mistake to compare SQL Server results to Oracle based results?
Hello ajmaidak,
Yes, I do think it is important to go beyond just the pure throughput. That is my point. And, yes, there is so much different between products like SQL Server and Oracle for workloads like TPC-H that comparisons are moot. I don’t feel that way, however, about TPC-C. There really are no benchmark-specials these vendors can do for TPC-C. The code path on these TPC-C New Order Trans has been optimized down to the high hundreds of thousands of instructions. There is no room left to optimize. But I digress.
My oh my how timely was my remark about comparing TPC-C results between vendors: http://finance.yahoo.com/news/Oracle-Delivers-Record-iw-1910757519.html?x=0
http://yhoo.it/tqnhP9
So Kevin, Why aren’t you trying to compare response times, database load times, and even pricing while you are at it? And where are your comparisons to IBM’s Power7, which I believe is the real target for the SPARC T4 since they are both RISC architectures and run UNIX. It sure seems that you are trying to prove that the SPARC T4 still doesn’t perform well in single threaded workloads, yet you haven’t provided any real substantiation? I think the 3TB TPC-H submission eliminates many of your arguments regarding using Flash storage as Oracle ran this benchmark using standard disks.
Why not compare the 3 closest results to the SPARC T4-4? vs IBM Power 780 and HP ProLiant DL980 G7, both of which are 8-socket systems!
Database Load time:
SPARC T4-4 = 4.14hrs
Power 780-8 = 5.86 hrs
HP ProLiant DL980 G7-8 = 8.35 hrs
SPARC T4-4 is 2x faster in loading DB than the DL980 and even beats IBM. SPARC T4-4 is 29% faster than the IBM Power 780 for data loading
TPC-H Refresh Function:
SPARC T4-4 is up to 3.2 x faster than HP ProLiant DL980 G7
SPARC T4-4 is up to 3.4 x faster than IBM Power 780
Disclosed HW Pricing (Server + Storage w/o SW as they are not running same DB):
SPARC T4-4 = $456,993
HP ProLiant DL980 G7-8 =$460,869
Power 780-8 = $1,513,105 (yes, that’s in the MILLIONS)
Wow, now I know why IBM doesn’t publish too many benchmarks where they need to disclose pricing!
And all of these comparisons is probably why you are spending so much of your time (hopefully not on EMC’s dime) talking about SPARC T4!
>Why aren’t you trying to compare response times, database load times, and even pricing
Because I don’t want to. But since we are now on the topic, I think it is only fair to discuss price/QphH which is $6.37 versus $4.10 in that IBM example, by the way.
If you’re dead set on opening up the field of comparisons, how about the SPARC T4 versus the Dell cluster result with EXASOL (http://bit.ly/u4vVDe).
But I’d rather talk about Oracle.
Follow the series. It’s three-parts. My focus is Oracle on SPARC versus Oracle on other platforms.
Do you have yourself convinced that a Oracle/Power7 or Oracle/Xeon E7 TPC-H wouldn’t make a better showing than the SPARC? I happen to think that Oracle on the same hardware would beat either SQL Server and perhaps Sybase IQ. That’s my angle. Does that sound so anti-Oracle to you?
By the way, your prior ignorant comment (http://bit.ly/tJfFpo) about the circumstances under which I came to EMC and this slur about how I spend my time are strikes 1 and 2 (google American baseball rules) in blog comment etiquette. Three strikes and you’re out. Your comments are welcome here until that third strike. Please don’t get that next strike.