In my blog entry entitled Exadata Storage Server: 485x Faster than…Exadata Storage Server. Part I., I took issue with the ridiculous multiple orders of magnitude performance improvement claims routinely made by the DW Appliance vendors. These claims are usually touted as comparisons to “Oracle” (without any substantive accounting for what sort of Oracle configuration they are comparing to) and never seem to include any accounting for where the performance improvement comes from. After learning a bit about marketing from a breakfast cereal commercial, I decided to share with my readers how easy it is to do what these guys generally do-compare apples to bicycles. To make it more interesting I decided to show a 485-fold performance increase of Oracle Exadata versus Oracle Exadata. A long comment thread ensued and ultimately ended with a reader posting the following:
Block density perhaps?
You didn’t mention that the number of records per block
was a constant. So it would be possible that in the first scenario you created a table with a low amount of records per block, resulting in a large segment, needing a lot of io’s. (you could have used 1 row/block for example)
While in the second scenario you could have used a high number of blocks per record, resulting in a smaller segment, and thus needing a lower amount of io’s to fulfill the query.
Here’s the deal. I chose my words carefully and took a huge dose of Semantic-a-sol(tm). I set the stage as a follows:
- I said, “There are no … partitioning or any sort of data elimination.” True, there were no forms of data elimination. I didn’t say anything about eliminating unused space.
- I said the data in the table was the same; I never said it was the same table.
- I said there was the same storage bandwidth and the same number of CPUs and that was true.
The 485x was a product of querying a table with PCTFREE 0 versus PCTFREE 99. When I queried the vacuous blocks I also did so with a normal scan instead of a Smart Scan. So it is true that storage bandwidth remained constant but I created an artificial bottleneck upwind by forcing the single database host (used in both cases) to ingest the full 1.6 TB which is how much round-brown spinning stuff needed to store the vacuous blocks (PCTFREE 99). That took 970 seconds.
With ~107 million rows, and a query that cited only the PURCHASE_AMT column, the amount of data actually needed by the SQL layer is a measly 86 MB. So, when I “magically” switched the card_trans synonym to point to the PCTFREE 0 table (which is only 8.4 GB) and scanned it with the full power of 14 Exadata Storage Servers, the data was off disk and the PURCHASE_AMT column plucked from the middle of each row and DMAed into the address space of the Parallel Query Processes on the database host in 1.96 seconds….485x speed up.
So, does anyone else hate it when these DW Appliance guys go around spewing ridiculous multiple orders of magnitude performance increases over who-knows-what without any accounting? It truly is an insult on your intelligence.
There is no reason to be mystified. If DW Appliance vendor XYZ is spouting off about a query processing speed-up of, say, X, just plug the values into the following magic decoder ring. Quote me on this, performance increase X is the product of:
- Executing on a platform with X-fold storage bandwidth, or
- Executing on a platform with X-fold processor bandwidth, or
- The query being measured manipulated 1/Xth the amount of data, or
- Some combination of items 1 through 3
Any reasonable vendor will gladly itemize for you where they get their magical performance gains. Just ask them, you might learn more about them than you thought.
Part II in these series can be found here.