Transistors/chip: >100,000 since 1971
Disk density: >100,000,000 since 1956
Disk speed: 12.5 since 1956
Disk Speed == Rotational Speed?
The slide was offering a comparison of “disk speed” from 1956 and CPU transistor count from 1971 to the present. I accept the notion that processors have outpaced disk capabilities in that time period-no doubt! However, I think there is too much emphasis placed on disk rotational speed and not enough emphasis on the 100 million-fold increase in density. The topic at hand is DW/BI and I don’t think as much attention should be given to rotational delay. I’m not trying to read into Curt’s message here because I wasn’t in the presentation, but it sparks food for thought. Are disks really that slow?
Are Disks Really That Slow?
Instead of comparing modern drives to the prehistoric “winchester” drive of 1956, I think a better comparison would be to the ST-506 which is the father of modern disks. The ST506 of 1984 would have found itself paired to an Intel 80286 in the PC of the day. Comparing transistor count from the 80286 to a “Harpertown” Xeon yields an increase of 3280-fold and a clock speed improvement of 212-fold. The ST506 (circa 1984) had a throughput capability of 625KB/s). Modern 450GB SAS drives can scan at 150MB/s-an improvement of 245-fold. When considered in these terms, the hard drive throughput and CPU clock speed have seen a surprisingly similar increase in capability. Of course Intel is cramming 3280x more transistors in a processor these days, but read on.
The point I’m trying to make is that disks haven’t lagged as far behind CPU as I feel is sometimes portrayed. In fact, I think the refrigerator-cabinet array manufacturers disingenuously draw attention to things like rotational delay in order to detract from the real bottleneck, which is the flow of data from the platters through the storage processors to the host. This bottleneck is built into modern storage arrays and felt all the way through the host bus adaptors. Let’s not punish ourselves by mentioning the plumbing complexities of storage networking models like Fibre Channel.
Focus on Flow of Data, Not Spinning Speed.
Oracle Exadata Storage Server, in the HP Oracle Database Machine offering, configures 1.05 processor cores per hard drive (176:168). Even if I clump Flash SSD into the mix (about 60% increase in scan throughput over round, brown spinning disks) it doesn’t really change that much (i.e., not orders of magnitude).
Junk Science? Maybe.
So, am I just throwing out the 3280x increase in transistor count gains I mentioned? No, but I think when we compare the richness of processing that occurs on data coming off of disk in today’s world (e.g., DW/BI) compared to the 80286->ST506 days (e.g., VisiCalc, a 26KB executable), the transistor count gets factored out. So we are left with 245-fold disk performance gains and 212-fold cpu clock gains. So, is it a total coincidence that a good ratio of DW/BI cpu to disk is about 1:1? Maybe not. Maybe this is all just junk science. If so, we should all continue connecting as many disks to the back of our conventional storage arrays as they will support.
Stop bottlenecking your disk drives. Then, and only then, you’ll be able to see just how fast they are and whether you have a reasonable ratio of CPU to disk for your DW/BI workload.