BLOG CORRECTION: The next to the last paragragh has been edited to offer more clarity on which components impose limits on I/O transfer sizes.
I’m going to tell you something nobody else knows. You’ve heard it here first. Ready? Here’s the deal, no more than 800 MB/s can pass through two 4 Gb Fibre Channel HBAs into any host system memory. It’s that simple. If you want more than 800 MB/s available for your CPUs, you have to either add more 4 Gb HBAs or go with 8 Gb Fibre, or drop FCP all together and go with something that can deliver at that level, but this isn’t a plug for the Manly Man Series on Fibre Channel Technology, I’m blogging about Data Warehouse Appliance technology, specifically DATAllegro.
Exit Conventional Wisdom, and Electronics!
Here is a graphic of the V3 DATAllegro building block. It’s two Dell 2950s (a.k.a., Compute Nodes) each plumbed with two 4 Gb Fibre Channel HBAs to a small EMC CX3 array. According to this piece on DATAllegro’s website, they are the only people on the planet to push more than is electronically possible through two 4 Gb HBAs, I quote:
Data for each compute node is partitioned into six files on dedicated disks with a shared storage node. Multi-core allows each of these six partitions to be read in parallel. Data is streamed off these partitions using DATAllegro Direct Data StreamingTM (DDS) technology that maximizes sequential reads from each disk in the array. DDS ensures the appliance architecture is not I/O bound and therefore pegged by the rate of improvement of storage technology. As a result, read rates of over 1.2 GBps per compute node are possible.
That’s right. I wasn’t going to point out that each compute node is fed by six disks, because if I did I’d also have to tell you they are 7200 RPM SATA drives, mirrored. Supposedly we are to believe that the pixy dust known as Direct Data StreamingTM can, uh, pull data at what rate per spindle? Yes, that’s right, they say 200 MB/s per drive! Folks, I’ve got 7200 LFF SATA drives all over the place and you can’t get more than 80 MB/s per drive from these things (and that is actually fairly tough to do). Even EMC’s own specification sheet for the CX3 spells out the limit as 31-64 MB/s. I’ll attest that if your code stays out on the outer, say, 10% of the drive you can stream as much as 75-80 MB/s from these things. So with the DATAllegro system, and using my best numbers (not EMC’s published numbers), you’d only expect to get some 480 MB/s from 6 7200 RPM SATA drives (6×80). Wow, that Direct Data StreamingTM technology must be really cool, albeit totally cloak and dagger. Let’s not stop there.
What about this 1.2 GB/s per compute node claim? How do you pump that through 2 x 4 Gb FC HBAs? You don’t. Not even DATAllegro with all those Cool SoundingTM technologies. What’s really being said in that DATAllegro overview piece is that their effective ingestion rate is some 1.2 GB/s, I quote:
Compression expands throughput: Within each node, two of the multi-core processors are reserved for software compression. This increases I/O throughput from 800MBps from the shared storage node to over 1.2 GBps for each compute node.
They could just come out and say it, but they expect you to believe in magic. I’ll quote Stuart Frost (CEO, DATAllegro) on more of this magic, secret sauce:
Another very important aspect of performance is ensuring sequential reads under a complex workload. Traditional databases do not do a good job in this area – even though some of the management tools might tell you that they are! What we typically see is that the combination of RAID arrays and intervening storage infrastructure conspires to break even large reads by the database into very small reads against each disk.
Traditional databases are only victims of what storage arrays do with the I/O requests by way of slicing and dicing. Further, the OS and FC HBA impose limits for the size of large I/O requests. It is not a characteristic of a traditional database system. Even a Totally Rad Non-Traditional RDBMSTM like the one DATAllegro embeds in their compute nodes (spoiler: it’s Ingres, nothing new) will fall prey to what the array controller does with large I/O requests. But more to the point, FC HBAs and the Linux (CentOS for DATAllegro) block I/O layer impose limits on the size of transfers and that is generally 1MB.
If I’m wrong, I expect DATAllegro to educate us, with proof, not more implied Awesomely Fabulicious CoolFlips Technology TM. In the end, however, no matter whether they managed to code custom FC HBA drivers and somehow obtained custom firmware for the CX3 to achieve larger transfer sizes than anyone else or not, I’ll bet dollars to donuts they can’t push more than 800 MB/s through dual 4 Gb FCP HBAs, and certainly not from 6 7200 RPM SATA drives.