In Robin Harris’ latest installment about ZFS action at Apple, he let out a glimpse of one of his other apparent morbid curiosities—flash. Just joking, I don’t think ZFS on Mac or flash technology are morbid, it just sounded catchy. Anyway, he says:
I’ve been delving deep into flash disks. Can you say “weird”? My take now is that flash drives are to disk drives what quantum mechanics is to Newtonian physics. I’m planning to have something out next week.
I look forward to what he has to say. I too have a great interest in flash.
Now, folks, just because we are Oracle-types and Jim Grey was/is a Microsoft researcher, you cannot overlook what sorts of things Jim was/is interesting in. Jim’s work has had a huge impact on technology over the years and it turns out that Jim took/takes an interest in flash technology with servers in mind. Just the abstract of that paper makes it a natural must-read for Oracle performance minded individuals. Why? Because it states (with emphasis added by me):
Executive summary: Future flash-based disks could provide breakthroughs in IOps, power, reliability, and volumetric capacity when compared to conventional disks.
Yes, IOps! Nothing else really matter where Oracle database is concerned. How can I say that? Folks, round-brown spinning things do sequential I/O just fine—naturally. What they don’t do is random I/O. To make it worse, most SAN array controllers (you know, that late 1990’s technology) pile on overhead that further choke off random I/O performance. Combine all that with the standard IT blunder of allocating space for Oracle on a pure capacity basis and you get the classic OakTable Network response:
Attention DBAs, it’s time for some déjà vu. I’ll state with belligerent repetition, redundantly, over and over, monotonously reiterating this one very important recurrent bit of advice: Do everything you can to get spindles from your storage group—not just capacity.
Yes that’s right, it wont be long (in relative terms) until you see flash memory storage fit for Oracle databases. The aspect of this likely future trend that I can’t predict, however, is what impact such technology would have on the entrenched SAN array providers. Will it make it more difficult to keep the margins at the levels they demand, or will flash be the final straw that commoditizes enterprise storage? Then again, and Jim Grey points out in that paper, flash density isn’t even being driven by the PC—and most certainly not enterprise storage—ecosystem. The density is being driven by consumer and mobile applications. Hey, I want my MTV. Um, like all of it, crammed into my credit-card sized mpeg player too.
When it gets cheaper and higher capacity of course. Well, its not exactly that simple. I went spelunking for that Samsung 1.8” 32GB SSD and found two providers with street price of roughly USD $700.00 for 32GB here and here. In fact, upon further investigation, Ritek may soon offer a 32GB device at some $8 per GB. But let’s stick with current product for the moment. At $22 per GB, we’re not exactly talking SATA which runs more on the order of $.35 per GB. But then we are talking enterprise applications here, so a better comparison would be to Fibre drives which go for about $3-$4 per GB.
Now that is interesting since Jim Grey pointed out that in-spite of some industry predictions setting the stage for NAND to double every year, NAND had in fact gained 16 fold in 4 years–off by year. If that pace continues, could we really expect 512GB 1.8″ SSD devices in the next 4 years? And would the price stay relatively constant yielding a cost of something like $1.35 per GB? Remember, even the current state of the art (e.g., the Samsung 1.8″ 32GB SSD) delivers on the order of 130,000 random single-sector IOps–that’s approximately 7usec latency for a random I/O. At least that is what Samsung’s literature claims. Jim’s paper, on the other hand reports grim current art performance when measured with DskSpd.exe:
The story for random IOs is more complex – and disappointing. For the typical 4-deep 8KB random request, read performance is a spectacular 2,800 requests/second but write performance is a disappointing 27 requests/second.
The technology is young and technically superior, but there is work to do in getting the most out of NSSD as the paper reports. Jim suspects that short term quick fixes could be made to bring the random I/O performance for 8KB transfers on today’s NSSD technology up to about 1,700 IOps split evenly between read and write. Consider, however, that real world applications seldom exhibit a read:write ratio of 50:50. Jim generalized on the TPC-C workload as a case in point. It seems with “some re-engineering” (Jim’s words) even today’s SSD would be a great replacement for hard drives for typical Oracle OLTP workloads since you’ll see more 70:30 read:write ratios in the real world. And what about sequential writes? Well, there again, even today’s technology can handle some 35MB/s of sequential writes so direct path writes (e.g., sort spills) and redo log writes would be well taken care of. But alas, the $$/GB is still off. Time will fix that problem and when it does, NSSD will be a great fit for databases.
Don’t think for a moment Oracle Corporation is going to pass up on enabling customers to exploit that sort of performance–with or without the major SAN vendors.
But flash burns out, right? Well, yes and no. The thing that matters is how long the device lasts-the sum of its parts. MTBF numbers are crude, but Samsung sticks a 1,000,000hr MTBF on this little jewel-how cool.
Well, I’ve got the cart well ahead of the horse here for sure because it is still too expensive, but put it on the back burner, because we aren’t using Betamax now and I expect we’ll be using fewer round-brown spinning things in the span of our careers.