I’ve been reading a lot of blogosphere content about Data Warehousing these days. I’ve taken a lot of interest in such technology as Netezza, GreenPlum, DATAllegro and others and blog reading proves to be an interesting way to augment one’s knowledge. Who’d have thought I’d learn so much about OLTP through this reading.
Memory is Faster than Disk, So Let’s Do a Complete Rewrite
Why, just today I found out that it is time for a total rewrite of commercial RDBMS products. Uh huh. More interestingly, though, I learned:
- Memory is faster than disk. Really, truely, it is!
- A dual-core (2.8GHz) server with 4GB memory and 4 250GB SATA drives can perform 51,000 TpmC
- Disabling transaction logging entirely in a commercial RDBMS will increase throughput (TpmC) about three-fold
I found these pearls of wisdom while reading a Stonebraker paper referred to on this blog post. Yes, I know that blog is basically a store-front for Vertica, but I like to learn about different things that are going on in database technology. Unfortunately this time I was wasting my time. The URL in that blog post points to the VLDB front page, but a little sleuthing found the paper posted here: The End of an Architectural Era (It’s Time for a Complete Rewrite).
Recite after me:
If you get two orders of magnitude performance gain, you are either not doing it or you’ve moved it closer to the processor.
Dang, and I ain’t even got no too pretty good pedigree. Pshaw, I dasn’t fidget ‘mungst the quality!
Central versus De-centralized versus Shared-Nothing
No, it isn’t time for a re-write, especially one that requires a complete shared-nothing database approach. Now don’t get me wrong, I’m all for de-coupling and grid architecture-most particularly where storage is concerned. If I hear of another poor production site that is head-saturated on a $500,000 storage array when driving a measly 15 or so 15K RPM drives, I’ll BAARF. Please see the following post for what I’m talking about: