This is installment number two in my series on Oracle Exadata Storage Server and HP Oracle Database Machine frequently asked questions. I recommend you also visit Exadata Storage Server Frequently Asked Questions Part I. I’m mostly cutting and pasting questions from the comment threads of my blog posts about Exadata and mixing in some assertions I’ve seen on the web and re-wording them as questions.
Later today The Pythian Group will be conducting a podcast question and answer interview with me that will be available on their website shortly thereafter. I’ll post a link to that when it is available.
Questions and Answers
Q. [I’m] willing to bet this is a full-blown Oracle instance running on each Exabyte [sic] Storage Server.
A. No, bad bet. Exadata Storage Server Software is not an Oracle Database instance. I happened to have an xterm with a shell process sitting in a directory with the self-extracting binary distribution file in it. We can tell by the size of the file that there is no room for a full-blown Oracle Database distribution:
|$ ls -l cell*
-rwxr-xr-x 1 root root 206411729 Sep 12 22:04 cell-080905-1-rpm.bin
Q. This must certainly be a difficult product to install, right?
A. HP installs the software on their manufacturing floor. Nonetheless I’ll point out that installing Oracle Exadata Storage Server Software is a single execution of the binary distribution file without options or arguments. Further, initializing a Cell is a single command with two options of which only one requires an argument; such as the following example where I specify a bonded Infiniband interface for interconnect 1:
CellCLI: Release 184.108.40.206.0 – Production on Fri Sep 26 10:56:17 PDT 2008
Copyright (c) 2007, 2008, Oracle. All rights reserved.
Cell Efficiency Ratio: 10,956.8
CellCLI> create cell cell01 interconnect1=bond0
After this command completes I’ve got a valid cell. There are no preparatory commands (e.g., disk partitioning, volume management, etc).
Q. I’m trying to grasp whether this is really just being pitched at the BI and data warehouse space, or whether it has real value in the OLTP space as well.
A. Oracle Exadata Storage Server is the best block server for Oracle Database, bar none. That being said, in the current release, Exadata Storage Server is in fact optimized for DW/BI workloads, not OLTP.
Q. I know we shouldn’t set too much store in these things, but are there plans to submit TPC benchmarks?
A. You are right that there should not be as much stock placed in TPC benchmarks, but they are a necessary evil. I don’t work in that space, but could you imagine Oracle not doing some audited benchmarks? Seems unlikely to me.
On the topic of TPC benchmarks, I was taking a gander at the latest move in the TPC-C “arms race.” This IBM RS600 p595 result of 6,085,166 TpmC proves that the TPC-C is not (has never been) an I/O efficiency benchmark. If you throw more gear at it, you get a bigger number! Great!
How about a stroll down memory lane.
When I was in Database Engineering in Sequent Computer Systems back in 1998, Sequent published a world-record Oracle TPC-C result on our NUMA system. We achieved 93,901 TpmC using 64GB main memory. The 6,085,166 IBM number I just cited used 4TB main memory. So how fulfilling do you think it must be to do that much work on a TPC-C just to prove that in 10 years nothing has changed! The Sequent result comes in at 1 TpmC per 714KB main memory and the IBM result at 1 TpmC per 705KB main memory. Now that’s what I want to do for a living! Build a system with 10,992 disk drives and tons of other kit just to beat a 10-year-old result by 1.3%. Yes, we are now totally convinced that if you throw more memory at the workload you get a bigger number! In the words of Gomer Pyle, “Soo-prise, Soo-prise, Soo-prise.” Ok, enough of that, I don’t like arms-race benchmarks.
Q. From the Oracle Exadata white paper: “No cell-to-cell communication is ever done or required in an Exadata configuration.”and a few paragraphs later: “Data is mirrored across cells to ensure that the failure of a cell will not cause loss of data, or inhibit data accessibility” Can both these statements be true and would we need to purchase a minimum of two cells for a small-ish ASM environment?
A. Cells are entirely autonomous and the two statements are true indeed. Consider two ASM disks out in a Fibre Channel SAN. Of course we know those two disks are not “aware of each other” just because ASM is using blocks from each to perform mirroring. The same is true for Oracle Exadata Storage Server cells and the drives housed inside them. As for the second part of the questions, yes, you must have a minimum of two cells. In spite of the fact that Cells are shared nothing (unaware of each other), ASM is in fact Cell-aware. ASM is intelligent enough to not mirror between 2 drives in the same Cell.
Q. Can this secret sauce help with write speeds?
A. That depends. If you have a workload suffering from the loss of processor cycles associated with standard Unix/Linux I/O libraries then, sure. If you have an application that uses storage provisioned from an overburdened back-end Fibre Channel disk loop (due to application collusion) then, sure. Strictly speaking, the “secret sauce” is the Oracle Exadata Storage Server Software and it does not have any features for write acceleration. Any benefit would have to come from the fact that the I/O pipes are ridiculously fast and the I/O protocol is ridiculously lightweight and the system on a whole is naturally balanced. I’ll blog about the I/O Resource Management (IORM) feature of Exadata soon as I feel it has positive attributes that will help OLTP applications. Although it is not an acceleration feature, it eliminates situations where applications steal storage bandwidth from each other.
Q. I like your initial overview of the product, but I believe that you need to compare both Netezza and Exadata side by side in real-world scenarios to gauge their performance.
A. I partially agree. I cannot go and buy a Netezza and legally produce competitive benchmark results based on the gear. Just read any EULA for any storage management software and you’ll see the bold print. Now that doesn’t mean Oracle’s competitors don’t do that. I think the comparison will come in the form of reduced Netezza sales. Heaven knows the 16% drop in Netezza stock was not as brutal as I expected.
Q. Re. [your] comparison to Netezza [in your first Exadata related post]. It’s bit of apple to oranges, really. You assume 80MB/s per disk for Exadata and for some reason only 70MB/s per disk for Netezza. Also, you have 168 disks spinning in parallel on Exadata and 112 on Netezza. Had your assumptions been tha same, sequential IO throughput would be similar, at least theoretically.
A. Reader, I invite you to explain to us how you think native SATA 7,200 RPM disk drives are going to match 15K RPM SAS drives. When I put 70 MB/s into the equation I was giving quite a benefit of the doubt (as if I’ve never measured SATA performance). Please, if you have a Netezza let us know how much streaming I/O you get from a 7,200 RPM SATA drive once you read beyond the first few outside sectors. I have also been using the more conservative 80 MB/s for our SAS drives. I’m highballing SATA and low-balling SAS. That sounds fair to me. As for the comparison between the numbers of drives, well, Netezza packaging limits the drive (SPU) count to 112 per cabinet. It would suit me fine if it takes a 1 plus another half rack to match a single HP Oracle Database Machine. That empty half of the rack would be annoying from a space constraint point of view though. Nonetheless, if you did go with a rack and a half (168 SPU), would that somehow cancel out the base difference in drive performance between SATA and SAS?