BLOG UPDATE 21-SEP-2009: The session Glenn Fawcett and I were scheduled to deliver has been cancelled.
They are letting me out of my cage long enough to attend Open World 2009. I’ll be working some of the Sun Oracle Database Machine demos and offering a couple of low-key sessions. One of the sessions is a joint-session with my old friend Glenn Fawcett. Glenn and I have been doing some performance engineering work on a full-rack Sun Oracle Database Machine. I don’t yet know the time slot for that session but I’ll post it here when I find out.
I’ve also signed up to deliver a session on Monday October 12 in the Open World UnConference. I signed up for it before the Sun Oracle Database Machine announcement so I gave the title of the session a bit of a stealth-title. I’ll be talking about Exadata, but perhaps more importantly I’ll have a lengthy question and answer session. If you check out the schedule you’ll see my session is in the same room following two more interesting sessions by my friend, co-worker and fellow OakTable Network member Greg Rahn and fellow OakTable Network member, and luminary, Cary Milsap:
1pm
Overlook II: Chalk & Talk: The Core Performance Fundamentals Of Oracle Data Warehousing (Greg Rahn, Database Performance Engineer, Real-World Performance Group @ Oracle)
2pm
Overlook I: Fundamentals of Performance (Oracle ACE Director Cary Millsap)
3pm
Overlook II: Oracle Exadata Storage Server FAQ Review and Q&A with Kevin Closson (Performance Architect, Oracle)
Kevin,
March 2009: Exadata was all HP.
Wow, big change. I’m curious how the technicalities of Exadata software being integrated with HP hardware were modified to work on Sun. I would assume that there were lots of HP & Oracle engineers working together to make the software/hardware work together. How was it possible to replace all that hardware so quickly?
Maybe, that’s between you & Larry, but it sure is an interesting idea for me-who-is-interested-in-hardware-software-integrations.
-paul
“I would assume that there were lots of HP & Oracle engineers working together to make the software/hardware work together.”
Why?
BTW, Paul, are there any of the guys left at your operation that remember their Sequent Symmetry Clusters running OPS circa 1993 or so? I sure do.
Why? Ignorance, I guess. So, how did/does it work? How portable is/was the software design and what did it take to make the move? I’m just curious and, hopefully, not too nosy.
—-
You remember 1993? In their youth, I hope the (current) old timers were not too mean… BTW, I think of top2 as a bit of a gold standard. I understand that DYNIX/ptx is still surviving somewhere inside the bowels of IBM. I came to American TV in 1995. I heard lots of stories about Oracle7 OPS version 7.0. I believe that OPS v7.0 would be fairly unforgettable 🙂 (now that it is 16 yrs later, we can laugh, ay?). Very ambitious! The servers were named “left” and “right”. I moved the systems to Numa (#1) and then to “numb” to “numc” and, in 2004, revived clustering on Sun v1280’s and Veritas SFRAC.
Paul,
I was onsite there at your operation helping stabilize that cluster in about 1993 as I recall and it wasn’t all that easy since they went live just before a huge Labor Day sale and, uh, back to school retail in your part of the world is significant business…the SKUs were whizzing through the system! Oh well, that is memory lane.
I’m not trying to put you on the spot (after all, we are in an odd way kindred spirits) but I have to ask how much you investigated Exadata V1? Between the papers, presentations, my webcasts, my posts and so forth it is pretty clear that Exadata software is portable. It’s not hardwired to the hardware much beyond the disk management layer it possesses. Think of that layer of Exadata the same way you think about the OSDs in the Oracle Server. Since it is portable it was able to be adapted to exploit FlashFire FLASH devices. And, by the way, I’m surprised people aren’t more jazzed about that technology. There are 4 FlashFire cards per cell each able to deliver 1 gigabytes per second to memory. Tremendous! Oh, and I’ll point out that the FlashFire-> memory bandwidth is not mutually exclusive at the expense of the spinning disk controller -> memory.
I agree with you Kevin, it’s all about optimizing a concept of architecture for x-86 based iron. Exadata V1 and V2 share the very same architecture : they just upgraded components for faster ones, but it’s just a bunch of x86 servers, RDP on infiniband, SAS/SATA disks, ASM, … nothing changed, therefore it’s as easy to port as changing server from a DL380 to a X4170.
The real new value of V2 is Flashfire, the Oracle optimizations for that technology, and Hybrid Columnar Compression. Amazing Amazing. I’m really waiting to see that in action !
Just a small nit. Exadata never ran on the Proliant DL 380. Exadata is the storage server software. It ran on the DL180 (and other hardware before that) and now runs on the x4275. There was work to do to get that working. I didn’t do it but I work with the guys that did. That work was to port the cell management and disk management software from the HP infrastructure to the Sun infrastructure. It is unfair to say that “nothing changed.” Once you log into an x4275 it is true that it “looks and feels” list just like any Xeon 5500 based system running the same OS (e.g., a Proliat DL 180 G6 with OEL 5). However, when you approach the KVM of the two systems they are totally different so think of it that way. Exadata is more than just the runtime, there is a lot of management stuff in there as well (e.g., disk management).
Nice to see folks picking out the named software optimizations. I haven’t even started blogging about that yet. I will be speaking at OW about these optimizations.
I am assuming that given the current speed of SPARC chips, Oracle does not have any plans to port it on SPARC?
Hi Amir,
Port what? Remember folks, Database Machine has 2 grids. Are you asking about the storage grid or the database grid or both?
ok, I’ll bite, at the risk of shamelesly exposing more of the width of my ignorance.
First: Where do I find information about Sun FlashFire technology? More admitted ignorance: the hyperlinks seem a bit circular. I can’t find anything beyond “flash accelerator”.
What is a “flash accelerator”? How does Sun FlashFire work? Where is it’s place in the stream of data between Exadata software and the physical disk drives? How might a thing like Sun FlashFire compete against the DMX-4 cache functions, or is that nonsensical? (BTW, I now know that the main point of Exadata Server is to simplify Oracle’s unique ways of getting a row).
How does FlashFire mitigate the random i/o problem? What is new/exciting about the algorithms that populate and flush the cache? What are the issues surrounding the time period for warming the cache (ie: never reboot, ay!)
ttfn, sunsets, rivers and kids are calling….
There is a nice video on youtube about the underlying technology in the Sun FlashFire technology from Andy Bechtolsheim:
Paul,
here’s the link about the Flash Accelerator on the Sun website:
http://www.sun.com/storage/disk_systems/sss/f20/index.xml
Hi Kevin.
Thanks for clarifying that this database machine is a db and storage grid and that Exadata is storage software customized for some hardware. Wanting to understand (and explain to customers) about Exadata storage, I’d like to ask if the following analogy/comparison to traditional enterprise storage is (close) enough to clarify/compare how it works:
Exadata storage grid (interconnected Exadata cells) no close equivalent (loosely analogous to multiple storage processors on same storage server)
Exadata cell storage server with 1 storage processor
Infiniband “SAN” Fiber, Ethernet network SAN
ZDP(RDS) storage protocol FC-AL, iSCSI storage protocol
Exadata cell “Disk”? (e.g. ASM Disk ?) Physical disk (?)
ASM Disk Group (of Exadata ASM Disks) RAID 0,1,10 array
ASM native “file” (db, ocr, css, etc.) LUN
ASM logical volume (via ADVM) LUN
I am not sure what is “presented” to ASM on the db grid by an Exadata cell – is it a logical volume that could possibly be carved out from mirrored/striped internal disks or else from some driver going to flash memory ?
Seems the delimiter I used in the analogy got lost when in posted my comment. Here I used the tilde (meaning “equivalent” to separate them, Exadata on the left, traditional enterprise storage on right:
Exadata storage grid (interconnected Exadata cells) ~ no close equivalent (loosely analogous to multiple storage processors on same storage server)
Exadata cell storage ~ server with 1 storage processor
Infiniband “SAN” ~ Fiber, Ethernet network SAN
ZDP(RDS) storage protocol ~ FC-AL, iSCSI storage protocol
Exadata cell “Disk”? (e.g. ASM Disk ?) ~ Physical disk (?)
ASM Disk Group (of Exadata ASM Disks) ~ RAID 0,1,10 array
ASM native “file” (db, ocr, css, etc.) ~ LUN
ASM logical volume (via ADVM) ~ LUN