In my recent blog entry entitled “Oracle Exadata Storage Server. No Magic in an Imperfect World. Excellent Tools and Really Fast I/O Though“, I concluded with a reference to some anti-Exadata comments made by EMC’s Chuck Hollis in his blog entry entitled “Oracle Does Hardware.” I directed my readers to that blog post by writing the following:
In spite of how many times EMC’s Chuck Hollis may claim that there is “nothing new” or “no magic” when referring to Oracle Exadata Storage Server, I think it is painfully obvious that there is indeed “something new” here. Is it magic? No, and we don’t claim that it’s magic.
Six days after I posted that blog entry, Chuck submitted a lengthy comment on the post. Instead of responding to Chuck’s comments in the comment thread I’ve decided to do so here.
Readers please don’t confuse this as some sort of Kevin versus Chuck thread because it isn’t. What you’ll see in this post is an analysis of the words of someone representing one of the (if not the premier) conventional storage providers (EMC). My motives are to provide useful information in this analysis.
If you read Chuck’s assessment of Oracle Exadata Storage Server, you’ll see a positioning piece with an overtly anti-Exadata slant. Chuck’s words in that post are aimed at conveying facts. My first handling of that anti-Exadata piece was very light. I aimed to capitalize foremost on the repetitious use of the words “nothing new” and “magic.” Chuck likely saw my post where I called this out. Chuck answered my calling-out in the comment thread of this post. Chock wrote:
Sorry, Kevin, didn’t mean to come across as too pessimistic in my blog.
Asserting Beliefs
I need to point out that one cannot be pessimistic about facts. The word pessimistic only applies to beliefs and emotions. Chuck’s piece wasn’t pessimistic–it was flawed based on technical grounds. Chuck continued with:
Leaving hardware issues aside, how much of the software functionality shown here is available on generic servers, operating systems and storage that Oracle supports today? I was under the impression that most of this great stuff was native to Oracle products, and not a function of specific tin …
Last Chance for a First Impression
Chuck’s “pessimistic” post came out the day after the Oracle Exadata Storage Server launch so harboring such questions in his mind at that time would have been understandable. However, Chuck visited my blog some 22 days later and continued to ask questions that clearly demonstrate a lack of understanding of Oracle Exadata Storage Server. Chuck may have been “under the impression” that the underpinnings of Exadata are Oracle-generic (“native to Oracle products”), but he is wrong. Oracle Exadata Storage Server software is not a scalpel-job on the Oracle Database server. It is a totally new storage server software package.
To answer Chuck’s question about the software, none of the software functionality (Exadata) is available on generic servers, operations systems or storage. Chuck continued with:
If the Exadata product has unique and/or specialized Oracle logic, well, that’s a different case.
Yes, it is unique and specialized and a different case. Even light reading of the available material (e.g., the Exadata paper and, shucks, maybe a few of my blog posts) would have made that glaringly obvious. Chuck continued with:
Speaking strictly as a storage guy, here’s what I know.
– using commodity servers and storage arrays, we can usually feed in more data than a server can process, specifically true in an Oracle DW environment.
Chuck, swamping a commodity server is not the goal. Of course it’s easy to produce more raw, streaming data from even a midrange storage array than can be ingested by a single commodity server. Even the best commodity servers choke at less than 2GB/s data ingest rate when Oracle is performing data-rich functionality (e.g., joins, sorts, aggregation, etc). The design goal of Exadata was not to swamp commodity servers more efficiently. That would be a storage-only, bigger hose, speed-and-feed mentality-the “brute force only approach.”
You Don’t Always Get What You Want-Enter Exadata
The value proposition of Exadata is to scan disk without bottlenecks and return to the Database grid only the data the query wants, not blocks of disk. It’s a feature we call Smart Scan. Chuck needs to have his folks brief him on that. However, Exadata is more than capable of holding its own in the pure “brute force” camp.
Even Without Smart Scan, Exadata is Faster Than Conventional Storage
As a simple block server, Exadata is able to deliver 1GB/s per cell to the Database grid. If you don’t think that is “brute force”, consider a moderate Oracle Database Machine configuration consisting of a single rack serving 14 GB/s to the Database grid. If those numbers don’t speak loudly enough, just investigate what sort of conventional SAN array configuration it would take to deliver 14 GB/s to a Database grid. So, yes, Exadata is both “brute force” and intelligent and that is why I had to call out Chuck’s blog remarks about how Exadata is “nothing new.”
Chuck finished that paragraph with:
I’m having a hard time seeing the advantages of pairing a commodity Xeon-based server with JBOD and claiming a performance advantage for this part of the equation.
Oh my, where to start. Chuck, I understand why you would have difficulty seeing the advantage in what you just described, but what on earth does any of that have to do with Oracle Exadata Storage Server? First, where did you get “JBOD?” An Oracle Exadata Storage Server cell is not just a Xeon processor sitting in front of some disks (JBOD). The disks are down-wind of an intelligent HP P400 Smart Array with 512MB battery-backed write cache. And, what’s so terrible about fronting some disks with Xeon technology anyway? There are a few conventional storage arrays on the market that use Xeon in the array head.
It’s All About Balance
Fingering the fact that Intel Xeon processors execute storage intelligence software in the Oracle Exadata Storage Server doesn’t hold water–especially since the ratio is 2 sockets per 12 hard drives. Perhaps Chuck will tell us the maximum number of Xeon processors EMC supports in front of 960 drives in a fully loaded midrange EMC array (e.g., CX)?
Oracle has purpose-built a balanced system by coupling the power of 2 Xeon processors (Harpertown quad-core) in front of 12 drives.
Infiniband: The Exadata “Secret Sauce?”
Chuck continued with:
– you may be more knowledgeable than I, but we are under the impression that the IB compute node connection doesn’t bring much to the party. When we looked at many clustered Oracle DW implementations, there was plenty of bandwidth available between the compute nodes, using multiple 1Gb/sec links.
That’s why we don’t talk about it much. Infiniband is not why Exadata is so fast. Infiniband is one of the reasons why Exadata is not bottlenecked. First, I’ll point out that Infiniband is a unified fabric for both disk and inter-node communications with Exadata. I’ve been writing about storage up to this point and now the focus is shifted to Real Application Clusters (RAC) interconnect technology. I’ll be brief on this topic. I don’t doubt, nor do I care, that there are clustered Oracle DW systems currently deployed that are able to get by with multiple UDP Gigabit Ethernet networks configured as the RAC interconnect. That’s just fine with me. Does that somehow negate the value of Exadata because Oracle so foolhardy engineered a zero-copy RDMA interconnect for RAC while unifying interconnect and storage networking into a single fabric? I shouldn’t think so. UDP costs some (lots) of cycles compared to ZDP over Infiniband. Just because a network has headroom left over doesn’t mean resources are otherwise being utilized efficiently. Oracle didn’t aim to engineer bottlenecks into the Exadata architecture.
Chuck continued with:
And, I know this only matters to storage people, but there’s the minor matter of having two copies of everything, rather than the more efficient parity RAID approaches. Gets your attention when you’re talking 10-40TB usable, it does.
Yes, the initial release of Exadata requires 1:1 mirroring. Does that somehow insinuate that Exadata will never offer the more space-saving RAID approached Chuck is alluding to? Life is, after all, an unending series of choices.
Everyone Includes EMC
Chuck continued with:
Bottom line – what does the hardware bring to the party, rather than software? And if you can get the same benefits without dictating that customers buy a specific piece of tin, isn’t that a win for everyone?
Chuck, and my blog readers alike, should know by now what the hardware brings to the party. Oracle Exadata Storage Server hardware is-unlike conventional storage arrays-not configured with guaranteed throughput bottlenecks built in. That warrants a party. On the other hand, the software is the secret sauce. The choice of which “tin” gets to run the software is, of course, someone else’s decision. I will say, however, if you were to execute the software on systems less balanced than the current platform (HP Proliant DL180 G5), you would not realize the benefit. It’s all about balance.
Chuck finished with:
Finally, I’d be interested in your thoughts on how enterprise flash drives fit into all of this. Yes, they’re rather expensive now, but this won’t be the case before too long.
I’ve bored you all to death already. I’ll hit FLASH SSD in my next blog entry.
I’m sorry, Kevin, I was under the impression that you were a somewhat balanced observer of such things.
I didn’t know you had a quota for hardware now.
My bad .. I’ll put you in the category of other cheerleaders.
Best regards …
Chuck
Hi ,
I agree with Chuck Hollis. Exadata is a very good product only if you really don`t know how to design a DW with Best Practics.
If you know the best disk formatting , operation system , database distribution and parallel processing (with or without smart scan filter techniques) and so on …. then you can do DW environment faster.
Obviously that Exadata has all best practics by default , and the price will be good.
But , i can see no miracles too.
Hello Kevin,
With 10Gige Ethernet prices dropping and increasing CPU power of generic COTS (commercial off the shelf) hardware, how does such a customized hardware as Exadata keep up in the race?
I mean are there plans to use generic COTS Server boards to run the offload component in Exadata? That way this becomes an “all software solution” (RAID ofcourse being done by RAID controller HBAs in these servers). Users can leverage the increasing computing power of the current COTS servers at anytime.
It is only a matter of time before standard COTS servers out-run such customized harware, and there may not be significant performance difference between Exadata and database servers directly connected to JBODS via RAID HBAs (ie no offload).
regards
Sudhir
I have been following the Exadata information mainly on this blog. Sure, as COTS gets better and cheaper it will be better and better. However, my take away from this blog has been that unless you really understand the underlying nuts and bolts of the system you can end up with horribly unbalanced systems. In the future it may cost you less to create such systems, but one bottle neck can really throttle down the whole system. (This is basic queuing theory)
Yes, current technology eventually gets pushed down to cots, but the leading edge keeps moving on. The scale keeps getting bigger and bigger. When I first used computers it was a PDP-8i with 32K of memory for a 5 user system and had a huge 1 meg winchester disk. Now for a couple of hundred dollars I can buy a 1 Terabyte SAN for my home network. It is all a matter of scale.
Thanks for the quick response Jim.
As a user and a buyer, I will feel awful if I am forced to buy some hardware that I know is outdated or not powerful enough, just because the manufacturer (Oracle) lagged behind in hardware design…is this a case where Oracle is unknowingly encouraging competetion (read Microsoft) to come up with such distributed node based -software only solution which is hardware agnostic?…as a user I will definitely want to go that way, rather than stay with such “custom hardware”.
regards
Sudhir
Good Heavens. This comment thread requires another now post. Too much here to polish in comments.
Kevin,
let me give it a try:
First to Chuck: Of course Kevin worked on this technology, and hopes it will be as widely used as possible. But coming from someone who got your position at EMC, your comment is very strange, as if you don’t have any interest in selling EMC product and positioning them as better than Exadata, which is not a competing offer in all your market: it’s just for Oracle databases.
Second Cesar: You really underestimate the importance of Smart Scan (with all it’s aspects, plus io resource management, ….) it is a very big part of the picture to make the whole thing much better.
Third Sudhir: The Exadata server are COTS based, there is nothing in there that was designed or built especially for Oracle. It’s all standard parts. I don’t really understand your comment. The Software is a very essential part of the system. It’s not that Oracle just cobbled together some disks with IB connection to make it faster than FC ones. Sur IB matters, but the brains behind it matter a lot also.
Best regards
Good try Salem.
Bottomline for me: Can I replace a Exadata node with a Dell/Intel/ server/motherboard or I still have to buy the node from Oracle?
Your ststement: “it is all standard parts” does it apply in a general way to whatever happens in the hardware industry- i.e even customized x86 boards have a “standard” Intel/AMD processor inside 🙂 !! To that extent, even Mac can say they are made by “Standard COTS components” I am sure u understand that is not the issue….or have I got it wrong?
regards
Sudhir
Sudhir,
The issue with your position is seperating the hardware from the software. That is, taking the exadata storage server software and running it on whatever you like. I’m not sure you can do, for the time being, at least for support issues. The Exadata server software requires some hardware (and os) configuration, you have to stick with that. Oracle may (or may not) allow it in the future.
The difference between Exadata and other DWH appliances is that many of them require specialized hardware: Netezza has there FPGA in the hardware, Teradata has their Bynet. You cannot replace a Netezza ‘cell’ with any other hardware you may find on the market, neither replace Bynet by another Ethernet technology.
rgds
Hi ,
Sudhir ,
The Smart Scan tecnology is not underestimated by me… I´m just saying that this tecnology is not new …
If you see what the Teradata AMP do since 1970 years , the Smart Scan has the same approach when filter and sort the “local data”.
Look… Exadata can be a very good setup for DW , but the concepts and its implementation do not have anything new.
Who knows Teradata/NeoView/Netezza is able to design a high performance system as good as Exadata using Oracle/DB2 !
That´s why i agree with Chuck.
César
Cesar,
The “locality” of the data is different between Exadata and Teradata. In the TD case, the storage system sends the data over Fc to the compute server for filtering, in ED case, it is done in the storage cell.
That said, every DB vendor can try something like ED, Netezza does the same thing, but the main problem there is that you don’t have any feature in the DB that would reduce the number of full table scans. That’s why they usually advertise their systems to cater for a small number of users, running queries that need FTS. A system that does only FTS cannot scale beyond a small number of users.
rgds
This is in response to Salem’s point about DWH appliances being proprietary. While I agree that such may have been the route some time back, when COTs were not as powerful as they are today, the compute environment today however has improved significantly, to allow COTS devices to do the job. DatAllegro-v3 is probably one such example.
Thank you Salem ,
Then i can think that Exadata is like a parser/exec out of database… Afterthat the rows are sent to the Database itself.
What kind of operations (filter , sort , etc…) can Exadata Cell perform ?
Is Exadata Cell a “mini-me” of Oracle main RDBMS ?
Best Regards,
César
@César
Exadata is passed query information from the Optimizer (columns in the select list, and predicate filters) and filters out rows and columns before the data is sent back to the database grid. There is no RDBMS running on the Exadata Storage Server.
This and more is covered in the technical documentation and presentation. There is even an example.
Click to access exadata-technical-whitepaper.pdf
Click to access exadata-storage-technical-overview.pdf
Cesar,
As discussed in the different docs on ED on oracle.com: the cells do:
filtering, projection, bloom filtering (join),
the cells do also offload the block modification control for rman (that is, checking if the block should be taken when doing incremental backups), and the block initialization when creating datafiles.
The cells also implement I/O resource allocation plans, so you can use the same cells for different databases, while preventing anyone from taking too much resources, and thus impacting the other ones, performance wise.
The ED storage server software is by no means a scaled down Oracle Kernel, it much smaller than that.
rgds
You folks have to be joking me. It took my friends Greg and Salem less than 100 words to get across to Cesar what my thousands of words have not?
Cesar wrote:
“If you know the best disk formatting , operation system , database distribution and parallel processing (with or without smart scan filter techniques) and so on …. then you can do DW environment faster.”
I let this comment pass through my moderation because it appeared as though Cesar may have had at least cursory understanding of Smart Scan upon which to build his anti-position. He finish his argument with:
“What kind of operations (filter , sort , etc…) can Exadata Cell perform ?”
This statement is clear indication that he did not read a single word of any of my blog posts or the Exadata product whitepaper. I’m glad to have visitors to my blog, but it is helpful when vistors actually read the posts.
I clearly need to do a better job of weeding out unqualified comments.
Kevin,
I am waiting for yor comment on the DatAllegro-v3, which to me seems a more rational way of doing “DWH noding”
regards
Sudhir
Sudhir,
Have you read my DATAllegro posts?
No…I apologize..I did not know you had a post on that..I would have liked it better had provided the link. I just googled it out.Is it: https://kevinclosson.wordpress.com/2008/07/07/i-know-nothing-about-data-warehouse-appliances-and-now-so-wont-you-part-ii-datallegro-supercharges-fibre-channel-performance/?
I will like to review that before I repond.
thanks
regards
Sudhir
Hello Kevin,
I did go thru your posts and some additional details on both products:Exadata & DATAllego. Following are my observations- May I request you and similarly technically enlightened people to respond?
1. The key significant strength of Exadata seems to be the backend access to storage which it does with a HP SA-p400 (really a LSI-Logic SAS controller with 8 SAS ports). The DATAllegro is kind of restricted with its “dual FC only” approach to backend storage. I personally dont see any reason why DATAllego needs to be restricted to FC usage, more because the cable distances between the Dell compute nodes and the Storage box could be within the permissible SAS cable distances. With minor reengineering, DATAllero could put in an Adaptec SAS controller (which I believe is far superior to the LSI controller that HP has and has more ports), Can replace the Clarion with a SAS enclosure and this combo could become a formidable combination.
2. I am rather surprised why Exadata wants to call itself a “hardware solution included”…I now see Salem’s point-thank you Salem!! I personally feel there is no real reason why it should be tied to the HP hardware and I say that for the following reasons.
a. The computing platform used is a general purpose server from HP: it is a HP ProLiant DL180 G5 , which also runs Windows 🙂 !! This can probably be be replaced by a COTS server board?
b. Backend storage in the Exadata is via the LSI SAS controller, on a PCIe slot. That can probably be replaced by an Adaptec or similar higher performing HBA. The storage enclosure too can be a COTS device.
c. Ditto for the Inifiniband PCIe HBA
I personally feel that with some intelligent replacements, the Exadata can actually be made to perform better and the offering can be at a lower cost. I wonder if the “tied to hardware” approach is more to ensure serviceability and general hardware support from a reliable Hardware vendor like HP, than real technical/design reasoning.
Thank you for going thru this rather long text…may I request reponses?
Thank you
regards
sudhir.brahma@gmail.com