Oracle Exadata Database Machine Handily Handles The Largest Database In Oracle IT. What Does That Really Mean?

In my recent post entitled Oracle Executives Underestimate SPARC SuperCluster I/O Capability–By More Than 90 Percent! I offered some critical thinking regarding the nonsensical performance claims attributed to the Sun SPARC SuperCluster T4 in one of the keynotes at Oracle Openworld 2011. In the comment thread of that post a reader asks:

All – has anyone measured the actual IOPS from disk as well as from flash in your Exadata (production) environment and compare with what the Oracle white paper or CXO presentations claimed?

That is a good question. It turns out the reader is in luck. There happens to be really interesting public information that can answer his question. According to this  article, Campbell Webb, Oracle’s VP of Product Development IT refers to Oracle’s Beehive email and collaboration database as “Oracle’s largest backend database.” Elsewhere in the article, the author writes:

Oracle’s largest in-house database is a 101-terabyte database running Beehive, the company’s in-house email and collaboration software, running on nine Oracle Exadata boxes.

As it turns out, Oracle’s Campbell Webb delivered a presentation on the Beehive system at Oracle Openworld 2011.  The presentation (PDF) can be found here. I’ll focus on some screenshots of that PDF to finish out this post.

According the following slide from the presentation we glean the following facts about Oracle’s Beehive database:

  • The Beehive database is approximately 93TB
  • Redo generation peaks at 20MB/s—although that is unclear if that is per instance or the aggregate of all instances of this Real Application Clusters Database
  • User SQL executions peaks at roughly 30,000 per second

The piece quotes Campbell Webb as stating the configuration is 9 racks of Exadata gear with 24 instances of the database—but “only” 16 are currently active. That is a lot of Oracle instances and, indeed, a lot of instances can drive a great deal of physical I/O. Simply put, a 9-rack Exadata system is gargantuan.

The following is a zoom-in photo of slide 12 from the presentation. It spells out that the configuration has the standard 14 Exadata Storage Servers per rack (126 / 14 == 9) and that the hosts are X2-2 models. In a standard configuration there would be 72 database hosts in a 9-rack X2-2 configuration but the article quotes Webb as stating 16 are active and there only 24 in total. More on that later.

With this much gear we should expect astounding database throughput statistics. That turns out to not be the case. The following slide shows:

  • 4,000,000 logical I/O per second at peak utilization. That’s 250,000 db block gets + db block consistent gets (cache-buffers chain walks) per second per active host (16 hosts). That’s a good rate of SGA buffer pool cache activity—but not a crushing load for 2S Westmere EP.
  • The physical read to write ratio is 88:12.
  • Multiblock physical I/Os are fulfilled by Exadata Storage Servers on average at 6 or less milliseconds
  • Single block reads are largely satisfied in Exadata Smart Flash Cache as is evidenced by the 1ms waits
  • Finally, database physical I/O peaks at 176,000 per second

176,000 IOPS
With 126 storage servers there is roughly 47TB of Exadata Smart Flash Cache. Considering the service times for single block reads there is clear evidence that the cache management is keeping the right data in the cache. That’s a good thing.

On the other hand, I see a cluster of 16 2U dual-socket Westmere-EP Real Application Clusters servers driving peak IOPS of 176,000. Someone please poke me with a stick because I’m bored to death—falling asleep. Nine racks of Exadata is capacity for 13,500,000 IOPS (read operations only of course). This database is driving 1% of that. 

Nine racks of Exadata should have 72 database hosts. I understand not racking them if you don’t need them, but the configuration is using less than 2 active hosts per rack—but, yes, there are 24 cabled (less than 3 per rack). Leaving out 48 X2-2 hosts is 96U—more than a full rack worth of aggregate wasted space. I don’t understand that.  The servers are likely in the racks—powered off. You, the Oracle customer, can’t do that because you aren’t Oracle Product Development IT. You’ll be looking at capex—or a custom Exadata configuration if you need 16 hosts fed by 126 cells.

Parting Thoughts
It is not difficult to configure a Real Application Clusters system capable of beating 16 2-socket Westmere EP servers, with their 176,000 IOPS demand, with far, far less than 9 racks of hardware.  It would be Oracle software just the same—just no Exadata bragging rights. And, once a modern, best-of-breed system  is happily steaming along its way hustling 176,000 IOPS, you could even call it an “Engineered System.” There’d be no Exadata bragging rights though. Just a good system handling a moderate workload. There is nothing about this workload that can’t be easily handled with conventional, best-of-breed storage. EMC technology with FAST quickly comes to mind.

Beehive is Oracle’s largest database and it runs on a huge Exadata configuration. Those two facts put together do not make any earth-shattering proof point when you study the numbers.

I don’t get it. Well, actually I do.

By the way, did I mention that 176,000 IOPS is not a heavy IOPS load–especially when only 12% of them are writes?

6 Responses to “Oracle Exadata Database Machine Handily Handles The Largest Database In Oracle IT. What Does That Really Mean?”

  1. 1 James Morle November 17, 2011 at 7:34 am

    Hey Kevin,

    Interesting analysis. I think you have an error (or inconsistency) in the last sentence: It should be 12% *writes* per the main text?

    So apart from the fact that the workload profile does not match the default size and shape of Exadata, I have a major architectural beef with people taking a naturally distributed application such as email and jerry-rigging them into a complex single system image. Is it a macho thing? Or just a good way to spend marketing dollars? It certainly isn’t intelligent architectural choice.

  2. 3 storagetuning November 17, 2011 at 8:11 pm

    Hi Kevin,

    Thanks for posting this. It always amazes me how the performance stats from an “amazingly” heavily accessed production database differ from what hardware vendors think of as heavily accessed.

    Jamon Bowen

    • 4 kevinclosson November 17, 2011 at 9:43 pm

      Hey Jamon…long time no speak… I don’t get out much these days (doggone day job) so I missed you at Openworld…

      Yeah, I thought you guys would like this post. Say “Hi” to Mike for me.

  3. 5 Mike Ault November 17, 2011 at 8:38 pm

    Using 5 small servers we can easily drive close to 1,000,000 IOPS through 1-5 TB RamSan630 serving at far less than 1 ms latency. If we configured 20 of our new RS810 – 10 TB units (mirrored for redundancy) to give 100 TB of storage we could serve in excess of 3,000,000 IOPS per mirror at less than 500 microseconds latency and 40 GB/s of sustained throughput. Needless to say, the hardware and license costs would be greatly reduced and dare I say, performance would probably be better since they can’t use all the wonderful Exadata software for a mostly small IO load.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


I work for Amazon Web Services but all of the words on this blog are purely my own. Not a single word on this blog is to be mistaken as originating from any Amazon spokesperson. You are reading my words this webpage and, while I work at Amazon, all of the words on this webpage reflect my own opinions and findings. To put it another way, "I work at Amazon, but this is my own opinion." To conclude, this is not an official Amazon information outlet. There are no words on this blog that should be mistaken as official Amazon messaging. Every single character of text on this blog originates in my head and I am not an Amazon spokesperson.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,900 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories


All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: