Lab Report: Oracle Database on EMC XtremIO. A Compression Technology Case Study.

If you are interested in array-level data reduction services and how such technology mixes with Oracle Database application-level compression (such as Advanced Compression Option), I offer the link below to an EMC Lab Report on this very topic.

To read the entire Lab Report please click the following link:   Click Here.

The following is an excerpt from the Lab Report:

Executive Summary
EMC XtremIO storage array offers powerful data reduction features. In addition to thin provisioning, XtremIO applies both deduplication and compression algorithms to blocks of data when they are ingested into the array. These features are always on and intrinsic to the array. There is no added licensing, no tuning nor configuration involved when it comes to XtremIO data reduction.

Oracle Database also supports compression. The most common form of Oracle Database compression is the Advanced Compression Option—commonly referred to as ACO. With Oracle Database most “options” are separately licensed features and ACO is one such option. As of the publication date of this Lab Report, ACO is licensed at $11,000 per processor core on the database host1. Compressing Oracle Database blocks with ACO can offer benefits beyond simple storage savings. Blocks compressed with ACO remain compressed as they pass through the database host. In short, blocks compressed with ACO will hold more rows of data per block. This can be either a blessing or a curse. Allowing Oracle to store more rows per block has the positive benefit of caching more application data in main memory (i.e., the Oracle SGA buffer pool). On the other hand, compacting more data into each block often results in increased block-contention.

Oracle offers tuning advice to address this contention in My Oracle Support note 1223705.12. However, the tuning recommendations for reducing block contention with ACO also lower the compression ratios. Oracle also warns users to expect higher CPU overhead with ACO as per the following statement in the Oracle Database product documentation:

Compression technology uses CPU. Ensure that you have enough available CPU to handle the additional load.

Application vendors, such as SAP, also produce literature to further assist database administrators in making sensible choices about how and when to employ Advanced Compression Option. The importance of understanding the possible performance impact of ACO are made quite clear in such publications as SAP Note 14363524 which states the following about SAP performance with ACO:

Overall system throughput is not negatively impacted and may improve. Should you experience very long runtimes (i.e. 5-10 times slower) for certain operations (like mass inserts in BW PSA or ODS tables/partitions) then you should set the event 10447 level 50 in the spfile/init.ora. This will reduce the overhead for insertion into compressed tables/partitions.

The SAP note offers further words of caution regarding transaction logging (a.k.a., redo) in the following quote:

Amount of redo data generated can be up to 30% higher

Oracle Database Administrators, with prior ACO experience, are largely aware of the trade-offs where ACO is concerned. Database Administrators who have customarily used ACO in their Oracle Database deployments may wish to continue to use ACO after adopting EMC XtremIO. For this reason Database Administrators are interested in learning how XtremIO compression and Advanced Compression Option interact.

This Lab Report offers an analysis of space savings with and without ACO on XtremIO. In addition, a performance characterization of an OLTP workload manipulating the same application data in ACO and non-ACO tablespaces will be covered…please click the link above to continue reading…

 

3 Responses to “Lab Report: Oracle Database on EMC XtremIO. A Compression Technology Case Study.”


  1. 1 Kishore Kannan (@Kishore2Tweets) June 1, 2015 at 8:25 am

    Hi Kevin
    Have a few queries related to the tests performed on your above nice and clear WP on compression with XTremeIO and Oracle ACO feature.
    1.Related to Test data setup: Any idea on how long it took in your controlled lab setup with XTremeIO apparatus to load the 80 billion rows i.e 9.42T of TPC-H lineitem data.
    I assume DBGen which comes with TPC-H is a C based utility to generate data which would be loaded into the customized single table of your setup.how was the sorting during loading performed as mentioned in the doc/WP

    2. Is there any tests performed to see compression benefits for oracle partitioned IOT with key compression vs out of box XTremeIO savings

    thx again for your WP.

  2. 2 Kishore Kannan (@Kishore2Tweets) June 1, 2015 at 10:36 pm

    For query 1 above I googled a bit for the loading part and found out DBGen simply generates flat data files and so i guess external table style loading or sqlldr style loading would later be used i guess . However still interested to see the time taken to do data loading if it is around 200MB/sec or higher. thx

    • 3 kevinclosson June 3, 2015 at 2:37 pm

      Hello Kishore,

      Sorry for my late reply. The method I used to load that 9.42 TB was to produce batches of data from DBGEN to which I added a unique number column as the first ordinal column so as to function with the SLOB framework (documented in Appendix A). I then loaded each batch with INSERT /*+ APPEND */ (direct path IAS) with an ORDER BY clause on SHIPDATE as per the Analyzing Space Savings section of the paper.

      The real answer to your question is not available in this Case Study as the data loading phase was not a part of the study. After all, nobody includes an ORDER BY in their data loading proof points given the fact that loading without an ORDER BY clause is already a CPU-problem not a storage problem.

      Finally, you cite 200MB/s as an rate of interest. I assure you that 200MB/s is not a challenge for any 2S Xeon host attached to EMC XtremIO… or the majority of storage on the market for that matter. 200MB/s is simply not a difficult bulk loading goal to achieve. I can refer you to the following blog post which shows 600GB/h data loading. I’ll point out that SLOB 2.3 (not yet released) is showing 1TB/h for not only data loading but 1TB/h loaded, indexed and CBO statistics gathering.

      https://kevinclosson.net/2014/08/06/slob-data-loading-case-studies-part-i-a-simple-concurrent-parallel-example/

      Never forget…data loading is a CPU problem…rarely a storage problem.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services but all of the words on this blog are purely my own. Not a single word on this blog is to be mistaken as originating from any Amazon spokesperson. You are reading my words this webpage and, while I work at Amazon, all of the words on this webpage reflect my own opinions and findings. To put it another way, "I work at Amazon, but this is my own opinion." To conclude, this is not an official Amazon information outlet. There are no words on this blog that should be mistaken as official Amazon messaging. Every single character of text on this blog originates in my head and I am not an Amazon spokesperson.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,894 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: