If you are interested in array-level data reduction services and how such technology mixes with Oracle Database application-level compression (such as Advanced Compression Option), I offer the link below to an EMC Lab Report on this very topic.
To read the entire Lab Report please click the following link: Click Here.
The following is an excerpt from the Lab Report:
Executive Summary
EMC XtremIO storage array offers powerful data reduction features. In addition to thin provisioning, XtremIO applies both deduplication and compression algorithms to blocks of data when they are ingested into the array. These features are always on and intrinsic to the array. There is no added licensing, no tuning nor configuration involved when it comes to XtremIO data reduction.
Oracle Database also supports compression. The most common form of Oracle Database compression is the Advanced Compression Option—commonly referred to as ACO. With Oracle Database most “options” are separately licensed features and ACO is one such option. As of the publication date of this Lab Report, ACO is licensed at $11,000 per processor core on the database host1. Compressing Oracle Database blocks with ACO can offer benefits beyond simple storage savings. Blocks compressed with ACO remain compressed as they pass through the database host. In short, blocks compressed with ACO will hold more rows of data per block. This can be either a blessing or a curse. Allowing Oracle to store more rows per block has the positive benefit of caching more application data in main memory (i.e., the Oracle SGA buffer pool). On the other hand, compacting more data into each block often results in increased block-contention.
Oracle offers tuning advice to address this contention in My Oracle Support note 1223705.12. However, the tuning recommendations for reducing block contention with ACO also lower the compression ratios. Oracle also warns users to expect higher CPU overhead with ACO as per the following statement in the Oracle Database product documentation:
Compression technology uses CPU. Ensure that you have enough available CPU to handle the additional load.
Application vendors, such as SAP, also produce literature to further assist database administrators in making sensible choices about how and when to employ Advanced Compression Option. The importance of understanding the possible performance impact of ACO are made quite clear in such publications as SAP Note 14363524 which states the following about SAP performance with ACO:
Overall system throughput is not negatively impacted and may improve. Should you experience very long runtimes (i.e. 5-10 times slower) for certain operations (like mass inserts in BW PSA or ODS tables/partitions) then you should set the event 10447 level 50 in the spfile/init.ora. This will reduce the overhead for insertion into compressed tables/partitions.
The SAP note offers further words of caution regarding transaction logging (a.k.a., redo) in the following quote:
Amount of redo data generated can be up to 30% higher
Oracle Database Administrators, with prior ACO experience, are largely aware of the trade-offs where ACO is concerned. Database Administrators who have customarily used ACO in their Oracle Database deployments may wish to continue to use ACO after adopting EMC XtremIO. For this reason Database Administrators are interested in learning how XtremIO compression and Advanced Compression Option interact.
This Lab Report offers an analysis of space savings with and without ACO on XtremIO. In addition, a performance characterization of an OLTP workload manipulating the same application data in ACO and non-ACO tablespaces will be covered…please click the link above to continue reading…
Hi Kevin
Have a few queries related to the tests performed on your above nice and clear WP on compression with XTremeIO and Oracle ACO feature.
1.Related to Test data setup: Any idea on how long it took in your controlled lab setup with XTremeIO apparatus to load the 80 billion rows i.e 9.42T of TPC-H lineitem data.
I assume DBGen which comes with TPC-H is a C based utility to generate data which would be loaded into the customized single table of your setup.how was the sorting during loading performed as mentioned in the doc/WP
2. Is there any tests performed to see compression benefits for oracle partitioned IOT with key compression vs out of box XTremeIO savings
thx again for your WP.
For query 1 above I googled a bit for the loading part and found out DBGen simply generates flat data files and so i guess external table style loading or sqlldr style loading would later be used i guess . However still interested to see the time taken to do data loading if it is around 200MB/sec or higher. thx
Hello Kishore,
Sorry for my late reply. The method I used to load that 9.42 TB was to produce batches of data from DBGEN to which I added a unique number column as the first ordinal column so as to function with the SLOB framework (documented in Appendix A). I then loaded each batch with INSERT /*+ APPEND */ (direct path IAS) with an ORDER BY clause on SHIPDATE as per the Analyzing Space Savings section of the paper.
The real answer to your question is not available in this Case Study as the data loading phase was not a part of the study. After all, nobody includes an ORDER BY in their data loading proof points given the fact that loading without an ORDER BY clause is already a CPU-problem not a storage problem.
Finally, you cite 200MB/s as an rate of interest. I assure you that 200MB/s is not a challenge for any 2S Xeon host attached to EMC XtremIO… or the majority of storage on the market for that matter. 200MB/s is simply not a difficult bulk loading goal to achieve. I can refer you to the following blog post which shows 600GB/h data loading. I’ll point out that SLOB 2.3 (not yet released) is showing 1TB/h for not only data loading but 1TB/h loaded, indexed and CBO statistics gathering.
https://kevinclosson.net/2014/08/06/slob-data-loading-case-studies-part-i-a-simple-concurrent-parallel-example/
Never forget…data loading is a CPU problem…rarely a storage problem.
Hi Kevin,
I have a few questions:
1. On page#3 of the following link, it is stated that due to the way Oracle database blocks are structured, dedupication will not help with providing compression for a single database and that it will help in a situations where multiple databases were created from, say, the same copy of production:
Click to access h16753-dell-emc-xtremio-x2-oracle-db-compression-wp.pdf
In this type of scenario, considering that XtremeIO is an expensive array, wouldn’t it make more sense to use something like Delphix for non-production and use XtremeIO for production and supporting performance testing environments?
2. XtremIO X2 uses 8k block size. Is this the physical size of the flash block?
Oracle datafiles do not really deduplicate but I don’t know if I can pivot to recommend one storage approach or the other based on that fact.
There is another post you might find helpful in understanding datafile dedup: https://kevinclosson.net/2016/11/09/yes-storage-arrays-can-deduplicate-oracle-database-here-is-exactly-why-it-doesnt-matter/