September 1, 2009 11:07 pm
I see fellow OakTable Network member Tanel Poder has blogged that Oracle Database 11g Release 2 has switched to offer columnar store technology. Or at least one could infer that from the post.
I left a comment on Tanel’s blog but would like to make a quick entry here on the topic as well. Oracle Database 11g Release 2 does not offer column-store technology as thought of in the Vertica (for example) sense. The technology, available only with Exadata storage, is called Hybrid Columnar Compression. The word hybrid is important.
Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.
So, “hybrid” is the word. But, none of that matters as much as the effectiveness. This form of compression is extremely effective.
Posted by kevinclosson
Categories: oracle
Tags:
You must be logged in to post a comment.
Mobile Site | Full Site
Get a free blog at WordPress.com Theme: WordPress Mobile Edition by Alex King.
Kevin
Is the restriction of HCC to Exadata due to technical reasons or is it a marketing ploy? if it’s the latter, seems like a bit of a gamble to me…
Regards
Andrew
By Andrew Gregovich on September 2, 2009 at 4:24 am
It is technical for reasons.
By kevinclosson on September 2, 2009 at 4:31 am
Kevin,
the marketing guys at Oracle did exactly what you never liked in marketing guys (of other companies). They claim: DWH query performance is 10-fold with Exadata Storage Server and Database Machine. Full stop. 😉
You can see it on http://www.oracledatabase11g.com/Main/Questionnaire/EducationQuiz.html in the first question
Nevertheless, we all know that Oracle Exadata Storage Server is 485x Faster Than…Oracle Exadata Storage Server.
Regards
Dani
By Dani Rey on September 2, 2009 at 7:02 am
[…] https://kevinclosson.wordpress.com/2009/09/01/oracle-switches-to-columnar-store-technology-with-oracl… […]
By Oracle 11gR2 has been released – and with column oriented storage option | Tanel Poder's blog: Core IT for Geeks and Pros on September 2, 2009 at 11:48 am
Hi Kevin,
Thanks for the correction! However the new features overview doc is somewhat misleading in this case..
“Hybrid columnar compression is a new method for organizing how data is stored. Instead of storing the data in traditional rows, the data is grouped, ordered and stored one column at a time.”
By Tanel Poder on September 2, 2009 at 1:33 pm
The Exadata documentation is much more clear. I don’t think anything is misleading. It’s just not perfectly clear. I see no problem with that though because HCC is not generically available as it is specific to Exadata. To that end, the Exadata documentation is the best place to go into the topic deeply. That’s just my opinion.
By kevinclosson on September 2, 2009 at 3:09 pm
The “switch” to column storage is eluded to at the Structured Data bloag as well…
http://structureddata.org/2009/09/01/oracle-database-11g-release-2-is-now-generally-available/
By Darryl on September 2, 2009 at 3:32 pm
Never a war between Greg and I …too good friends … there will be terminology missteps
By kevinclosson on September 2, 2009 at 5:41 pm
@Darryl
Kevin and I are not in disagreement; it’s just my way of describing what is in the documentation (which I have quoted in my blog post).
By Greg Rahn on September 3, 2009 at 8:11 am
[…] Oracle’s Kevin Closson explains in a blog post: “The technology, available only with Exadata storage, is called Hybrid […]
By The future of the database is… plaid? — Too much information on September 2, 2009 at 5:16 pm
[…] 2 white paper says “data is grouped, ordered, and stored one column at a time.” But Kevin Closson clarifies: The word hybrid is […]
By Oracle Exadata Hybrid Columnar Compression | DBMS2 -- DataBase Management System Services on September 3, 2009 at 9:33 am
Gents,
I wasn’t inferring any disagreement. I was just pointing out this topic was mentioned on another blog.
Thanks
By Darryl on September 3, 2009 at 3:00 pm
Kevin,
I’m quite confused by your post. You say:
“Rows are still used. They are stored in an object called a Compression Unit. Compression Units can span multiple blocks. Like values are stored in the compression unit with metadata that maps back to the rows.”
What is “they” referring to in “they are stored in an object…”? Is it rows? If so, then what does it mean that “like values” are stored in the compression unit? “Like values” would seem to indicate values from the same column.
In short, it is unclear from your post if a compression unit contains multiple rows, but each column is stored separately inside the compression unit, or if it is just a bunch of data from one particular column.
Cheers,
Daniel Abadi
Assistant Professor, Yale University
By Daniel Abadi on September 3, 2009 at 3:10 pm
Hi Daniel,
The hybrid columnar compression feature is documented in the Exadata Storage Server product documentation. However, I’m not permitted to expose the contents of that documentation here. My main objective in following on to my friend Tanel Poder’s post was to ensure folks don’t run off with the idea that Oracle Database 11g Release 2 for Exadata is a columnar store database.
I can answer some of your questions safely though. A compression unit is a collection of data blocks. Multiple rows are stored in a compression unit. Columns are stored separately within the compression unit. Likeness amongst the column values within a compression unit yields the space savings. There are still rowids (that change when a row is updated by the way) and row locks. It is a hybrid.
So, in short, no a compression unit does not have values from only one column.
I apologize for the confusing information. There are times I find myself trying to make bricks without straw as it were.
By kevinclosson on September 3, 2009 at 4:13 pm
Thanks Kevin, that’s what I wanted to know.
I’m also curious if Exadata performs selection predicates directly on the HCC data, or if it is decompressed into regular tuples before the predicate is evaluated. However, if this is restricted information, I understand.
Cheers,
Daniel Abadi
By Daniel Abadi on September 3, 2009 at 4:24 pm
Exadata performs filtration and projection (Smart Scan) on compressed data in Oracle Database 11g Release 2 just as it did in 11gR1. With the columnar compression, Smart Scans send uncompressed data to the database grid though.
By kevinclosson on September 3, 2009 at 4:39 pm
Hello:
Just one clarification..
So with HCC, not only will we have compression (and grouping) within the blocks, also there will be compression units that will have another grouping of
data based on the *likeness* of the column values among blocks?
Can you please explain?
Thanks
Ganesh
By Ganesh on September 3, 2009 at 6:31 pm
Well, “we” of course means Exadata customers. I do understand the curiosity. Think of the blocks in a compression unit as buckets for column values. Highly compressible columns wont take up much space in the “buckets” but non-compressible columns will.
By kevinclosson on September 3, 2009 at 6:39 pm
Kevin,
Thanks for your responses. The information you shared was enough for me to take a stab at placing the Oracle scheme in my taxonomy of hybrid row/column storage schemes. Please see: http://dbmsmusings.blogspot.com/2009/09/tour-through-hybrid-columnrow-oriented.html
If you disagree with anything I said, please let me know.
Cheers,
Daniel Abadi
By Daniel Abadi on September 3, 2009 at 11:02 pm
Why have you compared with Vertica instead of Sybase IQ? As far as I know Sybase IQ is much mature and also there are many great references!
By Sebastián on September 4, 2009 at 4:23 pm
Sebastian,
It was a contrast, not a comparison and I’m not here to prop one or the other of those two companies. I’m not on either of their teams so I don’t care which of those win their piece of the pie.
By kevinclosson on September 4, 2009 at 4:37 pm
[…] it turns out that it’s only available if you use Exadata storage to hold your data files. According to Kevin Closson this is for for technical (as opposed to commercial) reasons, and it certainly enhances the appeal […]
By Thoughts on the New DW Features in Oracle Database 11gR2 | Oracle on September 9, 2009 at 7:12 am
[…] Oriented Storage, but is in fact simply a compression technique (see Kevin Clossan’s post: Oracle Switches To Columnar Store Technology ). It may be highly effective though as the compression possibilities provided by column oriented […]
By Kerry Osborne’s Oracle Blog » Blog Archive 11gR2 - My Top Ten List of Interesting New Features - Kerry Osborne’s Oracle Blog on October 22, 2009 at 12:05 am
Kevin,
What is the size of the compression unit ? Is it the size of a physical I/O typically 1 MB ?
Thanks
Rajan
By Rajan on January 21, 2010 at 12:08 am
Rajan,
CU size varies. It is either 32KB or 64KB depending on the type of compression and other factors.
By kevinclosson on January 21, 2010 at 4:03 pm
[…] Closson of Oracle graciously confessed in a blog post and comment thread, that Oracle Exadata is not much of a column […]
By Columnar compression vs. column storage | DBMS 2 : DataBase Management System Services on February 6, 2011 at 9:24 am
[…] Closson在一篇blog和其评论中坦率承认,Oracle […]
By 列压缩与列存储 on February 8, 2011 at 8:08 am
[…] fueron Sybase con su IQ. Parece que la tendencia se va iponiendo en entornos DW/BI y que incluso Oracle comienza a implementar esta tecnología. Hay que decir que este tipo de almacenamiento de la información está pensado para aumentar el […]
By Teradata Database 14 « CarlosAL on October 10, 2011 at 4:52 pm
[…] Oracle would be forced to buy Exadata if they wanted to use this feature. HCC was originally planned as standard Oracle 11g release 2 feature for many platforms but then decided to limit it to Exadata […]
By Exadata Hybrid Columnar Compression (HCC) for (storage) dummies « Dirty Cache on August 3, 2012 at 3:54 am
Column Store is getting very critical in DWH environments and its all about comressing data. Please have a look on the below article
[ link removed ]
By Gaurav Mittal on September 3, 2012 at 10:14 am
@Guarav : Columnar orientation is not “getting critical” it is critical. I’m in R&D with EMC’s Greenplum I do indeed know the value of columnar vis a vis compression efficacy. Note, I removed the link to your post. If you’d like feel free to send a trackback.
By kevinclosson on September 3, 2012 at 11:27 am
[…] together efficiently. Kevin Clossion explain the structure of CU in one of his blog posts (https://kevinclosson.wordpress.com/2009/09/01/oracle-switches-to-columnar-store-technology-with-oracl… ) as “A compression unit is a collection of data blocks. Multiple rows are stored in a […]
By Exadata Hybrid Columnar Compression | Saurabh K. Gupta's Oracle Blog on May 4, 2013 at 12:33 pm