BLOG UPDATE 24 SEP 2011: This blog entry has been viewed slightly more than 50 times per day, on average, since it was originally posted several months ago. At this point I’d like to update the post with these words to serve as a bit of a preface to the post itself. The final point made in this post offers a glimpse into one of the technical reasons I resigned my position as Performance Architect in Oracle’s Exadata development organization.
In my recent post entitled Exadata Database Machine: The Data Sheets Are Inaccurate! Part – I, I drew attention to the fact that there is increasing Exadata-related blog content produced by folks that know what they are talking about. I think that is a good thing since it would be a disaster if I were the only one providing Exadata-related blog content.
The other day I saw Tanel Poder blogging about objects that are suitable targets for Smart Scan. Tanel has added bitmap indexes to his list. Allow me to quickly interject that the list of what can and cannot be scanned with Smart Scan is not proprietary information. There are DBA views in every running Oracle Database 11g Release 2 instance that can be queried to obtain this information. Tanel’s blog entry is no taboo.
So, while Tanel is correct, I think it is also good to simply point out that the seven core Exadata fundamentals do in fact cover this topic. I’ll quote the relevant fundamentals:
Full Scan or Index Fast Full Scan.
- The required access method chosen by the query optimizer in order to trigger a Smart Scan.
Direct Path Reads.
- The required buffering model for a Smart Scan. The flow of data from a Smart Scan cannot be buffered in the SGA buffer pool. Direct path reads can be performed for both serial and parallel queries. Direct path reads are buffered in process PGA (heap).
So, another way Tanel could have gone about it would have been to ask, rhetorically, why wouldn’t Exadata perform a Smart Scan on a bitmap index if the plan chooses access method full? The answer would be simple—no reason. It is an index after all and can be scanned with fast full scan. So why am I blogging about this?
Can I Add Index Organized Tables To That List?
In a recent email exchange, Tanel asked me why Smart Scan cannot attack an index organized table (IOT). Before I go into the outcome of that email exchange I’d like to revert to a fundamental aspect of Exadata that eludes a lot of folks. It’s about the manner in which data is stored in the Exadata Storage Servers and how that relates to offload processing such as Smart Scan.
Data stored in cells is striped by Automatic Storage Management (ASM) across the cells with coarse-grain striping (granularity established by the ASM allocation unit size). With Exadata, the allocation unit size by default—and best-practice—is 4MB. Therefore, tables and indexes are scattered in 4MB chunks across all the cells’ disks.
Smart Scan performs multiple, asynchronous 1MB reads for allocation units (thus four 1MB asynchronous reads for adjacent 1MB storage regions). As the I/O operations complete, Smart Scan performs predicate operations (filtration) upon each storage region (1MB). If the data contained in a 1MB region references another portion of the database (e.g., a chained row ), Smart Scan cannot completely process that storage region. The blocks that reference indirect data are sent to the database grid in standard block form (the same form as when reading an ASM disk on conventional storage). The database server then chases the indirection because only it has the code to map the block-level indirection to an ASM AU in some cell, somewhere. Cells cannot ask other cells for data because cells don’t know anything about each other. The storage grid of Exadata is shared-nothing.
Thus far, in this blog post, I’ve taken the recurring question of whether Smart Scan works on a certain type of object (in this case IOT) and broadened the discussion to focus on a fundamental aspect of Exadata. So what does this broadened scope have to do with Smart Scan on IOT? Well, when I read that email from Tanel I used logic based on the fundamentals and shot off an answer. Before that hasty reply to Tanel I recalled IOT has the concept of an overflow tablespace. The concept of overflow tablespace—in my mind—has “indirection” written all over it. Later I became more curious about IOT so I scanned through the oracle source code (server side) and couldn’t find any hard barriers against Smart Scan on IOT. I was stumped (trust me that aspect of the code is not all that straightforward) so I asked the developers that own that specific part of the server. I found out my logic was faulty. I was wrong. It turns out that Smart Scan for IOT is simply not implemented. I’m not insinuating that means “not implemented yet” either. That isn’t the point of this blog entry. Neither is admitting I was wrong in my original answer to Tanel. There is more to this train of thought.
Will The List Of Smart Scan Compatible Objects Keep Growing And Growing?
Neither confessing how I shot off a hasty answer to Tanel, nor specifics about IOT Smart Scan support are the central points of this blog entry. So, just what is my agenda? Primarily, I wanted to remind folks about the fundamental aspect of Exadata regarding indirection and Smart Scan (e.g., chained row, etc) and secondarily, I wanted to point out that the list of objects suitable for Smart Scan is limited for reasons other than feasibility. Time to market is important. I know that. If an object like IOT is not commonly used in the data warehousing use-case it is unnecessary work to implement support for Smart Scan. But therein lies the third hidden agenda item for this post which is to question our continual pondering over the list of objects that support Smart Scan.
Offload processing is a good thing. I wonder, is the goal to offload more and more? Some is good, certainly more must be better in a scale-out solution. Could offload support grow to the point where Exadata nears a state of “total offload processing?” Would that be a bad thing? Well, “total offload processing” is, in fact, impossible since cells do not contain discrete segments of data but instead the scattering of data I wrote about above. However, more can be offloaded. The question is just how far does that go and what does it mean in architectural terms? Humor me for another moment in this “total offload processing” train of thought.
If, over time, “everything”—or even nearly “everything”—is offloaded to the Exadata Storage Servers there may be two problems. First, offloading more and more to the cells means the query-processing responsibility in the database grid is systematically reduced. What does that do to the architecture? Second, if the goal is to pursue offloading more and more, the eventual outcome gets dangerously close to “total offload processing.” But, is that really dangerous?
So let me ask: In this hypothetical state of “total offload processing” to Exadata Storage Servers (that do not share data by the way), isn’t the result a shared-nothing MPP? Some time back I asked myself that very question and the answer I came up with put in motion a series of events leading to a significant change in my professional career. I’ll blog about that as soon as I can.