oracle | Kevin Closson's Blog: Platforms, Databases and Storage

Archive for the 'oracle' Category

Unstructured Data. Lots and Lots of It.

Published December 17, 2008 oracle 2 Comments

Yes, there is unstructured data and if you have an awful lot of it, the HP StorageWorks 9100 Extreme Data Storage System looks like a really great place to put it. I’m biased though because the software that drives the StorageWorks 9100 is PolyServe-my former company. I’m glad to see HP doing good things with PolyServe since the acquisition in 2007. Too many large corporate mergers end up in a mess. I’m glad to see that isn’t happening to my old friends and former PolyServe colleagues!

This offering is geared more towards density and cost than performance from what I can see. Nonetheless, having over 3 GB/s NFS bandwidth will come in handy given the capacities this offering supports.

Cool technology!

Oracle Exadata Storage Server: 485x Faster Than…Oracle Exadata Storage Server. Part II.

Published December 11, 2008 oracle 21 Comments

In my blog entry entitled Exadata Storage Server: 485x Faster than…Exadata Storage Server. Part I., I took issue with the ridiculous multiple orders of magnitude performance improvement claims routinely made by the DW Appliance vendors. These claims are usually touted as comparisons to “Oracle” (without any substantive accounting for what sort of Oracle configuration they are comparing to) and never seem to include any accounting for where the performance improvement comes from. After learning a bit about marketing from a breakfast cereal commercial, I decided to share with my readers how easy it is to do what these guys generally do-compare apples to bicycles. To make it more interesting I decided to show a 485-fold performance increase of Oracle Exadata versus Oracle Exadata. A long comment thread ensued and ultimately ended with a reader posting the following:

Block density perhaps?

You didn’t mention that the number of records per block
was a constant. So it would be possible that in the first scenario you created a table with a low amount of records per block, resulting in a large segment, needing a lot of io’s. (you could have used 1 row/block for example)

While in the second scenario you could have used a high number of blocks per record, resulting in a smaller segment, and thus needing a lower amount of io’s to fulfill the query.

BINGO.

Here’s the deal. I chose my words carefully and took a huge dose of Semantic-a-sol(tm). I set the stage as a follows:

I said, “There are no … partitioning or any sort of data elimination.” True, there were no forms of data elimination. I didn’t say anything about eliminating unused space.
I said the data in the table was the same; I never said it was the same table.
I said there was the same storage bandwidth and the same number of CPUs and that was true.

The 485x was a product of querying a table with PCTFREE 0 versus PCTFREE 99. When I queried the vacuous blocks I also did so with a normal scan instead of a Smart Scan. So it is true that storage bandwidth remained constant but I created an artificial bottleneck upwind by forcing the single database host (used in both cases) to ingest the full 1.6 TB which is how much round-brown spinning stuff needed to store the vacuous blocks (PCTFREE 99). That took 970 seconds.

With ~107 million rows, and a query that cited only the PURCHASE_AMT column, the amount of data actually needed by the SQL layer is a measly 86 MB. So, when I “magically” switched the card_trans synonym to point to the PCTFREE 0 table (which is only 8.4 GB) and scanned it with the full power of 14 Exadata Storage Servers, the data was off disk and the PURCHASE_AMT column plucked from the middle of each row and DMAed into the address space of the Parallel Query Processes on the database host in 1.96 seconds….485x speed up.

So, does anyone else hate it when these DW Appliance guys go around spewing ridiculous multiple orders of magnitude performance increases over who-knows-what without any accounting? It truly is an insult on your intelligence.

There is no reason to be mystified. If DW Appliance vendor XYZ is spouting off about a query processing speed-up of, say, X, just plug the values into the following magic decoder ring. Quote me on this, performance increase X is the product of:

Executing on a platform with X-fold storage bandwidth, or
Executing on a platform with X-fold processor bandwidth, or
The query being measured manipulated 1/X^th the amount of data, or
Some combination of items 1 through 3

Any reasonable vendor will gladly itemize for you where they get their magical performance gains. Just ask them, you might learn more about them than you thought.

Part II in these series can be found here.

Oracle Exadata Storage Server: 485x Faster Than…Oracle Exadata Storage Server. Part I.

Published December 10, 2008 oracle 29 Comments

I recently read an article by Curt Monash entitled Interpreting the results of data warehouse proofs-of-concept (POCs). Curt’s post touched on a topic that continually mystifies me. I’m not sure when the phenomenon started, but I’ve witnessed a growing trend towards complete lack of scrutiny when it comes to the performance claims made by most vendors in the data warehousing space. For example, Netezza makes a blanket claim that their appliance is 100-fold faster than Oracle. Full stop. Er, not full stop… Netezza doesn’t stop there. They claim:

While Netezza makes claims of 100x performance gains, it is not uncommon to see performance differences as large as 200x to even 400x or more when compared to existing Oracle systems.

100x Speed-up: Child’s Play

But, honestly, 100x is child’s play. Forget for a moment that there is no itemization of where that speedup would come from in Netezza’s high-level messaging. Such information would be technical marketing and I wouldn’t expect Netezza to disclose any sort of justification for where 100x speedup comes from. Lowered expectations. Shucks, these DW arms-race marketing claims remind me of that famous Saturday Night Live skit that seems to have served as the play book for these marketing guys-in more ways than one!

Ok, chuckles aside, Curt’s post on the topic included a link to a spreadsheet of recent Proof of Concept results where “the incumbent” was trounced to the tune of 335x in reporting tasks. Like I said, 100x is child’s play.

Intellectual Curiosity

Nobody should look at a claim such as 335x without wondering where in the world such a speedup comes from and shame on any vendor that isn’t willing to itemize the benefit. After all, without some knowledge of what produces such astounding speedup, how is the dutiful DW practitioner to expect the speedup to remain intact over time or, moreover, how to replicate the “magic” elsewhere. I’m more than willing to itemize to anyone any claim of Oracle Exadata Storage Server speed up on any query. Exadata is not “magic” so accounting for its benefit is very easy to do. But, back to the 335x for a moment. This is actually quite simple. To get 335x speedup one of the following is true:

The query was executed on a platform with 335x storage bandwidth
The query was executed on a platform with 335x processor bandwidth
The query manipulated 1/335^th the amount of data
Some combination of items 1 through 3

Number 3 in the list is achieved through things like partition elimination, indexing, materialized views, more efficient joins, and so forth. This is what Oracle refers to as the “Brainy Approach” to improved data warehouse query performance. Of course Oracle has, and retains all these “Brainy” optimization approaches, and more, when Exadata is in play. Exadata is a solution offering both “Brainy” and, most importantly, “Brawny” technology.

Let’s think this 335x thing through for a moment. Imagine that the 335x was a Netezza 10100 and the 335x was an improvement over a traditional Oracle incumbent (no Exadata). One of Netezza’s main value propositions is that they are able to utilize full bandwidth of all the disks in the system in parallel-just like Exadata. That’s the “Brawny” approach. As I point out in my post about “arcane” disk technology, this value proposition is the least we deserve, but because of typical storage provisioning most Oracle deployments don’t benefit from the aggregate bandwidth their drives could actually offer. So kudos to Netezza for that. What if this was Netezza and the 335x was due to the NPS 10100 “Brawny” disk bandwidth capability? Well, that chalks the win to item 1 in the list and therefore the incumbent system was configured with 1/335^th the amount of disk bandwidth of the NPS system. If I grant the NPS system 70 MB/s per disk drive I get roughly 7.5 GB/s (108 * 70MB). Does that mean the incumbent was ingesting only 22 MB/s (7.5 GB/335)? Would anyone care about that result? Would you be proud if you got more performance from 108 SATA drives than a single USB 2.0 drive? I shouldn’t think the 335x came solely from list item 1.

The NPS 10100 has 108 processors pounding on the data as it comes off the drives. Can we get 335x over our imaginary incumbent from sheer processing power? Sure, so long as the incumbent was running Oracle on a processor with 1/3^rd the bandwidth of a single PowerPC processor (the embedded CPU on a Netezza SPU). Would anyone be excited to beat 1/3^rd a CPU with 108 CPUs?

No, folks, the 335x was certainly the product of item 4 on the list-with a very heavy slant towards item 3-regardless of which appliance vendor it actually was.

A 335 Fold Improvement is Child’s Play? I want 485 Fold!

Humor me as I walk through a little exercise to elaborate more on this topic. In the following session I’ll demonstrate a query accessing precisely the same amount of data using the same SQL, in the same Oracle session, attached to the same Oracle database. You’ll see that I execute a host command to prove that within the scope of 15 seconds I am able to demonstrate a 485x speedup. You can choose to believe me or not, but the facts are as follows:

The amount of data in the table is the same in each case.
The data in every column of every row is the same.
The order of rows in the table is the same.
There is no compression involved at any point.
The table datatypes are the same.
The query plan is the same.
The Oracle Parallel Query Degree of Parallelism remains constant. That means equal CPUs attacking the data.
There are no indexes, materialized views, partitioning or any sort of data elimination.
The Oracle Results Cache feature was not used.
The data in each case resided on the same disks.

And, oh, before I forget to say so, this is Exadata. So, can Oracle market Exadata as 485x faster than Exadata without the use of any data elimination techniques? See for yourself and fill out a comment with your explanation for what I have shown here.

First, a listing of the “demo” script:

SQL> !cat demo.sql

set echo off
set timing off
col sum_sales format 999,999,999,999,999,999
host date

desc card_trans

set echo on
select count(*) from card_trans;
set timing on

select sum(purchase_amt) sum_sales from card_trans;
host date

In the following screen capture I’ll show that the query took 970 seconds to complete. I used the SUM aggregate against the 100+ million purchase_amt column values as a means to show I’m querying the same content in both cases.

SQL> @demo
Wed Dec 10 08:59:48 PST 2008

Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
CARD_NO                                   NOT NULL VARCHAR2(20)
CARD_TYPE                                          CHAR(20)
MCC                                       NOT NULL NUMBER(6)
PURCHASE_AMT                              NOT NULL NUMBER(6)
PURCHASE_DT                               NOT NULL DATE
MERCHANT_CODE                             NOT NULL NUMBER(7)
MERCHANT_CITY                             NOT NULL VARCHAR2(40)
MERCHANT_STATE                            NOT NULL CHAR(3)
MERCHANT_ZIP                              NOT NULL NUMBER(6)

SQL> select count(*) from card_trans;

COUNT(*)
----------
107389152

SQL>
SQL> set timing on
SQL>
SQL> select sum(purchase_amt) sum_sales from card_trans;

SUM_SALES
------------------------
6,443,502,770

Elapsed: 00:16:10.15
SQL>
SQL> host date
Wed Dec 10 09:32:08 PST 2008

The first pass of the script ended in the same session at 9:32:08 and 11 seconds later I executed the script again. The session capture shows that there was a 485x speed up (970 seconds down to 2 seconds). Like I said, “100x is childs play.” Well, at least it is when there is no accounting offered for the improvement. Pshaw, it seems I learned a lot from that “training” video I reference above.

SQL>  @demo
SQL>
SQL> set echo off
Wed Dec 10 09:32:19 PST 2008

Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
CARD_NO                                   NOT NULL VARCHAR2(20)
CARD_TYPE                                          CHAR(20)
MCC                                       NOT NULL NUMBER(6)
PURCHASE_AMT                              NOT NULL NUMBER(6)
PURCHASE_DT                               NOT NULL DATE
MERCHANT_CODE                             NOT NULL NUMBER(7)
MERCHANT_CITY                             NOT NULL VARCHAR2(40)
MERCHANT_STATE                            NOT NULL CHAR(3)
MERCHANT_ZIP                              NOT NULL NUMBER(6)

SQL> select count(*) from card_trans;

COUNT(*)
----------
107389152

SQL>
SQL> set timing on
SQL>
SQL> select sum(purchase_amt) sum_sales from card_trans;

SUM_SALES
------------------------
6,443,502,770

Elapsed: 00:00:01.96
SQL>
SQL> host date
Wed Dec 10 09:32:23 PST 2008

SQL> select count(*) from user_indexes ;

COUNT(*)
----------
0

Elapsed: 00:00:00.11

Part II in this series: click here.

FISHy Network Attached Storage

Published November 21, 2008 oracle 2 Comments

Back in October 2007, I had a series of very interesting and enjoyable technology briefings/discussions with Bryan Cantrill, the inventor of Sun’s Dtrace dynamic tracing toolkit. Bryan had been following my blog and realized I moved on to the Oracle Server Technologies group after my tenure as Chief Software Architect of Oracle Database Platform Solutions at PolyServe. Who knows, he may have been speaking to Glenn Fawcett, my old friend and former Database Engineer at Sequent, as well. Bryan was investigating whether there could be some synergy between Oracle and the project he was currently working on codenamed FISH (Fully Integrated Software and Hardware). I lost track of how that project was moving along as I became too engrossed in my work on Oracle Exadata Storage Server. Nonetheless, it is nice to hear of this product coming to market as I recall being very impressed by the early technical details of the product.

From the FISHworks project comes the Sun Storage 7000 Unified Storage System.

Bryan Cantrill goes into some of the history of the Sun Storage 7000 in this blog entry.

It will be interesting to watch the happenings between Sun and NetApp over this

…and, no, I don’t think I’m special because Bryan and I had a chat about FISH back in Oct 2007 as Ashlee Vance had scooped the news 7 months prior.

Exadata Related Posts. Losing Posts in the Mosh Pit.

Published November 7, 2008 oracle Leave a Comment

I’ve had several people ask me questions recently about topics I’ve covered in at least a few of my Exadata Storage Server related posts. When I asked them if they’d seen a particular post, they’ve generally responded, “No, I didn’t know that post existed”, or words to that effect. So, this may seem odd, but in addition to updating my Exadata Posts page, I’ve pasted handy URLs to my Exadata related posts below. I wish there was time to add to the set of posts. I do have plenty of material queued up, but I’m scrambling for time. In the meantime, perhaps a few readers will appreciate the content pointed to by these URLs:

Oracle Exadata Storage Server. Part I.

Oracle Exadata Storage Server. Part II.

Oracle Exadata Server Related Web News Media and Blog Errata. Part I.

HP Oracle Database Machine. A Thing of Beauty Capable of “Real Throughput!”

I know Nothing About Data Warehouse Appliances and Now, So Won’t You – Part V. Why GreenPlum is Better Than Oracle Exadata Storage Server.

Podcast: Pythian Group Oracle Exadata Storage Server Q&A with Kevin Closson.

Pessimistic Feelings About New Technology. Oracle Exadata Storage Server – A JBOD That Can Swamp A Single Server.

Oracle Exadata Storage Server: Beaten by FLASH SSD and Worthless for OLTP.

Oracle Exadata Storage Server. No Magic in an Imperfect World. Excellent Tools and Really Fast I/O Though.

Oracle Exadata Storage Server: A Black Box with No Statistics.

Blog Anniversary

Published October 23, 2008 oracle 2 Comments

Sing along with me now…

Happy Anniversary to my blog. It’s been two years as of today. And as Tom Kyte foretold, there has been a lot about “disk” on this blog.

This blog has had nearly 1 million page views to date and while I’d like to say it has been all fun and games it has been a lot of work.

My favorite topics include, of course, my Exadata Related Posts, but the Oracle on Opteron, NUMA (etc) posts have had tremendous readership as have the “famous” Oracle Over NFS (most notably the “Manly Man” series) posts.

How about a poll?

Pessimistic Feelings About New Technology. Oracle Exadata Storage Server – A JBOD That Can Swamp A Single Server.

Published October 20, 2008 oracle 23 Comments

In my recent blog entry entitled “Oracle Exadata Storage Server. No Magic in an Imperfect World. Excellent Tools and Really Fast I/O Though“, I concluded with a reference to some anti-Exadata comments made by EMC’s Chuck Hollis in his blog entry entitled “Oracle Does Hardware.” I directed my readers to that blog post by writing the following:

In spite of how many times EMC’s Chuck Hollis may claim that there is “nothing new” or “no magic” when referring to Oracle Exadata Storage Server, I think it is painfully obvious that there is indeed “something new” here. Is it magic? No, and we don’t claim that it’s magic.

Six days after I posted that blog entry, Chuck submitted a lengthy comment on the post. Instead of responding to Chuck’s comments in the comment thread I’ve decided to do so here.

Readers please don’t confuse this as some sort of Kevin versus Chuck thread because it isn’t. What you’ll see in this post is an analysis of the words of someone representing one of the (if not the premier) conventional storage providers (EMC). My motives are to provide useful information in this analysis.

If you read Chuck’s assessment of Oracle Exadata Storage Server, you’ll see a positioning piece with an overtly anti-Exadata slant. Chuck’s words in that post are aimed at conveying facts. My first handling of that anti-Exadata piece was very light. I aimed to capitalize foremost on the repetitious use of the words “nothing new” and “magic.” Chuck likely saw my post where I called this out. Chuck answered my calling-out in the comment thread of this post. Chock wrote:

Sorry, Kevin, didn’t mean to come across as too pessimistic in my blog.

Asserting Beliefs
I need to point out that one cannot be pessimistic about facts. The word pessimistic only applies to beliefs and emotions. Chuck’s piece wasn’t pessimistic–it was flawed based on technical grounds. Chuck continued with:

Leaving hardware issues aside, how much of the software functionality shown here is available on generic servers, operating systems and storage that Oracle supports today? I was under the impression that most of this great stuff was native to Oracle products, and not a function of specific tin …

Last Chance for a First Impression

Chuck’s “pessimistic” post came out the day after the Oracle Exadata Storage Server launch so harboring such questions in his mind at that time would have been understandable. However, Chuck visited my blog some 22 days later and continued to ask questions that clearly demonstrate a lack of understanding of Oracle Exadata Storage Server. Chuck may have been “under the impression” that the underpinnings of Exadata are Oracle-generic (“native to Oracle products”), but he is wrong. Oracle Exadata Storage Server software is not a scalpel-job on the Oracle Database server. It is a totally new storage server software package.

To answer Chuck’s question about the software, none of the software functionality (Exadata) is available on generic servers, operations systems or storage. Chuck continued with:

If the Exadata product has unique and/or specialized Oracle logic, well, that’s a different case.

Yes, it is unique and specialized and a different case. Even light reading of the available material (e.g., the Exadata paper and, shucks, maybe a few of my blog posts) would have made that glaringly obvious. Chuck continued with:

Speaking strictly as a storage guy, here’s what I know.

– using commodity servers and storage arrays, we can usually feed in more data than a server can process, specifically true in an Oracle DW environment.

Chuck, swamping a commodity server is not the goal. Of course it’s easy to produce more raw, streaming data from even a midrange storage array than can be ingested by a single commodity server. Even the best commodity servers choke at less than 2GB/s data ingest rate when Oracle is performing data-rich functionality (e.g., joins, sorts, aggregation, etc). The design goal of Exadata was not to swamp commodity servers more efficiently. That would be a storage-only, bigger hose, speed-and-feed mentality-the “brute force only approach.”

You Don’t Always Get What You Want-Enter Exadata
The value proposition of Exadata is to scan disk without bottlenecks and return to the Database grid only the data the query wants, not blocks of disk. It’s a feature we call Smart Scan. Chuck needs to have his folks brief him on that. However, Exadata is more than capable of holding its own in the pure “brute force” camp.

Even Without Smart Scan, Exadata is Faster Than Conventional Storage
As a simple block server, Exadata is able to deliver 1GB/s per cell to the Database grid. If you don’t think that is “brute force”, consider a moderate Oracle Database Machine configuration consisting of a single rack serving 14 GB/s to the Database grid. If those numbers don’t speak loudly enough, just investigate what sort of conventional SAN array configuration it would take to deliver 14 GB/s to a Database grid. So, yes, Exadata is both “brute force” and intelligent and that is why I had to call out Chuck’s blog remarks about how Exadata is “nothing new.”

Chuck finished that paragraph with:

I’m having a hard time seeing the advantages of pairing a commodity Xeon-based server with JBOD and claiming a performance advantage for this part of the equation.

Oh my, where to start. Chuck, I understand why you would have difficulty seeing the advantage in what you just described, but what on earth does any of that have to do with Oracle Exadata Storage Server? First, where did you get “JBOD?” An Oracle Exadata Storage Server cell is not just a Xeon processor sitting in front of some disks (JBOD). The disks are down-wind of an intelligent HP P400 Smart Array with 512MB battery-backed write cache. And, what’s so terrible about fronting some disks with Xeon technology anyway? There are a few conventional storage arrays on the market that use Xeon in the array head.

It’s All About Balance
Fingering the fact that Intel Xeon processors execute storage intelligence software in the Oracle Exadata Storage Server doesn’t hold water–especially since the ratio is 2 sockets per 12 hard drives. Perhaps Chuck will tell us the maximum number of Xeon processors EMC supports in front of 960 drives in a fully loaded midrange EMC array (e.g., CX)?

Oracle has purpose-built a balanced system by coupling the power of 2 Xeon processors (Harpertown quad-core) in front of 12 drives.

Infiniband: The Exadata “Secret Sauce?”

Chuck continued with:

– you may be more knowledgeable than I, but we are under the impression that the IB compute node connection doesn’t bring much to the party. When we looked at many clustered Oracle DW implementations, there was plenty of bandwidth available between the compute nodes, using multiple 1Gb/sec links.

That’s why we don’t talk about it much. Infiniband is not why Exadata is so fast. Infiniband is one of the reasons why Exadata is not bottlenecked. First, I’ll point out that Infiniband is a unified fabric for both disk and inter-node communications with Exadata. I’ve been writing about storage up to this point and now the focus is shifted to Real Application Clusters (RAC) interconnect technology. I’ll be brief on this topic. I don’t doubt, nor do I care, that there are clustered Oracle DW systems currently deployed that are able to get by with multiple UDP Gigabit Ethernet networks configured as the RAC interconnect. That’s just fine with me. Does that somehow negate the value of Exadata because Oracle so foolhardy engineered a zero-copy RDMA interconnect for RAC while unifying interconnect and storage networking into a single fabric? I shouldn’t think so. UDP costs some (lots) of cycles compared to ZDP over Infiniband. Just because a network has headroom left over doesn’t mean resources are otherwise being utilized efficiently. Oracle didn’t aim to engineer bottlenecks into the Exadata architecture.

Chuck continued with:

And, I know this only matters to storage people, but there’s the minor matter of having two copies of everything, rather than the more efficient parity RAID approaches. Gets your attention when you’re talking 10-40TB usable, it does.

Yes, the initial release of Exadata requires 1:1 mirroring. Does that somehow insinuate that Exadata will never offer the more space-saving RAID approached Chuck is alluding to? Life is, after all, an unending series of choices.

Everyone Includes EMC
Chuck continued with:

Bottom line – what does the hardware bring to the party, rather than software? And if you can get the same benefits without dictating that customers buy a specific piece of tin, isn’t that a win for everyone?

Chuck, and my blog readers alike, should know by now what the hardware brings to the party. Oracle Exadata Storage Server hardware is-unlike conventional storage arrays-not configured with guaranteed throughput bottlenecks built in. That warrants a party. On the other hand, the software is the secret sauce. The choice of which “tin” gets to run the software is, of course, someone else’s decision. I will say, however, if you were to execute the software on systems less balanced than the current platform (HP Proliant DL180 G5), you would not realize the benefit. It’s all about balance.

Chuck finished with:

Finally, I’d be interested in your thoughts on how enterprise flash drives fit into all of this. Yes, they’re rather expensive now, but this won’t be the case before too long.

I’ve bored you all to death already. I’ll hit FLASH SSD in my next blog entry.

Linux is Perfect So Why Would You Monitor Performance?

Published October 19, 2008 oracle 20 Comments

If you have Oracle Database deployed on Linux and have not yet settled upon collectl as your primary tool for system-level performance data collection, well, we need to have a long talk!

Read This Blog: Christian Antognini.

Published October 18, 2008 oracle Leave a Comment

Christian Antognini is a fellow member of the Oaktable Network and while I’ve only (briefly) met him once face-to-face, I’ll still recommend his blog! No, honestly, all joking aside I must say that I appreciate Christian’s blogging.

Oracle Exadata Storage Server: Beaten by FLASH SSD and Worthless for OLTP.

Published October 16, 2008 oracle 19 Comments

I’ve never met Mike Ault, but some friends of mine, who are fellow OakTable Network members, say he’s a great guy and I believe them. Mike works at Texas Memory Systems and I know some of those guys as well (Hi Woody, Jamon). Pleasantries aside, I have to call out some of the content Mike posted on a recent blog entry about HP Oracle Database Machine and Exadata Storage Server. Just because I blog about someone else’s posted information doesn’t mean I’m “out to get them.” Mike’s post made it clear I need to address a few things. Mike’s post was not vicious anti-Technical Marketing by any means, but it was riddled with inaccuracies that deserve correction.

Errata

While these first two errata I will point out may seem moot to many readers, I think accuracy is important if one intends to contrast one technology offering against another. After all, you won’t find me posting blog entries about Texas Memory Systems SSD being based on core memory or green Jell-O.

The first error I need to point out is that Mike refers to Oracle Exadata Storage Server cells as “block” or “blocks” 8 times in his post.

The second error I need to point out is rooted in the following quote from Mike’s blog entry:

These new storage and database devices offer up to 168 terabytes of raw storage with 368 gigabytes of caching and 64 main CPUs

That is partially true. With the SATA option, the HP Oracle Database Machine does offer 168TB of gross disk capacity. The error is in the “368 gigabytes of cache” bit. The HP Oracle Database machine does indeed come with 8 Real Application Clusters hosts in the Database grid configured with 32GB RAM and 14 Exadata Storage Server cells with 8GB each. However, it is entirely erroneous to suggest that the entirety of physical memory across both the Database grid and Storage grid somehow work in unison as “cache.” It’s not that the gross 368GB (8×32 + 14×8) isn’t usable as cache. It’s more the fact that none of it is used as user-data cache–at least not cache that somehow helps out with DW/BI workloads. The notion that it makes sense to put 368GB of cache in front of, say, a 10TB table scan, and somehow boost DW/BI query performance, is the madness that Exadata aims to put to rest. Here’s a rule:

If you can’t cache the entirety of a dataset you are scanning, don’t cache at all.

– Kevin Closson

Cache, Gas or a Full Glass. Nobody Rides for Free.

No, we don’t use the 8x32GB physical memory in the Database grid as cache because cycling, say, the results of a 2TB table scan through 368GB aggregate cache would do nothing but impede performance. Caching costs, and if there are no cache hits there is no benefit. Anyone who claims to know Oracle would know that parallel query table scans do not pollute the shared cache of Oracle Database. A more imaginative, and correct, use for the 32GB RAM in each of the hosts in the Database grid would be for sorting, joins (hash, etc) and other such uses. Of course you don’t get the entire 32GB anyway as there is an OS and other overhead on the server. But what about the 8GB RAM on each Oracle Exadata Storage cell?

One of the main value propositions of Oracle Exadata Storage Server is the fact that lower-half query functionality has been offloaded to the cells (e.g., filtering, column projection, etc). Now, consider the fact that we can scan disks in the SAS-based Exadata Storage Server at the rate of 1GB/s. We attack the drives with 1MB physical reads and buffer the read results in a shared cache visible to all threads in the Oracle Exadata Storage Server software. To achieve 1GB/s with 1MB I/O requests requires 1000 physical I/Os per second. OK, now I’m sure all the fully-cached-conventional-array guys are going to point out that 1000 IOPS isn’t worth talking about, and I’d agree. Forget for the moment that 1GB/s is in fact very close to the limit of data transfer many mid-range storage arrays have to offer. No, I’m not trying to get you excited about the 1GB/s because if that isn’t enough you can add more. What I’m pointing out is the fact that the results of 1000 IOPS (each 1MB in size) must be buffered somewhere while the worker threads rip through the data blocks applying filtration and plucking out cited columns. That’s 125 1MB filtration and projection operations per second per processor core. There is a lot going on and we need ample buffering space to do the offload processing.

Mike then moved on to make the following statement:

The Oracle Database Machine was actually designed for large data warehouses but Larry assured us we could use it for OLTP applications as well. Performance improvements of 10X to 50X if you move your application to the Database Machine are promised.

I’m not going to write guarantees, but no matter, that statement only lead in to the following:

This dramatic improvement over existing data warehouse systems is provided through placing an Oracle provided parallel processing engine on each Exadata building block so instead of passing data blocks, results are returned. How the latency of the drives is being defeated wasn’t fully explained.

Exadata Storage Server Software == Oracle Parallel Query

Folks, the Storage Server software running in the Exadata Storage Server cell is indeed parallel software and threaded, however, it is not entirely correct to state that there is a “parallel processing engine” that returns “results” from Exadata cells. More correctly, we offload scans (a.k.a. Smart Scan) to Exadata cells. Smart Scan technology embodies I/O, filtration, column projection and rudimentary joins. Insinuating otherwise makes Exadata out to be more of a database engine than intelligent storage and there is more than a subtle difference between the two concepts. So, no, “results” aren’t returned, filtered rows and projected columns are. That is not a nit-pick.

DW/BI and I/O Latency

Mike finished that paragraph with the comment about how Oracle Exadata Storage Server “defeats” (or doesn’t) drive latency. I’ll simply point out that drive latency is not an issue with DW/BI workloads. The problem (addressed by Exadata) is the fact that attaching just a few modern hard drives to a conventional storage array leaves you with a throughput bottleneck. Exadata doesn’t do anything for drive latency because, shucks, the disks are still round, brown spinning thingies. Exadata does, however, make a balanced offering that that doesn’t bottleneck the drives.

Mike continued with the following observation:

So in a full configuration you are on the tab for a 64 CPU Oracle and RAC license and 112 Oracle parallel query licenses

Yes, there are 64 processor cores in the Database grid component of the HP Oracle Database Machine, but Mike mentioning the 112 processor cores in the Exadata Storage Server grid is clearly indicative of the rampant misconception that Exadata Storage Server software is either some, most, or all of an Oracle Parallel Query instance. People who have not done their reading quickly jump to this conclusion and it is entirely false. So, mentioning the 112 Exadata Storage Server grid processors and “Oracle parallel query licenses” in the same breath is simple ignorance.

Mike continues with the following assertion:

Targeting the product to OLTP environments is just sloppy marketing as the system will not offer the latency needed in real OLTP transaction intensive shops.

While Larry Ellison and other important people have stated that Exadata fits in OLTP environments as well as DW/BI, I wouldn’t say it has been marketed that way, and certainly not sloppily. Until you folks see our specific OLTP numbers and value propositions I wouldn’t set out to craft any positioning pieces. Let me just say the following about OLTP.

OLTP Needs Huge Storage Cache, Right?

OLTP is I/O latency sensitive, but mostly for writes. Oracle offers a primary cache in the Oracle System Global Area disk buffer cache. Applications generaly don’t miss SGA blocks and immediately re-read them at a rate that requires sub-millisecond service times. Hot blocks don’t age out of the cache. Oracle SGA cache misses generally access wildly random locations, or result in scanning disk. So, for storage cache to offer read benefit it must cover a reasonable amount of the wildly, randomly accessed blocks. The SGA and intelligent storage arrays share a common characteristic: the same access patterns that blow out the SGA cache also blow out storage array cache. After all, architecturally speaking, the storage array cache serves as a second-level cache behind the SGA. If it is the same size as the SGA it is pretty worthless. If it is, say, 10 times the size of the SGA but only 1/50^th the size of the database it is also pretty useless-with the exception of those situations when people use storage array cache to make up for the fact that they are using, say, 1/10^th the number of drives they actually need. Under-provisioning spindles is not good but that is an entirely different topic.

I know there are SAN array caches in the terabyte range and Mike speaks of multi-terabyte FLASH SSD disk farms. I suppose these are options-for a very select few.

Most Oracle OLTP deployments will do just fine running against non-bottlenecked storage with a reasonable amount of write-cache. Putting aside the idea of an entirely FLASH SSD deployment for a moment, the argument about storage cache helping OLTP boils down to what percentage of the SGA cache misses can be satisfied in the storage array cache and what overall performance increase that yields.

The Eye of a Needle

Recently, I was looking at the specification sheet for a freshly released mid-range Fibre Channel SAN storage array that supports up to 960 disk drives plumbed through a two-headed controller. The specification sheet shows a maximum of 16GB cache per storage processor (up to two of them). I should think the cache is mirrored to accommodate storage processor failure-maybe it isn’t, I don’t know. If it is mirrored, let’s pretend for a moment that mirroring storage processor cache is free even with modify-intensive workloads (subliminal man says it isn’t). Given this example, I have to ask who thinks 16GB of storage array in front of hundreds of drives offers any performance increase? It doesn’t, so let’s put to rest the OLTP storage cache benefit argument.

But Mike Wasn’t Talking About Storage Array Cache

Right, Mike wasn’t talking about storage array cache benefit in an OLTP environment, but he was talking about nosebleed IOP rates from FLASH SSD. When referring to Exadata, Mike stated (quote):

What might be an alternative? Well, how about keeping your existing hardware, keep your existing licenses, and just purchase solid state disks to supplement your existing technology stack? For that same amount of money you will shortly be able to get the same usable capacity of Texas Memory Systems RamSan devices. By my estimates that will give you 600,000 IOPS, 9 GB/sec bandwidth (using fibre Fibre Channel , more withor Infiniband), 48 terabytes of non-volatile flash storage[S1] , 384 GB of DDR cache and a speed up of 10-50X depending on the query (based on tests against the TPCH data set using disks and the equivalent Ram-San SSD configuration).

OK, there is a lot to dissect in that paragraph. First there is the attractive sounding 600,000 IOPS with sub-millisecond response time. But wait, Mike suggests keeping your existing hardware. Folks, if you have existing hardware that is capable of driving OLTP I/O at the rate of 600,000 IOPS I want to shake your hand. Oracle OLTP doesn’t just issue I/O. It performs transactions that hammer the SGA cache and suffer some cache misses (logical to physical I/O ratio). The CPU cost wrapped around the physical I/O is not trivial. Indeed, the idea is to drive up CPU utilization and reduce physical I/O through schema design and proper SGA caching. Those of you who are current Oracle practitioners are invited to analyze your current production OLTP workload and assess the CPU utilization associated with your demonstrated physical I/O rate. If you have an OLTP workload that is doing more than, say, 5000 IOPS (physical) per processor core and you are not 100% processor-bound, tell us about it.

Yes, there are tricked out transactional benchmarks that shave off real-world features and code path and hammer out as much as 10,000 IOPS per processor core (on very powerful CPUs), but that is not your workload, or anyone else’s workload that reads this blog. So, if real OLTP saturates CPU at, say, 5000 IOPS I have to wonder what your “existing hardware” would look like if it were also able to take advantage of 600,000 IOPS. That would be a very formidable Database grid with something like 120 CPUs. Remember, Mike was talking about using existing hardware to take advantage of SSD instead of Exadata. If you have a 120 CPU Database grid, I suspect it is so critical that you wouldn’t be migrating it to anything. It is simply too critical to mess with. I should hope. Oh, it’s actually more like about 2000 IOPS per processor core in real life anyway, but that that doesn’t change the point much. And Exadata isn’t really about OLTP.

Let’s focus more intently on Mike’s supposition that an alternative to Exadata is “keeping your existing hardware” and feeding it 9GB/s from SSD. OK, first, that is 36% less I/O bandwidth than a single HP Oracle Database Machine can do, but let’s think about this for a moment. The Fibre Channel plumbing required for the Database grid to ingest 9GB/s is 23 active 4GFC FC HBAs at max theoretical throughput. That’s a lot of HBAs, and you need systems to plug them into. Remember, this is your “existing system.”

How much CPU does your “existing hardware” require to drive the 23 FC HBAs? Well, it takes a lot. Yes, I know you can use just a blip of CPU to mindlessly issue I/O in such a fashion as Orion or some pure I/O subsystem invigoration like dd if=/dev/zero of=/dev/sda bs=1024k, but we are talking about DW/BI and Oracle. Oracle actually does stuff with the data returned from an I/O call. With non-Exadata storage, the CPU cost associated with I/O (e.g., issuing, reaping), filtration, projection, joining, sorting, aggregation, etc is paid by the Database grid. So your “existing system” has to be powerful enough to do the entirety of SQL processing at a rate of 9GB/s. Let’s pretend for a moment that there existed on the market a 4-socket server that could accommodate 23 FC HBAs. Does anyone think for a moment that the 4 processors (perhaps 8 or 16 cores) can actually do anything reasonable with 9GB/s I/O bandwidth? A general rule is to associate approximately 4 processor cores with each 4GFC HBA (purposefully ignoring trick benchmark configurations). I think it looks like “your existing system” has about 96 processor cores.

A Chump Challenge

I’d put a HP Oracle Database Machine (64-core/14 cell) up against a 96-core/9GBPS FLASH SSD system any day of the week. I’d even give them 128 Database tier CPUs and not worry.

People keep forgetting that scans are offloaded to Exadata with the HP Oracle Database Machine. People shouldn’t craft their position pieces against Exadata by starting at the storage-regardless of the storage speeds and feeds.

It will always take more Database grid horsepower, in a non-Exadata environment, to drive the same scan rates offered by the HP Oracle Database Machine.

FLASH SSD

Did I mention that there is nothing (technically) preventing us from configuring Exadata Storage Server with 3.5″ FLASH SSD drives? Better late than never, but it isn’t really worth mentioning at this time.

The “City of Brotherly Love” Loves Exadata. I Love That.

Published October 15, 2008 oracle 11 Comments

BLOG CORRECTION: I misread the announcement. Oracle’s own Robert Stackowiac is presenting Exadata at the PAOUG meeting. I have all the confidence in the world that he’ll do a fantastic job.

According to this post on blogs.oracle.com, the excitement level over Exadata Storage Server is increasing. It seems the Philadelphia Area Oracle Users Group is featuring Mark Rittman to discuss the future of Oracle BI/DW architecture. There will also be a presentation on Oracle Database Machine and Exadata Storage Server. I’m glad to hear interest is building in the product, but I have to admit I’m a bit confused because I think Exadata is the future (present) of Oracle BI/DW. I also don’t know how anyone outside Oracle Corporation could so quickly amass the required expertise to present Exadata. I suppose I’m being petty. Nonetheless, I’m excited Exadata is deemed important enough to make the keynote address at this gathering. The abstract for the keynote reads:

Data volumes are exploding generating larger and larger databases, and getting to the right data instantly requires a new way to manage today¿s systems. New and revolutionary solutions and methodologies are converging to address this need, and taking a fresh look at the challenge reveals new insights. This keynote will examine the intersection of new database, data warehouse, and storage solutions that deliver on these requirements.

Hmmm…”new database, data warehouse, and storage solutions…”

I hope they get it right.

Oracle Exadata Storage Server. No Magic in an Imperfect World. Excellent Tools and Really Fast I/O Though.

Published October 9, 2008 oracle 8 Comments

I’ve met people before who’d rather drink muriatic acid than admit a mistake. I don’t cotton to such folks none too purdy good. This is a post about perfection and imperfection.

Cost Based Optimizer

There is no such thing as a perfect cost-based optimizer (CBO) and the expectations placed on cost-based optimizers run rampant. If the Oracle Database 11g CBO was perfect there would be no need for an excellent performance tuning tool like SQL Tuning Advisor and Reactive Tuning. In those cases where CBO does not generate a perfect plan, Oracle Database equips administrators with a tuning toolset that generally gets it right.

What’s This Have to do with Oracle Exadata Storage Server?

In my blog entry entitled Oracle Exadata Storage Server: A Black Box with No Statistics, I discussed the Affinity Card Program Test Database (ACPTD). One of the reasons I use this schema and query set is that, unlike a benchmark kit, it is not perfect. I expect Exadata to function as advertised in an imperfect world.

Imagine an organization that inherited some imperfect data with an imperfect schema. Nah, that would never happen. What about queries that aren’t perfect? Ever seen any of those? What about imperfect query plans?

One of the business questions, in English, that I test with the Affinity Card Program Test Database reads as follows:

List our customers with Club Cards who purchased more than $1,000.00 worth of goods/services in the west retail region. Omit non-affinity card purchases and all restaurant, travel and gasoline purchases.

The following text box has the SQL for this business question. Now, before you get all excited about the usage of the mcc.misc column (line 5 in the text box), please remember what I just explained in the previous paragraph. Sometimes I like imperfection when I’m analyzing performance. But what does this have to do with Exadata Storage Server? Well, I could certainly use a codes table and anti-join the purchases that are not restaurant, travel or gasoline. No question there. However, for this particular query I wanted imperfection, not a lesson in schema design. So, the purpose behind the not like ‘restaurant%travel%gasoline%’ predicate is to test a more heavily weighted filtration effort in the Exadata Storage Server. It’s synthetic, I know, but at least it is imperfect-which I like.

with act as
(
select   act.card_no, act.purchase_amt from zipcodes z ,all_card_trans act ,mcc m
where    (act.card_no like '4777%' or act.card_no like '3333%') and
act.mcc = m.mcc and m.misc not like 'restaurant%travel%gasoline%' and
act.merchant_zip = z.zip and z.state in ('CA', 'OR', 'WA')
)
select
cf.custid,
sum (act.purchase_amt) sales from  act,cust_fact cf
where act.card_no = cf.aff_cc_num and cf.club_card_num not like '0000%'
group by cf.custid
having   sum (act.purchase_amt) > 1000;

Notice in the following text box how the plan shows Exadata Storage Server is filtering out all_card_trans, cust_fact, zipcodes and mcc rows. The filtering effort on mcc is not colossal, but certainly heftier than the filtration on zip.state, since there are synthetic data-cleanliness values in mcc.misc such as ‘restaurant%travel%gas oline’ and so forth.


Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(SUM("ACT"."PURCHASE_AMT")>1000)
7 - access("ACT"."CARD_NO"="CF"."AFF_CC_NUM")
10 - access("ACT"."MERCHANT_ZIP"="Z"."ZIP")
13 - access("ACT"."MCC"="M"."MCC")
18 - storage("ACT"."CARD_NO" LIKE '4777%' OR "ACT"."CARD_NO" LIKE '3333%')
filter("ACT"."CARD_NO" LIKE '4777%' OR "ACT"."CARD_NO" LIKE '3333%')
23 - storage("M"."MISC" NOT LIKE 'restaurant%travel%gasoline%' AND SYS_OP_BLOOM_FILTER(:BF0000,"M"."MCC"))
filter("M"."MISC" NOT LIKE 'restaurant%travel%gasoline%' AND SYS_OP_BLOOM_FILTER(:BF0000,"M"."MCC"))
25 - storage("Z"."STATE"='CA' OR "Z"."STATE"='OR' OR "Z"."STATE"='WA')
filter("Z"."STATE"='CA' OR "Z"."STATE"='OR' OR "Z"."STATE"='WA')
27 - storage("CF"."CLUB_CARD_NUM" NOT LIKE '0000%')
filter("CF"."CLUB_CARD_NUM" NOT LIKE '0000%')

What Does This Have to do with Cost Based Optimizer and SQL Tuning Advisor?

The fact that Exadata does not leave any of the core Oracle Database 11g value propositions off the table is a point that has been made in the press and blogging community over and over again. I’m sure I’ll get spanked for saying this, but occasional bad plans are a fact of life with cost-based optimizers-and that includes Oracle. Those of us willing to accept and live with this reality are very appreciative of the goodness offered by the SQL Tuning Advisor. There are several canned queries in the ACPDT. The other day I loaded the scale that leaves me with an ALL_CARD_TRANS table of roughly 3TB. Of the several queries, I was disappointed with the performance of only one-the query listed above with the imperfect method for weeding out merchant codes. Since this is Exadata you can rest assured that the I/O was not a problem, but the query took 721 seconds to complete! I knew that was in error because there is partition elimination on this query along all_card_trans.card_no and 721 seconds of Oracle Exadata Storage Server Smart Scan throughput on this 6-cell configuration would rack up over 4TB of total I/O (6GB/s*721). Oracle Database does not mistakingly read an entire table when partition pruning is possible, so what was the issue?

The following text box shows the plan. Notice the estimate in step 18. The plan, it seems, didn’t have the best information about the relationship between ALL_CARD_TRANS and CUST_FACT.


------------------------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                                  | Name           | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

------------------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                           |                |    15 |  1395 |   207K  (2)| 00:48:32 |        |      |            |

|   1 |  PX COORDINATOR                            |                |       |       |            |          |        |      |            |

|   2 |   PX SEND QC (RANDOM)                      | :TQ10005       |    15 |  1395 |   207K  (2)| 00:48:32 |  Q1,05 | P->S | QC (RAND)  |

|*  3 |    FILTER                                  |                |       |       |            |          |  Q1,05 | PCWC |            |

|   4 |     HASH GROUP BY                          |                |    15 |  1395 |   207K  (2)| 00:48:32 |  Q1,05 | PCWP |            |

|   5 |      PX RECEIVE                            |                |    15 |  1395 |   207K  (2)| 00:48:32 |  Q1,05 | PCWP |            |

|   6 |       PX SEND HASH                         | :TQ10004       |    15 |  1395 |   207K  (2)| 00:48:32 |  Q1,04 | P->P | HASH       |

|*  7 |        HASH JOIN                           |                |    15 |  1395 |   207K  (2)| 00:48:32 |  Q1,04 | PCWP |            |

|   8 |         PX RECEIVE                         |                |     2 |   104 |   206K  (2)| 00:48:11 |  Q1,04 | PCWP |            |

|   9 |          PX SEND BROADCAST                 | :TQ10003       |     2 |   104 |   206K  (2)| 00:48:11 |  Q1,03 | P->P | BROADCAST  |

|* 10 |           HASH JOIN                        |                |     2 |   104 |   206K  (2)| 00:48:11 |  Q1,03 | PCWP |            |

|  11 |            PX RECEIVE                      |                |     2 |    88 |   206K  (2)| 00:48:11 |  Q1,03 | PCWP |            |

|  12 |             PX SEND BROADCAST              | :TQ10002       |     2 |    88 |   206K  (2)| 00:48:11 |  Q1,02 | P->P | BROADCAST  |

|* 13 |              HASH JOIN BUFFERED            |                |     2 |    88 |   206K  (2)| 00:48:11 |  Q1,02 | PCWP |            |

|  14 |               JOIN FILTER CREATE           | :BF0000        |     2 |    58 |   206K  (2)| 00:48:09 |  Q1,02 | PCWP |            |

|  15 |                PX RECEIVE                  |                |     2 |    58 |   206K  (2)| 00:48:09 |  Q1,02 | PCWP |            |

|  16 |                 PX SEND HASH               | :TQ10000       |     2 |    58 |   206K  (2)| 00:48:09 |  Q1,00 | P->P | HASH       |

|  17 |                  PX BLOCK ITERATOR         |                |     2 |    58 |   206K  (2)| 00:48:09 |  Q1,00 | PCWC |            |

|* 18 |                   TABLE ACCESS STORAGE FULL| ALL_CARD_TRANS |     2 |    58 |   206K  (2)| 00:48:09 |  Q1,00 | PCWP |            |

|  19 |               PX RECEIVE                   |                |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,02 | PCWP |            |

|  20 |                PX SEND HASH                | :TQ10001       |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,01 | P->P | HASH       |

|  21 |                 JOIN FILTER USE            | :BF0000        |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,01 | PCWP |            |

|  22 |                  PX BLOCK ITERATOR         |                |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,01 | PCWC |            |

|* 23 |                   TABLE ACCESS STORAGE FULL| MCC            |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,01 | PCWP |            |

|  24 |            PX BLOCK ITERATOR               |                |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,03 | PCWC |            |

|* 25 |             TABLE ACCESS STORAGE FULL      | ZIPCODES       |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,03 | PCWP |            |

|  26 |         PX BLOCK ITERATOR                  |                |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,04 | PCWC |            |

|* 27 |          TABLE ACCESS STORAGE FULL         | CUST_FACT      |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,04 | PCWP |            |

------------------------------------------------------------------------------------------------------------------------------------------

So, I executed the query under the SQL Tuning Advisor API as follows.


var tname varchar2(500);

exec :tname := dbms_sqltune.create_tuning_task(sql_text=>' with act as ( select act.card_no, act.purchase_amt from  zipcodes z ,all_card_trans act ,mcc m where

(act.card_no like ''4777%'' or act.card_no like ''3333%'') and act.mcc = m.mcc and m.misc not like ''restaurant%travel%gasoline%'' and act.merchant_zip = z.zip a

nd z.state in (''CA'', ''OR'', ''WA'')) select cf.custid, sum (act.purchase_amt) sales from act,cust_fact cf where act.card_no = cf.aff_cc_num and cf.club_card_n

um not like ''0000%'' group by cf.custid having   sum (act.purchase_amt) > 1000');

exec dbms_sqltune.execute_tuning_task(:tname);

select dbms_sqltune.report_tuning_task(:tname) AS recommendations from dual;

SQL Tuning Advisor did recommend a different plan and, as you can see in the following text box, the primary difference is in the cardinality estimate on ALL_CARD_TRANS. With this learned information the join order changed and the result was a 6.5X speedup as the query completed in 110 seconds.


---------------------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                               | Name           | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

---------------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                        |                |     1 |    93 |   208K  (2)| 00:48:33 |        |      |            |

|   1 |  PX COORDINATOR                         |                |       |       |            |          |        |      |            |

|   2 |   PX SEND QC (RANDOM)                   | :TQ10005       |     1 |    93 |   208K  (2)| 00:48:33 |  Q1,05 | P->S | QC (RAND)  |

|*  3 |    FILTER                               |                |       |       |            |          |  Q1,05 | PCWC |            |

|   4 |     HASH GROUP BY                       |                |     1 |    93 |   208K  (2)| 00:48:33 |  Q1,05 | PCWP |            |

|   5 |      PX RECEIVE                         |                |    86M|  7640M|   207K  (2)| 00:48:32 |  Q1,05 | PCWP |            |

|   6 |       PX SEND HASH                      | :TQ10004       |    86M|  7640M|   207K  (2)| 00:48:32 |  Q1,04 | P->P | HASH       |

|*  7 |        HASH JOIN BUFFERED               |                |    86M|  7640M|   207K  (2)| 00:48:32 |  Q1,04 | PCWP |            |

|   8 |         PX RECEIVE                      |                |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,04 | PCWP |            |

|   9 |          PX SEND HASH                   | :TQ10002       |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,02 | P->P | HASH       |

|  10 |           PX BLOCK ITERATOR             |                |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,02 | PCWC |            |

|* 11 |            TABLE ACCESS STORAGE FULL    | CUST_FACT      |  6278K|   245M|  1527   (1)| 00:00:22 |  Q1,02 | PCWP |            |

|  12 |         PX RECEIVE                      |                |  6802K|   337M|   206K  (2)| 00:48:11 |  Q1,04 | PCWP |            |

|  13 |          PX SEND HASH                   | :TQ10003       |  6802K|   337M|   206K  (2)| 00:48:11 |  Q1,03 | P->P | HASH       |

|* 14 |           HASH JOIN                     |                |  6802K|   337M|   206K  (2)| 00:48:11 |  Q1,03 | PCWP |            |

|  15 |            PX RECEIVE                   |                |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,03 | PCWP |            |

|  16 |             PX SEND BROADCAST           | :TQ10000       |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,00 | P->P | BROADCAST  |

|  17 |              PX BLOCK ITERATOR          |                |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,00 | PCWC |            |

|* 18 |               TABLE ACCESS STORAGE FULL | MCC            |   276 |  4140 |    83   (0)| 00:00:02 |  Q1,00 | PCWP |            |

|* 19 |            HASH JOIN                    |                |  7220K|   254M|   206K  (2)| 00:48:10 |  Q1,03 | PCWP |            |

|  20 |             PX RECEIVE                  |                |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,03 | PCWP |            |

|  21 |              PX SEND BROADCAST          | :TQ10001       |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,01 | P->P | BROADCAST  |

|  22 |               PX BLOCK ITERATOR         |                |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,01 | PCWC |            |

|* 23 |                TABLE ACCESS STORAGE FULL| ZIPCODES       |  3953 | 31624 |     3   (0)| 00:00:01 |  Q1,01 | PCWP |            |

|  24 |             PX BLOCK ITERATOR           |                |    78M|  2179M|   206K  (2)| 00:48:09 |  Q1,03 | PCWC |            |

|* 25 |              TABLE ACCESS STORAGE FULL  | ALL_CARD_TRANS |    78M|  2179M|   206K  (2)| 00:48:09 |  Q1,03 | PCWP |            |

---------------------------------------------------------------------------------------------------------------------------------------

Summary

After a brief bout with an imperfect query plan, the Oracle Database 11g SQL Tuning Advisor fixed what ailed me. With the plan offered me by SQL Tuning Advisor I was able to push my imperfect query through my imperfect schema and a perfect I/O rate of 6 GB/s for my small 6-cell Oracle Exadata Storage Server test configuration.

In spite of how many times EMC’s Chuck Hollis may claim that there is “nothing new” or “no magic” when referring to Oracle Exadata Storage Server, I think it is painfully obvious that there is indeed “something new” here. Is it magic? No, and we don’t claim that it’s magic. What we do claim is that we have moved filtration and projection down to storage and, more importantly, we don’t bottleneck disks (see Hard Drives Are Arcane Technology. So Why Can’t I Realize Their Full Bandwidth Potential?). So while we don’t claim Exadata is magic, there is one thing this blog post shows-all the pre-existing magic, like SQL Tuning Advisor, is still in the arsenal. When you deploy Exadata you leave nothing at the door.

Finally, a no-magic approach to something really new.

Oracle Exadata Storage Server: A Black Box with No Statistics.

Published October 7, 2008 I/O Topics , oracle , Oracle performance , Real Application Clusters 16 Comments
Tags: Oracle Exadata Storage Server Software

A question came in about whether it is possible to measure how much data is filtered out when running a query serviced by a Smart Scan in the Exadata Storage Server grid. The following is the long answer.

An Example of Offload Processing Effectiveness Accounting

I’d like to answer this question by taking real information from a test configuration consisting of 4 Real Application Clusters nodes attached to 6 Exadata Storage Server cells with the SAS disk option (12 x 300GB 3.5″ 15K RPM).

Test Workload – The Affinity Card Program Test Database

The Affinity Card Program Test Database (ACPTD) is a test schema and set of queries that mimics the type of queries performed by a marketing group responsible for promoting the use of a retail club card and affinity credit card for a fictitious company. In this example deployment of the ACPTD, the group responsible for promoting club card and affinity credit card activity for Acme Inc has built a data mart with data from several sources, including the main corporate ERP and CRM systems; partner merchant ERP systems; and, outside syndicated data.

For this blog post I’ll focus on the main syndicated card transaction table called all_card_trans. However, there are other tables in the schema and for reference sake the following text box shows the table sizes in my current deployment of the test kit. As the query shows, the all_card_trans table is 613 GB. Yes, I know that’s small, but this is a test system and I don’t like watching 6 Exadata cells hammering 1 GB/s each for tens of minutes when a couple of minutes will do. I hope you’ll understand.

SQL> set echo on
SQL> @tab_sizes
SQL> col segment_name for a32
SQL> select segment_name, sum(bytes)/1024/1024 mb from user_segments
  2  where segment_name not like 'BIN%' and segment_name not like 'SYS%'
  3  group by segment_name;

SEGMENT_NAME                             MB
-------------------------------- ----------
PARTNER_MERCHANTS                         8
PARTNER_MERCHANT_SALES                67780
ZIPCODES                                  8
OUR_SALES                             21956
ALL_CARD_TRANS                       629032
MCC                                       8
OUR_STORES                                8
CUST_SERVICE                             76
TM                                   .03125
CUST_FACT                              4708

10 rows selected.

The table definition for the all_card_trans table is shown in the following text box:

SQL> desc all_card_trans
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CARD_NO                                   NOT NULL VARCHAR2(20)
 CARD_TYPE                                          CHAR(20)
 MCC                                       NOT NULL NUMBER(6)
 PURCHASE_AMT                              NOT NULL NUMBER(6)
 PURCHASE_DT                               NOT NULL DATE
 MERCHANT_CODE                             NOT NULL NUMBER(7)
 MERCHANT_CITY                             NOT NULL VARCHAR2(40)
 MERCHANT_STATE                            NOT NULL CHAR(3)
 MERCHANT_ZIP                              NOT NULL NUMBER(6)

The following queries show the total row count in the all_card_trans table as well as the number of rows representing transactions with credit cards with leading 4 digits of 3333 and 4447, which represent the card block granted to our fictitious company, Acme Inc, for their affinity card program. This is a needle in a haystack query since only .3% of the data matches the where predicate. Now, before you look at the text box please note that this is a single instance of Real Application Clusters driving this query. And, yes, the RAC node is a small 8-core DL360. Both scans are driving storage at 6 GB/s (613GB/102s). Folks, remember that to do what I just did in the following text box with Fibre Channel I’d have to have a system with 15 FC HBAs attached to several huge SAN arrays. Oh, and I should point out that it would also require about 85 Netezza Snippet Processing Units to match this throughput.

SQL> select count(*) from all_card_trans where  card_no like '4777%' or card_no like '3333%’;

  COUNT(*)
----------
  22465830

Elapsed: 00:01:42.71
SQL> select count(*) from all_card_trans;

  COUNT(*)
----------
7897611641

Elapsed: 00:01:42.47

I Can’t Believe I Ate the Whole Thing

The following query uses a method for measuring cell activity by forcing the database to ingest all rows and all columns. Don’t try to make sense of the query because it isn’t supposed to make sense. It is only supposed to drive Exadata to return all rows and all columns, which it does.

$ more demo_offload_bytes.sql

drop view ingest;
create view ingest as
select avg(length(CARD_NO)) c1,
avg(length(CARD_TYPE)) c2,
max(MCC + PURCHASE_AMT) c3,
max(length(PURCHASE_DT)) c4,
max(MERCHANT_CODE) c5,
avg(length(MERCHANT_CITY)) c6,
avg(length(MERCHANT_STATE)) c7,
max(MERCHANT_ZIP) c8 from all_card_trans;

col MB format 99999999.9
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43 );

select c1+c2+c3+c4+c5+c6+c7 from ingest;

select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43);

The following text box shows that when I ran the query Exadata (in aggregate) scanned 613 GB of disk and returned all but 6% to the RDBMS instance (delta cell physical IO interconnect bytes). Also recorded by v$sysstat (bytes eligible) is the fact that there was nothing peculiar about the data being scanned-peculiar in any such fashion that would have interfered with offload processing (e.g., chained rows, etc). Since I asked for all the data, that is what I got. It is nice to know, however, that the entirety of the data was a candidate for offload processing.

Notice I didn’t time the query. I’ll offer more on that later in this blog post.

SQL> @demo_offload_bytes

View dropped.

View created.

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           1227.80814
cell physical IO interconnect bytes                              1151.52453
cell physical IO bytes eligible for predicate offload            1227.31451

C1+C2+C3+C4+C5+C6+C7
--------------------
          9010106.59

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                            1841.5774
cell physical IO interconnect bytes                              1727.15227
cell physical IO bytes eligible for predicate offload            1840.97177

So let’s take a look at some offload processing. The following change to the “ingest” query reduces the amount of data ingested by the database tier by only selecting rows where the credit card number started with 4777 or 3333. We know from the scan of the table that this whittles down the dataset by 99.7%.

$ more demo_offload_bytes1.sql

create view ingest_less as
select avg(length(CARD_NO)) c1,
avg(length(CARD_TYPE)) c2,
max(MCC + PURCHASE_AMT) c3,
max(length(PURCHASE_DT)) c4,
max(MERCHANT_CODE) c5,
avg(length(MERCHANT_CITY)) c6,
avg(length(MERCHANT_STATE)) c7,
max(MERCHANT_ZIP) c8 from all_card_trans act
where card_no like '4777%' or card_no like '3333%';

col MB format 99999999.9
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43 );

set timing on
select c1+c2+c3+c4+c5+c6+c7 from ingest_less;

select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43);

The following text box shows that that Exadata scanned and filtered out the uninteresting rows. The projected columns accounted for only 1.7 GB across the iDB intelligent I/O fabric. That is, the database tier only had to ingest 1.7 GB, or 17 MB/s since the query completed in 103 seconds.

SQL> @demo_offload_bytes_1

View dropped.

Elapsed: 00:00:00.21

View created.

Elapsed: 00:00:00.00

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           1841.13743
cell physical IO interconnect bytes                              43.1782733
cell physical IO bytes eligible for predicate offload            1840.97177

Elapsed: 00:00:00.01

C1+C2+C3+C4+C5+C6+C7
--------------------
           9010106.9

Elapsed: 00:01:43.31

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           2454.80203
cell physical IO interconnect bytes                              44.8776852
cell physical IO bytes eligible for predicate offload            2454.62903

Elapsed: 00:00:00.00
SQL>

So, yes, Exadata does let you see how effectively Smart Scans are working. In this example we saw a single-table scan on a small Oracle Exadata Storage Server grid weeding out 99.7% of the data.

Sneaky, Sneaky.

You’ll notice when I executed demo_offload_bytes.sql (the non-filtering example) I was not tracking query completion time. That’s because up to this point I’ve only been showing examples driven by a single Real Application Clusters node. Now, let’s think about this. I have 8 Intel processor cores and a single inbound Infiniband path on the database server. From an inbound I/O bandwidth perspective the host can “only” ingest 1.6GB/s, but the CPUs may also further throttle that since the database is doing its own projection in this case.

I’ve shown, in this post, that this test Exadata Storage grid configuration can scan disk at 6 GB/s. The question you might ask-and I’m about to answer-is how much does this single database host throttle storage and why does that matter? Well, it matters because having ample storage bandwidth with limited database server bandwidth is the classic imbalance addressed by the HP Oracle Database Machine. So, let me run it again-with timing turned on. As you’ll see, bottlenecks are bottlenecks. This single database host cannot keep up with storage, so storage is throttled and the result is 18X increase in query completion time when compared to the heavily filtered case. Both queries had to read the same amount of data but in this case there was an imbalance in the upstream ability to ingest the data both from an I/O bandwidth and CPU perspective.

SQL> @demo_offload_bytes

View dropped.

Elapsed: 00:00:00.02

View created.

Elapsed: 00:00:00.01

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           2454.81084
cell physical IO interconnect bytes                               44.886501
cell physical IO bytes eligible for predicate offload            2454.62903

Elapsed: 00:00:00.00

C1+C2+C3+C4+C5+C6+C7
--------------------
          9010106.59

Elapsed: 00:29:26.85

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           3068.55792
cell physical IO interconnect bytes                              620.492259
cell physical IO bytes eligible for predicate offload            3068.28629

Elapsed: 00:00:00.00

Need More RDBMS Bandwidth? OK, Use It.

So, let’s see what the full-ingestion case looks like with 4 Real Application Clusters nodes. And please forgive that I forgot to aggregate the v$sysstat output. I’ve added 4-fold database grid resources so I should complete the full-ingestion query in 75% less time. The text box starts out showing that the number of RAC instances was increased from 1 to 4.

SQL> host date
Fri Oct  3 16:27:52 PDT 2008

SQL> select instance_name from gv$instance;

INSTANCE_NAME
----------------
test1

Elapsed: 00:00:00.02
SQL> select instance_name from gv$instance;

INSTANCE_NAME
----------------
test1
test4
test3
test2

Elapsed: 00:00:00.08
SQL> host date
Fri Oct  3 16:32:06 PDT 2008
SQL>
SQL> @demo_offload_bytes

View dropped.

Elapsed: 00:00:00.03

View created.

Elapsed: 00:00:00.01

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           3068.69843
cell physical IO interconnect bytes                              620.632769
cell physical IO bytes eligible for predicate offload            3068.28629
physical IO disk bytes                                           .013385296
cell physical IO interconnect bytes                              .013385296
cell physical IO bytes eligible for predicate offload                     0
physical IO disk bytes                                           .010355473
cell physical IO interconnect bytes                              .010355473
cell physical IO bytes eligible for predicate offload                     0
physical IO disk bytes                                           .031275272
cell physical IO interconnect bytes                              .031275272

NAME                                                                     GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload                     0

12 rows selected.

Elapsed: 00:00:00.02

C1+C2+C3+C4+C5+C6+C7
--------------------
          9010106.59
Elapsed: 00:07:25.01

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           3222.13893
cell physical IO interconnect bytes                              764.535856
cell physical IO bytes eligible for predicate offload            3221.70425
physical IO disk bytes                                            154.31714
cell physical IO interconnect bytes                              144.732708
cell physical IO bytes eligible for predicate offload            154.280945
physical IO disk bytes                                           152.860648
cell physical IO interconnect bytes                              143.367385
cell physical IO bytes eligible for predicate offload            152.828033
physical IO disk bytes                                           153.195674
cell physical IO interconnect bytes                                143.6831

NAME                                                                     GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload             153.13031

12 rows selected.

Elapsed: 00:00:00.02
SQL>

Well, that was predictable, but cool nonetheless. What about the filtered ingest-query? Should there be much of a speedup given the fact that storage bandwidth remains constant and the ingest rate of the filtered query was only roughly 17 MB/s with a single Real Application Clusters node? I could save the cut and paste effort and just tell you that adding Real Application Clusters nodes in this case will, of course, not reduce the query time, but since I’ve bored you this long, here it is:

SQL> @demo_offload_bytes_1

View dropped.

Elapsed: 00:00:00.01

View created.

Elapsed: 00:00:00.01

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           3222.15758
cell physical IO interconnect bytes                              764.554509
cell physical IO bytes eligible for predicate offload            3221.70425
physical IO disk bytes                                           154.334342
cell physical IO interconnect bytes                               144.74991
cell physical IO bytes eligible for predicate offload            154.280945
physical IO disk bytes                                           153.213285
cell physical IO interconnect bytes                               143.70071
cell physical IO bytes eligible for predicate offload             153.13031
physical IO disk bytes                                           152.878347
cell physical IO interconnect bytes                              143.385084

NAME                                                                     GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload            152.828033

12 rows selected.

Elapsed: 00:00:00.01

C1+C2+C3+C4+C5+C6+C7
--------------------
           9010106.9

Elapsed: 00:01:42.79

NAME                                                                     GB
---------------------------------------------------------------- ----------
physical IO disk bytes                                           3375.27835
cell physical IO interconnect bytes                              764.974159
cell physical IO bytes eligible for predicate offload            3374.81995
physical IO disk bytes                                             308.0746
cell physical IO interconnect bytes                              144.133529
cell physical IO bytes eligible for predicate offload            307.986572
physical IO disk bytes                                           310.634206
cell physical IO interconnect bytes                              145.189768
cell physical IO bytes eligible for predicate offload              310.5755
physical IO disk bytes                                           302.274069
cell physical IO interconnect bytes                              143.805239

NAME                                                                     GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload            302.218781

12 rows selected.

Elapsed: 00:00:00.01
SQL> host date
Fri Oct  3 16:49:32 PDT 2008

Summary

Exadata works and has statistics so that the solution doesn’t feel so much like a black box. In the end, however, none of the statistcs stuff really matters. What matters is whether your queries complete in the desired service times.

Podcast: Pythian Group Oracle Exadata Storage Server Q&A with Kevin Closson.

Published October 3, 2008 oracle 4 Comments

The Pythian Group recently conducted a short podcast interview with me to discuss a few topics related to Oracle Exadata Storage Server and the HP Oracle Database Machine. Sadly, the audio is troublesome in some segments, due to telephony issues, but in general I think you may find it informative. Here is the link:

The Pythian Group Interview: Kevin Closson on the Oracle Exadata Storage Server. Part I.

Oracle Exadata Storage Server FAQ – Part V. Sweet and Sour Disk.

Published October 3, 2008 oracle 2 Comments

This is installment number five in my series on Oracle Exadata Storage Server and HP Oracle Database Machine frequently asked questions. I recommend you also visit The Index of my other Exadata posts. I’m mostly cutting and pasting questions from the comment threads of my blog posts about Oracle Exadata Storage Server and the HP Oracle Database Machine and mixing in some assertions I’ve seen on the web (and re-phrasing them as questions). If they read as questions when I see them I cut and paste them without modification.

Q. […] for joining large datasets, such as a fact tables anda very large dimension, we currently encourage the use of equipartitioning on the join key to reduce messaging (CPU) and the volume of data distributed to PQ slaves (memory/storage).Would there be (a) a benefit and (b) a mechanism for ensuring that matching partitions of commonly joined tables are colocated on the same cell, so that the join can be performed entirely at the storage level?

A. The answer to both (a) and (b) is no. There is no way to co-locate table or indexes on a cell given the fact that we use ASM to stripe and mirror between cells. To do this we would have to replicate partitions (in entirety) across cells-for redundancy. ASM is unaware of what type of data is stored in blocks. Besides, the only type of “join” we do at this stage is based on bloom filters. Bloom filters are created in the database tier (grid) and then pushed down to storage.

Q. Where can I find the official documentation for the Exadata Product Family? I could only find datasheet and white paper.

A. Documentation is only provided to licensed customers.

Q. How [does] the optimizer decide whether to use an index or a Smart Scan for predicate filtering? I guess he will choose the faster one, but what’s faster and when?

A. Ah, but we don’t call is Smart Table Scan. Exadata Storage Server uses Smart Scans on indexes with the same brute force and intelligence (e.g., filtration) as it does with tables.

Q. Why use the outer part of the disk for everything? I.e., why not use the slower part for mirroring?

A. This question is in reference to the fact that we recommend folks allocate the outer portions of their disks to disk groups that will contain “hot” data. Strictly speaking, there is a way to do this with Automatic Storage Management Failure Groups. I tend not to use Failure Groups to address what the reader is asking because the origin of that feature was to equip administrators with the necessary tools to do the hard work of ensuring ASM mirrors between separate controllers and/or cabinets, etc. Exadata is much more aware of the underlying storage so this level of admin effort is not needed.

Automatic Storage Management mirrors the contents of the logical management unit (a disk group) and does not mirror between disk groups unless you have Failure Groups. If we supported a RAID 0+1 (with BCL) style approach between disk groups then I would put a hot-mirror-side disk group on the sweet sectors and a cold-mirror-side disk group closer to the spindles and only read the cold-mirror-side in the event of a failure. But, we don’t do that with singleton disk groups. Instead, we do a quasi-RAID 1+0 **within** the disk group. As such we consume sweet disk for both the primary and secondary extents. In the event of a loss, I’d say our current approach is better because the application will continue to be serviced with I/O from sweet sectors. On the other hand, if there is never a loss we are consuming sweet disk for naught. It is a trade-off.

The other problem with a RAID 0+1 (hot-to-cold) striped-then-mirrored approach is suffered by OLTP workloads because with the typical read/write ratio of OLTP we’d be wildly flailing between the sweet and sour sectors to satisfy the writes. Remember, we are not a one-pony show.

« Previous Page — Next Page »

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage

Archive for the 'oracle' Category

Unstructured Data. Lots and Lots of It.

Oracle Exadata Storage Server: 485x Faster Than…Oracle Exadata Storage Server. Part II.

Oracle Exadata Storage Server: 485x Faster Than…Oracle Exadata Storage Server. Part I.

FISHy Network Attached Storage

Exadata Related Posts. Losing Posts in the Mosh Pit.

Blog Anniversary

Pessimistic Feelings About New Technology. Oracle Exadata Storage Server – A JBOD That Can Swamp A Single Server.

Linux is Perfect So Why Would You Monitor Performance?

Read This Blog: Christian Antognini.

Oracle Exadata Storage Server: Beaten by FLASH SSD and Worthless for OLTP.

The “City of Brotherly Love” Loves Exadata. I Love That.

Oracle Exadata Storage Server. No Magic in an Imperfect World. Excellent Tools and Really Fast I/O Though.

Oracle Exadata Storage Server: A Black Box with No Statistics.

Podcast: Pythian Group Oracle Exadata Storage Server Q&A with Kevin Closson.

Oracle Exadata Storage Server FAQ – Part V. Sweet and Sour Disk.

DISCLAIMER

Pages

Blogroll

Follow Blog via Email

Recent Posts

Recent Comments

Fond Memories

Copyright

Archive for the 'oracle' Category

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

DISCLAIMER

Pages

Blogroll

Follow Blog via Email

Recent Posts

Recent Comments

Fond Memories

Copyright