Archive Page 3

AWS Database Blog – Added To My Blog Roll

This is just a brief blog post to share that I’ve added the AWS Database Blog to my blogroll.  I recommend you do the same! Let’s follow what’s going on over there.

Some of my favorite categories under the AWS Database Blog are:



Readers: I do intend to eventually get proper credentials to make some posts on that blog. All in proper time and with proper training and clearance.

SLOB Use Cases By Industry Vendors. Learn SLOB, Speak The Experts’ Language.

For general SLOB information, please visit:

List of Vendors Who Publish SLOB Testing Results

The list of vendors’ SLOB use cases discussed in this blog post are (in no particular order):

  • VMware
  • A joint paper co-branded by Intel and Quanta Cloud Technologies
  • VCE
  • Nutanix
  • Netapp
  • Microsoft (Azure)
  • HPE
  • Pure Storage
  • Nimble
  • IBM
  • Red Hat
  • Dell EMC
  • Red Stack Tech.
  • Vexata
  • Datrium

Beyond vendors, I’ll show SLOB usage at as well.


This is just a quick blog entry to showcase a few of the publications from IT vendors showcasing SLOB. SLOB allows performance engineers to speak in short sentences. As I’ve pointed out before, SLOB is not used to test how well Oracle handles transactions. If you are worried that Oracle cannot handle transactions then you have bigger problems than what can be tested with SLOB. SLOB is how you test whether–or how well–a platform can satisfy SQL-driven database physical I/O.

SLOB testing is not at all like using a transactional test kit (e.g., TPC-C). Transactional test kits are, first and foremost, Oracle intrinsic code testing kits (the code of the server itself). Here again I say if you are questioning (testing) Oracle transaction layer code then something is really wrong. Sure, transactional kits involve physical I/O but the ratio of CPU utilization to physical I/O is generally not conducive to testing even mid-range modern storage without massive compute capability. This is why vendors and dutiful systems experts rely on SLOB.

Recent SLOB testing on top-bin Broadwell Xeons (E5-2699v4) show that each core is able to drive over 50,000 physical read IOPS (db file sequential read).  On the contrary 50,000 IOPS is about what one would expect from over a dozen of such cores with a transactional test kit because the CPU is being used to execute Oracle intrinsic transaction code paths and, indeed, some sundry I/O.

SLOB Use Cases By IT Vendors

The following are links and screenshots from various vendors showing some of their SLOB use cases. Generally speaking, if you are shopping for modern storage–optimized for Oracle Database–you should expect to see SLOB results in a vendor’s literature.


The first case I’d like to share is that of a solution built by FlashGrid. This solution is all about using AWS EC2 instances, along with FlashGrid technology and best practices for Real Application Clusters,  in the AWS Cloud. I am not an expert on Flash Grid and am merely reporting their usage of SLOB as can be seen in the following paper and blog post:

I do recommend getting a copy of this paper!

FlashGrid Characterizing Real Application Clusters Performance with SLOB in the AWS Cloud (EC2 instances)


VMware showcasing VSAN with Oracle using SLOB at:


VMware Using SLOB to Assess VSAN Suitability for Oracle Database

VMware has an additional publication showing SLOB results at the following URL:

Intel and Quanta Cloud Technologies – a Co-Branded Whitepaper

The following is a link to a Principled Technologies publication. This whitepaper is co-branded by Intel and Quanta Cloud Technologies. The paper proves platform suitability of  VMware/Quanta Cloud Technologies and Intel processors for Oracle I/O intensive workloads with SLOB results:

Principled Technologies Co-Branded Whitepaper with Intel and QCT


The VCE Solution guide for consolidating databases includes proof points based on SLOB testing at the following link:


VCE Solution Guide Using SLOB Proof Points


Next is Nutanix with this publication:

Figure 2: Nutanix Using SLOB for Platform Suitability Testing

Nutanix Using SLOB for Platform Suitability Testing

More SLOB proof points by Nutanix:


NetApp has a lot of articles showcasing SLOB results. The first is at the following link:


NetApp Testing FlexPod Select for High-Performance Oracle RAC with SLOB

NetApp AFF A800 Performance with Oracle RAC Database

In March 2019, NetApp published a great technical article (tr-4767.pdf) on the value of NVMeOF for Real Application Clusters. I recommend this article because NVMeOF is the emerging best of breed storage connectivity technology the industry has to offer.

Kudos to NetApp for sharing platform performance proof points with a freely available, understandable and believable Oracle Database I/O testing toolkit–SLOB. The paper can be downloaded here.


NetApp’s Memory Accelerated Data (MAX Data)

In December 2018, reported that NetApps’ Memory Accelerated Data (MAX Data) has been proven by SLOB to offer “dramatic” impact on database workloads. There is a paper available to download at the StorageReview site:

NetApp Testing MAX Data with SLOB

The following NetApp article entitled NetApp AFF8080 EX Performance and Server Consolidation with Oracle Database also features SLOB results and can be found here:

Figure 4: NetApp Testing the AFF8080 with SLOB

NetApp Testing the AFF8080 with SLOB

Yet another SLOB-related NetApp article entitled Oracle Performance Using NetApp Private Storage for SoftLayer can be found here:

Figure 5: NetApp Testing NetApp Private Storage for SoftLayer with SLOB

NetApp Testing NetApp Private Storage for SoftLayer with SLOB

NetApp teamed with Enterprise Strategy Group to produce the report at the following link which shows proof points including SLOB: 

Netapp explains their Direct NFS value add using SLOB in this paper: Oracle Databases on ONTAPSelect

When searching the NetApp main webpage I find 11 articles that offer SLOB testing results:


Searching NetApp Website shows 11 SLOB-Related Articles

Microsoft Azure

Microsoft have posting testing results using SLOB on Azure compute with Netapp storage. This is a Direct NFS proof point and is very much worth a read.



Hewlett-Packard Enterprise offers an article entitled HPE Reference Architecture for
Oracle 12c license savings with HPE 3PAR StoreServ All Flash and ProLiant DL380 Gen9 The article can be found at the following link


HPE Using SLOB Proof Points

Pure Storage

In the Pure Storage article called Pure Storage Reference Architecture for Oracle Databases, the authors also show SLOB results. The article can be found here:


Pure Storage Featuring SLOB Results in Reference Architecture

Other Pure Storage publications with SLOB proof points:

Nimble Storage

Nimble Storage offers the following blog post with SLOB testing results:

Figure 9: Nimble Storage Blogging About Testing Their Array with SLOB

Nimble Storage Blogging About Testing Their Array with SLOB


There is an IBM “8-bar logo” presentation showing SLOB results here:


IBM Material Showing SLOB Testing

I also find it interesting that folks contributing code to the Linux Kernel include SLOB results showing value of their contributions such as here:


Linux Kernel Contributors Use SLOB Testing of Their Submissions

Red Hat

Next we see Red Hat disclosing Live Migration capabilities that involve SLOB workloads:


Red Hat Showcasing Live Migration with SLOB Workload

Dell EMC

DellEMC has many publications showcasing SLOB results. This reference, however, merely suggests the best-practice of involving SLOB testing before going into production:


DellEMC Advocates Pre-Production Testing with SLOB

EMC Using SLOB to characterize combining XtremIO array-level compression in combination with Oracle Advanced Compression Option: :

EMC XtremIO Compression Testing with SLOB

EMC Documenting XtremIO X2 Best Practices with SLOB testing:  :

EMC XtremIO X2 Best Practices Testing with SLOB

An example of a detailed DellEMC publication showing SLOB results is the article entitled VMAX ALL FLASH AND VMAX3 ISCSI DEPLOYMENT GUIDE FOR ORACLE DATABASES which can be found here:


EMC Testing VMAX3 All-FLASH with SLOB

Another usage of SLOB by DellEMC can be found at the following link: This paper is a partner effort with Principled Technologies and it showcases a VMAX 250F All-Flash Array performance characterization with SLOB.

Dell EMC Partnering with Principled Technologies: SLOB Testing with VMAX 250F All-Flash

I took a moment to search the main DellEMC website for articles containing the word SLOB and found 76 such articles!


Search for SLOB Material on DellEMC Main Web Page

Red Stack Tech

Red Stack Tech offer DBaaS and even showcase the ability to test the platform for I/O suitability with SLOB:


Red Stack Tech Offering SLOB Testing as Proof of Concept


Vexata and Lenovo teamed up to produce a fantastic SLOB proof point that showcases their VX-100F array as the following graphic shows:


Vexata / Lenovo 4-Node RAC Configuration

The report can be downloaded at the following link:

Vexata also commissioned ESG to conduct a performance assessment of their Vexata VX-100 Scalable Storage Systems. The results are available in the following paper:

Vexata have updated their literature to include a SLOB proof point of leveraging their all-flash storage via Oracle Database Smart Flash Cache in their paper entitled UTILIZE VEXATA WITH FLASH CACHE TO BOOST ORACLE DATABASE PERFORMANCE available here:

Here is a glimpse:


Datrium have posted SLOB testing results for their Datrium AllFlash suite.

Non-Vendor References

Although not a vendor, it deserves mention that Greg Shultz of Server StorageIO and UnlimitedIO LLC lists SLOB alongside other platform and I/O testing toolkits. Greg’s exhaustive list can be found here:



More and more people are using SLOB. If you are into Oracle Database platform performance I think you should join the club! Maybe you’ll even take interest in joining the Twitter SLOB list:

Get SLOB, use SLOB!

SLOB 2.3 Data Loading Failed? Here’s a Quick Diagnosis Tip.

The upcoming SLOB 2.4 release will bring improved data loading error handling. While still using SLOB 2.3, users can suffer data loading failures that may appear–on the surface–to be difficult to diagnose.

Before I continue, I should point out that the most common data loading failure with SLOB in pre-2.4 releases is the concurrent data loading phase suffering lack of sort space in TEMP. To that end, here is an example of a SLOB 2.3 data loading failure due to shortage of TEMP space. Please notice the grep command (in Figure 2 below) one should use to begin diagnosis of any SLOB data loading failure:


Figure 1

And now, the grep command:


Figure 2


Yes, Storage Arrays Can Deduplicate Oracle Database. Here Is Exactly Why It Doesn’t Matter!

I recently had some cycles on a freshly installed Dell EMC XtremIO Storage Array. I took this opportunity to prepare a blog entry about the never-ending topic of whether or not storage arrays are able to reduce physical data capacity through deduplication of blocks in Oracle Database.

Of Course There Is Duplicate Data In Oracle Datafiles

Before I continue, let me say something that may come as a surprise to you. Yes, Oracle Database has duplicate blocks in tablespaces! Yes, modern storage arrays can achieve astonishing data reduction rates through deduplication–even when the only data in the array is Oracle Database (whether ASM or file systems)!

XtremIO computes and displays global data reduction rate. This makes it a bit more difficult to show the effect of deduplication on Oracle Database because averages across diverse data makes pin-point focus impossible. However, as I was saying, I took some time on a freshly-installed XtremIO array and collected what I hope will be interesting information on the topic of deduplication.

Please take a look at Figure 1. To start the testing I created a 4TB XtremIO volume, attached it as a LUN to a test host and then created an XFS file system on it. Please be aware that the contents of an Oracle datafile is precisely the same whether stored in ASM or in a file system file. After the file system was created I used the SLOB database creation kit (SLOB/misc/create_database_kit) to create a small database with Oracle Database 12c. As Figure 1 shows, the small database consumed 11.83GB of logical space in the 4TB volume. However, since the data enjoyed a slight deduplication ratio of 1.1:1 and a healthy compression ratio of 3.3:1 for a 3.6:1 data reduction ratio, only 3.27GB physical space was consumed in the array.


Figure 1

The next step in the testing was to consume the majority of the 4TB file system with a BIGFILE tablespace. Figure 2 shows the DDL I used to create the tablespace.


Figure 2

Figure 3 shows the file system file that corresponds to the tablespace created with DDL in Figure 2.


Figure 3

After creating the 3.9TB BIGFILE tablespace I took a screenshot of the XtremIO GUI Dashboard. As Figure 4 shows, there was no deduplication! Instead, the data was compressed 4.0:1 resulting in only 977.66GB physical space being consumed in the array. So why in the world would I blog the opposite of what I said above? Why show the array did not, in fact, deduplicate the 3.9TB datafile? The answer is in the fact that I said there are duplicate data block in tablespaces. I didn’t say there are duplicate blocks in the same datafile!


Figure 4

To return the array to the state prior to the BIGFILE tablespace creation, I dropped the tablespace (including contents and datafiles thus unlinking the file) and then used the Linux fstrim(8) command to return the space to the array as shown in Figure 5.


Figure 5

Once the fstrim command completed I took another screenshot of the XtremIO GUI Dashboard as shown in Figure 6. Figure 6 shows that the array space utilization and data reduction had returned to that of what was seen before the BIGFILE tablespace creation.


Figure 6

OK, Now For The Duplicate Data

The next step in the testing was to fill up the majority of the 4TB file system with SMALLFILE tablespaces. To do so I created 121 tablespaces each consisting of a single SMALLFILE datafile of 32GB. The output shown in Figure 7 is from a data dictionary query to display the size of each of the 121 datafiles and how the sum of these datafiles consumed 3.87TB of the 4TB file system.


Figure 7

That’s Duplicate Data

Once the file system was filled with SMALLFILE datafiles I took another screenshot of the XtremIO GUI Dashboard. Figure 8 shows that the SMALLFILE datafiles enjoyed a deduplication ratio 81.8:1 combined with a compression ratio of 3.8:1 resulting in a global data reduction rate of 306.9:1. Because of the significant data reduction rate only 12.68GB of physical space was consumed in the array in spite of the 3.79TB logical space (the sum of the SMALLFILE datafiles) being allocated.


Figure 8

So here we have it! I had a database created with Oracle Database 12c that consisted of 121 32GB files for roughly 3.8TB database size yet XtremIO deduplicated the data down by a factor of 82:1!

So arrays can deduplicate Oracle Database contents! Right? Well, yes, but it matters none whatsoever. Allow me to explain.

Oracle datafiles consist of initialized blocks but vast portions of that initialized content is the same from file to file. This fact can be seen with simple md5sum(1) output. Consider Figure 9 where you can see the output of the md5sum command used to compute Oracle datafile checksums but only after skipping the first 8,692 blocks (8KB blocks). It’s the first approximate 68MB of each datafile that is unique when a datafile is freshly initialized. Beyond that threshold we can see (Figure 9) that the rest of the file content is identical.


Figure 9

Thus far this blog post has proven that initialized, but empty, Oracle Database datafiles have duplicate data. As the title of this post says, however, it does not matter.

Introduce Application Data To The Mix

Figure 10 shows the commands I used to populate each of the 121 tablespaces with a single table. The table has the sparse characteristic we are all accustomed to with SLOB. That is, I am only creating a single row in each block. Moreover, I’m populating each of these 121 tables with the same application data! This is precisely why I say deduplication of Oracle Database doesn’t matter because it only holds true until any application data is loaded into the data blocks. Figure 10 shows this set of DDL commands.


Figure 10

After populating the blocks in each of the 121 tables (each residing in a dedicated SMALLFILE tablespace) with blocks containing just a single row of application data I took another screenshot of the XtremIO GUI Dashboard. Figure 11 shows how putting any data into the data blocks reverts the deduplication. Why? Well, remember that the block header of every block has the SCN of the last change made to the block. For this reason I can put the same application data in blocks and still have 100% unique blocks–at least at the 8KB level.

Please note that the application table I used to populate the 121 tables does not consume 100% of the data blocks in each of the SMALLFILE tablespaces. There were a few blocks remaining in each tablespace and thus there remained a scant amount of deduplication as seen in Figure 11. Most XtremIO customers see some insignificant deduplication in their Oracle Database environments. Some even see significant deduplication–at least until they insert data into the database.


Figure 11

In a follow-up post I’ll say a few words about the deduplication granularity and how it affects the ability to achieve small amounts of deduplication of unused space in initialized data blocks. However, bear in mind that the net result of any deduplication of Oracle Database data files is that the only space that can be deduplicated is space that has never had application data in it. After all, a SQL DELETE command doesn’t remove data–it only marks it as free in the block.


I don’t think there are that many Oracle shops that have an urgent need for data reduction of space that’s never been used to store application data. I could be wrong. Until I find out either way, I say that yes you can see deduplication of Oracle Database datafiles but it doesn’t matter one bit.








How Many ASM Disks Per Disk Group And Adding vs. Resizing ASM Disks In An All-Flash Array Environment

I recently posted a 4-part blog series that aims to inform readers that, in an All-Flash Array environment (e.g., XtremIO), database and systems administrators should consider opting for simplicity when configuring and managing Oracle Automatic Storage Management (ASM).

The series starts with Part I which aims to convince readers that modern systems, attached to All-Flash Array technology, can perform large amounts of low-latency physical I/O without vast numbers of host LUNs. Traditional storage environments mandate large numbers of deep I/O queues because high latency I/O requests remain “in-flight” longer. The longer an I/O request takes to complete, the longer other requests remain in the queue. This is not the case with low-latency I/O. Please consider Part I a required primer.

To add more detail to what was offered in Part I,  I offer Part II.  Part II shares a very granular look at the effects of varying host LUN count (aggregate I/O queue depth) alongside varying Oracle Database sessions executing zero-think time transactions.

Part III begins the topic of resizing ASM disks when additional ASM disk group capacity is needed.  Parts I and II are prerequisite reading because one might imagine that a few really large ASM disks is not going to offer appropriate physical I/O performance. That is, if you don’t think small numbers of host LUNs can deliver the necessary I/O performance you might be less inclined to simply resize the ASM disks you have when extra space is needed.

Everything we know in IT has a shelf-life. With All-Flash Array storage, like XtremIO, it is much less invasive, much faster and much simpler to increase your ASM disk group capacity by resizing the existing ASM disks.

Part IV continues the ASM disk resizing topic by showing an example in a Real Application Clusters environment.


Resizing ASM Disks On Modern Systems. Real Application Clusters Doesn’t Make It Any More Difficult. An XtremIO Example With RAC.

My recent post about adding space to ASM disk groups by resizing them larger, as opposed to adding more disks, did not show a Real Application Clusters example. Readers’ comments suggested there is concern amongst DBAs that resizing disks (larger) in a RAC environment might somehow be more difficult than in non-RAC environments. This blog entry shows that, no, it is not more difficult. If anything is true it is that adding disks to ASM disk groups is, in fact, difficult and invasive and that resizing disks–whether clustered systems or not–is very simple. The entire point of this short blog series is to endear DBAs to the modern way of doing things.

For more background on the topics of LUN sizes and LUN counts in All-Flash Array environments based on proof and data from an XtremIO environment, I recommend the following links. The first and second links in the following list make the case for the fact that administrators really do not need to make ASM disk groups out of large numbers of host LUNS. The third link covers resizing ASM disks in a non-RAC environment.

  1. Yes, Host Aggregate I/O Queue Depth is Important. But Why Overdo It When Using All-Flash Array Technology? Complexity is Sometimes a Choice.
  2. Host I/O Queue Depth with XtremIO and SLOB Session Count. A Granular Look.
  3. Stop Constantly Adding Disks To Your ASM Disk Groups. Resize Your ASM Disks On All-Flash Array Storage. Adding Disks Is Really “The Y2K Way.” Here’s Why.

A Real Application Clusters Example

The example I give in this post is based on XtremIO storage array; however, the principles discussed in this post are applicable to most modern enterprise storage arrays. However, it is my assertion that adding space to ASM disk groups by resizing the individual ASM disks (LUNs) is really only something one should do in an All-Flash Array environment like XtremIO. I’ve made that point in the above-cited linked articles.

Resizing ASM disks in an XtremIO environment is every bit as simple as it is in non-RAC environments. The following example shows just how simple.

Figure 1 shows a screen capture of ASMCA reporting that all disk groups are mounting on both nodes of the RAC cluster and that the SALESDATA disk group has 2TB capacity at the beginning of the testing.


Figure 1

Figure 2 shows the XtremIO GUI after all 4 of the ASM disks in the SALESDATA disk group have been resized to 1TB. Resizing XtremIO volumes is a completely non-disruptive operation.


Figure 2

Figure 3 shows the simple commands the administrator needs to execute to rescan for block device changes on all nodes of the RAC cluster. Figure 3 also shows the commands necessary to verify that the block device reflects the new capacity given to each of the LUNs that map to the XtremIO volumes.


Figure 3

Figure 4 shows how a simple shell script (called in this example) can be used to direct the multipathd(8) command to resize internal metadata for specific XtremIO volumes. The script can be executed on remote hosts via the bash(1)  “-s” option.


Figure 4

Figure 5 shows how the ASM disks were 512GB each until the disk group was altered to resize all the disks. That is, in spite of the fact that the block devices were resized at the operating system level, ASM had not yet been updated.


Figure 5

Once the ASM disks are resized as shown in Figure 5, the ASMCA command will also show that the disk group (SALESDATA in the example) has 4TB capacity as seen in Figure 6.


Figure 6

This example has shown that resizing ASM disks in an XtremIO environment is the simplest, least impactful way to add space to an ASM disk group in a Real Application Clusters environment–just as it is in a non-RAC environment.





Stop Constantly Adding Disks To Your ASM Disk Groups. Resize Your ASM Disks On All-Flash Array Storage. Adding Disks Is Really “The Y2K Way.” Here’s Why.

This blog post is centered on All-Flash Array(AFA) technology. I mostly work with EMC XtremIO but the majority of my points will be relevant for any AFA. I’ll specifically call out an array that doesn’t fit any of the value propositions / methods I’m writing about in this post.

Oracle Automatic Storage Management (ASM) is a very good volume manager and since it is purpose-built for Oracle Database it is the most popular storage presentation model DBAs use today. That is not to say alternatives such as NFS (with optional Direct NFS) and simple non-clustered file systems are obsolete. Not at all. However, this post is about adding capacity to ASM disk groups in an all-flash storage environment.

Are You Adding Capacity or Adding I/O Performance?

One of the historical strengths of ASM is the fact that it supports adding a disk even though the disk group is more or less striped and mirrored (in the case of normal or high redundancy). After adding a disk to an ASM disk group there is a rebalancing of existing data to spread it out over all of the disks–including the newly-added disk(s). This was never possible with a host volume manager in, for example, RAID-10. The significant positive effect of an ASM rebalance is realized, first and foremost, in a mechanical storage environment. In short, adding a disk historically meant adding more read/write heads over your data, therefore, adding capacity meant adding IOPS capability (presuming no other bottlenecks in the plumbing).

The historical benefit of adding a disk was also seen at the host level. Adding a disk (or LUN) means adding a block device and, therefore, more I/O queues at the host level. More aggregate queue depth means more I/O can be “in-flight.”

With All-Flash Array technology, neither of these reasons for rebalance make it worth adding ASM disks when additional space is needed. I’ll just come out and say it in a quotable form:

If you have All-Flash Array technology it is not necessary to treat it precisely the same way you did mechanical storage.

It Isn’t Even A Disk

In the All-Flash Array world the object you are adding as an ASM disk is not a disk at all and it certainly has nothing like arms, heads and actuators that need to scale out in order to handle more IOPS. All-Flash Arrays allows you to create a volume of a particular size. That’s it. You don’t toil with particulars such as what the object “looks like” inside the array. When you allocate a volume from an All-Flash Array you don’t have to think about which controller within the array, which disk shelf, nor what internal RAID attributes are involved. An AFA volume is a thing of a particular size. That’s it. These words are 100% true about EMC XtremIO and, to the best of my knowledge, most competitors offerings are this was as well. The notable exception is the HP 3PAR StoreServ 7450 All-Flash Array which burdens administrators with details more suited to mechanical storage as is clearly evident in the technical white paper available on the HP website (click here).

What About Aggregate Host I/O Queue Depth?

So, it’s true that adding a disk to an ASM disk group in the All-Flash Array world is not a way to make better use of the array–unlike an array built on mechanical storage. What about the host-level benefit of adding a block device and therefore increasing host aggregate I/O queue depth? As it turns out, I just blogged a rather in-depth series of posts on the matter. Please see the following posts where I aim to convince readers that you really do not need to assemble large numbers of block devices in order to get significant IOPS capacity on modern hosts attached to low-latency storage such as EMC XtremIO.

What’s It All Mean?

To summarize the current state of the art regarding adding disks to ASM disks groups:

  • Adding disks to ASM disk groups is not necessary to improve All Flash Array “drive” utilization.
  • Adding disks to ASM disk groups is not necessary to improve aggregate host I/O queue depth–unless your database instance demands huge IOPS–which it most likely doesn’t.

So why do so many–if not most–Oracle shops still do the old add-a-disk-when-I-need-space thing? Well, I’m inclined to say it’s because that’s how they’ve always done it.  By saying that I am not denigrating anyone! After all, if that’s the way it’s always been done then there is a track record of success and in today’s chaotic IT world I have no qualms with doing some that is proven. But loading JES3 card decks into a card reader to fire off an IBM 370 job was proven and we don’t do much of that these days.

If doing something simpler has no ill effect, it’s probably worth consideration.

If You Need More Capacity, Um, Why Not Make Your Disk(s) Larger?

I brought that up in twitter recently and was met with a surprising amount of negative feedback. I understood the face value of the objections and that’s why I’m starting this section of the post with objection-handling. The objections all seemed to have revolved about the number of “changes” involved with resizing disks in an ASM disk group when more space is needed.  That is, the consensus seemed to believe that resizing, say, 4 ASM disks accounts for more “changes” than adding a single disk to 4 existing disks. Actually, adding a disk makes more changes. Please read on.

Note: Please don’t forget that I’m writing about resizing disks in an All-Flash Array like EMC XtremIO or even competitive products in the same product space.

A Scenario

Consider, for example, an ASM disk group that is comprised of 4 LUNs mapped to 4 volumes in an All Flash Array like (like XtremIO). Let’s say the LUNs are each 128GB for a disk group capacity of 512GB (external redundancy of course). Let’s say further that the amount of space to be added is another 128GB–a 25% increase and that the existing space is nearly exhausted. The administrators can pick from the following options:

  1. Add a new 128GB disk (LUN). This involves a) creating the volume in the array and b) discovering the block device on the host and c) editing udev rules configuration files for the new device and c) adding the disk to ASM and, finally, d) performing a rebalance.
  2. Resize the existing 4 LUNs to 160GB each. This involves a) modifying 4 volumes in the array to increase their size and b) discovering the block device on the host and c) updating the run-time multipath metadata (runtime command, no config file changes) and d) executing the ASM alter diskgroup resize all command (merely updates ASM metadata).

Option #1 in the list makes a change in the array (adding a volume deducts from fixed object counts) and two Operating System changes (you are creating a block device and editing udev config files and–most importantly–ASM will perform significant physical I/O to redistribute the existing data to fan it out from 4 disks to 5 disks.

Option #2 in the list actually make no changes.

If doing something simpler has no ill effect, it’s probably worth consideration.

The Resizing Approach Really Involves No Changes?

How can I say resizing 4 volumes in an array constitutes no changes? OK, I admit I might be splitting hairs on this but bear with me. If you create a volume in an array you have a new object that has to be associated with the ASM disk group. This means everything from naming it to tagging it and so forth. Additionally, arrays do not have an infinite number of volumes available. Moreover, arrays like XtremIO support vast numbers of volumes and snapshots but if your ASM disk groups are comprised of large numbers of volumes it takes little time to exhaust even the huge supported limit of snapshots in a product like XtremIO. If you can take the leap of faith with me regarding the difference between creating a volume in an All-Flash Array versus increasing the size of a volume then the difference at the host and ASM level will only be icing on the cake.

The host in Option  #2 truly undergoes no changes. None. In the case study below you’ll see that resizing block devices on modern Linux hosts is an operation that involves no changes. None.

But It’s Really All About The Disruption

If you add a disk to an ASM disk group you are making storage and host changes and you are disrupting operations due to the rebalancing. On the contrary the resize disks approach is clearly free of changes and is even more clearly free of disruption. Allow me to explain.

The Rebalance Is A Disruption–And More

The prime concern about adding disks should be the overhead of the rebalance operation. But so many DBAs say they can simply lower the rebalance power limit (throttle the rebalance to lessen its toll on other I/O activity).

If administrators wish to complete the rebalance operation as quickly as possible then the task is postponed for a maintenance window. Otherwise production I/O service times can suffer due to the aggressive nature of ASM disk rebalance I/O. On the other hand, some administrators add disks during production processing and simply set the ASM rebalance POWER level to the lowest value. This introduces significant risk. If an ASM disk is added to an ASM disk group in a space-full situation the only free space for new data being inserted is in the newly added disk. The effect this has on data distribution can be significant if the rebalance operation takes significant time while new data is being inserted.

In other words, with the add-disk method administrators are a) making changes in the array, making changes in the Operating System and physically rebalancing existing data and doing so in a maintenance window or with a low rebalance power limit and likely causing data placement skew.

The resize-disk approach makes no changes and causes no disruption and is nearly immediate. It is a task administrators can perform outside maintenance windows.

What If My Disks Cannot Be Resized Because They are Already Large?

An ASM disk in 11g can be 2TB and in 12c, 4PB. Now, of course, Linux block devices cannot be 4PB but that’s what Oracle documentation says they can (obviously theoretically) be. If you have an ASM disk group where all the disks have been resized to 2TB then you have to add a disk. What’s the trade off? We’ll, as the disks were being resized over time to 2TB you made no changes in the array nor the operating system and you never once suffered a rebalance operation. Sure, eventually a disk needed to be added but that is a much less disruptive evolution for a disk group.

Case Study

The following section of this blog post shows a case study of what’s involved when choosing to resize disks as opposed to constantly adding disks. The case study was, of course, conducted on XtremIO so the array-level information is specific to that array.

Every task necessary to resize ASM disks can be conducted without application interruption on modern Linux servers attached to XtremIO storage array. The following section shows an example of the tasks necessary to resize ASM disks in an XtremIO environment—without application interruption.

Figure 1 shows a screen shot of the ASM Configuration Assistant (ASMCA). In the example, SALESDATA is the disk group that will be resized from one terabyte to two terabytes.


Figure 1

Figure 2 shows the XtremIO GUI with focus on the four volumes that comprise the SALESDATA disk group. Since all of the ASM disk space for SALESDATA has been allocated to tablespaces in the database, the Space in Use column shows that the volume space is entirely consumed.


Figure 2

Figure 3 shows the simple, non-disruptive operating system commands needed to determine the multipath device name that corresponds to each XtremIO volume. This is a simple procedure. The NAA Identifier (see Figure 2) is used to query the Device Mapper metadata. As the Figure 3 shows, each LUN is 256GB and the corresponding multipath device for each LUN is reported in the left-most column of the xargs(1) output.


Figure 3

The next step in the resize procedure is to increase the size of the XtremIO volumes. Figure 4 shows the screen output just prior to resizing the fourth of four volumes from the original size of 256GB to the new size of 512GB.


Figure 4

Once the XtremIO volume resize operations are complete (these operations are immediate with XtremIO), the next step is to rescan SCSI busses on the host for any attribute changes to the underlying LUNs. As figure 5 shows, only a matter of seconds is required to rescan for changes. This, too, is non-disruptive.


Figure 5

Once the rescan has completed, the administrator can once again query the multipath devices to find that the LUNs are, in fact, recognized as having been resized as seen in Figure 6.


Figure 6

The final operating system level step is to use the multipathd(8) command to resize the multipath device (see Figure 7). This is non-disruptive as well.


Figure 7

As Figure 8 shows, the next step is to use the ALTER DISKGROUP command while attached to the ASM instance. The execution of this command is nearly immediate and, of course, non-disruptive. Most importantly, after this command completes the new capacity is available and no rebalance operation was required!


Figure 8

Finally, as Figure 9 shows, ASM Configuration Assistant will now show the new size of the disk group. In the example, the SALESDATA disk group has been resized from 1TB to 2TB in a matter of seconds—with no application interruption and no I/O impact from a rebalance operation.


Figure 9


If you have an All-Flash Array, like EMC XtremIO, take advantage of modern technology. Memories of constantly adding disks to ASM disk groups all over your datacenter can fade into vague memories–just like loading those JES3 decks into the card reader of your IBM 370. And, yes, I’ve written and loaded JES3 decks for an IBM 370 but I don’t feel compelled to do that sort of thing any more. Just like constantly adding disks to ASM disk groups some of the old ways are no longer the best ways.


Host I/O Queue Depth with XtremIO and SLOB Session Count. A Granular Look.

In my recent post about aggregate host I/O queue depth I shared both 100% SQL SELECT and 20% SQL UPDATE test results (SLOB) at varying LUN (ASM disk) counts. The LUNs mapped to XtremIO volumes but the assertions in that post were really applicable in most All-Flash Array situations.

I received quite a bit of email from readers about the granularity of session counts shown in the charts in that post. Overwhelmingly, folks asked to see more granular data. It so happens that the charts in that post were a mere snippet of the test suite results so I charted the full data set and am posting them here.

Test Description

The testing consisted of varying the number of ASM disks in a disk group from 1 to 16 host LUNs mapped to XtremIO volumes. SLOB was executed with varying numbers of zero-think time sessions from 1 to 250 sessions for the 20% UPDATE test and from 1 to 450 sessions for the 100% SELECT test.  The SLOB scale was 1TB and I used SLOB Single-Schema Model. The array was a 4 X-Brick XtremIO array connected to a single 2s36c72t Xeon server running single-instance Oracle Database 12c and Linux 7.  The array was attached via 6 runs of 8GFC Fibre Channel and multipathing was supplied by DM-MPIO. The default Oracle Database block size (8KB) was used.

Remember that the sessions are zero think-time in this testing, therefore, IOPS are a direct reflection of latency and in this case latency is majority attributed to host queueing as I explained in the prior post.

The prime message in this data is the Total IOPS values demonstrated at even low host LUN counts and, as such, it makes little sense to create complex ASM disk groups (consisting of large numbers of host LUNs mapped to All-Flash Array storage like XtremIO). Unless, that is, you manage one of the very few production databases that demands IOPS above 100,000. I know these databases exist, but there aren’t as many of them as some might think. High IOPS-capable platforms like XtremIO are generally used for consolidation.

If you click on the image you can get the full-size chart.



Figure 1. 100% SQL SELECT.



Figure 2. 80% SQL SELECT with 20% SQL UPDATE.



Yes, Host Aggregate I/O Queue Depth is Important. But Why Overdo It When Using All-Flash Array Technology? Complexity is Sometimes a Choice.

Blog Update. Part II is available. Please Click the following link after you’ve finished this post: click here.

That’s The Way We’ve Always Done It

I recently updated the EMC best practices guide for Oracle Database on XtremIO. One of the topics in that document is how many host LUNs (mapped to XtremIO storage array volumes) should administrators use for each ASM disk group. While performing the testing for the best practices guide it dawned on me that this topic is suitable for a blog post. I think too many DBAs are still using the ASM disk group methodology that made sense with mechanical storage. With All Flash Arrays–like XtremIO–administrators can rethink the complexities of they way they’ve always done it–as the adage goes.

Before reading the remainder of the post, please be aware that this is the first installment in a short series about host LUN count and ASM disk groups in all-flash environments. Future posts will explore more additional reasons simple ASM disk groups in all-flash environments makes a lot of sense.

How Many Host LUNs are Needed With All Flash Array Technology

We’ve all come to accept the fact that–in general–mechanical storage offers higher latency than solid state storage (e.g., All Flash Array). Higher latency storage requires more aggregate host I/O queue depth in order to sustain high IOPS. The longer I/O takes to complete the longer requests have to linger in a queue.

With mechanical storage it is not at all uncommon to construct an ASM disk group with over 100 (or hundreds of) ASM disks. That may not sound too complex to the lay person, but that’s only a single ASM disk group on a single host. The math gets troublesome quite quickly with multiple hosts attached to an array.

So why are DBAs creating ASM disk groups consisting of vast numbers of host LUNs after they adopt all-flash technology? Well, generally it’s because that’s how it’s has always been done in their environment. However, there is no technical reason to assemble complex, larger disk-count ASM disk groups with storage like XtremIO. With All Flash Array technology latencies are an order of magnitude (or more) shorter duration than mechanical storage. Driving even large IOPS rates is possible with very few host LUNs in these environments because the latencies are low. To put it another way:

With All Flash Array technology host LUN count is strictly a product of how many IOPS your application demands

Lower I/O latency allows administrators to create ASM disk groups of very low numbers of ASM disks. Fewer ASM disks means fewer block devices. Fewer block devices means a more simplistic physical storage layout and simplistic is always better–especially in modern, complex IT environments.

Case Study

In order to illustrate the relationship between concurrent I/O and host I/O queue depth, I conducted a series of tests that I’ll share in the remainder of this blog post.

The testing consisted of varying the number of ASM disks in a disk group from 1 to 16 host LUNs mapped to XtremIO volumes. SLOB was executed with varying numbers of zero-think time sessions from 80 to 480 and the slob.conf->UPDATE_PCT to values 0 and 20. The SLOB scale was 1TB and I used SLOB Single-Schema Model. The array was a 4 X-Brick XtremIO array connected to a single 2s36c72t Xeon server running single-instance Oracle Database 12c and Linux 7.  The default Oracle Database block size (8KB) was used.

Please note: Read Latencies in the graphics below are db file sequential read wait event averages taken from AWR reports and therefore reflect host I/O queueing time. The array-level service times are not visible in these graphics. However, one can intuit such values by observing the db file sequential read latency improvements when host I/O queue depth increases. That is, when host queueing is minimized the true service times of the array are more evident.

Test Configuration HBA Information

The host was configured with 8 Emulex LightPulse 8GFC HBA ports. HBA queue depth was configured in accordance with the XtremIO Storage Array Host Configuration Guide thus lpfc_lun_queue_depth=30 and lpfc_hba_queue_depth=8192.

Test Configuration LUN Sizes

All ASM disks in the testing were 1TB. This means that the 1-LUN test had 1TB of total capacity for the datafiles and redo logs. Conversely, the 16-LUN test had 16TB capacity.  Since the SLOB scale was 1TB readers might ponder how 1TB of SLOB data and redo logs can fit in 1TB. XtremIO is a storage array that has always-on, inline data reduction services including compression and deduplication. Oracle data blocks cannot be deduplicated. In the testing it was the XtremIO array-level compression that allowed 1TB scale SLOB to be tested in a single 1TB LUN mapped to a 1TB XtremIO volume.

Read-Only Baseline

Figure 1 shows the results of the read-only workload (slob.conf->UPDATE_PCT=0). As the chart shows, Oracle database is able to perform 174,490 read IOPS (8KB) with average service times of 434 microseconds with only a single ASM disk (host LUN) in the ASM disk group. This I/O rate was achieved with 160 concurrent Oracle sessions. However, when the session count increased from 160 to 320, the single LUN results show evidence of deep queueing. Although the XtremIO array service times remained constant (detail that cannot be seen in the chart), the limited aggregate I/O queue depth caused the db file sequential read waits at 320, 400 and 480 sessions to increase to 1882us, 2344us and 2767us respectively. Since queueing causes the total I/O wait time to increase, adding sessions does not increase IOPS.

As seen in the 2 LUN group (Figure 1), adding an XtremIO volume (host LUN) to the ASM disk group had the effect of nearly doubling read IOPS in the 160 session test but, once again, deep queueing started to occur in the 320 session case and thus db file sequential read waits approached 1 millisecond—albeit at over 300,000 IOPS. Beyond that point the 2 LUN case showed increasing latency and thus no improvement in read IOPS.

Figure 1 also shows that from 4 LUNs through 16 LUNs latencies remained below 1 millisecond even as read IOPS approached the 520,000 level. With the information in Figure 1, administrators can see that host LUN count in an XtremIO environment is actually determined by how many IOPS your application demands. With mechanical storage administrators were forced to assemble large numbers of host LUNs for ASM disks to accommodate high storage service times. This is not the case with XtremIO.


Figure 1

Read / Write Test Results

Figure 2 shows measured IOPS and service times based on the slob.conf->UPDATE_PCT=20 testing. The IOPS values shown in Figure 2 are the combined foreground and background process read and write IOPS. The I/O ratio was very close to 80:20 (read:write) at the physical I/O level. As was the case in the 100% SELECT workload testing, the 20% UPDATE testing was also conducted with varying Oracle Database session counts and host LUN counts. Each host LUN mapped to an XtremIO volume.

Even with moderate SQL UPDATE workloads, the top Oracle wait event will generally be db file sequential read when the active data set is vastly larger than the SGA block buffer pool—as was the case in this testing. As such, the key performance indicator shown in the chart is db file sequential read.

As was the case in the read-only testing, this series of tests also shows that significant amounts of database physical I/O can be serviced with low latency even when a single host LUN is mapped to a single XtremIO volume. Consider, for example, the 160 session count test with a single LUN where 130,489 IOPS were serviced with db file sequential read wait events serviced in 754 microseconds on average. The positive effect of doubling host aggregate I/O queue depth can be seen in Figure 2 in the 2 LUN portion of the graphic.  With only 2 host LUNs the same 160 Oracle Database sessions were able to process 202,931 mixed IOPS with service times of 542 microseconds. The service time decrease from 754 to 542 microseconds demonstrates how removing host queueing allows the database to enjoy the true service times of the array—even when IOPS nearly doubled.

With the data provided in Figures 1 and 2, administrators can see that it is safe to configure ASM disk groups with very few host LUNs mapped to XtremIO storage array making for a simpler deployment. Only those databases demanding significant IOPS need to be created in ASM disk groups with large numbers of host LUNs.


Figure 2

Figure 3 shows a table summarizing the test results. I invite readers to look across their entire IT environment and find their ASM disk groups that sustain IOPS that require even more than a single host LUN in an XtremIO environment. Doing so will help readers see how much simpler their environment could be in an all-flash array environment.


Figure 3


Everything we know in IT has a shelf-life. Sometimes the way we’ve always done things is no longer the best approach. In the case of deriving ASM disk groups from vast numbers of host LUNs, I’d say All-Flash Array technology like XtremIO should have us rethinking why we retain old, complex ways of doing things.

This post is the first installment in short series on ASM disk groups in all flash environments. The next installment will show readers why low host LUN counts can even make adding space to an ASM disk group much, much simpler.

For Part II Please click here.

Introducing a VCE White Paper. Consolidating SAP, SQL Server and Oracle Production/Test/Dev/OLTP and OLAP Into a Single XtremIO Array with VCE Converged Infrastructure.

This is just a short blog post to direct readers to a fantastic mixed-workload and heterogeneous database consolidation Proof of Concept. This VCE paper should not be missed. I assert that the VCE converged infrastructure platforms–most notably the Vblock 540–are the best off-the-shelf solution for provisioning XtremIO storage array all-flash storage to large numbers of hosts each processing vastly differing workloads (production,test/dev,OLTP,OLAP).

This paper is full of useful information. It explains the XtremIO 24:1 data reduction realized in the test. It also shows a great deal of configuration tips such as controlling I/O on Linux hosts with CGROUPS and on VMware virtual hosts via VMware Storage I/O Control.

The following is an overview of the testing landscape proven in the paper:

  • A high frequency online transaction processing (OLTP) application with Oracle using the Silly Little Oracle Benchmark (SLOB) tool
  • A modern OLTP benchmark simulating a stock trading application representing a second OLTP workload for SQL Server
  • ERP hosted on SAP with an Oracle data store simulating a sell-from-stock business scenario
  • A decision support system (DSS) workload accessing an Oracle database
  • An online analytical processing (OLAP) workload accessing two SQL Server analysis and reporting databases
  • Ten development/test database copies for each of the Oracle and SQL Server OLTP and five development/test copies of the SAP/Oracle system (25 total copies)

The following graphic helps visualize the landscape:

Screen Shot 2016-08-03 at 7.59.16 AM

The following graphic shows an example of one of the test scenario I/O performance metrics discussed in the paper:

Screen Shot 2016-08-03 at 8.01.03 AM

I encourage you to click the following link to download the paper: VCE Solutions for Enterprise Mixed Workloads on Vblock System 540

Expecting Sum-Of-Parts Performance From Shared Solid State Storage? I Didn’t Think So. Neither Should Exadata Customers. Here’s Why.


Last month I had the privilege of delivering the key note session to the quarterly gathering of Northern California Oracle User Group. My session was a set of vignettes in a theme regarding modern storage advancements. I was mistaken on how much time I had for the session so I skipped over a section about how we sometimes still expect systems performance to add up to a sum of its parts. This blog post aims to dive in to this topic.

To the best of my knowledge there is no marketing literature about XtremIO Storage Array that suggests the array performance is due to the number of solid state disk (SSD) drives found in the device. Generally speaking, enterprise all-flash storage arrays are built to offer features and performance–otherwise they’d be more aptly named Just a Bunch of Flash (JBOF).  The scope of this blog post is strictly targeting enterprise storage.

Wild, And Crazy, Claims

Lately I’ve seen a particular slide–bearing Oracle’s logo and copyright notice–popping up to suggest that Exadata is vastly superior to EMC and Pure Storage arrays because of Exadata’s supposed unique ability to leverage aggregate flash bandwidth of all flash components in the Exadata X6 family. You might be able to guess by now that I aim to expose how invalid this claim is. To start things off I’ll show a screenshot of the slide as I’ve seen it. Throughout the post there will be references to materials I’m citing.

DISCLAIMER: The slide I am about to show was not a fair use sample of content from and it therefore may not, in fact, represent the official position of Oracle on the matter. That said, these slides do bear logo and copyright! So, then, the slide:


Figure 1

I’ll start by listing a few objections. My objections are always based on science and fact so objecting to content–in particular–is certainly appropriate.

  1. The slide (Figure 1) suggests an EMC XtremIO 4 X-Brick array is limited to 60 megabytes per second per “flash drive.”
    1. Objection: An XtremIO 4 X-Brick array has 100 Solid State Disks (SSD)–25 per X-Brick. I don’t know where the author got the data but it is grossly mistaken. No, a 4 X-Brick array is not limited to 60 * 100 megabytes per second (6,000MB/s). An XtremIO 4 X-Brick array is a 12GB/s array: click here. In fact, even way back in 2014 I used Oracle Database 11g Real Application Clusters to scan at 10.5GB/s with Parallel Query (click here). Remember, Parallel Query spends a non-trivial amount of IPC and work-brokering setup time at the beginning of a scan involving multiple Real Application cluster nodes. That query startup time impacts total scan elapsed time thus 10.5 GB/s reflects the average scan rate that includes this “dead air” query startup time. Everyone who uses Parallel Query Option is familiar with this overhead.
  2. The slide (Figure 1) suggests that 60 MB/s is “spinning disk level throughput.”
    1. Objection: Any 15K RPM SAS (12Gb) or FC hard disk drive easily delivers sequential scan throughput of more than 200 MB/s.
  3. The slide (Figure 1) suggests XtremIO cannot scale out.
    1. Objection: XtremIO architecture is 100% scale out so this indictment is absurd. One can start with a single X-Brick and add up to 7 more. In the current generation scaling out in this fashion with XtremIO adds 25 more SSDs, storage controllers (CPU) and 4 more Fibre Channel ports per X-Brick.
  4. The slide (Figure 1) suggests “bottlenecks at server inputs” further retard throughput when using Fibre Channel.
    1. Objection: This is just silly. There are 4 x 8GFC host-side FC ports per XtremIO X-Brick. I routinely test Haswell-EP 2-socket hosts with 6 active 8GFC ports (3 cards) per host. Can a measly 2-socket host really drive 12 GB/s Oracle scan bandwidth? Yes! No question. In fact, challenge me on that and I’ll show AWR proof of a single 2-socket host sustaining Oracle table scan bandwidth at 18 GB/s. No, actually, I won’t make anyone go to that much trouble. Instead, click the following link for AWR proof that a single host with 2 6-core Haswell-EP (2s12c24t) processors can sustain Oracle Database 12c scan bandwidth of 18 GB/s: click here. I don’t say it frequently enough, but it’s true; you most likely do not know how powerful modern servers are!
  5. The slide (Figure 1) says Exadata achieve “full flash throughput.”
    1. Objection: I’m laughing, but that claim is, in fact, the perfect segue to the next section.

Full Flash Throughput

Scan Bandwidth

The slide in Figure 1 accurately states that the NVMe flash cards in the Exadata X6 model are rated at 5.5GB/s. This can be seen in the F320 datasheet. Click the following link for a screenshot of the F320 datasheet: click here. So the question becomes, can Exadata really achieve full utilization of all of the NVMe flash cards configured in the Exadata X6? The answer no, but sort of. Please allow me to explain.

The following graph (Figure 2) shows data cited in the Exadata datasheet and depicts the reality of how close a full-rack Exadata X6 comes to realizing full flash potential.

As we know, a full-rack Exadata has 14 storage servers. The High Capacity (HC) model has 4 NVMe cards per storage server purposed as a flash cache. The HC model also comes with 12 7,200 RPM hard drives per storage server as per the datasheet.

The following graph shows that yes, indeed Exadata X6 does realize full flash potential when performing a fully-offloaded scan (Smart Scan). After all, 4 * 14 * 5.5 is 308 and the datasheet cites 301 GB/s scan performance for the HC model. This is fine and dandy but it means you have to put up with 168 (12 * 14) howling 7,200 RPM hard disks if you are really intent on harnessing the magic power of full-flash potential!

Why the sarcasm? It’s simple really–just take a look at the graph and notice that the all-flash EF model realizes just a slight bit more than 50% of the full flash (aggregate) performance potential. Indeed, the EF model has 14 * 8 * 5.5 == 616 GB/s of full potential available–but not realizable.

No, Exadata X6 does not–as the above slide (Figure 1) suggests–harness the full potential of flash. Well, not unless you’re willing to put up with 168 round, brown, spinning thingies in the configuration. Ironically, it’s the HDD-Flash hybrid HC model that enjoys the “full flash potential.” I doubt the presenter points this bit out when slinging the slide shown in Figure 1.


Figure 2


The slide in Figure 1 doesn’t actually suggest that Exadata X6 achieves full flash potential for IOPS, but since these people made me crack open the datasheets and use my brain for a moment or two I took it upon myself to do the calculations. The following graph (Figure 3) shows the delta between full flash IOPS potential for the full-rack HC and EF Exadata X6 models using data taken from the Exadata datasheet.

No…Exadata X6 doesn’t realize full flash potential in terms of IOPS either.


Figure 3


Here is a link to the full slide deck containing the slide (Figure 1) I focused on in this post:

Just in case that copy of the deck disappears, I pushed a copy up the the WayBack Machine: click here.


XtremIO Storage Array literature does not suggest that the performance characteristics of the array are a simple product of how many component SSDs the array is configured with. To the best of my knowledge neither does Pure Storage suggest such a thing.

Oracle shouldn’t either. I have now made that point crystal clear.

You Scratch Your Head And Ponder Why It Is You Go With Maximum Core Count Xeons. I Can’t Explain That, But This Might Help.

Folks that have read my blog for very long know that I routinely point out that Intel Xeon processors with fewer cores (albeit same TDP) get more throughput per core. Recently I had the opportunity to do some testing of a 2-socket host with 6-core Haswell EP Xeons (E5-2643v3) connected to networked all-flash storage. This post is about host capability so I won’t be elaborating on the storage. I’ll say that it was block storage, all-flash and networked.

Even though I test myriads of systems with modern Xeons it isn’t often I get to test the top-bin parts that aren’t core-packed.  The Haswell EP line offers up to 18-core parts in a 145w CPU.  This 6-core part is 135w and all cores clock up to 3.7GHz–not that clock speed is absolutely critical for Oracle Database performance mind you.

Taking It For a Spin

When testing for Oracle OLTP performance the first thing to do is measure the platform’s ability to deliver random single-block reads (db file sequential read). To do so I loaded 1TB scale SLOB 2.3 in the single-schema model. I did a series of tests to find a sweet-spot for IOPS which happened to be at 160 sessions. The following is a snippet of the AWR report from a 5-minute SLOB run with UPDATE_PCT=0. Since this host has a total of 12 cores I should think 8KB read IOPS of 625,000 per second will impress you. And, yes, these are all db file sequential reads.


At 52,093 IOPS per CPU core I have to say this is the fastest CPU I’ve ever tested. It takes a phenomenal CPU to handle this rate of db file sequential read payload. So I began to wonder how this would compare to other generations of Xeons. I immediately thought of the Exadata Database Machine data sheets.

Before I share some comparisons I’d like to point out that there was a day when the Exadata data sheets made it clear that IOPS through the Oracle Database buffer cache costs CPU cycles–and, in fact, CPU is often the limiting factor. The following is a snippet from the Exadata Database Machine X2 data sheet that specifically points out that IOPS are generally limited by CPU. I/O buffered–and cached–in application shared memory is a CPU problem even if the buffers are never snooped.  That is, in fact, why I invented SLOB way back in the early 1990s. I’ve never seen an I/O testing kit that can achieve more IOPS per DB CPU than is possible with SLOB.


Oracle stopped using this foot note in the IOPS citations for Exadata Database Machine starting with the X3 generation. I have no idea why they stopped using this correct footnote. Perhaps they thought it was a bit like stating the obvious. I don’t know. Nonetheless, it is true that host CPU is a key limiting factor in a platform’s ability to cycle IOPS through the SGA. As an aside, please refer to this post about calibrate_io for more information about the processor ramifications of SGA versus PGA IOPS.

So, in spite of the fact that Oracle has stopped stating the limiting nature of host CPU on IOPS, I will simply assert the fact in this blog post. Quote me on this:

Everything is a CPU problem

And cycling IOPS through the Oracle SGA is a poster child for my quotable quote.

I think the best way to make my point is to simply take the data from the Exadata Database Machine data sheets and put it in a table that has a row for my E5-2643v3 results as well. Pictures speak thousands of words. And here you go:


AWR Report

If you’d like to read the full AWR report from the E5-2643v3 SLOB test that achieved 625,000 IOPS please click on the following link: AWR (click here).


X2 data sheet
X3 data sheet
X4 data sheet
X5 data sheet
X6 data sheet


Yes, You Must Use CALIBRATE_IO. No, You Mustn’t Use It To Test Storage Performance.

I occasionally get questions from customers and colleagues about performance expectations for the Oracle Database procedure called calibrate_io on XtremIO storage. This procedure must be executed in order to update the data dictionary. I assert, however, that it shouldn’t be used to measure platform suitability for Oracle Database physical I/O. The main reason I say this is because calibrate_io is a black box, as it were.

The procedure is, indeed, documented so it can’t possibly be a “black box”, right? Well, consider the fact that the following eight words are the technical detail provided in the Oracle documentation regarding what calibrate_io does:

This procedure calibrates the I/O capabilities of storage.

OK, I admit it. I’m being too harsh. There is also this section of the Oracle documentation that says a few more words about what this procedure does but not enough to make it useful as a platform suitability testing tool.

A Necessary Evil?

Yes, you must run calibrate_io. The measurements gleaned by calibrate_io are used by the query processing runtime (specifically involving Auto DOP functionality). The way I think of it is similar to how I think of gathering statistics for CBO. Gathering statistics generates I/O but I don’t care about the I/O it generates. I only care that CBO might have half a chance of generating a reasonable query plan given a complex SQL statement, schema and the nature of the data contained in the tables. So yes, calibrate_io generates I/O—and this, like I/O generated when gathering statistics, is I/O I do not care about. But why?

Here are some facts about the I/O generated by calibrate_io:

  • The I/O is 100% read
  • The reads are asynchronous
  • The reads are buffered in the process heap (not shared buffers in the SGA)
  • The code doesn’t even peek into the contents of the blocks being read!
  • There is limited control over what tablespaces are accessed for the I/O
  • The results are not predictable
  • The results are not repeatable

My Criticisms

Having provided the above list of calibrate_io characteristics, I feel compelled to elaborate.

About Asynchronous I/O

My main issue with calibrate_io is it performs single-block random reads with asynchronous I/O calls buffered in the process heap. This type of I/O has nothing in common with the main reason random single-block I/O is performed by Oracle Database. The vast majority of single-block random I/O is known as db file sequential read—which is buffered in the SGA and is synchronous I/O. The wait event is called db file sequential read because each synchronous call to the operating system is made sequentially, one after the other by foreground processes. But there is more to SGA-buffered reads than just I/O.

About Server Metadata and Mutual Exclusion

Wrapped up in SGA-buffered I/O is all the necessary overhead of shared-cache management. Oracle can’t just plop a block of data from disk in the SGA and expect that other processes will be able to locate it later. When a process is reading a block into the SGA buffer cache it has to navigate spinlocks for the protected cache contents metadata known as cache buffers chains. Cache buffers chains tracks what blocks are in the buffer cache by their on-disk address.  Buffer caches, like that in the SGA, also need to track the age of buffers. Oracle processes can’t just use any shared buffer. Oracle maintains buffer age in metadata known as cache buffers lru—which is also spinlock-protected metadata.

All of this talk about server metadata means that as the rate of SGA buffer cache block replacement increases—with newly-read blocks from storage—there is also increased pressure on these spinlocks. In other words, faster storage means more pressure on CPU. Scaling spinlocks is a huge CPU problem. It always has been—and even more so on NUMA systems. Testing I/O performance without also involving these critical CPU-intensive code paths provides false comfort when trying to determine platform suitability for Oracle Database.

Since applications to not drive random single-block asynchronous reads in Oracle Database, why measure it? I say don’t! Yes, execute calibrate_io, for reasons related to Auto DOP functionality, but not for a relevant reading of storage subsystem performance.

About User Data

This is one that surprises me quite frequently. It astounds me how quick some folks are to dismiss the importance of test tools that access user data. Say what?  Yes, I routinely point out that neither calibrate_io nor Orion access the data that is being read from storage. All Orion and calibrate_io do is perform the I/O and let the data in the buffer remain untouched.  It always seems strange to me when folks dismiss the relevance of this fact. Is it not database technology we are talking about here? Databases store your data. When you test platform suitability for Oracle Database I hold fast that it is best to 1) use Oracle Database (thus an actual SQL-driven toolkit as opposed to an external kit like Orion or fio or vdbench or any other such tool) and 2) that the test kit access rows of data in the blocks! I’m just that way.

Of course SLOB (and other SQL-driven test kits such as Swingbench do indeed access rows of data). Swingbench handily tests Oracle Database transaction capabilities and SLOB uses SQL to perform maximum I/O per host CPU cycle. Different test kits for different testing.

A Look At Some Testing Results

The first thing about calibrate_io I’ll discuss in this section is how the user is given no control or insight into what data segments are being accessed. Consider the following screenshot which shows:

  1. Use of the calibrate.sql script found under the misc directory in the SLOB kit (SLOB/misc/calibrate.sql) to achieve 371,010 peak IOPS and zero latency. This particular test was executed with a Linux host attached to an XtremIO array. Um, no, the actual latencies are not zero.
  2. I then created a 1TB tablespace. What is not seen in the screenshot is that all the tablespaces in this database are stored in an ASM disk group consisting of 4 XtremIO volumes. So the tablespace called FOO resides in the same ASM disk group. The ASM disk group uses external redundancy.
  3. After adding a 1TB tablespace to the database I once again executed calibrate_io and found that the IOPS increased 13% and latencies remained at zero. Um, no, the actual latencies are not zero!
  4. I then offlined the tablespace called FOO and executed calibrate_io to find that that IOPS fell back to within 1% of the first sample.
  5. Finally, I onlined the tablespace called FOO and the IOPS came back to within 1% of the original sample that included the FOO tablespace.
A Black Box

My objections to this result is calibrate_io is a black box. I’m left with no way to understand why adding a 1TB tablespace improved IOPS. After all, the tablespace was created in the same ASM disk group consisting of block devices provisioned from an all-flash array (XtremIO). There is simply no storage-related reason for the test result to improve as it did.


More IOPS, More Questions. I Prefer Answers.

I decided to spend some time taking a closer look at calibrate_io but since I wanted more performance capability I moved my testing to an XtremIO array with 4 X-Bricks and used a 2-Socket Xeon E5-2699v3 (HSW-EP 2s36c72t) server to drive the I/O.

The following screenshot shows the result of calibrate_io. This test configuration yielded 572,145 IOPS and, again, zero latency. Um, no, the real latency is not zero. The latencies are sub-millisecond though. The screen shot also shows the commands in the SLOB/misc/calibrate.sql file. The first two arguments to DBMS_RESOURCE_MANAGER.CALIBRATE_IO are “in” parameters. The value seen for parameter 2 is not the default. The next section of this blog post shows a variety of testing with varying values assigned to these parameters.


As per the documentation, the first parameter to calibrate_io is “approximate number of physical disks” being tested and the second parameter is “the maximum tolerable latency in milliseconds” for the single-block I/O.


As the table above shows I varied the “approximate number of physical disks” from 1 to 10,000 and the “maximum tolerable latency” from 10 to 20 and then 100. For each test I measured the elapsed time.

The results show us that the test requires twice the elapsed time with 1 approximate physical disk as it does for with 10,000 approximate physical disks. This is a nonsensical result but without any documentation on what calibrate_io actually does we are simply left scratching our heads. Another oddity is that with 10,000 approximate disks the throughput in megabytes per second is reduced by nearly 40% and that is without regard for the “tolerable latency” value. This is clearly a self-imposed limited within calibrate_io but why is the big question.

I’ll leave you, the reader, to draw your own conclusions about the data in the table. However, I use the set of results with “tolerable latency” set to 20 as validation for one of my indictments above. I stated calibrate_io is not predictable. Simply look at the set of results in the 20 “latency” parameter case and you too will conclude calibrate_io is not predictable.

So How Does CALIBRATE_IO Compare To SLOB?

I get this question quite frequently. Jokingly I say it compares in much the same way a chicken compares to a snake. They both lay eggs. Well, I should say they both perform I/O.

I wrote a few words above about how calibrate_io uses asynchronous I/O calls to test single-block random reads. I also have pointed out that SLOB performs the more correct synchronous single block reads. There is, however, an advanced testing technique many SLOB users employ to test PGA reads with SLOB as opposed to the typical SLOB reads into the SGA. What’s the difference? Well, revisit the section above where I discuss the server metadata management overhead related to reading blocks into the SGA. If you tweak SLOB to perform full scans you will test the flow of data through the PGA and thus the effect of eliminating all the shared-cache overhead. The difference is dramatic because, after all, “everything is a CPU problem.”

In a subsequent blog post I’ll give more details on how to configure SLOB for direct path with single-block reads!

To close out this blog entry I will show a table of test results comparing some key time model data. I collected AWR reports when calibrate_io was running as well as SLOB with direct path reads and then again with the default SLOB with SGA reads. Notice how the direct path SLOB increased IOPS by 19% just because blocks flowed through the PGA as opposed to the SGA. Remember, both of the SLOB results are 100% single-block reads. The only difference is the cache management overhead is removed. This is clearly seen by the difference in DB CPU. When performing the lightweight PGA reads the host was able to drive 29,884 IOPS per DB CPU but the proper SLOB results (SGA buffered) shows the host could only drive 19,306 IOPS per DB CPU. Remember DB CPU represents processor threads utilization on a threaded-processor. These results are from a 2s36c72t (HSW-EP) so these figures could also be stated as per DB CPU or per CPU thread.

If you are testing platforms suitability for Oracle it’s best to not use a test kit that is artificially lightweight. Your OLTP/ERP application uses the SGA, so test that!

The table also shows that calibrate_io achieved the highest IOPS but I don’t care one bit about that–because it isn’t true database I/O.


AWR Reports

I’d like to offer the following links to the full AWR reports summarized in the above table:

Additional Reading


Use calibrate_io. Just don’t use it to test platform suitability for Oracle Database.

Is SLOB AWR Generation Really, Really, Really Slow on Oracle Database Yes, Unless…

If you are testing SLOB against and find that the AWR report generation phase of is taking an inordinate amount of time (e.g., more than 10 seconds) then please be aware that, in the SLOB/awr subdirectory, there is a remedy script rightly called 11204-awr-stall-fix.sql.

Simply execute this script when connected to the instance with sysdba privilege and the problem will be solved.


Performance Data Visualization for SLOB. The SLOB Expert Community is Vibrant!

Thanks to Nikolay Savvinov (@oradiag) for his excellent post on how to wrap his scripts around the SLOB test driver ( to capture and produce performance data visualization graphs.  I recommend a visit to his post here:

Performance Data Visualization with SLOB


As always, the link for SLOB is: Obtain the SLOB Kit and Helpful Information Here


I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories


All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: