Yes, Host Aggregate I/O Queue Depth is Important. But Why Overdo It When Using All-Flash Array Technology? Complexity is Sometimes a Choice.

Blog Update. Part II is available. Please Click the following link after you’ve finished this post: click here.

That’s The Way We’ve Always Done It

I recently updated the EMC best practices guide for Oracle Database on XtremIO. One of the topics in that document is how many host LUNs (mapped to XtremIO storage array volumes) should administrators use for each ASM disk group. While performing the testing for the best practices guide it dawned on me that this topic is suitable for a blog post. I think too many DBAs are still using the ASM disk group methodology that made sense with mechanical storage. With All Flash Arrays–like XtremIO–administrators can rethink the complexities of they way they’ve always done it–as the adage goes.

Before reading the remainder of the post, please be aware that this is the first installment in a short series about host LUN count and ASM disk groups in all-flash environments. Future posts will explore more additional reasons simple ASM disk groups in all-flash environments makes a lot of sense.

How Many Host LUNs are Needed With All Flash Array Technology

We’ve all come to accept the fact that–in general–mechanical storage offers higher latency than solid state storage (e.g., All Flash Array). Higher latency storage requires more aggregate host I/O queue depth in order to sustain high IOPS. The longer I/O takes to complete the longer requests have to linger in a queue.

With mechanical storage it is not at all uncommon to construct an ASM disk group with over 100 (or hundreds of) ASM disks. That may not sound too complex to the lay person, but that’s only a single ASM disk group on a single host. The math gets troublesome quite quickly with multiple hosts attached to an array.

So why are DBAs creating ASM disk groups consisting of vast numbers of host LUNs after they adopt all-flash technology? Well, generally it’s because that’s how it’s has always been done in their environment. However, there is no technical reason to assemble complex, larger disk-count ASM disk groups with storage like XtremIO. With All Flash Array technology latencies are an order of magnitude (or more) shorter duration than mechanical storage. Driving even large IOPS rates is possible with very few host LUNs in these environments because the latencies are low. To put it another way:

With All Flash Array technology host LUN count is strictly a product of how many IOPS your application demands

Lower I/O latency allows administrators to create ASM disk groups of very low numbers of ASM disks. Fewer ASM disks means fewer block devices. Fewer block devices means a more simplistic physical storage layout and simplistic is always better–especially in modern, complex IT environments.

Case Study

In order to illustrate the relationship between concurrent I/O and host I/O queue depth, I conducted a series of tests that I’ll share in the remainder of this blog post.

The testing consisted of varying the number of ASM disks in a disk group from 1 to 16 host LUNs mapped to XtremIO volumes. SLOB was executed with varying numbers of zero-think time sessions from 80 to 480 and the slob.conf->UPDATE_PCT to values 0 and 20. The SLOB scale was 1TB and I used SLOB Single-Schema Model. The array was a 4 X-Brick XtremIO array connected to a single 2s36c72t Xeon server running single-instance Oracle Database 12c and Linux 7. The default Oracle Database block size (8KB) was used.

Please note: Read Latencies in the graphics below are db file sequential read wait event averages taken from AWR reports and therefore reflect host I/O queueing time. The array-level service times are not visible in these graphics. However, one can intuit such values by observing the db file sequential read latency improvements when host I/O queue depth increases. That is, when host queueing is minimized the true service times of the array are more evident.

Test Configuration HBA Information

The host was configured with 8 Emulex LightPulse 8GFC HBA ports. HBA queue depth was configured in accordance with the XtremIO Storage Array Host Configuration Guide thus lpfc_lun_queue_depth=30 and lpfc_hba_queue_depth=8192.

Test Configuration LUN Sizes

All ASM disks in the testing were 1TB. This means that the 1-LUN test had 1TB of total capacity for the datafiles and redo logs. Conversely, the 16-LUN test had 16TB capacity. Since the SLOB scale was 1TB readers might ponder how 1TB of SLOB data and redo logs can fit in 1TB. XtremIO is a storage array that has always-on, inline data reduction services including compression and deduplication. Oracle data blocks cannot be deduplicated. In the testing it was the XtremIO array-level compression that allowed 1TB scale SLOB to be tested in a single 1TB LUN mapped to a 1TB XtremIO volume.

Read-Only Baseline

Figure 1 shows the results of the read-only workload (slob.conf->UPDATE_PCT=0). As the chart shows, Oracle database is able to perform 174,490 read IOPS (8KB) with average service times of 434 microseconds with only a single ASM disk (host LUN) in the ASM disk group. This I/O rate was achieved with 160 concurrent Oracle sessions. However, when the session count increased from 160 to 320, the single LUN results show evidence of deep queueing. Although the XtremIO array service times remained constant (detail that cannot be seen in the chart), the limited aggregate I/O queue depth caused the db file sequential read waits at 320, 400 and 480 sessions to increase to 1882us, 2344us and 2767us respectively. Since queueing causes the total I/O wait time to increase, adding sessions does not increase IOPS.

As seen in the 2 LUN group (Figure 1), adding an XtremIO volume (host LUN) to the ASM disk group had the effect of nearly doubling read IOPS in the 160 session test but, once again, deep queueing started to occur in the 320 session case and thus db file sequential read waits approached 1 millisecond—albeit at over 300,000 IOPS. Beyond that point the 2 LUN case showed increasing latency and thus no improvement in read IOPS.

Figure 1 also shows that from 4 LUNs through 16 LUNs latencies remained below 1 millisecond even as read IOPS approached the 520,000 level. With the information in Figure 1, administrators can see that host LUN count in an XtremIO environment is actually determined by how many IOPS your application demands. With mechanical storage administrators were forced to assemble large numbers of host LUNs for ASM disks to accommodate high storage service times. This is not the case with XtremIO.

Figure 1

Read / Write Test Results

Figure 2 shows measured IOPS and service times based on the slob.conf->UPDATE_PCT=20 testing. The IOPS values shown in Figure 2 are the combined foreground and background process read and write IOPS. The I/O ratio was very close to 80:20 (read:write) at the physical I/O level. As was the case in the 100% SELECT workload testing, the 20% UPDATE testing was also conducted with varying Oracle Database session counts and host LUN counts. Each host LUN mapped to an XtremIO volume.

Even with moderate SQL UPDATE workloads, the top Oracle wait event will generally be db file sequential read when the active data set is vastly larger than the SGA block buffer pool—as was the case in this testing. As such, the key performance indicator shown in the chart is db file sequential read.

As was the case in the read-only testing, this series of tests also shows that significant amounts of database physical I/O can be serviced with low latency even when a single host LUN is mapped to a single XtremIO volume. Consider, for example, the 160 session count test with a single LUN where 130,489 IOPS were serviced with db file sequential read wait events serviced in 754 microseconds on average. The positive effect of doubling host aggregate I/O queue depth can be seen in Figure 2 in the 2 LUN portion of the graphic. With only 2 host LUNs the same 160 Oracle Database sessions were able to process 202,931 mixed IOPS with service times of 542 microseconds. The service time decrease from 754 to 542 microseconds demonstrates how removing host queueing allows the database to enjoy the true service times of the array—even when IOPS nearly doubled.

With the data provided in Figures 1 and 2, administrators can see that it is safe to configure ASM disk groups with very few host LUNs mapped to XtremIO storage array making for a simpler deployment. Only those databases demanding significant IOPS need to be created in ASM disk groups with large numbers of host LUNs.

Figure 2

Figure 3 shows a table summarizing the test results. I invite readers to look across their entire IT environment and find their ASM disk groups that sustain IOPS that require even more than a single host LUN in an XtremIO environment. Doing so will help readers see how much simpler their environment could be in an all-flash array environment.

Figure 3

Summary

Everything we know in IT has a shelf-life. Sometimes the way we’ve always done things is no longer the best approach. In the case of deriving ASM disk groups from vast numbers of host LUNs, I’d say All-Flash Array technology like XtremIO should have us rethinking why we retain old, complex ways of doing things.

This post is the first installment in short series on ASM disk groups in all flash environments. The next installment will show readers why low host LUN counts can even make adding space to an ASM disk group much, much simpler.

For Part II Please click here.

9 Responses to “Yes, Host Aggregate I/O Queue Depth is Important. But Why Overdo It When Using All-Flash Array Technology? Complexity is Sometimes a Choice.”

Feed for this Entry Trackback Address

1 ozprem August 9, 2016 at 4:54 pm

Thanks for the brilliant post. Now we can convert those religious arguments to more scientific ones 😉

I was hoping a punch line in your summary like “Use X devices for a safe approach”. I think X=4 a safe approach to cover the basics.

- 2 kevinclosson August 15, 2016 at 9:02 am
  
  Hi Prem,
  
  I don’t agree that it is necessary to start with 4 LUNs “to be safe.” Think about it. It is not that difficult to add an ASM disk if the app is one of the very few that a customer has that might even demand that level of IOPS.
  
3 Riggi, Maria August 10, 2016 at 5:53 am

Very interesting research Kevin!

Maria

4 Mahmoud Hatem August 11, 2016 at 3:06 am

Hi Kevin,

Great and helpful post ! Thank you for sharing 🙂

Have you tested the new multiqueue block layer subsystem (blk-mq) introduced in UEKR4 and if yes have it any remarkable performance impact ?

I covered it on my blog here (https://mahmoudhatem.wordpress.com/2016/02/08/oracle-uek-4-where-is-my-io-scheduler-none-multi-queue-model-blk-mq/) but sadly was not able to SLOB it because our test server is overused.

- 5 kevinclosson August 15, 2016 at 9:03 am
  
  Well…when you have SLOB numbers feel free to report back 🙂

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage