Manly Men Only Deploy Oracle with Fibre Channel – Part 1. Oracle Over NFS is Weird. | Kevin Closson's Blog: Platforms, Databases and Storage

Manly Men Only Deploy Oracle with Fibre Channel – Part 1. Oracle Over NFS is Weird.

Beware, lot’s of tongue in cheek in this one. If you’re not the least bit interested in storage protocols, saving money or a helpful formula for safely configuring I/O bandwidth for Oracle, don’t read this.

I was reading Pawel Barut’s Log Buffer #48 when the following phrase caught my attention:

For many of Oracle DBAs it might be weird idea: Kevin Closson is proposing to install Oracle over NFS. He states that it’s cheaper, simpler and will be even better with upcoming Oracle 11g.

Yes, I have links to several of my blog entries about Oracle over NFS on my CFS, NFS, ASM page, but that is not what I want to blog about. I’m blogging specifically about Powet’s assertion that “it might be a weird idea”—referring to using NAS via NFS for Oracle database deployments.

Weird
I think the most common misconception people have is regarding the performance of such a configuration. True, NFS has a lot of overhead that would surely tax the Oracle server way too much—that is if Oracle didn’t take steps to alleviate the overhead. The primary overhead is in NFS client-side caching. Forget about it. Direct I/O and asynchronous I/O are available to the Oracle server for NFS files with just about every NFS client out there.

Manly Men™ Choose Fibre Channel
I hear it all the time when I’m out in the field or on the phone with prospects. First I see the wheels turning while math is being done in the head. Then, one of those cartoon thought bubbles pops up with the following:

Hold it, that Closson guy must not be a Manly Man™. Did he just say NFS over Gigabit Ethernet? Ugh, I am Manly Man and I must have 4Gb Fibre Channel or my Oracle database will surely starve for I/O!

Yep, I’ve been caught! Gasp, 4Gb has more bandwidth than 1Gb. I have never recommended running a single path to storage though.

Bonding Network Interfaces
Yes, it can be tricky to work out 802.3ad Link Aggregation, but it is more than possible to have double or triple bonded paths to the storage. And yes, scalability of bonded NICs varies, but there is a simplicity and cost savings (e.g., no FCP HBAs or expensive FC switches) with NFS that cannot be overlooked. And, come in closely and don’t tell a soul, you won’t have to think about bonding NICs for Oracle over NFS forever, wink, wink, nudge, nudge.

But, alas, Manly Man doesn’t need simplicity! Ok, ok, I’m just funning around.

No More Wild Guesses
A very safe rule of thumb to keep your Oracle database servers from starving for I/O is:
100Mb I/O per GHz CPU

So, for example, if you wanted to make sure an HP c-Class server blade with 2-socket 2.66 GHz “Cloverdale” Xeon processors had sufficient I/O for Oracle, the math would look like this:

12 * 2.66 * 4 * 2 == 255 MB/s

Since the Xeon 5355 is a quad-core processor and the 480c c-Class blade supports two of them there are 21.28 GHz for the formula. And, 100 Mb is about 12 MB. So if Manly Man configures, say, two 4Gb FC paths (for redundancy) to the same c-Class blade he is allocating about 1000 MB/s bandwidth. Simply put, that is expensive overkill. Why? Well, for starters, the blade would be 100% saturated at the bus level if it did anything with 1000 MB/s so it certainly couldn’t satisfy Oracle performing physical I/O and actually touching the blocks (e.g., filtering, sorting, grouping, etc). But what if Manly Man configured the two 4Gb FCP paths for failover with only 1 path active path (approximately 500 MB/s bandwidth)? That is still overkill.

Now don’t get me wrong. I am well aware that 2 “Cloverdale” Xeons running Parallel Query can scoop up 500MB/s from disk without saturating the server. It turns out that simple light weight scans (e.g., select count(*) ) are about the only Oracle functionality that breaks the rule of 100Mb I/O per GHz CPU. I’ve even proven that countless times such as in this dual processor, single core Opteron 2.8 Ghz proof point. In that test I had IBM LS20 blades configured with dual processor, single-core Opterons clocked at 2.8 GHz. So if I plug that into the formula I’d use 5.6 for the GHz figure which supposedly yields 67 MB/s as the throughput at which those processors should have been saturated. However, on page 16 of this paper I show those two little single-core Opterons scanning disk at the rate of approximately 380MB/s. How is that? The formula must be wrong!

No, it’s not wrong. When Oracle is doing a light weight scan it is doing very, very little with the blocks of data being returned from disk. On the other hand, if you read further in that paper, you’ll see on page 17 that a measly 21MB/s of data loading saturated both processors on a single node-due to the amount of data manipulation required by SQL*Loader. OLTP goes further. Generally, when Oracle is doing OLTP, as few as 3,000 IOps from each processor core will result in total saturation. There is a lot of CPU intensive stuff wrapped around those 3,000 IOps. Yes, it varies, but look at your OLTP workload and take note of the processor utilization when/if the cores are performing on the order of 3,000 IOps each. Yes, I know, most real-world Oracle databases don’t even do 3,000 IOps for an entire server which takes us right back to the point: 100Mb I/O per GHz CPU is a good, safe reference point.

What Does the 800 Pound Gorilla Have To Say?
When it comes to NFS, Network Appliance is the 800lb gorilla. They have worked very hard to get to where they are. See, Network Appliance likely doesn’t care if Manly Man would rather deploy FCP for Oracle instead of NFS since their products do both protocols-and iSCSI too. All told, they may stand to make more money if Manly Man does in fact go with FCP since they may have the opportunity to sell expensive switches too. But, no, Network Appliance dispels the notion that 4Gb (or even 2Gb) FCP for Oracle is a must.

In this NetApp paper about FCP vs iSCSI and NFS, measurements are offered that show equal performance with DSS-style workloads (Figure 4) and only about 21% deficit when comparing OLTP on FCP to NFS. How’s that? The paper points out that the FCP test was fitted with 2Gb Fibre Channel HBAs and the NFS case had two GbE paths to storage yet Manly Man only achieved 21% more OLTP throughput. If NFS was so inherently unfit for Oracle, this test case with bandwidth parity would have surely made the point clear. But that wasn’t the case.

If you look at Figure 2 in that paper, you’ll see that the NFS case (with jumbo frames) spent 31.5% of cycles in kernel mode compared to 22.4% in the FCP case. How interesting. The NFS case lost 28% more CPU to kernel mode overhead and delivered 21% less OLTP throughput. Manly Man must surely see that addressing that 28% extra kernel mode overhead associated with NFS will bring OLTP throughput right in line with FCP and:

– NFS is simpler to configure

– NFS can be used for RAC and non-RAC

– NFS is cheaper since GbE is cheaper (per throughout) than FCP

Now isn’t that weird?

The 28%.

I can’t tell you how and when the 28% additional kernel-mode overhead gets addressed, but, um, it does. So, Manly Man, time to invent the wheel.

20 Responses to “Manly Men Only Deploy Oracle with Fibre Channel – Part 1. Oracle Over NFS is Weird.”

Feed for this Entry Trackback Address

1 Paweł Barut June 15, 2007 at 5:09 pm

Hello Kevin,

I hope you didn’t get me wrong. I made my assumption based on Oracle installations I’ve seen. In almost all cases it was iSCSI disk arrays. On then other hand storage architecture is not my strongest point as I’m developing applications rather then deciding on hardware configurations.
Finally my intension was to get more attention to what you have to say, as I see it very valuable.

Pawel

P.S. Please correct typing error in my name 🙂

Reply
2 kevinclosson June 15, 2007 at 5:50 pm

Pawel,

I’m very sorry for misspelling your name. Also, the part of your blog entry that I found important is the mentality about Oracle over NFS seeming weird. I believe what you report is in fact the case–that this is the sentiment amongst many IT shops…so I wanted to blog that a bit…

Reply
3 Mark October 5, 2007 at 10:26 pm

So will Oracle come up with a new license for NAS attached databases which is 21% less?

I won’t hold my breath.

Why is it every innovation in running Oracle which comes from Oracle (RAC, NAS, etc.) results in higher Oracle licensing costs?

Reply
4 kevinclosson October 5, 2007 at 11:02 pm

Mark,

Don’t get wrapped up in that 21%. That is NetApp’s number with 10g. I’ve stated elsewhere that 21% is a bit high. I’m more accustomed to 10-15%. No matter, the majority of that difference is due to time spent in kernel mode–cost that is addressed in 11g with Direct NFS.

Reply
5 User December 26, 2007 at 9:12 pm

Can you please explain what you mean
100 Mb is about 12 MB (you mentioned just after 12 * 2.66 * 4 * 2 == 255 MB/s)
Thank you,

Reply
6 Awesome article - I agree March 7, 2008 at 7:57 pm

Great article. We have over 20 terabytes all running on Netapp, and we have over 6 terabytes in Orcle 10gr2 databases alone, all running on Netapp and Sun Solaris v10.

Reply
7 ast March 10, 2010 at 7:16 am

But in certain operations, especially create indexes, NetApp is several times slower than SAN in our case. We’d love to go all the way to NFS since it IS much cheaper and easier to implement. But the performance simply not there, at lease not for production. That’s why we use NFS for Dev and Test databases. Since all the servers are blades with identifical configuration, the performance comparison makes more sense.

Reply
- 8 kevinclosson March 11, 2010 at 11:20 pm
  
  ast,
  
  You can’t say “the performance simply not there” when speaking generically about Oracle on NFS. If you have a NetApp model that is slower than your FC SAN that is not attributable to the storage architecture, but instead, the specific NFS filer you are using and perhaps the plumbing. Can’t compare a single path of GbE to 8GFC for table scans for instance. There is 800% difference. Now, 10GbE to 4GFC (much, much more common than end to end 8GFC) would be a reasonable comparison and much in favor of the 10GbE.
  
  Reply
9 Alen November 23, 2010 at 1:21 pm

Hello Kevin,

Great article !

Just wondering, the 100Mb I/O per GHz CPU you mentioned in your article is it still a valid figure or perhaps modern CPUs (Westmere) raised the bar ? Do you have some recent figures ?

Thanks.

Reply
10 Adam Boliński (@boliniak) January 23, 2017 at 1:44 pm

Hello Kevin I must add few words , your tests was done using standard kNFS or dNFS , but now I’m doing test using RDMA over Ethernet 40 or 10 and the results are very very good much, much better than standard SAN.
Here first of this results : https://twitter.com/boliniak/status/815140049327190016
I’m doing more test and I will send a blog post about it.

Adam

Reply
- 11 kevinclosson January 24, 2017 at 12:28 pm
  
  That blog post of mine is old but of historical value.
  
  I do not doubt your results a single bit! The future is RDMA Ethernet (RoCE specifically). Think about the fact that future Xeons will have 4x25Gbit controllers *on the processor die* !!! Fibre Channel will have it’s sunset. Please do share your results. I’m presuming these will be SLOB results?
  
  Reply

	David Zheng on Announcing pgio (The SLOB Meth…
	Oracle redo log perf… on File Systems For A Database? C…
	Oracle redo log perf… on Yes, File Systems Still Need T…
	kevinclosson on Announcing SLOB 2.5.4
	pgio nutzen? - I/O W… on So pgio Does Not Accurately Re…

Kevin Closson's Blog: Platforms, Databases and Storage

20 Responses to “Manly Men Only Deploy Oracle with Fibre Channel – Part 1. Oracle Over NFS is Weird.”

Leave a comment Cancel reply

DISCLAIMER

Pages

Blogroll

Follow Blog via Email

Recent Posts

Recent Comments

Fond Memories

Copyright

Kevin Closson's Blog: Platforms, Databases and Storage