Manly Men Only Deploy Oracle with Fibre Channel – Part VI. Introducing Oracle11g Direct NFS! | Kevin Closson's Blog: Platforms, Databases and Storage

Manly Men Only Deploy Oracle with Fibre Channel – Part VI. Introducing Oracle11g Direct NFS!

Published July 11, 2007 Direct NFS , NFS CFS ASM , oracle , Oracle NFS , Oracle11g 23 Comments

Since December 2006, I’ve been testing Oracle11g NAS capabilities with Oracle’s revolutionary Direct NFS feature. This is a fantastic feature. Let me explain. As I’ve laboriously pointed out in the Manly Man Series, NFS makes life much simpler in the commodity computing paradigm. Oracle11g takes the value proposition further with Direct NFS. I co-authored Oracle’s paper on the topic:

Here is a link to the paper.

Here is a link to the joint Oracle/HP news advisory.

What Isn’t Clearly Spelled Out. Windows Too?
Windows has no NFS in spite of stuff like SFU and Hummingbird. That doesn’t stop Oracle. With Oracle11g, you can mount directories from the NAS device as CIFS shares and Oracle will access them with high availability and performance via Direct NFS. No, not CIFS, Direct NFS. The mounts only need to be visible as CIFS shares during instance startup.

Who Cares?
Anyone that likes simplicity and cost savings.

The Worlds Largest Installation of Oracle Databases
…is Oracle’s On Demand hosting datacenter in Austin, Tx. Folks, that is a NAS shop. They aren’t stupid!

Quote Me

The Oracle11g Direct NFS feature is another classic example Oracle implementing features that offer choices in the Enterprise data center. Storage technologies, such as Tiered and Clustered storage (e.g., NetApp OnTAP GX, HP Clustered Gateway), give customers choices—yet Oracle is the only commercial database vendor that has done the heavy lifting to make their product work extremely well with NFS. With Direct NFS we get a single, unified connectivity model for both storage and networking and save the cost associated with Fibre Channel. With built-in multi-path I/O for both performance and availability, we have no worries about I/O bottlenecks. Moreover, Oracle Direct NFS supports running Oracle on Windows servers accessing databases stored in NAS devices—even though Windows has no native support for NFS! Finally, simple, inexpensive storage connectivity and provisioning for all platforms that matter in the Grid Computing era!

23 Responses to “Manly Men Only Deploy Oracle with Fibre Channel – Part VI. Introducing Oracle11g Direct NFS!”

Feed for this Entry Trackback Address

1 Richard July 11, 2007 at 10:59 pm

Very cool! Now all they need to do is update their download links so that people can actually check it out…

I am actually a little disappointed in the way that everything is heading from a grid standpoint, though. What would be very cool is a tighter integration between RAC concepts and ASM auto-balancing, to the point that Oracle would support a true shared-nothing system. In a lot of ways, it seems like they’re very close.

When we can get to the point that I can deploy some nice machines with a fast chunk of directly-attached disk (as well as the option to use network-attached disk), and just tell ASM which volumes to use and have it handle data-sharing between machines (ideally with a tie-in to the DB itself, so that frequently read data that wasn’t updated often would be replicated on all the nodes (space permitting), whereas frequently updated data would only be in two places, et cetera)… that will be a wonderful day. And scaling will simply be plugging in more disk, more servers, or both.

In theory, of course.

Reply
- 2 pieter February 23, 2010 at 10:15 am
  
  Kevin,
  
  I was reading the oracle documentation.
  Direct NFS is not supported (yet?) for the clusterware files. Does it work after all? Or does it simply not work?
  Do you have any idea it will ever be supported?
  If I want to use DNFS for the database files, what do you recommend for the clusterware files?
  
  Pieter
  
  Reply
  - 3 kevinclosson February 23, 2010 at 3:04 pm
    
    I recommend NFS for the clusterware files. That’s the RAC model. Use kernel NFS mounted space for clusterware files, oracle software and Direct NFS for the database.
    
    Reply
  - 4 Freek February 24, 2010 at 11:06 am
    
    Kevin,
    
    In Oracle 10g, in case of network problems, you have no failover between the ocr and ocrmirror volume when using NFS (due to the hard / nointr required mount options). See http://preview.tinyurl.com/yzdkbf7
    
    I was hoping that 11g would solve this with direct NFS, but apparently direct nfs is not supported for the ocr volumes.
    Do you know if 11g uses a separate monitoring process (as with the voting disks) to check if io requests to the ocr / ocrmirror volumes are hanging?
    
    Also, when using direct nfs, does Oracle require the “normal” kernel nfs mount points only for administration or also during startup of the database?
    
    Reply
5 kevinclosson July 11, 2007 at 11:01 pm

Richard,

Two words: Storage Grid.

Reply
6 Jeff July 12, 2007 at 1:44 pm

Kevin,

What type of savings are we looking at in comparison to doing storage on a FC SAN? Also, what type of replication capabilities are available in the event I want to copy data from one NAS device to another at a DR site (non-oracle and Oracle related without having to use Data Guard).

Reply
7 Glen September 26, 2007 at 5:01 pm

Does that mean you don’t recommend a NAS unless running 11g? Where does 10gR2 fit into the NAS picture?

Reply
8 kevinclosson September 26, 2007 at 5:30 pm

Hi Glen,

When it comes to NAS, yes, 11g is the best so far. However, 10g on NAS is a good match as well–just not as good. It always goes like that, right? Oracle9i requires a lot of patches for NAS so that is a bit more work and honestly, if you have a 9i database running, just keep it where it is as long as it works.

What I don’t recommend is running old Linux distros for Oracle over NFS. Stay with the 2.6 Kernels.

And like the front page of my blog states, these are MY opinions, not Oracle’s so please folks, don’t drum up a list of Metalink notes showing Oracle over NFS tips with RHAS 2.1 with Oracle9i and stuff like that. All technology information is time relevant.

Reply
9 Freek April 8, 2009 at 10:27 pm

Kevin,

I’m in the process of setting up a new 2 linux node 10gR2 rac environment that will use nfs to mount the Netapp volumes.
Unfortunately the application vendor (still) does not support oracle 11, so I can’t use direct nfs.

I had foreseen to use bonding on the server nics that would be connected to the netapp to get more throughput (and more redundancy).

At that point however, I got a bit of a cold shower from our network admin who stated that none of the load balancing schemes would allow me to get more then 1Gb of throughput between a rac node and the netapp.
The switch would always select 1 nic port on the netapp to send the data to (or 1 nic port on the rac node when reading from the netapp).
Tests with dd and orion seem to confirm this.

So, my question is: how did you increase your throughput by adding an additional nic into the configuration?
In the paper there is only stated: “The network paths from the Oracle Database server to the storage was first configured with the best possible bonded Ethernet interfaces supported by the hardware at hand.”

The only thing I can think of is that I have to assign virtual ip’s on the “storage” vif in the netapp and then mount different volumes on which I would need to “spread out” my datafiles.
When the load balance scheme on the switch is then set to src_dest_ip the traffic shouls use multiple nics.
But this seems to me as very difficult to maintain as we need to know which datafiles will be accessed frequently and make sure that these are not all on the same volume.

regards

Reply
- 10 kevinclosson April 9, 2009 at 5:00 pm
  
  Freek,
  
  The troubles you are having is exactly why DNFS was brought to market. Like I was saying in the paper, it is often difficult to work out all the bonding stuff because it is driver+NIC+switch related..and perhaps, even varies by filer. To that end, I was not using a NetApp. I was using an HP EFS Gateway which is a totally different beast than a NetApp filer.
  
  May I ask if your network admin has any scalable bonding setup anywhere? Perhaps it is just the ethernet switches deployed in your datacenter? I can’t troubleshoot that from here.
  
  I can’t remember all the details about the testing I did for that paper because that was nearly 2 years ago…which leads me to point out that I cannot understand why an App vendor would still be forcing new deployments onto 10g when 11g has been shipping for 2 years. But I suppose I should understand that.
  
  Reply
11 sq April 10, 2009 at 8:36 pm

Freek,

I am have a small datacenter that uses bonding on everything. We use it not only for performance but failover. On our CISCO side its configured for LACP, on the linux side its handled by setting the bonding driver to mode=4. In this configuration we do get more than 1gb network bandwidth utilization on the node, and we use it constantly for our RAC interconnect.

Through LACP above (and our on server VLAN configuration), the two network cards essentially are trunked and bonded with the same MAC address on both, which allows for using multiple NICs for performance.

Not sure on how you could load balance on the netapp side.

Reply
- 12 kevinclosson April 10, 2009 at 10:21 pm
  
  sg,
  
  Thanks for the info. I should think NetApp’s library would have a bunch of info about bonding…come on NetApp guys (Pete?)…I know you guys read my blog 🙂
  
  Reply
13 Freek April 11, 2009 at 1:53 pm

Kevin & sq,

Thanks for the respone.

In the setup I talked about, we are using 2 cisco catalyst 3750 placed in a single stack (well, at least I think I got the term correct).
Currently The nics on the netapp dedicated for the nfs traffic are combined into a vif using lacp round-robin and on the linux server the bond is also created in mode 4 with the xmit_hash_policy set to layer2 (xor of src and dst mac).
But I have also tried several other combinations (including round robin on the linux server).

As I understood from our network admin, the problem lies with available methods the switch has to decide to which port it will forward the traffic.
All methods would result in a situation in which the switch would always select the same port for all the network traffic between a single server and a single netapp head. So, even if the server would distribute the network packets over both nics in the bond (as it does when using round-robin), the switch would still only use 1 destination port to forward all traffic to. This would in effect cap the bandthwidth to max 1Gbit/s.

Is my understanding correct? If so, this would mean that I would best split the db data over 2 different netapp heads (the netapp is a clustered FAS 3140) and use for instance a volume on head1 for my datafiles and a second volume on head2 for the archived redo logs.
But then, how does the different nfs performance papers get a higher throughput by bonding nics?

On the cisco support wiki, I found an article (http://tinyurl.com/cisco-etherchannel) about etherchannel on the different switch models, and according to this article some of the switches can include layer 4 in the port forwarding calculation, but I’m not sure if this would of any use with nfs (I’m guessing that always the same port number is used).

sq,

You said you get more then 1gb network bandwidth utilization.
Is this for traffic between 2 nodes or between multiple nodes?
What did you mean with “server vlan configuration”?

regards,

Freek

Reply
14 John Darrah June 29, 2009 at 4:05 pm

Kevin,

Great article. I have been testing Direct NFS and running into the following problem. When I start up the database the following message appears in my alert log:

Direct NFS: Invalid filer wtmax 61440 on filer 192.169.0.100
Direct NFS: Filer wtmax 61440 must be an even multiple of 32768

I don’t have any direct control over the filer so I cannot change wtmax on the server. Anything I can put into oranfstab to override the wtmax at the client? I know this isn’t a support site, I’m just hoping you might know how to do this.

Thanks,

John

Reply
- 15 kevinclosson July 7, 2009 at 5:33 pm
  
  I understand that you can’t make adjustments on the filer, but I’d still like to know what brand/model of filer it is.
  
  Reply
  - 16 John Darrah July 7, 2009 at 6:25 pm
    
    Kevin,
    I’m under NDA so I need to err on the side of caution in terms of what I disclose. I opened an SR and was told that wtmax cannot be overridden; it is set to whatever the filer returns when queried by the client (bummer).
    On a more general question, is there a toolkit or test kit that can be used to verify the level of compatibility a filer has with Direct NFS? I read in another one of your blog entries about how one product will actually corrupt oracle data if used in conjunction with Direct NFS. That was a little worrisome. What is the best approach for storage vendors to determine compatibility with Direct NFS.
    
    Thanks,
    
    John
    
    Reply
    - 17 kevinclosson July 7, 2009 at 9:51 pm
      
      Also… it will be very important to get the Filer admin to adjust the wtmax for this unit to a multiple of the database block size.
      
      Reply
18 Adam Garsha December 4, 2009 at 8:12 pm

Great articles/blogs.

We are moving (soon/slowly) Oracle DB’s from HP-UX/Veritas/FC-SAN to Linux/polyserve/directnfs/11g2 and so far I am loving NFS/polyserve.

Question for you sir. Have you done any updated benchmarking for HP Polyserve and Oracle/directNFS with the newer HP bl460c G6’s with Flex-10 interfaces (e.g. 8Gb pipe ).

I’d be curious of what you see/learn when the network isn’t the bottleneck (or did you find that after 2Gb NFS itself becomes bottleneck in an environment of 11g2/directNFS.

Thanks.

Reply
- 19 kevinclosson December 7, 2009 at 10:02 pm
  
  Hello Adam,
  
  I can’t blame you for loving dNFS to a high-powered Multi-Headed NAS device such as HP/PolyServe. I have not done any hands-on dNFS work, however, since joining Oracle. Too focused on Exadata and the Oracle Database Machine. As for bottlenecks, I can’t imagine you could bottleneck a symmetric multi-headed NAS device like HP/PolyServe as it scales to, what, 16 NAS heads? That would take a tremendous database grid worth of I/O.
  
  In the end, however, I cannot talk about anything HP/PolyServe in the present-tense as I have been out of the loop for over two years.
  
  Reply

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage