Building a Stretch Real Application Clusters Configuration? Get The CRS Voting Disk Setup Right!

Published February 6, 2008 Geo-Clusters , Real Application Clusters 12 Comments

The topic of “stretch clusters” has been interesting to a lot of folks for quite some time. A stretch cluster is one where one or more cluster nodes, one or more portions of the SAN or both are geographically remote. Geographically remote could be within eye-sight (1-2km) or a long distance away. YottaYotta (Robin Harris of StorageMojo.com will notice that name) reached out to me (with hardware to offer) several years ago to set up a 3500km stretch cluster with three 10gR2 RAC nodes. Two of the RAC nodes were co-located and the third was put at 3500km distance using communications hardware that simulates the latency imposed by such great distance. And, yes, it is a valid simulation. It was an interesting exercise and with the YottaYotta distributed block server, the PolyServe (HP) and RAC were totally oblivious to the topology. It was a cool project, but that technology has had a difficult time catching on. In the interim, mainstream vendors have stepped up to offer stretch clustering technology and in the name of business continuity, folks are considering these sorts of solutions-but they are expensive. To that end, most shops would tend to buy, at most, a two-legged SAN. Therein lies the problem. Such a configuration could suffer a disaster on the leg of the SAN that has the majority of the CRS voting disks resulting in a total outage of the solution.

The remedy for this problem is to implement a third leg of storage for more voting disks to ensure an n+1 majority are available, but at what cost? The solution is to implement an inexpensive NFS share in which to host these additional voting disks. And, yes, you can use a simple low end Unix/Linux host as the NFS server for this purpose-so long as the host is running Solaris, AIX or HP-UX, or Linux. The following is a link to a paper that covers Oracle’s recommended/supported approach to this solution with Oracle Database 10g Release 2.

Using NFS for a Third CRS Voting Device

The paper is clear about the fact that using some plain Unix/Linux server to host NFS shares for Oracle files is limited to this specific purpose:

Oracle does NOT support standard NFS for any files, with the one specific exception documented in this white paper.

The paper appears to have a small contradiction about mount options-specifically stating that the noac option is required for Linux (see Figure 1) servers which seems to contradict Metalink 279393.1. I’ve sent an email to the authors about that. We’ll see if it changes.

12 Responses to “Building a Stretch Real Application Clusters Configuration? Get The CRS Voting Disk Setup Right!”

Feed for this Entry Trackback Address

1 pier00 February 6, 2008 at 8:39 pm

Hi Kevin,
You are totally right concerning the voting disks. However I find the OCR part much more difficult. See my blog entry on http://geertdepaep.wordpress.com/2008/01/24/my-experiences-with-ocrmirror-voting-disks-and-stretched-clusters/
Bottom line is that the ocrmirror is not a failover of the ocr. You would even be less redundant when putting the ocr on one san and the ocrmirror on the other san, compared to putting them both on the same san. There are many error messages possible on ocr failure and the failover behaviour is very dependent on this.
OS-mirroring of ocr doesn’t look like an easy nor good solution either.
Is Oracle working on some kind of solution for this problem? For the moment this prevents me from recommending stretched clusters with 2 sans to my customers.
Geert
geert.depaep@uptime.be

Reply
2 joel garry February 8, 2008 at 12:01 am

The “Oracle does NOT support standard NFS for any files” link in the paper ( http://www.oracle.com/technology/deploy/availability/htdocs/vendors_nfs.html )gives a 404. Is it still true? Some of the linux papers on metalink suggest that it is supported with proper options (for example, EL5 on Note:279069.1 – or is that considered abnormal?).

I wouldn’t want to see myths propagated in either direction. Many people say Oracle doesn’t support NFS, we need to verify. Searching oracle.com, “We did not find any search results for: vendors_nfs.html” and the references from google all seem to point at that one mysteriously missing doc.

Reply
3 Arup Nanda January 25, 2009 at 4:54 pm

Kevin,

Thank you for the post. Your solution to the problem makes sense; but my question is not on your solution but rather the requirement. I am puzzled by the requirement of the stretch cluster.

What business (or technical) need the stretch cluster was supposed to address? Business continuity? I doubt it. How would you address the single point of failure of the database. Perhaps you can address it by placing two SANs at two different places; but then how do you synchronize between them? OS mirroring, since SAN mirroring over fiber may be impossible over such long distances? But in that case, updates will take much longer. If you enable asynch mirroring, that takes care of i/o latency for OS mirroring but it defeats the purpose of the DR.

In a nutshell I am looking for a use case for stretch clusters and can’t really find it. would you happen to know one?

Thanks a lot in advance.

Reply
4 kevinclosson January 28, 2009 at 4:34 pm

Arup,

This post was not meant to promote stretch clusters as much as to inform of some of the things others are thinking about. I personally see no value in building one unless it also includes a “stretch SAN” using something like the now defunct YottaYotta Distributed Block Server. Stretch clusters to me sound like something much better handled by Data Guard.

Reply
5 NHoyos September 22, 2009 at 8:29 pm

The SPF in a RAC is the central disk storage. In a stretch cluster there is an ASM instance taking care of the replication between two SANs. RAC nodes exists on either side thus a failure on one SAN won’t affect access to the database.

Reply
6 steven May 30, 2012 at 11:16 pm

It’s not possible to implement long distance strech RAC. A year ago, I had a talk with the tecnicians of NetAPP and Dell, and all concerned network latency and electricity signal fading. I don’t think it can be come true in present time. If you know any new implementation fulling stretch cluster, pls tell me. I am also finding the same solutions to have a multi-side transaction system.

Reply
- 7 kevinclosson June 1, 2012 at 1:34 pm
  
  @steven : You can with VPLEX and it is certified by Oracle.
  
  Reply
8 orasuds November 15, 2012 at 10:25 am

All,
One of our customer implemented stretch cluster to protect aginst storage frame failures. Their data center is just mile away. ASM mirroring is done between two frames for the stretch cluster. Previously they had a Veritas LVM writing to both frames at the same. In fact they had server on Site1 considering frame on site2 as primary. When VCS fails over to site2 server . Site2 server considers site1 frame as primary ( for reading).
When I questioned the need for strtech cluster. They said Veritas LVM based mirroring saved them couple of times when someone turned off frame switch at site1 in the past. But the stretch cluster is doen without the 3rd voting disk. Now I am wondering what to do react if the site with most voting disks crashed. Oracle document says “If you have an extended cluster and do not configure a third site, you must find out which of the two sites is the primary site. Then, if the primary site fails, you must manually restart the secondary site.”. Not sure that statement is correct since how the secondary site comes up with only less than half of voting disks being available.
Any ideas?

Reply
- 9 kevinclosson November 15, 2012 at 1:52 pm
  
  @orasuds, see:
  
  Click to access h7113-vplex-architecture-deployment.pdf
  
  Reply
10 Bart Sjerps November 15, 2012 at 3:38 pm

If you use VPLEX for Stretched RAC (with the Witness option) then you don’t have to worry about setting up another 3rd site with NFS/Voting disk. You define voting (CRS) disks on VPLEX and they are virtual volumes. If on one site the storage environment would go down then the VPLEX witness keeps the other one up. Therefore the RAC node that loses storage will go down (because it will lose access to the voting disks) and the other will stay up automatically (because it still has access to the “virtual” voting disk). The beauty of the EMC solution is that you would install RAC exactly the same as you would on a local (single storage system) cluster. No messing around with failure groups, priority reads, cluster witness on NFS (duh) etc.

More info check my blog, a series of articles on RAC/VPLEX here: http://bartsjerps.wordpress.com/category/vplex/

Hope this helps.

Reply
11 Dave November 16, 2012 at 3:03 am

Check out the following solutions
Business Continuity and Disaster Recovery for stretched RAC
http://tinyurl.com/6ndcbww
and
Business Continuity for SAP with stretched RAC on VPLEX
http://tinyurl.com/bpr3yqg

Reply

1 My Blog Posts Prove Oracle Doesn’t Support NFS! « Kevin Closson’s Oracle Blog: Platform, Storage & Clustering Topics Related to Oracle Databases Trackback on February 8, 2008 at 8:07 pm

	Optimize replication… on Introducing SLOB – The S…
	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage