Several months back I made a blog entry about the RAC poll put together by Jared Still. The poll can be found here. Thus far there have been about 150 participants through the poll—best I can tell. Some of the things I find interesting about the results are:
1. Availability was cited 46% of the time as the motivating factor for deploying RAC whereas scalability counted for 37%.
2. Some 46% of the participants state that RAC has met between 75% and 100% of their expectations.
3. More participants (52%) say they’d stay with RAC given the choice to revert to non-RAC.
4. 52% of the deployments are Linux (42% Red Hat, 6% Oracle Enterprise Linux, 4% SuSE) and 34% are using the major Legacy Unix offerings (Solaris 17%, AIX 11%, HP-UX 6%).
5. 84% of the deployments are using block storage (e.g., FCP, iSCSI) with 42% of all respondents using ASM on block storage. Nearly one quarter of the respondents say they use a CFS. Only 13% use file storage (NAS via NFS).
Surveys often make for tough cipherin’. It sure would be interesting to see which of the 52% that use Linux also state they’d stay with RAC given the choice to revert or re-deploy with a non-RAC setup. Could they all have said they’d stick with RAC? Point 1 above is also interesting because Oracle markets RAC as a prime ingredient for availability as per MAA.
Of course point 5 is very interesting to me.
RAC is Simple…on Simple Storage
We are talking about RAC here, so the 84% from point 5 above get to endure the Storage Buffet. On the other hand, the 24% of the block storage deployments that layered a CFS over the raw partitions didn’t have it as bad, but the rest of them had to piece together the storage aspects of their RAC setup. That is, they had to figure out what to do with the clusterware files, database, Oracle Home and so forth. The problem with CFS is that there is no one CFS that covers all platforms. That war was fought and lost. NFS on the other hand is ubiquitous and works nicely for RAC. On that note, an email came in to my inbox last Friday on this very topic. The author of that email said:
[…] we did quite a lot of tests in the summer last year and figured out that indeed using Oracle/NFS can make a very good combination (many at [COMPANY XYZ] were spectical, I had no opinion as I had never used it, I wanted to see the fact). So I have convinced our management to go the NFS way (performance ok for the workload under question, way simpler management).
[…] The production setup (46 nodes, some very active, some almost idle accessing 6 NAS “heads”) does its job with satisfying performance […]
What do I see in this email? NFS works well enough for this company that they have deployed 46 nodes—but that’s not all. I pay particular attention to the 3 most important words in that quote: “way simpler management.”
Storage Makes or Breaks Many RAC Deployments
I watched intently as Charles Schultz detailed his first forray into RAC. First, I’ll point out that Charles and I had an email side-bar conversation on this topic. He is aware that I intended to weave his RAC experience into a blog entry of my own. So what’s there to blog about? Well, I’ll just come right out and say it—RAC is usually only difficult when difficult storage is used. How can I say that? Let’s consider Charles’ situation.
First, Charles is an Oracle Certified Master who has no small amount of exposure to large Oracle environments. Charles points out on his blog that the environment they were trying to deploy RAC into has some 150 or more databases consuming some 10TB of storage! That means Charles is no slouch. And being the professional he is, Charles points out that he took specialized RAC training to prepare for the task of deploying Oracle in their environment. So why did Charles struggle with setting up a 2-node RAC cluster to the point of making a post to the oracle-l email list for assistance? The answer is simply that the storage wasn’t simple.
It turned out that Charles’ “RAC difficulty” wasn’t even RAC. I assert that the highest majority of what is termed “RAC difficulty” isn’t RAC at all, but the platform or storage instead. By platform I mean Linux RPM dependency and by storage I mean SAN madness. Charles’ difficulties boiled down to Linux FCP multipathing issues. Specifically, multipathing was causing ASM to see multiple entries for each LUN. I made the following comment on Charles’ blog:
Hmm, RHEL4 and two nodes. Things should not be that difficult. I think what you have is more on your hands than RAC. I’ve seen OCFS2, and ASM [in Charles’ blog thread]. That means you also have simple raw disks for OCR/CSS and since this is Dell, is my guess right that you have EMC storage with PowerPath?
Lot’s on your plate. You know me, I’d say NAS…
Ok, I’m sorry for SPAMing your site, Charles, but your situation is precisely what I talk about. You are a Certified Master who has also been to specific RAC training and you are experiencing this much difficulty on a 2 node cluster using a modern Linux distro. Further, most of your problems seem to be storage related. I think that all speaks volumes.
Charles replied with:
[…] I agree whole-heartedly with your statements; my boss made the same observations after we had already sunk over 40 FTE of 2 highly skilled DBAs plunking around with the installation.
If I read that correctly, Charles and a colleague spent a week trying to work this stuff out and Charles is certainly not alone in these types of situations that generally get chalked up as “RAC problems.” There was a lengthy thread on oracle-l about very similar circumstances not that long ago.
Back To The Poll
It has been my experience that most RAC difficulties are storage related—specifically the storage presentation. As point 5 in the poll above shows, some 84% of the respondents had to deal with raw partitions at one time or another. Indeed, even with CFS, you have to get the raw partitions visible and like-named on each node of the cluster before you can create a filesystem. If I hear of one more RAC deployment falling prey to storage difficulties, I’ll…
Ah, forget that. I use the following mount options on Linux RAC NFS clients:
and I generally widen up a few kernel tunables when using Oracle over NFS:
net.core.rmem_default = 524288
net.core.wmem_default = 524288
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem=4096 524288 16777216
net.ipv4.tcp_wmem=4096 524288 16777216
net.ipv4.tcp_mem=16384 16384 16384
Once the filesystem(s) is/are mounted, I have 100% of my storage requirements for RAC taken care of. Most importantly, however, is to not forget Direct I/O when using NFS, so I set the following init.ora parameter filesystemio_options as follows:
Life is an unending series of choices. Choosing between simple or difficult storage connectivity and provisioning is one of them. If you overhear someone lamenting about how difficult “RAC” is, ask them how they like their block storage (FCP, iSCSI).