Real Application Clusters: The Shared Database Architecture for Loosely-Coupled Clusters

The typical Real Application Clusters (RAC) deployment is a true enigma. Sometimes I just scratch my head because I don’t get it. I’ve got this to say, if you think Shared Nothing Architecture is the way to go, then deploy it. But this is an Oracle blog, so let’s talk about RAC.

RAC is a shared disk architecture, just like DB2 on IBM mainframes. It is a great architecture, one that I agree with as is manifested by my working for shared data clustering companies all these years. Again, since this is an Oracle blog I think arguments about shared disk versus shared nothing are irrelevant.

Dissociative Identity Disorder
The reason I’m blogging this topic is because in my opinion the typical RAC deployment exhibits the characteristics of a person suffering from Dissociative Identity Disorder. Mind you, I’m discussing the architecture of the deployment, not the people that did the deployment. That is, we spend tremendous amounts of money for shared disk database architecture and then throw it into a completely shared nothing cluster. How much sense does that make? What areas of operations does that paradigm affect? Why does Oracle promote shared disk database deployments on shared-nothing clusters? What is the cause of this Dissociative Identity Disorder? The answer: the lack of a general purpose shared disk filesystem that is suited to Oracle database I/O that works on all Unix derivations and Linux. But wait, what about NFS?

Shared “Everything Else”
I can’t figure out any other way to label the principle I’m discussing so I’ll just call it “Shared Everything Else”. However, the term Shared Everything Else (SEE for short) insinuates that there is less importance in that particular content—an insinuation that could not be further from the truth. What do I mean? Well, consider the Oracle database software itself. How do you suppose an Oracle RAC (shared disk architecture) database can exist without having the product installed somewhere.

The product install directory for the database is called Oracle Home. Oracle has supported the concept of a shared Oracle Home since the initial release of RAC—even with Oracle9i. Yes, Metalink note 240963.1 describes the requirement for Oracle9i to have context dependent symbolic links (CDSL), but that was Oracle9i. Oracle10g requires no context dependent symbolic links. Oracle Universal Installer will install a functional shared Oracle Home without a any such requirements.

What if you don’t share a software install? It is very easy to have botched or mismatched product installs—which doesn’t sit well with a shared disk database. In a recent post on the oracle-l list, sent the following call for help:

We are trying to install a 2-node RAC with ASM (Oracle 10.2.0.2.0 on Solaris 10) and getting the error below when using dbca to create the database.The error occurs when dbca is done creating the DB (100%).Any suggestions?

We have tried starting atlprd2 instance manually and get the error below regarding an issue with spfile which is on ASM.

ORA-01565: error in identifying file ‘+SYS_DG/atlprd/spfileatlprd.ora’
ORA-17503: ksfdopn:2 Failed to open file +SYS_DG/atlprd/spfileatlprd.ora
ORA-03113: end-of-file on communication channel

OK, for those who are not Oracle-minded, this sort of deployment is what I call the Dissociative Identity Disorder since the database will be deployed on a bunch of LUNs provisioned, masked and accessed as RAW disk from the OS side—ASM is a collection of RAW disks. This is clearly not a SEE deployment.The original poster followed up with a status of the investigatory work he had to do to try and get around this problem:

[…] we have checked permissions and they are the same.We also checked and the same disk groups are mounted in both ASM instances

also.We have also tried shutting everything down (including reboot of both servers) and starting everything from scratch (nodeapps, asm, listeners, instances), but the second node won’t start.Keep getting the same error […]

What a joy. Deploying a shared disk database in a shared nothing cluster! There he was on each server checking file permissions (I just counted, there are 20,514 files in one of my Oracle10g Oracle Homes), investigating the RAW disk aspects of ASM, rebooting servers and so on. Good thing this is only a 2 node cluster. What if it was an 8 node cluster? What if he had 10 different clusters?

As usual, the oracle-l support channel comes through. Another list participant posted the following:

Seem to be a known issue (Metalink Note 390591.1). We encountered similar issue in Linux RAC cluster and has been resoled by following this note.

The cause was included in his post (emphasis added by me):

Cause

Installing the 10.2.0.2 patchset in a RAC installation on any Unix platform does not correctly update the libknlopt.a file on all nodes. The local node where the installer is run does update libknlopt.a but remote nodes do not get the updated file. This can lead to dumps or internal errors on the remote nodes if Oracle is subsequently relinked.

That was the good and bad, now the ugly—his post continues with the following excerpt from the Oracle Metalink note:

There are two solutions for this problem:

1) Manual copy of the “libknlopt.a” library to the offending nodes:

-ensure all instances are shut down
-manually copy $ORACLE_HOME/rdbms/lib/libknlopt.a from the local node to all remote nodes

-relink Oracle on all nodes :
make -f ins_rdbms.mk ioracle

2) Install the patchset on every node using the “-local” option:

What’s So Bad About Shared Nothing Clusters?
I’m not going to get into that, but one of the central knock-offs Oracle uses against shared-nothing database architecture is the fact that replication is required. Since the software used to access RAC needs to be kept in lock-step, replication is required there as well, and as we see from this oracle-l email thread, replication is not all that simple with a complex software deployment like the Oracle database product. But speaking of complex, the Oracle database software pales in comparison to the Oracle E-Business Suite. How in the world do people manage to deploy E-Biz on anything other than a huge central server? Shared Applications Tier.

Shared Applications Tier
Yes, just like Oracle Home, the huge, complex Oracle E-Business Suite can be installed in a shared fashion as well. It is called a Shared Applications Tier. One of the other blogs I read has been discussing this topic as well, but this is not just a blogosphere topic—it is mainline. Perhaps the best resource for Shared Applications Tier is Metalink note 243880.1, but Metalink notes 384248.1 and 233428.1 should not be overlooked. The long story short is that Oracle supports SEE, but they don’t promote it for who-knows-what-reason.

Is SEE Just About Product Installs?
Absolutely not. Consider intrinsic RAC functionality that doesn’t function at all without a shared filesystem:

External Tables with Parallel Query Option
UTIL_FILE
BFILE

I’m sure there are others (perhaps compiled PL/SQL), but who cares. The product is expensive and if you are using shared disk architecture you should be able to use all the features of shared disk architecture. However, without a shared filesystem, External Tables and the other features listed are not cluster-ready. That is, you can use External Tables, UTIL_FILE and BFILE—but only from one node. Isn’t RAC about multi-node scalability?

So Why the Rant?
The Oracle Universal Installer will install a fully functional Oracle10g shared Oracle Home to simplify things, the complex E-Business Suite software is architected for shared install and there are intrinsic database features that require shared data outside of the database so why deploy a shared database architecture product on a platform that only shares the database? You are going to have to explain it to me like I’m six years old; because I know I’m not going to understand. Oh, yes, and don’t forget that with a shared-nothing platform, all the day to day stuff like imp/exp, SQL*Loader, compressed archive redo, logging, trace, scripts, spool and so on mean you have to pick a server and go. How symmetric is that? Not as symmetric as the software for which you bought the cluster (RAC), that’s for certain.

Shared Oracle Home is a Single Point of Failure
And so is the SYSTEM tablespace in a RAC database, so what is the point?People who choose to deploy RAC on a platform that doesn’t support shared Oracle Home often say this. Yes a single shared Oracle Home is a single point of failure, but like I said, so is the SYSTEM tablespace in every RAC database out there. Shops that espouse shared software provisioning (e.g., shared Oracle Home) are not dolts, so the off-the-cuff single point of failure red herring is just that. When we say shared Oracle Home, do we mean a single shared Oracle Home? Well, not necessarily. If you have, say, a 4 or 8 node RAC cluster, why assume that SEE or not to SEE is a binary choice? It is perfectly reasonable to have 8 nodes share something like 2 Oracle Homes. That is a significant condensing factor and appeases the folks that concentrate on the possible single point of failure aspect of a shared Oracle Home (whilst often ignoring the SYSTEM tablespace single point of failure). A total availability solution requires Data Guard in my opinion, and Data Guard is really good, solid technology.

Choices
All told, NFS is the only filesystem that can be used across all Unix (and Linux) platforms for SEE. However, not all NFS offerings are suffiently scalable and resilient for SEE. This is why there is a significant technology trend towards clustered storage (e.g., NetApp OnTAP GX, PolyServe(HP) EFS Clustered Gateway, etc).

Finally, does anyone think I’m proposing some sort of mix-match NFS here with a little SAN there sort of ordeal? Well, no, I’m not. Pick a total solution and go with it…either NFS or SAN, the choice is yours, but pick a total platform solution that has shared data to complement the database architecture you’ve chosen. RAC and SEE!

9 Responses to “Real Application Clusters: The Shared Database Architecture for Loosely-Coupled Clusters”

Feed for this Entry Trackback Address

1 Noons February 9, 2007 at 2:25 am

SEE, I told you! 🙂

Great post, Kevin. And all this complexity becomes even worse when one considers the consequences of for example:

regular security patch sets and the need to apply them to what is supposed to be a non-stop system.

or the need to test new releases, new funtionality. what, folks supposed to duplicate an expensive RAC production system for testing?

” You are going to have to explain it to me like I’m six years old; because I know I’m not going to understand.”

Guv, I’m nearly 53 and I *still* don’t understand it!

😉

Reply
2 Doug Burns February 9, 2007 at 2:52 am

That is, you can use External Tables, UTIL_FILE and BFILE—but only from one node.

I think I came across another example of this type of thing today, during a presentation by Jason Arneil of Nominet – http://www.ukoug.org/calendar/show_presentation.jsp?id=6713

He found that he ran into problems with parallelising Data Pump when some of the parallel slaves executed on different nodes. The solution in his case was to run with one node, at least for the Data Pump part of the migration. So not only did he have to use one node, but had to make sure only one node was available to guarantee he wouldn’t run into problems.

I wish I could remember all of the details.

Reply
3 kevinclosson February 9, 2007 at 4:57 am

Doug,

Thanks for stopping by. You don’t have to remember all the details of any such war story. Any “anecdotal” story will only elaborate on the principles: shared disk database architecture on otherwise shared-nothing clusters. It’s a trainwreck.

Reply
4 kevinclosson February 9, 2007 at 4:59 am

Noons,

As always, thanks for stopping by!

Reply
5 Arnoud Roth May 1, 2007 at 5:51 am

Hi Kevin,

Apparently my previous reply to this topic didn’t get through, for whatever reason.
I came across this weblog entry of yours when writing a weblog that states almost the opposite of yours (http://technology.amis.nl/blog/?p=1873).
I can come a great deal with you, but I am not so sure about SEE when it comes to E-Business Suite.
One thing for example that you didn’t find on metalink is a general procedure on how to install a shared ORACLE_HOME for the E-Business Suite database tier. Trivial you might think, but if you have any experience on the E-Business Suite, you may well know that Oracle has added at least one directory (appsutil) under the DB ORACLE_HOME, which is only partially so-called “context sensitive” (differentiates between nodes/instances in the directory structure). Apart from that, I keep asking myself: Why does oracle push a shared APPL_TOP for the e-Business Suite, even a shared Applications Technology Stack (8.0.6 and iAS ORACLE_HOME), but doesn’t provide any documentation for shared RDBMS ORACLE_HOME?
I have asked this question (requested a support statement) to Oracle. The answer I got from Oracle is rather worrying… In short:
a) there is no documentation about this at least not within Oracle,
b) 10G supports a shared O_H, 10G is certified with APPS, so you might try it with APPS (yes! they stated this literally in their so-called official answer!),
c) there might be manual workarounds necessary for everything to work correctly, and probably (again, quote from Oracle Support), some things might not work as expected

I also have some personal experience on a SEE environment (with EBS). For example: A shared ORACLE_HOME for Clusterware. Doesn’t work consistently out of the box (duh? I agree…), there are bugs identified in 10gR1 that haven’t been solved in 10gR2 yet (Ever tried getting ONS to work stable on the cluster nodes when sharing the ORACLE_HOME?). I’ll admit it, there are workarounds (Metalink Note 304767.1), but still… When implementing E-Business Suite on 10gR2 with RAC on a shared RDBMS ORACLE_HOME, you will have to add some additional scripting into ORACLE_HOME/bin/racgwrap in order to be able to start the database using srvctl. Oracle’s whitepaper on implementing RAC with ASM for EBS covers this only partially, not the part where you would have a shared ORACLE_HOME.
Just a number of issues that you will have to deal with when implementing SEE strategy.
As stated in the first line of my reply, I can imagine SEE could work for a regular (e.g. non E-Business Suite) RAC environment, and I can clearly see its advantages, but apparently Oracle isn’t ready yet to push it to the market.
Besides, when you consider an architecture like this, isn’t one of the first questions the one about availability? Personally I would like to minimize the chances of down-time as much as possible. SEE will increase the chances of downtime significantly. With SEE, when problems occur on any component which is shared, you risk unavailability of your entire application (patches applied to your ORACLE_HOME that do not work out as they were expected, etc.). When you do not share the code, this could only happen when something happens to your database. When the ORACLE_HOME/APPL_TOP etc. get damaged (and I am sure you will agree with me, logical corruption being caused by people or procedures are the most occuring corruptions and cannot be eliminated by e.g. mirroring) you will still have the other nodes in your cluster being able to serve your environment, thereby preventing complete loss of service in such circumstances. Especially with the E-Business Suite, which is, for most companies, a crucial part of their primary business process.
In short, there definetly is a trade-off when such an architecture is considered. You cannot just say someone is suffering from Dissociative Identity Disorder. There may be perfectly valid reasons not to implement SEE.
(I do hope the 6-year-old in you understands a little;-)

Regards,
Arnoud

Reply
6 Marco Gralike May 2, 2007 at 8:29 am

I haven’t followed up in years, since the days I worked with Oracle Parallel Server, but my grunge against, and I think the mayor flaw in these kind of architectures, is that the problem lies in the Oracle kernel itself. This is your SPOF, in my honest opinion, because it hasn’t been made for parallelism (1 executable accros multiple nodes, instead of multiple executables over multiple nodes) and therefore can’t deal with SEE or share nothing architectures. This will lead always to compromises and a lot of hard work.

But maybe, as Anjo (Kolk) described it (“RAC”): “You will have to hate it before you love it” (in that light at least I made the first step 😉

Reply
7 Arnoud Roth May 3, 2007 at 5:56 am

How true, Marco.
I really hated RAC before I started lovin’ it.
However, without putting any doubts on Mr. Kolk’s expertise, imho this could also have to do with the fact that we don’t understand the product to its full extend, thereby misjudging its features, possibilities and impossibilities.

Reply
8 Robert December 27, 2011 at 5:59 am

I agree with Arnoud. The primary reason that I use local installations of Oracle software in a RAC cluster is to have the ability to patch in a rolling fashion. Until Oracle can solve hot patching and no downtime, then sharing the Oracle Home is not in my future. I will admit, I would like to patch only once though….

Reply
- 9 kevinclosson December 27, 2011 at 10:56 am
  
  Hi Robert,
  
  Good point, but then I never advocate a single anything. My writings have been about the idea of “Shared homes” more than “Shared home.” In a large complex cluster of, say, 8 RAC hosts it would seem like savings to have 2 shared homes (4 hosts each sharing 1). But, honestly, I’m tired of that topic because in the world of modern non-FSB 8-socket servers with huge memories I question the need to scale out with RAC anyway.
  
  Reply

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage