Coming back from vacation and failing to catch up on oracle-l list topics is a bad mistake. I was wondering why Kirk McGowan decided to make a post about fencing in the context of Oracle Clusterware. After finally catching up on my oracle-l backlog, I see that the stimulus for Kirk’s blog entry was likely this post to the oracle-l list where the list member was asking whether Oracle Clusterware implements STONITH as its fencing model. It seems the question was asked after the list member watched this Oracle webcast about RAC where slide 11 specifically states:
IO Fencing via Stonith algorithm (remote power reset)
The list member was conflicted over the statement in Oracle’s webcast. It seems he had likely seen my blog entry entitled RAC Expert or Clusters Expert where I discuss the clusters concept of fencing. In that blog entry, and in the paper I reference therein, I point out that Oracle Clusterware doesn’t implement STONITH because it doesn’t. Oh boy, there he goes again contradicting Oracle. Well, no, I’m not. The quote from Oracle’s webcast says they implement their fencing using a “STONITH algorithm” and they do. The bit about remote power reset is splitting hairs a bit since the way the fenced node excuses itself from the cluster is by executing an immediate shutdown (e.g., Linux reboot(8) command). Kirk correctly points out that the correct term is actually suicide. Oracle uses algorithms common to STONITH implementations to determine what nodes need to get fenced. When a node is alerted that it is being “fenced” it uses suicide to carry out the order.
What Time Is It?
From about 2003 through 2005 I had dozens of people ask me for in-depth clusters concepts information with both a generic view and an Oracle-centric view-I was working for a clustering company and had a long history of clustered Oracle behind me after all. It seems people were getting confused as to why there were options to use vendor-integrated host clusterware on all platforms except Linux and Windows. People wanted to better understand both generic clustering concepts as well as Oracle Clusterware. It seems some merely wanted to “know what time it is” while others wanted to “know how to tell time” and some even wanted to know “how the clock works.” About the time Oracle implemented the Third Party Clusterware Validation Program I decided I need to write a paper on the matter, so I did and posted it on the OakTable Network site. In the paper, and my blog post, I point out that Oracle Clusterware is not STONITH-clinically speaking-and indeed it isn’t. STONITH requires healthy servers to take action against ill servers via:
- Remote Power Reset. This technology is not expensive, nor spooky. In fact, here is a network power switch for $199 that allows SNMP commands to power cycle outlets. Academic (and some commercial) approaches use these sorts of devices when implementing clusters. A healthy server will simply issue an SNMP command to power off the ill server. Incidentally, not all servers that run Oracle even have a power cord (think blades) and some don’t even use AC (see Rackable’s DB Power servers) so Oracle couldn’t use this approach without horrible platform-specific porting issues.
- Remote System Management. There are a plethora of remote system management technologies (e.g., power cycle a server remotely) such as DRAC, IPMI, iLO, ALOM, RSC. Oracle is not crazy enough to tailor their fencing requirements around each of these. What a porting nightmare that would be. Oracle has stated more than once that there are no standards in this space and thus no useable APIs. The closest thing would have been either IPMI or OPMA, but the industry hasn’t seemed to want a cross-platform standard in this space.
The lack of standards where cluster fencing is concerned leaves us with a wide array of vendor clusterware such as Service Guard, HACMP, VCS, PolyServe, Red Hat Cluster Suite and on and on. I had a lot of Oracle customers asking me to inform them of the fundamental differences between these various clusterware and Oracle’s Clusterware so I did.
Gasp, Oracle Doesn’t Implement STONITH!
Henny penny: the sky is falling. So Oracle doesn’t really implement STONITH. So what. That doesn’t mean nobody wants to understand the general topic of clustering-and fencing in particular-a little bit better. It would not be right to tell them that their quest for information is moot just because other cluster approaches are not embodied in Oracle Clusterware. However, the importance of Oracle’s choice of fencing method is probably summed up the best in that oracle-l email thread which dried up and died within 24 hours after another member posted the following:
Has anyone see a RAC data corruption due to Clusterware unable to shoot itself?
I can assure you all that if anyone reading the oracle-l list had such a testimonial we would have heard it. The oracle-l list membership is huge and there are also a lot of consultants on the list who have contacts with a lot of production sites. The thread dried up, dropped to the ground and died. I think what I just wrote mirrors Kirk McGowan’s position on the matter.
What Would It Take
No clustering approach is perfect. Whether STONITH, fabric fencing or suicide, clusters can melt down. That is, after all, why Oracle offers an even higher level of protection in their Maximum Availability Architecture through such technology as DataGuard.
What would it take for an Oracle Clusterware fencing breach and why would I blog such taboo? It takes a lot of unlikely (yet possible) circumstances and because some people want to know. With Oracle Clusterware, a fencing breach would require:
- Failed Suicide. If for any reason Oracle’s Clusterware process is not able to successfully execute a software reboot of the ill server.
- Hangcheck Failure. The hangcheck kernel module executes off a kernel timer. If the system is so ill that these kernel events are not getting triggered then that would mean hardclock interrupts are not working and I should think the system would likely PANIC. All told a PANIC is just as good as hangcheck timer succeeding really. Nonetheless, it is possible that such a situation could arise.
A Waste of My Time
So over the last few years I spent a little time explaining clusters concepts to people with Oracle in mind. In my writings I discussed such topics as fencing, kernel mode/user mode clusterware, skgxp(), skgxn() and a host of other RAC-related material. Was it a waste of my time? No. Do I agree with Kirk McGowan’s post? Yes. Most importantly, however, I hang my hat on the oracle-l thread that dead-ended when the last poster on the thread asked:
Has anyone see a RAC data corruption due to Clusterware unable to shoot itself?
…and then there was silence.
Recent Comments