Now don’t get me wrong. What I’m blogging about is not really an Oracle Database 10g Real Application Clusters (RAC) problem. All of the problems I will mention in this entry were clearly related to a botched configuration at the OS level. Since Oracle10g will be around for a long time, I suspect that some day someone else might run into this sort of a problem and I aim to make their lives easier. That said, there could be some goodies in here for all the regular readers of this blog.
The Scenario
I took the brand new 4-node RAC cluster I have in the lab and aimed to see how much the oracle-validated-100-4el4x86_64.rpm package assists in setting up the system in preparation for a RAC install. I was excited since this was the first 2-socket Xeon “Cloverdale” 5355-based cluster I’ve had the chance to test. These processors are murderously fast so I was chomping at the bits to give them a whirl.
The systems were loaded with RHEL4 U4 x86_64. I executed the oracle-validated-100-4el4x86_64.rpm package and here is what it returned:
# rpm -ivh oracle-validated-1.0.0-4.el4.x86_64.rpm warning: oracle-validated-1.0.0-4.el4.x86_64.rpm: V3 DSA signature: NOKEY, key ID b38a8516 error: Failed dependencies: /usr/lib/libc.so is needed by oracle-validated-1.0.0-4.el4.x86_64 control-center is needed by oracle-validated-1.0.0-4.el4.x86_64 fontconfig >= 2.2.3-7.0.1 is needed by oracle-validated-1.0.0-4.el4.x86_64 gnome-libs is needed by oracle-validated-1.0.0-4.el4.x86_64 libdb.so.3()(64bit) is needed by oracle-validated-1.0.0-4.el4.x86_64 libstdc++.so.5()(64bit) is needed by oracle-validated-1.0.0-4.el4.x86_64 xscreensaver is needed by oracle-validated-1.0.0-4.el4.x86_64 Suggested resolutions: compat-db-4.1.25-9.x86_64.rpm compat-libstdc++-33-3.2.3-47.3.x86_64.rpm control-center-2.8.0-12.rhel4.5.x86_64.rpm gnome-libs-1.4.1.2.90-44.1.x86_64.rpm xscreensaver-4.18-5.rhel4.11.x86_64.rpm
So I chased down what was recommended and installed those RPMs as well:
# cat > /tmp/list compat-db-4.1.25-9.x86_64.rpm compat-libstdc++-33-3.2.3-47.3.x86_64.rpm control-center-2.8.0-12.rhel4.5.x86_64.rpm gnome-libs-1.4.1.2.90-44.1.x86_64.rpm xscreensaver-4.18-5.rhel4.11.x86_64.rpm # rpm -ivh --nodeps `cat /tmp/list | xargs echo` warning: compat-db-4.1.25-9.x86_64.rpm: V3 DSA signature: NOKEY, key ID db42a60e Preparing... ########################################### [100%] 1:xscreensaver ########################################### [ 20%] 2:compat-db ########################################### [ 40%] 3:compat-libstdc++-33 ########################################### [ 60%] 4:control-center ########################################### [ 80%] 5:gnome-libs ########################################### [100%]
At this point I think I should have been ready to install Oracle Clusterware (CRS). Well, the install failed miserably during the linking phase and-shame on me-I didn’t save any of the logs or screen shots. No matter, I can still make a good blog entry out of this-so long as you are willing to believe that the linking phase of the CRS install failed. I decided to clean up the botched installation and use my normal installation process.
The approach I generally take is to look at a comparable Oracle Validated Configuration and pull the list of RPMs specified. And that is precisely what I did:
$ cat > /tmp/list binutils-2.15.92.0.2-21.x86_64.rpm compat-db-4.1.25-9.x86_64.rpm compat-libstdc++-33-3.2.3-47.3.x86_64.rpm control-center-2.8.0-12.rhel4.5.x86_64.rpm gcc-3.4.6-3.x86_64.rpm gcc-c++-3.4.6-3.x86_64.rpm glibc-2.3.4-2.25.i686.rpm glibc-2.3.4-2.25.x86_64.rpm glibc-devel-2.3.4-2.25.i386.rpm glibc-devel-2.3.4-2.25.x86_64.rpm glibc-common-2.3.4-2.25.x86_64.rpm glibc-headers-2.3.4-2.25.x86_64.rpm glibc-kernheaders-2.4-9.1.98.EL.x86_64.rpm gnome-libs-1.4.1.2.90-44.1.x86_64.rpm libgcc-3.4.6-3.x86_64.rpm libstdc++-3.4.6-3.x86_64.rpm libstdc++-devel-3.4.6-3.x86_64.rpm libaio-0.3.105-2.x86_64.rpm make-3.80-6.EL4.x86_64.rpm pdksh-5.2.14-30.3.x86_64.rpm sysstat-5.0.5-11.rhel4.x86_64.rpm xorg-x11-deprecated-libs-6.8.2-1.EL.13.36.x86_64.rpm xorg-x11-deprecated-libs-6.8.2-1.EL.13.36.i386.rpm xscreensaver-4.18-5.rhel4.11.x86_64.rpm # rpm -ivh --nodeps `cat /tmp/list | xargs echo`
[…output deleted…]
# rpm -ivh oracle-validated-1.0.0-4.el4.x86_64.rpm warning: oracle-validated-1.0.0-4.el4.x86_64.rpm: V3 DSA signature: NOKEY, key ID b38a8516 error: Failed dependencies: /usr/lib/libc.so is needed by oracle-validated-1.0.0-4.el4.x86_64 fontconfig >= 2.2.3-7.0.1 is needed by oracle-validated-1.0.0-4.el4.x86_64 # rpm -ivh fontconfig-2.2.3-7.x86_64.rpm warning: fontconfig-2.2.3-7.x86_64.rpm: V3 DSA signature: NOKEY, key ID db42a60e Preparing... ########################################### [100%] package fontconfig-2.2.3-7 is already installed
So, once again I should have been ready to go. That was not the case. I got the following stream of error output when I ran vipca:
# sh ./vipca PRKH-1010 : Unable to communicate with CRS services. [PRKH-1000 : Unable to load the SRVM HAS shared library [PRKN-1008 : Unable to load the shared library "srvmhas10" or a dependent library, from LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/.. /lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:" [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1: cannot open shared object file: No such file or directory]]] PRKH-1010 : Unable to communicate with CRS services. [PRKH-1000 : Unable to load the SRVM HAS shared library [PRKN-1008 : Unable to load the shared library "srvmhas10" or a dependent library, from LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/.. /lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:" [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1: cannot open shared object file: No such file or directory]]] PRKH-1010 : Unable to communicate with CRS services. [PRKH-1000 : Unable to load the SRVM HAS shared library [PRKN-1008 : Unable to load the shared library "srvmhas10" or a dependent library, from LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/.. /lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:" [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1:
Egad! What I’m about to tell you should prove beyond the shadow of a doubt that I am not a DBA! I don’t need VIPs or gsd or srvctl for that matter. My test harnesses do not use any of that-at least not the test harness I was intending to use for this specific battery of testing. So, I ignored the vipca problem and figured I’d just install the database and get to work. I’ve never before seen what I’m about to show you.
During the installation of the database, OUI detected all nodes of the cluster so I thought I might be able to sneak this one through. however, during the installation of the database, OUI presented a dialogue for database updates. Now, this is seriously odd since this was a freshly built cluster. There were no other databases installed, but that isn’t the most peculiar aspect of this dialogue. Check it out:
That is strangely beyond strange! I’m not sure if you can tell what that is but the cells in the dialogue are populated with error output (PRKH-* and LD_LIBRARY_PATH, etc). Wow, that was crazy. What I’m about to tell you should be proof positive that not only am I not a DBA, I do things that are only done by someone who has no idea what a DBA is! I needed the database installed and I didn’t have any databases to upgrade, so I thought I’d ignore this database upgrade mess and push on. Bad idea.
Of course the install was a total failure. I spent some time figuring out what was going on and then it dawned on me. I forgot to install the 32bit glibc-devel RPM. After doing so I cleaned up the mess, walked through the install again and viola-no problems.
The Moral of the Story
Don’t forget the required 32bit library when installing 10g on x86_64 Linux and pay particular attention to the fact that neither the Validated Configuration list of RPMS, nor the oracle-validated-100-4el4x86_64.rpm package had any mention of this library.
I hope it helps someone, someday. Nonetheless, that was a weird screen shot wasn’t it?
This has bitten me more than once. Oracle should detect that the 32bit library is missing and let the installer know. I’ve installed oracle 10g a least 20 times on Linux and Solaris this year and each time it was a PAINFULL experience.
I’m a 19 year Unix admin/developer… Oracle should take note of it’s competitors install process. SQLserver is a snap to install and is much easier to manage. From my latest Quest Benchmark Factory benchmarks comparing SQLserver vs Oracle/Linux, SQLserver is the clear winner.
Wow!
Your story remember’s my first attempts of intallation of 10.1 db on AMD 64 bit with Suse SLES 9. It mades me crazy. A that time validated configurations did not existed yet and i followed only installation manual. The problem was the same, i didn’t installed 32 bit version of some libraries. I think it would be a good think if Oracle Enterprise Linux would be installed with all packages required for an Oracle RAC installation.
Thanks for your publications kevin.
At least ‘Uses ASM’ came out as N/A!
I’ve always objected to the pre-requisite checks in oui – since on other occasions I’ve also found that they checked some but not all pre-requisites. If you are going to code a check that validates your install, you better make sure it really does.
cheers
Niall
Quite frankly, install issues have plagued Oracle deployments for years. Even with diligent research, thorough planning and documentation, the best laid plans……&c.
I would venture to postulate that the most successful implementations (in recent memory that is), have been on SuSE (SLES 9 and above) and most likely due to the OraRun rpm which I believe is written or at least maintained by Arun Singh at SuSE.
Thanks for the blog post Kevin, always entertaining, and illuminating as well!
It’s great to know that others forget RPMs sometimes like I do. I wrote some cfengine scripts for new Oracle server builds that set kernel parameters, setup rawdevices, install RPMs, create oracle and dba user/groups, etc. They have saved me countless times, but the DBAs still give me a hard time if I forget one of their precious 32-bit RPMs…
Thanks for the entertaining article.
Nice to know I am not a complete jerk,
I can’t believe that Oracle in 10 years of this ‘beautiful’ java installer did little or no improvement in user experience in installing oracle products, it’s just a pain, and it is a shame.
With all theie customers and money they should really do something
to keep bugs away from the OUI, 7.x text intaller was fast at least.
Aloha!
Claudio
PS: Although certified on the paper, do not install RAC 10gr2 on RedHat5, pain in the pain, you have to fake OUI.
How can it be certified since the OUI won’t even start because it is not in the list?
===============================================================
Red Hat Enterprise AS/ES 5 10gR2 64-bit Oracle Clusterware 10g Certified None None None None
================================================================