The Most Horribly Botched Oracle Database 10g Real Application Clusters Install Attempt.

Now don’t get me wrong. What I’m blogging about is not really an Oracle Database 10g Real Application Clusters (RAC) problem. All of the problems I will mention in this entry were clearly related to a botched configuration at the OS level. Since Oracle10g will be around for a long time, I suspect that some day someone else might run into this sort of a problem and I aim to make their lives easier. That said, there could be some goodies in here for all the regular readers of this blog.

The Scenario
I took the brand new 4-node RAC cluster I have in the lab and aimed to see how much the oracle-validated-100-4el4x86_64.rpm package assists in setting up the system in preparation for a RAC install. I was excited since this was the first 2-socket Xeon “Cloverdale” 5355-based cluster I’ve had the chance to test. These processors are murderously fast so I was chomping at the bits to give them a whirl.

The systems were loaded with RHEL4 U4 x86_64. I executed the oracle-validated-100-4el4x86_64.rpm package and here is what it returned:

# rpm -ivh oracle-validated-1.0.0-4.el4.x86_64.rpm
warning: oracle-validated-1.0.0-4.el4.x86_64.rpm: V3 DSA signature: NOKEY, key ID b38a8516
error: Failed dependencies:
        /usr/lib/libc.so is needed by oracle-validated-1.0.0-4.el4.x86_64
        control-center is needed by oracle-validated-1.0.0-4.el4.x86_64
        fontconfig >= 2.2.3-7.0.1 is needed by oracle-validated-1.0.0-4.el4.x86_64
        gnome-libs is needed by oracle-validated-1.0.0-4.el4.x86_64
        libdb.so.3()(64bit) is needed by oracle-validated-1.0.0-4.el4.x86_64
        libstdc++.so.5()(64bit) is needed by oracle-validated-1.0.0-4.el4.x86_64
        xscreensaver is needed by oracle-validated-1.0.0-4.el4.x86_64
    Suggested resolutions:
        compat-db-4.1.25-9.x86_64.rpm
        compat-libstdc++-33-3.2.3-47.3.x86_64.rpm
        control-center-2.8.0-12.rhel4.5.x86_64.rpm
        gnome-libs-1.4.1.2.90-44.1.x86_64.rpm
        xscreensaver-4.18-5.rhel4.11.x86_64.rpm

So I chased down what was recommended and installed those RPMs as well:

 # cat > /tmp/list
        compat-db-4.1.25-9.x86_64.rpm
        compat-libstdc++-33-3.2.3-47.3.x86_64.rpm
        control-center-2.8.0-12.rhel4.5.x86_64.rpm
        gnome-libs-1.4.1.2.90-44.1.x86_64.rpm
        xscreensaver-4.18-5.rhel4.11.x86_64.rpm

# rpm -ivh --nodeps `cat /tmp/list | xargs echo`
warning: compat-db-4.1.25-9.x86_64.rpm: V3 DSA signature: NOKEY, key ID db42a60e
Preparing...                ########################################### [100%]
   1:xscreensaver           ########################################### [ 20%]
   2:compat-db              ########################################### [ 40%]
   3:compat-libstdc++-33    ########################################### [ 60%]
   4:control-center         ########################################### [ 80%]
   5:gnome-libs             ########################################### [100%]

At this point I think I should have been ready to install Oracle Clusterware (CRS). Well, the install failed miserably during the linking phase and-shame on me-I didn’t save any of the logs or screen shots. No matter, I can still make a good blog entry out of this-so long as you are willing to believe that the linking phase of the CRS install failed. I decided to clean up the botched installation and use my normal installation process.

The approach I generally take is to look at a comparable Oracle Validated Configuration and pull the list of RPMs specified. And that is precisely what I did:

$ cat > /tmp/list
binutils-2.15.92.0.2-21.x86_64.rpm
        compat-db-4.1.25-9.x86_64.rpm
        compat-libstdc++-33-3.2.3-47.3.x86_64.rpm
        control-center-2.8.0-12.rhel4.5.x86_64.rpm
        gcc-3.4.6-3.x86_64.rpm
        gcc-c++-3.4.6-3.x86_64.rpm
        glibc-2.3.4-2.25.i686.rpm
        glibc-2.3.4-2.25.x86_64.rpm
        glibc-devel-2.3.4-2.25.i386.rpm
        glibc-devel-2.3.4-2.25.x86_64.rpm
        glibc-common-2.3.4-2.25.x86_64.rpm
        glibc-headers-2.3.4-2.25.x86_64.rpm
        glibc-kernheaders-2.4-9.1.98.EL.x86_64.rpm
        gnome-libs-1.4.1.2.90-44.1.x86_64.rpm
        libgcc-3.4.6-3.x86_64.rpm
        libstdc++-3.4.6-3.x86_64.rpm
        libstdc++-devel-3.4.6-3.x86_64.rpm
        libaio-0.3.105-2.x86_64.rpm
        make-3.80-6.EL4.x86_64.rpm
        pdksh-5.2.14-30.3.x86_64.rpm
        sysstat-5.0.5-11.rhel4.x86_64.rpm
        xorg-x11-deprecated-libs-6.8.2-1.EL.13.36.x86_64.rpm
        xorg-x11-deprecated-libs-6.8.2-1.EL.13.36.i386.rpm
        xscreensaver-4.18-5.rhel4.11.x86_64.rpm

# rpm -ivh --nodeps `cat /tmp/list | xargs echo`

[…output deleted…]

# rpm -ivh oracle-validated-1.0.0-4.el4.x86_64.rpm
warning: oracle-validated-1.0.0-4.el4.x86_64.rpm: V3 DSA signature: NOKEY, key ID b38a8516
error: Failed dependencies:
        /usr/lib/libc.so is needed by oracle-validated-1.0.0-4.el4.x86_64
        fontconfig >= 2.2.3-7.0.1 is needed by oracle-validated-1.0.0-4.el4.x86_64

# rpm -ivh fontconfig-2.2.3-7.x86_64.rpm
warning: fontconfig-2.2.3-7.x86_64.rpm: V3 DSA signature: NOKEY, key ID db42a60e
Preparing...                ########################################### [100%]
        package fontconfig-2.2.3-7 is already installed

So, once again I should have been ready to go. That was not the case. I got the following stream of error output when I ran vipca:

# sh ./vipca
PRKH-1010 : Unable to communicate with CRS services.
  [PRKH-1000 : Unable to load the SRVM HAS shared library
  [PRKN-1008 : Unable to load the shared library "srvmhas10"
  or a dependent library, from

LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/..
/lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:"
  [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1: 

cannot open shared object file: No such file or directory]]]
PRKH-1010 : Unable to communicate with CRS services.
  [PRKH-1000 : Unable to load the SRVM HAS shared library
  [PRKN-1008 : Unable to load the shared library "srvmhas10"
  or a dependent library, from

LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/..
/lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:"
  [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1: 

cannot open shared object file: No such file or directory]]]
PRKH-1010 : Unable to communicate with CRS services.
  [PRKH-1000 : Unable to load the SRVM HAS shared library
  [PRKN-1008 : Unable to load the shared library "srvmhas10"
  or a dependent library, from

LD_LIBRARY_PATH="/opt/oracle/crs/jdk/jre/lib/i386/client:/opt/oracle/crs/jdk/jre/lib/i386:/opt/oracle/crs/jdk/jre/..
/lib/i386:/opt/oracle/crs/lib32:/opt/oracle/crs/srvm/lib32:/opt/oracle/crs/lib:/opt/oracle/crs/srvm/lib:"
  [java.lang.UnsatisfiedLinkError: /opt/oracle/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1:

Egad! What I’m about to tell you should prove beyond the shadow of a doubt that I am not a DBA! I don’t need VIPs or gsd or srvctl for that matter. My test harnesses do not use any of that-at least not the test harness I was intending to use for this specific battery of testing. So, I ignored the vipca problem and figured I’d just install the database and get to work. I’ve never before seen what I’m about to show you.

During the installation of the database, OUI detected all nodes of the cluster so I thought I might be able to sneak this one through. however, during the installation of the database, OUI presented a dialogue for database updates. Now, this is seriously odd since this was a freshly built cluster. There were no other databases installed, but that isn’t the most peculiar aspect of this dialogue. Check it out:

botched.jpg

That is strangely beyond strange! I’m not sure if you can tell what that is but the cells in the dialogue are populated with error output (PRKH-* and LD_LIBRARY_PATH, etc). Wow, that was crazy. What I’m about to tell you should be proof positive that not only am I not a DBA, I do things that are only done by someone who has no idea what a DBA is! I needed the database installed and I didn’t have any databases to upgrade, so I thought I’d ignore this database upgrade mess and push on. Bad idea.

Of course the install was a total failure. I spent some time figuring out what was going on and then it dawned on me. I forgot to install the 32bit glibc-devel RPM. After doing so I cleaned up the mess, walked through the install again and viola-no problems.

The Moral of the Story
Don’t forget the required 32bit library when installing 10g on x86_64 Linux and pay particular attention to the fact that neither the Validated Configuration list of RPMS, nor the oracle-validated-100-4el4x86_64.rpm package had any mention of this library.

I hope it helps someone, someday. Nonetheless, that was a weird screen shot wasn’t it?

6 Responses to “The Most Horribly Botched Oracle Database 10g Real Application Clusters Install Attempt.”


  1. 1 Dan August 30, 2007 at 2:42 am

    This has bitten me more than once. Oracle should detect that the 32bit library is missing and let the installer know. I’ve installed oracle 10g a least 20 times on Linux and Solaris this year and each time it was a PAINFULL experience.

    I’m a 19 year Unix admin/developer… Oracle should take note of it’s competitors install process. SQLserver is a snap to install and is much easier to manage. From my latest Quest Benchmark Factory benchmarks comparing SQLserver vs Oracle/Linux, SQLserver is the clear winner.

  2. 2 cristiancudizio August 30, 2007 at 7:08 am

    Wow!
    Your story remember’s my first attempts of intallation of 10.1 db on AMD 64 bit with Suse SLES 9. It mades me crazy. A that time validated configurations did not existed yet and i followed only installation manual. The problem was the same, i didn’t installed 32 bit version of some libraries. I think it would be a good think if Oracle Enterprise Linux would be installed with all packages required for an Oracle RAC installation.
    Thanks for your publications kevin.

  3. 3 Niall Litchfield August 30, 2007 at 8:30 am

    At least ‘Uses ASM’ came out as N/A!

    I’ve always objected to the pre-requisite checks in oui – since on other occasions I’ve also found that they checked some but not all pre-requisites. If you are going to code a check that validates your install, you better make sure it really does.

    cheers

    Niall

  4. 4 Marc Handelman September 4, 2007 at 12:58 am

    Quite frankly, install issues have plagued Oracle deployments for years. Even with diligent research, thorough planning and documentation, the best laid plans……&c.

    I would venture to postulate that the most successful implementations (in recent memory that is), have been on SuSE (SLES 9 and above) and most likely due to the OraRun rpm which I believe is written or at least maintained by Arun Singh at SuSE.

    Thanks for the blog post Kevin, always entertaining, and illuminating as well!

  5. 5 Luke Youngblood March 15, 2008 at 1:08 am

    It’s great to know that others forget RPMs sometimes like I do. I wrote some cfengine scripts for new Oracle server builds that set kernel parameters, setup rawdevices, install RPMs, create oracle and dba user/groups, etc. They have saved me countless times, but the DBAs still give me a hard time if I forget one of their precious 32-bit RPMs…

    Thanks for the entertaining article.

  6. 6 Claudio September 24, 2008 at 9:03 am

    Nice to know I am not a complete jerk,
    I can’t believe that Oracle in 10 years of this ‘beautiful’ java installer did little or no improvement in user experience in installing oracle products, it’s just a pain, and it is a shame.
    With all theie customers and money they should really do something
    to keep bugs away from the OUI, 7.x text intaller was fast at least.

    Aloha!
    Claudio

    PS: Although certified on the paper, do not install RAC 10gr2 on RedHat5, pain in the pain, you have to fake OUI.
    How can it be certified since the OUI won’t even start because it is not in the list?
    ===============================================================
    Red Hat Enterprise AS/ES 5 10gR2 64-bit Oracle Clusterware 10g Certified None None None None
    ================================================================


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services but all of the words on this blog are purely my own. Not a single word on this blog is to be mistaken as originating from any Amazon spokesperson. You are reading my words this webpage and, while I work at Amazon, all of the words on this webpage reflect my own opinions and findings. To put it another way, "I work at Amazon, but this is my own opinion." To conclude, this is not an official Amazon information outlet. There are no words on this blog that should be mistaken as official Amazon messaging. Every single character of text on this blog originates in my head and I am not an Amazon spokesperson.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,894 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: