Don’t Install Oracle on Linux Servers With Long Kernel Names

Chalk this blog entry up under the “might help some poor Googler” someday column. This is a really weird Oracle installation error.

Lot’s of Clusters, Less Confusion
We have a lot of clusters here running Oracle on everything from Red Hat RHEL4 x86 and x86_64 to SuSE SLES9 x86 and x86_64. We also build clusters for certain test purposes such as analyzing how different kernels affect performance (thanks Carel-Jan), stability and so on. To keep things straight we generally build kernels and then name them with earmarks so that simple uname(1) output will tell us what the configuration is. For example, if we are running a test called “test3”, with the kernel build from the kernel-source-2.6.5-7.244.i586.rpm package, we might see the following when running the uname(1) command:

# uname -r
2.6.5-7.244-default-test3-244-0003-support

That is a long name for a kernel, but who should care? The manpage for the uname(2) call on Linux defines the arrays returned by the call as being unspecified in length:

The utsname struct is defined in <sys/utsname.h>:
struct utsname {
char sysname[];
char nodename[];
char release[];
char version[];
char machine[];
#ifdef _GNU_SOURCE
char domainname[];
#endif
};

The length of the arrays in a struct utsname is unspecified; the fields are NUL-terminated.

[Blog Correction: Before updating this page I had erroneously pointed out that NUL was a misspelling. I was wrong. See the comment stream below.]

What Does This Have to Do With Oracle?
Installation! We were trying to install Oracle10g Release 2 version 10.2.0.1 on SuSE SLES9 U3 x86 and ran into the following:

$ ./runInstaller
*** glibc detected *** free(): invalid next size (fast): 0x0807aa80 ***
*** glibc detected *** free(): invalid next size (fast): 0x0807ab00 ***
*** glibc detected *** free(): invalid next size (fast): 0x0807ab28 ***
*** glibc detected *** free(): invalid next size (fast): 0x0807ab50 ***
[…much error text deleted…]
*** glibc detected *** free(): invalid next size (fast): 0x0807ab70 ***
*** glibc detected *** free(): invalid next size (fast): 0x0807ab98 ***
*** glibc detected *** free(): invalid next size (fast): 0x0807abc0 ***
*** glibc detected *** free(): invalid next size (normal): 0x0807ad88 ***
*** glibc detected *** free(): invalid next size (normal): 0x0807aef0 ***
./runInstaller: line 63: 11294 Segmentation fault $CMDDIR/install/.oui $* -formCluster

What? The runInstaller script executed .oui which in turn suffered a segmentation fault. After investigating .oui with ltrace(1) it became clear that .oui mallocs 30 bytes and then calls uname(2). In our case, the release[] array returned to .oui from the uname(2) library call was a bit large. Larger than 30 bytes for certain. In spite of the fact that the uname(2) manpage says the size of the release[] array is unspecified, .oui presumes it will fit in 30 bytes. The strcpy(P) call that followed tried to stuff the array containing our long kernel name(2.6.5-7.244-default-test3-244-0003-support) into a 30 byte space at 0x8075940. That resulted in a segmentation fault:

malloc(1024) = 0x8073118
malloc(1024) = 0x8073520
malloc(8192) = 0x8073928
malloc(8) = 0x8075930
malloc(30) = 0x8075940
uname(0xbfffba48 <unfinished …>
SYS_uname(0xbfffba48) = 0
<… uname resumed> ) = 0
strcpy(0x8075940, “2.6.5-7.244-default-test3-244-00″…) = 0x8075940

The Moral
Don’t use long kernel names on Oracle systems. And, oh, when a manpage says something is unspecified, that doesn’t necessarily mean 30.

13 Responses to “Don’t Install Oracle on Linux Servers With Long Kernel Names”


  1. 1 Nigel Thomas April 19, 2007 at 11:46 am

    Trivia corner:

    The length of the arrays in a struct utsname is unspecified; the fields are NUL-terminated.

    “Let’s overlook the misspelling of NULL (there are a lot of typos in Linux manpages).”

    NUL (with one L) is not a typo – it’s the official ASCII symbol name for 0x00. See eg http://www.unicode.org/charts/PDF/U0000.pdf.

    (Now I’ll go get myself a life…)

  2. 2 Hemant April 19, 2007 at 1:54 pm

    I wouldn’t fault Oracle for a normal assumption
    that unames are not longer than 30 characters.

    This is an example of two developers working independently — one in the Linux space, one in the Oracle space !
    Hemant

  3. 3 kevinclosson April 19, 2007 at 2:41 pm

    Hemant,

    I’m not saying this is some hideous bug. On the other hand, there is nothing “normal” about assuming a documented array of unspecified size will be 30 bytes or less at runtime. Of all the arrays returned by this call, the release[] array is the one that has historically has not had any sort of “convention” for contents.

  4. 4 kevinclosson April 19, 2007 at 3:00 pm

    Yes, Nigel, you are right. I have fallen prey to common usage. I feel exonerated after having searched out the fact that both forms (NUL/NULL) share the same etymology–Latin nullus. 🙂 Not really…precision is important (thus the blog post about the 30 byte thing).

    We (re)learn something new everyday. Having said that, I still hold fast that there are a LOT of typos in Linux manpages.

  5. 5 Hemant April 20, 2007 at 9:24 am

    I only say that expecting uname to be 30 characters might
    be normal. After all, what is relevant to the Installer
    is the uname of the server. Whether underlying it is
    a bounded or unbounded array and whether Oracle should
    handle an unbounded array is taking it, probably, to too
    much detail.
    uname is commonly “not long” and 30 characters is “long enough”.
    What is the standard on non-Linux platforms ? Oracle
    would try to use a standard (or be platform independent).
    Hemant

  6. 6 kevinclosson April 20, 2007 at 2:59 pm

    “I only say that expecting uname to be 30 characters might
    be normal. After all, what is relevant to the Installer
    is the uname of the server. Whether underlying it is
    a bounded or unbounded array and whether Oracle should
    handle an unbounded array is taking it, probably, to too
    much detail.”

    Hemant,

    We are talking about a very unimportant bug, I know, because very few people would compile a kernel with a very long kernel description field (utsname.release). However, you are completely missing the point. It is no small oversight to take the return from a system call that is documented as an unspecified length array and cram it into a 30 byte space. And for that matter, 30 is a really odd size to pull out of a hat since it isn’t even a power of 2. Regardless of how unimportant we all think it is, it is a bug. Further. You keep stating the problem is “uname” of the server. That is not the case. It would be uname -r, or more succinctly, the release[] array returned by the uname(2) call.

  7. 7 Mark D Powell April 20, 2007 at 5:42 pm

    It you are going to access a string of undefined length but only intend to allocate 30 bytes to store the string it would seem reasonable to sub-string the value.

  8. 8 kevinclosson April 20, 2007 at 9:19 pm

    Yes Mark, that would be reasonable.

    As an aside, I’m told by a fellow OakTable member that this is a known bug (6006775) but I can’t find it in metalink.

    Finally, I have said that this is a blog entry I’d hope to chalk up under the “help some poor googler someday” column. This isn’t rocket science.

  9. 9 Brian Tkatch April 9, 2008 at 3:18 pm

    Kevin, i’m one of those poor googler. Thanx!

  10. 10 RD October 15, 2008 at 6:02 pm

    I get the same error for even with short name. 2.6.18-8.el5 I guess it is short name.

    ./runInstaller -silent -responseFile /home/oracle/sw/11g/LINUX/database/response/custom1.rsp
    Starting Oracle Universal Installer…

    Checking Temp space: must be greater than 80 MB. Actual 866020 MB Passed
    Checking swap space: must be greater than 150 MB. Actual 1983 MB Passed
    Preparing to launch Oracle Universal Installer from /tmp/OraInstall2008-10-15_10-58-33AM. Please wait …./runInstaller: line 81: 6998 Segmentation fault $CMDDIR/install/.oui $*
    [oracle@p8-crmdb-standby database]$ vi runInstaller
    [oracle@p8-crmdb-standby database]$ uname -r
    2.6.18-8.el5

  11. 11 RD October 15, 2008 at 6:38 pm

    I even changed the hostname it does not work:

    [oracle@dbstdby database]$ ./runInstaller -silent \
    > -responseFile $DISTRIB/response/custom1.rsp \
    > FROM_LOCATION=$DISTRIB/stage/products.xml \
    > ORACLE_BASE=/u01/app/oracle \
    > ORACLE_HOME=/u01/app/oracle/product/11g/db_1 \
    > ORACLE_HOME_NAME=11g
    Starting Oracle Universal Installer…

    Checking Temp space: must be greater than 80 MB. Actual 865127 MB Passed
    Checking swap space: must be greater than 150 MB. Actual 1983 MB Passed
    Preparing to launch Oracle Universal Installer from /tmp/OraInstall2008-10-15_11-37-34AM. Please wait …./runInstaller: line 81: 3127 Segmentation fault $CMDDIR/install/.oui $*
    [oracle@dbstdby database]$

  12. 12 mulyana June 16, 2010 at 4:46 am

    I found some error with my installation oracle 10g on centOS 5.4, when 62% install oracle : Error in invokin target ‘ntcontab.o’ makefile ‘/home/oracle/oracle/product/10.2.0/db_4/network/lib/ins_net_client.mk’ See ‘/home/oracle/oraInventory/logs/installActions2010-06-16_07-19-27PM.log’ for details. Click ‘Help’, Retry’,’Ignore’,’Cancel’.

    How problem solve with my installation, thank’s for your attention.

    Best regards

    A. Mulyana

  13. 13 Aaron Cirillo July 17, 2013 at 12:50 pm

    So I know nobody has posted here in 3 years, but I just ran into this bug and came up with a fix you can use without depending on Oracle. What I did was to create my own uname() and put it in a shared library. I will try to outline what I have done here.

    1. Create a C file with the following code in it:

    #include
    #include
    #include
    #include

    int uname(struct utsname *name)
    {
    strncpy (name->sysname, “test”, sizeof (name->sysname));
    strncpy (name->release, “test”, sizeof (name->release));
    strncpy (name->version, “test”, sizeof (name->version));
    strncpy (name->machine, “test”, sizeof (name->machine));
    return 0;
    }

    2. Compile that code with the following command:

    gcc new_uname.c -o libuname.so -ldl -shared -fPIC -I.

    3. Make sure that programs you run are using your new library by setting LD_PRELOAD:

    export LD_PRELOAD=/path/to/new/libuname.so

    4. Test it out:

    aaron@amcirillo-linux ~/code/uname $ ldd `which uname`
    linux-vdso.so.1 (0x00007fff647ff000)
    /home/aaron/code/uname/libuname.so (0x00007f9371c2a000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f9371883000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00007f937167f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f9371e2c000)

    aaron@amcirillo-linux ~/code/uname $ uname
    test
    aaron@amcirillo-linux ~/code/uname $ uname -r
    test
    aaron@amcirillo-linux ~/code/uname $ uname -v
    test
    aaron@amcirillo-linux ~/code/uname $ uname -m
    test

    5. Go ahead and run the oracle installer again, but be sure to export LD_PRELOAD first so that it uses your custom uname()


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services but all of the words on this blog are purely my own. Not a single word on this blog is to be mistaken as originating from any Amazon spokesperson. You are reading my words this webpage and, while I work at Amazon, all of the words on this webpage reflect my own opinions and findings. To put it another way, "I work at Amazon, but this is my own opinion." To conclude, this is not an official Amazon information outlet. There are no words on this blog that should be mistaken as official Amazon messaging. Every single character of text on this blog originates in my head and I am not an Amazon spokesperson.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,894 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: