Out of Memory Errors With Oracle Database On RHEL 4.8 Or OEL 4.8? Freeing Memory Perhaps?

In my recent post about free versus reclaimable memory on Linux ( Linux Free Memory, Is It Free Or Reclaimable?) I discussed a technique to quickly shuffle all pages containing clean, buffered file system pages to the free list. If you are interested, please see that post for an explanation for why I would mess with free versus reclaimable pages.  The following is a quick reminder of what /proc/sys/vm/drop_caches is used for:

To clear clean pages from the pagecache:

  • echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes (cached filesystem metadata):

  • echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes:

  • echo 3 > /proc/sys/vm/drop_caches

The drop_caches feature was mainlined in the Linux 2.6.16 kernel and thus has been around since about 2006. Using drop_caches should in no way impact dirty data. As such, this should be a totally benign operation. In my experience, it is. I do it all the time with Oracle Database instances running and even while transactions are running. I don’t do it for reasons any production Database Administrator would, but I do it nonetheless. I was, therefore, very surprised to receive feedback from a reader who tried this experiment on an Enterprise Linux RHEL4 system. Now, I was surprised to hear that drop_caches was even available on OEL 4.8 (or RHEL 4.8 for that matter)  since it is 2.6.9 based. I have not touched a RHEL4 or EL4 system in quite some time, but it looks like drop_caches was back-ported from 2.6.16 to the RHEL4 kernel at some point. The reader was testing on OEL version 4.8. His comment went as follows:

Here’s an interesting outcome. On a 4.8 OEL system with about 4 databases running (non-prod). I did a sync, then echo 3>/proc/sys/vm/drop_caches. Looked at free and I got the expected result. However, all the DB’s then returned this error when connecting to them:

ERROR:
ORA-01034: ORACLE not available
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 1
Additional information: 262149

Interesting, indeed! The reader (Mike) and I tries to investigate this a bit but ran into a wall since none of his OEL 4.8 systems had the strace utility installed. I cannot imagine what sort of memory allocation error this is, but suffice it to say that freeing up free memory resulting in a memory allocation problem is quite an unexpected result.

If anyone has an OEL (or RHEL 4.8) test system with Oracle running and would care to strace for us we’d appreciate it. Simply clear the page cache with the echo into drop_caches and see if you too suffer this connection-time error. If so, simply do it again as follows:

$ strace –o /tmp/bizzare.txt –f sqlplus scott/tiger

I’d love to get my hands on such a bizzare strace file!

5 Responses to “Out of Memory Errors With Oracle Database On RHEL 4.8 Or OEL 4.8? Freeing Memory Perhaps?”


  1. 1 Iliya Peregoudov December 9, 2009 at 7:50 am

    Not sure is it RHEL 4.8 or 4.9:

    [iliyap@absolut1:~]$ cat /etc/issue
    Red Hat Enterprise Linux ES release 4 (Nahant Update 8)
    Kernel \r on an \m

    [iliyap@absolut1:~]$ rpm -qf /etc/issue
    redhat-release-4ES-9

    [iliyap@absolut1:~]$ free; sudo /bin/sh -c “echo 3 >/proc/sys/vm/drop_caches”; free
    total used free shared buffers cached
    Mem: 3974508 3844720 129788 0 318416 697244
    -/+ buffers/cache: 2829060 1145448
    Swap: 2097112 1579992 517120
    total used free shared buffers cached
    Mem: 3974508 3439600 534908 0 560 660460
    -/+ buffers/cache: 2778580 1195928
    Swap: 2097112 1579992 517120
    [iliyap@absolut1:~]$ strace -o after_cache_drop.txt -ff sqlplus ip9i403a

    SQL*Plus: Release 9.2.0.7.0 – Production on Wed Dec 9 10:20:07 2009

    Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

    Enter password:
    ERROR:
    ORA-01034: ORACLE not available
    ORA-27123: unable to attach to shared memory segment
    Linux Error: 13: Permission denied

    There are two trace files, one for sqlplus process, one for oracle process. Trace file for oracle process contains following syscall attempts (oracle proccess tries to attach to shared memory segment, but kernel grants no access):

    shmget(0xdec84abc, 4096, 0) = 0
    open(“/dev/shm/ora_absolut1_0”, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
    shmget(0xdec84abc, 0, 0) = 0
    open(“/dev/shm/ora_absolut1_0”, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
    shmctl(0, IPC_64|IPC_STAT, 0xfeffe880) = 0
    open(“/dev/shm/ora_absolut1_0”, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
    shmat(0, 0, 0) = -1 EACCES (Permission denied)

    It it very strange but later on trace there are following syscall attempts (oracle process tries to open log file, but kernel grants no access):

    lstat64(“/usr/oracle/absolut1/rdbms/log/absolut1_ora_17730.trc”, 0xfeffdfe4) = -1 ENOENT (No such file or directory)
    stat64(“/usr/oracle/absolut1/rdbms/log/absolut1_ora_17730.trc”, 0xfeffdfe4) = -1 ENOENT (No such file or directory)
    open(“/usr/oracle/absolut1/rdbms/log/absolut1_ora_17730.trc”, O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0660) = -1 EACCES (Permission denied)

    ipcs -a output:

    [iliyap@absolut1:~]$ ipcs -a

    —— Shared Memory Segments ——–
    key shmid owner perms bytes nattch status
    0xdec84abc 0 absolut1 640 2367684608 38 locked

    —— Semaphore Arrays ——–
    key semid owner perms nsems
    0xc9f93990 98304 absolut1 640 154

    —— Message Queues ——–
    key msqid owner perms used-bytes messages

  2. 2 Iliya Peregoudov December 9, 2009 at 8:06 am

    It seems that strace is the cause of EACCES. strace does not allow setsid-executable to change euid on exec:

    [iliyap@absolut1:~]$ COLUMNS=1000 ps -fHu iliyap
    UID PID PPID C STIME TTY TIME CMD
    iliyap 29468 29425 0 10:31 ? 00:00:00 sshd: iliyap@pts/5
    iliyap 29475 29468 0 10:31 pts/5 00:00:00 -bash
    iliyap 31014 29475 0 11:03 pts/5 00:00:00 ps -fHu iliyap
    iliyap 21748 21723 0 10:17 ? 00:00:00 sshd: iliyap@pts/3
    iliyap 21753 21748 0 10:17 pts/3 00:00:00 -bash
    iliyap 13146 21753 0 10:59 pts/3 00:00:00 strace -o second_try.txt -ff sqlplus ip9i403a/usrofip9i403a
    iliyap 13147 13146 0 10:59 pts/3 00:00:00 sqlplus
    iliyap 13154 13147 0 10:59 ? 00:00:00 oracleabsolut1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

    Without strace connect to database works as it should:

    [iliyap@absolut1:~]$ free; sudo /bin/sh -c “echo 3 >/proc/sys/vm/drop_caches”; free; sqlplus ip9i403a/usrofip9i403a
    total used free shared buffers cached
    Mem: 3974508 3274808 699700 0 7948 680196
    -/+ buffers/cache: 2586664 1387844
    Swap: 2097112 1579448 517664
    total used free shared buffers cached
    Mem: 3974508 3244216 730292 0 456 659608
    -/+ buffers/cache: 2584152 1390356
    Swap: 2097112 1579448 517664

    SQL*Plus: Release 9.2.0.7.0 – Production on Wed Dec 9 11:04:56 2009

    Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

    Connected to:
    Oracle9i Enterprise Edition Release 9.2.0.7.0 – Production
    With the Partitioning, OLAP and Oracle Data Mining options
    JServer Release 9.2.0.7.0 – Production

  3. 3 Timur Akhmadeev December 9, 2009 at 11:17 am

    Hi Kevin,

    I’ve tried to reproduce it on a 64-bit Oracle 10.2.0.4 @ OEL4u7 – no “luck”. Not reproduced too on a 32-bit 11.2.0.1 @ OEL5u3 (under VMware).

  4. 4 Derek May 19, 2010 at 1:13 am

    Hi guys,

    I have been researching this issue as well, for DB2 and Redhat 4.x. It looks like you should NOT be using the drop_caches feature in Redhat 4 (even though it has been backported to 4), though it is fine to run in Redhat 5.

    Using it in Redhat 4 may cause a system deadlock:

    http://kbase.redhat.com/faq/docs/DOC-23223


  1. 1 My little IT blog now Oracle oriented Trackback on October 31, 2010 at 9:50 am

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,988 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: