Oracle Database 10g 10.2.0.4 Cannot Boot a Large SGA on AMD Servers Running Linux

Published July 18, 2008 oracle 9 Comments

In the comment thread of my recent blog entry entitled Of Gag-Orders, Excitement, and New Products, a fellow blogger, Jeff Hunter wrote:

I’d be happy if the major innovation was being able to run a 10.2.0.4 16G SGA on x86_64.

He offered a link to a thread on his blog where he has been chronicling his unsuccessful attempts to boot a 16GB SGA on the same iron that seemed to have no problem doing so with 10.2.0.3.

What’s New?

Oracle Database 10g release 10.2.0.4 has additional rudimentary support for NUMA in the Linux port, true, but Jeff has tried with NUMA enabled and disabled (via boot options) none of which has fixed his problems. In his latest installment on this thread I noticed that the title of the post has renamed the thread to “The Great NUMA debate” and the post ends with Jeff reporting that he still is having trouble with his 16GB SGA, but also that he can’t boot even a 4GB SGA. Jeff wrote:

I still couldn’t start a 16GB SGA. Interestingly enough, I couldn’t start a 4G SGA either! I had to go back to booting without numa=off. The saga continues…

Unfortunately, I can’t jump in and debug what is wrong on his configuration and I don’t know what the debate is. However, I can take a moment to post evidence that Oracle Database 10g 10.2.0.4 can in fact boot a 16GB SGA-in both AMD Opteron SUMA mode and NUMA mode. No, I don’t have any large memory AMD systems around to test this myself. But I certainly use to. So, I decided to call in a favor to my old friend Mary Meredith (yes, old Sequent folks stick together) who has taken over for me in the role I vacated at HP/PolyServe when left to join Oracle. I asked Mary if she’d mind booting a 16GB SGA on one of those large memory AMD systems I use to have available to me…and she did:

$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Jul 6 09:15:35 2008
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup pfile=create1.ora
ORACLE instance started.
Total System Global Area 1.7700E+10 bytes
Fixed Size                  2115104 bytes
Variable Size             503319008 bytes
Database Buffers         1.7180E+10 bytes
Redo Buffers               14659584 bytes
Database mounted.
Database opened.

$ numactl --hardware
available: 1 nodes (0-0)
node 0 size: 32146 MB
node 0 free: 13821 MB
node distances:
node   0
  0:  10

So, here we see 10.2.0.4 on a SUMA-configured Proliant DL585 with a 16GB buffer pool. I asked Mary if she’d be willing to boot in NUMA mode (Linux boot option) and give it a try, and she did:

$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Jul 7 10:03:35 2008
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup pfile=create1.ora
ORACLE instance started.
Total System Global Area 1.7700E+10 bytes
Fixed Size                  2115104 bytes
Variable Size             503319008 bytes
Database Buffers         1.7180E+10 bytes
Redo Buffers               14659584 bytes
Database mounted.
Database opened.
SQL> quit

But she reported that she didn’t get any hugepages:

$ cat /proc/meminfo|grep Huge
HugePages_Total:  8182
HugePages_Free:   8182
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

I pointed out that 8192 2MB hugepages is not big enough. I recommended she up that to 8500 and then start the database up under strace so we could capture the shmget() call to ensure it was flagging in SHM_HUGETLB, and it was:

$ cat /proc/meminfo|grep Huge
HugePages_Total:  8500
HugePages_Free:   7132
HugePages_Rsvd:   7073
Hugepagesize:     2048 kB

And from the strace:

6510  shmget(0x1420f290, 17702060032, IPC_CREAT|IPC_EXCL|SHM_HUGETLB|0600) = 393219

And…

$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 0          root      644        72         2
0x00000000 32769      root      644        16384      2
0x00000000 65538      root      644        280        2
0x1420f290 393219     oracle    600        17702060032 12

Also, in the NUMA configuration we see a good, even distribution of pages allocated from each of the “nodes”, with the exception of node zero which until Linux gets fully NUMA-aware will always be over-consumed:

$ numactl --hardware
available: 4 nodes (0-3)
node 0 size: 7906 MB
node 0 free: 2025 MB
node 1 size: 8080 MB
node 1 free: 3920 MB
node 2 size: 8080 MB
node 2 free: 3969 MB
node 3 size: 8080 MB
node 3 free: 3926 MB
node distances:
node   0   1   2   3
  0:  10  20  20  20
  1:  20  10  20  20
  2:  20  20  10  20
  3:  20  20  20  10

We also see that the shmget() call did flag in SHM_HUGETLB and correspondingly we see the shmkey in the ipcs output. We also see hugepages being used, although mostly just reserved.

So, I haven’t been able to see Jeff’s strace output or other such diagnostic information so I can’t help there. However, this blog post is meant to be a confidence booster to any wayward googler who might happen to be having difficulty booting a VLM SGA on AMD Opteron running Linux with Oracle Database 10g release 10.2.0.4.

Extra Credit

So, if Mary had booted in NUMA mode without hugepages, does anyone think it would have resulted in such a nice even consumption of pages from the nodes, or would it have looked like Cyclops? We all recall Cyclops, don’t we? In case you don’t here is a link:
Oracle on Opteron with Linux–The NUMA Angle Part VI. Introducing Cyclops.

9 Responses to “Oracle Database 10g 10.2.0.4 Cannot Boot a Large SGA on AMD Servers Running Linux”

Feed for this Entry Trackback Address

1 Noons July 18, 2008 at 5:08 am

Pardon the apparent cynic comment, Kevin: it isn’t meant that way and you know it isn’t. Others might not, hence the disclaimer.

But isn’t it about time Oracle got itself configured automatically for all these things, maybe with a single and easy to recall parameter?

I mean: you and I (well, I can spell the things involved, you actually *know* them) and many others of our crop can grok what’s going on and adjust accordingly.

I doubt if a handfull of dbas of the last 8 year crop will be able to even knock on the door of what’s ticking and why, other than: “I can’t get it going”.

Which is not the case with Jeff, BTW: I think his is just a weird h/w+s/w combo causing the probs. Hey, it happens in the best families: don’t anyone get me started on oracle+aix… 🙂

But you get the picture. This is the sort of thing I’d expect the Linux port to be able to figger by itself, preferably without one having to cough up a licence of the “linux/numa” performance pack for grid/oem at a cost that pales the national debt of a medium-size country.

Or am I aiming too high?

Reply
2 kevinclosson July 18, 2008 at 5:21 am

I don’t think you’re entirely off base, Noons. It really shouldn’t be that way. Even though it is reactive to want so, I do wish I could get a full diagnostic from Jeff because I can’t for the life of me figure out how his setup could be that broken. I’ve asked for strace but I really could use init.ora, strace and the alert log. Since he is currently not able to boot even a 4GB SGA something is really fishy. I think Jeff will get some click-throughs and perhaps he’ll join this thread. Jeff is sharp, so this is likely not a surface problem. Maybe we could learn something that could come in handy for 10.2.0.5, and, as usual, help the wayward googler someday.

Reply
3 Jeff Hunter July 18, 2008 at 2:56 pm

I emailed you the strace output to the email indicated in the “Appearances/Contact” section of this blog.

Reply
4 Jeff Hunter July 22, 2008 at 4:34 pm

After following through some of the examples, I may have more cluses about my issue. Although my Hugepagesize is 2048kB, the output from /etc/meminfo indicates my HugePages_total and HugePages_free are both 0.

Reply
5 Michael July 23, 2008 at 3:07 pm

Kevin,

Do you know what version of linux Mary was running when she started oracle 10.2.0.4 up on the proliant with 16G of sga? Was it redhat 4 or redhat 5 and which update?

Mary’s oracle appears to only use 1 shared memory segment. I notice recently that my linux x86_64 box uses multiple shared memory segment but my 10.2.0.4 x86 one uses 1 single shared memory segment. Just curious to see if we’re all on the same version of redhat or linux.

Michael

Reply
6 Anthony August 19, 2008 at 7:00 pm

Maybe I can offer some twist as well. I have similar issues. My issue though has to do with pre page sga- true. I can in fact boot any size SGA so far as I get the huge pages proportionately higher than sga. The ratio is a mystery to me at this moment but hoevers around 65% of the SGA. That is if the huge page is 35% higher than size of SGA.

Take a look at my test results below

Test Results

With pre_page_ture
SGA=12g,13g, 14g
Huge Pages= 16GB

I get the ora-00443 background process “PMON” did not start

With pre_page_ture
SGA=11g and below
Huge Pages= 16GB

Works okay

With pre_page_ture
SGA=10g , 9g 8g
Huge Pages= 11GB
I get the ora-00443 background process “PMON” did not start

Problem resolved when sga dropped to 7g

Final series of tests

With pre_page_sga =true
SGA=21g
Huge Pages= 24GB
I get the ora-00443 background process “PMON” did not start

With pre_page_sga =false
SGA=21g
Huge Pages= 24GB

No issues

With pre_page_sga =true
SGA=21g,23g 24gb
Huge Pages= 30GB

No issues.

In all cases the NUMA optimization is set off. With NUMA optimization set to true any size SGA can be booted and use huge pages regardless of the size of the huge pages provided it is bigger than the SGA…even by a few Mbytes

I am battling this one with oracle now as we speak.

Reply
7 Be Bravo September 24, 2008 at 10:43 pm

Anthony, maybe you can share which Oracle 10G and Linux you have tested the above. I’m planning to test similar parameters with 30GB SGA.

Reply
8 Sunny November 19, 2011 at 7:33 pm

I’m running 2 oracle 11.2 databases on SUSE Linux Enterprise Server (SLES) having total 16 gb physical
memory.
1) sga_target 6gb sga_max_size =8g
2) sga_target=sga_max_size=1536 mb

I configured huges pages and
I run hugepages_settings.sh as per the document
Document 401749.1 and it s giving
Recommended setting: vm.nr_hugepages = 4868
which is looking wrong to me.
and also getting following message via OEM

Significant virtual memory paging was detected on the host operating system.

oracle@oracle1:/tmp> cat /proc/meminfo | grep Huge
HugePages_Total: 4868
HugePages_Free: 28
HugePages_Rsvd: 26
HugePages_Surp: 0
Hugepagesize: 2048 kB

I am getting this error
PROCESS J000 And M000 Die
Process J000 died, kkjcre1p: unable to spawn jobq slave process
ORA-00445: Background Process “J000” Did Not Start After 120 Seconds

Reply
- 9 kevinclosson November 19, 2011 at 8:56 pm
  
  You need to aggregate up to a number that fits both. 4868 might be enough for the 8g instance but not enough for both instances.You need somewhere north of 5000 hugepages. If you are just working this out I receommend you set hugepages to 6000, boot both instances and examine /proc/meminfo see how over-configured you are, adjust down and then keep thoese values.
  
  There is some patch to 11.2 that implements an init.ora parameter (can’t remember specifics at the moment) that forces the instance to use hugepages or fail the instance startup. That’s the best way to go.
  
  Reply

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage