Oracle Database 11g Automatic Memory Management – Part IV. Don’t Use PRE_PAGE_SGA, OK?

BLOG UPDATE (05.14.09): The bug number for this PRE_PAGE_SGA with Automatic Memory Management issue  is 8505803

It has been quite a while since I’ve blogged about Automatic Memory Management (AMM).  I had to dig out the following three posts before making this blog entry just to see what I’ve said about AMM in the past:

Recently my friend Steve Shaw of Intel reported to me that he has had some problems with combining AMM and the PRE_PAGE_SGA init.ora parameter. I’ve looked into this a bit and thought I’d throw out a quick heads-up post. I won’t blog yet about the specific PRE_PAGE_SGA related problem Steve saw, but there are rather generic problems with combining PRE_PAGE_SGA with AMM to warrant this blog entry.

I could make this a really short blog entry by simply warning not to combine PRE_PAGE_SGA with AMM, but that would be boring. Nonetheless, don’t combine PRE_PAGE_SGA with AMM. There is a bug in 11.1.0.7 with AMM where PRE_PAGE_SGA causes every process to touch every page of the entire AMM space—not just the SGA! This has significant impact on page table consumption and session connect time. To make some sense out of this, consider the following…

I’ll set the following init.ora parameters:

MEMORY_TARGET=8G
SGA_TARGET= 100M
PARALLEL_MAX_SERVERS = 0

Next, I booted the instance and took a peek at ps(1) output. As you can see, every background process has a resident set of roughly 8G. Ignore the SZ column since it is totally useless on Linux (see the man page). Actually, that topic also warrants a post in the Little Things Doth Crabby Make series! Sorry, I digress. Anyway, here is the ps(1) output:


$ ps -elF | grep -v grep | grep -v ASM | egrep 'RSS|test'
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
4 S root     27940     1  0  85   0 -  3276 pipe_w  1000   7 09:35 ?        00:00:00 ora_dism_test1
0 S oracle   27943     1  8  75   0 - 2162332 -    8404660 2 09:35 ?        00:00:06 ora_pmon_test1
0 S oracle   28022     1  3  58   - - 2161773 -    8403480 0 09:36 ?        00:00:02 ora_vktm_test1
0 S oracle   28077     1  3  75   0 - 2163861 159558 8411180 2 09:36 ?      00:00:02 ora_diag_test1
0 S oracle   28141     1  3  75   0 - 2162467 -    8405576 2 09:36 ?        00:00:02 ora_dbrm_test1
0 S oracle   28157     1  3  76   0 - 2161774 150797 8403900 2 09:36 ?      00:00:02 ora_ping_test1
0 S oracle   28171     1  4  75   0 - 2162570 -    8405808 6 09:36 ?        00:00:02 ora_psp0_test1
0 S oracle   28187     1  4  78   0 - 2161774 -    8403480 2 09:36 ?        00:00:02 ora_acms_test1
0 S oracle   28201     1  4  75   0 - 2164626 126590 8414448 7 09:36 ?      00:00:02 ora_dia0_test1
0 S oracle   28256     1  4  75   0 - 2164131 159729 8413160 6 09:36 ?      00:00:02 ora_lmon_test1
0 S oracle   28306     1  4  75   0 - 2165750 276166 8419012 2 09:36 ?      00:00:02 ora_lmd0_test1
0 S oracle   28364     1  5  58   - - 2165485 276166 8418852 2 09:36 ?      00:00:02 ora_lms0_test1
0 S oracle   28382     1  5  58   - - 2165485 277884 8418848 3 09:36 ?      00:00:02 ora_lms1_test1
0 S oracle   28398     1  5  75   0 - 2161773 -    8403504 2 09:36 ?        00:00:02 ora_rms0_test1
0 S oracle   28412     1  6  78   0 - 2161774 -    8403752 7 09:36 ?        00:00:02 ora_mman_test1
0 S oracle   28426     1  6  75   0 - 2162572 -    8406832 3 09:36 ?        00:00:02 ora_dbw0_test1
0 S oracle   28491     1  6  75   0 - 2161773 -    8403720 2 09:36 ?        00:00:02 ora_lgwr_test1
0 S oracle   28550     1  7  75   0 - 2162467 -    8406164 3 09:36 ?        00:00:02 ora_ckpt_test1
0 S oracle   28608     1  7  78   0 - 2161774 -    8403428 2 09:36 ?        00:00:02 ora_smon_test1
0 S oracle   28624     1  8  78   0 - 2161773 -    8403500 2 09:36 ?        00:00:02 ora_reco_test1
0 S oracle   28638     1  9  75   0 - 2162560 -    8406436 2 09:36 ?        00:00:02 ora_rbal_test1
0 S oracle   28652     1  9  78   0 - 2162487 pipe_w 8407412 2 09:36 ?      00:00:02 ora_asmb_test1
0 S oracle   28666     1 10  75   0 - 2161773 -    8404092 2 09:36 ?        00:00:02 ora_mmon_test1
0 S oracle   28729     1 12  75   0 - 2161773 -    8403528 2 09:36 ?        00:00:02 ora_mmnl_test1
0 S oracle   28776     1 14  75   0 - 2162597 277884 8406572 3 09:36 ?      00:00:02 ora_lck0_test1
0 S oracle   28860     1 19  75   0 - 2162597 276166 8406220 2 09:36 ?      00:00:02 ora_rsmn_test1
0 S oracle   28893 23881 37  78   0 - 2162210 -    8409436 2 09:37 ?        00:00:02 oracletest1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
0 S oracle   28914     1 54  81   0 - 2162070 -    8405276 2 09:37 ?        00:00:02 ora_o000_test1
0 R oracle   28997     1 99  82   0 - 2161643 -    7213532 0 09:37 ?        00:00:01 ora_dskm_test1

$ grep -i pagetable /proc/meminfo
PageTables:     590404 kB

As you can see I followed up the ps command with a grep for how much memory is being spent on page tables. With all these 8GB resident sets it looks like roughly 575MB. That got me to thinking, what would other init.ora combinations result in. Those 575MB page tables were begat of 8G MEMORY_TARGET and no PQO slaves. I wrote a couple of quick and dirty scripts to probe around for some other values.

I created 6 init.ora files where, not surprisingly,  the only setting that varied was the number of PQ slaves. MEMORY_TARGET and SGA_TARGET remained constant. The following script is the driver. It boots the instance with 16,32…or 96 PQ slaves, sleeps for 5 seconds and then executes the rss.sh script also listed in the following box:


$ cat doit.sh
for i in 16 32 48 64 80 96
do

sqlplus '/ as sysdba' <<EOF
startup force pfile=./$i.ora
host sleep 5
host sh ./rss.sh "MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE $i SLAVES" >> rss.out
exit;
EOF

done

$ cat rss.sh

DESC="$1"

RSS=`ps -elF | grep test | grep -v ASM | grep -v grep  | awk '{ t=t+$12 } END { printf("%7.2lf\n", (t * 1024) / 2^ 30 ) }'`
PT=`grep -i paget /proc/meminfo | awk '{ print $2 }'`

echo "$RSS  $PT   $DESC"

The rss.sh script sums up the resident set sizes of all the interesting processes and reports it in gigabytes. The script also reports the page table size in KB. The script puts the interesting output in a file called rss.out. The following box shows the output generated by the script. The first line of output is with 16 PQ slaves, the next is 32 PQ slaves and so forth through the 6th line which used 96 PQ slaves.


$ cat rss.out
391.84  838644   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 16 SLAVES
529.00  1124688   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 32 SLAVES
657.03  1391100   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 48 SLAVES
785.32  1658000   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 64 SLAVES
918.41  1935368   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 80 SLAVES
1041.08  2190548   MEMORY_TARGET=8G SGA_TARGET=100M PRE_PAGE_SGA=TRUE 96 SLAVES

Pretty cut and dried. The aggregate RSS grows by roughly 8GB x 16 in accordance with each increment of 16 PQ slaves and the page tables grow to roughly 2GB through the increases in PQ slave count.

Bug Number: 42
I don’t have the bug number for this one yet. But it is a bug. Just don’t use PRE_PAGE_SGA with AMM. That setting was very significant many years ago for reasons that had mostly to do with ensuring Oracle on BSD-derived Unix implementations didn’t suffer from swapable SGA pages. The PRE_PAGE_SGA functionality ensured that each page was multiply referenced and therefore could not leave physical memory. But that was a long time ago. Time for old dogs to learn new tricks. And, no, my friend Steve Shaw does not suffer from old-dog-clamoring-for-new-trickitis. As I said above, I fully intend to blog about what Steve ran into with his recent PRE_PAGE_SGA related issue…soon.

By the way, did I forget to mention that you really shouldn’t combine PRE_PAGE_SGA with AMM? Like they say, the memory is the first thing to go…

And, before I forget, this is 11.1.0.7 on 64-bit Linux. I have no idea how PRE_PAGE_SGA works on other platforms. Maybe Glenn or Tanel will chime in on a Solaris x64 result?

Oh, I am forgetful today. I nearly forgot to mention that with AMM, PRE_PAGE_SGAand a 8G MEMORY_TARGET, a simple connect as scott/tiger followed by an immediate exit takes 2.3 seconds on Xeon 5400 processors. With PRE_PAGE_SGA commented out, the same test completes in .19 seconds. Hey, I should start rambling on about recovering 12x performance!   🙂

6 Responses to “Oracle Database 11g Automatic Memory Management – Part IV. Don’t Use PRE_PAGE_SGA, OK?”


  1. 1 DuncanE May 9, 2009 at 10:08 am

    Kevin,

    Off topic for this blog post I know, but given your interest in all thinkgs NUMA, I wonder if you’d possibly comment on a recent article that popped up on metalink – 759565.1

    It seems to be a very strong steer to NOT enable Oracle’s NUMA optimizations if you want a stable system, although it’s not clear whether is is excluding 11.1.0.7 from this.

    Thanks

    Duncan

  2. 2 Glenn Fawcett May 14, 2009 at 6:23 am

    I have always advised against PRE_PAGE_SGA, mainly because I didn’t think it was necessary, at least for Solaris. Solaris uses ISM which is locked shared memory usually backed by large pages (256M on SPARC).But with Intel only 2M pages are supported. It might be time to do some experiments with x64/Solaris since the supported page sizes are so small. Maybe, in the future we can see some larger page sizes? The memory sizes just continue to grow and with more threading on the way, it will be necessary.

    • 3 Christo Kutrovsky September 30, 2010 at 2:39 pm

      Glenn,

      ISM only in manual sga or sga_max = sga_target.

      In automatic where sga_target < sga_max, it uses DISM, which is swappable and requires ORADISM process to lock it in memory.

      Been swappable also means you need the size of your SGA in swap space available.

  3. 4 Alex September 27, 2009 at 3:24 pm

    Hello Kevin,

    What I cannot understand ( at least from the Oracle documenation and Metalink notes ) is what is the actual benefit from PRE_PAGE_SGA=TRUE and huge pages together( without AMM -i.e is turned off) We tried to run 80GB SGA on 11gR1( RHEL4 64 bit ) with huge pages ,AMM off and pre_page_sga=TRUE and the instance just cannot start . I have SR 7712866.994 opened on that( I think is also quoted on Oracle’s BDA forum) and so far at least I am not clear what is the reason .On 10.2.0.4/RHEL5(64 bit )/AMM off with around 28GB SGA that works just fine . Where I am getting is the Oracle documenation does not seem to provide answer (or at least I cannot find it ) is there any benefit of using pre_page_sga=TRUE and huge pages .Thanks

    Regards,
    Alex


  1. 1 Ronny Egners Blog » MEMORY_TARGET (SGA_TARGET) or HugePages – which to choose? Trackback on March 31, 2010 at 9:04 am

Leave a Reply to Glenn Fawcett Cancel reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 744 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: