Archive for the 'use_large_pages' Category

Configuring Linux Hugepages for Oracle Database Is Just Too Difficult! Isn’t It? Part – I.

Allocating hugepages for Oracle Database on Linux can be tricky. The following is a short list of some of the common problems associated with faulty attempts to get things properly configured:

  1. Insufficient Hugepages.You can be short just a single 2MB hugepage at instance startup and Oracle will silently fall back to no hugepages. For instance, if an instance needs 10,000 hugepages but there are only 9,999 available at startup Oracle will create non-hugepages IPC shared memory and the 9,999 (x 2MB) is just wasted memory.
    1. Insufficient hugepages is an even more difficult situation when booting with _enable_NUMA_support=TRUE as partial hugepages backing is possible.
  2. Improper Permissions. Both limits.conf(5) memlock and the shell ulimit –l must accommodate the desired amount of locked memory.

In general, list item 1 above has historically been the most difficult to deal with—especially on systems hosting several instances of Oracle. Since there is no way to determine whether an existing segment of shared memory is backed with hugepages, diagnostics are in short supply. Oracle Database 11g Release 2 (11.2.0.2) The fix for Oracle bugs 9195408 (unpublished) and 9931916 (published) is available in 11.2.0.2. In a sort of fast forward to the past, the Linux port now supports an initialization parameter to force the instance to use hugepages for all segments or fail to boot. I recall initialization parameters on Unix ports back in the early 1990s that did just that. The initialization parameter is called use_large_pages and setting it to “only” results in the all or none scenario. This, by the way, addresses list item 1.1 above. That is, setting use_large_pages=only ensures an instance will not have some NUMA segments backed with hugepages and others without. Consider the following example. Here we see that use_large_pages is set to “only” and yet the system has only a very small number of hugepages allocated (800 == ~1.6GB). First I’ll boot the instance using an init.ora file that does not force hugepages and then move on to using the one that does. Note, this is 11.2.0.2.

$ sqlplus '/ as sysdba'

SQL*Plus: Release 11.2.0.2.0 Production on Tue Sep 28 08:10:36 2010

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL>
SQL> !grep -i huge /proc/meminfo
HugePages_Total:   800
HugePages_Free:    800
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
SQL>
SQL> !grep large_pages y.ora x.ora
use_large_pages=only
SQL>
SQL> startup force pfile=./x.ora
ORACLE instance started.

Total System Global Area 4.4363E+10 bytes
Fixed Size                  2242440 bytes
Variable`Size            1406199928 bytes
Database Buffers         4.2950E+10 bytes
Redo Buffers                4427776 bytes
Database mounted.
Database opened.
SQL> HOST date
Tue Sep 28 08:13:23 PDT 2010

SQL>  startup force pfile=./y.ora
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory

The user feedback is a trite ORA-27102. So the question is,  which memory cannot be allocated? Let’s take a look at the alert log:

Tue Sep 28 08:16:05 2010
Starting ORACLE instance (normal)
****************** Huge Pages Information *****************
Huge Pages memory pool detected (total: 800 free: 800)
DFLT Huge Pages allocation successful (allocated: 512)
Huge Pages allocation failed (free: 288 required: 10432)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
NUMA Huge Pages allocation on node (1) (allocated: 3)
Huge Pages allocation failed (free: 285 required: 10368)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
Huge Pages allocation failed (free: 285 required: 10368)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
NUMA Huge Pages allocation on node (1) (allocated: 192)
NUMA Huge Pages allocation on node (1) (allocated: 64)

That is good diagnostic information. It informs us that the variable portion of the SGA was successfully allocated and backed with hugepages. It just so happens that my variable SGA component is precisely sized to 1GB. That much is simple to understand. After creating the segment for the variable SGA component Oracle moves on to create the NUMA buffer pool segments. This is a 2-socket Nehalem EP system and Oracle allocates from the Nth NUMA node and works back to node 0. In this case the first buffer pool creation attempt is for node 1 (socket 1). However, there were insufficient hugepages as indicated in the alert log. In the following example I allocated  another arbitrarily insufficient number of hugepages and tried to start an instance with use_large_pages=only. This particular insufficient hugepages scenario allows us to see more interesting diagnostics:

SQL>  !grep -i huge /proc/meminfo
HugePages_Total: 12000
HugePages_Free:  12000
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

SQL> startup force pfile=./y.ora
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory

…and, the alert log:

Starting ORACLE instance (normal)
****************** Huge Pages Information *****************
Huge Pages memory pool detected (total: 12000 free: 12000)
DFLT Huge Pages allocation successful (allocated: 512)
NUMA Huge Pages allocation on node (1) (allocated: 10432)
Huge Pages allocation failed (free: 1056 required: 10368)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
Huge Pages allocation failed (free: 1056 required: 10368)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
Huge Pages allocation failed (free: 1056 required: 5184)
Startup will fail as use_large_pages is set to "ONLY"
******************************************************
NUMA Huge Pages allocation on node (0) (allocated: 704)
NUMA Huge Pages allocation on node (0) (allocated: 320)

In this example we see 12,000 hugepages was sufficient to back the variable SGA component and only 1 of the NUMA buffer pools (remember this is Nehalem EP with OS boot string numa=on). Summary In my opinion, this is a must-set parameter if you need hugepages. With initialization parameters like use_large_pages, configuring hugepages for Oracle Database is getting a lot simpler. Next In Series

  1. “[…] if you need hugepages”
  2. More on hugepages and NUMA
  3. Any pitfalls I find.

More Hugepages Articles Link to Part II in this series: Configuring Linux Hugepages for Oracle Database Is Just Too Difficult! Isn’t It? Part – II. Link to Part III in this series: Configuring Linux Hugepages for Oracle Database is Just Too Difficult! Isn’t It? Part – III. And more: Quantifying hugepages Memory Savings with Oracle Database 11g Little Things Doth Crabby Make – Part X. Posts About Linux Hugepages Makes Some Crabby It Seems. Also, Words About Sizing Hugepages. Little Things Doth Crabby Make – Part IX. Sometimes You Have To Really, Really Want Your Hugepages Support For Oracle Database 11g. Little Things Doth Crabby Make – Part VIII. Hugepage Support for Oracle Database 11g Sometimes Means Using The ipcrm Command. Ugh. Oracle Database 11g Automatic Memory Management – Part I.


EMC Employee Disclaimer

The opinions and interests expressed on EMC employee blogs are the employees' own and do not necessarily represent EMC's positions, strategies or views. EMC makes no representation or warranties about employee blogs or the accuracy or reliability of such blogs. When you access employee blogs, even though they may contain the EMC logo and content regarding EMC products and services, employee blogs are independent of EMC and EMC does not control their content or operation. In addition, a link to a blog does not mean that EMC endorses that blog or has responsibility for its content or use.

This disclaimer was put into place on March 23, 2011.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,058 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2013. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

Follow

Get every new post delivered to your Inbox.

Join 2,058 other followers