Allocating hugepages for Oracle Database on Linux can be tricky. The following is a short list of some of the common problems associated with faulty attempts to get things properly configured:
- Insufficient Hugepages.You can be short just a single 2MB hugepage at instance startup and Oracle will silently fall back to no hugepages. For instance, if an instance needs 10,000 hugepages but there are only 9,999 available at startup Oracle will create non-hugepages IPC shared memory and the 9,999 (x 2MB) is just wasted memory.
- Insufficient hugepages is an even more difficult situation when booting with _enable_NUMA_support=TRUE as partial hugepages backing is possible.
- Improper Permissions. Both limits.conf(5) memlock and the shell ulimit –l must accommodate the desired amount of locked memory.
In general, list item 1 above has historically been the most difficult to deal with—especially on systems hosting several instances of Oracle. Since there is no way to determine whether an existing segment of shared memory is backed with hugepages, diagnostics are in short supply. Oracle Database 11g Release 2 (11.2.0.2) The fix for Oracle bugs 9195408 (unpublished) and 9931916 (published) is available in 11.2.0.2. In a sort of fast forward to the past, the Linux port now supports an initialization parameter to force the instance to use hugepages for all segments or fail to boot. I recall initialization parameters on Unix ports back in the early 1990s that did just that. The initialization parameter is called use_large_pages and setting it to “only” results in the all or none scenario. This, by the way, addresses list item 1.1 above. That is, setting use_large_pages=only ensures an instance will not have some NUMA segments backed with hugepages and others without. Consider the following example. Here we see that use_large_pages is set to “only” and yet the system has only a very small number of hugepages allocated (800 == ~1.6GB). First I’ll boot the instance using an init.ora file that does not force hugepages and then move on to using the one that does. Note, this is 11.2.0.2.
$ sqlplus '/ as sysdba' SQL*Plus: Release 11.2.0.2.0 Production on Tue Sep 28 08:10:36 2010 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to an idle instance. SQL> SQL> !grep -i huge /proc/meminfo HugePages_Total: 800 HugePages_Free: 800 HugePages_Rsvd: 0 Hugepagesize: 2048 kB SQL> SQL> !grep large_pages y.ora x.ora use_large_pages=only SQL> SQL> startup force pfile=./x.ora ORACLE instance started. Total System Global Area 4.4363E+10 bytes Fixed Size 2242440 bytes Variable`Size 1406199928 bytes Database Buffers 4.2950E+10 bytes Redo Buffers 4427776 bytes Database mounted. Database opened. SQL> HOST date Tue Sep 28 08:13:23 PDT 2010 SQL> startup force pfile=./y.ora ORA-27102: out of memory Linux-x86_64 Error: 12: Cannot allocate memory
The user feedback is a trite ORA-27102. So the question is, which memory cannot be allocated? Let’s take a look at the alert log:
Tue Sep 28 08:16:05 2010 Starting ORACLE instance (normal) ****************** Huge Pages Information ***************** Huge Pages memory pool detected (total: 800 free: 800) DFLT Huge Pages allocation successful (allocated: 512) Huge Pages allocation failed (free: 288 required: 10432) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** NUMA Huge Pages allocation on node (1) (allocated: 3) Huge Pages allocation failed (free: 285 required: 10368) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** Huge Pages allocation failed (free: 285 required: 10368) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** NUMA Huge Pages allocation on node (1) (allocated: 192) NUMA Huge Pages allocation on node (1) (allocated: 64)
That is good diagnostic information. It informs us that the variable portion of the SGA was successfully allocated and backed with hugepages. It just so happens that my variable SGA component is precisely sized to 1GB. That much is simple to understand. After creating the segment for the variable SGA component Oracle moves on to create the NUMA buffer pool segments. This is a 2-socket Nehalem EP system and Oracle allocates from the Nth NUMA node and works back to node 0. In this case the first buffer pool creation attempt is for node 1 (socket 1). However, there were insufficient hugepages as indicated in the alert log. In the following example I allocated another arbitrarily insufficient number of hugepages and tried to start an instance with use_large_pages=only. This particular insufficient hugepages scenario allows us to see more interesting diagnostics:
SQL> !grep -i huge /proc/meminfo HugePages_Total: 12000 HugePages_Free: 12000 HugePages_Rsvd: 0 Hugepagesize: 2048 kB SQL> startup force pfile=./y.ora ORA-27102: out of memory Linux-x86_64 Error: 12: Cannot allocate memory
…and, the alert log:
Starting ORACLE instance (normal) ****************** Huge Pages Information ***************** Huge Pages memory pool detected (total: 12000 free: 12000) DFLT Huge Pages allocation successful (allocated: 512) NUMA Huge Pages allocation on node (1) (allocated: 10432) Huge Pages allocation failed (free: 1056 required: 10368) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** Huge Pages allocation failed (free: 1056 required: 10368) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** Huge Pages allocation failed (free: 1056 required: 5184) Startup will fail as use_large_pages is set to "ONLY" ****************************************************** NUMA Huge Pages allocation on node (0) (allocated: 704) NUMA Huge Pages allocation on node (0) (allocated: 320)
In this example we see 12,000 hugepages was sufficient to back the variable SGA component and only 1 of the NUMA buffer pools (remember this is Nehalem EP with OS boot string numa=on).
Summary
In my opinion, this is a must-set parameter if you need hugepages. With initialization parameters like use_large_pages, configuring hugepages for Oracle Database is getting a lot simpler.
Next In Series
- “[…] if you need hugepages”
- More on hugepages and NUMA
- Any pitfalls I find.
More Hugepages Articles
Link to Part II in this series: Configuring Linux Hugepages for Oracle Database Is Just Too Difficult! Isn’t It? Part – II. Link to Part III in this series: Configuring Linux Hugepages for Oracle Database is Just Too Difficult! Isn’t It? Part – III. And more: Quantifying hugepages Memory Savings with Oracle Database 11g Little Things Doth Crabby Make – Part X. Posts About Linux Hugepages Makes Some Crabby It Seems. Also, Words About Sizing Hugepages. Little Things Doth Crabby Make – Part IX. Sometimes You Have To Really, Really Want Your Hugepages Support For Oracle Database 11g. Little Things Doth Crabby Make – Part VIII. Hugepage Support for Oracle Database 11g Sometimes Means Using The ipcrm Command. Ugh. Oracle Database 11g Automatic Memory Management – Part I.
Recent Comments