Recently I had someone ask me in email why I bother posting installments on my Little Things Doth Crabby Make series. I responded by saying I think it is valuable to IT professionals to know they are not alone when confronted by something that makes little sense, or makes them crabby if that be the case. It’s all about the Wayward Googler(tm).
Well, Wayward Googler, it’s coming on thick.
Using Memory and Then Allocating HugePages (Or Die Trying)
I purposefully booted my system with no hugepages allocated in /etc/sysctl.conf (vm.nr_hugepages = 0). I then booted an Oracle Database 11g instance with sga_target set to 8000M. Next, I fired off 500 dedicated connections using the following goofy stuff:
$ cat doit cnt=0 until [ $cnt -eq 500 ] do sqlplus rw/rw @foo.sql & (( cnt = $cnt + 1 )) done wait $ cat foo.sql HOST sleep 120 exit;
The script ran in a matter of moments since I’m using a Xeon 5500 (Nehalem) based dual-socket server running Linux with a 2.6 kernel. Yes, these processors are really, really fast. But that, of course, isn’t what made me crabby.
Directly before I invoked the script, that fired off my 500 dedicated connections, I executed a script that intermittently peeked at how much memory is being wasted on page tables. Remember, without hugepages (hugetlb) backed IPC Shared Memory for the SGA there will be page table overhead for every connection to the instance. The size of the SGA and the number of dedicated connections compounds to consume potentially significant amounts of memory. Although that is also not what made me crabby, let’s look at what 500 dedicated sessions attaching to an 8000 MB SGA looks like as the user count ramps up:
$ while true > do > grep PageTables /proc/meminfo > sleep 10 > done PageTables: 3764 kB PageTables: 4696 kB PageTables: 65848 kB PageTables: 176956 kB PageTables: 287616 kB PageTables: 366540 kB PageTables: 478224 kB PageTables: 588424 kB PageTables: 699832 kB PageTables: 792356 kB PageTables: 802468 kB PageTables: 834004 kB PageTables: 851980 kB PageTables: 835432 kB PageTables: 834948 kB PageTables: 835052 kB PageTables: 1463260 kB PageTables: 2072864 kB PageTables: 2679572 kB PageTables: 3283456 kB PageTables: 3892628 kB PageTables: 4496868 kB PageTables: 5100908 kB PageTables: 6846256 kB PageTables: 6866820 kB PageTables: 6829388 kB PageTables: 6874752 kB PageTables: 6879360 kB PageTables: 6883076 kB PageTables: 6895244 kB PageTables: 6901528 kB PageTables: 6917256 kB PageTables: 6927984 kB PageTables: 6999196 kB PageTables: 6999472 kB PageTables: 7000048 kB PageTables: 7088160 kB PageTables: 7087960 kB PageTables: 7088812 kB PageTables: 7132804 kB PageTables: 7121120 kB
Got Spare Memory? Good, Don’t Use Hugepages
Uh, just short of 7 GB of physical memory lost to page tables! That’s ugly, but that’s not what made me crabby. Before I forget, did I mention that it is a really good idea to back your SGA with hugepages if you are running a lot of dedicated connections and have a large SGA?
So, What Did Make Him Crabby Anyway?
Wasting all that physical memory with page tables was just part of some analysis I’m doing. I never aim to waste memory (nor processor cycles for TLB misses) like that. So, I shut my Oracle Database 11g instance down in order to implement hugepages and move on. This is where I started getting crabby.
The first thing I did was verify there were, in fact, no allocated hugepages. Next, I checked to see if I had enough free memory to mess with. In this case I had most of the 16GB physical memory free. So, I tried to allocate 6200 2MB hugepages by echoing the token into /proc. Finally, I checked to make sure I was granted the hugepages I requested…Irk. Now that, made me crabby. Instead of 6200 I was given what appears to be some random number someone pulled out of the clothes hamper—604 hugepages:
# grep HugePages /proc/meminfo HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 # free total used free shared buffers cached Mem: 16427876 422408 16005468 0 24104 209060 -/+ buffers/cache: 189244 16238632 Swap: 2097016 29836 2067180 # echo 6200 > /proc/sys/vm/nr_hugepages # grep HugePages /proc/meminfo HugePages_Total: 604 HugePages_Free: 604 HugePages_Rsvd: 0
So, I then checked to see what free memory looked like:
# free total used free shared buffers cached Mem: 16427876 1670400 14757476 0 27040 207924 -/+ buffers/cache: 1435436 14992440 Swap: 2097016 29696 2067320
Clearly I was granted that oddball 604 hugepages I didn’t ask for. Maybe I’m supposed to just take what I’m given and be happy?
I thought, perhaps the system just didn’t hear me clearly. So, without changing anything I just belligerently repeated my command and found that doing so increased my allocated hugepages by a whopping 2:
# echo 6200 > /proc/sys/vm/nr_hugepages # grep HugePages /proc/meminfo HugePages_Total: 608 HugePages_Free: 608 HugePages_Rsvd: 0
I began to wonder if there was some reason 6200 was throwing the system a curve-ball. Here’s what happened when I lowered my expectations by requesting 3100:
# echo 3100 > /proc/sys/vm/nr_hugepages;grep HugePages /proc/meminfo HugePages_Total: 610 HugePages_Free: 610 HugePages_Rsvd: 0
Great. I began to wonder how long I could continually whack my head against the wall picking up little bits and pieces of hugepages along the way. So, I scripted 1000 consecutive requests for hugepages. I thought, perhaps, it was necessary to really, really want those hugepages:
# cnt=0;until [ $cnt -eq 1000 ] > do > echo 6200 > /proc/sys/vm/nr_hugepages > (( cnt = $cnt + 1 )) > done # grep HugePages /proc/meminfo HugePages_Total: 5502 HugePages_Free: 5502 HugePages_Rsvd: 0
Brilliant! Somewhere along the way the system decided to start doling out more than those piddly 2-page allocations in response to my request for 6200, otherwise I would have exited this loop with 2,610 hugepages. Instead, I exited the loop with 5502.
Well, since some is good, more must be better. I decided to run that stupid loop again just to see if I could pick up any more crumbs:
# cnt=0;until [ $cnt -eq 1000 ]; do echo 6200 > /proc/sys/vm/nr_hugepages; (( cnt = $cnt + 1 )); done # grep PageTables /proc/meminfo PageTables: 7472 kB # grep '^Hu' /proc/meminfo HugePages_Total: 5742 HugePages_Free: 5742 HugePages_Rsvd: 0 Hugepagesize: 2048 kB
That makes me crabby.
We should all do ourselves a favor and make sure we boot our servers with sufficient hugepages to cover our SGA(s). And, of course, you don’t get hugepages if you use Automatic Memory Management.