I have been exploring the effect of process migration between CPUs in a multi-core Linux system while running long duration Oracle jobs. While Linux does schedule processes as best as possible for L2 cache affinity, I do see migrations on my HP DL 585 Opteron 850 box. Cache affinity is important, and routine migrations can slow down long running jobs. In fact, when a process gets scheduled to run on a CPU different than the one it last ran on the CPU will stall immediately while the cache is loaded with the process’ page tables—regardless of cache warmth. That is, the cache might have pages of text, data, stack and shared memory, but it won’t have the right versions of the page tables. Bear in mind that we are talking really small stalls here, but on long running jobs it can add up.
This Linux Journal webpage has the source for a program called cpu_bind that uses the Linux 2.6 sched_setaffinity(2) library routine to establish hard-affinity for a process to a specified CPU. I’ll be covering more of this in my NUMA series, but I thought I’d make a quick blog entry about this new to get the ball rolling.
After downloading the cpu_bind.c program, it is simple to compile and execute. The following session shows compilation and execution to set the PID of my current bash(1) shell to execute with hard affinity on CPU 3:
$ cc -o cpu_bind cpu_bind.c
$ cpu_bind $$ 3
$ while true
The following is a screen shot of top(1) with CPU 3 utilized 100% in user mode by my looping shell. Note, you may have to ricght-click->vew image:
If you wanted to experiment with Oracle, you could start a long running job and execute cp_bind on its PID once it is running, or do what I did with $$ and then invoke sqlplus for instance. Also, a SQL*Net listener process could be started with hard affinity to a certain CPU and you could connect to it when running a long CPU-bound job. Just a thought, but I’ll be showing real numbers in my NUMA series soon.
Give it a thought, see what you think.
The NUMA series links are: