Oracle on Opteron with Linux-The NUMA Angle (Part VII).

This installment in my series about Oracle on Linux with NUMA hardware is very, very late. I started this series at the end of last year and it just kept getting put off—mostly because the hardware I needed to use was being used for other projects (my own projects). This is the seventh in the series and it’s time to show some Oracle numbers. Previously, I laid groundwork about such topics as SUMA/NUMA, NUMA API and so forth. To make those points I relied on microbenchmarks such as the Silly Little Benchmark. The previous installments can be found here.

To bring home the point that Oracle should be run on AMD boxes in NUMA mode (as opposed to SUMA), I decided to pick an Oracle workload that is very easy to understand as well as processor intensive. After all, the difference between SUMA and NUMA is higher memory latency so testing at any level below processor saturation actually provides the same throughput-albeit the SUMA result would come at a higher processor cost. To that end, measuring SUMA and NUMA at processor saturation is the best way to see the difference.

The workload I’ll use for this testing is what my friend Anjo Kolk refers to as the Jonathan Lewis Oracle Computing Index workload. The workload comes in script form and is very straightforward. The important thing about the workload is that it hammers memory which, of course, is the best way to see the NUMA effect. Jonathan Lewis needs no introduction of course.

The test was set up to execute 4, 8 16 and 32 concurrent invocations of the JL Comp script. The only difference in the test setup was that in one case I booted the server in SUMA mode and in another I booted in NUMA mode and allocated hugepages. As I point out in this post about SUMA, hugepages are allocated in a NUMA fashion and booting an SGA into this memory offers at least crude fairness placement of the SGA pages—certainly much better than a Cyclops. In short, what is being tested here one case where memory is allocated at boot time in a completely round-robin fashion versus the SGA being quasi-round robin yet page tables, kernel-side process-related structures and heap are all NUMA-optimized. Remember, this is no more difficult than a system boot option. Let’s get to the numbers.

jlcomp.jpg

I have also rolled up all the statspack reports into a word document (as required by WordPress). The document is numa-statspack.doc and it consist of 8 statspacks each prefaced by the name of what the specific test was. If you pattern search for REPORT NAME you will see each entry. Since this is a simple memory latency improvement, you might not be surprised how uninteresting the stats are-except of course the vast improvement in the number of logical reads per second the NUMA tests were able to push through the system.

SUMA or NUMA
A picture speaks a thousand words. This simple test combined with this simple graph covers it all pretty well. The job complete time ranged from about 12 to 15 percent better with NUMA at each of the concurrent session counts. While 12 to 15% isn’t astounding, remember this workload is completely processor bound. How do you usually recuperate 12-15% from a totally processor-bound workload without changing even a single line of code? Besides, this is only one workload and the fact remains that the more your particular workload does outside the SGA (e.g., sorting, etc) the more likely you are to see improvement. But by all means, do not run Oracle with Cyclops memory.

The Moral of the Story

Processors are going to get more cores and slower clock rates and memory topologies will look a lot more NUMA than SUMA as time progresses. I think it is important to understand NUMA.

What is Oracle Doing About It?
Well, I’ve blogged about the fact that the Linux ports of 10g do not integrate with libnuma. That means it is not NUMA-aware. What I’ve tried to show in this series is that the world of NUMA is not binary. There is more to it than SUMA or NUMA-aware. In the middle is booting the server and database in a fashion that at least allows benefit from the OS-side NUMA-awareness. The next step is Oracle NUMA-awareness.

Just recently I was sitting in a developer’s office in bldg 400 of Oracle HQ talking about NUMA. It was a good conversation. He stated that Oracle actually has NUMA awareness in it and I said, “I know.” I don’t think Sequent was on his mind and I can’t blame him—that was a long time ago. The vestiges of NUMA awareness in Oracle 10g trace back to the high-end proprietary NUMA implementations of the 1990s.  So if “it’s in there” what’s missing? We both said vgetcpu() at the same time. You see, you can’t have Oracle making runtime decisions about local versus remote memory if a process doesn’t know what CPU it is currently executing on (detection with less than a handful of instructions).  Things like vgetcpu() seem to be coming along. That means once these APIs are fully baked, I think we’ll see Oracle resurrect intrinsic NUMA awareness in the Linux port of Oracle Database akin to those wildcat ports of the late 90s…and that should be a good thing.

5 Responses to “Oracle on Opteron with Linux-The NUMA Angle (Part VII).”


  1. 1 Jeremy Schneider June 6, 2007 at 1:22 pm

    “To bring home the point that Oracle should be run on AMD boxes…”

    Just wondered – I sortof took that quote out of context and I think your point was more about NUMA vs SUMA, but wouldn’t AMD be pretty much out since Intel’s multi-core processors have proven so much faster in virtually every benchmark? Now I’m just thinking of the Athlon’s vs the Core Duo’s which I follow since I’m thinking of building a new home computer soon… maybe AMD’s higher-end processors don’t stack up so poorly to Intel’s offering…

    Did you mention AMD in this post simply because that’s what you had on hand or do you have an opinion one way or the other about whether they are the best choice for Oracle workloads?

  2. 2 kevinclosson June 6, 2007 at 2:41 pm

    Jeremy,

    Since this is a thread about NUMA and Linux, I must use AMD as the model. Intel woodcrest and clovertown multicore processors are astounding. Since I haven’t seen a single announcement from HP, IBM or Dell about a server based on upcoming AMD “Barcelona” processor, I don’t think there will be much to blog about there…

    Penroe and Nahelem Intel processors will connect to memory with a more NUMA-like architecture (CSI). This thread might be more mainline in that timeframe. In the meantime, folks that want to get a little more out of their 8820 and 2220 Opteron systems should appreciate this thread as well…or then there are the morbidly curious about how things work.

    My AMD stuff is here:

    http://kevinclosson.wordpress.com/kevin-closson-index/oracle-on-opteron-k8l-numa-etc/

  3. 3 Michal November 3, 2008 at 11:43 am

    Remarkable could be the fact that in Oracle 11.1.0.7 there is new row in v$sgainfo dynamic perspective:

    SQL> select * from v$sgainfo;

    NAME BYTES RES
    ——————————– ———- —

    Startup NUMA Shared Pool memory 536870912 No

    I believe that Oracle started utilizing platform with awareness.


  1. 1 Monitoring Oracle on SMP with Hyper-Threading : Ardent Performance Computing Trackback on June 22, 2007 at 5:10 am
  2. 2 ITC Test Agg » Blog Archive » Monitoring Oracle on SMP with Hyper-Threading Trackback on June 22, 2007 at 12:47 pm

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




EMC Employee Disclaimer

The opinions and interests expressed on EMC employee blogs are the employees' own and do not necessarily represent EMC's positions, strategies or views. EMC makes no representation or warranties about employee blogs or the accuracy or reliability of such blogs. When you access employee blogs, even though they may contain the EMC logo and content regarding EMC products and services, employee blogs are independent of EMC and EMC does not control their content or operation. In addition, a link to a blog does not mean that EMC endorses that blog or has responsibility for its content or use.

This disclaimer was put into place on March 23, 2011.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,176 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2013. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

Follow

Get every new post delivered to your Inbox.

Join 2,176 other followers

%d bloggers like this: