AMD Quad-Core “Barcelona” Processor For Oracle (Part III). NUMA Too!

To continue my thread about AMD’s future Quad-core processors code named “Barcelona” (a.k.a. K8L), I need to elaborate a bit on my last installment on this thread where I pointed out that AMDs marketing material suggests we should expect 70% better OLTP performance from Barcelona than Socket F (Opteron 2220). To be precise, the marketing materials are predicting a 70% increase on a per-processor basis. That is a huge factor that I need to blog, so here it is.

“Friendemies”
While doing the technical review for the Julian Dyke/Steve Shaw RAC on Linux Book I got to know Steve Shaw a bit. Since then we have become more familiar with each other especially after manning the HP booth in the exhibitor hall at UKOUG 2006. Here is a photo of Steve in front of the HP Enterprise File Services Clustered Gateway demo. The EFS is an OEMed version of the PolyServe scalable file serving utility (scalable clustered storage that works).

shaw_4.JPG

People who know me know I’m a huge AMD fan, but they also know I am not a techno-religious zealot. I pick the best, but there is no room for loyalty in high technology (well, on second thought, I was loyal to Sequent to the bitter end…oh well). So over the last couple of years, Steve and I have occasionally agreed to disagree about the state of affairs between Intel and AMD processor fitness for Oracle. Steve and I are starting to see eye to eye a lot more these days because I’m starting to smell the coffee as they say.

It’s All About The Core
When it comes to Oracle performance on industry standard servers, the only thing I can say is, “It’s the core, stupid”—in that familiar Clintonian style of course. Oracle licenses the database at the rate of .5 per core, rounded up. So a quad-core processor is licensed as 2 CPUs. Let’s look at some numbers.

Since AMD’s Quad-core promo video is based on TPC results, I think it is fair to go with them. TPC-C is not representative of what real applications do to a processor, but the workload does one thing really well—it exploits latency issues. For OLTP, memory latency is the most important performance characteristic. Since AMD’s material sets our expectations for some 70% improvement in OLTP over the Opteron 2200, we’ll look at TPC-C.

This published TPC-C result shows that the Opteron 2200 can perform 69,846 TpmC per processor. If the AMD quad-core promotional video proves right, the Barcelona processor will come it at approximately 118,739 TpmC per processor (a 70% improvement).

TpmC/Oracle-license
Since a quad-core AMD is licensed by Oracle as 2 CPUs, it looks like Barcelona will be capable of 59,370 TpmC per Oracle license. Therein lies the rub, as they say. There are a couple of audited TPC-C results with the Intel “Tulsa” processor (a.k.a. Xeon 7140, 7150), such as this IBM System x result, that show this current high-end Xeon processor is capable of some 82,771 TpmC per processor. Since the Xeon 71[45]0 is a dual-core processor, the Oracle-license price factor is 82,771 TpmC per Oracle license. If these numbers hold any water, some 9 months from now when Barcelona ships, we’ll see a processor that is 28% less price-performant from a strict Oracle licensing standpoint. My fear is that it will be worse than that because Barcelona is socket-compatible with Socket F systems—such as the Opteron 2200. I’ve been at this stuff for a while and I cannot imagine the same chipset having enough headroom to feed a processor capable of 70% more throughput. Also, Intel will not stand still. I am comparing current Xeon to future Barcelona.

A Word About TPC-C Analysis
I admit it! I routinely compare TPC-C results on the same processor using results achieved by different databases. For instance, in this post, I use a DB2/SLES on IBM System x to make a point about the Xeon 7150 (“Tulsa”) processor. E-gad, how can I do that with a clear conscience? Well, think about it this way. If DB2 on IBM System x running SuSE can achieve 82,771 TpmC per Xeon 7150 and this HP result shows us that SQL Server 2005 on Proliant ML570G4 (Xeon 7140) can do 79,601 TpmC per CPU, you have to at least believe Oracle would do as well. There are no numbers anywhere that suggest Oracle is head and shoulders above either of these two software configurations on identical hardware. We can only guess because Oracle seems to be doing TPC-C with Itanium exclusively these days. I think that is a bummer, but Steve Shaw likes it (he works for Intel)!

What Does NUMA Have To Do With It?
Uh, Opteron/HyperTransport systems are NUMA systems. I haven’t blogged much about that yet, but I will. I know a bit about Oracle on NUMA—a huge bit.

I hope you’ll stay tuned because we’ll be looking at real numbers.

21 Responses to “AMD Quad-Core “Barcelona” Processor For Oracle (Part III). NUMA Too!”


  1. 1 Jeff January 1, 2007 at 6:30 pm

    Kevin,

    Happy new year, and I really enjoy the website. I’ve been struggling a bit on performing capacity analysis w.r.t hardware for a RAC environment using an EMC DMX-3 SAN. While I have good numbers on my application in terms of DML activity, can you give any pointers on where I should start from a hardware perspective.

    Thanks Jeff

  2. 2 kevinclosson January 2, 2007 at 8:46 pm

    Hi Jeff,

    Well, I’m not sure exactly what you are trying to do. Are you trying to predict headroom? Maybe more elaboration on your request might help me out?

    BTW, isn’t Switch a PolyServe site?

  3. 3 Mogens Nørgaard January 3, 2007 at 8:25 pm

    I have a dream.

    When the dual-core thingies came out, Oracle was insisting on licensing per core, so a dual-core = 2 CPU licenses.

    Microsoft has stated (very wisely!) that multi-core = One, and only One, CPU license.

    After a while Oracle changed it to 0.5 per core, meaning One CPU = One CPU license.

    (They did one exception – the Sun T1 8-core was declared equal to two CPU-licenses since one core = 0.25 license for this processor.)

    I hereby predict that Oracle will change their licensing so that a core becomes 0.25 CPU – or they’ll do the elegant and timeless Microsoft thing. May as well.

    I have a dream.

  4. 4 kevinclosson January 3, 2007 at 8:33 pm

    Hi Mogens,

    I’d like to think you are right, but there is one problem. Intel is currently shipping quad-core (the “Cloverdale” Xeon 5355 MCM) and yet Oracle has not addressed the .5 Intel core licensing. I doubt they will change it just for AMD when Barcelona ships…but then crazy things do happen!

  5. 5 Mogens Nørgaard January 3, 2007 at 8:41 pm

    They didn’t change it for a while after the first dual-core came out. Think about the price difference between Oracle and SQL Server suddenly, and you’ll see why they’ll change it in a conservative way :).

  6. 6 Jeff January 4, 2007 at 3:02 pm

    Kevin,

    What i’m trying to predict is headroom of a 4 node cluster. For instance, I can look at TPC-C benchmarks, but TPC-C isn’t going to accurately reflect my numbers considering the servers and equipment will be newly purchased. Before purchasing the server equipment, i’d like to at least have some sort of idea of what performance numbers i should expect considering i’m looking at 1K TPS. I’ve looked at Erik Peterson’s paper as well as yours (Federated benchmark), so I already have rough ideas.

    What I’m going to do initially is benchmark on our app against a 4 node cluster I have in house using the same storage we will use in the field (and attempt to use Orion to see storage performance). Won’t be identical to the new equipment, but should give a good baseline. Any other suggestions would be appreciated.

    BTW, we do use Polyserve at one of our customer sites and works very well for our RAC environment.

  7. 7 kevinclosson January 4, 2007 at 9:59 pm

    Jeff,
    I have an idea. Why don’t you email me at ora_kclosson (at) yahoo.com and let’s take this offline. What I’d like to do is see if I can help you out and then make a mini case study of what comes of it. I have some standalone tools that I use to give me a failry good idea what Oracle can do on a platform. If you execute them we might be able to decipher this headroom you seek….

    BTW, thanks for being a PolyServe customer.

  8. 8 DavidC January 16, 2007 at 12:28 pm

    “This published TPC-C result shows that the Opteron 2200 can perform 69,846 TpmC per processor. If the AMD quad-core promotional video proves right, the Barcelona processor will come it at approximately 118,739 TpmC per processor (a 70% improvement).”

    Duh. That’s because Barcelona has double the amount of cores per processor. Barcelona is a quad-core(4 cores) while AMD Opteron 2220 is a dual core(2 cores)

    And dividing the total TPC-C score by number of processors to get “per processor” result is WRONG. There is NO linear scaling.

    Xeon Clovertown can score 220K-240K with just TWO processors. AMD Opteron 2220 can do 139K with two processors.

    139K*1.7=236K(this would be Barcelona with 2 processors)

    We see that according to AMD, Barcelona is not faster per core than Clovertown.

  9. 9 kevinclosson January 16, 2007 at 3:59 pm

    David,

    I’m not sure I get your angle. We are saying the same thing. I’ll quote myself (typo and all):

    “This published TPC-C result shows that the Opteron 2200 can perform 69,846 TpmC per processor. If the AMD quad-core promotional video proves right, the Barcelona processor will come it at approximately 118,739 TpmC per processor (a 70% improvement).”

    I’m saying if AMD is right Barcelona will do 118,739 per processor or 237,478 for a 2 CPU system. Your comment says 139K*1.7=236K would be Barcelona with 2 processors.

    Did you mean to make a long blog comment about .6% difference between your number and my number?

    As for dividing system results to get a per-processor or per-core number, that is nothing new. That aside, Oracle doesn’t factor in scalability when they license their product and I’m blogging about performance per Oracle license.

  10. 10 DavidC January 24, 2007 at 3:14 pm

    Ok, let me clarify. No I am not nit-picking about the 0.6% difference between your calculation and mine. Also, I forgot about the licensing factor.

    Rather, I was talking about how you viewed 70% performance improvement as astonishing, but its possible as Barcelona has twice the amount of cores, along with the fact it’ll significantly outperform Clovertown as you were looking at it per core. From what I can remember from the original article, you deducted that:

    Opteron 2220SE gets 69K per processor. If Barcelona is 70% faster, then it should get 69K x 1.7 = Approximately 117K

    Which is false because Barcelona has twice as many cores as the Opteron and Oracle licensing would count it as twice as many. In terms of Oracle licensing Barcelona would actually indicate a LOSS in score for same licensing prices.

    It did look heavily biased towards AMD without very good arguments towards it.

    Now you clarified that, its ok.

  11. 11 kevinclosson January 24, 2007 at 3:32 pm

    DavidC,

    Sorry if it was confusing. You must have sped read the most important paragraph in the post:

    “TpmC/Oracle-license
    Since a quad-core AMD is licensed by Oracle as 2 CPUs, it looks like Barcelona will be capable of 59,370 TpmC per Oracle license. Therein lies the rub, as they say. There are a couple of audited TPC-C results with the Intel “Tulsa” processor (a.k.a. Xeon 7140, 7150), such as this IBM System x result, that show this current high-end Xeon processor is capable of some 82,771 TpmC per processor. Since the Xeon 71[45]0 is a dual-core processor, the Oracle-license price factor is 82,772 TpmC per Oracle license.”

    This post sadly offers a grim outlook for Barcelona given what we know today–most particularly regarding the Oracle per-core licensing scheme.

  12. 12 DavidC January 24, 2007 at 4:13 pm

    I see. I am sure I have seen yet another article by you though.

  13. 13 Mark January 24, 2007 at 7:19 pm

    The real comparisons will be AMD Barcelona to Intel Clovertown and Tigerton. Comparing quad-core Barcelona to dual-core Woodcrest or Tulsa. The same price performance disadvantage exists between Clovertown and Woodcrest as exists between Barcelona and Woodcrest.

  14. 14 information_is_king January 24, 2007 at 9:37 pm

    check you math on the xeon system. tpc is 331,087 and the box has 4 dual core processors for a total of 8 physical processors. 331,087/8 = 41386.

    now compare that to the 2 way dual core opteron system. tpc is 139,693 (multiply by 1.7 to estimate barcalona ) = 237478 for 4 physical cpus or 59367.

    the barcelona@59367 > xeon@41386 by a factor of 1.44

    your welcome… and i’m glad you aren’t my IT buyer.

  15. 15 kevinclosson January 24, 2007 at 10:35 pm

    Do I really have to do this?

    The Xeon system at 331,087 is 4 socket, 8 core not “8 physical processors” as you state. The terminology is very important and the term “physical processors” has generally been replaced with the term “socket.”

    The Opteron number is 139,693 for 2 sockets, 4 cores. AMD expects an increase of 70% per socket, not core. So you are right, the projected Barcelona number is 1.7x or 237,478, but that would be for a 2 socket system–albeit 8 cores.

    This is an Oracle blog and I’m blogging about performance per core. So I’ll reiterate:

    Opteron 2200 34,923 TpmC per core (139,693/4)
    Barcelona ~29,684 TpmC per core (237,478/8)
    Tulsa 41,385 TpmC per core (331,087/8)

    Oracle licenses by the core. That is all that matters on this blog.

    There was nothing wrong with my math, you just don’t know the difference between a socket and a core.

    Web 2.0…good grief.

  16. 16 information_is_king January 25, 2007 at 5:32 pm

    kevin – you are correct. your math is fine. though, i may still disagree about core being a better term than “physical processor”, but that is neither here, nor there.

    my gut told me based upon working with servers and knowing both architectures your calculations were incorrect, instead i errored in my math as you pointed out. *but*, i did uncover an error in your logic that makes your case worthless.

    you are comparing a commodity chip with a specialized chip. those xeon processors in the ibm TPC have 16MB of L3 cache and cost about 6k a piece. amd most likely gave us the performance increase of the commodity version of barcelona, not a specialized version of barcelona. they specifically used it as a comparison, or upgrade of current socket TDP (65W,89W) parts.

    the benchmark likely runs in cache on the special case hardware. we all know the p4 architecture is on the way out and intel has even put an end of line date on the architecture. compare the barcelon to woodcrest or intel’s new two die on one core for a more accurate representation of where the battle ahead lie.

    so, you still can’t buy my hardware for me 😉

  17. 17 Ron March 13, 2007 at 2:32 pm

    OK, there seems to be some confusion

    1= Oracle basic strategy is to charge =per socket= AKA =per physical CPU=.
    1C, 2C, 4C, 8C, etc CPUs are each going to be considered =1= Oracle license once the CPU in question is common enough.
    Oracle is greedy, not stupid.
    P*ssing off the customer base to the point where they vote with their dollars and feet to become someone else’s customer en masse is not something Ellison’s Oracle Corp is going to let happen.

    2= Intel and AMD use very, very different memory and cache coherency technologies. AMD’s MOESI is considerably better than Intel’s MESI; and we all know how much better HT is for large DBMS workloads that don’t fit into on die SRAM caches.

    3= Comparing TPC numbers w/o =CAREFUL= regard to all of HW, OS, and DBSW is very likely to result in erroneous conclusions.
    In addition, as good as TPC is, it is still an artificial workload based bench.
    In particular, many TPC efforts are made using system configs that are grossly unrealistic ITRW. Check the system disclosure info for many of the TPC results to see what I mean.
    The only really trustworthy bench is =your= workload running on =your= system. (…and done correctly of course)

    4= Comparing CPUs of wildly differing prices as if they are equivalent is both logically flawed and fiscally irresponsible.
    IF CPU “x” costs me ~10x the TCO of CPU “y”, CPU “x” d@mn well better give me +at least+ ~10x the performance if it is going to claim superiority over the lower TCO product.
    That holds even more true for comparative overall system costs.

    While I share the concerns regarding these issues, the lack of careful clarity here is more likely to add to the FUD rather than reduce it.

  18. 18 kevinclosson March 13, 2007 at 3:04 pm

    Ron,

    As of Feb 16 2007, Oracle charges per socket for Standard Edition up to 4 sockets. Enterprise Edition was and is still .5 licenses per CORE. I’ll repeat that.

    EE per Core. SE per socket (up to 4 sockets).

    As for comparing TPC across platforms, I present my reasoning for doing so very early in the thread. We expect that Oracle is at least as good as SQL Server, so I use SQL Server numbers with that thought in mind. I ran audited TPC benchmarks for a living dating way back to the early 90s. I “get it.” But, if you think I’m so very wrong in doing so, then please tell us just how much faster you think SQL Server is. On the other hand, if Oracle can out perform SQL Server on the same hardware, then my ideas are extremely pessimistic which would play in favor of Oracle on Barcelona. However, since you are so far off base with your point 1 above, none of the arithmetic, nor the concepts involved are going to make sense to you.

    As for point 4, TPCC has a cost metric ($$/TpmC) so I must be missing your point.

    In the end, this thread is meant to raise the issue of Oracle Enterprise Edition licensing on dense multi-core processors. If Barcelona is so murderously fast (even more so than AMD’s predictions), then sure, the “Barcelona Effect” will cancel out the “Oracle Per Core Licensing Effect.” But please, do not mistake the fact that Oracle licenses EE at .5 license per core because it distracts from the thread.

    About point 3 in your comment, AMD is the one that used TPCC first to set the expectations for Barcelona. I didn’t just start swirling TPC numbers around in a vat with visions of Barcelona in my head. If you read this thread, you’ll see I provide links to AMDs site where they show audited AMD and Intel TPCC results and lay out a projected Barcelona number. I suspect you googled into part III of this thread and didn’t go back to see Parts I and II before posting.

  19. 19 Ron March 13, 2007 at 4:47 pm

    No one is arguing with you about Oracle’s current pricing scheme.

    The argument is with your implied assumption that Oracle is going to keep it the way it presently is.
    History shows there’s not a chance of that happening.
    Oracle will keep their prices where they are only until 4C is common enough that it is “main streamed”, at which point they will do the same thing they did when 2C mainstreamed- they will change their license model.
    Oracle will permanently lose customers if they don’t, and they know it.
    When the number of customers Oracle rates to lose gets large enough because 4C is common enough, the licenses will change enough to reduce lost customers to an acceptable level.
    Oracle is greedy, not stupid.
    Your basic pricing premise is flawed.

    My issues with using TPC at face value has nothing to do with assumption re: Oracle vs SQL Server.
    Anyone reading this knows that Oracle is the OLTP king of the DB world. So does M$.

    The issue with TPC is that in order to do a reasonable comparison, you have to looks very closely at the system configs used for each specific TPC effort.
    For instance, I’ve seen ~$25M configs where ~$20M of it was HD subsystems. Maybe this is realistic for your organization. I’d argue it isn’t for most.
    Worse, I’ve seen configs where it is obvious that the big performance enhancer was not the DB at all!
    Instead some new disruptive technology (10GbE vs 1GbE, 15Krpm SAS HDs vs “other”, 4Gb FC vs 2Gb FC vs 1Gb FC, etc) was what was actually driving the improvement in TPC results obtained.
    The point here is “There are lies, damn lies, and statistics… …and benchmarks.”

    Everyone in this discussion appears experienced enough to know that the only bench that can be trusted is the bench done on your own HW with your own workload. Anything and everything else has to be treated with a large degree of skepticism.

    Then we get to your fuzziness with regards to MOSI vs MOESI and other HW details re: NUMA and ccNUMA as well as comparing CPUs of wildly differing price points as equivalent while being exacting about Oracle licensing issues.
    I read your entire thread. I “grok” your entire thread. I agree there are valid issues in what you are bringing up.
    But you are hurting your own case by using poor methods and logic.

    If you want to make a valid engineering and financial argument (and I agree there is a need for such analysis, so =please= do), Do It Right.

    Like you, I’ve been “around the block” (Oracle 6.5 to 10g, SQL Server from 1st release to now, just about every release of mySQL, just about every release of PostgreSQL, etc).
    One lesson we all know is that “The devil is in the details. If you don’t pay attention to the details, the devil comes out to play.”

  20. 20 kevinclosson March 13, 2007 at 7:17 pm

    “Oracle will keep their prices where they are only until 4C is common enough that it is “main streamed”, at which point they will do the same thing they did when 2C mainstreamed- they will change their license model.”

    Ron, Oracle did just respond. They change SE to a per-socket model. Being a Feb 16 2007 change, I think it will remain the model for quite some time. Since Intel is shipping Clovertown, I should think that qualifies as “main streamed” 4C.

    I allowed your comment through because if Barcelona ends up being widely adopted for Oracle EE deployments with today’s licensing scheme, then my blog posts on the matter are nothing but hot air and you get to be right.

    You are 100% correct that customer workloads are much more appropriate than TPC, but I’ll reiterate (seems I have to do that a lot on this thread) AMD is the one that started the TPC/Barcelona thread–not me.

    “…well as comparing CPUs of wildly differing price points as equivalent while being exacting about Oracle licensing issues.”

    Ron. On this thread I’ve compared commodity processors such as Xeon (Tulsa, Woodcrest and Clovertown) and AMD. Yes, I suppose there are “wildly differing price points” between them but how can that possibly matter when the SOFTWARE costs exponentially more than the processor. List EE is $60,000 for a single dual-core x86 processor and $120,000 for a single quad-core x86 processor. Are you really going to scrap over the list price of the server? With Oracle on commodity hardware, the hardware cost is moot. If you scrutinize between a server that is $5,000 or $35,000 with, say, 8 cores to run $240,000 worth of software, doesn’t that make you a bit penny-wise/pound-foolish? That is the very point I’m trying to make.

    There are three possible outcomes; 1) Barcelona is so very fast that it neutralizes the Oracle licensing affect making me henny-penny, 2) Oracle changes EE core-based licensing to accomodate AMD’s projected OLTP performance or 3) Oracle EE on Barcelona is too expensive. Of these, only 3 would upset me.

  21. 21 ossiovasen December 20, 2007 at 11:03 am

    Hi.
    Due to the fact that Barcelona is having delay after delay on delievery. We look into the Intel Xeon Quad core instead of the AMD Dual Core for Oracle databases. Are there any tests on these two architectures? Any news on when Barcelona will be released?
    Rgds OSSI


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,988 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: