May 30, 2007. BLOG UPDATE: Note, the author of the papers I discussed in this blog entry has visited and commented. If nothing else, I recommend reading my follow up regarding the fact that these papers don’t even have the word Oracle in them.
It isn’t very often that you get a tier one hardware vendor directly comparing RAC with non-RAC. When it happens, it is generally by accident. That doesn’t stop me from learning from the information. I hope you will find it interesting too.
So, Dell didn’t exactly set out to compare RAC to non-RAC, but they inadvertently did. In October 2006, they released a series of whitepapers that compare Dell with Oracle to Sun with Oracle. I personally think such comparisons are a complete waste of time since Sun shops are going to run Sun and Windows shops are going to run Windows.
The whitepapers take two swipes at the Sun V490 with 8 UltraSPARC IV+ processors. The first is a cluster of Dell 2950s each with 2 dual-core Xeon 5160 (Woodcrest) processors running Red Hat Enterprise Linux 4. The second was a single Dell 6850 with 4 dual-core Xeon 7140 (Clovertown) processors running Windows 2003. Oh if only they both would have been Linux. No matter though, the comparison is still very interesting. The papers are available at the following URLs:
- http://www.dell.com/downloads/global/power/pe2950_vs_sunv490.pdf
- http://www.dell.com/downloads/global/power/pe6850_vs_sunv490.pdf
Even though the paper was intended to cast stones at the Sun V490, there was one particularly interesting aspect of the testing that makes the results helpful in so many other ways. See, Dell did all this testing with the same SAN. In fact, a good portion of these papers are identical. The description of the SAN used for the V490, Clustered 2950s and the 6850 appears in both papers as follows:
Storage for both the Dell and Sun servers was provided by a Storage Area Network (SAN) attached Dell/EMC CX3-80 fibre channel storage array. Each server was attached to the SAN via two QLogic Host Bus Adapters.
There we have it, 3 configurations with the same application code, the same database schema and the same database size. How tasty!
The Workload
They used Dell’s DVD Store Test Application test suite that has been available since about 2005 at http://linux.dell.com/dvdstore/. I have used this workload quite a bit myself actually. It exhibits a lot of the same characteristics as TPC-C—for what it is worth. By the way, the link I provided works, the one in the whitepapers is faulty. I guess that will be my value add.
The Numbers
Like I said, forget about the comparison to Sun. I say look at the comparison of clustered versus non-clustered Oracle. I’ll let you read the papers for the full nitty-gritty, but the summary is worth a lengthy discussion:
Configuration Cost Throughput (Orders/Minute)
Dell 6850 $185,747 32,264
Dell 2950 Cluster $266,852 22,169
Remarkable, and remember, all the important aspects of this sort of test were constant between the two. By important I mean the application, database schema and database size and storage.
Highly Available
Yes, the Dell 2950 cluster theoretically offers more availability. That is important in the event of a failure, sure, but it performs at 31% less throughput than the 6850 solution when it is fully healthy. The important comparison, I believe, is the 6850 to the “brown-out” effect of running an application on a single surviving node of the 2950 cluster. With only one node surviving in the event of a failure, the 2950 cluster solution would be capable of 11,084 orders per minute—about 67% less throughput than the 6850. I think it breaks down like this; the clustered 2950 solution costs 44% more and performs 31% less but in the event of a failure, a surviving 2950 will offer about 1/3rd the throughput of a 6850.
Kevin, very nice catch. Thanks for this post.
One of your assumption might be wrong when estimating throughput for a single survived node of 2850 cluster. You assumed that a single node will be able to process only 50% of what two nodes RAC cluster can (22,169/2). Which is probably not true in practice.
Put it another way around – you don’t usually double your throughput by going from single-node RAC (i.e. practically non-RAC) to two-nodes RAC. If you are lucky you can get 50% increase providing you are CPU bound. Dell’s DVD Store application can be different but my experienced with real life applications suggests that moving to two node RAC requires additional 70% of CPU capacity to sustain the same throughput.
Alex,
You know and I know that RAC doesn’t usually scale 100%…I didn’t want to get into that…I get in enough political trouble as it is 🙂 Concentrate on the delta between the cluster and the 6850…that is the real gotcha…
As always, thanks for stopping by
The question I have is why would Oracle Corp. allow this white paper to be published? It doesn’t really say anything good RAC. May be that is exactly what DELL wanted. Isn’t DELL using Oracle Apps internally? May be they have some problems and they think that Oracle Corp. is not really supporting them. This paper may be a way to put pressure on Oracle.
Oracle has been really hard on people that say bad (and often true) things about RAC. Why haven’t they succeeded in convincing DELL so that they remove the papers (well may be that is the reason the links didin’t work :))
Anjo,
The what links don’t work? I just walked through this blog entry and tested each URL and they all seem to work.
Isn’t it also possible that Dell is always looking for a way to make linux look bad wrt windows? Is it just coincidence that didn’t they make the windows system the clustered one? Dell has always been one of the most loyal remoras that hangs around the various orifices of the Shark Of Redmond, after all..
yfeefy,
No. List price is not subjective. Also, their goal was to make Sun look bad so they put their best into both the Linux and Windows test.
Why wasn’t the Windows test setup clustered? Well, think about it. They used a non-clustered 6850 to stomp both a clustered 2950 and a Sun v90. The only thing a clustered 6850 setup would show is an even more cost (MUCH more cost) and a heavy-handed slam of the other two configurations.
I don’t think there is any slight of hand here on Dell’s part. I just think their Technical Marketing guys didn’t realize that they were actually releasing an apples comparison of RAC versus non-RAC. Oh, and that sticky bit about using Solaris in violation of EULA…but, what the heck, Sun doesn’t care.
From the Production experience that we have : (figures are approximative, but they are giving some trends).
15 months running in production
Red Hat 3.0 – 10.0.1.4 in RAC 2 nodes – IBM HS40 2×2 procs 2x4Go RAM and a FastT600 Disk.
The workload is about 250 users (up to 1000 oracle connexions, due to a mix of Windows applications).
With the two nodes running : 50 – 60% cpu. Users are not fully happy with response times. Few internode data cache transferts(15 years of experience in Oracle single instance, but i can consider that we are still rookies for Rac, with 1 year old.
Many commands are different from a single instance, take into consideration the effort of training, beginning by the installation.
Running a failover system in production (this is the target) means running something :
– robust (failing over and recovering alone)
– simple to understand
– simple to manage, because in production, many people are involved, during years…
I hope, that, in the future, Oracle will be able to meet these goals, the technology which is not new (bought to Rdb in the 1990…) was probably better on VMS 10 years ago.
Regards.
Kevin,
is there a little typo in your story?
You start by mentioning the Dell 2950 cluster but later you refer to a 2850 cluster… My guess is it’s a minor typo.
Your apples to apples comparison is not even close.
The RAC setup has 3.0 GHz the non-RAC is 3.4 (10%).
The RAC setup has only 2MB L3 cache the non-RAC has 16MB. Difficult to quantitize this delte without a better understanding of the workload.
Regards,
Bryon
Byron,
The workload is fully available for you to better understand. I provided the URL to it. This is an Oracle workload and if you could increase the L3 cache in a 2950 to 16MB (to match the 6850) it would not bring the performance up 45% (to match the single node 6850). Cache does not benefit Oracle to that degree since the code and data footprint blows away the cache anyway. Oracle is a load and store workload.
However, I’m just pointing out that there are several differences between the two system which can be quantified. I don’t have all of the data to do that quantifying but I can fill in some holes if you’d like.
It has been my experience that installing RAC software on a system will give you about a 10-15% hit in performance due to the overhead of the additional RAC code path.
Giving simple estimates:
10% for cpu speed differences (3.0GHz vs 3.4GHz)
20% FSB memory speed differences (667MHz vs 800 MHz, FBDs vs DDR2)
15% RAC vs. NonRac
All in all the numbers are indeed interesting.
Thanks,
Bryon
I have to agree with Byron. IMHO, the size of level 2 and 3 cache IS extremely important in Oracle performance. I have only anecdotal evidence:
When Oracle, Intel, and Dell were working on the “Mega Grid” project at Oracle’s Austin data center, they performed comparison tests on Linux using Intel’s 64-bit Itanium processors and Intel’s 32-bit Xeon processors. The performance for the Itanium processor was twice as high. After analysis, Intel concluded that this was largely not due to the difference in architecture or the 232-bit vs. 64-bit comparison, but due to the difference of cache on the processors. The Itanium had over twice as much level 2 and level 3 cache as the Xeon processor
Another bit of anecdotal evidence comes from Dell. I have talked with Dell engineers, who tell me that the performance advantage of Itanium processors over EM64T began to erode as the EM64T offered cache equivalent to the Itanium. Now, you can get up to 16MB of cache on the EM64T processor, and Dell no longer sells the Itanium processor. Draw your own conclusions.
Larry,
Good input. I believe nothing I hear and only half of what I see however. So I would have to see with my own eyes how anyone could finger processor cache when accounting for the performance delta between such two entirely different beasts–32bit Xeon and IA64. The only thing that would make the it stranger would be to hear that it was Oracle9i verus Oracle10g. I’m not saying that you are wrong, but that just smells fishy. The only way to compare processor cache effectiveness is to change NOTHING other than the cache size–same processor, memory, OS, Oracle bits. I had the luxury of doing a lot of that with the Sequent port of Oracle (a long time ago, I know).
Did the Intel guys bother to mention the difference in memory latency on the Itanium versus the Xeon in that particular case? Oracle’s footprint mashes the cache so it all boils down to memory latency.
And yes, Dell no longer sells Itanium. Isn’t it odd that about the only thing Oracle uses for TPCC is Itanium. I need to go dig, but I can’t recall the last x86 Oracle TPCC.
Kevin,
I’ve never bought into the idea that RAC is a good option for scalability. Your observations from the Dell papers seem intuitive and obvious. IMO, aggressive censorship, by Oracle, seems to be the only believable explanation as to why there are not more examples of this.
In 2003, the company I work for accepted a good sales pitch and bought RAC. The reason was purely for the (advertised) simplicity of recovery from a failure. For us, owning RAC means owning database servers that are always underutilized. Since then, the extra costs and complexities of Vx SFRAC (or any clustering software), combined with the extra demands on sa and dba expertise has been quite expensive.
Why put such a high load on a set of servers, when
– there is no room to sustain that load in the event of failure and
– a smaller number of bigger servers seems to be easier and cheaper?
What is your take on using RAC primarily for availability?
-paul
What is your take on using RAC primarily for availability?
-paul
…sounds like a good idea to me.
Ok, so, using RAC for scalability is:
– actively advertised by Oracle, for whatever reason.
– probably not the best (or even a good) choice
AND
using RAC for availability (assuming there is human expertise and unutilized hardware resources to cover losses) is a good idea.
Maybe, that’s a bit altruistic and naive, but it is my opinion and I get the idea that you think so, too. Yes?
-paul
ps: I’ve been enjoying the reading on your blog. It jogs memories of the days of when we had two Symmetry 2000’s, named “right” and “left”, because the operators could easily figure out where to put the tape. From there we went through 3 numa-q’s and now we are on Sun. Apparently, before my time, you came here (American TV) to help with problems with OPS v7.0 (!!). 7.0 OPS, wow, talk about iron men and wooden ships. top2, nice job.
-paul
Hello PJ,
Yes, I did come to Madison (American TV) to troubleshoot OPS problems on the right/left cluster…and, yes, it was 7.0.12 in production 🙂
Wow, that was a long time ago.
I’m glad you are enjoying the blog.
I think RAC for scalability depends on the application. If you go to the papers section on my blog you’ll see what sort of testing I’ve done to prove (at least to myself if nobody else) that RAC does scale OLTP. Further, RAC scales parallel query nearly for free (that is, it pretty much just works). The problem is that I know of folks who cobble 2 way servers into, say, 2 node RAC clusters (total of 4 CPUs) and say they are doing so for scalability. Such a config would be more for availability because you’d have to work long and hard to find a workload that scaled to 4 CPUs separated by a RAC interconnect as well as a 4 way SMP.
Kevin,
Thanks for clarifying your rationale for the commentary about the Dell tests. Your answer makes perfect sense.
Amtv has grown quite a bit since the “days of old”. Having survived 9 Black Fridays, biggest OLTP load for the whole year, I can attest that RAC has not caused any performance problems.
Lately, turnover in our sys admin function has reaffirmed (in spades) the importance of:
a) in depth knowledge of the environment. You are right, “dba’s who don’t know clustering, don’t know RAC”.
b) choosing rac for the right reasons. “Fools rush in where angels fear to tread”.
-paul
Kevin-
I recently learned of your blog and your interest in these two papers, of which I am a co-author, along with Todd Muirhead.
First of all, thanks for picking up the bad URL for our DVD Store database test workload. The papers show http://www.linux.dell.com/dvdstore but the correct URL as you point out doesn’t contain the www. We have no idea how that bug got past our quality control team. (Just kidding, Todd and I don’t have a quality control team, just a few internal reviewers and, of course, our Dell Legal team).
I think enough people have already responded to your main point, that these two papers constitute an “apples to apples” test of RAC vs single node Oracle, but, just to summarize, we purposely chose these two configurations to have different processors, OSes, cache sizes and memory speeds exactly to prevent such a comparison. In our view we don’t care if customers run Oracle on Windows or Linux, single node or RAC, 2 socket or 4 socket, as long as they run it on Dell.
Finally, to counter some of the comments that our papers show a bias against Linux or RAC, let me point out that the Dell Information Technology department runs much of our factories and online business on Oracle RAC on Linux.
In fact, Todd and I have recently published another paper, with two Dell IT guys, on exactly that topic: how Dell IT moved from a mission critical supply chain management application from large Sun systems running single node Oracle to RAC clusters running on PowerEdge servers. That paper and others are available on our new enterprise wiki at delltechcenter.com. We invite you and your readers to come check out the (growing) collection of Oracle (and other) articles, pointers, blogs, etc.
Dave Jaffe
dave_jaffe@dell.com
Dave,
Thanks for stopping by. I’d like to follow-up a bit. First, I’m surprised that although this blog entry has been viewed thousands of times (and now once more since you’ve visited :-)) nobody–not one soul–has pointed out that neither of these papers (pe2950_vs_sunv490.pdf, pe6850_vs_sunv490.pdf) even have the word Oracle in them. Not a single mention. I have waited patiently for someone else to notice that and when I saw your comment in my moderation queue, I thought for sure the time had come!
You state:
“In our view we don’t care if customers run Oracle on Windows or Linux, single node or RAC, 2 socket or 4 socket, as long as they run it on Dell.”
An excellent position to take! I do see a lot of “religion” wrapped up in all this. At the end of the day, it so happens that Oracle works well enough on both Linux and Windows to make a solid solution. Couple that with such tools as RAC and DataGuard and you have enough on hand to tackle just about any problem. You continued with:
“Finally, to counter some of the comments that our papers show a bias against Linux or RAC, let me point out that the Dell Information Technology department runs much of our factories and online business on Oracle RAC on Linux.”
I was indeed aware that Dell migrated off the Sun servers some time back. As for the comparison in the papers, I was only pointing out that using the same SAN and same workload (DVD) is sufficient for comparing the solution architecture (clustered versus unclustered). After all, does anyone really think that when Oracle is running full-bore there is really that much difference between Windows and Linux overhead? I should hope not.
Hi Kevin,
A little background on our direction for systems running Oracle. As a corporate directive, we are going to get off of SPARC and move to RHEL/x86_64. I am in the process of migrating one of our mission critical Oracle ERP systems from Solaris/SPARC over to RHEL 6. On the HW side, we are standardizing on Cisco’s UCS bade servers. In one of your past blog posts, you had mentioned that roughly “x” number of CPU cycles are spent on doing an IO. From capacity planning standpoint, I am trying to figure out, if my application go from 40k IOPS to 60k IOPS in the next two years, ,how many x86 CPUs would I need in my RAC farm or how many nodes would I need.
Thank you for always responding and providing guidance.
Amir
@Amir : So nice of you to stop by. I’m still digging for that post. However, these calculations for CPU/IOP are very easy to do with SLOB and I’m quite sure today’s servers are much more powerful than most folks know. Much more powerful.
Consider that 128 Sandy Bridge E5-2690 CPUs (8-node cluster) in a VCE vBlock Specialty Solution for High Performance Database delivers 4M random 8K IOPS to the SGAs at the rate of nearly 25,000 IOPS/ DB CPU second. Consider:
http://wp.me/a21zc-17e
http://wp.me/a21zc-17f
I think you made a mistake here. Your number for orders/min made no sense
Dell 6850 $185,747 32,264
Dell 2950 Cluster $266,85 222,169
In the paper its 22,169
[cid:image001.png@01CECF46.ACDA96B0]
@David yep, old post with a typo. Sorry. Will fix it.