The things I routinely hear from DBAs leads me to believe that they often don’t understand storage. Likewise, the things I hear from Storage Administrators convinces me they don’t always know what DBAs and system administrators have to do with those chunks of disk they dole out for Oracle. This is a long blog entry aimed at closing that gap with a particular slant to Oracle over NFS. Hey, it is my blog after all.
I also want to clear up some confusion about points I made in a recent blog entry. The confusion was rampant as my email box will attest so I clearly need to fix this.
I was catching up on some blog reading the other day when I ran across this post on Nuno Souto’s blog dated March 18, 2006. The blog entry was about how Noon’s datacenter had just taken on some new SAN gear. The gist of the blog entry is that they did a pretty major migration from one set of SAN gear to the other with very limited impact—largely due to apparent 6-Ps style forethought. Noons speaks highly of the SAN technology they have.
Anyone that participates in the oracle-l email list knows Noons and his important contributions to the list. In short, he knows his stuff—really well. So why am I blogging about this? It dawned on me that my recent post about Manly Men Only Deploy Oracle with Fibre Channel Storage jumped over a lot of ground work. I assure you all that neither Noons nor the tens of thousands of Oracle shops using Oracle on FCP are Manly Men as I depicted in my blog entry. I’m not trying to suggest that people are fools for using Fibre Channel SANs. Indeed, after waiting patiently from about 1997 to about 2001 for the stuff to actually work warrants at least some commitment to the technology. OK, ok, I’m being snarky again. But wait, I do have a point to make.
Deploying Oracle on NAS is Simpler and Cheaper, Isn’t It?
In my blog entry about “Manly Man”, I stated matter-of-factly that it is less expensive to deploy Oracle on NAS using NFS than on SANs. Guess what, I’m right, it is. But I didn’t sufficiently qualify what I was talking about. I produced that blog entry presuming readers would have the collective information of my prior blog posts about Oracle over NFS in mind. That was a weak presumption. No, when someone like Noons says his life is easier with SAN he means it. Bear in mind his post was comparing SAN to DAS, but no matter. Yes, Fibre Channel SAN was a life saver for too many sites to count in the late 90s. For instance, sites that bought into the “server consolidation” play of the late 1990s. In those days, people turned off their little mid-range Unix servers with DAS and crammed the workloads into a large SMP. The problem was that eventually the large SMP couldn’t physically attach any more DAS. It turns out that Fibre was needed first and foremost to get large numbers of disks connected to the huge SMPs of the era. That is an entirely different problem to solve than getting large numbers of servers connected to storage.
Put Your Feet in the Concrete
Most people presume that Oracle over NFS must be exponentially slower than Fibre Channel SAN. They presume this because at face value the wires are faster (e.g., 4Gb FCP versus 1Gb Ethernet). True, 4Gb is more bandwidth than 1Gb, but you can have more than one NFS path to storage and the latencies are a wash. I wanted to provide some numbers so I thought I’d use Network Appliance’s data that suggested a particular test of 8-way Solaris servers running Oracle OLTP over NFS comes within 21% of what is possible on a SAN. Using someone else’s results was mistake number 1. Folks, 21% degredation for NFS compared to SAN is not a number cast in stone. I just wanted to show that it is not a day and night difference and I wanted to use Network Appliance numbers for validity. I would not be happy with 21% either and that is good, because the numbers I typically see are not even in that range to start with. I see more like 10% and that is with 10g. 11g closes the gap nicely.
I’ll be producing data for those results soon enough, but let’s get back to the point. 21% of 8 CPUs worth of Oracle licenses would put quite a cost-savings burden on NAS in order to yield a net gain. That is, unless you accept the fact that we are comparing Oracle on NAS versus Oracle on SAN in which case the Oracle licensing gets cancelled out. And, again, let’s not hang every thought on that 21% of 8 CPUs performance difference because it is by no means a constant.
Snarky Email
After my Manly Man post, a fellow member of the OakTable Network emailed me the viewpoint of their very well-studied Storage Administrator. He calculated the cost of SAN connectivity for a very, very small SAN (using inexpensive 8-port FC switches) and factored in Oracle Enterprise Edition licensing to produce a cost per throughput using the data from that Network Appliance paper—the one with the 21% deficit. That is, he used the numbers at hand (21% degradation), Oracle Enterprise Edition licensing cost and his definition of a SAN (low connectivity requirements) and did the math correctly. Given those inputs, the case for NAS was pretty weak. To my discredit, I lashed back with the following:
…of course he is right that Oracle licensing is the lion’s share of the cost. Resting on those laurels might enable him to end up the last living SAN admin.
Folks, I know that 21% of 8 is 1.7 and that 1.7 Enterprise Edition Licenses can buy a lot of dual-port FCP HBAs and even a midrange Fibre Channel switch, but that is not the point I failed to make. The point I failed to make was that I’m not talking about solving the supposed difficulties of provisioning storage to those one or two remaining refrigerator-sized Legacy Unix boxes you might have. There is no there, there. It is not difficult at all to run a few 4Gb FCP wires to separate 8 or 16 port FC switches and then back to the storage array. Even Manly Man can do that. That is not a problem that needs solved because that is neither difficult nor is it expensive (at least the SAN aspect isn’t). As the adage goes, a picture speaks a thousand words. The following is a visual of a problem that doesn’t need to be solved—a simple SAN connected to a single server. Ironically, what it depicts is potentially millions of dollars worth of server and storage connected with merely thousands of dollars worth of Fibre Channel connectivity gear. In case the photo isn’t vivid enough, I’ll point out that on the left is a huge SMP (e.g., HP Superdome) and on the right is an EMC DMX. In the middle is a redundant set of 8-port switches—cheap, and simple. Even providing private and public Ethernet connectivity in such a deployment is a breeze by the way.
I Ain’t Never Doing That Grid Thing.
Simply put, if the only Oracle you have deployed—now and forever—sits in a couple of refrigerator-sized legacy SMP boxes, I’m going to sound like a loon on this topic. I’m talking about provisioning storage to commodity servers—grid computing. Grid may not be where you are today, but it is in fact where you will be someday. Consider the fact that most datacenters are taking their huge machines and chopping them up into little machines with hardware/software virtualization anyway so we might as well just get to the punch and deploy commodity servers. When we do, we feel the pain of Fibre Channel SAN connectivity and storage provisioning. Because connecting large numbers of servers to storage was not exactly a design center for Fibre Channel SAN technology. Just the opposite is true; SANs were originally meant to connect a few servers to a huge number of disks—more than was possible with DAS.
Commodity Computing (Grid) == Huge SAN
Large numbers of servers connected to a SAN makes the SAN very complex. Not necessarily more disks, but the presentation and connectivity aspects get very difficult to deal with.
If you are unlucky enough to be up to your knees in the storage provisioning, connectivity and cost nightmare associated with even a moderate number of commodity servers in a SAN environment you know what I’m talking about. In these types of environments, people are deploying and managing director-class Fibre Channel switches where each port can cost up to $5,000 and they are deploying more than one switch for redundancy sake. That is, each commodity server needs a 2 port FC HBA and 2 paths to two different switches. Between the HBAs and the FC switch ports, the cost is as much as $10,000-$12,000 just to connect a “pizza box” to the SAN. That’s the connectivity story and the provisioning story is not much prettier.
Once the cabling is done, the Storage Administrator has to zone the switches and provision storage (e.g., create LUNs, LUN masking, etc). For RAC, that would be a minimum of 3 masked LUNs for each database. Then the System Administrator has to make sure Oracle has access to those LUNs. That is a lot of management overhead. NAS on the other hand uses very inexpensive NICs and switches. Ah, now there is an interesting point. Using NAS means each server only has one type of network connectivity instead of two (e.g., FC and Ethernet). Storage provisioning is also simpler—the database server administrator simply mounts the NFS filesystem and the DBA can go straight to work with RAC or non-RAC Oracle databases. How simple. And yes, the Oracle licensing cost is a constant, so in this paradigm, the only way to recuperate cost is in the storage connectivity side. The savings are worth consideration, and the simplicity is very difficult to argue.
It’s time for another picture. The picture below depicts a small commodity server deployment—38 servers that need storage.
Let’s consider the total connectivity problem starting with the constant—Ethernet. Yes, every one of these 38 servers needs both Ethernet and Fibre Channel connectivity. For simplicity, let’s say only 8 of these servers are using RAC. The 8 that host RAC will need a minimum of 4 Gigabit Ethernet NICs/cables—2 for the public interfaces and two for a bonded, private network for Oracle Cache Fusion (GCS, GES) for a total of 32. The remaining 30 could conceivably do fine with 2 public networks each for a subtotal of 60. All told, we have 92 Ethernet paths to deal with before we look at storage networking.
On the storage side, we’ll need redundant paths for all 38 server to multiple switches so we start with 38 dual-port HBAs and 76 front-side Fibre Channel switch ports. Each switch will need a minimum of 2 paths back to storage, but honestly, would anyone try to feed 38 modern commodity servers with 2 4Gb paths worth of storage bandwidth? Likely not. On the other hand, it is unlikely the 30 smaller servers will each need dedicated 4Gb I/O bandwidth to storage so we’ll play zone trickery on the switch and group sets of 2 from the 30 yielding a requirement for 15 back-side I/O paths from each switch for a subtotal of 30 back-side paths. Following in suit, the remaining 8 RAC servers will require 4 back-side paths from each of the two switches for a subtotal of 8 back-side paths. To sum it up, we have 76 front-side and 38 back-side paths for a total of 114 storage paths. Yes, I know this can be a lot simpler by limiting the number of switch-to-storage paths. That’s a game called Which Servers Should We Starve for I/O and it isn’t fun to play. These arrangements are never attempted with small switches. That’s why the picture depicts large, expensive director-class switches.
Here’s our mess. We have 92 Ethernet paths and 114 storage paths. How would NAS make this simpler? Well, Ethernet is the constant here so we simply add more inexpensive Ethernet infrastructure. We still need redundant switches and I/O paths, but Ethernet is cheap and simple and we are down to a single network topology instead of two. Just add some simple NICs and simple Ethernet switches and go. And oh, by the way, the two network-topologies-model (e.g., GbE_+ FCP) generally means two different “owners” since the SAN would generally be owned by the Storage Group and the Ethernet would be owned by the Networking Group. With NAS, all connectivity from the Ethernet switches forward can be owned by the Networking Group freeing the Storage Group to focus on storage—as opposed to storage networking.
And, yes, Oracle11g has features that make the connectivity requirement on the Ethernet side simpler but 10g environments can benefit from this architecture too.
Not a Sales Pitch
Thus far, this blog entry has been the what. This would make a pretty hollow blog entry if I didn’t at least mention the how. The odds are very slim that your datacenter would be able to do a 100% NAS storage deployment. So Network Appliance handles this by offering multiple protocol storage from their Filers. The devil shall not remain with the details.
Total NAS? Nope. Multi-Protocol Storage.
I’ll be brief. You are going to need both FCP and NAS, I know that. If you have SQL Server (ugh) you certainly aren’t going to connect those servers to NAS. There are other reasons FCP isn’t going to go away soon enough. I accept the fact that both protocols are required in real life. So let’s take a look a multi-protocol storage and how it fits into this thread.
Network Appliance Multi-Protocol Support
Network Appliance is an NFS device. If you want to use it for FCP or iSCSI SAN, large files in the Filer’s filesystem (WAFL) are served with either FCP or iSCSI protocol and connectivity. Fine. It works. I don’t like it that much, but it works. In this paradigm, you’d choose to run the simplest connectivity type you deem fit. You could run some FCP to a few huge Legacy SMPs, FCP to some servers running SQL Server (ugh), and most importantly Ethernet for NFS to whatever you choose—including Oracle on commodity servers. Multi-protocol storage in this fashion means total vendor lock-in, but it would allow you to choose between the protocols and it works.
SAN Gateway Multi-Protocol Support
Don’t get rid of your SAN until there is something reasonable to replace it with. How does that statement fit this thread? Well, as I point out in this paper, SAN-NAS gateway devices are worth consideration. Products in this space are the HP Enterprise File Services Clustered Gateway and EMC Celerra. With these devices you leverage your existing SAN by connecting the “NAS Heads” to the SAN using very low-end, simple Fibre Channel SAN connectivity (e.g., small switches, few cables). From there, you can provision NFS mounts to untold numbers of NFS clients—a few, dozens or hundreds. The mental picture here should be a very small amount of the complex, expensive connectivity (Fibre Channel) and a very large amount of the inexpensive, simple connectivity (Ethernet). What a pleasant mental picture that is. So what’s the multi-protocol angle? Well, since there is a down-wind SAN behind the NAS gateway, you can still directly cable your remaining Legacy Unix boxes with FCP. You get native FCP storage (unlike NetApp with the blocks-from-file approach) for the systems that need it and NAS for the ones that don’t.
I’m a Oracle DBA, What’s in It for Me?
Excellent question and the answer is simply simplicity! I’m not just talking simplicity, I’m talking simple, simple, simple. I’m not just talking about simplicity in the database tier either. As I’ve pointed out upteen times, NFS will support you from top to bottom—not just the database tier, but all your unstructured data such as software installations as well. Steve Chan chimes in on the simplicity of shared software installs in the E-Biz world too. After the NFS filesystem is mounted, you can do everything from ORACLE_HOME, APPL_TOP, clusterware files (e.g., the OCR and CSS disks), databases, RMAN, imp/exp, SQL*Loader/External Tables, ETL, compiled PL/SQL, UTL_FILE, BFILE, trace/logging, scripts, and on and on. Without NFS, what sort of mix-match of raw, filesystem, raw+ASM combination would be required? A complex one—and the really ironic part is you’d probably still end up with some NFS mounts in addition to all that raw disk and non-CFS filesystem space as well!
Whew. That was a long blog entry.
Recent Comments