Oracle Over NFS. I Need More Monitoring Tools? A Bonded NIC Roller Coaster.

As you can tell by my NFS-related blog entries, I am an advocate of Oracle over NFS. Forget those expensive FC switches and HBAs in every Oracle Database server. That is just a waste. Oracle11g will make that point even more clearly soon enough. I’ll start sharing how and why as soon as I am permitted. In the meantime…

Oracle over NFS requires bonded NICs for redundant data paths and performance. That is an unfortunate requirement that Oracle10g is saddled with. And, no, I’m not going to blog about such terms as IEEE 802.3ad, PAgP, LACP, balance-tlb, balance-alb or even balance-rr. The days are numbered for those terms-at least in the Oracle database world. I’m not going to hint any futher about that though.

Monitoring Oracle over NFS
If you are using Oracle over NFS, there are a few network monitoring tools out there. I don’t like any of them. Let’s see, there’s akk@da and Andrisoft WanGuard. But don’t forget Anue and Aurora, Aware, BasicState, CommandCenter NOC, David, Dummmynet, GFI LANguard, Gomez, GroundWork, Hyperic HQ, IMMonitor, Jiploo, Monolith, moods, Network Weathermap, OidView, Pandetix, Pingdom, Pingwy, skipole-monitor, SMARTHawk, Smarts, WAPT, WFilter, XRate1, arping, Axence NetVision, BBMonitor, Cacti, CSchmidt collection, Cymphonix Network Composer, Darkstat, Etherape, EZ-NOC, Eye-on Bandwidth, Gigamon University, IPTraf, Jnettop, LITHIUM, mrtg-ping-probe, NetMRG, NetworkActiv Scanner, NimTech, NPAD, Nsauditor, Nuttcp, OpenSMART, Pandora FMS, PIAFCTM, Plab, PolyMon, PSentry, Rider, RSP, Pktstat, SecureMyCompany, SftpDrive, SNM, SpeedTest, SpiceWorks, Sysmon, TruePath, Unbrowse, Unsniff, WatchMouse, Webalizer, Web Server Stress Tool, Zenoss, Advanced HostMonitor, Alvias, Airwave, AppMonitor, BitTorrent, bulk, BWCTL, Caligare Flow Inspector, Cittio, ClearSight, Distinct Network Monitor, EM7, EZMgt, GigaMon, Host Grapher II, HPN-SSH, Javvin Packet Analyzer, Just-ping, LinkRank, MoSSHe, mturoute, N-able OnDemand, Netcool, netdisco, Netflow Monitor, NetQoS, Pathneck, OWAMP, PingER, RANCID, Scamper, SCAMPI, Simple Infrastructure Capacity Monitor, Spirent, SiteMonitor, STC, SwitchMonitor, SysUpTime, TansuTCP, thrulay, Torrus, Tstat, VSS Monitoring, WebWatchBot, WildPackets, WWW Resources for Communications & Networking Technologies, ZoneRanger, ABwE, ActivXpets, AdventNet Web NMS, Analyse It, Argus, Big Sister, CyberGauge, eGInnovations, Internet Detective, Intellipool Network Monitor, JFF Network Management System, LANsurveyor, LANWatch, LoriotPro, MonitorIT, Nagios, NetIntercept, NetMon, NetStatus, Network Diagnostic Tool, Network Performance Advisor, NimBUS, NPS, Network Probe, NetworksA-OK, NetStat Live, Open NerveCenter, OPENXTRA, Packeteer, PacketStorm, Packetyzer, PathChirp, Integrien, Sniff’em, Spong, StableNet PME, TBIT, Tcptraceroute, Tping, Trafd, Trafshow, TrapBlaster, Traceroute-nanog, Ultra Network Sniffer, Vivere Networks, ANL Web100 Network Configuration Tester, Anritsu, aslookup, AlertCenter, Alertra, AlertSite, Analyse-it, bbcp, BestFit, Bro, Chariot, CommView, Crypto-PAn, elkMonitor, DotCom, Easy Service Monitor, Etherpeek, Fidelia, Finisar, Fpinger, GDChart, HipLinkXS, ipMonitor, LANExplorer, LinkFerret, LogisoftAR, MGEN, Netarx, NetCrunch, NetDetector, NetGeo, NEPM, NetReality, NIST Net, NLANR AAD, NMIS, OpenNMS PageREnterprise, PastMon, Pathprobe, remstats, RIPmon, RFT, ROMmon, RUDE, Silverback, SmokePing, Snuffle, SysOrb, Telchemy, TCPTune, TCPurify, UDPmon, WebAttack, Zabbix, AdventNet SNMP API, Alchemy Network Monitor, Anasil analyzer, Argent, Autobuf, Bing, Clink, DSLReports, Firehose, GeoBoy, PacketBoy, Internet Control Portal, Internet Periscope, ISDNwatch, Metrica/NPR, Mon, NetPredict, NetTest, Nettimer, Net-One-1, Pathrate, RouteView, sFlow, Shunra, Third Watch, Traceping, Trellian, HighTower, WCAT, What’s Up Gold, WS_FTP, Zinger, Analyzer, bbftp, Big Brother, Bronc, Cricket, EdgeScape, Ethereal (now renamed Wireshark), gen_send/gen_recv, GSIFTP, Gtrace, Holistix, InMon, NcFTP, Natas, NetAlly, NetScout, Network Simulator, Ntop, PingGraph, PingPlotter, Pipechar, RRD, Sniffer, Snoop, StatScope, Synack, View2000, VisualPulse, WinPcap, WU-FTPD, WWW performance monitoring, Xplot, Cheops, Ganymede, hping2, Iperf, JetMon, MeasureNet, MatLab, MTR, NeoTrace, Netflow, NetLogger, Network health, NextPoint, Nmap, Pchar, Qcheck, SAA, SafeTP, Sniffit, SNMP from UCSD, Sting, ResponseNetworks, Tcpshow, Tcptrace WinTDS, INS Net Perf Mgmt survey, tcpspray, Mapnet, Keynote, prtraceroute clflowd flstats, fping, tcpdpriv, NetMedic Pathchar, CAIDA Measurement Tool Taxonomy, bprobe & cprobe, mrtg, NetNow, NetraMet, Network Probe Daemon, InterMapper, Lachesis, Optimal Networks and last but not least, Digex.

Simplicity Please
The networking aspect of Oracle over NFS is the simplest type of networking imaginable. The database server issues I/O to NFS filesystems being exported over simple, age old Class C private networks (192.168.N.N). We have Oracle statspack to monitor what Oracle is asking of the filesystem. However, if the NFS traffic is being sent over a bonded NIC, monitoring the flow of data is important as well. That is also a simple feat on Linux since /proc tracks all that on a per-NIC basis.

I hacked out a very simple little script to monitor eth2 and eth3 on my system. It isn’t anything special, but it shows some interesting behavior with bonded NICS. The following screen shot shows the simple script executing. A few seconds after starting the script, I executed an Oracle full table scan with Parallel Query in another window. Notice how /proc data shows that the throughput has peaks and valleys on a per-second basis. The values being reported are Megabytes so it is apparent that the bursts of I/O are achieving full bandwidth of the GbE network storage paths, but what’s up with the pulsating action? Is that Oracle, or the network? I can’t tell you just yet. Here is the screen shot nonetheless:

For what it is worth, here is a listing of that silly little script. Its accuracy compensates for its lack of elegance. Cut and paste this on a Linux server and tailor eth[23] to whatever you happen to have.

$ cat ntput.sh
#!/bin/bash
function get_data() {
cat /proc/net/dev | egrep “${token1}|${token2}” \
| sed ‘s/^.*://g’ | awk ‘{ print $1 }’ | xargs echo
}

token1=eth2
token2=eth3
INTVL=1

while true
do

set – `get_data`
b_if1=$1
b_if2=$2

sleep $INTVL

set – `get_data`
a_if1=$1
a_if2=$2

echo $a_if1 $b_if1 | awk -v intvl=$INTVL ‘{
printf(“%7.3f\t”, (($1 – $2) / 1048576) / intvl) }’
echo $a_if2 $b_if2 | awk -v intvl=$INTVL ‘{
printf(“%7.3f\n”, (($1 – $2) / 1048576) / intvl) }’

done

10 Responses to “Oracle Over NFS. I Need More Monitoring Tools? A Bonded NIC Roller Coaster.”

Feed for this Entry Trackback Address

1 Richard June 5, 2007 at 9:35 pm

I’m curious. Would you recommend Oracle over NFS over, say, Oracle over iSCSI? Both have very similar hardware requirements, although with iSCSI you can (optionally) offload some of the network overhead onto dedicated HBAs as well (but with NFS you might have less need to).

I find that an interesting problem because in both cases you’ve got GigE cabling, ASM doing management, et cetera, with just a protocol difference… and the fact that, if I’m reading the docs correctly, Oracle only supports NFS when using a few dedicated filers, whereas quite a lot of boxes support iSCSI now.

Great site, btw – always educational.

Reply
2 kevinclosson June 5, 2007 at 9:59 pm

Richard,

Excellent question. The problem I see with iSCSI is you still have to fiddle with RAW devices. I simply prefer the simplicity of the filesystem model. If you use NFS, you use files. I have blogged before that NFS supports ASM as well, but there is a short list of reasons to use that model.

Between now and the time that Oracle supplies a real, general purpose cluster filesystem on more than just the Linux platform, we will continually get into these quasi-religious wars about files versus ASM. The day will come, however, that Oracle will indeed provide filesystems and then the dust will settle and you can choose where to put things as it makes sense in your environment. In the meantime, NFS is the only storage option that covers all the bases with simplicity. And, as I allude to, 11g makes the picture even better. That is my way of saying, why not do the simplest thing today and then get a good boost when you adopt 11g. In the end, you’ll look smarter to management than if someone down the line points out that you could have gotten away with much lower cost (FCP versus simple networking)… just a thought…

Reply
3 Jeff June 8, 2007 at 6:25 am

Why not iSCSI a partition as an attached drive formatted with the native hosts filesystem for storage of remote files. Would that not be the best of both worlds?

Reply
4 Mark Seger December 15, 2007 at 9:45 pm

Sorry to hear you haven’t found any tools you like, but perhaps you haven’t looked at collectl yet. This is something I wrote about years ago and open sourced this past summer. I use it every day as do a number of people I know. One of my goals is for it to be the one stop shopping place for everything to do with system performance monitoring and so it can collect data on just about any system component and either log it to a file or display it interactively. Perhaps that’s one of the reasons it has so many switches! When someone finds something it can’t do, I often add another switch to allow it!

Anyhow, if you’re serious about monitoring check out http://collectl.sourceforge.net/index.html and see if you add it to your list of those tools you don’t like OR start a new list with collectl in it. 😎

-mark

Reply
5 kevinclosson December 17, 2007 at 1:06 am

Hi Mark,

This post is some 6 months old. You have stolen my thunder, er, well, you wrote collectl so I guess yours is the thunder! That is, I was **just about to** make a blog entry about collectl because I do use it and absolutely love it! I guess I better make that entry tomorrow morning.

Reply
6 Fred November 11, 2008 at 3:33 pm

Kevin,

Did you ever discover the cause of the pulsating peaks and valleys in your IO? I recently cloned a single-instance 10g database running on NFS into a 2-node RAC database with mounts running off the same NFS servers. When I run a FTS on the RAC database, I see the same pulsating IO that you report (I’m using collectl to monitor it). But if I run the same query on the single-instance, I get no pulsing.

So I’m wondering if you ever got to the bottom of this. I’m plowing through all your NFS-related postings in the hope of finding something, but maybe you could help me refine my search. Great blog btw! A real treasure trove.

Thanks.

-Fred

Reply
7 kevinclosson November 11, 2008 at 5:04 pm

Hi Fred,

Sorry, but no I never got the chance to get to the bottom of that issue. Shortly after I shifted all my focus to 11g DNFS which exhibited no such problem. If I had stayed focused on it, however, I would have concentrated on the bonding end more than the Oracle end.

Is your RAC query utilizing PQO slaves on both instances? My case wasn’t RAC at all, but your situation sounds interesting. Are you trafficking RAC interconnect over the same interfaces? It might be helpful for you to narrow down your scenario. See if the pulsation occurs with a totally “benign” instance of Oracle up. That is, just start the other instance but set DOP specifically not to use that intance. In fact, perhaps you should open the other instance read-only. It would also be interesting to know the scan complete time with and without the pulsating symptoms.

Reply
8 Fred November 11, 2008 at 9:14 pm

Kevin,

Thanks for your ideas. I normally wouldn’t be so verbose in a blog comment, but you asked for it 😉

I ran to check if the interconnect and storage were on the same wire (that could have been a slam-dunk), but alas, IC is eth1 and storage is eth2.

Here are the facts:

There are three RHEL servers configured (more or less) the same: srv1, srv2, and srv3. Their storage is served up via NFS from an Isilon. We have tweaked NFS options with no effect.

srv1 hosts a single-instance database, DEV1.

srv2 and srv3 are nodes in a RAC cluster. They host DEV2. The RAC database is a clone of DEV1. All are 10gR2.

Test 1: First, I establish the target performance goal on srv1, which is our “old” dev server, where performance is adequate.

I flush the buffer cache and run an FTS on DEV1:
select /*+ FULL(foo) */ count(*) from foo;

This takes about 20 seconds to complete. According to v$session_longops, it’s chugging away at about 8,000 blocks/second.

In collectl, KBIn shows about 79,000 per cycle. No dips.

Test 2: Second, I run the same FTS from DEV2 on srv2. It takes about 4 minutes. v$session_longops reports a dismal 2,000 blocks/sec or less. collectl reports a KBIn pattern of 38571, 74, 40755, 1, and so on. Up and down.

Another interesting comparison between the collectl runs on srv1 and srv2 is the ctxsw column. There are twice as many context switches going on on srv2 as on srv1! So far I can’t figure out who is doing that. Some clusterware component fighting for priority, perhaps?

Test 3: We set up a single-instance database (TEST) on srv2 and ran the FTS again. The results were about halfway between the DEV1 and DEV2 tests: 2 minute completion, KBIn alternating between 60,000 and 2, and reading (according to v$session_longops) around 4,000 blocks/sec. Here, context switches were no greater than they were on srv1. I have not yet had a chance to shutdown RAC and CRS on srv2 and rerun this test; I suspect that if I did that my non-RAC TEST database might behave like DEV1.

Test 4: We created a big table on DEV2 in /tmp, which is not on the NFS storage (it’s an ext3 filesystem on an MSA device), and now we get metrics comparable to the DEV1 test. No dips and peaks in the collectl IO stats.

Summary: RAC alone is not to blame because of Test 4. NFS alone is not to blame because of Test 3. But put NFS and RAC together on this system, and performance collapses.

I’m theorizing some kind of caching taking place at the clusterware level, but troubleshooting CRS is new territory for me.

Well, if you have comments I would appreciate them, even if it’s to point out a flaw in my testing.

Reply
9 kevinclosson November 12, 2008 at 3:35 pm

At the expense of sounding like simple “Certification Police”, I do need to point out the RTCM (URL follows). My understanding of Isilon architecture would make me very surprised to hear it works well with RAC. I’ll send you email, Fred.

http://www.oracle.com/technology/products/database/clustering/certify/tech_generic_linux_new.html

The views expressed on this comment are my own and do not necessarily reflect the views of Oracle. The views and opinions expressed by others on this blog are theirs, not mine.

Reply

1 It’s Your Choice: Collectl or Some Odd Collection of Sundry Commands « Kevin Closson’s Oracle Blog: Platform, Storage & Clustering Topics Related to Oracle Databases Trackback on December 18, 2007 at 6:10 pm

	kevinclosson on Announcing SLOB 2.5.4
	Hell Dip on Announcing SLOB 2.5.4
	kevinclosson on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…
	Amey Bobade on Introducing SLOB – The S…

Kevin Closson's Blog: Platforms, Databases and Storage

10 Responses to “Oracle Over NFS. I Need More Monitoring Tools? A Bonded NIC Roller Coaster.”

Leave a comment Cancel reply

DISCLAIMER

Pages

Blogroll

Follow Blog via Email

Recent Posts

Recent Comments

Fond Memories

Copyright

Kevin Closson's Blog: Platforms, Databases and Storage