There has been a bit of a spat going on between Jonathan Lewis and Don Burleson. Don’s summary of the topic can be found here. I don’t care about the spat. I want to point out something clearly. Very clearly. But first, a quote from Don:
[…] Lewis failed to mention the hidden parameter _lgwr_io_slaves nor the Metalink note that clearly states that initially, only one LGWR process is started and that multiple LGWR processes will only appear under high activity.
The Oracle docs are very clear that multiple log writer processes exist:
“Prior to Oracle8i you could configure multiple log writers using the LGWR_IO_SLAVES parameter.”
Note: There is a big difference between LGWR I/O Slaves and LGWR. Don continues:
(Note: Starting in 8.1.7, lgwr_io_slaves became a hidden parameter _lgwr_io_slaves, and it remains a hidden parm in Oracle10g).
Further, Metalink note 109582.1 says that multiple log writer I/O slaves were first introduced almost a decade ago, way back in Oracle8:
Folks, there is only ever one LGWR whose job it is to clear the redo buffer. DBWR on the other hand is multi-stated (support for multiple true DBWR processes) and that is due entirely to the NUMA requirements of certain ports from the late 1990s. Although it is possible for a single DBWR to saturate a CPU, and therefore need multiple DBWR processes, that is rare (generally related to pathological latch contention on cache buffers chains).
As long as LGWR and DBWR have asynchronous I/O support—and you are not running on a NUMA-optimized port—there really should never be a reason to configure multiple DBWR processes, nor should LGWR saturate a CPU.
Let me put it this way, there are examples of large high-end systems doing OLTP that don’t need multiple LGWR processes. Here is a system with roughly 7,000 spindles, 40 FC HBAs, 128 cores and 2TB RAM that needed neither multiple LGWR nor DBWR processes. And if you think you have a workload that generates more redo on a per-CPU basis than TPC-C you’re probably wrong.
If I ever hear the term I/O slaves again, I’ll:
Summary
Oracle demonstrates more LGWR scalability than 99.99942% of all production sites would ever need when they run such high end TPC-C benchmarks as the one I cited above. And please, let’s stop the I/O slave madness!
So you are comfortable with NOT using multiple DBWRs
(ie DBWR I/O Slaves) ?
I don’t use multiple DBWRs I/O slaveseither but keep seeing
“recommendations” to do so. Also, that multiple DBWR I/O
slaves can be used to “simulate” Async I/O. I’m not
very sure how that woud really work although the “simulation” means that DBWR0 hands-over to DBWR1
the way it would have handed-over to the OS’s Async I/O ?
I noticed that the 10g Admin Guide includes a new parameter
“DB_WRITER_PROCESSES” different from “DBWR_IO_SLAVES”.
What is this new parameter for ?
Hemant
So you are comfortable with NOT using multiple DBWRs
(ie DBWR I/O Slaves) ?
Hemant,
Don’t confuse slaves with writers. Oracle8i introduced true multiple writers. These are not slaves, but writers that tend to their own LRUs, build their own batches, perform their own I/O and post waiters. Slaves are slaves. If you have 1 writer and no async I/O (OS support), you can configure slaves and get a little relief. All told however, let’s be aware that this is 2007 and any platform that matters supports async I/O. Let’s use it.
If you have one writer and async I/O but see free buffer waits, take a look at DBWR’s processor utilization. If it is saturating a CPU, add another DBWR process (DB_WRITER_PROCESSES=2).
Folks: LGWR and DBWR can use slaves for systems that don’t support async I/O. If you use DBWR slaves, you can only have one DBWR. You can only ever have 1 LGWR since that is all the product supports. Having said all that, use async I/O and you will most likley not need multistated background processes (multiple DBWR/LGWR).
My apologies. I started reading your references
to “LGWR IO Slaves” and the paragraph beginning “As long as LGWR and DBWR have asyncronous I/O support ..” so I though you also meant DBWR I/O slaves. (because slaves are for systems that don’t provide async I/O).
You are right. We generally do not need DBWR_IO_SLAVES. We may need DB_WRITER_PROCESSES if the single process is saturating CPU.
I haven’t used either parameter (not being comfortable with DBWR_IO_SLAVES) and have never noticed DBWR saturating CPU (so I hadn’t come across references to DB_WRITER_PROCESSES in search of a solution for a non-existent problem).
Hemant
Folks, there is absolutely no reason to use db_io_slaves anymore. Actually there probably never *really* was a reason, except for some port specific problems that appeared in a distant past. This was a point-specific fix for a point-specific problem…sorta like the ‘alter system suspend/resume’ command. If I hear one more user say they use this comand ….i’m gonna barf !
Nitin,
If someone finds themselves on a platform that doesn’t support asynchronous I/O, they might need slaves. Let’s just not run Oracle on any platform that doesn’t support direct I/O…now there is an idea!
Kevin,
I found one exception to your recommendation — ” Multiple DB Writers on modern OSes that support async i/o only when 1 DB Writer is consuming all of the CPU.”
I recall when I ran my tests on RHEL4 u4 for my paper see http://www.netapp.com/library/tr/3495.pdf I found that using multiple dbwriters helped some. I increased them to as many as cores (8 on that V40z system) performance improved and beyond that it didn’t help. This was an OLTP benchmark.
Perhaps this is indicative of how well async i/o on linux is implemented.
So yes, we’re in 2007 and all modern OSes support async i/o. But perhaps some do it better than others.
I’d love to hear if you have had different experiences on 64bit linux with async i/o and linux kernel nfs.
thanks,
Sanjay Gulabani
http://netappdb.blogspot.com/
Sanjay,
I read that paper some time ago and I don’t recall seeing any figures for IOPS in there. I don’t doubt you when you say you saw benefit from adding multiple DBWR processes. I just don’t find that to be the case generally–that is unless DBWR processes become CPU-bound. What sort of I/O rate did those test push?
Peter K. xxxx,
>> You could see why DC got so confused about IO_SLAVES and the LGWR process itself.
The assertion that I may have misunderstood was “the log writer does not have multiple processes”.
They can be seen with a “ps -ef|grep ORA|grep IX” command, and I guess I fail to see how they are not processes.
>> enabled DB to sic his legal representatives for copyright violations.
This is a LIE, Peter K.
I have NEVER threatened ANY BLOGGER for copyright violations. I don’t even go after people who copy my whole web site:
http://www.copyscape.com/view.php?o=56414&u=http%3A%2F%2Fwww.globaldata.ro%2Fconsultanta.html&t=1185055303&s=http%3A%2F%2Fwww.dba-oracle.com&w=50&c=
http://www.copyscape.com/view.php?o=56414&u=http%3A%2F%2Fwww.3rpco.com%2Fservices.asp&t=1185055303&s=http%3A%2F%2Fwww.dba-oracle.com&w=20&c=
“BC is expert at Oracle instance tuning, SQL tuning and server-side tuning for Oracle.”
Why can’t you people play fair, without threatening “smear campaigns” and telling lies?
BTW, my wife made me remove that complaint page, after receiving veiled threats from Jonathan Lewis that “he suspects” that my complaints about him would damage me, if they were not removed. Last time I was smeared, the scumbags felt compelled to drag my wife into the fight.
Peter K. XXX, please retract this LIE.
Don,
Keep it short and sweet here, ok.
>> Keep it short and sweet here, ok.
Ok, how’s this:
Peter K. is a liar.
The nasty nature of these assault is evident to all, and the attacks have been likened to the attack on Kathy Sierra:
http://tkyte.blogspot.com/2007/03/this-is-really-bad.html
“I have seen many posts in regards to comments made by “Burleson vs Lewis” fanboys that would probably not be much different than those made against Kathy.”
FYI, this is what happened to Kathy:
http://headrush.typepad.com/
Readers:
There were 4 comments on this thread that pertained to the strife between Don Burleson and Jonathan Lewis. Although I have the utmost respect for Jonathan Lewis, I have nothing to do with this war and so I’m removing comments from Peter K and Don Burleson.
Theis thread is about LGWR, nothing else.
Kevin,
My apologies. You are of course right to do so.
Peter.
For Sanjay, Kevin:
– So it’s okay to set the db_writer_processes equal to # of CPUs?
– or you monitor the DBWR then if its consuming high CPU then add another one upto the # of CPUs, it’s like offloading the other DBWR of the work?
Kevin,
What are your thoughts about log_archive_max_processes?
The default is 2, but the official doc and MAA best practice white paper recommend to set it to 30, this will create 30 LGWR processes.. but I still how to find out if these processes are working in parallel altogether..
I’ll appreciate your inputs..