In Part II of this small series of blog entries on DBWR, a reader made note of the fact that the test I was using suffered log file switch (checkpoint incomplete) wait events and wanted an explanation as to why those were happening. I’d like to answer that and cover a couple of mistakes I made as well.
When I configured the OLTP database for that set of tests I sized the online redo logs small enough to ensure I’d get some log switches into the overall mix as well. My rationale for that move was to take measurements at the varying DBWR counts with a workload that put DBWR under as much duress as possible. To that end, some may consider the test too artificial and honestly I don’t blame them. Afterall, the remedy for log file switch (checkpoint incomplete) is to either make the redo logs larger or address DBWR’s ability to flush buffers. Of course the approach for the latter would be either a faster I/O subsystem or improve DBWR’s ability to do his job-which is what this thread is about. So that begs the question, what are the numbers if redo logs are large enough to rule out checkpoint bottlenecks. Good question. After changing my redo logs to very large sizes I ran another set of tests that lead me to believe that on systems with a small number of processor cores and a normal DBWR workload (e.g., reasonable checkpoints) there is no appreciable difference in throughput when going with as many as 1 DBWR per processor core. The results surprised me…read on…
Admissions of Error
The first of two admissions of error that is. After reading my own blog entries on this thread a couple of times I believe I need to clarify some things. I’ve stated matter-of-factly that over-configuring DBWR processes will impact performance. Well, that is true as my test example showed but is it a completely true statement? No. I met a guy once that couldn’t admit when he was wrong. I didn’t like him much.
Here is the situation: over-configuring DBWR processes is not going to impact performance if your workload isn’t a challenge for DBWR functionality. I have myriads of proof that normal DBWR workloads can be handled by multiple DBWR processes without any problem. It’s more likely the case that if your system has unnecessary multiple writers and you reconfigure to a single writer you wont see any difference. That’s because multiple writers in and of themselves is not a bad thing. And most production systems do not torture the I/O subsystem like I do here. My intent was simply to dig into the topic a bit deeper. That leads me to admission number 2.
Earlier in this series I mentioned the cache effect of multiple writers. Well, I was digging around some audited TPC-C full disclosure reports and noticed the practice of setting db_block_checksum=FALSE. This jarred my memory to the fact that since Oracle9i, DBWR has in fact been peeking at block contents by default. Let me explain. There was a day when DBWR’s accesses to memory differed significantly from that of foreground processes. Those days were pre-Oracle9i and in those releases DBWR didn’t peek at the contents of buffers. Instead, DBWR’s work consisted of figuring out what to write, writing it and performing the necessary housekeeping so that the written buffer would have the correct state. To that end, DBWR processes manipulated much more SGA metadata than did foreground processes. While that is still true to a large degree, Oracle9i changed that situation because DBWR now peeks at buffer contents.
With Oracle9i and later releases, DBWR processes peek at the contents of the buffers to calculate the block checksum and then store the value in the block itself. When re-read, the checksum is calculated by the reading process to ensure the block hadn’t been corrupted after it was flushed from the SGA. Now don’t get me wrong, these buffer reads are nowhere as data intensive as what foreground processes do for a living. Consider, for instance, a block of data manipulated by a foreground process servicing a payroll application. The foreground process exercises myriads of code that DBWR never would (SQL, transaction layer, etc), touches shared resources DBWR never would (shared pool, redo, etc) and most importantly, manipulates the row pieces in the block. When DBWR flushes the buffer with block check summing, it will peek at the contents of the block which will require the memory be installed into DBWR’s processor cache. That wasn’t the case “in the old days.” DBWR used to only manipulate metadata.
With db_block_checksum=FALSE, DBWR doesn’t peek at the contents of the block. When DBWR simply writes a buffer (without peeking into it), the contents of that buffer do not get installed into the processor cache. The page is in DBWR’s address space, but that is not the same as a memory access. Don’t get me wrong, I’m not saying db_block_checksum=TRUE is miserably expensive, because I haven’t tested the difference on modern systems so it I don’t know. To be honest, I wouldn’t run a database without it because it sure seems like good insurance. I would think systems with really large processor caches would see the most performance hit, but that would just be my best guess. After all, most commodity systems have small processor caches that are simply overwhelmed long before any of these concepts come into play. As for huge systems, well, I’ve been unfortunate to have been out of that space for a few years. It’s good to have friends, so I’ll quote the comment my old buddy Glenn Fawcett made on Part I of this series:
I agree… One DBWR process per cpu is not sage advice.
On large NUMA based servers, such as the Sun E25K, Oracle will by *default* create one dbwr process per system board. This allows the DBWR process to search LRUs out of local memory which is very efficient… Kevin could explain the benefits of this far better than I could.
The best advice I could give on “db_writer_processes” is don’t set it AT ALL. Let Oracle figure out the best setting.
Glenn knows really big systems due to his role in Sun’s SAE and other roles.
I think all this was my long-winded way to say that it is good to dive into DBWR topics, but it is best to remember that things like db_block_checksum defaults change from time to time. Further, there should be no panic over having multiple writers and finally, mistakes happen.