In my recent post about LGWR processing, I used a title that fits my recurring Manly Men motif just for fun. The title of the post was Manly Men Only Use Solid State Disk for Redo Logging. The post then set out to show that LGWR performance issues are not always I/O related by digging in to what a log file sync wait event really encompasses. Well, I’m learning to choose my words better-the hard way. Let me explain.
The purpose for building the test case for that post using solid state disk (SSD) for redo log files was to establish the fact that even when redo logging I/O is essentially free, it is quite possible to see long duration log file sync wait events. I also mentioned that I did one pass of the test with the initialization parameter _disable_logging set to true and achieved the same results as the SSD run. That establishes the fact that SSD for redo logging is as fast as fast can be. The point of the exercise was to show that when the I/O component is the not the root cause of long duration log file sync wait events it would be foolish to “throw hardware at the problem.” What I failed to point-in this post only-was that SSD for redo is most certainly a huge win if your system is doing any reasonable amount of redo. This test case only generates about 500 redo writes per second so it is not a show-case example of when to apply SSD technology for redo. On the other hand, I would not have been able to make such a rich example of LGWR processing without solid state disk.
Biting The Hand That Feeds You
The good folks at Texas Memory Systems were nice enough to loan me the solid state disk that I happened to use in the LGWR processing post. I’m sad to report that, as per email I received on the topic, their reading of the LGWR processing post left the impression that I do not promote the use of solid state disk for redo. Nothing could be further from the truth. The problem is that I am still working on the tests for my latest proof case for solid state using a much higher-end configuration. Regular readers of this blog know that I have consistently promoted solid state disk for high-end redo logging situations. I have been doing so since about 2000 when I got my first solid state disk-an Imperial MegaRam. That technology could not even hold a candle to the Texas Memory Systems device I have in my lab now though.
Folks should forget SSDs for logs: they are good to reduce seek and rotational delays, that is NOT the problem in 99% of the cases with redo logs. They might be much better off putting the indexes in the SSDs!
Noons is correct to point out that rotation and seek are generally not a problem with LGWR writes to traditional round-brown spinning thingies, but I’d like to take it a bit further. While it is true that given a great deal of tuning and knowledge of the workload it is possible to do a tremendous amount of redo logging to traditional disks, that is generally only possible in isolated test situations. The perfect case in point are the high end TPC-C results Oracle achieves on SMP hardware. The TPC-C workload generates a lot of redo and yet Oracle manages to achieve results in the millions of TpmC without solid state disk. That is because there are not as many variables with the TPC workload as there are in production ERP environments. And there is one other dirty little secret: log switches.
Real Life Logging. Don’t Forget Log Switches.
Neither the test case I used for the LGWR processing post nor any audited Oracle TPC-C run are conducted with the database in archive log mode. What? That’s right. The TPC-C specification stipulates that the total system price include disk capacity for a specific number of days worth of transaction logging, but it is not required that logs actually be kept so the databases are never set up in archivelog mode.
One of the most costly aspects of redo logging is not LGWR’s log file parallel write, but instead the cost of ARCH spooling the inactive redo log to the archive log destination. When Oracle performs a log switch, LGWR and ARCH battle for bandwidth to the redo log files.
Noons points out in his comment that rotation and seek are not a problem for LGWR writes which is generally true. However, all too often folks set up their redo logs on a single LUN. And although the single LUN may have many platters under it, LGWR and ARCH going head to head performing sequential writes and sequential large reads of blocks on the same spindles can introduce performance bottlenecks-significant performance bottlenecks. To make matter worse, it is very common to have many databases stored under one SAN array. Who is paying attention to religiously carve out LUNs for one database or the other based on logging requirements (both LGWR and ARCH)? Usually nobody.
The thing I like the most about solid state for redo logging is that it neutralizes concern for both sides of the redo equation (LGWR and ARCH) and does so regardless of how many databases you have. Moreover, if you log in solid state to start with, you don’t have to worry about which databases are going logging-critical because all the databases get zero cost LGWR writes and log switches.
Storage Level Caches
Simply put, redo logging wreaks havoc on SAN and NAS caches. Think about it this way, a cache is important for revisiting data. Although DBAs care for redo logs and archived redo logs with religious fervor, very few actually want to revisit that data-at least not for the sake of rolling a database forward (after a database restore). So if redo logs are just sort of a necessary evil, why allow redo I/O to punish your storage cache? I realize that some storage cache technology out there supports tuning to omit the caching of sequential reads or writes and other such tricks, but folks, how much effort can you put into this? After all, we surely don’t want to eliminate all sequential reads and write caching just because the ones LGWR and ARCH perform are trashing the storage cache. And if such tuning could be done on a per-LUN basis, does that really make things that much simpler? I don’t think so. The point I’m trying to make is that if you want to eliminate all facets of redo logging as a bottleneck, while offloading the storage caches, solid state redo is the way to go.
No, I don’t throw partners under the bus. I just can’t do total brain dumps in every blog entry and yes, I like catchy blog titles. I just hope that people read the blog entry too.
Hey, I’m new to this blogging stuff after all.