It is pretty well known that the Oracle database relies quite heavily on gettimeofday(2) for timing everything from I/O calls to latch sleeps. The wait interface is coated with gettimeofday() calls. I’ve blogged about Oracle’s heavy reliance upon the gettimeofday(2) such as in this entry about DBWR efficiency. In fact, gettimeofday() usage is so high by Oracle that the boutique platforms of yesteryear even went so far as to work out a mapping of the system clock into user space so that a simple CPP macro could be used to get the data—eliminating the function overhead and kernel dive associated with the library routine. Well, it looks like there is relief on the horizon for folks running Linux on AMD. According to this AMD webpage about RDTSCP, there is about a 30% reduction in processor overhead for every call when using a gettimeofday() implementation based upon the new RDTSCP instruction in AMDs Socket-F compatable processors. The webpage states:
Testing shows that on RDTSCP capable CPUs, vast improvements in the time it takes to make gettimeofday() (GTOD)
calls. It takes 324 cycles per call to complete 1 million GTOD calls without RDTSCP and 221 cycles per call with the capability.
Of course that would be a kernel-mode reduction in CPU consumption which is even better for an Oracle database system.
I need to get my hands on a Socket F system to see whether the kernel support in RHEL4 U4 and the glibc side of things are set to use this RDTSCP-enabled gettimeofday() right out of the box. If not it might require the vgettimeofday() routine that is under development. If the latter is true it will require Oracle to release a patch to make the correct call—but only on AMD. Hmm, porting trickery. Either way, an optimized gettimeofday() can be a nice little boost. I’ll be sure to blog on that when I get the information. In the meantime, it is nice to see folks like AMD are trying to address these pain points.
Since Oracle calls gettimeofday() so frequently, and they are so very serious about Linux, I wonder why you are reading this here first?
I too am getting a ton of these gettimeofday() when using strace on my dbwr process and its showing up as massive system i/o in my grid control. If I’m getting the reads as in this example, then that verifies that my aio is working, but are these gettimeofday a blocking aspect of my i/o subsystem causing whatever is needed by the dbwr to be bottlenecked by these gettimeofday calls? Anybody find a fix for this yet? Here’s what I’m getting in my strace while using RHEL 4 U4, 10.2.0.3, ASMLib on Hitachi SAN and my LUN is RAID 10 3+3 146GB 15k drives.
gettimeofday({1178725123, 526849}, NULL) = 0
read(16, “MSA\2\10P \335i\6″…, 80) = 80
gettimeofday({1178725123, 528090}, NULL) = 0
gettimeofday({1178725123, 528156}, NULL) = 0
gettimeofday({1178725123, 528215}, NULL) = 0
gettimeofday({1178725123, 528274}, NULL) = 0
gettimeofday({1178725123, 528334}, NULL) = 0
gettimeofday({1178725123, 528394}, NULL) = 0
read(16, “MSA\2\10P\222\377\377\377 \335i\6″…, 80) = 80
times(NULL) = 540087496
gettimeofday({1178725123, 528864}, NULL) = 0
read(16, “MSA\2\10P \335i\6″…, 80) = 80
gettimeofday({1178725123, 529040}, NULL) = 0
gettimeofday({1178725123, 529104}, NULL) = 0
gettimeofday({1178725123, 529163}, NULL) = 0
gettimeofday({1178725123, 529222}, NULL) = 0
gettimeofday({1178725123, 529280}, NULL) = 0
gettimeofday({1178725123, 529342}, NULL) = 0
times(NULL) = 540087496
gettimeofday({1178725123, 529568}, NULL) = 0
gettimeofday({1178725123, 529628}, NULL) = 0
gettimeofday({1178725123, 529687}, NULL) = 0
gettimeofday({1178725123, 529745}, NULL) = 0
times(NULL) = 540087496
gettimeofday({1178725123, 529857}, NULL) = 0
gettimeofday({1178725123, 529923}, NULL) = 0
semtimedop(1867777, 0x7fbfffddf0, 548682063376, NULL) = 0
gettimeofday({1178725123, 908480}, NULL) = 0
gettimeofday({1178725123, 908556}, NULL) = 0
gettimeofday({1178725123, 908618}, NULL) = 0
gettimeofday({1178725123, 908680}, NULL) = 0
gettimeofday({1178725123, 908744}, NULL) = 0
times(NULL) = 540087534
read(16, “MSA\2\10P\222\377\377\377 \335i\6″…, 80) = 80
gettimeofday({1178725123, 929319}, NULL) = 0
read(16, “MSA\2\10P\222\377\377\377 \335i\6″…, 80) = 80
read(16, “MSA\2\10P \335i\6″…, 80) = 80
read(16, “MSA\2\10P\222\377\377\377 \335i\6″…, 80) = 80
Lance,
This is an age-old artifact of how the OSDs work. Oracle knows they need to work on the overuse of gettimeofday()…it has been a known problem for a very long time. Fortunately it doesn’t really add up to much on normal production systems. It hurts in benchmarking though. Give it a try with timed statistics turned off …
Well, I’ll have to turn off my Automatic Memory Management if I turn statistics off, but I’ll give it a try. Do you also think that this wait is tracked more in Grid Control on the newer version of 10.2.0.3 under system i/o more so than the older Grid Control? The reason I ask is because in my new system, there is so much more system i/o wait (mainly when doing updates or a process that writes undo/redo) than my other production systems and it just looks like abnormal behavior as far as Grid goes, where as I don’t seem to have that much system i/o wait on my other 10.1 Grid System. Am I making sense?
Well, I turned timed_statistics off…it did reduce cpu quite a bit, but I still am getting about the same amount of gettimeofday in my dbwr trace file. Has anyone tried this on Linux Redhat 4U4?
mknod /dev/timedev c 15 0
chmod 664 /dev/timedev
Or is this just a RAC and Tru64 thing?
Hi Kevin,
This was very useful post that you created almost 2 years ago. I have couple of questions about it.
(1) Why do we talk about gettimeofday() only for Linux implementations? Is it not a problem on UNIX?
(2) Metalink document # 436797.1 mentions that there will be a new timer process in 11gR1 but I could not locate any reference to that both on Metalink or even using Google Search
(3) Any newer updates/references/links on this topic would be great!
Thanks
Regards
Amit