This is the third installment on this thread. For context, please see:
Copying Files on Solaris. Slow or Fast, It’s Your Choice. Part I
Copying Files on Solaris. Slow or Fast, It’s Your Choice. Part II
What About cp8M Versus Stock cp(1) with Non-forcedirectio?
That is a good question. The saga continues after my post about copying files on Solaris. Once again, Padraig O’Sullivan was kind enough to test cp8M (available here ) versus stock cp(1) using a normal mounted UFS (non-forcedirectio). He reports:
Ok, I ran the benchmark in the same manner as before WITHOUT forcedirectio i.e. I rebooted the machine before each copy of the file.
# time /usr/bin/cp large_file large_file.1
# time /usr/bin/cp8m large_file large_file.
I don’t know. I certainly did not expect an increase in kernel mode cycles for the mmap-enabled cp(1). Please refer to Part I in this series to see that the comparison here is between 14.363s versus 10.853s of kernel-mode cycles. We’re not talking a little increase. No, what Padraig’s measurements show is an increase of 32% in kernel mode cycles when copying a 1000MB file using stock cp(1) on a regular mount compared to the same work on a file in a forcedirectio mount. But hey, at least the performance (in MB/s) was consistently 16% less than cp8M. Yes, that was sarcasm.
I haven’t yet gotten my head around why the standard mmap-enabled cp(1) suffers such a jump in kernel mode processor overhead when switching from direct I/O to a normal UFS mount. I need to think about that a bit.
As usual, a picture speaks a thousand words, so I’ll provide two:
Remember my rant about the “small test?”
Sharing, and Caring
There was a comment by a reader on Part I of this blog thread that is worthy of discussion. The reader commented:
Perhaps a fairly obvious statement this, but notice the use of MAP_SHARED on the mmap call? – (I suspect you’ve spotted that already). This means that multiple processes can attach to the same memory mapped file simultaneously.
That is a good blog comment and evidence of someone giving it some thought. But, I’d like to comment on the sharable aspect of the 8MB map the reader brought out. I replied:
Your point about the kernel bcopy from UFS read buffers to the heap buffer in the address space of the cp(1) process is a good one, but this is a forcedirectio case.
That means there is no copy from the page cache into the virtual address space of process since it is direct I/O. My reply continued:
Since this is an Oracle blog, I would naturally go with the forcedirectio comparison first. It will be interesting to see with a normal UFS mount.
I’ve got a $2 bet that the MAP_SHARED is only there to facilitate copying an already mmapped file…the odds of a process jumping in and sharing a 8MB map that only lives for the duration of an I/O in and an I/O out seems pretty slim to me…but then that is 8MB twice…hmmm…I guess that 8MB mmap could exist for as much as 2-3 seconds if the I/O is headed for a single, simple drive. Sounds like a race just to share an 8MB map to me.
A Closer Look
Yes, when the stock cp(1) mmaps each 8MB segment of the input file it does so with MAP_SHARED. Like I said, I suspect the only thought behind that flag usage was to ensure there wouldn’t be “twinkling” mmap failures by other processes that could potentially be mmapping parts of that file while cp(1) is walking through it.
The reader’s comment continued:
That’s not to say that they all need to be “cp”’s – anything using mmap() on the same file at about same time will yield a benefit – the 8MB chunk paged in by mmap should only be later reclaimed by the pagescanner (or when the last process detaches?).
I already discussed the odds of another process getting in there and benefiting by that very transient mmap. It is 8MB in size and only valid during the read in, and write out—about 2-3 seconds on a really slow disk subsystem.
What’s this about reclaims? Good topic. When the mmap is dissolved through munmap(), the pages of the file are put on the free list (pagecache). Here is where the non-forcedirectio cp8M and cp(1) have a lot in common. In both cases, the blocks from the input file remain in main memory. Now that is where there is some true opportunity for sharing but only in the non-forcedirectio case. All said, it doesn’t take mmap() to get sharing of file contents being copied when you are using UFS with a normal mount.
So the question remains, what’s up with the mmap()-enabled cp(1)?
Is this thread making you sleepy?