My last installment in the Little Things Doth Crabby Make series had a lot of readers stepping up to remind me that ls(1) and du(1) aren’t always supposed to report the same size-related information on files. Uh, I actually knew that!
The post wasn’t about sparse files or any other such remedial aspects of file sizes.
In the post I mentioned that I was taking some rather unseemly actions against my XFS file system.
One particular unseemly thing I did was a the result of a bug in a small piece of my code. Imagine for a moment that the loff_t variable sz in the following snippet was stupidly uninitialized/unassigned and the program steps on this syscall(__NR_fallocate,,,,) landmine.
if ((ret = syscall(__NR_fallocate, fd, 0, (loff_t)0, (loff_t)sz)) != 0 ) perror ("syscall.fallocate");
Well, if whatever happens to be stored in the variable sz is a really large value you’ll have a.out (allocate_file in my case) spinning in kernel mode for the rest of your life (at least on a 2.6.18 kernel). However, I got tired of it shortly after I snapped the following top(1) information:
top - 11:47:27 up 3 days, 17 min, 4 users, load average: 1.00, 1.00, 1.00 Tasks: 481 total, 2 running, 479 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 4.2%sy, 0.0%ni, 95.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49451520k total, 4065088k used, 45386432k free, 121492k buffers Swap: 50339636k total, 1044k used, 50338592k free, 3609352k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12682 root 25 0 3648 308 248 R 99.7 0.0 880:23.09 allocate_file 3997 root 15 0 13008 1416 816 R 1.0 0.0 10:25.16 top 10100 gpadmin 15 0 111m 17m 2032 S 1.0 0.0 9:13.49 collectl 1 root 15 0 10352 692 580 S 0.0 0.0 0:13.40 init 2 root RT -5 0 0 0 S 0.0 0.0 0:00.10 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0.0 0.0 0:00.10 migration/1 6 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1 7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 root RT -5 0 0 0 S 0.0 0.0 0:00.21 migration/2 9 root 34 19 0 0 0 S 0.0 0.0 0:00.08 ksoftirqd/2 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 root RT -5 0 0 0 S 0.0 0.0 0:04.91 migration/3 12 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3 13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 root RT -5 0 0 0 S 0.0 0.0 0:00.09 migration/4
It turned out my stupid error put the file system up to the task of allocating nearly 14TB to a file in a file system with about 200GB free. My mistake. However, the call should have failed instead of leaving me with a kernel-mode process that required a server reset to clear. But, alas, I was using a very old interface. If the particular test system I was investigating was running a more recent kernel I would have called fallocate(2) and the situation would most likely have been different but the kernel was older than the 2.6.23 minimum requirement for the fallocate(2) call.
So what does this have to do with ls(1) and du(1). Well, I had a lot of programs running that were thrashing the file system. I unearthed a race condition of some sort where my looping call to ls(1) managed to catch a glimpse of the file being populated by PID 12682 (see the top(1) output above). The ls(1) command reported zero bytes. The next line of the script executed microseconds (or less) later at which point du(1) was under the opinion the file was 287GB. Both the initial and subsequent df(1) information was consistent. I haven’t studied the transactional nature of this old rendition of fallocate so I can’t speculate what was going on. The only thing executing on the system at the time was, indeed, several invocations of the allocate_file program. It turns out that none of them branched to that call with an uninitialized grenade—as it were.
I was unable to reproduce the situation and lost interest after fixing that stupid bug in the allocate_file program.
If there is any moral to this story it would be that the level of unpredictability is unpredictable if a process unpredictably asks the kernel to do something it cannot possibly do such as allocate terabytes to a file in gigabytes of free space. I would predict, however, that >2.6.23 fallocate() would handle my goofy mistake differently.
I hate it when I can’t reproduce a problem.
0 Responses to “Little Things Doth Crabby Make – Part XVI (Addendum). Hey ls(1) And du(1) Are Supposed To Agree.”