Yes, File Systems Still Need To Support Concurrent Writes! Yet Another Look At XFS Versus EXT4.

My post entitled File Systems For A Database? Choose One That Couples Direct I/O and Concurrent I/O. What’s This Have To Do With NFS? Harken Back 5.2 Years To Find Out has not been an incredibly popular post by way of page views (averages about 10 per day for the last six months), but it has generated some email from readers asking about EXT4.

I’ve been putting off the topic but it is fresh on my mind.

Today I put out a quick tweet about concurrent writes on Ext4 (https://twitter.com/kevinclosson/status/177111985790525440) that started a small tweet-thread by others looking for clarification.  This blog entry aims to clarify my point about concurrent writes on EXT4 compared to XFS. As an aside, if you have not read the above referenced blog post, and you are interested in concurrent writes and how the topic pertains to several file systems including NFS, I recommend you give it a read.

The topic at hand—EXT4 versus XFS—concurrent write handling is a very brief topic so this will be a brief blog post.  Allow me to explain. The following really sums it up:

EXT4 does not support concurrent writes, XFS does.

So, in spite of the fact that the topic is brief, I’d like to expound upon the matter and offer some proof.

In the following you will see two proof cases—one EXT4 and the other XFS. The proof case is as follows:

  1. The previous file system is unmounted
  2. An XFS file system is created in my md(4) SW RAID LUN
  3. The XFS file system is mounted on /mnt/dsk
  4. A script called simple.sh is executed to prove the volume supports high-performance sequential writes by first initializing a test file through the direct I/O code path
  5. The simple.sh script then measures 196,608 64KB sequential writes to the test file. The file is opened without truncate so this is an operation that merely over-writes the file. The writes are performed with direct I/O.
  6. The simple.sh script then performs concurrent writes of the same file—again the writes are through the direct I/O code path and the file is not truncated. There are two dd(1) processes—one over-writes the first half of the file the other over-writes the second half of the file.

I’ll paste the silly little simple.sh script at the bottom of this post.

The measure of goodness is , of course, whether or not the two-process case is able to push more I/O in aggregate than the single writer case.  You’ll see that with very large writes the LUN can sustain 3.7 GB/s with a single writer through the direct I/O code path on both XFS and EXT4 files. The concurrent versus single write test cases were conducted with 64KB writes. Again, with both file systems (XFS and EXT4) the single writer was able to push 1.4 GB/s. As the following shows, the XFS two-writer case scaled at 1.7x.

Now it’s time to move on to EXT4. Here you’ll see the same baseline of 3.7 GB/s when initializing the file and the familiar 1.4 GB/s for the single 64KB serial writer. That, however, is the extent of the similarities. The two-writer case on EXT4 sadly de-scales. The 2.4 GB/s seen in the XFS case f alls to aggregate of 1048 MB/s with two writers on EXT4.

The following is the simple.sh script:

#!/bin/bash

myfile=$1

echo "Creating test file $myfile using direct I/O"
dd if=/dev/zero of=$myfile bs=1024M count=12 oflag=direct

sync;sync;sync;echo 3 > /proc/sys/vm/drop_caches

echo "Single Direct I/O writer"
( dd if=/dev/zero of=$myfile bs=64K count=196608 conv=notrunc oflag=direct > thread1.out 2>&1 ) &

wait
cat thread1.out

echo "Two Direct I/O writer"
( dd if=/dev/zero of=$myfile bs=64K count=98304 conv=notrunc oflag=direct > thread1.out 2>&1 ) &
( dd if=/dev/zero of=$myfile bs=64K count=98304 seek=98304 conv=notrunc oflag=direct > thread2.out 2>&1 ) &

wait
cat thread1.out thread2.out

16 Responses to “Yes, File Systems Still Need To Support Concurrent Writes! Yet Another Look At XFS Versus EXT4.”


  1. 1 goryszewskig March 7, 2012 at 8:27 am

    Ok, so what about BTRFS is that good competitor for XFS ?
    Best Regards.
    GregG

    • 2 kevinclosson March 7, 2012 at 10:27 am

      Hello GregG,

      I would put btrfs way ahead of ext4 for features and so forth. I fully admit I am biased towards XFS at this time. All those years of all those long-of-tooth SGI engineers’ time stack up to goodness.

      I do not test btrfs so mum’s the words from my unfortunately. I have heard my friend Dave Chinner speak kind words about btrfs in the past…maybe he was dodging drop bears 🙂

  2. 3 Bart Sjerps March 7, 2012 at 12:04 pm

    BTRFS is (like WAFL and ZFS) copy-on-write. It is my understanding that this means every block of written data eventually ends up at a pseudo-random place on disk (i.e. fragmentation by design – which might work well for fileservers but less so for small-block-update OLTP databases). If you later do a sequential read (and I believe Oracle does a lot of short-sequential-“ish” I/O (better explanation needed 😉 then it might cause a lot of excess physical disk seeks, therefore unnecessarily heavy disk utilization (more than you would have on an FS with minor fragmentation). Until everybody is running on 100% flash disk sometime in the future, I think this will not improve performance (actually, the opposite…)

    Not to mention the negative effects of physically moving datafile logical block locations to things like virtual provisioning, EMC FLASH cache, FAST-VP and the like…

    Am I right or am I missing something?

  3. 5 Lee Johns March 22, 2012 at 1:18 pm

    Nice article. Good validation points for our choice of XFS. Thank you SGI & Dave Chinner.

    • 6 kevinclosson March 22, 2012 at 1:24 pm

      Dave Chinner is a ridiculously talented individual (and quite a good chuckle over beers I’ll add).

      We all owe Dave and Red Hat a lot for their commitment. And, as you point out, to the heritage of the SGI code.

  4. 7 Andrey B. Panfilov April 8, 2012 at 1:48 pm

    Kevin,

    You got very strange results, because according http://www.mysqlperformanceblog.com/2012/03/15/ext4-vs-xfs-on-ssd/ XFS is definitely slower than ext4 on concurrent writes

  5. 13 Gadi Chen December 8, 2015 at 7:35 am

    can we mount both xfs and ext4 on the same RDBMS (11gR2) Linux?


  1. 1 Interesting observations executing SLOB2 with ext4 and xfs on SSD « Martins Blog Trackback on October 31, 2014 at 7:27 am
  2. 2 Oracle redo log performance on Linux filesystems – Ilmar Kerm Trackback on March 18, 2023 at 2:00 am

Leave a reply to kevinclosson Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.