Forgot your password?
typodupeerror
Linux Software

Benchmarking XFS, ext2, ReiserFS, FAT32 124

Posted by timothy
from the cool-developments dept.
blakestah writes: "Well, it looks like someone on the LKML has taken upon himself to do some benchmarking of ReiserFS, ext2, and XFS using the 2.4 kernel series. It is not a real benchmark test, but kind of interesting nonetheless. See the results (in Spanish) at this LUG in Mallorca. Simple runs of dd, tar, and rm are shown, and for most of the tests XFS is pretty dern fast, beating all the others. The exception is removal of a large source tree (the kernel source), for which XFS is the slowest by a fair amount. See this kernel post for the translations of important words. It will be nice to see more such open benchmarking posted, because benchmarks provide developers goals." The contrast between FAT32 and XFS is particularly interesting to see.
This discussion has been archived. No new comments can be posted.

Benchmarking XFS, ext2, ReiserFS, FAT32

Comments Filter:
  • by Anonymous Coward
    For every benchmark that Microsoft has showing that its software is faster, the Linux crowd has one that shows the opposite.

    If the benchmark doesn't operate similarly to the application you're going to use it for, the benchmark may be next to meaningless to you. Maybe you're going to create, and then soon after delete a small file, 10,000 times. Which FS is better for you? Maybe you don't know that your application does this, now you have even less of a clue.

    There should be fewer benchmarks because 1) people trust them more than they should, and 2) their utility is low when compared to test-driving your application on the real thing.

  • Benchmarks... bah. I don't think the ones linked to on that Spanish site are worthy of the name. For those that can't see it because it's /.ed, it makes out XFS to be quite a bit faster. Some relevant comments from the LKML... [the last one showing that XFS is not always faster]

    Alan Cox
    "reiserfs seems to handle large amounts of small files well, up to a point but it also seems to degrade over time. ext3 isnt generally available for 2.4 but is proving very solid on 2.2 and has good fsck tools. Ext3 does not add anything over ext2 in terms of large directories of files and other ext2 performance limits. XFS is very fast most of the time (deleting a file is sooooo slow its like using old BSD systems). Im not familiar enough with its behaviour under Linux yet."

    Andi Kleen:
    "On one not very scientific test: unpacking and deleting a cache hot 40MB/230MB gzipped/unzipped tar on ext2 and xfs on a IDE drive on a lowend SMP box.

    XFS (very recent 2.4.4 CVS, filesystem created with mkxfs defaults)

    • > time tar xzf ~ak/src.tgz real 1m58.125s user 0m16.410s sys 0m44.350s

    • > time rm -rf src/ real 0m50.344s user 0m0.190s sys 0m13.950s
    ext2 (on same kernel as above)
    • > time tar xzf ~ak/src.tgz real 1m26.126s user 0m16.100s sys 0m36.080s

    • > time rm -rf src/ real 0m1.085s user 0m0.160s sys 0m0.930s
    ext2 seems to be faster and the difference on deletion is dramatic, so at least here it looks like Alan's statement is true. The test did not involve very large files, the biggest files in the tar are a few hundred K with most of them being much smaller. The values stay similar over multiple runs. I did not do any comparisons recently with reiserfs, but at least in the past reiserfs usually came out ahead of ext2 for similar tests (especially being much faster for deletion)"
  • by Anonymous Coward
    Turn on SoftUpdates and apply the FFS Director preference patch, *bam!*, runs faster then anything on the market. Unfortanly you still have to fsck unless you run -CURRENT.
  • by Anonymous Coward
    32mb?? How big is your hard drive, 100 megs? My old 386 had a bigger drive than that. Now, 32 megs out of my 40 gig hard drive is nothing. I waste more than that in core files I never bother to delete. :-)
  • by Anonymous Coward
    Even if you have -CURRENT, you still have to fsck, it just does it in the background. Its slightly better to have a painfully slow system back up immediately, rather than having a full-speed system back up after a long fsck, but not a whole lot better.

    There is a reason that journaling is considered required for an enterprise level OS. I'd love to see ReiserFS or XFS ported to FreeBSD. Too bad about that GPL thing...
  • by Anonymous Coward
    Deleting files on a FAT* partition doesn't actually delete the files. It just marks the first character of the filename with an invalid character (0 ASCII IIRC) so the filesystem thinks the space is free.

    It's not ASCII 0, 0 marks the end of the directory (I think it's 0xE5, not that it's that important). And if the file has a long filename, the first letter of the short filename entry is overwritten, but the long filename is still available. So newer undelete programs don't always need the first letter.

    Anyway, when a file is deleted, each cluster of the file must be zeroed in the FAT (note that there are usually two identical copies of the FAT, both must be updated). Since the FAT is at the beginning of the disk, the write heads may need to move a large distance (between the directory entry and each FAT), so write caching will give a huge speed increase. Also, directories can only be deleted when they are empty, so this needs to be done for each file (and subdirectory), and then for the directory itself. And then the free space count has to be updated.

    My guess here is that FAT32 just marks the root source directory as deleted and then moves along.

    This would not work, you'd end up with "lost clusters" and lots of missing hard drive space. You could try this with Norton Disk Editor some time if you were curious - it's good way to look at how the filesystem is stored, FAT is pretty easy to understand. Scandisk would find and fix this problem easily (it would either restore the directory, or just delete everything that used to be in it).

    [Offtopic] A fun trick to play on someone is to change their disk's volume label, and then use Disk Editor to set the read-only flag on it (you'll find the volume label in the root directory). They will be unable to change the label from Windows or DOS. Unfortunately, I think Scandisk (or Norton Disk Doctor) can detect this now, but they won't think to run Scandisk for a few minutes anyway.

  • by Anonymous Coward
    If this is the case, I think it is quite a good tradeoff, given that space is allocated much more often.

    >Only if you've got an infinite drive..

    So the fact that you get one credit to your bank account each month but make lots of payments means you have an infinite balance?

    lots of allocation operations, fewer delete operations. It is just that the delete operations are bigger

    ..d

  • by Anonymous Coward
    It is amazing how fast ReiserFS can remove a source tree to the Linux kernel. Almost as soon as you hit the "return" key, it is finished. Blows me away. It is so fast it could be dangerous to the careless. It works so fast that by the time you realize your mistake *everything* would be gone--in the blink of an eye.
  • by Anonymous Coward on Thursday May 10, 2001 @10:47AM (#232207)
    Deleting files on a FAT* partition doesn't actually delete the files. It just marks the first character of the filename with an invalid character (0 ASCII IIRC) so the filesystem thinks the space is free. That's why undelete utilities work -- they just search for all files with the invalid first character and then change the character to something else (which the program prompts the user for). Assuming the space the file had before hasn't since been used by another file, you get the file back.

    My guess here is that FAT32 just marks the root source directory as deleted and then moves along. (This is how it works under DOS/Windoze at any rate.) If this were so, FAT32 is hardly fast here -- it takes 6.7 seconds to do just one write to the FAT table!

    To be honest though, I'm not sure how the various Linux FSes work, so maybe they're doing it the same way, though I doubt one FS entry would take 10 - 20 seconds.

  • by Anonymous Coward on Thursday May 10, 2001 @09:12AM (#232208)
    Benchmarks... bah. I don't think the ones linked to on that Spanish site are worthy of the name. For those that can't see it because it's /.ed, it makes out XFS to be quite a bit faster. Some relevant comments from the LKML... [the last one showing that XFS is not always faster] Alan Cox: "reiserfs seems to handle large amounts of small files well, up to a point but it also seems to degrade over time. ext3 isnt generally available for 2.4 but is proving very solid on 2.2 and has good fsck tools. Ext3 does not add anything over ext2 in terms of large directories of files and other ext2 performance limits. XFS is very fast most of the time (deleting a file is sooooo slow its like using old BSD systems). Im not familiar enough with its behaviour under Linux yet." Andi Kleen: "On one not very scientific test: unpacking and deleting a cache hot 40MB/230MB gzipped/unzipped tar on ext2 and xfs on a IDE drive on a lowend SMP box. XFS (very recent 2.4.4 CVS, filesystem created with mkxfs defaults) > time tar xzf ~ak/src.tgz real 1m58.125s user 0m16.410s sys 0m44.350s > time rm -rf src/ real 0m50.344s user 0m0.190s sys 0m13.950s ext2 (on same kernel as above) > time tar xzf ~ak/src.tgz real 1m26.126s user 0m16.100s sys 0m36.080s > time rm -rf src/ real 0m1.085s user 0m0.160s sys 0m0.930s ext2 seems to be faster and the difference on deletion is dramatic, so at least here it looks like Alan's statement is true. The test did not involve very large files, the biggest files in the tar are a few hundred K with most of them being much smaller. The values stay similar over multiple runs. I did not do any comparisons recently with reiserfs, but at least in the past reiserfs usually came out ahead of ext2 for similar tests (especially being much faster for deletion)"
  • I once accidentally typed rm -rf /home when I was extremely sleepy. Funny that I did notice that something was wrong and so I pressed ctrl-c immediately, I think it's about half a second after I issued the command. More than 1 GB data was gone! Dunno why but it just deleted all of my mp3s instead of my thesis's tex files =)

    So beware, there's no undelete utilities last time I checked, though the authors would like to write one if you choose to pay them.

  • certainly it can't be slower than BSD's UFS, though, can it? I could manually delete the bits off the platter with a tiny magnet faster than FreeBSD can do rm -r /usr/ports...

    - A.P.

    --
    Forget Napster. Why not really break the law?

  • I would imagine they didn't test against NTFS because the kernel drivers are very unstable for it. Sure, they could have installed Win2K and tested that way, but then OS differences would affect the results. A pure filesystem test should use the same kernel setup with only the fs drivers changing.


  • http://kernelnewbies.org/~phillips/

  • Can you point me to information about IBM's Next Generation PThread book?
  • Yeah, now that I think of it, 1024 Gb isn't that impressive. I think you're correct, it was a multiple of terabytes.

    <SLIGHTLY OT>
    Doesn't it piss you off that disk drive manufacturers are claiming that 1 Gb = 1000 Mb in order to protect the "sanctity" of the metric system. What a load of crap. Byte magnitudes have always been on the order of 2^(n*10). Now all of a suddent they need to protect the SI naming convention? Isn't it convenient that this "protection" makes their drives sound larger without actually being that large?
    </SLIGHTLY OT>

  • by Omega (1602) on Thursday May 10, 2001 @09:09AM (#232215) Homepage
    I guess I've always been partial to XFS and I hope that it becomes the new default filesystem for Linux.

    This guy Dave (I forget his last name now), from sgi [sgi.com] gave a presentation to the DC-LUG [tux.org] back in 1999 and talked about XFS and how sgi wanted to release it as GPL to become a core component of Linux. He also talked about the history of XFS and how they had to invent a new size prefix to describe how large a filesystem XFS could accomodate ("exo-byte" = 1024 Gb). XFS has been used by sgi for their MIPS and Cray machines ever since 1984, and now that sgi has donated it to the Linux community, I think we'd be remiss if we didn't welcome it with open arms.

    But that's just MHO. ;)

  • by slim (1652)
    "It is quite surprising to see the write time be so slow for linux, as quite frankly, FAT32 is so simple (no transaction) it *should* be only slightly slower than optimal in medium to large file size cases."

    I would imagine that the reason FAT32 isn't highly optimised on Linux is that it's only there as a compatibility tool. Nobody running Linux would put data on a FAT32 partition when performance was an issue. Being able to get at the filesystem at all is enough.
    --
  • To get XFS to work with Linux 2.4.4, you'll need the version in their CVS repository.

    XFS and MOSIX -might- work, with sufficient application of brute-force. Neither work with 2.4.4-ac6, and I believe there's a lot of problems with FreeSWAN's IPSEC, too.

    The only patches I've found to be relatively unstressful are the International Patch, the POSIX Timer patch, and IBM's Next Generation PThread code.

  • Absolutely understood. I've e-mailed the authors of the various packages, asking if there's some way of getting the packages to work together.

    The closest to a printable response was some sneer to the effect that the writers weren't interested in supporting Alan Cox's "private OS".

    (Curiously, the patches tend to break with every release of the official kernel. Hmmm. Maybe if these coders got off their high horses long enough to see how code migrates between the official kernel and the AC series, they'd have an easier time keeping up.)

    As of right now, though, you're going to have to do some of the merge by hand. And it's likely to prove difficult.

    Look on the bright side, though! If you =DO= get a merged patch, it'll be YOUR patch that's used. Who wants second-best?

  • by jd (1658) <{moc.oohay} {ta} {kapimi}> on Thursday May 10, 2001 @09:46AM (#232219) Homepage Journal
    Size of Linux-2.4.4, with no patches: 3,239,950 lines (as calculated by doing a wc -l on all .c's and .h's in the tree, then summing the result), and 131,448,832 bytes (for the whole tree, using du).

    Size of Linux-2.4.4 with XFS (as taken from SGI's CVS tree): 3,612,541 lines (same algorithm) and 145,367,040 bytes.

  • Or they've a serious PostgreSQL bug. :)

    Seriously, XFS looks interesting. I did some stats on source size, and posted them to K5. :) Dropped in score so rapidly, it'll reach the Earth's core sometime in the next half hour.

    Essentially, XFS is 10% of the entire kernel size, making it perhaps the single-most sophisticated driver available. On the flip-side, I can't help but feel that that much code is going to have -some- impact on the rest of the system.

    Talking of benchmarking, how does IBM's "Next Generation PThread" code stand up? And how on Earth are you supposed to install it? It clashes with glibc, making an RPM install, ummm, of questionable safety. And once you start with RPM (or tarballs, or any other system), it's unwise to mix-and-match. Either you keep track of where things are, or the system does. Half-and-half is NOT good.

    Lastly, anyone found a way to get XFS, JFS and AFS into the same kernel? (Without using a sledgehammer, preferbly.)

  • According to Steve Lord, one of the XFS developers, current xfs from cvs has problems. (link to Steves post on linux-xfs [sgi.com]) /jarek
  • It's true. I ran bonnie++ on a test machine under ext2 and xfs. xfs had REALLY slow deletes, but fast throughput. XFS did 36 deletes per second, while ext2 did 802 deletes per second. As far as throughput, XFS did 538K/sec on block reads, while ext2 only got 436. Also, in general, xfs used less CPU resources. Also, ext2 kicked xfs's butt in all areas when there was only one bonnie++ process running. The benchmark data I've given was running three simultaneously on the filesystem. So, if the server only has one process, I'd NOT go with xfs. If the server has multiple processes, xfs probably beats most things, especially on block read performance.

    Note, also, that my test machine was only a AMD K6-2 266 w/28M RAM
  • There is not word for slashdotted in Spanish

    How about barrapuntar: "to slashdot".

    - Sam

  • I don't think the ones linked to on that Spanish site are worthy of the name

    Keep in mind that the article starts out by saying "This is a little more rigourous than what was posted to the mailing list the other day". Also says "This comparison only measures one aspect of the filesystems, and there are other important aspects to consider when deciding on a filesystem".

    - Sam

  • Anybody know the Spanish word for "slashdotted"

    Well, "slashdot" is Barrapunto [barrapunto.com] when translated in to Spanish, so the verb would be "barrapuntar". E.g. ellos barrapuntaron el pagina (They slashdotted the page).

    - Sam

  • by Kiwi (5214) on Thursday May 10, 2001 @10:05AM (#232226) Homepage Journal
    Since I know a little Spanish, here is my very crude translation of the core benchmarks:

    First benchmark:

    Writing, reading, and deleting a fairly large file (256MB). The commands used are as follows (omitting the redundant 'time' command used in all cases):

    • dd if=/dev/zero of=./prova bs=1M count=256
    • dd if=./prova of=/dev/null bs=1M count=256
    • rm -f ./prova
    And the time these commands took (in seconds):

    Filesystem Write Read Delete
    ReiserFS 18.5 23.41 0.4
    Ext2FS 20.3 21.38 0.57
    XFS 16.32 19.42 0.26
    FAT32 43.65 27.98 1.59
    Second benchmark: This benchhmark was donw with the source code of Linux 2.4.4. The .tar.gz file was first uncompressed, so that all the work was done on the tar file (which is larger then 100MB). The commands used on the uncompressed .tar file are as follows (with the time command ommitted again):

    • cp linux-2.4.4.tar prova.tar
    • tar xf prova.tar
    • rm -f prova.tar
    • rm -rf linux
    The times these commands took (in seconds):

    FS Copy Extract rm file.tar rm -rf dir
    Reiser 38.48 58.44 0.45 10.09
    Ext2FS 21.31 59.19 2.88 11.12
    XFS 16.21 35.44 0.18 21.96
    Fat32 39.76 134.19 1.2 6.7

  • What the gut got:

    $ddif=/dev/zeroof=./provab s=1Mcount=256

    $ddif=./provaof=/dev/nullb s=1Mcount=256

    $rm-f./prova

    FSWriteReadrm

    ReiserFS18.523.410.4

    Ext2FS20.321.380.57

    XFS16.3219.420.26

    FAT3243.6527.981.59

    $cplinux-2.4.4.tarprova.tar

    $tarxfprova.tar

    $rm-fprova.tar

    $rm-rflinux

    FScpextractrmrmdir

    ReiserFS38.4858.440.4510.0 9

    Ext2FS21.3159.192.8811.12

    XFS16.2135.440.1821.96

    FAT3239.76134.191.26.7



    [1] ReiserFS has a high (reproducible) variance when copying the kernel tarball.

    All test ran three times with the exception of the reiserfs kernel tarball copy, which ran six times. Machine is running in single user mode. Results timed with time(1).
  • Anybody know the Spanish word for "slashdotted"?
  • Also, one more tidbit, the CVS tree also contains the kdb code, so that's a (small) part of the increase.
  • Just realized, CVS has a kernel/ dir and a cmds/ dir.

    Any chance your tallies include the whole cmds/ tree? That's all userspace stuff...
  • by Booker (6173) on Thursday May 10, 2001 @09:21AM (#232231) Homepage
    I can't help but feel that that much code is going to have -some- impact on the rest of the system.

    Although XFS is big, it doesn't stomp on the kernel that much.

    If you're looking at it from a pure "volume of code" point of view, here's some info:

    The XFS patches are split into 2 parts, one which contains kernel changes, and one which is the filesystem itself.

    The "core-kernel" patch is about 190k, while the actual filesystem code patch is about 4.3M.

    Bear in mind that a 190k patch to the kernel does not mean 190k of new code, either, since you have to take out the context lines, the headers, and the delete lines.

    Overall, the impact on the kernel isn't as big as you might think, looking at the overall size.

    Now, whether or not executing code in the filesystem slows down the rest of the system, I don't have any real data on that, although I have not noticed any detrimental effect on my system.
  • They configured PostgreSQL with a low limit for the number of client processes (default = 32 usually). Either that should be increased, or the connection pooling configured in a way so that this limit isn't exceeded. Limiting the number of simultaneous Apache processes also helps, sometimes.

  • That must mean that it's possible to calculate (or guestimate) the number of words spoken by one person during his/her lifetime. I wonder how much storage it would take to hold that?

    Macka
  • zetabyte.

    exobyte, yottabyte, zetabyte. x-y-z.

    THEN you do a lottabyte. :)

    Rob

  • by sl70 (9796) on Thursday May 10, 2001 @10:13AM (#232235) Homepage
    I'm using ReiserFS now and I'm really happy with performance/crash recovery. I have a lot of cat-related outages...

    I installed xfs a week ago. Yesterday I wrote about it to my ex-office mate, saying I couldn't wait for a power failure. Got my wish this morning at about 4:00 a.m. My machine took 15 seconds to reboot. My colleague's machine with ext2 took nearly 3 minutes. Cool messages in the logs:

    May 10 03:55:11 musuko kernel: Start mounting filesystem: ide0(3,5)
    May 10 03:55:11 musuko kernel: XFS: WARNING: recovery required on readonly filesystem.
    May 10 03:55:11 musuko kernel: XFS: write access will be enabled during mount.
    May 10 03:55:11 musuko kernel: Starting XFS recovery on filesystem: ide0(3,5) (dev: 3/5)
    May 10 03:55:11 musuko kernel: Ending XFS recovery on filesystem: ide0(3,5) (dev: 3/5)
  • You're sitting comfortably in front of the console of your server happily clickity-clicking away, the syslog quietly printing its timestamps every 5 minutes, load is normal, users are quiet and the world seems to be in place.

    Suddenly alarm rings, your syslog becomes all red and the security specialist comes storming through the door shouting "someone executed 'rm -rf /&' as root"...

    At this point, would you prefer to have a filesystem that is *sooo* fast deleting that it's already gone through /bin and moved on to /home/, or would you like to have one that has barely just started? :)

    Of course that is all just hypothetical, not that anyone has ever had to recover a system from a deleted /bin :)

    regards: andrey
  • Essentially, XFS is 10% of the entire kernel size, making it perhaps the single-most sophisticated driver available. On the flip-side, I can't help but feel that that much code is going to have -some- impact on the rest of the system.

    Well, it seems to break modules in the one patch of the 2.4.3 kernel that I've tried so far. (Unpack fresh kernel source, patch with both XFS patches, compile...XFS works [compiled directly into kernel], but every module I try to load [sound, drm driver, etc.] complains about unresolved symbols. Not sure what to do about that). Anybody else run into this one? I've been itching to try XFS on a couple of my systems...

    Lastly, anyone found a way to get XFS, JFS and AFS into the same kernel? (Without using a sledgehammer, preferbly.)

    If this isn't bad enough, I'd like to get XFS and MOSIX into the same kernel. Hows THAT for masochistic? Need to figure out how to get the XFS patch into 2.4.4, though...


    ---
  • by MSG (12810) on Thursday May 10, 2001 @10:12AM (#232238)
    The differences between FAT32 and XFS may be interesting, but keep perspective. What you're seeing is the difference between the Linux driver for XFS and the Linux driver for FAT32, and not necessarily the inherent properties of either filesystem.

    Don't get me wrong, I'm not comparing FAT32 to XFS by a long shot! But FAT32 is a fs that not a lot of hackers care about enough to improve the performance under Linux. Personally, I've always found that FAT32 access under Linux has been abysmal compared to access to the same filesystem under Windows.
  • by Jethro (14165) on Thursday May 10, 2001 @09:15AM (#232239) Homepage
    I used ext3 about a year or to ago. It worked a lot better than it should have. Or so I thought.

    It claimed that it'll still have fsck run when you crash, but actually nothing was getting run... I'm pretty sure that it was just not repairing itself after crashes. I'm really surprised I never lost any data until, well, one day I lost the whole damn partition.

    I'm using ReiserFS now and I'm really happy with performance/crash recovery. I have a lot of cat-related outages...


    --
  • by rangek (16645) on Thursday May 10, 2001 @09:18AM (#232240)

    they had to invent a new size prefix to describe how large a filesystem XFS could accomodate ("exo-byte" = 1024 Gb)

    This is a slight exaggeration. A kilobyte is 1024 bytes, a megabyte is 1024 kilobytes, a gigabyte is 1024 megabytes, a terabyte is 1024 gigabytes. Oh wow, it is not even an exaggeration, it is just wrong, I guess. Anyway, if we keep going up the table of SI prefixes in my CRC, we could have petabytes, which would be 1024 terabytes, and then exabytes = 1024 petabytes, and so on. The is no need to invent things, it is all taken care of, from 10E-24 (yocto-) to 10E24 (yotta) (now thats a lotta stuff) ;)

  • by rangek (16645) on Thursday May 10, 2001 @10:29AM (#232241)

    This isn't specifically for bytes, but the list of SI prefixes is here [nist.gov]

    The new prefixes for binary units (which nobody uses (the prefixes, not binary units)) are here [nist.gov]

  • by Scudsucker (17617) on Thursday May 10, 2001 @10:18AM (#232242) Homepage Journal
    Maybe because this was a test of filesystems that work under Linux, not filesystems in general. To have a reliable test of NTFS, you'd have to use it under NT or Win2k, unless you wanted to do some shady benchmarking and use the developing NTFS driver for linux. Also, the point was do use the exact same commands for each filesystem...dd, rm, etc, which you couldn't do in Windows.

    Fat32 was probably included because it's its so well supported and easy to do.
  • I think it is quite a good tradeoff, given that space is allocated much more often.

    If you think about it, almost as much space gets allocated as gets deleted overall. In fact, the difference between total space allocated and total space freed is equal to the space in use.

  • It was also mentioned on l.k.m.l that the slow delete for XFS may have been fixed; oddly enough, it was Hans Reiser who pointed it out.

    http://www.uwsg.iu.edu/hypermail/linux/kernel/0105 .1/0334.html [iu.edu]

  • On my main machine (reiserfs, of course!) I had the stupidity to quickly type "rm -rf /lib" rather than "rm -rf lib". I quickly control-Ced the operation, but ReiserFS deletes FAST.

    It was funny...all applications currently running (browser, etc.) were all still happily running and usable, but I could not execute ANY new commands at the shell prompt.

    I decided to take that opportunity to upgrade to SuSE 7.0. I'm very glad I didn't wipe my wife's files...
  • Any mistakes are mine...but this is how I read
    it...
    --
    These are not benchmarks in all reality... I have only did some simple enough tests but some things have surprised me and I would
    like to give you a look at them.:-)

    Well, this is a little more rigorous than that which happened on the mailing list the other day... the idea is to measure the
    performance of various filesystems, which, nevertheless, are very important: XFS and ReiserFS have journaling, and XFS has
    advanced characteristics like ACLs, extended attributes, etc. That is to say, this "comparison" takes into account a single aspect of
    the filesystems, but there are many other important aspects to take into account when you choose one.

    To begin, I have done these tests with RedHat 7.1 with the installer modified by SGI for directly installing XFS 1.0 for Linux, with
    the kernel and all the tools to date. I have recompiled the kernel in order to have support for ReiserFS without DEBUG activated
    (which makes it very slow).

    Aside from that, I have to say that I have done all these tests on a hard drive quicker than that which I used for that comment on the
    mailing list (it runs about 12.5 MB/s according to hdparm-t). All the tests were done on the same machine, with the same system,
    the same kernel, and they were done on the same hard drive on the same partition. The only thing that has changed has been the
    filesystem: my intention was basically to test XFS and compare it with ReiserFS, but I have added Ext 2 FS and Fat 32 as
    examples (though some will say that I'm cruel to include Fat 32;-). I have used the 'time' command to measure the execution
    time of the commands, and the times that show up in the tables are the average of the times obtained in three experiments. I have
    fixed the "real" time elapsed during execution, and you have to take into account that the machine was in single-user mode so that it
    was for all practical purposes not doing anything more than execute my commands.

    First test: writing, reading, and deletion of a relatively large file (256 MB). The commands used were these (omitting 'time' which
    was used with all the commands):

    * dd if=/dev/zero of=./prova bs=1 M count=256
    * dd if=./prova of=/dev/null bs=1 M count=256
    * rm-f./prova

    And the times were these (in seconds):

    FS Writing Reading Deletion
    ReiserFS 18.5 23.41 0.4
    Ext 2 FS 20.3 21.38 0.57
    XFS 16.32 19.42 0.26
    FAT 32 43.65 27.98 1.59

    Second test: this was based on the source code of kernel 2.4.4. I had downloaded the tar.gz and had uncompressed it in order to
    work with only the tar (of more than 100 MB) without adding the implications of gzip. The commands used were these (again,
    omitting 'time'):

    * cp linux-2.4.4.tar prova.tar
    * tar xf prova.tar
    * rm-f prova.tar
    * rm-rf linux

    And the times were these (in seconds):

    FS Copy .tar Extract .tar Delete.tar Delete tree
    ReiserFS 38.48 58.44 0.45 10.09
    Ext 2 FS 21.31 59.19 2.88 11.12
    XFS 16.21 35.44 0.18 21.96
    FAT 32 39.76 134.19 1.2 6.7

    Conclusions: I was surprised by the speed of XFS because in the first test that I did (a single experiment) it didn't seem so quick.
    For writing, reading, and deletion of big files (sequentially), it seems that this is the one that runs better. Although ReiserFS didn't do
    badly, I was hoping that it would beat XFS. In this test, my attention was called to something else: all but Fat32 lagged more in
    reading than in writing? Can anyone explain that? Something related to /dev/null and /dev/zero???

    As far as the tests with the kernel source code, copying the .tar gave somewhat strange results: at first sight, the fastest was XFS, later
    Ext2FS, later ReiserFS, and finally (what a surprise]:-) FAT32. But this is looking as the averages. There is another curious thing
    (which I didn't put in the table): in this test, the variance using ReiserFS was very large, oscillating between 16.94 and 63.48
    seconds. I cannot explain that. I repeated this test six times instead of three, and it didn't stabilize. Executing three out of three times
    (I had a script prepared for doing this, and I ran it twice) the first copy lagged some 20 seconds, the second some 60 and the third
    about 30. This happened to me twice. Poltergeist?

    The extraction of the .tar (which implies reading a large file mixed with the creation of a large tree with many files) was slow with
    FAT32, while XFS behaved much better, and ReiserFS and regular Ext2FS (they both lagged about the same). If in the previous
    test my attention was called to the enormous variance in the time of copying large files with ReiserFS, here my attention was called
    to the consistency of the results of XFS (identical in all three experiments). Not that this is a promotion of XFS, but this is the
    impression the tests have left me with. O:-)

    Any comments are welcome ;-)

  • Apparently, he never changed the default number of clients in PostgreSQL. It defaults to 32. That probably filled up in the first 30 seconds.
  • by Dragonmaster Lou (34532) on Thursday May 10, 2001 @09:14AM (#232249) Homepage
    I believe 1024 TB is a petabyte. 1024 PB is an exobyte, and 1024 EB os a yottabyte.
  • Little time or patience required, I can explain the delete behavior easily. (Not that I would label myself an FS hacker... distributed, maybe.)

    Modern inode-based filesystems scatter both files and metadata across the disk, to reduce average seek time and fragmentation problems (free blocks are always nearby when using cylinder groups, unless the disk is very very full). So deletion can involve visiting many areas of the disk, and also traversing the allocation tree for each file and adding its blocks back to the free list (or vector).

    MS-DOS FAT filesystems have most of the metadata tightly clustered in the two file allocation tables... only subdirectories containing filenames, some metadata, and pointers into the FAT are stored outside. (32 bytes each.) So for FAT you have to seek, read, and write once (usually) for each subdirectory, but for all the normal files you just make one pass on each FAT, which probably involves no seeking, since they fit entirely within a single cylinder on modern disks.

    DOS just made this process seem excruciatingly slow because you were typically using it without caching and on floppies. FAT-based filesystems are a bit of a naive approach, and their average-case performance, especially under load, can be quite poor if you can't cache the entire FAT in memory. But they aren't all that awful.

  • Hmm... I don't know exactly why metadata operations should be slow on XFS, but my understanding is that XFS is primarily useful for:

    • Journaling, which depending on how it is implemented, would make exactly when metadata writes occur a lot less important. So you may just be seeing the forced sync of 5-second-old dirty metadata from bdflush fouling up the performance. Perhaps XFS was designed with a much looser disk/memory binding in mind.
    • Very high throughput sequential I/O on very large files on very large disks/RAIDs (think video editing), which might involve optimizations or tradeoffs that are not synergistic with low-latency metadata operations.

    Just my thoughts....

  • by LordXarph (38837) on Thursday May 10, 2001 @09:02AM (#232252) Homepage
    fat32 is an interesting control, but in an ideal benchmark, ntfs would have been used, as it is designed closer to other filesystems, as opposed to fat32, which is more like Baby's First Filesystem(tm).

    Though I will grant that NTFS would have been hard or impossible to benchmark in this test, given the lack of robust drivers for it.

    -Lx?
  • FAT32 has almost zero relevance here because if we were all using Win2k or NT we would be using NTFS just like Windows servers do. Why on god's green earth this test didn't test against NTFS is completely beyond me other than to have a weeker MS filesystem to poke fun at. Real objective, guys.
  • Maybe it's not legal to benchmark NTFS.
  • by sid crimson (46823) on Thursday May 10, 2001 @08:57AM (#232255)
    Maybe they should benchmark their DB system.

    ;-)

    -sid
  • Its not about how *much* space, but about how often. Space gets allocated much more often then space gets freed, and usually at more time critical moments. Take, for example, a media system. If you're capturing a video (what else do you do on an XFS sytem?), the file will get enlarged several times during itslifetime. It will get deleted only once. Moreover, you care how quickly you can allocate space, since it effects you're capture quality, but don't really care if deleting files is slower (within reason, of course.) Or, take log files. Log files grow many times, are deleted only once. Also, as another poster pointed out, you usually delete log files when the system is lightly loaded doing internal maintainance. There is almost no application that I can think of (save artificial benchmarks) where XFS's deletion performance would be more important than its file creation/enlargement performance.
  • by be-fan (61476) on Thursday May 10, 2001 @12:16PM (#232257)
    Yes, XFS does kick that much ass. Still, the deletion performance surprises me. Of course, the other speed things make up for it, but its still a puzzle. If you don't know, rm -rf has historically been a slow operation on XFS.

    Here's my theory. Ext2 uses a bitmap to track free blocks, and I'm pretty sure ReiserFS does as well. Free block runs on XFS is managed by two B+trees, one keyed by address, one keyed by size. Thus, allocating space is very fast on XFS, and it is easy to keep things contiguous. However, inserting runs of blocks into both trees is a slower operation then simply clearing the appropriate bits. This would explain the difference between the file creation speed (extraction test) and file deletion speed (rm -rf test.) If this is the case, I think it is quite a good tradeoff, given that space is allocated much more often.

    Of course, IANAXE (I am not an XFS engineer) so this is just my theory. I'd appreciate it is someone more informed about XFS could tell me the reason for the performance delta.
  • I just did some similar benchmarks on XFS vs. ReiserFS with and without the notail option, and ext2 (for reference -- I wanted to switch to a journaled filesystem).

    My tests were 1) copy a several gigabyte file tree (/home) to a temp filesystem, 2) run du on the tree, 3) run Bonnie++, 4) build a kernel, and 5) rm -rf the whole tree.

    My results were that ext2 and xfs had high read/write throughput, resierfs-notail was slightly lower, and resierfs without notail was a fair bit slower.

    All metadata intensive operations (rm -rf, du, bonnie++ small files, etc) were blazing fast on ReiserFS, slow as molasses on xfs, and fast on ext2, but scaling poorly to high numbers of files.

    xfs could do slightly more random seeks than either ext2 which in turn beat out Reiserfs, by about the same amount.

    reiserfs doesn't have an fsck -- at least the one I have is a noop.

    I switched to reiserfs-notail, and was proptly annoyed by the lack of fsck, but otherwise it is running well. I ran a pair of make -j6's, some directory copies, and an updatedb, then hit the power and it is running fine.

    My personal preference would be to have ext3 in 2.4 along with Daniel Phillips' hash-tree directory patch stabalized--I think that would smoke reiserfs and still give me journaling.
  • Did you look through the mailing list for ReiserFS?

    If you had, you'd have seen that Reiserfs is journaling (Well, actually log structured)

    It reserves 32mb for the log... Ergo, 32mb of missing space. If I remember right, you can decrease this when formatting it.

  • 1 kilobyte = 1024 bytes
    1 megabyte = 1024 kilobytes
    1 gigabyte = 1024 megabytes
    1 terabyte = 1024 gigabytes
    1 petabyte = 1024 terabytes
    1 exabyte = 1024 petabytes

    Data Powers of Ten [caltech.edu]
  • I think we all knew FAT sucks already. And besides, I've found the linux vfat driver to be at least as good on the native windows version. I'm really quite impressed with it, actually. Now, the filesystem itself is another matter...
  • You've got a good point. I wish, though, that they would have included BFS in the benchmarking. I'd have been interested in seeing how it compares to ReiserFS (which I've heard is also blazingly fast). I much prefer BeOS over Linux for a desktop OS (in large part because of the excellent filesystem), but given the state that Be Inc.'s in right now, I'm looking for a Linux filesystem that can even begin to compare with the speed I get with BFS (of course, I'm not too fond of the RAM-hogging X Windowing System, either, but that's another story . . .).

    I'm just not sure, though, if there are any Linux BFS drivers that are past the compatibility stage of the NTFS drivers . . .
  • With current processor speeds, hundreds of thousands of lines of code optimizing hard drive access will pay off. It's just a damned bore to write.
  • i believe its still too beta to test......
  • Actually, rm would have walked the entire tree doing unlink and rmdir on every file. rm doesn't know that it can just rmdir the root directory because the filesystem is FAT32. Actually, the kernel would probably not let it, seeing as the VFS would probably require the directory to be empty before removing it anyway, regardless of the underlying filesystem. That's why in FAT32 it took 6.7 seconds, because it was writing to the FAT a number of times.
  • The other problem being that if the Fat32 test was run under Linux as well, all it's showing is how good the Linux driver is, not the filesystem itself.

    That's true of all the filesystems being benchmarked. There's no way to seperate the performance of the filesystem from the performance of the driver, unless there are two drivers for the same filesystem on the same OS. Benchmarking it on another OS would change way too many things to get any meaningful results.

    In fact, all of the filesystems except ReiserFS have drivers under other operating systems. (FAT32 is obvious. ext2, FreeBSD has it, I think there's an NT driver, etc. XFS was originally for IRIX.)

  • http://eies.njit.edu/~walsh/powers/newstd.html
  • Conclusions: the speed of the XFS because in the first test has surprised me that I made (a single experiment) not parecia so fast. For writing, reading and erasure of great files (sequentially) it seems that he is the one that runs more. Although the ReiserFS does not do anything badly, hoped that it won to him to the XFS. In this test another thing has called me the attention: all except FAT32 take more in reading than in writing. Somebody can explain that? Something related to / dev/null and / dev/zero? As far as the tests with the source code of kernel... the copy of tar has given somewhat rare results: at first sight fastest it has been XFS, soon Ext2FS, soon ReiserFS, and finally (that surprise ]: -) FAT32. But that is watching the averages. There is another peculiar thing (that I have not put in the table): in this test the variance using ReiserFS has been very great, oscillating between 16,94 and 63,48 seconds. I cannot explain that. I repeated this test six times instead of three, and one did not become stabilized. Executing it of three in three times (tapeworm script prepared to do that, and I executed it twice) the first copy took about 20 seconds, second about 60 and third about 30. That happened to me twice. Poltergeist? The extraction of tar (that implies reading of a put in great file with the creation of a great tree with many files) has been sudden for FAT32, whereas XFS has behaved more than bién, and ReiserFS and regular Ext2FS (almost they take the same). If in the previous test it called the attention the enormous variance in the time of copy of great files with ReiserFS, here the consistency of the XFS results has called the attention (identical in the three experiments). It is not that it is promoting the XFS, but is that the tests have left me therefore Or: -) As far as the erasure... Creia that fastest was ReiserFS but the XFS has won him erasing a great file. However, the XFS is the slowest with muchisima difference at the time of erasing a complex structure of directories with many files, and the FAT32 () is fastest. Somebody can explain it? Any commentary will be welcome; -)
  • The other problem being that if the Fat32 test was run under Linux as well, all it's showing is how good the Linux driver is, not the filesystem itself.
  • Cat-related outages? I don't let my
    cat near my computer. But I do get the daily
    dodgy hardware random resets so I've switched
    to ReiserFS.

    BTW my cats name is Linus and he is smart and
    he looks like a penguin
  • I believe that terabyte would be the prefix for that order of magnitude. 10^3=kilo-; 10^6=mega-; 10^9=giga-; 10^12=tera-; 10^15=peta-; 10^18=exa-. 1000 Gbyte=1000*10^9 byte=1 Tbyte. Pendantic enough for you? And, by the way, the prefix "exa-" has been around a while and is a perfectly acceptable SI unit of measurement.
  • That sure is a yottabytes!

    HAR HAR

    NAS
  • You should just run everything through gzip, that way you don't waste as much space on the chatterboxes - their conversations are usually redundant and very compressible ;).

    Valley-speak compresses pretty well too, like, you know, whatever :).

    Cheerio,
    Link.
  • here [altavista.com] Have fun! Link.
  • This guy Dave (I forget his last name now)

    Probably Olsen.

    XFS has been used by sgi for their MIPS and Cray machines ever since 1984

    You mean 1994. At I can assure you that using XFS in 1994 was a thrill a minute ;-).

  • by kfg (145172) on Thursday May 10, 2001 @10:43AM (#232279)
    a 'Yottafied' filesystem.

    "Hmmmmmm,large it is. Strong is the file system in this one, hmmmmm."

    KFG
  • A couple of years ago, one of the international standards organisations (I don't remember which) got annoyed at the "mega=10^6" versus "mega=2^20" confusion, and came up with new prefixes. According to these:
    1000 bytes = 1 kilobyte
    1024 bytes = 1 kibibyte
    1000000 bytes = 1 megabyte
    1048576 bytes = 1 mebibyte
    and so on - gibibytes, tebibytes, pebibytes, ebibytes, each larger than the previous one by a factor of 1024.

    Even though the prefix names sound silly, I Am Not Making This Up.

  • by electricmonk (169355) on Thursday May 10, 2001 @10:09AM (#232284) Homepage
    ...that FAT32 actually beat ext2 on two tests, and even beat EVERYONE ELSE on the test that involved removing the kernel source tree? Could an FS hacker with some time and patience on his hands please explain what's going on?


    --

  • I found two things while our site is being slashdotted:

    1. The maxclients in php.ini doesn't work.

    2. We want MySQL !!!

    Right now, the author is doing more extensive tests. We will put them is a static page.

    Hope the server is still alive when I arrive home.

    Cheers, assasins;-)

    --ricardo

  • It wasn't any formal benchmark, just a small test of a young student which wasn't released for Slashdot or any publication. Just a game... We didn't (and the author neither) wanted it to appear in site as slashdot.

    Sorry if it annoyed you. But don't complain to a guy who likes just to play around with Linux.

    --ricardo

  • by gallir (171727) on Thursday May 10, 2001 @10:36AM (#232287) Homepage
    And it's hard to maintain postgress running.

    You should tell ne before you try to slashdot us ;-) So, I could have time to increase the PG backend.

    Hope we can keep it running now... (poor PIII 500Mhz)

    --ricardo

  • by ortholattice (175065) on Thursday May 10, 2001 @09:27AM (#232288)
    Possibly off-topic, but what's the story on the Tux2 http://slashdot.org/features/00/10/13/2117258.shtm l [slashdot.org] file system? It sounded like a great idea, now it seems even the links are broken.
  • by enrico_suave (179651) on Thursday May 10, 2001 @09:43AM (#232290) Homepage
    "I believe 1024 TB is a petabyte. 1024 PB is an exobyte, and 1024 EB os a yottabyte."

    Actually I believe 1024 EB is a lottabyte (*badoom tch!*) =P

    e.

  • by hhg (200613) on Thursday May 10, 2001 @08:57AM (#232293)
    That their webserver can not keep up with their filesystem anyway...
  • If this is the case, I think it is quite a good tradeoff, given that space is allocated much more often.

    Only if you've got an infinite drive...

    --
    BACKNEXTFINISHCANCEL

  • You're forgetting temporary files: what about all those ccxxxxxx.[iso] files that gcc creates--and then deletes--for each source file you compile, or the test programs autoconf's configure script generates for every test it does?

    I will agree that allocation has greater importance overall than deletion, though.

    --
    BACKNEXTFINISHCANCEL

  • You're sitting comfortably in front of the console of your server happily clickity-clicking away, the syslog quietly printing its timestamps every 5 minutes, load is normal, users are quiet and the world seems to be in place.

    Suddenly alarm rings, your syslog becomes all red and the security specialist comes storming through the door shouting "someone executed 'rm -rf /&' as root"...

    At this point, would you prefer to have a filesystem that is *sooo* fast deleting that it's already gone through /bin and moved on to /home/, or would you like to have one that has barely just started? :)

    At this point, I quit my job, and wander the wastes, warning about using "password" for the root password...

    Seriously - this is what full backups are for. Over the lifetime of the system, fast deletes by every user may compensate for the lost time of doing a full restore, rather than catching the global rm halfway through. Plus, a fast delete often means that nothing is really being deleted, file system references are just being dropped - there may be some easy recovery tool for this old worst-case-scenario.

  • by foobar104 (206452) on Thursday May 10, 2001 @09:22AM (#232297) Journal
    While I agree with the substance of your message, you've got a couple of facts wrong.

    First, the maximum filesystem size that XFS can handle is 18 exabytes. Since exabyte is also the name of a brand of tape drive, it's more common to hear of people talking of 18 million terabytes.

    (Aside: an intresting statistic found on this [caltech.edu] page says that as of 1995, 5 million terabytes was about enough data to store all words ever spoken by humans, ever. Cool.)

    Also, XFS was never used on Cray systems. XFS made its first appearance (if I remember correctly) on IRIX 5.3 back in 1994. Other than being off by four orders of magnitude in your sizing and by ten years in your dates, I think you're exactly right. ;-)

  • Isn't 1024 GB a terabyte? And 1024TB = 1 exobyte?
  • While speed is important, lets not lose sight of one of the other main benefits of a JFS - recovery time. Even if XFS is a little slower than ext2, I'd still use it given the recovery advantage. Yes reiserFS will do it too, though I belive XFS to be a more 'compilant' JFS in terms of b-trees, etc and I believe as the drivers mature IMHO!!! that XFS will pull away from reiserFS in terms of performance. Only time will tell.

    Needless to say I've got a pair of 61GB IDE's screaming for XFS once the 2.4.4 patches are out. Can't wait.

    --

  • by Scoria (264473) <slashmail.initialized@org> on Thursday May 10, 2001 @09:59AM (#232309) Homepage
    Mirror here [initialized.org].

    No idea why they're including something as outdated as FAT...

  • Neither of those will usually comprise the majority of usage, though.
  • I can assure you that whatever the quality of Reiserfs, it beats ext2fs hands down in the cases it's designed to handle particularly well: Large (huge) directories and small files. The company I work for operate a mail server for about 1.1 million people, and we have quite a few million files, of an average size of less than one kilobyte (we're using maildir format for the accounts, and also store a lot of small XML objects as separate files for each user), and some of our directories has tens of thousands of entries.

    Running something like that on ext2fs would (apart from the agony of fsck'ing 800GB of storage) be completely hopeless.

    When it comes to setups with a few large files, though, the advantage isn't that great, and the numbers in the article makes reasonable sense.

    Reiserfs should be your filesystem of choice if: a) you want to be able to put gazillions of files in a single directory when there is no logical hierarchy to the data (our tests indicate that Reiserfs handles shallow directory hierarchies with many files pr. directory faster than the opposite), and b) you want to be able to efficiently store a 100 byte file when grouping your data logically would give you file sizes of that magnitude, instead of grouping many pieces of data together.

    We've been running Reiserfs for well over a year now, and it works great. It's important that you keep up to date on bugfixes, though, and that you're very careful about your recovery procedures - reiserfsck is really a last resort, and you should always ensure you copy out as much as possible of any data on a damaged volume, and preferrably take a raw copy of the entire volume, before running it. That said, I wouldn't hesitate to recommend Reiserfs to anyone with the specific needs I mentioned above.

  • Or if you never delete the full contents of your drive - which most people likely won't.

    It still holds for most cases: You'll copy a bunch of data to the drive over time, and some of it will be deleted, but most of the time a lot of the data you accumulate will be data that you'll keep around for the lifetime of the drive.

    Which equals less deletion that allocation...

    Perhaps not as significant as the original poster thinks, but his claim still holds true for most users.

    More significant, I'd assume, is that a lot of deletion often is maintenance that won't take place during peak usage of the system, and that often may just be left running in the background, to complete whenever it completes.

    Allocation is in many applications much more time critical, as allocation is mostly done by applications that the users of the system are much more likely to care about performance for.

  • I have not seen the FAT32 source linux, but I have for Windows and NT (as I wrote it). I can say that (as another posted has speculated) the quality of the source algorithms has a huge impact on the performance of the filesystem. It is quite surprising to see the write time be so slow for linux, as quite frankly, FAT32 is so simple (no transaction) it *should* be only slightly slower than optimal in medium to large file size cases. NTFS should certainly show slower on writes of larger files, and if it doesnt, you know you have a huge bug somewhere in the FAT32 driver. I dont know enough about the other FSs to comment, but I do suppose they should not be significantly faster.
  • XFS is optimised for dealing with streaming media, and so deals well with high IO and large files.

    JFS has been around for years under AIX. It's a well proven general purpose journalling filesystem.

    ReiserFS is the best established of the Linux journalling filesystems. It has several fairly innovative features and is more efficient than ext2 in terms of space utilisation. People are using it as their primary filesystem now, although it's still in development.

    EXT3 is (unsurprisingly) a development of EXT2. It lacks most of the pretty features of the other journalled filesystems, but has the significant advantage that you can turn EXT2 partitions into EXT3 (and vice versa) without any trouble at all.
  • What many Linux users forget (or don't know) is that FreeBSD (and the other BSD's) default to synchronous writes with drive write caching turned off. This is done to prevent losing large chunks of data in the event of a sudden system shutdown. Linux defaults to asynchronous writes with write caching turned on. The other piece of information missing from the post is that /usr/ports on FreeBSD is 110MB with over 17,000 directories and over 62,000 files. If you plan on removing that many directories and files on a regular basis (not common on most servers) then just turn on asynchronous writes.

The greatest productive force is human selfishness. -- Robert Heinlein

Working...