Please create an account to participate in the Slashdot moderation system


Forgot your password?
Silicon Graphics

SGI open-sourcing XFS 179

Yun Ye writes "Finally, a journaling FS for Linux! Get the full story . Excellent-we'll have to ask the SGI people about it tomorrow. And who came up with the name change. Whatever the case, this will help Linux continue to crack the high-end market.
This discussion has been archived. No new comments can be posted.

SGI open-sourcing XFS

Comments Filter:
  • One nitpick... with standard Linux devices you can access 2G /blocks/, which comes out to 1TB.
  • From the white paper:
    7.1.2 Future Directions

    The ever evolving future of XFS shall include items such as:

    Access Control Lists
    Disk Quotas

    It looks like the white paper might not be completely up-to-date. Have Quotas and ACLs been implemented yet?

  • Well, Now you know one sysadmin who has made serious use of NT's ACLs. They actually do come in handy if you know how to effectively use them.
    The u/g/o permissions are sufficient in most instances, but being able to set up 1 person as the "owner", a group with "rwx" perms, everyone with "rx" permissions, and excluding one group from even looking in that a corporate LAN, that type of thing comes in handy.
  • I knew this was comming, but I though they were going to delay it for one or two months.

    Well, they are:

    While SGI's announcement will come tomorrow, the software itself won't be available for download until the summer. SGI still is deciding how to structure the open source license...

    At least there will be a "hacker" reason to wait for the summer :) If they won't hurry however, we will be getting the next ext2fs with journaling and B-tree code (faster directory/file data access) by then from Ted T'tso and Stephen Tweedie, and XFS will be useful only for IRIX compatibility :)

    Licensing is still an issue, GPL would be nice IMHO for them, because of the protection they would be getting to not commercialize it without them getting the improvements back, and also perfect for kernel code. If it's not GPL, it can't be used by directly linking into the kernel (minor issue, but still inconvenient.) Let's hope for the best.

    It's also not the "full" XFS, according to the report, it's got only the features the kernel hackers are planning to build into ext2.. really weird.

  • >5. No high availability clustering (Beowolf is cool but completely unrelated to this)

    HA clustering, load balancing, hot swapping of nodes, etc. using Linux boxen is something I've been looking into recently. I'm currently having wet dreams about hierarchical, infinitely-scalable Linux clusters, using SAN technology...

    aka Mad Systems Engineer-type person.
  • What happens if your raid card goes? Use software raid to mirror accross two controllers and you've eliminated a single point of failure.
    Buying a controller card with RAID built in is not typically how you'd set up a high-end server.

    Instead, you buy a box from someone like Baydel [] or Sun Storage Solutions []. This box has a bunch of disk drives in it and a couple of controller boards that have SCSI controllers to talk to the disks, RAID hardware, and a SCSI controller to talk to the host. This disk box deals with all of the RAID stuff (striping, parity, rebuilding failed disks on hot spares, etc.). The host sees the entire array as one or more disks.

    For the host, you just buy a couple of regular SCSI controllers and plug each one of these into one of the controllers on the disk array box. Then you just have to set it up to fail over to the second controller if the first one fails in some way (be it the host controller, the cable, or the controller in the disk array box).


  • by Anonymous Coward
    this will be interesting. if it is to be linked into the kernel as source - not as a binary - surely it will have to be GPLed... i guess that would prevent it being ported elsewhere (bsd and friends). if this is the case, then SGI will actually manage to ** effectively limit ** the release to linux. nobody else would dare to touch it for fear of license pollution.

    i could be wrong of course... i wonder how rms will react.
  • by ^BR ( 37824 )

    Sorry but there is GPL in *BSD kernel, actually the same FPU emulation than Linux can be used in OpenBSD's kernel (I'm not 100% sure but I think that it's an available option for FreeBSD too).

    The only restriction is that the system must be able to run without GPLed code. In the previous example a BSD licenced emulator is an option (and it sucks compared to the GPLed one), and most people don't use a FPU emulator at all since they have a FPU.

    Please don't spread FUD. Even a GPLed XFS will be usable in the BSDs.

  • I guess they could release it under both GPL and some other (e.g., the Artistic Licence).
  • How about support for ACLs? That the other major weakness in the current Linux file system... the UNIX file permission system is extremely limited compared to NTFS or the Netware file system in this way.
  • The SGI Myths paper mentioned looks like it hasn't been touched in 2.5 years, which makes it embarassingly outdated in places. For example, quoting Byte (Jan '97) labelling the O2 as a "Wintel killer." Well, in 1999, the writing on the wall is that Wintel, in the form of SGI's own VisualPC boxes, is going to kill the O2.

    Also, the 6.4 GB/sec figure quoted is Origin memory bandwidth, and has nothing to do with XFS or disk I/O rates. You can get really good throughput out of an Onyx/Origin, like in the hundreds of megabytes per second (I've witnessed that), if you've got lots and lots of striped disks, but gigabytes per second disk bandwidth isn't realistic.
  • This is internet gift economy at its best. I really hope that no-one will forget SGI for this generous move!

    Short overview of XFS at ge/ti_xfs.html

    All we need to get Linux into the data centers is a decent clustering/failover package...

    Compaq, are you listening? How about GPL'ing your old failover/clustering package (ASE)? It would be a proper gift to match SGI...

    (not that I don't appreciate or notice the release of the Compaq 64-bit math libraries... i really do... but SGI just raised the bar ;-)

    / Henning
  • by Anonymous Coward
    Well, my nominations for the other two would be:

    1. No large files support (files over 2Gb). This is required for any work with medium-large databases or digital video editing.

    2. Poor support for RAID - hot spares, hot swap, etc.

    3. Poor NFS performance (speed and locking)

    4. Poor desktop environment

    5. No high availability clustering (Beowolf is cool but completely unrelated to this)

    All these except (4) are possibly (5) more or less requirements for use in medium to large enterprise situations.

    But while these are significant, lets not forget the cool things linux has like the /proc filesystem, that Sun have only just introduced in Solaris 7.
  • by Anonymous Coward
    It's not fully finished (have not tried it yet, but will be soon), but we now have NFS Version 3 support for linux.

  • >If you add automatic failure-detection to MOSIX, you have HA+speed. You'd only need to make sure state was mirrored across the cluster.

    Yeah, all you need to add high availability is make sure state is mirrored across the cluster and add failure detection. All you need to write a kernel is run a few lines of code through a compiler. All you need to build a car is weld a few bits of metal together.
  • While it would rock on NT, Microsoft wants an inordinate amount of money for their IFS development kit (I can't remembert if it is one or five grand). I'd be quite impressed if someone actually took the time to port XFS to NT, given the cost associated with the development tools.

    Then again, IIRC, I have seen some sort of GPL'd IFS development kit. Anyone have any details on this?
  • Free/Net/OpenBSD will be able to add XFS support too.

    Rather depends on the License. I don't believe the BSD folks allow GPL software into their kernel.

  • when I read that NTFS could journal and ext2 could not, it made me feel really crummy...
    This is excellent news!!!
    I just hope XFS can match NTFS blow for blow...
    anyone know of a good comparison?
  • by Anonymous Coward
    So ACLs, properly implemented, are a really nice feature. They're certainly not properly implemented in NT, but if you look at Unix with a filesystem like AFS, they can make life a lot easier. If I'm working on a project for a class, I can set up a new group, stick all the people on that project into the group, and we can have shared files all of us can access. It's really something that's a must for an environment with thousands of users (such as my school network).
  • All the BSDs ship with plenty of GPLed code, as well as all Linux distibutions ship with plenty of BSD code, get real.

    The various BSD generally only require that the GPLed code not to be necessary to have a running system, eg the kernel scheduler couldn't be GPLed.
  • by Anonymous Coward
    Here is a good article that explains it simply in the cotext of NTFS's failings. i?vc=docid_9-129837.html
  • I stand corrected. However, Solaris 6 had 64 bit file support, and that runs on non-Ultra hardware. There are limitations involved in architecture, but Sun managed to work round them...

  • 4dwm would be nice, but I'm still waiting for fsn to go open source! :)
  • >Let's just say smart management doesn't give away one of company's few technological advantages for some vague future benefit.

    It seems to me that giving something to Linux is not necessarily giving away a technical advantage.

    Seriously, SGI is not fundamentally in the operating system kernel business, it's what they build on top of that. If the IRIX kernel was replaced by a Linux kernel (I have no idea how practical that is or how the feature sets compare), then SGI could dispense with a large part of their R&D expenses (maintaining the IRIX kernel) without giving away anything that they really sell to customers. Even if they don't make such a move, XFS for Linux makes dual boot situations nicer, and compatibility with Linux is a feature.

    I had hoped that BeOS would do this with the BFS.
  • The article says that the lack of a journaling filesystem is one of linux's three major weaknesses. What are the other two?
  • Yeah. As I understand, the way many journaling systems work it, the filesystem itself is a database. Just as a database uses a journal to record transactions in preparation for committing or rollbacks, so does the journaling filesystem. The journal holds the transactions (block writes) until they are written, then the journal entries for them are cleared. If there is corruption in the filesystem, the journal can be used to bring the filesystem back to sanity.

    I've never lost an ext2 filesystem; it does, however, take some time to fsck... a journaling capability would be nice to have. :)
  • by elyard ( 928 )
    Great! Another clueless comment from a failing wannabe pundit!
  • I don't know about support for ACLs, but I've never heard that discussed as a weakness. To the contrary, all I've heard is much complaining about how hard it is to manage ACLs and how the user and group model tends to be more practical. (In fact, I've never met an NT sysadmin who's made serious use of NT's ACLs.)

    Is this a byproduct of poor implementations, or am I missing something?
  • by matomira ( 2943 )
    After many years of walking in circles, SGI
    has finally found the new focus needed to grow
    beyond graphics. And they are showing great strategic savvy: it's much better to periodically release bits of very free source, instead of going `Open Source' in a big gulp. You get much more exposure that way.

    It's not risk-free, of course. For many people `just' 64-bit and journaling will be more than they need.

    I hope the license will only allow THEM to use this source to add XFS to Windows NT.
  • If they GPL it they'll have to GPL all derivative work, including all the work they did improving it for Irix. That's exactly for this reason that Netscape didn't used GPL, they had other products using the same code and didn't want to opensource them.
  • by slim ( 1652 ) <> on Wednesday May 19, 1999 @11:22PM (#1885972) Homepage
    It's slightly worriesome that SGI haven't decided on a licence, even though the piece says it'll certainly be Open Source.
    If we need a journalled FS (I guess we do), then we need a GPLd journalled FS.
    How's this going to be implemented? If it's a kernel patch, then correct me if I'm wrong, but doesn't it *have* to be GPL?
    I guess if it's only a module, it can be any licence, right?
  • Linus doesn't have to approve *anything* -- unless it's going to be part of the kernel, and even then the GPL allows anyone to start distributing a forked version of the kernel, without consulting Linus, as long as that too is released under the GPL.
    See the top level post I am about to make, discussing this and the GPL.
  • > No large files support (files over 2Gb). This is
    > required for any work with medium-large
    > databases or digital video editing.

    Well, I could argue with its necessity in database support, where a raw partition would make more sense, but I concede it would be nice on the digital video front.

    This has started to be addressed in the early 2.3 kernels unless I'm mistaken.

    > Poor support for RAID - hot spares, hot swap,
    > etc.

    In software yes. But I don't see this as an obstacle, since hardware RAID options are cheap on the x86 platform.

    And if you want controller redundancy, set up two RAID 5's off the back of the box and use either MD or LVM to mirror those.

    >Poor NFS performance (speed and locking)

    I would actually say that speed is no longer the issue with the 2.2.x series of kernels; it's NFSv3 compatability. The patches to make it such are out there and are going to be integrated RSN from what I've heard.

    >Poor desktop environment

    Compared to *what*?

    CDE is a joke. Inflexible, bloated and ugly.

    4DWM is okay, but it uses X extensions not available on all platforms, and it doesn't really provide anything that outstanding.

    NT is okay. Major strength there is consistancy and decent inter-application communication.

    Gnome is already more flexible and usable that CDE and 4DWM, and has already integrated the drag and drop protocal, a CORBA infastructure and the release of the COM is right around the corner.

    And, as someone else has probably pointed out, this is irrelevant in terms of a server OS.

    >No high availability clustering (Beowolf is cool
    >but completely unrelated to this)

    MOSIX was just GPL'ed. This is a good foundation.

    Not to mention that NT's clustering is abysmal and SCO's barely exists, yet both of those have a presence in the enterprise.
  • by Anonymous Coward on Thursday May 20, 1999 @04:32AM (#1885977)
    For real high-end stuff:

    1. Poor large memory support. I'm not sure how the most recent kernels fare, but last I checked Linux only supports a maximum of 2GB of ram. This is probably one of the FIRST things which need to be fixed. I have heard that SGI is working on a patch to give 3.8GB on Intel machines... that sounds promising!

    2. No raw I/O support. This is used for large RDBMS's for example. AFAIK Linus hates the idea, so it will probably never get this. Mind you this is minor because there are ways around it.

    3. "fsync()" on large files is extremely inefficient. Again, this affects RDBMS to the extreme. It is so bad that in some cases Windows NT is 30 or 40 times faster than Linux on equivalent hardware doing database inserts and updates. To witness this for yourself, write a quick program that opens a file and loops appending a line and doing an fsynch on it. Notice the slowdown as the file grows. That shouldn't happen.

    4. Poor performance under load. When the load on a Linux box goes up, context switch time increases a LOT. This is bad.

    And these are just off the top of my head...

    Mind you, I am not diminishing Linux, it is still my primary development platform of choice, however as it stands it doesn't make a good platform for really big single-computer tasks. I'm sure this will change in the future, but right now as much as I hate to say it, NT and commercial Unix have the lead. Also I'd love to hear corrections to my points above, they would be good news to tired eyes.

  • Well, I meant that the availability of MOSIX provides a LOT of the architecture needed for high availability clusters. MOSIX provides a framework to build HA on.
  • by Thunderbear ( 4257 ) on Wednesday May 19, 1999 @09:57PM (#1885979) Homepage
    Some time ago Silicon Graphics asked for comments from users regarding what we wanted, and I replied that opensourcing xfs was probably the most relevant thing they could do. Since apparently SGI is moving away from their Irix95, and they have a huge investment in xfs development, this appears to be a natural step towards having xfs on their Linux platform. This is really a good thing, and the only OpenSource initiative SGI has done so far which will actually matter to most users. Now 4dwm would be nice too - that is really a nice window manager.
  • It looks like free software has reached a critical mass. There is enough usable and successful source out there to make it more profitable to add to it than develop your own proprietary code.

    This could explode.

    This is one of the advantages of the GPL, with BSDish licenses they could just port it and include it and keep it as proprietary as before. If it should be possible to boot from XFS it would have to be in the kernel wich forces it to be GPL.

    This is unfortunately also one of the disadvantages. SGI don't want Sun or MS to be able to use it and probably don't care at all for *BSD. So they will probably release it under GPL only wich means *BSD can't use it.
  • by Erik Corry ( 2020 ) on Thursday May 20, 1999 @12:40AM (#1885981)
    No large files support; xfs will fix this. Read the press release.

    Actually there is a limitation in the VFS (virtual file system) layer that means right now no FS can have more than 2GB files.

    It could well be that SGI have patches to address that, but that would be separate to the XFS code as such.

    This restriction only applies to 32 bit platforms such as x86, non-Ultra Sparc and PowerPC of course. On Alpha, Ultrapenguin and Merced there is/will be no such limitation.

  • Linux is still technically inferior to commercial unices. Just like windows NT. Truth hurts. SCO?

  • okay...

    OpenVault - opensource . check
    XFS - opensource . check

    DMF - opensource . Well, not yet

  • Or can any linux distribution modify their code to run on it?
    I know the article said Linus would have to approve but journalists are great for misinformation.
  • I don't really know much of the comparisons between NTFS and XFS, but here are my experiences:

    SGI said that XFS would never get corrupted, so fsck wasn't written so as to check it; then they realised that XFS can get corrupted, so they released xfscheck (or is is xfschk? can't remember). In any event, I've seen disk corruption that even xfscheck can't fix; it's not foolproof. That said, all file systems suffer from corruption.

    By the same token, M$ said that NTFS wouldn't fragment, and I can tell you that it does, and the performance hit is phenomenal. I believe NT 5^H2000 is going to include DiskKeeper by Execsoft for this reason.

    In short, I think that XFS is better in my experience. I'm very glad of this development, and hope the legal issues of licencing can be met so that it becomes part of linux proper.

  • I've been thinking about it for a while but now as soon as I make sure there is a driver for linux(Haven't looked yet)that I will buy one of those swanky digital flatpanel monitors.
  • Wouldn't it be possible to do a perl-like thing and release it under the GPL along with another license? That way it could be a kernel patch on linux but other OS's could use it under whatever license they liked.

    Also future enhancements could be shared between the two licenses.

    Personally I hope this is what they do.
  • Journalling filesystems slower than non-journalling ones? Not XFS!

    XFS is actually faster than its non-journalling predecessor, EFS. I had both on the same hardware (low-end Indy). EFS itself wasn't bad. I thought is was faster than SUN's ufs on otherwise comparable hardware with the apps I used back then.

    I'd never go back to an old-style FS if I had a choice. Glad to see a key piece of technology open sourced.
  • I use 4Dwm daily; its features pale in comparison to the linux offerings.

    I'd sure like to hear what features you're talking about, because I can't imagine how anyone could think this. SGI's desktop tools are the best I've seen on Unix, bar none.

    it has a major flaw in that many of the features available on the desktop are not available to a non-SGI X-Server

    I think by ``non-SGI X server'' you actually mean ``X servers that do not support the GLX server extension.'' Is is really a surprise that SGI, who invented OpenGL, would write a lot of code that takes advantage of it?

    It seems quite unfair to me to blame SGI for the failings of X servers shipped by other vendors.

    You can argue that SGI should have met the least-common-denominator of all other vendors with these tools, but if they did, they would be throwing away all the technology related to their area of expertise.

  • It's out there.

    It is not considered production code yet, but I haven't had a problem with it yet.

    Very similar to the HP-UX implementation of the Veritas volume manager, though IMO the Linux implementation is shaping up to have better tools.
  • by Per Abrahamsen ( 1397 ) on Thursday May 20, 1999 @01:42AM (#1885994) Homepage
    > SGI still is deciding how to structure the open
    > source license, the company said, though it is
    > sure to meet the requirements of the Open
    > Source Definition, a spokesman said.

    They need more than OSD compliance. If they want it to go into the Linux kernel, they need to use a GPL-compliant license. The main question is whether they want (or will accept) that the other proprietary Unixen can use it. If not, then the obvious choice is to GPL it. It will cause problems for the BSDs, but why should SGI care? They don't have the hype (momentum) of Linux.

    If they do want their file system to become a standard, they could LGPL it, or even use an X-like license. That would also make it easier to backport changes to Irix. Using the LGPL would fmake it more problematic for competitors to keep their changes proprietary.
  • "XFS has many advanced features, but SGI isn't releasing all of them as open source, Iams added. The open source version--which anyone will be able to see, modify, and distribute--is limited to 64-bit file support and the journaling system."

    Not to put a wet blanket on the party, but if it already doesn't include mentioned things like quota support, etc.. and they aren't releasing all of the features, and they don't even have a licence decided on yet...

    Looks like it will just be a part of the code that might be incorporated into ext2, and will help some, if there are enough tallented people to actually do it (I sure know I won't be doing it). And, given it will take time for them to decide on the licence, it will take time to deal with the licence, and time to incorporate the code.. It might be a long time before we even see the effects of this _part_ of xfs incorporated into a Open Source OS (and who says Linux will be the first to use it?).

  • Apparently, this file system does not have quota or ACL support? As per the SGI white-paper [] on the XFS filesystem, those features are on the TODO list.

    Is this something we will be able to put in given the source, and call the whole thing ext3fs, and release it in Linux 2.4?

    Good filesystems are some of the most difficult code in an operating system, so having an excellent base like XFS will certainly help. Thank you SGI!
  • I disagree. While its purpose is not HA, it still lets two machines be seen as one. If you add automatic failure-detection to MOSIX, you have HA+speed. You'd only need to make sure state was mirrored across the cluster.
  • I'm not sure off the top of my head if XFS supports ACLs. I would tend to think so. However, e2fs has supported ACLs for a long time. However, the linux kernel does not, nor are there stable ACL aware versions of acl utilities, or an acl-aware version of e2fs utils. I did come accross a group that was working on it. However, they were only at the alpha stages, and seemed to halt development after kernel 2.2.0pre9. Anyways, the issue of ACL is one of the kernel and user space programs, not the filesystem.
  • PS - Slashdot could use a full dictionary of terms from around the site... That'd rule..."

    ... Then you may want to check out What is it -
    a really good web dictionary.
  • by rjk ( 10763 )

    It wouldn't have to be GPL'd as such, just GPL-compatible - examples of such include the LGPL and the Xfree86 copyright [].

    IANAL so don't bet your house on this. l-)

  • by Anonymous Coward
    Another thing that would help get Linux into the data center, would be a real disk manager like Vertias Volume Manger. Lets hope that they port it to Linux, or maybe even open source it!

    Lack of a volume manger is the last thing from keeping me from recommending Linux to the managment at the company I work for.
  • Of course he does. Linus holds the copyright to much of the kernel. This is a condition that he set some time ago. No one else would have standing in a court to force a developer to release the source code to a binary-only module.
  • Not to mention Hans Reiser's fs called (guess what) Reiserfs [].
    It wouldn't have journaling at first, but it is planned.
  • by Salamander ( 33735 ) <> on Thursday May 20, 1999 @09:49AM (#1886008) Homepage Journal
    > thing that hasn't been pointed out yet: journaling file systems don't (immediately) overwrite a file when it is changed.
    >To elaborate, imagine a long tape that represents your hard drive...

    It hasn't been pointed out because it's not true. What you're talking about is a _log structured_ file system. That's a whole different thing. A _journaling_ file system looks after the metadata using a (duh) journal that records changes to directories, attributes, allocation maps etc. so they can be either rolled forward or rolled back upon reboot. However, a JFS (not necessarily IBM's JFS, though they were pioneers in this area) has pretty much the same ways of handling the actual file data as any other FS, including deferred writes etc.

    Any FS including a JFS or LSFS may support features for increased synchrony providing greater data integrity, such as fsync() or O_SYNC, but that's really a separate matter.
  • If I'm not mistaken, I believe journaling plays a role in file recovery. fsck'ing 10 gigs can take awhile.

    Anyone else more versed on how this works?

  • by matomira ( 2943 ) on Wednesday May 19, 1999 @11:27PM (#1886010)
    If you want grio, you'll have to get IRIX.
    They are only releasing the journaling part,
    and it's limited to 64-bit.
    This is the perfect move. They give Linux something great, get extremely good PR, establish XFS as an industry-standard, and still manage to
    keep a proprietary advantage
    to make you want to buy their machines for technical reasons. Really smart.

    Even such a `limited' version will
    be better than NTFS.

  • You don't beat someone by giving them technoligy they can add what ever crap they want and make it a closed non standard, thats the way to beat yourself.

    What you get is OpenStandards and ClosedSource. I prefer OpenStandards and FreeSoftware.

    I say, BSD lisence suck because it doesn't cooperate but rather forks in 1000 different directions with just a few free versions and tons of un-free. But if you wan't your code in every application except GPL ones BSD is the right way...
    Or if you want to get ripped of by practically everyone but get your collage's name written on every ad it could be good to...

    If microsoft had bad TCP ip it would be perfect, they would weak in networks and not the slightest threat to anyone...
  • If SGI wants to run their graphics programs under Linux, they need support for large >2G files. Updating Linux's filesystem with their XFS code is one way to provide this. Linux currently dosen't support something they need, time to add it in. They are giving away a bit of tech (which is unimportant to them) to get a bigger market for their main product which is graphics programs. They want to sell licenses to their graphics programs. They may make some money on hardware, but their big bucks come in from selling high end graphics capibility. In order to make it into the linux market they need certin capibilities from the underlying OS/environment. A filesystem that handles lager files is one of them.

    Gees guys, look for the self intrest angle here. If I want to do X, what do I need before I can make it happen.

  • Try Everything [] And, if a definition doesn't exist, you can write one yourself.

    It's a slash-cousin.

  • XFS is fast and it does support big files. Here are some more XFS ressources: SGI Performance Comparisons [], a text about Myths about SGI dispelled [] (see Myth 8 [] for more information about what XFS does: 6.4 GB/sec sustained rates for a 16 processor Origin) and the text of the Sweeney Paper [].

    In the Sweeney Paper, read chapter 5. Allocation groups allow XFS concurrent activity, where current ext2 blocks the entire filesystem when a single process grows a file. Sparse large file support works well with a 64 bit filesizes, while producing only little overhead for many small files. Dynamic allocation of inodes and organizing these inodes in B+-trees allows for a dynamic number of inodes and for a very large number (64 bit again) of files per FS. B+-tree organizing directories makes searching very large directories very fast. Log structure makes crash recovery fast even for TB sized filesystems.

    By delaying allocation and not assigning physical block numbers until the buffer cache is being flushes, XFS can cluster blocks in a file much better than ext2 can do this. Integrating this change into Linux will need some work on the caching subsystem, though.
  • XFS is so, so, so superior to NTFS that there is no comparison.

    XFS has high end features like guaranteed rate I/O for media streaming application, among others, that make NTFS look like a joke.
    ___________________________________________ __________
  • by Anonymous Coward
    They plan to use Linux to compete in the low-middle range of servers, as well as under Intel systems. Having XFS available for Linux will only help them, competing with NT in that market. Finally, we see a company actually contributing something besides a few cash investments in RedHat. With more moves like this, companies investing in Linux and the open source movement, what can Microsoft NT really do against both the open source movement AND companies with wary eyes at Microsoft's monopolistic practices? The fall of the evil Empire is not far away...
  • XFS has "Attributes" which can be attached to inodes. Apparently they're user defined. Maybe Linux use of XFS can reserve certain values for ACL or Capabilities use. Yes, we have to deal with verification...but that's already an issue with networked drives, clusters, and trusting any disk mount.
  • by Anonymous Coward
    A friend of mine spoke with Linus back at the 96 USENIX conference. He asked him if there were any plans for a new filesystem such as XFS to replace ext2 because of its limiting factors. Linus said that he was open to anyone porting XFS because he didn't have the time to address it since other portions of the kernel required more attention at that time.

    Just my two cents worth
  • by kiva ( 722 )
    then again... NTFS isn't that good...
  • On the other hand, the article said that the only components being opensourced were the 64-bit filesystem and the journaling support. I suspect this means that the volume manager won't be part of the package.

    Still, what's there is nothing to be sneezed at. Kudos to SGI.
  • If it can only be used as a module and can't be compiled into the kernel, you can't use the root partition with it. (Unless you go through all the mess of an initrd setup.) Arguably this is not a big issue since on most servers the root filesystem is distinct from /usr, /var, /home and the rest of the big stuff, so it's small and doesn't take much time to fsck. OTOH most of the configuration is on it (/etc) so losing anything there could be a problem.
  • If they only release it as GPL, it can't go into the BSD kernel. But nothing stops them from releasing it under multiple licences. But I highly doubt they'll go with anything BSD-like. That would be like telling Sun 'here's our XFS file system, please adopt and expand it for your own proprietary use'.
  • > 2. Poor support for RAID - hot spares, hot swap, etc.

    I run a RAID-1 system here at work with Linux 2.0.34. oh wait, no it's 2.2.1 now. Dual 4G Baracuda's with a DPT 2044U and a RC4040 cache/RAID module. Works like a charm. It would be hot swap if I had the case/bays for it.

    The only thing I don't like is that it doesn't have linux-native utilities. I have to use iBCS to shut off the alarms when they occur. (last time was when I accidentally kicked the case and the one drive's power connector was loose, which was about a month ago.

    This isn't an UltraHighPerformance system, the drives/controllers are UltraNarrow (does that sound like 4-bit communications to anyone else? :-) but it works VERY well.
  • Having "just" journaling alone would not actually
    be *that* important; logical volume management
    is more important -- fortunately XFS gives

    Journaling gives you "just" faster startup times
    after crashes because the filesystem should by
    definition never be in an inconsistent state
    (no fsck or equivalent required). However,
    for big disk farms the flexibility given
    by a logical volume manager is really important.
    (the flexibility is not bad for small setups,

    Here's a nice white paper on XFS: l
  • by Anonymous Coward on Thursday May 20, 1999 @05:11AM (#1886035)
    Just a thought - it hasn't crashed yet on my amiga, and I'm using an early beta from ages ago, and I delibrately tested it by power cycling in the middle of lots of writes on several occasions.
    It's great to never have to use l:disk-validator (amiga fsck) again ( o.k, ok. it's in ROM, not l: on all Amigas above 1.3, but hey...)

    The website has an exceptionally clear discription of how the filesystem has been implemented. It's 64-bit, using the NSD (New Style Device)API.

    It's also free.

    here's the site : []

    In the interest of /. effect minimisation,
    the feature list is duplicated here:

    (Note that some of the things described are amiga-specific. The dos.library limitation, in particular, is irrelevant to linux, and probably to future amigas, too. The 2GB max single file size limitation arises from the amiga's incomplete transition to 64-bit APIs - CBM went bust as the NSD spec was released, and, once again, may not be relevant to a linux implementation)

    This page gives you an overview of what SFS is capable of. It will also give you an idea what features we are planning to add in the near future in planned features, and what features we are considering later on.

    Below you'll find a list of features which are already implemented in SFS.

    Fast reading of directories.
    Fast seeking, even in extremely large files.
    Blocksizes of 512 bytes up to 32768 bytes (32 kB) are supported.
    Supports large partitions. The limit is about 2000 GB, but it can be more depending on the blocksize.
    Support for partitions larger than 4 GB or partitions located (partially) beyond the 4 GB barrier on your drive. There is support for New Style Devices (NSD) which support 64 bit access, the 64-bit trackdisk commands and SCSI direct.
    The length of file and directory names is internally limited only by blocksize. Limitations in the dos.library however will reduce the effective length of file and directory names to about 100 characters.
    The size of a file in bytes is limited to slightly less than 4 GB. Because of limitations in dos.library we will however probably not allow files larger than 2 GB, to avoid potential problems.
    Modifying data on your disk is very safe. Even if your system is resetted, has crashed or experienced a powerloss than your disk will not be corrupted and will not require long validation procedures before you will be able to use it again. In the worst case you will only lose the last few modifications made to the disk. See Safe writing for detailed information on how this works.
    To be able to ensure that your disk never gets corrupted we use an internal caching system which keeps track of modifications before writing them to disk. This cache has the additional benefit that creating and copying files can be a lot faster, especially if the drive used isn't very fast (ZIP & floppy drives for example).
    There is a built-in low-level read-ahead cache system which tries to speed up small disk accesses. This cache has as a primary purpose to speed up directory reading but also works very well to speed up files read by applications which use small buffers.
    Disk space is used very efficiently. See the Space efficiency page for a comparison between a few filesystems.
    Supports notification and Examine All.
    Supports Soft links (hard links are not supported for now).
    Using the SFSformat command you can format your SFS partition with case sensitive or case insensitive file and directory names. Default is case insensitive (like FFS).
    There is a special directory which contains the last few files which were deleted. See deldir.
    Planned features
    The list of planned features below are features which are either already in development or are very likely to be added to the filesystem in the near future.

    Multiuser support.
    Built-in background file and free space defragmenter. Already the filesystem is set up in such a way to allow for easy implementation of this feature without having to do extensive scanning of the disk before the defragmenter can begin. This means defragmenting can be done in the background and can be interrupted at any time (even by a reset, crash or power failure) without loss of data.
    Mirroring of important filesystem administration blocks to make the filesystem more robust.
    Features we are considering
    The features below are either features which are very application specific or not used very often. If there is enough demand for some of these features we will consider implementing them in the filesystem.

    Mirroring of complete partitions. Such a feature would not only ensure that all your data is very safe since everything is stored twice on two different drives, but it will also speed up multiple concurrent read accesses since both drives can be used to deliver data. This feature however normally is only used on mission critical systems (like file servers) and would be of little use on systems not equipped with high speed SCSI controllers.
    Support for striping. To put it simply, striping can be used to distribute data to multiple drives which increases the total available bandwidth as each disk will be used simultaneously to access part of the data. If you for example have 2 drives than with striping all odd blocks of 64 kB would be stored on drive 1, and all even blocks of 64 kB would be stored on drive 2. A similair scheme is used with more than 2 drives. With striping there is also an option to use one of the drives as a parity drive. If one of the drives crashes or becomes unuseable than the data on that drive can be reconstructed using the remaining drives which ensures that your data is very safe. However, although it may seem that striping could speed up disk accesses by a factor of 2 or more, this is usually only the case when working with very large video streams or multi user systems. Under normal conditions you will be hard pressed to find any speed gains at all.
    Support for hard links (soft links are already implemented).
    The ability to extend a partition without having to copy all your data and format the partition.
    New DOS packets. There are lots of ways to exploit the ability of a filesystem better than is possible at the moment. New packets are the key to this. For example, support for paths larger than 255 characters, live directories (directories which are updated in realtime), enforcing recordlocking and many more. This however must be a team effort and we'll need support from writers of important applications and people willing to build new interfaces to access these new abilities.

  • There have been a few good answers to this question in this thread, but there's one thing that hasn't been pointed out yet: journaling file systems don't (immediately) overwrite a file when it is changed.

    To elaborate, imagine a long tape that represents your hard drive. The tape is written from left to right. When a file is changed, the new version is appended to the end of the existing data, while the old version remains "untouched" farther to the left. When the kernel has finished updating all pending file writes, it can write a "checkpoint" at the end of the existing data. Essentially the checkpoint says "everything up to the point is kosher." If the disk gets really full, then the filesystem can double-back and overwrite the really old data at the beginning of the tape.

    Now, let's see how this works to help recover from crashes; say the computer crashes as it's writing out a file to disk. In a conventional filesystem, a lot of things could go wrong: it could have been overwriting the old data, but finished only half of the job. Then, at best, you've got a corrupted file with a hybrid of new data and old data. The file allocation table may not have been updated, so the file may be completely lost. It's a bad situation.

    Conversely, if the crash happened with a JFS, the computer would run an "fsck" and look for the last checkpoint. It's guaranteed that all data preceding the checkpoint has integrity. Then the filesystem would just work from that checkpoint and ignore any non-checkpointed data. This can still lead to some data loss, but never to filesystem corruption.

    Of course, this is a simplified account, and there are implementation details. But that's the gist of it.

  • Nope, they could have a GPL'ed version and a prorietary version if they wish so. However, they would need permission to include _others_ improments to the free version in the proprietary version.

    Actually, they need such a permission in any case. But in practice it will be easier to get if they use a license that also allows others to make their own proprietary versions. It will then tend to become the "default" license for all modifications.

  • SGI is back on it's feet!

    Let's we all (day traders) go and buy a decent amount of their shares ...

    ... Sun, I can only say I just feel sorry for being so short minded! But you still have time to backup and have MacNealy eats his words back and join the FORCE. Otherwise I am feeling you will become the next DarkVader after MS falls down; and guess what ... The FORCE is with US!

    Linux ... what dreams may become!
    ________________________________________________ _____

  • SGI could choose to use the GPL plus alternate licensing, and still get back improvements from others. This can be done either by getting assignments from contributors (if they are willing) or recoding the changes in a different way. Regardless of what licensing is chosen, getting assignments (as FSF and the egcs project do) is probably the safest thing to do, as I expect that within the next year, Microsoft or someone they put up to it will attempt to sue some high-profile open source project for theft of code (just find some contributor who didn't have the right to contribute, because of an employment contract or something), thus spreading a piracy taint over the whole movement.

    It should be emphasized that at the least, SGI would be shooting themselves in the foot if they choose a GPL-incompatible license such as an NPL-like license. The reason is that this would force all Linux distributors to use their filesystem only as a module, which would be inconvenient. If SGI's work requires changes in the kernel itself, then it wouldn't even be valid to use it as a Linux kernel module if it's not under a GPL-compatible license.

    SGI could use a BSD-like license (without the advertising clause), which would permit both BSD and Linux to use the code. They might not want to give that much away, though. You'll never beat Microsoft if you write code for them (yes, Microsoft networking has tons of Berkeley code in there, you can tell from the bug-compatibility).

    I hope that they either go in the GPL or the BSD direction, and don't try to do one of those one-sided NPL-like licenses that is becoming popular with companies (e.g. we can take your changes proprietary, but you have to distribute source).

  • Given this press release: []
    SGI and Veritas Form Strike Team to Investigate Development of Journaled File System Solution for Linux

    I'd guess a Logical Volume Manager would be part of what SGI contributes to the free software commuity.

  • > ...the OS itself has no software RAID facility.
    > SGI/IRIX achieves HA RAID using HARDWARE, There
    > external RAIDS are made by Clariion and simply
    > look like a SCSI device to the OS, you can do
    > the same with Linux today.

    If you'd rather do software RAID then you have never worked in an enterprise environment where speed and reliability are a concern. Why somebody would use software RAID when hardware devices are available at low costs COMPLETELY blows my mind.


  • They used to be the state of the art bad-ass workstations.

    And yet, they still do. I'm sorry, but SGI's closes gfx workstation competitor remains Intergraph. (Sun? Gimme a break. Maybe when Sun actually offers something in the same league as the original Onyx I'll reconsider. Sun sells boring-flavored business workstations.)

    If you think SGIs sell poorly, perhaps you should check on how Intergraph's sales are doing.

    I don't know why SGI machines don't sell better. I still think they offer the best possible price performance for graphics (and server and computational performance) than anything else out there.

    I as a previous SGI fan felt isolated by the jump to NT, and I wasn't alone. Their NT workstations just didn't sell as much as they wanted, so now their jumping onto the Linux bandwagon.

    They haven't even been out a year. Give 'em a chance.

    And how do you know anyway? Facts and figs, please.

    And, why should anyone have felt *isolated* by SGI's adoption (note the word adoption instead of *jumped*) of NT? I don't get it. Maybe it's just because I didn't care.
  • I wonder whether we'll ever see the day when billionaire philanthropists buy the rights to successful commercial software and then turn around and release it under the GPL in the interests of humanity.

  • as i recall, someone out there already has a volume manager for linux that works almost exactly like hp's volume manager...

    anyone have more details on its location, etc?

  • Let's not forget that Linux won't be the only OS able to take advantage of this code release. Free/Net/OpenBSD will be able to add XFS support too.

    I'd love to be able to run XFS on our apartment FreeBSD fileserver/firewall, as well as my Linux desktops. I am really looking forward to playing with this! Thank you SGI!

  • What you're talking about is a _log structured_ file system.
    Right. A few more notes on this: a log-structured filesystem is a great thing if you do a lot of small writes, especially on a lot of different files. (Such as, say, on a news server.) You end up doing far fewer seeks than you would on a more traditional filesystem. On the other hand, you want a lot of buffer cache if you doing a fair number of reads.

    NetBSD 1.4 has a log-structured filesystem called LFS, though it may still have a bug here or there. And it really wants a cleaner that also consolidates (defragments) files as it cleans.


  • by sboss ( 13167 ) on Thursday May 20, 1999 @04:28AM (#1886056) Homepage
    Journaling filesystems keep a "redo-log" of all activity (changes) to the filesystem. If the system dumps (crashs) the redo-log is re-run at the "fsck" time so the filesytem will be complete again and the fsck take relatively no time. I have a very large Sun machine at work that has a terabyte of a Oracle table space that would take almost an hour to boot (due to the basic fsck of the oracle tablespace filesystems) unless it crashed then it was almost 2hours or more. I move the oracle tablespace filesystems to a journaling filesystem and now it takes about 12-15 minutes to boot maybe 20 minutes if I crash the box. Before the journalling filesytems, whenever it crashed (or I should say almost always when it crashed) I had to manually fix filesystems in maintenace mode. Once I moved the filesystems to a journaling fs, I have not had to do that again. If the journalling filesystem is stable and works like it is suppose to, I would move *all* my machines (including laptops, desktops, and servers) to it.
    sboss dot net
  • >>No high availability clustering (Beowolf is cool but completely unrelated to this)
    >MOSIX was just GPL'ed. This is a good foundation.

    MOSIX is also cool, but also totally unrelated to HA.
  • If SGI so desired, they could actually release a binary-only module which implements XFS (complete with GRIO (*drool*) if they really wanted to), and do so without GPL or anything that resembled an Open Source license. (Ick.) With that in mind, it stands to reason that a filesystem module could be distributed under any license, since it could be built separately from the kernel and modprobe'd in.

    A different question is whether or not code that is part of the kernel source tree proper (eg. /usr/src/linux, also known as "everything in the linux-X.Y.ZZZ tarball") must be GPL'd, and I believe that said code does inherit the GPL from the rest of the kernel's GPL-ness. (Any license lawyers out there care to expound on this point for us?) If this is the case, then if SGI wants this to be part of the Linux kernel source tree, they'll have to jump on the GPL bandwagon.

    I think "kernel patches", if they're distributed separately from the official kernel and must be applied manually by the users of the patch, are also exempt from being GPL'd, although this is a significantly greyer area. Kernel patches distributed in this fashion act very similarly to programs that #include GPL'd header files. eg. If I #include a GPL'd foo.h in my program, but I don't distribute my code with said foo.h, I don't believe my code becomes GPL'd -- even if foo.h is under GPL rather than LGPL -- although I'm not 100% certain. Anyone care to clarify on that particular grey area?


  • by ^BR ( 37824 ) on Thursday May 20, 1999 @12:47AM (#1886064)

    Excerpt from http://www.OpenBSD.ORG/policy.html []

    The GNU Public License and licenses modeled on it impose the restriction that source code must be distributed or made available for all works that are derivatives of the GNU copyrighted code.

    While this may be a noble strategy in terms of software sharing, it is a condition that is typically unacceptable for commercial use of software. As a consequence, software bound by the GPL terms can not be included in the kernel or "runtime" of OpenBSD, though software subject to GPL terms may be included as development tools or as part of the system that are "optional" as long as such use does not result in OpenBSD as a whole becoming subject to the GPL terms.

    As an example, some ports include GNU Floating Point Emulation - this is optional and the system can be built without it or with an alternative emulation package. Another example is the use of GCC and other GNU tools in the OpenBSD tool chain - it is quite possible to distribute a system for many applications without a tool chain, or the distributor can choose to include a tool chain as an optional bundle which conforms to the GPL terms.

    So a GPL part only have to be optional, XFS qualify.

  • by Anonymous Coward
    There have been some excellent White Papers on
    XFS over the years I recall at both USENIX and
    the LISA mtgs. Might want to browse them.
    There was one in 95 about XFS and guaranteed IO;

    I KNOW there was one in Jan.96 in San Diego as I
    have a copy still. Titled "Scalability in the XFS File System" sorry no link :(

    Try this, looks recent: l

    Check em out!
  • by Anonymous Coward

    Yet SGI/IRIX it is considered an Enterprise OS:

    2. Poor support for RAID - hot spares, hot swap, etc.

    I manage several Origin 2000's with Fibre Channel and SCSI RAIDS. While the IRIX OS does support powering down a SCSI chain for swapping out a drive, the OS itself has no software RAID facility

    SGI/IRIX achieves HA RAID using HARDWARE, There external RAIDS are made by Clariion and simply look like a SCSI device to the OS, you can do the same with Linux today.

    3. Poor NFS performance (speed and locking)

    While SGI (almost) properly implements NFS3, good luck if you are serving/sharing files with anything other than a Genuine SGI. They've screwed with the Sun implementation enough that it really pays to just stick with SGI for all your workstation/server needs.

    5. No high availability clustering (Beowolf is cool but completely unrelated to this)

    While there is an expensive, fun to set up FailSafe IRIX available, it is only limited to TWO nodes and not really true clustering (kindof like the original NT implementation).

    4. Poor desktop environment
    I use 4Dwm daily; its features pale in comparison to the linux offerings. While 4Dwm is "ok" it has a major flaw in that many of the features available on the desktop are not available to a non-SGI X-Server (such as Linux/eXceed). As the servers such as the Origin 2000/Origin 200 ship headless (no display) and we interact with the servers using Linux/NT-eXceed this is a major shortcoming. I wonder if the SGINT-eXceed combo is any better.
  • ... for the fs gurus

    and several million hours saved worldwide for
    fs non-gurus like myself. When can we burn the
    rescue disks?

    Lets hope it all comes together in an acceptable way. If/when it does, what a gift for humanity!! (or at least the
    short fat flippered version thereof)

  • by kris ( 824 ) <> on Wednesday May 19, 1999 @10:15PM (#1886073) Homepage
    If the license for XFS is any sensible (i.e. a true Open Source license), this is the single most intelligent thing SGI could have done to score with the Open Source movement. Linux is in dire need of an Journalling File System and XFS is one of the very best of this flock.

    Their white paper [] on XFS explains how XFS is different from conventional file systems and what they did to it to make it fast with very large files as well as with many, many small files (SGI is not Open Sourcing their GRIO capabilities, which together with RT scheduling would make Linux a serious multimedia contender).

    If you are a USENIX member, you will be able to download the Sweeney paper Scalabilit y in the XFS File System [] from the USENIX server. It was published in the Spring 1996 proceedings of the USENIX, so you may also read it in your Universities library.
  • by 8Complex ( 10701 ) on Wednesday May 19, 1999 @11:03PM (#1886082)
    Forgive my ignorance here for I haven't learned much about file systems, but what is the difference between journaling and non-journaling file systems?


    PS - Slashdot could use a full dictionary of terms from around the site... That'd rule...
  • by kkreamer ( 27852 ) on Thursday May 20, 1999 @12:02AM (#1886085)
    Disclaimer: I'm not a fs guru, but...

    I think a journaling filesystem is one where it keeps a journal of what, when, and where it is writing data, between the time it schedules it to be written and the time it is actually written. That way, if a system crashes, the filesystem can look at the journal and continue where it left off, not losing any data. A non-journaling filesystem schedules stuff to be written, then acts as if the write has happened. This works ok, until the system crashes. If a write was pending when the system crashed, that data is lost.

  • Since I have no knowledge of TOPS, I'd like to see someone expand on your comment. It seems to me that if a file has an ACL, that's a file property, and would need to be stored in the file system.

    While the UNIX people po-po ACLs, Linux's main competitors in the x86 market (NT, NetWare) have them, so they are important for migration purposes at the very least. (Plus, I fail to see how "other" works if you are in a large NDS or other directory tree, but maybe I'm missing something.)

  • Journaling is not about not loosing data (that would requires no-delay write to disk, etc), but about being able to return to a consistent state fast. You may loose several seconds of updates (or more), but you are then guarantee to have all your files as they were supposed to be (from system and applications point-of-vue) at that time, not a "hopefully rightly fixes fs" like what most *fsck can do.

  • Well, hardware RAID is important, if only because there's tons of x86 server boxes out there that have hardware RAID cards in them, including many 486/586 Compaq boxes that are being decomissioned. These would make perfect Linux boxes.

    Note that people go for hardware RAID on x86 even though WinNT has workable software RAID. So both are importantant.
  • Things are improving...

    1. Large files does xfs support _big_ files? Is xfs what SGI uses on their BigIron origin servers (the fastest WinNT servers on the face of the planet, thanx to Samba ;)

    3. NFS Cats and kids -- NFS sucks. As in, "will never be worth a damn." Use something good. Coda, anyone? Actually, the Coda guy at CMU just wrote a distributed file system (in PERL!!) that runs on top of whatever FS the OS is already using -- would be very nice on top of xfs. It's a bit easier to maintain/add functionality, since it's ~8k lines to Coda's 500k. It's called InterMezzo.

    4. DEs GNOME & KDE blow 4dwm, Win95, Mac, and everything else away. 'Nuff said.

    5. HA clustering so how good is TurboLinux's? They must be convinced to OS it. Or we'll have to rewrite it :)

Syntactic sugar causes cancer of the semicolon. -- Epigrams in Programming, ACM SIGPLAN Sept. 1982