EXT4 Is Coming

Please create an account to participate in the Slashdot moderation system

EXT4 Is Coming 182

Posted by CowboyNeal on Saturday July 01, 2006 @09:31AM from the we're-going-five-blades dept.

ah admin writes "A series of patches has been proposed in Linux kernel mailing list earlier by a team of engineers from Red Hat, ClusterFS, IBM and Bull to extend the Ext3 filesystem to add support for very large filesystems. After a long-winded discussion, the developers came forward with a plan to roll these changes into a new version — Ext4."

This discussion has been archived. No new comments can be posted.

Search 182 Comments Log In/Create an Account

Comments Filter:

Yes but (Score:5, Interesting)

by Anonymous Coward writes: on Saturday July 01, 2006 @09:40AM (#15642354)

Yes, but will it be enough if you had energy to boil all the oceans?

Interesting bit from wiki/ZFS:
ZFS is a 128-bit file system, which means it can provide 16 billion billion times the capacity of current 64-bit systems. The limitations of ZFS are designed to be so large that they will never be encountered in any practical operation. When contemplating the capacity of this system, Bonwick stated "Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans."

In reply to a question about filling up the ZFS without boiling the ocean, Jeff Bonwick, an engineer at Sun Microsystems who led the team in developing ZFS for Solaris, offered this answer:

"Although we'd all like Moore's Law to continue forever, quantum mechanics imposes some fundamental limits on the computation rate and information capacity of any physical device. In particular, it has been shown that 1 kilogram of matter confined to 1 liter of space can perform at most 1051 operations per second on at most 1031 bits of information [see Seth Lloyd, "Ultimate physical limits to computation." Nature 406, 1047-1054 (2000)]. A fully-populated 128-bit storage pool would contain 2128 blocks (nibbles) = 2137 bytes = 2140 bits; therefore the minimum mass required to hold the bits would be (2140 bits) / (1031 bits/kg) = 136 billion kg.

To operate at the 1031 bits/kg limit, however, the entire mass of the computer must be in the form of pure energy. By E=mc2, the rest energy of 136 billion kg is 1.2x1028 J. The mass of the oceans is about 1.4x1021 kg. It takes about 4,000 J to raise the temperature of 1 kg of water by 1 degree Celsius, and thus about 400,000 J to heat 1 kg of water from freezing to boiling. The latent heat of vaporization adds another 2 million J/kg. Thus the energy required to boil the oceans is about 2.4x106 J/kg * 1.4x1021 kg = 3.4x1027 J. Thus, fully populating a 128-bit storage pool would, literally, require more energy than boiling the oceans."

Share
twitter facebook
Modularizable filesystem (Score:2, Interesting)

by Square Snow Man ( 985909 ) writes: on Saturday July 01, 2006 @09:48AM (#15642371)

What about a modularizable filesystem, which can be upgraded with modules for compression, encryption, larger file support etc. ? Is this impossible or is it a unkown area for the linux developers?

Share
twitter facebook
Re:Sounds like a good idea. (Score:2, Interesting)

by Anonymous Coward writes: on Saturday July 01, 2006 @10:00AM (#15642392)

It's BS that people think it should be considered stable. I've never had more corruptions, other than using XFS w/ very heavy writes, than Resier4. It needs at least another year. ext3 on its own, though not awesome in all areas, hasn't lost me any data yet.

Parent Share
twitter facebook
Why EXT4 ? (Score:4, Interesting)

by Anonymous Coward writes: on Saturday July 01, 2006 @10:36AM (#15642449)

Ext4 is an extention of ext3, much like ext3 is an extention of ext2. The plan is to ensure backwards compatability and sanity for when things break, and with filesystems.. things break.

There are many factors that influence filesystems, not just "how fast it can write", but rather.. how it breaks when it does.

While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3.

Scream fanboys scream, someone will listen, but the problem is that these filesystems are not proven in the field, or in some circumstances even in the kernel itself.

Share
twitter facebook
Re:Why only 48 bits? (Score:5, Interesting)

by r00t ( 33219 ) writes: on Saturday July 01, 2006 @10:59AM (#15642499) Journal

With a block size of 32 kB (64 kB is expected to be supported soonish) the 48-bit numbers will take you 1 byte over the maximum file size that apps can support. There is no UNIX-like OS that lets an app handle files bigger than 2**63.

We'll need to adjust other things if filesystems ever get so huge. The whole design probably needs a rethink, but we can't do it now. We don't know what the future holds in terms of seek times, transfer rates, sector sizes, etc.

Parent Share
twitter facebook
Comment removed (Score:5, Interesting)

by account_deleted ( 4530225 ) writes: on Saturday July 01, 2006 @11:14AM (#15642547)

Comment removed based on user account deletion

Parent Share
twitter facebook
Re:define very large (Score:2, Interesting)

by zlogic ( 892404 ) writes: on Saturday July 01, 2006 @11:50AM (#15642639)

Though this may be needed in some rare applications, I don't see ext4 as something needed in the near future. As I understand, the larger the max partition&file size, the more space indexes will need (not to mention that speed will probably drop).
For example, if we have 20-bit indexes (2^20 clusters max) and use 4-kilobyte clusters, to increase the maximum space we'll either have to add one bit to the indexes to double the maximum space or we'll have to increase the cluster size and have problems storing small files (remember the FAT16->FAT32 transition?)
ext4 is thousands larger than ext3, which will probably mean that indexes will need a lot more space, which will be bad for 8TB volumes (and besides, noone would notice any benefits!)

Parent Share
twitter facebook
Re:Why EXT4 ? (Score:3, Interesting)

by Carewolf ( 581105 ) writes: on Saturday July 01, 2006 @11:53AM (#15642653) Homepage

In enterprise.. Exactly!

Note that servers with extensive mirroring and other hardware error-handling rarely need error-recovery from the filesystem. Filesystem errors happen on ordinary peoples harddrives when they grow old, and ext* have a million times more experience in the handling those than any enterprise FS..

Parent Share
twitter facebook
What they really should do is (Score:1, Interesting)

by Anonymous Coward writes: on Saturday July 01, 2006 @01:10PM (#15642846)

In ext4 they should get rid of some legacy stuff to foster development and usage of new technologies. The users of legacy technologies could still use ext3 and it would be very nice for ext4 users. I'm talking mostly about dropping support for the old style octal file access permissions system and bolting the ACL system as the default and enabling the metadata features by default.

The fact that nothing pressurises ever the distribution builders into using anything new has lead to majorly slowed down development of Linux.

Share
twitter facebook
Re:Sounds like a good idea. (Score:3, Interesting)

by mnmn ( 145599 ) writes: on Saturday July 01, 2006 @05:05PM (#15643522) Homepage

Who cares? Linux has more than its fair share of filesystems, including XFS. I'm still wondering why XFS isnt used universally on desktop and server Linux installations everywhere. Is the ext2/3 just 'traditional'?

Parent Share
twitter facebook
Re:Well, how does a Honda Civic ... (Score:1, Interesting)

by Anonymous Coward writes: on Saturday July 01, 2006 @05:31PM (#15643591)

"ZFS is an exotic beast with a totally ridiculous maximum capacity and tons of advanced of features that do not exist in any other Unix filesystem, but are only useful for Big Iron."

Actually, except for his highly advanced algorithms, ZFS code is very small and simple, and on top of that, ZFS is really nice in small desktop deployments, where his "big iron" features give him the ability to detect and automatically correct garbage being delivered by that cheap SATA drive.

In fact, having been ported (compiles, doesn't yet run) to Linux and in process of being ported to OS X, and FreeBSD, ZFS is on a pretty good track to becoming ubiquitous... which would be the exact opposite of exotic.

Parent Share
twitter facebook
Re:Sounds like a good idea. (Score:3, Interesting)

by raxx7 ( 205260 ) writes: on Saturday July 01, 2006 @07:08PM (#15643834) Homepage

There are or were a few quirks.

First off the bat: you can't install the bootloader in a XFS partition since XFS uses the first 512 byte block on the partition. Of course, most people install the bootloader in the MBR but for some it's an issue.

GRUB had a bug with XFS. When you tried to use a XFS partition as /boot, you could corrupt XFS.

For a considerable period of time, ext3's code was more stable than XFS.

ext3 has an ordered data mode (which is the default). Other journaled file systems only support writeback mode. In general, ordered data mode doesn't provide any better warranty of consistency than writeback mode but does make an important difference for a few special cases but which can make a substancial difference to a desktop user.

Typical annoying case:
- You're editing a file on your favorite text editor and you save it.
- The editor opens the file in overwrite mode, meaning the file is actually deleted and a new one is created (under Linux's default settings, the OS will commit the changes to the metadata in 5 seconds or less and the changes to the data in 30 seconds or less).
- The changes to the metadata are commited to disk.
- The system crashes!
When the system comes back up, the new file is there it's full of garbage.

With ext3's ordered data mode, the contents of the file would have been commited to disk before the associated changes to metadata. It's problable (but not assured!!) that after a crash you'll have either the old version or the new version of the file.

Parent Share
twitter facebook
Re:Sounds like a good idea. (Score:3, Interesting)

by szap ( 201293 ) writes: on Sunday July 02, 2006 @12:29AM (#15644578)

Just a quick chime in, take it with a grain of salt. Some rambling thoughts.

I've just converted my main partition (non-/boot) on a notebook from XFS to reiser3 mainly because I work with huge svn working copies and svn loves to keep small files around, as well as create lots of small files (lock files, etc) during routine svn work. xfs is just way considerably slower than reiserfs for svn status, update, commit, cleanup. Besides, reiser3's tail feature means svn's penchant for small files uses less space overall on my tinny notebook harddrive. Not sure if performance of reiser3 will degrade over time, (I've been on xfs on this partition for longer than a year), but we'll see.

BTW, http://www.debian-administration.org/articles/388 [debian-adm...ration.org] My observations differ from theirs (operations on file tree). I do have a significant larger amount of files, and many of those are smaller than the default block size, so that might affect things.

On the server side, XFS, on multiple concurrent large, random, writes (postgresql) just creams reiser3 and ext3. (IIRC, battery backed SCSI raid controller, tested with both RAID1+0 and RAID5, Linux 2.6.x, 6 x 15000RPM 132(?)GB HDD) Read operations and single thread seq/random writes are too similar in performance for the various filesystems.

Another feature of XFS I used a lot (before converting to reiser3) is xfs_fsr, which defrags a mounted xfs filesystem. Oddly buggy though, as after some runs, some inodes tends to have max_extents corrupted (endian problem?). I'd recommend a xfs_repair after a xfs_fsr, which effectively makes xfs_fsr a utility for defragging *UN*mounted filesystems. So yeah, xfs is a tad unstable. I've only one real corruption, though, and that's from killing the notebook power during some writes. Not sure if that's from the fs, or the harddisk misbehaving.

Parent Share
twitter facebook
A real O/S filesystem needs defrag! (Score:2, Interesting)

by ArtStone ( 745847 ) writes: on Sunday July 02, 2006 @08:41AM (#15645433)

The main described change / advantage in this proposed ext4 is that the notion that a file's allocation is tracked via "extents" (a specified number of contiguous 2k blocks) rather than a chain of inode pointers (with up to 3 levels of indirection).

This is based not only on the need for a larger maximum file system, but a recognition that there is significant performance advantage to reducing read/write head movement and initiating large reads from consecutive blocks that can take advantage of the high transfer rates of today's drives. (this assumes that the OS filesystem doesn't attempt/require that the entire disk drive be cached in RAM to get decent performance)

Except for "write once" files, over time this will cause files to become physically spread over the disk and the performance benefit is reduced, unless a process periodically consolidates the blocks back into a contiguous series of blocks (ignoring for the moment that on today's disk drives, blocks may be "spared" into place that are not really physically consecutive, but just logically appear to be)...

One of the "proofs" that *nix is superior to other O/Ss has been the absence of a need to "Defrag" the file system.

A commenter on the article also raises the question of why the "right" solution isn't to increase the 2k block size limit rather than rework the internals of the block pointers, and got the response that since the linux kernal manages memory in 2k blocks, it is a nightmare in the kernal to support larger I/O transfers (although others here seem to indicate this is one of the solutions people have implemented)

Isn't "extents" a concept contained in NTFS? Has anyone looked into the patent implications of these proposed changes?

Share
twitter facebook
Re:fsck quality (Score:3, Interesting)

by hansreiser ( 6963 ) writes: on Sunday July 02, 2006 @01:29PM (#15646279) Homepage

ext2fsck has a history of plenty of problems, just like everyone. I get reports from users swearing they will never again use ext*. Ted Tso goes walking around FUD'ing everyone else's fsck. He does this because ext* performance is poor, so there is not much else to do but FUD. Some users suspect that high performance is a little sinful, so this works on some.

All of the major filesystems have a decent fsck, and all of them are by now stable to the point that you should worry about your hardware and backups failing, not your FS. The only qualifier on that is that ZFS is new, and I hope no one will view that as my FUDing.

Parent Share
twitter facebook
Re:Sounds like a good idea. (Score:3, Interesting)

by fbjon ( 692006 ) writes: on Monday July 03, 2006 @03:32AM (#15648617) Homepage Journal

But if the code's already been changed, why hasn't it been included yet?

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

EXT4 Is Coming 182

EXT4 Is Coming More Login

EXT4 Is Coming

Yes but (Score:5, Interesting)

Modularizable filesystem (Score:2, Interesting)

Re:Sounds like a good idea. (Score:2, Interesting)

Why EXT4 ? (Score:4, Interesting)

Re:Why only 48 bits? (Score:5, Interesting)

Comment removed (Score:5, Interesting)

Re:define very large (Score:2, Interesting)

Re:Why EXT4 ? (Score:3, Interesting)

What they really should do is (Score:1, Interesting)

Re:Sounds like a good idea. (Score:3, Interesting)

Re:Well, how does a Honda Civic ... (Score:1, Interesting)

Re:Sounds like a good idea. (Score:3, Interesting)

Re:Sounds like a good idea. (Score:3, Interesting)

A real O/S filesystem needs defrag! (Score:2, Interesting)

Re:fsck quality (Score:3, Interesting)

Re:Sounds like a good idea. (Score:3, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot