NFS In A Disk Write Intensive Environment? 9
tolldog asks: "At Big Idea Productions, a 3D animation studio, we are looking at ways of improving our network and I/O performance. We have been toying with the idea of loadbalancing the NFS server over multiple machines. We have also thought about buying one large machine with several interfaces. What are the advantages and drawbacks of these methods? Are there other approaches that could work?"
"We will probably be going from a gigabit backbone to gigabit to all of the clients, so network speed will be less of a concern than I/O over NFS. Currently we have a 4 proc Origin serving our filespace with a mixture of SGI workstations and Linux render boxes all mounting the same space. The SGI boxes are mostly reads with a few writes when saving files. The Linux boxes are mostly writes, all over NFS2, with large reads at the begining to get the textures and models loaded into memory."
Well, yeah (Score:1)
But I was referring to numbers he gave RAID levels in his descriptions
WHY do people keep claiming this? (and #'s wrong) (Score:3)
First off, your numbers are wrong. RAID 0 is striping (no parity), RAID 1 is mirroring.
RAID 5 does not require the multiple read/write combinations. The parity chunk can be (and are in all but the most horribly broken systems) calculated on the fly and then the system assumes it is done unless the parity chunk comes back wrong.
RAID 4, OTOH, is simply RAID 5 without a dispersed parity. This is, for writes, the *WORST* possible RAID level because it puts pressure on a single disk, and RAID 3 is just as bad.
The absolute fastest "standard", "safe" RAID for reads is RAID 0+1 which is striping and then mirroring the stripe. For writes, it fairs fairly well also, particularly when backed with a hardware controller to help mitigate the need for matching drives in order to get the appropriate performance on writes, which is limited by the slowest drive (its biggest flaw, IMO, other than cost).
RAID 5, OTOH, can be quite effective, even doing "read, modify, write", the trick is to be able to ensure a high probability of having the data from a block already in the cache to prevent the need to read it off the disks again. In fact, it can outperform (yes, you read that right) RAID 0+1 in a write environment when backed by a sufficiently good caching system and if you end up having all transactions be larger than a full block size.
So, to answer the question. Well, if you have a large cache AND hardware controller cards, you can go RAID 5 (1 MB cache per 2-5 GB of disk or so, IME). The only caveat is that you might be working with larger individual files which causes a need to increase the cache per GB.
If you need sheer pedal to the metal performance with fault tolerance, RAID 0+1 looks to be it.
Kinda sounds like roll ur own DFS (Score:1)
Re:Not Linux NFS... (Score:2)
Admittedly, we're using Linux for clients and our server is a NetApp box, but if Linux's NFSv2 is screwed up for clients I'd guess it's screwed up for servers...
-JF
NFS ? why ? (Score:1)
Re:WHY do people keep claiming this? (and #'s wron (Score:2)
Not Linux NFS... (Score:3)
After days and days of testing we eventually came to the conclusion that linux NFS cannot deal with multiple writes to the same server at the same time. We were using quad Xeon IIIs and gigabit networking (cLAN by Giganet) on a dedicated, hardware RAID 5 array, so hardware was definitely not an issue (unless the RAID 5 was tripping us up). I have heard the same complaints from other Beowulfers, so this seems to be indeed a major problem.
I would suggest you look to a Solaris server with maybe Linux clients, if you indeed want to deploy Linux (I am assuming, since this is
engineers never lie; we just approximate the truth.
NFSv3 (Score:2)
Mmmmm Origin!! (Score:3)
If you are going to have a fairly disk-intensive app, don't use RAID 5!
It requires; 1 read per disk, 1 write per disk, then another read-verify from each disk. That's 6 I/O's per disk for a single write. If you need redundancy, use mirroring or a stripe set no parity (RAID 1 or RAID 4).
You'll get much better performance out of the disks!