Best Server Storage Setup? 76
new-black-hand asks: "We are in the process of setting up a very large storage array and we are working toward having the most cost-effective setup. Until now, we have tried a variety of different architectures, using 1U servers or 6U servers packed with drives. Our main aims are to get the best price per GB of storage that we can, while having a reliable and scalable setup at the same time. The storage array will eventually become very large (in the PB range) so saving just a few dollars on each server means a lot. What do people out there find is the most effective hardware setup? Which drives and of what size? Which motherboards, etc? I am familiar with the Petabox solution which is what the Internet Archive uses — they have made good use of Open Source software. So what are some of the architectures out there that, together with Open Source, can give us a storage array that is much better than the $3 per GB plus that the commercial vendors ask for?"
How about some requirements? (Score:4, Insightful)
Easy - Think SAN - Apple XServe RAID + DNFStorage (Score:5, Insightful)
1) Run a FC SAN as the backend. This allows you to connect anything you want without wondering what future technology will allow for - ATAoE, iSCSI, ???
2) Love thy Apple. XServe RAID's are 3u, 7tb (raw) and $13,000 - get a bunch - each controller see's 7 disks, set them up as a RAID 0 and uplink the thing to a FC switch.
3) Use DNFStorage.com's SANGear 4002 / 6002 devices to RAID 5 across the XServe RAID 0 LUN's. Your data security then can tolerate half of an XServe RAID going offline. RAID 6 allows for an entire unit to become DOA. Make sure to have an online spare or two.
4) Repeat - but remember, just because you can create it, doesn't mean you can reasonibly back it up.
Now the stupid question - what are you trying to do that would require this much space when you don't have the budget to get a "tested, supported, enterprise" solution? Building things is fun, but at some point you need to back up and say, "Am I willing to risk my company on my solution". EMC, HDS, IBM, HP and other big vendors are willing to step up and make sure your solution works, runs and will not fail (see that video with the SAN array getting shot?)
Backup software (Score:1, Insightful)
Reliability... (Score:4, Insightful)
How important is performance? Reliability? Scalability? If you are building PB of storage, how do you plan to back it up? What are the uptime requirements? Hitless codeload? Do you need multipathing? Snapshots? They all do raid, but do you need SATA/FC disk? How much throughput (MB/s) and IO/sec do you require? What management tools do you need? Callhome? SNMP/pager/email notification? Can you save $$$ by putting in a high end array but use iSCSI in the hosts to save $800/host?
In the end it comes down to. How valuable is your data? What is the impact to your business if it is down.
It sounds sexy on
Comment removed (Score:5, Insightful)
Re:What's it for ? (Score:3, Insightful)
iSCSI. Using the linux-iscsi-utils in CentOS 4.3.
You could also use NBD or AoE (that's ATA over Ethernet, not Age of Empires ;) ), but I have found iSCSI to be the fastest, most reliable, most flexible and most supported solution.