Interesting to see how my accepted submission turned out: I put a link to an article that I have some issues with but has good info. It gets turned into "in depth" (I did warn that it was 19 pages long) and has a copy of the conclusion added to it.
I think that if you have enough memory, I would not use the disk for tertiary storage. Just read the file into memory at once & scan the whole thing. But I suspect I don't know enough about it.
As far as Raid goes, I got the impression that you wanted to use a mirror to speed up disk reads - as opposed to a stripe (with no data redundancy).
If I understand correctly,
Raid 0 is fastest: JBOD data is broken in two parts and written to each disk in
Raid 1 is a mirror and is the same speed as just one disk. (although you suggest it's a bit faster)
Raid 2-4 are (mostly) academic
Raid 5 (taking more than two disks) stripes the data across multiple disks and is almost as fast as Raid 0 for reads (you don't the speed advantage of one whichever disk has the parity stripe). It is also the slowest for writing as the (hopefully) coprocessor has to calculate the parity stripe and which disk it is going on.
What is the point system?
1+1+1+2+0-1= bad karma?