Comment Re:incomplete access (Score 1) 70
1MB is plenty for most research needs, each nucleotide (nt) triplet encodes one amino acid, so you can get the equivilent of a 333,333aa protein out of this (assuming that all of the nt's are encoding and that there is only one isoform of the protein). The highly repetivative bits of the genome won't be represented due to the difficulty of fitting these peices of sequenced data in with the rest (a particular problem with Celera's method), and if you have a splice site in the middle of the gene and pushes off the bit's your interested in then you could always, do more than one download. It has to be said though, that for genomics work it's a PIA, but they have thrown a huge amount of cash at the problem, and if you don't want to look at their data, use the HGP instead.
The ability to search the data is much more important than being able to download all of it anyhow.
The ability to search the data is much more important than being able to download all of it anyhow.