Become a fan of Slashdot on Facebook


Forgot your password?

Comment Re:the way I see it (Score 1) 533

There's legitimacy in identifying it as a distinctive act—if someone sets off a bomb in a public place that is supposed to kill thousands of people, but by fluke it only kills one, you'd have to put the perpetrator down for... how many attempted murders? And of whom? Plus that ignores the possible property damage. The way the law is written, 18 USC sec. 2332a is more of a summary of damages, actual and potential, than some pigeonhole that crimes have to be sandwiched into. In the Tsarnaev case it was added on top of the murder charges. This is another crime separate from those acts.

Comment Re:the way I see it (Score 4, Informative) 533

If you look at the laws themselves it's a bit weird; 18 USC sec. 2332a seems to introduce the term "weapons of mass destruction" for the sole purpose of re-naming a definition provided in 18 USC sec. 921 called "destructive device," which dates to 1934 at the latest. I'm not savvy enough to figure out when the "WMD" terminology was introduced, but it's at least older than 1996 and seems to serve no purpose other than sounding grandiose.

Comment Re:Good. (Score 1) 433

Of course, that also requires that the patient is unaware of it (unlikely), or that they cannot speak (in which case there's probably a caregiver who does...) I suppose it could happen, but it's pretty combinatorically hard. Antibiotic allergies tend to manifest symptoms that are fairly non-life-threatening and limited by the dose size, they can be tested for in advance if the medical practitioner has cause for concern, and tend to go away as you grow up. That being said, there can be other fairly serious drug allergies, so the point's not moot.

Comment Re:At least they're not rolling their own. (Score 1) 138

Here's the lowdown on how BZGF works, as one example. In this case, there are many short distinct of DNA being stored together, each with offset and quality information, many of which may be identical. The compression is localized to smaller blocks (I'm not sure if they're 4096-byte disk sectors or something else.) You're right that there's probably some performance lost due to the misalignment, but 6 and 8 line up every 24 bits, so at worst that means patterns of four codons or three bytes—and a step of four amino acids is ideal for alpha helix motifs, so it's not all a loss.

And, yes, regarding individual genomes: I'm pretty sure that'd be all anyone stored if they didn't have to hold onto the FASTQ files for auditability.

Comment Re:At least they're not rolling their own. (Score 2) 138

It's a neat thought, but it would never beat the basics. While there are a lot of genes that have common ancestors (called paralogues), the hierarchical history of these genes is often hard to determine or something that pre-dates human speciation; for example, there's only one species (a weird blob a little like a multi-cellular amoeba) that has a single homeobox gene.

While building a complete evolutionary history of gene families is of great interest to science, it's pointless to try exploiting it for compression when we can just turn to standard string methods; as has been mentioned elsewhere on this story, gzip can be faster than the read/write buffer on standard hard drives. Having to replay an evolutionary history we can only guess at would be a royal pain.

That being said, we can store individuals' genomes as something akin to diff patches, which brings 3.1 gigabytes of raw ASCII down to about 4 MB of high-entropy data, even before compression.

Comment Re:To put this into perspective (Score 1) 138

Well, if you really need to have that kind of contest...

The data files being discussed are text files generated as summaries of the raw sensor data from the sequencing machine. In the case of Illumina systems, the raw data consists of a huge high-resolution image; different colours in the image are interpreted as different nucleotides, and each pixel is interpreted as the location of a short fragment of DNA. (Think embarrassingly parallel multithreading.)

If we were to keep and store all of this raw data, the storage requirements would probably be a thousand to a million times what they currently are—to say nothing of the other kinds of biological data that's captured on a regular basis, like raw microarray images.

Comment Re:Oddly... I have a clue about this stuff lately (Score 1) 138

CNVs actually can be detected if you have enough read depth; it's just that most assemblers are too stupid (or, in computer science terms, "algorithmically beautiful") to account for them. SAMTools can generate a coverage/pileup graph without too much hassle, and it should be obvious where significant differences in copy number occur.

(Also, the human genome is about 3.1 gigabases, so about 3.1 GB in FASTA format. De novo assembles will tend to be smaller because they can't deal with duplications.)

Slashdot Top Deals

What is now proved was once only imagin'd. -- William Blake