An anonymous reader writes: Most genetic research to date has focused on just 1 percent of the genome — the areas that code for proteins. But new research, published today in Science, provides an initial map for the sections of the genome that orchestrate this protein-building process. “It’s one thing to have the book — the big question is how you read the book,” said Brendan Frey, a computational biologist at the University of Toronto who led the new research.
Frey compares the genome to a recipe that a baker might use. All recipes include a list of ingredients — flour, eggs and butter, say — along with instructions for what to do with those ingredients. Inside a cell, the ingredients are the parts of the genome that code for proteins; surrounding them are the genome’s instructions for how to combine those ingredients.
Just as flour, eggs and butter can be transformed into hundreds of different baked goods, genetic components can be assembled into many different configurations. This process is called alternative splicing, and it’s how cells create such variety out of a single genetic code. Frey and his colleagues used a sophisticated form of machine learning to identify mutations in this instruction set and to predict what effects those mutations have.