When the Human Genome Project mapped out the chemical composition of human DNA, they determined that only a few percent of the whole chain was really important in determining the structure of the human body and how it worked, and concluded that the rest was just unused 'junk' that was simply left over from the process of our evolution. The US National Human Genome Research Institute (NHGRI) started ENCODE nine years ago to study these initial findings and expand our knowledge of the sequence.
The papers from ENCODE, published in the journals Nature, Genome Research and Genome Biology, reveal two major discoveries: Over 80 percent of the DNA sequence can be linked to specific biological processes, and there are over 4 million regions of the sequence that act as switches, where proteins can interact with the DNA chain to turn genes on and off.
"ENCODE has revealed that most of the human genome is involved in the complex molecular choreography required for converting genetic information into living cells and organisms." said Dr. Eric Green, director of the National Human Genome Research Institute, according to Science Daily.
Dr. Ewan Birney, lead analysis coordinator for the ENCODE project, added "By carefully piecing together a simply staggering variety of data, we've shown that the human genome is simply alive with switches, turning our genes on and off and controlling when and where proteins are produced."
The problem that the researchers of the Human Genome Project faced, once they had laid down the human DNA sequence, is that they essentially had a book written in a different language. What they needed to do after that was figure out how to translate it. They needed a primer. Over the last nine years, by employing hundreds of researchers in over 30 research labs around the world, ENCODE designed that primer.
"This is the first truly comprehensive view of how the three billion letter instruction book for human biology actually carries out its work, across many tissues and over the course of development," Dr. Francis Collins, director of the National Institutes of Health, told NBC News in an interview, calling the findings "awesome and elegant."
Of particular note is how their findings have become important in the study of disease.
"We were surprised that disease-linked genetic variants are not in protein-coding regions," said Dr. Mike Pazin, Program Director of NHGRI's Division of Extramural Research, according to Science Daily. "We expect to find that many genetic changes causing a disorder are within regulatory regions, or switches, that affect how much protein is produced or when the protein is produced, rather than affecting the structure of the protein itself. The medical condition will occur because the gene is aberrantly turned on or turned off or abnormal amounts of the protein are made. Far from being junk DNA, this regulatory DNA clearly makes important contributions to human health and disease."
ENCODE has made all of its data publicly accessible, and this resource is already being used by scientists around the world.