Stanford Report Online



Stanford Report, February 12, 2001
Stanford researchers make major contribution to human genome sequence

Stanford researchers, as members of the International Human Genome Sequencing Consortium, today announced the first analyses of the human genome sequence - the 3 billion DNA letters that comprise the complete set of human genes. Of the 20 groups - from the United States, the United Kingdom, Japan, France, Germany and China - that were involved in the collaboration, the efforts of the two Stanford teams combined to place Stanford ninth in terms of the amount of draft sequence contributed to the global effort.

Richard Myers, PhD, professor of genetics, and his team at the Stanford Human Genome Center and Ronald Davis, PhD, professor of biochemistry and of genetics and his team at the Stanford Genome Technology Center each produced approximately 30 million units of draft sequence. The international consortium's results are published in the February 12 issue of Nature along with a series of papers on the human genome.

"I think this group is a band of heroes for having done this," Myers said of the international collaborators. "You had to keep cranking the handle. It wasn't always one big happy family and it wasn't always easy. But overall it was a remarkable commitment to do something like this."

Humans now join the list of organisms - including more than 600 bacteria and viruses, one fungus, two animals and one plant -- that have had their genomes sequenced.

The human genome is the largest yet to be sequenced, being 25 times larger than any previously studied genome. It is also the most complex. Less than 5 percent of the human DNA sequence codes for proteins - the basic molecules that build the cells our body consists of - whereas at least half of all human DNA consists of strings of repeated letters often described as 'junk' DNA. Bacterial genomes have been found to contain only 1.5 percent of this nonsensical DNA and the fruit fly's genome has 3 percent.

Since announcing completion of the draft sequence of the human genome in June, the researchers have been finalizing the sequence, which includes correcting errors, resolving ambiguities and closing gaps. About 10 percent of the genome
is incomplete. "It's like having a big book like Don Quixote, where you have about 1,000 pages. But you're missing 100 words, and some words are in the wrong order, some are even on the wrong page and there are a lot of typos," said Myers. "[The draft sequence] is pretty good. I'm very proud of it. But we all agreed from the beginning that we want that book to be pristine. It's going to be used by lots of researchers for a long time."

Myers' group is one of five centers now focused on finalizing the draft sequence. So far, his group has transformed 90 million units of sequence into final form. "Now the page is in the right order, all the sentences are correct and there are very few or no errors. Now it's possible to read that page without any confusion," he said.

According to Myers, 30 percent of the human genome sequence is now finalized. The international team expects to have a completed final version by no later than 2003.

While Myers' team of 22 researchers has concentrated on producing high-quality sequence, Davis' team of 20 has largely focused on technology development. The members of the Human Genome Project believed from the outset that a de-centralized international team of researchers would generate a diversity of approaches that would speed scientific progress in the long-term. "Part of the genome project was to develop completely new ways of doing things," said Myers.

"Our main focus was developing technology for the advancement of high throughput sequencing," said Michael Proctor, research and development scientist at the genome technology center. "We focused on automating sequencing, lowering costs and getting higher throughput." Other genome centers such as the U.S. Department of Energy Joint Genome Institute in Walnut Creek, Calif., have taken advantage of technology advancements developed at the Stanford center.

With the draft sequence of the human genome basically complete, researchers have begun comparing features of the human sequence to that of fly and worm, the only other animals for which the entire sequence is known. Scientists are also searching for genes among the millions of DNA letters, called base pairs.

They have found that far from the figure of 100,000 genes that has been widely quoted since the mid-1980s, the human genome contains approximately 31,000 genes - only about twice as many as worms or flies. To date the researchers
have been able to assign about 5,000 human genes to the correct site in the genome. "The ultimate goal is to compile a complete list of all human genes and their encoded proteins, to serve as a 'periodic table' for biomedical research," the authors write in the Nature paper.

Members of the sequencing consortium hail the near-completion of the draft sequence yet they readily admit that much work remains before the human genome yields all its secrets. "I think people who think the sequence will be the answer to everything have got it wrong. The hard work has only just begun. It requires many dedicated biochemists and biologists to siphon through and understand it," said Proctor.

Myers agrees, "Let's revel in this and take advantage of it. It's a major landmark, but it isn't over. The future is just as exciting to me as what we've done so far. This is not an end point, it's just a landmark along the way."

The idea of sequencing the entire human genome was first proposed by scientists in the early 1980s. The program was launched in the United States as a joint effort of the DOE and the National Institutes of Health. Genome centers were also created in Europe, Japan and elsewhere, and by late 1990 the official Human Genome Project had been established. The project was founded on two inviolate principles: that the collaboration would be open to centers from any nation "because we felt that the human genome sequence is the common heritage of all humanity," the authors write, and that the sequence would be updated daily and available without restrictions.

The international consortium that comprises the Human Genome Project has been engaged in a race with a private company, Celera Genomics of Rockville, Md., to complete the human genome sequence. The private and public efforts used different methods to produce sequence. The international consortium chose a more conservative approach believing it to be more accurate, with less chance of mis-assemblies in the final sequence. The two groups jointly announced in June the completion of the draft sequence of the human genome.

Myers is co-director of the Stanford Human Genome Center and Davis is director of the Stanford Genome Technology Center. Other Stanford authors named on the 63-page Nature paper are David Cox, MD, PhD, professor of genetics and pediatrics and co-director of the SHGC; Jeremy Schmutz, Mark Dickson and Jane Grimwood from the SHGC; and Michael Proctor, Nancy Federspiel and A. Pia Abola from the SGTC. Funding for the two Stanford teams was provided by the NIH and the DOE.