Читать книгу Welcome to the Genome - Michael Yudell - Страница 14
1 From Mendel to Molecules
ОглавлениеSince the nineteenth century, scientists have been working to unravel the biological basis of inheritance. With Gregor Mendel’s mid‐nineteenth‐century discovery of the basic mechanisms of heredity, genetics was born, and humanity took its first small steps toward deciphering the genetic code. No longer would heredity solely be the domain of philosophers and farmers. Indeed, Mendel’s discoveries set the stage for major advances in genetics in the twentieth century and help put in motion the series of discoveries that led to the development of the sequencing of human and nonhuman genomes. This age of discovery, from Mendel to genome sequencing, is the subject of the first four chapters of this book. Chapter 1 covers some basic biology and tells the story of the evolution of genetics by examining some of the most significant discoveries in the field—discoveries that enabled the development of genomics. Chapter 2 looks specifically at the evolution of genetic and genomic sequencing technologies. Chapter 3 examines the human genome itself and the ways in which we are exploring and exploiting it now and in the future. And, finally, Chapter 4 looks at the sequencing and genome analysis tools of the post‐genomic era also called next generation sequencing or (NGS).
Without any further ado, may we present to you the human genome!
This photo (Figure 1.1), also known as a karyotype, shows the 46 human chromosomes, the physical structures in the nuclei of your cells that carry almost the entire complement of your genetic material, also known as your genome. But don’t let this two‐dimensional representation of the genome fool you into believing in its simplicity. Almost 20 years ago biologist Richard Lewontin called DNA a “triple‐helix” to explain how genes function, and how they interact with each other and the environment. This triple helix is largely inseparable, and genetics doesn’t make sense unless taking these effects into account.
We could also have introduced you to your genome with a slew of the DNA sequence units—As, Ts, Gs, and Cs—in a string, or we could have shown you a picture of DNA in a test tube or even a picture of a nucleus of one of your cells where the DNA would be visible as dark stringy stuff. There are many ways to visualize the genome and this is part of its beauty.
Figure 1.1 This picture, known as a karyotype, is a photograph of all 46 human chromosomes. With an X and a Y chromosome, this is a male’s karyotype. A female’s karyotype would show two X chromosomes.
Credit: Photo Researchers
Figure 1.2 The nucleus of every human cell (the large purple mass inside the cell) contains DNA. Mitochondria, organelles in cells that produce energy (the smaller purple objects within the cell), also contain some DNA.
Credit: Wiley
Still, to understand function, we do need to learn about basic form. And a karyotype, despite its limitations as a representation of the genome, illustrates that in almost all the cells in the human body there are 22 pairs of chromosomes and two sex‐determining chromosomes. The double helices that make up your chromosomes are composed of deoxyribonucleic acid, also known as DNA, on which are found approximately 20,000 genes. These cells are called somatic cells, and they are found in almost all nonreproductive tissue.
Humans also have cells with 23 nonpaired chromosomes. In these cells, each chromosome is made up of a single double helix of DNA that contains approximately 20,000 genes. These cells are called germ cells and are the sperm and egg cells produced for reproduction. These germ cells carry a single genome’s worth of DNA or more than 3 billion bases worth of nucleic acids.
Chromosomes are somewhat like genetic scaffolding—they hold in place the long, linearly arranged sequences of the nucleotides or base pairs that make up our genetic code. There are four different nucleotides that make up this code—adenine, thymine, guanine, and cytosine. These four nucleotides are commonly abbreviated as A, T, G, and C. Found along that scaffolding are our genes, which are made from DNA, the most basic building block of life. These genes code for proteins, which are the structural and machine‐like molecules that make up our bodies, physiology, our mental state. Through the Human Genome Project scientists are not simply learning the order of this DNA sequence, but are also beginning to locate and study the genes that lie on our chromosomes. But not all DNA contains genes.
On average 3 billion base pairs exist in the collection of the chromosomes your mother transmitted to you. Add to that the chromosomes given to you by your father gave you and in your cells there are around 6 billion bases, a complete diploid human genome. There are long stretches of DNA between genes known as intergenic or noncoding regions. And even within genes some DNA may not code for proteins. These areas, when they are found within genes, are called introns. While these genomic regions were once believed to have no products and/or no function, scientists now understand that both introns and intergenic regions play a role in regulating DNA function. The Encyclopedia of DNA Elements or ENCODE Project estimates, for example, that while only 2.94% of the entire human genome is protein coding, 80.4% of genome sequences might govern the regulation of genes. (1) Unlike the human genome and all other eukaryotic genomes, however, bacterial genomes do not have introns and have very short intergenic regions. Curiously though, the archaea, a third major domain of life (in addition to eukaryotes and bacteria) do have introns, but not necessarily the same kind of introns as eukaryotes.
Let’s begin our tour of the human genome with a very basic lesson in genetic terminology. For example, what exactly is genetics, and how is it different from genomics? Genetics is the study of the mechanisms of heredity. The distinction between genetics and genomics is one of scale. Geneticists may study single or multiple human traits. In genomics, an organism’s entire collection of genes, or at least many of them, is examined to see how entire networks of genes influence various traits. A genome is the entire set of an organism’s genetic material. The fundamental goal of the Human Genome Project was to sequence all of the DNA in the human genome. Sequencing a genome, whether human or nonhuman, simply means deciphering the linear arrangement of the DNA that makes up that genome. In eukaryotes (plants, animals, fungi, and single‐celled organisms called protists), the vast majority of the genetic material is found in the cell’s nucleus. The Human Genome Project has been primarily interested in the more than 3 billion base pairs of nuclear DNA. A tiny amount of DNA is also found in the mitochondria, a cellular structure responsible for the production of energy within a cell. Whereas the human nuclear genome contains more than 3 billion base pairs of DNA and approximately 20,000 genes (that’s nearly 10,000 genes fewer than when the first edition of this book was published in 2005), the reference human mitochondrial genome contains only 16,568 bases and 37 genes. (2) Like bacteria, mitochondrial DNA, or mtDNA, has short intergenic regions and its genes do not contain introns. Another interesting characteristic of mtDNA is that it is always maternally inherited. This has made mtDNA very helpful to track female human evolutionary phenomena. These discoveries were made possible, in part, by sequencing mtDNA.
What about heredity? In the most basic sense we should think about heredity as the transmission of traits from one generation to the next. When we talk about heredity in this book we refer to the ways in which traits are passed between generations via genes. The term heredity is also sometimes used to describe the transmission of cultural traits. Such traits are shared through a variety of means including laws, parental guidance, and social institutions. Unlike genetics, however, there are no physical laws governing the nature of this type of transmission.
What are genes? Genes are regions of DNA and are the basic units of inheritance in all living organisms. These words, genes and DNA, are too often used interchangeably. Both genes and DNA are components of heredity, but we identify genes by examining regions of DNA. In other words, DNA is the basic molecular ingredient of life, whereas genes are discrete components of that molecular brew.
If you look at any family you’ll see both shared and unique traits. Family members typically look alike, sharing many features such as eye color and nose shape, but they may also have very different body types and be susceptible to different diseases. This diversity is possible for two reasons. The first reason is that genes come in multiple forms. These alternative forms are known as alleles, and in sexual reproduction they are the staple of organismal diversity. According to the laws of genetics, siblings can inherit different traits from the same biological parents because there is an assortment of alleles that can be randomly passed along. The second reason is that the environment can exert a significant influence on the expression of genes. For example, an individual may inherit a gene that makes him or her susceptible to lung cancer. Such susceptibility is typically revealed, however, only after years of genetic damage caused by cigarette smoking or other lung‐related environmental impacts. (3) Recent advances in the field of epigenetics have brought new complexity to our understanding of how our genes interact with our environments, and how such interactions can be passed between generations (through the germline). Over the past decade epigenetic research has accelerated our understanding of how environmental factors can alter the peripheral structure of DNA—not the DNA sequence itself but the molecular structures that interact with and support the sequence—to elicit changes in the expression of a gene (the gene’s phenotype).
So how did science progress from thinking about the mechanisms of heredity to understanding that genes are the basic units of heredity, to deciphering and finally manipulating the DNA code that underlies all life on Earth? The results of the Human Genome Project were the fruits of over a century of struggle by scientists around the globe. Most historians of science would measure this progress beginning with Gregor Mendel’s work on pea plants during the middle of the nineteenth century. Although premodern thinkers did have a basic grasp of the idea of heredity—that is, that identifiable traits could be passed down from generation to generation—it was not until Mendel that science began to understand the mechanisms underlying the transmission of these traits. (4)
The journey from abstract notions of inheritance to the sequencing of the human genome abounds with stories of discoveries both great and small that led to where we are today. Science seldom progresses in a straight line. The genome was always there for us to find but took centuries to discover because knowledge and the technological application of that knowledge advance fitfully, revealing gradually more over time, and the social and cultural context that prioritizes different types of knowledge ebbs and flows with that time. Scientists have not always made the right choices. Even today, in what has been called the post‐genomic age, we are likely making assumptions about our genes that future generations look back on and ask, “How could they have thought that?” The trials and errors of science are part of what makes this process so interesting.
Several major building blocks of life had to be discovered to make possible our entry into the genomic world. First, scientists needed to determine what constitutes the hereditary material that passes from one generation to the next. Second, they needed to find out what constitutes the biochemical basis for the expression of this intergenerational legacy. This endeavor required the ability to take cells apart and analyze the chemical components from different parts of cells. Scientists then needed to determine the ways in which these chemicals, the building blocks of life, interacted, how they were structured, and how that structure influenced the hereditary process. Finally, technologies needed to be developed to use this information to improve human health, agriculture, and our understanding of our place in the history of life on Earth.
It took almost 150 years from the discovery of the hereditary principles to the sequencing of the human genome. The stories behind these discoveries explain how scientists came to understand the biological basis of heredity. What follows does not represent the comprehensive history of all the important genetic work of the past century or so. Yet without the discoveries we highlight, the discovery of the genome would never have occurred or would have happened very differently.
The meanings and mechanisms of heredity were pondered and debated millennia before the development of modern genetics. In the fifth century BCE, the Greek dramatist Euripides wrestled with the complexities of the relationship between parent and child in his play Electra:
I oft have seen,
One of no worth a noble father shame,
And from vile parents worthy children spring, Meanness oft groveling in the rich man’s mind, And oft exalted spirits in the poor. (5)
Without knowledge of genes or genomes, premodern thinkers had many ideas concerning the nature of heredity, some of which were surprisingly sophisticated and accurate. To Euripides heredity must have been a mystifying and seemingly random process. How else could he and his contemporaries explain the inconsistencies among inherited traits within families? Other ancients carefully considered similar questions. Lucretius, a Roman philosopher, wrote that traits could skip generations, as children sometimes resembled their grandparents. (6) Around the globe, premodern farmers had already developed sophisticated breeding techniques that depended, in part, on a basic understanding of heredity. We know, for example, that the ancient Assyrians and Babylonians artificially pollinated date palm trees and that many animals, including sheep, camels, and horses were domesticated during ancient times. (7) The domestication and breeding of plants and animals shows that many early thinkers recognized that traits were passed between generations.
Perhaps the most advanced premodern thinker on heredity was Aristotle (384–322 BCE). (8) Aristotle dedicated much of his work to questions concerning the specific mechanisms of heredity. He theorized that inherited traits were passed between generations by what he called the eidos, or the blueprint, that gave form to a developing organism. Aristotle’s eidos was entirely theoretical—he could not see this invisible configuration—a fact that makes his theory all the more remarkable. Aristotle understood the mechanisms of heredity only in the broadest sense and remained handicapped by the limited technology of his time, a primitive understanding of biology, and the cultural limitations of his worldview. Yet a keen perception, buttressed by his emphasis on observation and description, made him a brilliant interpreter of the natural world.
The concept of the eidos remained the most complete theory of heredity until the modern era of genetics. More than two millennia later scientists use a genetic language strikingly similar to Aristotle’s. The eidos is in many ways analogous to the modern concept of a genome, and like Aristotle today’s scientists often refer to a genome as a blueprint for life. (9)