Читать книгу Cell Biology - Stephen R. Bolsover - Страница 75

Reading the Genetic Code

It is the sequence of bases along the DNA strand that determines the sequence of the amino acids in proteins. There are four different bases in DNA (G, A, T, and C). Each amino acid is specified by a codon, a group of three bases. Because there are four bases in DNA, a three‐letter code gives 64 (4 × 4 × 4) possible codons. These 64 codons form the genetic code – the set of instructions that tells a cell the order in which amino acids are to be joined together to form a protein (Figure 3.8). Despite the fact that the linear sequence of codons in DNA determines the linear sequence of amino acids in proteins, the DNA helix does not itself play a role in protein synthesis. The translation of the sequence from codons into amino acids occurs through the intervention of members of a third class of molecule – mRNA. Messenger RNA acts as a template, guiding the assembly of amino acids into a polypeptide chain. Messenger RNA uses the same code as the one used in DNA with one difference: in mRNA the base uracil (U) is used in place of thymine (T). When we write the genetic code we usually use the RNA format, that is, we use U instead of T.

The code is read in sequential groups of three, codon by codon. Adjacent codons do not overlap and each triplet of bases specifies one particular amino acid. This discovery was made by Sydney Brenner, Francis Crick, and their colleagues by studying the effect of various mutations (changes in the DNA sequence) on the bacteriophage T4, which infects the common bacterium E. coli . If a mutation caused either one or two nucleotides to be added or deleted from one end of the T4 DNA, then a defective polypeptide was produced, with a completely different sequence of amino acids. However, if three bases were added or deleted, then the protein made often retained its normal function. These proteins were found to be identical to the original protein, except for the addition or loss of one amino acid.

Figure 3.7. Amino acids and the peptide bond.

Figure 3.8. DNA makes RNA makes protein: the central dogma of molecular biology.

The identification of the triplets encoding each amino acid began in 1961. This was made possible by using a cell‐free protein synthesis system prepared by breaking open E. coli cells. Synthetic RNA polymers, of known sequence, were added to the cell‐free system together with the 20 amino acids. When the RNA template contained only uridine residues (poly‐U) the polypeptide produced contained only phenylalanine – therefore codon UUU must specify phenylalanine. A poly(A) template produced a polypeptide of lysine and poly‐C one of proline: AAA and CCC must therefore specify lysine and proline, respectively. Synthetic RNA polymers containing all possible combinations of the bases G, A, U, and C, were added to the cell‐free system to determine the codons for the other amino acids. A template made of the repeating unit CU gave a polypeptide with the alternating sequence leucine–serine. Because the first amino acid in the chain was found to be leucine, CUC must code for leucine and UCU must code for serine. Although much of the genetic code was read in this way, the amino acids defined by some codons were particularly hard to determine. Only when specific transfer RNA molecules (page 85) were used was it possible to demonstrate that GUU codes for valine. The genetic code was finally solved by the combined efforts of several research teams. The leaders of two of these, Marshall Nirenberg and Har Gobind Khorana, received the Nobel prize in 1968 for their part in cracking the code.

Подняться наверх