Читать книгу An Introduction to Molecular Biotechnology - Группа авторов - Страница 16

2.3 Structure and Function of Proteins

Оглавление

Proteins represent the most important tools of the cell (Table 2.2). They catalyze chemical reactions, transport metabolites through membranes, recognize other molecules, and can regulate gene activity. If we consider genes as the legislative branch, proteins then function as the executive branch (i.e. as the executing organs). Proteins are built according to the same principles in both prokaryotes and eukaryotes.

Twenty amino acids serve as building blocks for peptides and proteins, linked to one another by peptide bonds (Figure 2.6). Polypeptides, therefore, are polymers made from amino acids. Polypeptides are polar molecules, possessing a NH2 group (amino‐ or N‐terminal) on one end and a COOH group (carboxyl‐ or Cterminal) on the other. The diverse tasks and functions of proteins result from different arrangements (sequences) of amino acids.


Figure 2.6 General structure of amino acids and peptides.

The 20 amino acids differ in their side chains (Figure 2.7). The functional groups of the side chains, which protrude from the α‐C atom, dictate the conformation and later functionality of the protein by molecular recognition or biocatalysis. Amino acids exist in two optical isomers: the D‐ and L‐forms. Polypeptides are composed exclusively of L‐amino acids. D‐Amino acids can be found in bacterial cell walls and in many antibiotics (gramicidin, valinomycin). Since proteases can only cleave peptides composed of L‐amino acids, the incorporation of D‐amino acids results in a certain protection from untimely degradation.


Figure 2.7 Structures of proteinogenic amino acids. (Cysteine muss zu den amino acids with apolar residues.)

The proteinogenic amino acids can be divided into different groups according to their functional groups and residues (Figure 2.7 and Table 2.4):

 Amino acids with apolar, lipophilic residues.

 Amino acids with polar but uncharged residues (i.e. with hydroxyl or amide groups).

 Amino acids with acid groups that are negatively charged.

 Amino acids with basic groups that are positively charged.

Table 2.4 Compilation and grouping of the proteinogenic amino acids: two types of abbreviations are recognized internationally, which either consist of one or three letters; the codons that represent the amino acids in the genetic code are also given.

Classification Symbols Codons
Neutral and nonpolar amino acids
Glycine Gly; G GGA GGC GGG GGU
Alanine Ala; A GCA GCC GCG GCU
Valine Val; V GUA GUC GUG GUU
Leucine Leu; L UUA UUG CUA CUC CUG CUU
Isoleucine Ile; I AUA AUC AUU
Tryptophan Trp; W UGG
Phenylalanine Phe; F UUC UUU
Methionine Met; M AUG
Cysteine Cys; C UGC UGU
Proline Pro; P CCU CCC CCA CCG
Neutral and polar amino acids
Serine Ser; S AGC AGU UCA UCC UCG UCU
Threonine Thr; T ACA ACC ACG ACU
Tyrosine Tyr; Y UAC UAU
Asparagine Asn; N AAC AAU
Glutamine Gln; Q CAA CAG
Basic amino acids
Lysine Lys; K AAA AAG
Arginine Arg; R AGA AGG CGA CGC CGG CGU
Histidine His; H CAC CAU
Acidic amino acids
Aspartate Asp; D GAC GAU
Glutamate Glu; E GAA GAG

The human body is capable of synthesizing some amino acids; others must be obtained through nutrition (essential amino acids). The amino acids phenylalanine, tryptophan, lysine, methionine, valine, leucine, isoleucine, histidine, and threonine belong to the essential amino acids.

Proteins often undergo posttranslational modification, by transferring oligosaccharide residues to asparagine (N‐glycosidic) or serine residues (O‐glycosidic) (see Section 5.4). Glycoproteins are found on the outside of the cell, in cell walls, and in the extracellular matrix, especially in connective tissue. Glycosylation is important for the biological activity and antigenic properties.

While the peptide bond itself is inflexible, the substituents at the α‐C atom of an amino acid can rotate freely. As a result, a polypeptide chain can engage in a number of spatial structures (conformations). Under aqueous conditions found in the cell, the polypeptide chains are not present in a linear form, but form spontaneous secondary and tertiary structures, which are energetically more favorable. These structures rely on many noncovalent bonds and forces; those that are important include the following:

 Hydrogen bonds (bond strength of 4 kJ mol–1 under aqueous conditions).

 Ionic bonds (electrostatic attraction) (bond strength of 12.5 kJ mol–1).

 van der Waals forces (bond strength of 0.5 kJ mol–1).

 Hydrophobic attractions.

Figure 2.8 summarizes the most common hydrogen bonds present in a cell. Electronegative atoms, such as oxygen and nitrogen, try to withdraw electrons from neighboring atoms such as hydrogen. This results in oxygen and nitrogen having a slight negative charge, while hydrogen is slightly positively charged. Positive and negative charges attract one another. The resulting attractions are known either as hydrogen bonds or as hydrogen bridges. The ability to form hydrogen bonds is especially present in water molecules (the hydrogens are positive; the oxygen atom is negatively charged), and water is therefore considered as the universal solvent of the cell. Biomolecules with polar groups easily take up water molecules (they are water soluble), while nonpolar residues repel water (hydrophobic) and group together with other apolar molecules (which are fat soluble). Figure 2.9 illustrates the importance of noncovalent and covalent bonds for the formation of protein folds. Through the formation of disulfide bridges between two cysteine residues, the conformation of a protein can also be covalently influenced (Figure 2.9).


Figure 2.8 Important hydrogen bonds in biomolecules.


Figure 2.9 Noncovalent bonds and disulfide bridges lead to a spatial folding and stabilization of a peptide. Bond types: hydrogen bonds, ionic bonds, van der Waals forces, and disulfide bridges.

In comparison with covalent bonds (bond strength of 348–469 kJ mol–1), noncovalent bonds are 5–100 times weaker. When many noncovalent bonds are present, they simultaneously can work cooperatively, leading to the formation of stable and thermodynamically favored structure elements in polypeptides. Hydrophobic amino acid residues cluster together in order to lock water out. In polypeptides this can lead to a globular tertiary structure, while the hydrophobic residues are oriented toward the inside, and the polar and charged residues are oriented toward the outside (Figure 2.10). Under aqueous conditions, proteins usually fold spontaneously into a stable conformation in which the free energy is at the lowest.


Figure 2.10 Folding of peptide chains under aqueous conditions leads to a compact globular conformation with a hydrophobic core.

However, the conformation of proteins can easily change if they come into contact with other proteins or contents of the cell. Other examples of protein modifications are phosphorylation (of hydroxyl groups of tyrosine, serine, and threonine) or dephosphorylation that leads to a change in conformation. It is experimentally simple to alter the conformation of a protein using detergents or urea. For example, when globular proteins are dissolved in a 4 M urea solution, the polypeptide chain unfolds (i.e. the protein is denatured). If the urea is removed, the polypeptide chain refolds into the previous conformation (renaturing).

Even though each protein has an individual conformation, when the structures of many proteins are compared, two folding patterns that regularly appear are recognized. These structural elements are:

 α‐Helix structures.

 β‐Pleated sheet structures.

α‐Helix structures and β‐pleated sheet structures arise from hydrogen bonds between the NH and CO groups in the backbone of the polypeptide chain. Functional groups on the side chains do not take part in these structural elements. Figure 2.11 describes the structure of helices and pleated sheets more precisely. Other structures include loops and random coils.


Figure 2.11 Importance of hydrogen bonds for the construction of α‐helix and β‐sheet structures. (a) The right twisting helix has 3.6 residues per turn. The dotted lines represent the hydrogen bonds between CO and NH groups. (b) The zigzag‐shaped representation of a β‐pleated sheet. Dotted lines symbolize hydrogen bonds. The side chains alternate between being present below and above the folded plane.

Source: Voet et al. (2016). Reproduced with permission of John Wiley and Sons.

A β‐sheet structure element is often found at the inner core of many proteins. The β‐pleated sheet can appear between neighboring polypeptide chains that have the same orientation (parallel chain). When a polypeptide chain folds back on itself and is aligned in parallel, the chains are termed antiparallel chains. In both cases, the chains are being held strongly together by hydrogen bonds (Figure 2.11).

An α‐helix forms when a single peptide chain winds around itself and forms a sturdy cylinder. In doing so, a hydrogen bond forms between each fourth peptide bond (i.e. between the CO group of one peptide bond and the NH group of the other peptide bond). This results in the formation of an ordered helix with a complete turn every 3.6 amino acids. Short α‐helix structures can be found in membrane proteins that possess a transmembrane region. In this case, the α‐helix contains only amino acids with nonpolar residues. The nonpolar residues are oriented toward the outside of the helix and shield the hydrophilic backbone of the peptide chain and interact with the lipophilic components of the phospholipids.

In fibrous proteins (e.g. α‐keratin), two or three longer helices can twist around each other (coiled coil) and form long ropelike structures.

The structure of proteins is very complex, because there are thousands of covalent and noncovalent bonding possibilities between the atoms of the peptide chains and the amino acid residues. Through X‐ray and nuclear magnetic resonance(NMR) analysis, the spatial structures of many hundreds of proteins have been determined. Structure analysis is a challenge not only for basic research but also for applied pharmaceutical research. If the structure or binding sites of a receptor or enzyme are known in detail, it should be possible to design new active substances that have the correct fit and act either as an agonist or as an antagonist. Successes in rational drug design so far concern active substances in the area of AIDS (HIV protease inhibitors; Viracept, Agenerase) and influenza (neuraminidase inhibitors: Relenza, Tamiflu).

There are four structural levels of protein structure:

 Primary structure. Primary structure corresponds to the amino acid sequence.

 Secondary structure. Secondary structure corresponds to α‐helix and β‐pleated sheet formations.

 Tertiary structure. Tertiary structure corresponds to the three‐dimensional conformation of a polypeptide chain.

 Quaternary structure. If a protein complex consists of several subunits (i.e. hemoglobin), then the entire structure is referred to as the quaternary structure.

The proteins of a cell usually contain between 50 and 2000 amino acid residues. Theoretically, each of the 20 amino acids can appear at each location of a polypeptide chain. In an oligopeptide, with a length of four amino acids, there are 20 × 20 × 20 × 20 = 160 000 different oligopeptides. The number of possible peptide molecules can be calculated as 20n, where n denotes the chain length. For a protein with the average length of 300 amino acids (Figure 2.12), 20300 = 10390 possible variations are derived. However, not even our universe has that many atoms. From the great number of variants, only a comparatively small number was seemingly realized by nature. Through the course of evolution, many more proteins have been created. However, following natural selection only those proteins that have proven to be of value remain. During the course of evolution, protein families deriving from the first proteins with defined functions have developed through gene duplication. The original sequence has been changed in the new proteins.


Figure 2.12 Size of proteins in yeast (Saccharomyces cerevisiae). The yeast genome project allowed a first estimate of the size of yeast proteins.

During analysis of genome projects, individual structural domains of many proteins have been identified with the help of bioinformatics. Large proteins are usually made up of several functional domain or modules. Domains usually have defined structures and functions (Figures 2.13 and 2.14). They often correspond to the exons in a eukaryotic gene (see Section 4.2). They developed in early evolution, obviously independent of each other. In a later evolutionary phase, the gene sections coding for a domain were newly combined. Through domain shuffling, proteins with new characteristics could thus be created. As a consequence, most proteins can be seen as variants of previously existing proteins or of their domains. Figure 2.13 shows as an example the structure of an Src protein that has four domains. Examples for domain shuffling are illustrated in Figure 2.14. Domain shuffling is important for the explanation of evolutionary development. It is not only individual point mutations that bring evolutionary advancement but also mainly new combinations of functional modules (prefabricated building blocks).


Figure 2.13 Structure of Src protein with four domains. The four domains are the (a) small kinase domain, (b) large kinase domain, (c) SH2 domain, and (d) SH3 domain.


Figure 2.14 Occurrence of domains in different proteins.

Many proteins contain binding sites for ligands; ligands can be not only lower‐molecular‐weight substances but also macromolecules such as nucleic acids or other proteins. The binding of a ligand to a binding site can be viewed as a molecular recognition process. Such molecular recognition processes are common in the cell, but these processes are only understood in detail in a few cases. However, these processes have an important relevance to cell function, metabolism, and “life” that should not be underestimated. Experiments in structural biology have already shown that the binding of a ligand in a binding site functions according to the lock‐and‐key principle. The binding site has a specific spatial structure in which a ligand fits selectively. Binding of the ligand involves the formation of several noncovalent bonds (Figure 2.15) between the functional groups of the ligand and those of the protein. Binding generally brings about a change of the protein conformation (induced fit). The binding site is not formed by amino acid residues that lie beside each other on the peptide chain, but often consists of amino acids located in different parts of a peptide chain and spatially form a binding site by appropriate specific folding (Figure 2.15).


Figure 2.15 Structure of binding sites within proteins. (a) Schematic illustration of the significance of noncovalent bonds in the lock‐and‐key principle. (b) cAMP is locked into a binding site via ionic and hydrogen bonds.

Interactions that occur between antigens and antibodies (see Chapter 28), between ligands and hormone receptors, and between enzymes and their substrates are particularly intimate and selective. The topic of protein–protein interactions is discussed further in Chapter 23.

Most of the cellular building blocks are inert molecules that are not prone to react chemically. Significant activation energy has to be overcome in order to start an energy‐consuming chemical reaction. In the laboratory, this can be achieved by heating and adding acids or bases. In biological systems, evolution has developed enzymes as biological catalysts that are able to catalyze all necessary reactions without higher temperatures being necessary. Enzymes do not change the reaction equilibrium, but usually alter the reaction rate. Enzymes contain an active center in which a substrate is bound. After the enzyme has catalyzed a reaction, the product is released, but the enzyme remains unchanged and is ready for a new reaction. Noncovalent interactions (hydrogen bonds, ionic bonds) and transient covalent bonds between protein and substrate play a key role during the binding and catalysis. Detailed elucidation of such interactions at the atomic scale is the task of biophysics and biochemistry. This research is also important for biotechnology in relation to the synthesis of new enzyme inhibitors or enzyme modulators.

Enzymes show high substrate specificity. It is believed that for almost every biosynthetic step that happens in the cell, a specific enzyme is also present. This does not rule out that enzymes that catalyze chemically similar reactions can be derived from a common original enzyme. Such enzymes belong to a common protein family. Most enzymes have particular pH and temperature optima. Enzymes are divided into different classes according to the processes catalyzed (Table 2.5). Coenzymes or inorganic ions often take part in the catalysis itself. Many coenzymes must be ingested in the forms of vitamins (Table 2.6) because the human body cannot synthesize them themselves. Biochemists and biotechnologists are interested in the elucidation of the enzymatic reaction mechanisms because hints for new catalysts for organic synthesis can be obtained. Apart from this, scientists are attempting to create new biological catalysts through the production of artificial enzymes through genetic engineering of existing enzymes.

Table 2.5 Important classes of enzymes.

Enzyme Reaction catalyzed
Hydrolases Catalyze hydrolytic cleavage (amylase, lipase, glucosidase, esterase)
Nucleases Hydrolyze nucleic acids (DNase, RNase)
Proteases Cleave peptides (pepsin, trypsin, chymotrypsin)
Isomerases Catalyze the rearrangement of bonds within a molecule
Synthases General name for an enzyme that catalyzes condensation reactions in anabolic processes
Polymerases Catalyze the formation of RNA and DNA
Kinases Transfer phosphate residues; the protein kinases (PKA, PKC) are particularly important
Phosphatases Remove phosphate residues from a molecule
ATPases Hydrolyze ATP (e.g. H+‐ATPase, Na+, K+‐ATPase, Ca2+‐ATPase); motor proteins, such as myosin
GTPases Hydrolyze GTP; many GTP‐binding proteins work as GTPases
Oxidoreductases Enzymes that catalyze redox reactions, in which one molecule is reduced and another is oxidized; they are grouped into oxidases, reductases, and dehydrogenases

Table 2.6 Many vitamins serve as essential coenzymes for enzyme reactions.

Vitamin Coenzyme Enzyme reactions that require the coenzyme
Thiamine (vitamin B1) Thiamine pyrophosphate Activation and transfer of aldehydes
Pyridoxine (vitamin B6) Pyridoxal phosphate Transaminases and decarboxylases
Biotin (vitamin B7) Biotin Activation and transfer of CO2
Riboflavin (vitamin B2) FADH Oxidations–reductions
Niacin (vitamin B3) NADH, NADPH Oxidations–reductions
Pantothenic acid (vitamin B5) Coenzyme A Activation and transfer of acyl groups
Lipoic acid Lipoamide Activation of acyl groups; oxidation–reductions
Folic acid (vitamin B9) Tetrahydrofolate Activation and transfer of single‐carbon groups
Vitamin B12 Cobalamin Isomerization and methyl group transfer

In addition to a catalytic center, many enzymes (especially those composed of several subunits) also have a regulatory center where allosteric ligands bind. For example, the second messenger cAMP binds to the tetrameric protein kinase A complex; after binding both regulatory protein subunits dissociate from both catalytic subunits, which results in their activation (Figure 3.9). Enzymes can be inhibited by inhibitors. We distinguish between reversible, irreversible, competitive, and noncompetitive inhibitors.

A further important way to regulate the activity of enzymes or regulatory proteins is that of reversible conformational change. This is achieved by phosphorylation/dephosphorylation with the help of protein kinases or phosphatases, respectively. Most of the protein kinases utilize adenosine triphosphate (ATP); other molecular switches work through the binding of guanosine triphosphate (GTP) and guanosine diphosphate (GDP) (Figure 2.16, Table 2.7). A reversible reduction of disulfide bridges (e.g. through thioredoxin) plays an important role during the regulation of light‐dependent chloroplast enzymes. Biochemists and cell biologists are working extensively to define all cellular proteins that are regulated through phosphorylation and GTP/GDP to gain a better understanding of regulation processes and regulatory pathways or networks inside the cell (see Section 3.1.1.3).


Figure 2.16 Reversible activation and inactivation of enzymes and regulatory proteins. (a) Phosphorylation/dephosphorylation. (b) Binding of GTP/GDP. GEF, guanine nucleotide exchange factor; GAP, GTPase‐activating protein.

Table 2.7 Nomenclature of DNA and RNA building blocks.

Base Nucleotide (abbreviation) Nucleotide (number of phosphate groups)
RNA DNA
1 2 3 1 2 3
Adenine Adenosine (A) AMP ADP ATP dAMP dADP dATP
Guanine Guanosine (G) GMP GDP GTP dGMP dGDP dGTP
Cytosine Cytidine (C) CMP CDP CTP dCMP dCDP dCTP
Thymine Thymidine (T) dTMP dTDP dTTP
Uracil Uridine (U) UMP UDP UTP

AMP, adenosine monophosphate; ADP, adenosine diphosphate; ATP, adenosine triphosphate; d, deoxy.

Many pathways have been optimized during evolution to increase the rate and efficacy of them. One way is to organize all proteins of a certain pathway or reaction in form of multienzyme complexes, in which enzymes that share substrates and educts are in close vicinity, thus reducing diffusion rates. Another strategy is to concentrate the pathway enzymes in a particular cellular compartment, e.g. the citric acid cycle in mitochondria.

An Introduction to Molecular Biotechnology

Подняться наверх