Читать книгу Bioinformatics - Группа авторов - Страница 4
List of Illustrations
Оглавление1 Chapter 1Figure 1.1 The landing page for ENA record U54469.1, providing a graphical vie...Figure 1.2 Results of a search for the human heterogeneous nuclear ribosomal p...Figure 1.3 The Subcellular location and Pathology & Biotech sections of ...Figure 1.4 The Feature viewer rendering of the record for the human heterogene...Figure 1.5 Expanding the PTM, Structural features, and Variants sections withi...
2 Chapter 2Figure 2.1 The exponential growth of GenBank in terms of number of nucleotides...Figure 2.2 Results of a text-based Entrez query against PubMed using Boolean o...Figure 2.3 An example of a PubMed record in Abstract format, as returned throu...Figure 2.4 Neighbors to an entry found in PubMed. The original entry from Figu...Figure 2.5 The Entrez Gene page for the DCC (deleted in colorectal carcinoma) ...Figure 2.6 A section of the Database of Single Nucleotide Polymorphisms (dbSNP...Figure 2.7 Entries in the RefSeq protein database corresponding to the origina...Figure 2.8 The RefSeq entry for the netrin receptor, the protein product of th...Figure 2.9 The same RefSeq entry for the netrin receptor shown in Figure 2.8, ...Figure 2.10 Protein structures associated with the RefSeq entry for the human ...Figure 2.11 The structure summary page for pdb:4URT, the crystal structure of ...Figure 2.12 A list of structures deemed similar to pdb:4URT using VAST+. The t...Figure 2.13 Online Mendelian Inheritance in Man (OMIM) entries related to the Figure 2.14 The Online Mendelian Inheritance in Man (OMIM) entry for the DCC g...Figure 2.15 An example of a list of allelic variants that can be found through...Figure 2.16 The ClinicalTrials.gov page showing all actively recruiting clinic...Figure 2.17 A clickable map showing where actively recruiting clinical trials ...Figure 2.18 The Mouse Genome Informatics (MGI) entry for the Dcc gene in mouse...Figure 2.19 The Zebrafish Information Network (ZFIN) gene page for the dcc gen...Figure 2.20 An example of gene expression data available through the Zebrafish...
3 Chapter 3Figure 3.1 The BLOSUM62 scoring matrix (Henikoff and Henikoff 1992). BLOSUM62 ...Figure 3.2 A nucleotide scoring table. The scoring for the four nucleotide bas...Figure 3.3 The initiation of a BLAST search. The search begins with query word...Figure 3.4 BLAST search extension. Length of extension represents the number o...Figure 3.5 The National Center for Biotechnology Information (NCBI) BLAST land...Figure 3.6 The upper portion of the BLASTP query page. The first section in th...Figure 3.7 The lower portion of the BLASTP query page, showing algorithm param...Figure 3.8 Graphical display of BLASTP results. The query sequence is represen...Figure 3.9 The BLASTP “hit list.” For each sequence found, the user is present...Figure 3.10 Detailed information on a representative BLASTP hit. The header pr...Figure 3.11 Performing a BLAST 2 Sequences alignment. Clicking the check box a...Figure 3.12 Typical output from a BLAST 2 Sequences alignment, based on the qu...Figure 3.13 Constructing a position-specific scoring matrix (PSSM). In the upp...Figure 3.14 Performing a PSI-BLAST search. See text for details.Figure 3.15 Selecting algorithm parameters for a PSI-BLAST search. See text fo...Figure 3.16 Results of the first round of a PSI-BLAST search. For each sequenc...Figure 3.17 Results of the second round of a PSI-BLAST search. New sequences i...Figure 3.18 Submitting a BLAT query. A rat clone from the Cancer Genome Anatom...Figure 3.19 Results of a BLAT query. Based on the query submitted in Figure 3....Figure 3.20 The FASTA search strategy. (a) Once FASTA determines words of leng...Figure 3.21 Search summary from a protein–protein FASTA search, using the sequ...Figure 3.22 Hit list for the protein–protein FASTA search described in Figure ...
4 Chapter 4Figure 4.1 The home page of the UCSC Genome Browser, showing a query for the g...Figure 4.2 The default view of the UCSC Genome Browser, showing the genomic co...Figure 4.3 The genomic context of the human HIF1A gene, after clicking on zoom...Figure 4.4 The RefSeq Track Settings page. The track settings pages are used t...Figure 4.5 The genomic context of the human HIF1A gene, after displaying RefSe...Figure 4.6 The Get Genomic Sequence page that provides an interface for users ...Figure 4.7 The genomic context of the human HIF1A gene, after changing the dis...Figure 4.8 Configuring the track settings for the Common SNPs(150) track. Set ...Figure 4.9 The genomic context of the human HIF1A gene, after changing the col...Figure 4.10 The GTEx Gene track, which depicts median gene expression levels i...Figure 4.11 BLAT search at the UCSC Genome Browser. (a) This page shows the re...Figure 4.12 Configuring the UCSC Table Browser. The link to the Table Browser ...Figure 4.13 The home page of the Ensembl Genome Browser, showing a query for t...Figure 4.14 The Gene tab for the human PAH gene. This landing page provides li...Figure 4.15 Computationally predicted orthologs of the human PAH gene, from th...Figure 4.16 The Location tab for the human PAH gene. The Location tab is divid...Figure 4.17 Zooming in on the bottom section of the Location tab from Figure 4...Figure 4.18 The Ensembl Variant tab. (a) To get more details about SNP rs76296...Figure 4.19 The Ensembl Regulatory Build track. (a) Go to Configure this page ...Figure 4.20 The Synteny view at Ensembl. (a) An overview of the syntenic block...Figure 4.21 Ensembl BLAST output, showing an alignment between the human ADAM1...Figure 4.22 Using BioMart to retrieve the mouse orthologs of the human RefSeqs...Figure 4.23 JBrowse display of a predicted Mnemiopsis gene (ML05372a) from the...
5 Chapter 5Figure 5.1 A simplified depiction of a prokaryotic gene or open reading frame ...Figure 5.2 A simplified depiction of a eukaryotic gene illustrating the multi-...Figure 5.3 A schematic illustration of the upstream regions of a eukaryotic ge...Figure 5.4 A schematic illustration of the splice site regions around exons an...Figure 5.5 Sample output from a GENSCAN analysis of the uroporphyrinogen decar...Figure 5.6 Schematic representation of measures of gene prediction accuracy at...Figure 5.7 The typical L-shaped structure of a tRNA molecule. This depicts the...Figure 5.8 A screenshot montage of the PHASTER web server showing the website ...Figure 5.9 A screenshot of a BASys bacterial genome annotation output for the ...
6 Chapter 6Figure 6.1 The three levels of organization of RNA structure. (a) The primary ...Figure 6.2 The RNA secondary structure of the 3′ untranslated region of the Dr...Figure 6.3 An illustration of the equilibria of RNA structures in solution. (a...Figure 6.4 Prediction of conformational free energy for a conformation of RNA ...Figure 6.5 A simple RNA pseudoknot. This figure illustrates two representation...Figure 6.6 The input form for the version 3.1 Mfold server. (a) The top and (b...Figure 6.7 The output page for the Mfold server. Please refer to the text for ...Figure 6.8 Sample output from the Mfold web server, version 3.1. (a) The secon...Figure 6.9 RNAstructure web server input form. (a) The top and (b) the bottom ...Figure 6.10 Sample output from the RNAstructure web server showing the predict...Figure 6.11 Input form for the RNAstructure web server for multiple-sequence p...Figure 6.12 Sample output from the RNAstructure web server for multiple-sequen...
7 Chapter 7Figure 7.1 Dashboard of the PredictProtein web server. PredictProtein (Yachdav...Figure 7.2 Protein secondary structure. Experimentally determined three-dimens...Figure 7.3 Accessible surface area (ASA). The ASA describes the surface that i...Figure 7.4 Protein secondary structure. Prediction of secondary structure, sol...Figure 7.5 Types of transmembrane proteins. Experimentally determined three-di...Figure 7.6 Transmembrane helix prediction by TMSEG. TMSEG (Bernhofer et al. 20...Figure 7.7 Annotations of human tumor suppressor P53 (P53_HUMAN). (a) InterPro...Figure 7.8 Prediction of subcellular localization. Visual output from LocTree3...Figure 7.9 From predicting single amino acid sequence variant (SAV) effects to...
8 Chapter 8Figure 8.1 An example multiple sequence alignment of seven globin protein sequ...Figure 8.2 An outline of the simple progressive multiple alignment process. Th...Figure 8.3 Aligner accuracy versus total single-threaded run time using the BA...Figure 8.4 Total single-threaded execution time (y-axis) for different aligner...Figure 8.5 Ratio of total run time relative to single-threaded execution (y-ax...Figure 8.6 Protein and RNA multiple sequence alignments as visualized using Ja...Figure 8.7 Linked coding sequence (CDS), protein, and three-dimensional struct...
9 Chapter 9Figure 9.1 Different ways to visualize a tree. In this example, the same tree ...Figure 9.2 Alignments illustrating sequence similarity versus sequence identit...Figure 9.3 The differences between orthologs, paralogs, and xenologs. The ance...Figure 9.4 The difference between phylogenetic signal and phylogenetic noise. ...Figure 9.5 Character-based versus distance-based phylogenetic methods. Charact...Figure 9.6 Rooting a tree with an outgroup. Escherichia coli bacteria are comm...Figure 9.7 Workflow for a protein-based phylogenetic analysis using the PHYLIP...Figure 9.8 Phylogenetic relationships can be visualized using different types ...Figure 9.9 Excerpt of a Salmonella minimum spanning tree. Types of Salmonella ...
10 Chapter 10Figure 10.1 Example of an MA plot before (a) and after (b) normalization. A, o...Figure 10.2 Histogram of the base mismatch (MM) rate across multiple RNA-seq s...Figure 10.3 Overview of quantile normalization. We start with the box on the t...Figure 10.4 Batch effects principal components analysis (PCA) example. Boxplot...Figure 10.5 A simple illustration of the process of hierarchical clustering. (...Figure 10.6 Heatmap showing clustering of gene expression data of the 100 most...Figure 10.7 First two components of principal component analysis (PCA) on the ...Figure 10.8 Principal component analysis (PCA) is a dimensionality reduction m...Figure 10.9 Illustration of how one can select k when performing consensus clu...Figure 10.10 Receiver operating characteristic (ROC) curve for a model desig...
11 Chapter 11Figure 11.1 Gene(s) to proteoforms. This figure illustrates the complexity of ...Figure 11.2 Quadrupole mass analyzer. Schematic of a quadrupole mass analyzer,...Figure 11.3 Time of flight (TOF) mass analyzer. Schematic of a TOF mass analyz...Figure 11.4 (a) Tandem mass spectrometry (MS). Schematic of a triple quadrupol...Figure 11.5 Fragmentation tandem mass spectrometry (MS/MS, or MS2) spectrum. A...Figure 11.6 Polypeptide backbone cleavage produces different product ion speci...Figure 11.7 Post-translational modifications (PTMs) take place at different am...Figure 11.8 Data pre-processing workflow of a mass spectrum. Different steps i...Figure 11.9 Shotgun proteomics workflow. Schematic showing different steps inv...Figure 11.10 A schematic diagram comparing the label-free approach with the di...Figure 11.11 Peptide mass fingerprinting (PMF) workflow. Schematic showing dif...Figure 11.12 Mascot peptide mass fingerprinting (PMF). PMF submission screen a...Figure 11.13 Peptide sequencing via tandem mass spectrometry (MS/MS) spectra i...Figure 11.14 Peptide sequence tag searching. Schematic illustrating how a sequ...Figure 11.15 Peptide spectrum match (PSM). Annotated MS2 spectrum showing matc...Figure 11.16 Mascot search engine. Mascot MS2 database search submission windo...Figure 11.17 Proteomics. A broad classification of proteomics and the biologic...
12 Chapter 12Figure 12.1 A flow diagram illustrating the steps used to experimentally prepa...Figure 12.2 An example of a nuclear magnetic resonance (NMR) “blurrogram” of a...Figure 12.3 The different levels of protein structures illustrating: (a) prima...Figure 12.4 Examples of different types of protein folds including (a) the fou...Figure 12.5 An illustration of standard amino acid residue and peptide bond ge...Figure 12.6 An example of a Protein Data Bank formatted file showing the first...Figure 12.7 A Ramachandran plot for the thioredoxin protein (Protein Data Bank...Figure 12.8 A screenshot of the Research Collaboratory for Structural Bioinfor...Figure 12.9 A screenshot of an image of Escherichia coli thioredoxin as genera...Figure 12.10 An illustration of the four major approaches to rendering protein...Figure 12.11 An example of the high-quality images that can be created using a...Figure 12.12 An illustration of a homology model (b) of Escherichia coli thior...Figure 12.13 A schematic illustration of how threading is performed. (a) A que...Figure 12.14 An example of the high-quality postscript output data from PROCHE...Figure 12.15 An example of the CATH database description of Escherichia coli t...
13 Chapter 13Figure 13.1 The Reactome database pathway view. The central view shows pathway...Figure 13.2 The EcoCyc database cellular overview of Escherichia coli metaboli...Figure 13.3 An example of metabolic pathway reconstruction from Kyoto Encyclop...Figure 13.4 A BioGRID database record. A screenshot of the result page for a B...Figure 13.5 An IntAct database search for the human MDM2 gene. A summary of al...Figure 13.6 An example of the main STRING query result page. A network of rela...Figure 13.7 A query result from GeneMANIA. Each node in the network represents...Figure 13.8 The AKT pathway as represented by a traditional method (top left, ...Figure 13.9 The main components of the Proteomics Standards Initiative–Molecul...Figure 13.10 The valine biosynthesis pathway dynamically drawn by the Pathway ...Figure 13.11 Output from the PathVisio software showing a portion of a human c...Figure 13.12 The set of symbol types available in the Systems Biology Graphica...Figure 13.13 The Drosophila melanogaster cell cycle drawn using Systems Biolog...Figure 13.14 The results of pathway enrichment analysis using the g:Profiler t...Figure 13.15 A Gene Set Enrichment Analysis (GSEA) enrichment figure. The bott...Figure 13.16 An enrichment map showing two enriched themes. Each node represen...Figure 13.17 An introduction to terminology and visual notation used in the co...Figure 13.18 Zooming in on a network in Cytoscape shows part of a large connec...Figure 13.19 An overview of a pathway analysis workflow, summarizing multiple ...
14 Chapter 14Figure 14.1 A diagram illustrating the typical workflow for a metabolomic expe...Figure 14.2 An example of a Molecular Design Limited (MDL) chemical fingerprin...Figure 14.3 An example of a MOL file for a two-dimensional representation of L...Figure 14.4 An example of an nmrML data file for L-alanine. The actual file is...Figure 14.5 The JSpectraViewer image for L-alanine. JSpectraViewer is a Java a...Figure 14.6 A selection of two screenshots from the PubChem web pages for the ...Figure 14.7 Two screenshots of the gas chromatography–mass spectrometry (GC-MS...Figure 14.8 Two screenshots from the Human Metabolome Database (HMDB) entry fo...Figure 14.9 A simplified illustration of how spectral deconvolution works for ...Figure 14.10 Two screenshots of the Bayesil web server. (a) A nuclear magnetic...Figure 14.11 An illustration of how spectral deconvolution works for gas chrom...Figure 14.12 An illustration of how principal component analysis can be though...Figure 14.13 A three-dimensional principal component analysis (PCA) “scores” p...Figure 14.14 The MetaboAnalyst Module Overview page. This page allows users to...Figure 14.15 The MetaboAnalyst Data Upload page. This page allows users to upl...Figure 14.16 The MetaboAnalyst Data Normalization page. The optimal normalizat...Figure 14.17 The MetaboAnalyst Data Normalization and Scaling results, generat...Figure 14.18 A two-dimensional principal component analysis (PCA) “scores” plo...Figure 14.19 The principal component analysis (PCA) “loadings” plot, showing t...Figure 14.20 The partial least squares discriminant analysis (PLS-DA) plot sho...Figure 14.21 An example of an R2/Q2 plot generated by MetaboAnalyst using the ...Figure 14.22 A variable importance in projection plot showing which metabolite...Figure 14.23 A pathway impact plot showing the importance of different pathway...
15 Chapter 15Figure 15.1 Principal components analysis (PCA) of nine world populations and ...Figure 15.2 The coalescent process. Although the ancestral population contains...Figure 15.3 Multiple sequentially Markovian coalescent (MSMC) estimate of popu...Figure 15.4 Admixture analysis of nine populations and three test samples. Ind...Figure 15.5 A Manhattan plot of Composite of Multiple Signals (CMS) scores (Y-...
16 Chapter 16Figure 16.1 General workflow for DNA-based microbiome analysis.Figure 16.2 FastQC summary of DNA sequence read quality for an Illumina sequen...Figure 16.3 Primary structure and variable regions of the 16S ribosomal RNA ge...Figure 16.4 k-mer decomposition of a nucleotide sequence with k = 2. Two seque...Figure 16.5 Rarefaction curves for microbial communities sampled from six diff...Figure 16.6 Unweighted phylogenetic alpha- and beta-diversity measures. Left: ...Figure 16.7 Principal coordinate analysis (a) vs. non-metric multidimensional ...Figure 16.8 Visualizing the differences between two groups of gut microbiome s...
17 Chapter 17Figure 17.1 ClinVar entry for a benign variant in the cystic fibrosis gene (CF...Figure 17.2 Receiver operating characteristic (ROCs) curves of five submissi...
18 Chapter 18Figure 18.1 Relationships between observation, data, information, and knowledg...Figure 18.2 Types of variables and their hierarchical relationships.Figure 18.3 Organization of an example dataset. (a) Part of a two-dimensional ...Figure 18.4 Commonly used descriptive statistics for sample variables. Light b...Figure 18.5 Covariance versus correlation. The red sample has higher sample va...Figure 18.6 Example histogram demonstrating the frequency of black cherry tree...Figure 18.7 Example boxplot and related variant graphs. (a) Schematic diagram ...Figure 18.8 Anscombe's quartet. Scatterplots with regression lines for four fa...Figure 18.9 Scatterplot of the first two principal components (PCs) from princ...Figure 18.10 Example of how to make a graph descriptive.Figure 18.11 The standard normal distribution.Figure 18.12 Other well-described discrete and continuous distributions common...Figure 18.13 Bond length and coordination angle histograms for coordinated met...Figure 18.14 Overview of the process of statistical inference. FUV stands for ...Figure 18.15 Truth table with descriptions of type I and II errors.Figure 18.16 Diagram illustrating the relationships between a probability dens...Figure 18.17 Using a Student's t-test to test a null hypothesis.Figure 18.18 Relationship between population and sample mean distributions..Figure 18.19 An approximate power analysis diagram for a Student's t-test.
19 AppendicesFigure 6.A.1 Pseudo-computer code for the fill order of V(i,j) and W(i,j). Thi...Figure 6.A.2 The filled V(i,j) array for sequence GCGGGUACCGAUCGUCGC.Figure 6.A.3 The filled W(i,j) array for sequence GCGGGUACCGAUCGUCGC.Figure 6.A.4 Illustrations of maximum hydrogen bond conformations as found by ...Figure 6.A.5 Flowchart for structure traceback. Traceback starts by placing 1,...Figure 6.A.6 The secondary structure of rGCGGGUACCGAUCGUCGC with 17 hydrogen b...