Algorithms in Bioinformatics

Algorithms in Bioinformatics
Автор книги: id книги: 2108611     Оценка: 0.0     Голосов: 0     Отзывы, комментарии: 0 13035,6 руб.     (129,48$) Читать книгу Купить и скачать книгу Электронная книга Жанр: Математика Правообладатель и/или издательство: John Wiley & Sons Limited Дата добавления в каталог КнигаЛит: ISBN: 9781119697992 Скачать фрагмент в формате   fb2   fb2.zip Возрастное ограничение: 0+ Оглавление Отрывок из книги

Реклама. ООО «ЛитРес», ИНН: 7719571260.

Описание книги

ALGORITHMS IN BIOINFORMATICS Explore a comprehensive and insightful treatment of the practical application of bioinformatic algorithms in a variety of fields Algorithms in Bioinformatics: Theory and Implementation delivers a fulsome treatment of some of the main algorithms used to explain biological functions and relationships. It introduces readers to the art of algorithms in a practical manner which is linked with biological theory and interpretation. The book covers many key areas of bioinformatics, including global and local sequence alignment, forced alignment , detection of motifs, Sequence logos, Markov chains or information entropy. Other novel approaches are also described, such as Self-Sequence alignment, Objective Digital Stains (ODSs) or Spectral Forecast and the Discrete Probability Detector (DPD) algorithm. The text incorporates graphical illustrations to highlight and emphasize the technical details of computational algorithms found within, to further the reader’s understanding and retention of the material. Throughout, the book is written in an accessible and practical manner, showing how algorithms can be implemented and used in JavaScript on Internet Browsers. The author has included more than 120 open-source implementations of the material, as well as 33 ready-to-use presentations. The book contains original material that has been class-tested by the author and numerous cases are examined in a biological and medical context. Readers will also benefit from the inclusion of: A thorough introduction to biological evolution, including the emergence of life, classifications and some known theories and molecular mechanismsA detailed presentation of new methods, such as Self-sequence alignment, Objective Digital Stains and Spectral Forecast A treatment of sequence alignment, including local sequence alignment, global sequence alignment and forced sequence alignment with full implementationsDiscussions of position-specific weight matrices, including the count, weight, relative frequencies, and log-likelihoods matricesA detailed presentation of the methods related to Markov Chains as well as a description of their implementation in Bioinformatics and adjacent fieldsAn examination of information and entropy, including sequence logos and explanations related to their meaningAn exploration of the current state of bioinformatics, including what is known and what issues are usually avoided in the fieldA chapter on philosophical transactions that allows the reader a broader view of the prediction processNative computer implementations in the context of the field of BioinformaticsExtensive worked examples with detailed case studies that point out the meaning of different results Perfect for professionals and researchers in biology, medicine, engineering, and information technology, as well as upper level undergraduate students in these fields, Algorithms in Bioinformatics: Theory and Implementation will also earn a place in the libraries of software engineers who wish to understand how to implement bioinformatic algorithms in their products.

Оглавление

Paul A. Gagniuc. Algorithms in Bioinformatics

Table of Contents

List of Tables

List of Illustrations

Guide

Pages

Algorithms in Bioinformatics. Theory and Implementation

Preface

About the Companion Website

1 The Tree of Life (I) 1.1 Introduction

1.2 Emergence of Life

1.2.1 Timeline Disagreements

1.3 Classifications and Mechanisms

1.4 Chromatin Structure

1.5 Molecular Mechanisms

1.5.1 Precursor Messenger RNA

1.5.2 Precursor Messenger RNA to Messenger RNA

1.5.3 Classes of Introns

1.5.4 Messenger RNA

1.5.5 mRNA to Proteins

1.5.6 Transfer RNA

1.5.7 Small RNA

1.5.8 The Transcriptome

1.5.9 Gene Networks and Information Processing

1.5.10 Eukaryotic vs. Prokaryotic Regulation

1.5.11 What Is Life?

1.6 Known Species

1.7 Approaches for Compartmentalization

1.7.1 Two Main Approaches for Organism Formation

1.7.2 Size and Metabolism

1.8 Sizes in Eukaryotes

1.8.1 Sizes in Unicellular Eukaryotes

1.8.2 Sizes in Multicellular Eukaryotes

1.9 Sizes in Prokaryotes

1.10 Virus Sizes

1.10.1 Viruses vs. the Spark of Metabolism

1.11 The Diffusion Coefficient

1.12 The Origins of Eukaryotic Cells

1.12.1 Endosymbiosis Theory

1.12.2 DNA and Organelles

1.12.3 Membrane-bound Organelles with DNA

1.12.4 Membrane-bound Organelles Without DNA

1.12.5 Control and Division of Organelles

1.12.6 The Horizontal Gene Transfer

1.12.7 On the Mechanisms of Horizontal Gene Transfer

1.13 Origins of Eukaryotic Multicellularity

1.13.1 Colonies Inside an Early Unicellular Common Ancestor

1.13.2 Colonies of Early Unicellular Common Ancestors

1.13.3 Colonies of Inseparable Early Unicellular Common Ancestors

1.13.4 Chimerism and Mosaicism

1.14 Conclusions

2 Tree of Life: Genomes (II) 2.1 Introduction

2.2 Rules of Engagement

2.3 Genome Sizes in the Tree of Life

2.3.1 Alternative Methods

2.3.2 The Weaving of Scales

Additional algorithm 2.1 Note that the source code is in context and works with copy/paste

Additional algorithm 2.2 Note that the source code is in context and works with copy/paste

2.3.3 Computations on the Average Genome Size

2.3.4 Observations on Data

2.4 Organellar Genomes

2.4.1 Chloroplasts

2.4.2 Apicoplasts

2.4.3 Chromatophores

2.4.4 Cyanelles

2.4.5 Kinetoplasts

2.4.6 Mitochondria

2.5 Plasmids

2.6 Virus Genomes

2.7 Viroids and Their Implications

2.8 Genes vs. Proteins in the Tree of Life

2.9 Conclusions

3 Sequence Alignment (I) 3.1 Introduction

3.2 Style and Visualization

Additional algorithm 3.1 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.2 Note that the source code is out of context and is intended for explanation of the method

3.3 Initialization of the Score Matrix

Additional algorithm 3.3 Note that the source code is in context and works with copy/paste

3.4 Calculation of Scores

3.4.1 Initialization of the Score Matrix for Global Alignment

Additional algorithm 3.4 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.5 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.6 Note that the source code is in context and works with copy/paste

3.4.2 Initialization of the Score Matrix for Local Alignment

Additional algorithm 3.7 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.8 Note that the source code is in context and works with copy/paste

3.4.3 Optimization of the Initialization Steps

Additional algorithm 3.9 Note that the source code is out of context and is intended for explanation of the method

3.4.4 Curiosities

Additional algorithm 3.10 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.11 Note that the source code is in context and works with copy/paste

3.5 Traceback

Additional algorithm 3.12 Note that the source code is out of context and is intended for explanation of the method

3.6 Global Alignment

Additional algorithm 3.13 Note that the source code is in context and works with copy/paste

3.7 Local Alignment

Additional algorithm 3.14 Note that the source code is in context and works with copy/paste

3.8 Alignment Layout

Additional algorithm 3.15 Note that the source code is out of context and is intended for explanation of the method

3.9 Local Sequence Alignment – The Final Version

Additional algorithm 3.16 Note that the source code is in context and works with copy/paste

3.10 Complementarity

Additional algorithm 3.17 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 3.18 Note that the source code is in context and works with copy/paste

3.11 Conclusions

4 Forced Alignment (II) 4.1 Introduction

4.2 Global and Local Sequence Alignment

4.2.1 Short Notes

4.2.2 Understanding the Technology

4.2.3 Main Objectives

4.3 Experiments and Discussions

4.3.1 Alignment Layout

4.3.2 Forced Alignment Regime

4.3.3 Alignment Scores and Significance

4.3.4 Optimal Alignments

4.3.5 The Main Significance Scores

4.3.6 The Information Content

4.3.7 The Match Percentage

4.3.8 Significance vs. Chance

4.3.9 The Importance of Randomness

4.3.10 Sequence Quality and the Score Matrix

4.3.11 The Significance Threshold

4.3.12 Optimal Alignments by Numbers

4.3.13 Chaos Theory on Sequence Alignment

4.3.14 Image-Encoding Possibilities

4.4 Advanced Features and Methods

4.4.1 Sequence Detector

4.4.2 Parameters

4.4.3 Heatmap

Heatmap Area

Fractional Parts and Discretization

Top-Left Sequences

Heatmap Charts

Information Window

4.4.4 Text Visualization

4.4.5 Graphics for Manuscript Figures and Didactic Presentations

4.4.6 Dynamics

4.4.7 Independence

4.4.8 Limits

4.4.9 Local Storage

Web Storage API

Format and Limitations

Record Navigation

Local Storage Expansion Capabilities

Import and Export

Disk Operations

4.5 Conclusions

5 Self-Sequence Alignment (I) 5.1 Introduction

5.2 True Randomness

5.3 Information and Compression Algorithms

5.4 White Noise and Biological Sequences

5.5 The Mathematical Model

5.5.1 A Concrete Example

5.5.2 Model Dissection

5.5.3 Conditions for Maxima and Minima

5.6 Noise vs. Redundancy

5.7 Global and Local Information Content

5.8 Signal Sensitivity

5.9 Implementation

5.9.1 Global Self-Sequence Alignment

Additional algorithm 5.1 Note that the source code is in context and works with copy/paste

Additional algorithm 5.2 Note that the source code is in context and works with copy/paste

5.9.2 Local Self-Sequence Alignment

Additional algorithm 5.3 Note that the source code is in context and works with copy/paste

5.10 A Complete Scanner for Information Content

Additional algorithm 5.4 Note that the source code is in context and works with copy/paste

5.11 Conclusions

6 Frequencies and Percentages (II) 6.1 Introduction

6.2 Base Composition

6.3 Percentage of Nucleotide Combinations

6.4 Implementation

Additional algorithm 6.1 Note that the source code is in context and works with copy/paste

Additional algorithm 6.2 Note that the source code is in context and works with copy/paste

6.5 A Frequency Scanner

Additional algorithm 6.3 Note that the source code is in context and works with copy/paste

6.6 Examples of Known Significance

6.7 Observation vs. Expectation

6.8 A Frequency Scanner with a Threshold

Additional algorithm 6.4 Note that the source code is in context and works with copy/paste

6.9 Conclusions

7 Objective Digital Stains (III) 7.1 Introduction

7.2 Information and Frequency

Additional algorithm 7.1 Note that the source code is in context and works with copy/paste

7.3 The Objective Digital Stain

Additional algorithm 7.2 Note that the source code is in context and works with copy/paste

7.3.1 A 3D Representation Over a 2D Plane

Additional algorithm 7.3 Note that the source code is in context and works with copy/paste

7.3.2 ODSs Relative to the Background

Additional algorithm 7.4 Note that the source code is in context and works with copy/paste

7.4 Interpretation of ODSs

7.5 The Significance of the Areas in the ODS

7.6 Discussions

7.6.1 A Similarity Between Dissimilar Sequences

7.7 Conclusions

8 Detection of Motifs (I) 8.1 Introduction

8.2 DNA Motifs

8.2.1 DNA-binding Proteins vs. Motifs and Degeneracy

8.2.2 Concrete Examples of DNA Motifs

8.3 Major Functions of DNA Motifs

8.3.1 RNA Splicing and DNA Motifs

8.4 Conclusions

9 Representation of Motifs (II) 9.1 Introduction

9.2 The Training Data

9.3 A Visualization Function

Additional algorithm 9.1 Note that the source code is out of context and is intended for explanation of the method

9.4 The Alignment Matrix

Additional algorithm 9.2 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 9.3 Note that the source code is out of context and is intended for explanation of the method

9.5 Alphabet Detection

Additional algorithm 9.4 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 9.5 Note that the source code is out of context and is intended for explanation of the method

9.6 The Position-Specific Scoring Matrix (PSSM) Initialization

Additional algorithm 9.6 Note that the source code is out of context and is intended for explanation of the method

9.7 The Position Frequency Matrix (PFM)

Additional algorithm 9.7 Note that the source code is out of context and is intended for explanation of the method

9.8 The Position Probability Matrix (PPM)

Additional algorithm 9.8 Note that the source code is out of context and is intended for explanation of the method

9.8.1 A Kind of PPM Pseudo-Scanner

Additional algorithm 9.9 Note that the source code is out of context and is intended for explanation of the method

9.9 The Position Weight Matrix (PWM)

Additional algorithm 9.10 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 9.11 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 9.12 Note that the source code is out of context and is intended for explanation of the method

9.10 The Background Model

9.11 The Consensus Sequence

Additional algorithm 9.13 Note that the source code is out of context and is intended for explanation of the method

9.11.1 The Consensus – Not Necessarily Functional

9.12 Mutational Intolerance

9.13 From Motifs to PWMs

Additional algorithm 9.14 Note that the source code is in context and works with copy/paste

9.14 Pseudo-Counts and Negative Infinity

Additional algorithm 9.15 Note that the source code is in context and works with copy/paste

9.15 Conclusions

10 The Motif Scanner (III) 10.1 Introduction

10.2 Looking for Signals

Additional algorithm 10.1 Note that the source code is out of context and is intended for explanation of the method

10.3 A Functional Scanner

Additional algorithm 10.2 Note that the source code is in context and works with copy/paste

10.4 The Meaning of Scores

10.4.1 A Score Value Above Zero

10.4.2 A Score Value Below Zero

10.4.3 A Score Value of Zero

10.5 Conclusions

11 Understanding the Parameters (IV) 11.1 Introduction

11.2 Experimentation

11.2.1 A Scanner Implementation Based on Pseudo-Counts

Additional algorithm 11.1 Note that the source code is in context and works with copy/paste

11.2.2 A Scanner Implementation Based on Propagation of Zero Counts

Additional algorithm 11.2 Note that the source code is in context and works with copy/paste

11.3 Signal Discrimination

11.4 False-Positive Results

11.5 Sensitivity Adjustments

11.6 Beyond Bioinformatics

11.7 A Scanner That Uses a Known PWM

Additional algorithm 11.3 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 11.4 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 11.5 Note that the source code is in context and works with copy/paste

11.8 Signal Thresholds

Additional algorithm 11.6 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 11.7 Note that the source code is out of context and is intended for explanation of the method

11.8.1 Implementation and Filter Testing

Additional algorithm 11.8 Note that the source code is in context and works with copy/paste

11.9 Conclusions

12 Dynamic Backgrounds (V) 12.1 Introduction

12.2 Toward a Scanner with Two PFMs

12.2.1 The Implementation of Dynamic PWMs

Additional algorithm 12.1 Note that the source code is in context and works with copy/paste

12.2.2 Issues and Corrections for Dynamic PWMs

12.2.3 Solutions for Aberrant Positive Likelihood Values

Virtual Additions to the Background Set

Real Additions to the Background Set

Verification of the Two Methods

Additional algorithm 12.2 Note that the source code is out of context and is intended for explanation of the method

12.3 A Scanner with Two PFMs

Additional algorithm 12.3 Note that the source code is in context and works with copy/paste

12.4 Information and Background Frequencies on Score Values

12.5 Dynamic Background vs. Null Model

12.6 Conclusions

13 Markov Chains: The Machine (I) 13.1 Introduction

13.2 Transition Matrices

13.3 Discrete Probability Detector

13.3.1 Alphabet Detection

Additional algorithm 13.1 Note that the source code is out of context and is intended for explanation of the method

13.3.2 Matrix Initialization

Additional algorithm 13.2 Note that the source code is out of context and is intended for explanation of the method

13.3.3 Frequency Detection

Additional algorithm 13.3 Note that the source code is out of context and is intended for explanation of the method

13.3.4 Calculation of Transition Probabilities

Additional algorithm 13.4 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 13.5 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 13.6 Note that the source code is in context and works with copy/paste

Additional algorithm 13.7 Note that the source code is in context and works with copy/paste

13.3.5 Particularities in Calculating the Transition Probabilities

13.4 Markov Chains Generators

13.4.1 The Experiment

13.4.2 The Implementation

Additional algorithm 13.8 Note that the source code is in context and works with copy/paste

13.4.3 Simulation of Transition Probabilities

13.4.4 The Markov machine

13.4.5 Result Verification

13.5 Conclusions

14 Markov Chains: Log Likelihood (II) 14.1 Introduction

14.2 The Log-Likelihood Matrix

14.2.1 A Log-Likelihood Matrix Based on the Null Model

Additional algorithm 14.1 Note that the source code is out of context and is intended for explanation of the method

Additional algorithm 14.2 Note that the source code is out of context and is intended for explanation of the method

14.2.2 A Log-Likelihood Matrix Based on Two Models

Additional algorithm 14.3 Note that the source code is out of context and is intended for explanation of the method

14.3 Interpretation and Use of the Log-Likelihood Matrix

Additional algorithm 14.4 Note that the source code is out of context and is intended for explanation of the method

14.4 Construction of a Markov Scanner

Additional algorithm 14.5 Note that the source code is in context and works with copy/paste

Additional algorithm 14.6 Note that the source code is out of context and is intended for explanation of the method

14.5 A Scanner That Uses a Known LLM

Additional algorithm 14.7 Note that the source code is in context and works with copy/paste

14.6 The Meaning of Scores

14.7 Beyond Bioinformatics

14.8 Conclusions

15 Spectral Forecast (I) 15.1 Introduction

15.2 The Spectral Forecast Model

15.3 The Spectral Forecast Equation

15.4 The Spectral Forecast Inner Workings

15.4.1 Each Part on a Single Matrix

15.4.2 Both Parts on a Single Matrix

15.4.3 Both Parts on Separate Matrices

15.4.4 Concrete Example 1

15.4.5 Concrete Example 2

15.4.6 Concrete Example 3

15.5 Implementations

15.5.1 Spectral Forecast for Signals

Additional algorithm 15.1 Note that the source code is in context and works with copy/paste

15.5.2 What Does the Value of d Mean?

Additional algorithm 15.2 Note that the source code is in context and works with copy/paste

15.5.3 Spectral Forecast for Matrices

Additional algorithm 15.3 Note that the source code is in context and works with copy/paste

15.6 The Spectral Forecast Model for Predictions

15.6.1 The Spectral Forecast Model for Signals

Additional algorithm 15.4 Note that the source code is in context and works with copy/paste

Additional algorithm 15.5 Note that the source code is in context and works with copy/paste

15.6.2 Experiments on the Similarity Index Values

15.6.3 The Spectral Forecast Model for Matrices

Additional algorithm 15.6 Note that the source code is in context and works with copy/paste

15.7 Conclusions

16 Entropy vs. Content (I) 16.1 Introduction

16.2 Information Entropy

16.3 Implementation

Additional algorithm 16.1 Note that the source code is in context and works with copy/paste

Additional algorithm 16.2 Note that the source code is in context and works with copy/paste

16.4 Information Content vs. Information Entropy

16.4.1 Implementation

Additional algorithm 16.3 Note that the source code is in context and works with copy/paste

Additional algorithm 16.4 Note that the source code is in context and works with copy/paste

16.4.2 Additional Considerations

16.5 Conclusions

17 Philosophical Transactions. 17.1 Introduction

17.2 The Frame of Reference

17.2.1 The Fundamental Layer of Complexity

17.2.2 On the Complexity of Life

17.3 Random vs. Pseudo-random

Additional algorithm 17.1 Note that the source code is in context and works with copy/paste

Additional algorithm 17.2 Note that the source code is in context and works with copy/paste

17.4 Random Numbers and Noise

17.5 Determinism and Chaos

17.5.1 Chaos Without Noise

Additional algorithm 17.3 Note that the source code is in context and works with copy/paste

Additional algorithm 17.4 Note that the source code is in context and works with copy/paste

17.5.2 Chaos with Noise

Additional algorithm 17.5 Note that the source code is in context and works with copy/paste

17.5.3 Limits of Prediction

17.5.4 On the Wings of Chaos

17.6 Free Will and Determinism

17.6.1 The Greatest Disappointment

17.6.2 The Most Powerful Processor in Existence

17.6.3 Certainty vs. Interpretation

17.6.4 A Wisdom that Applies

17.7 Conclusions

Appendix A

A.1 Association of Numerical Values with Letters

Additional algorithm A.1 Note that the source code is in context and works with copy/paste

A.2 Sorting Values on Columns

Additional algorithm A.2 Note that the source code is in context and works with copy/paste

A.3 The Implementation of a Sequence Logo

Additional algorithm A.3 Note that the source code is in context and works with copy/paste

A.4 Sequence Logos Based on Maximum Values

Additional algorithm A.4 Note that the source code is in context and works with copy/paste

A.5 Using Logarithms to Build Sequence Logos

Additional algorithm A.5 Note that the source code is in context and works with copy/paste

A.6 From a Motif Set to a Sequence Logo

Additional algorithm A.6 Note that the source code is in context and works with copy/paste

References

Index

WILEY END USER LICENSE AGREEMENT

Отрывок из книги

Paul A. Gagniuc

University Politehnica of Bucharest

.....

Life shows two main approaches for organism formation. The first approach forced a cooperation between biochemical processes (virtual cells in one physical boundary). The second approach forced a cooperation for gradual specialization among individual cells of the same species (or even between species). Interestingly, the second approach shows a lower entropy than the first. However, cooperation between individual cells did not rule out further specialization between biochemical processes inside individual cells.

Competition and gravity preclude the emergence of unicellular organisms over a certain size. Moreover, gradient-based biochemical signaling and interactions would be inefficient on long distances inside large unicellular organisms. Multicellular organisms seem to have found a balance between the speed of response and the size of the cells. Small cells have a larger surface area relative to their volume. Each unit of volume can exchange gases and nutrients at a higher rate compared to larger cells. Note that the principle is equivalent to smaller salt granules that dissolve faster in water than large ones. Cooperation for development of cell specialization in the direction of a circulatory system formation ensured an optimal exchange with the outside environment and a fast response for the entire organism. In the case of very large unicellular organisms, the response time for any stimulus may be dictated by distances inside the cell and the metabolic rate. For instance, a biochemical interaction between two points in the cytoplasm of such an organism would require time and high amounts of messenger molecules to diffuse in a large volume until the target is stochastically encountered. In other words, “time contracts” for giant unicellular organisms. It is likely that giant single-celled organisms have existed in the distant past. However, competition with smaller unicellular organisms with higher response times may have eliminated them from the evolutionary chain.

.....

Добавление нового отзыва

Комментарий Поле, отмеченное звёздочкой  — обязательно к заполнению

Отзывы и комментарии читателей

Нет рецензий. Будьте первым, кто напишет рецензию на книгу Algorithms in Bioinformatics
Подняться наверх