Читать книгу Principles of Microbial Diversity - James W. Brown - Страница 57

Converting a similarity matrix into an evolutionary distance matrix

Оглавление

Next is the estimation of evolutionary distances from their sequence similarity. You might think that the distance would just be 1 − similarity (i.e., “difference”), and you would be right except that the number of differences you count between any two sequences misses some of the changes that probably have occurred between them. More than one evolutionary change at a single position (e.g., A to G to U, or A to G in one sequence and the same A to U in another) counts as only one difference between the two sequences, and in the case of reversion or convergence it counts as no change at all (e.g., A to G to A, or A to G in one organism and the same A to G in another). As a result, the observed similarity between two sequences underestimates the evolutionary distance that separates them.

One common way to estimate evolutionary distances from similarity is the Jukes and Cantor method, which uses the following equation:


As shown graphically in Fig. 4.1, similarity and distance are very closely related initially (e.g., 0.90 similarity ≈ 0.10 distance) but level off to 0.25 similarity, where evolutionary distance is infinite. This makes sense; for two sequences that are very similar, the probable frequency of more than one change at a single site is low, requiring only a small correction, whereas two sequences that have changed beyond all recognition (infinite evolutionary distance) are still approximately 25% similar just because there are only four bases and so approximately one of the four will match entirely by chance.


Figure 4.1 The Jukes and Cantor equation plotted as observed sequence similarity (from the similarity matrix) versus estimated evolutionary distance. doi:10.1128/9781555818517.ch4.f4.1

To convert a similarity matrix to a distance matrix, just convert each value in the similarity matrix to evolutionary distance using either the graph or the equation. In our example:


Principles of Microbial Diversity

Подняться наверх