Читать книгу Biogeography - Группа авторов - Страница 30
2.4.2. Extending the DEC and BIB models
ОглавлениеOver time, the DEC and BIB models have been expanded to include more complexity and increasing realism. The original DEC model (Ree and Smith 2008) included dispersal or range expansion only as an anagenetic event, which was modeled as a time-dependent rate within the Q instantaneous rate matrix (Figure 2.5(b)). Matzke (2014) extended this model to include “cladogenetic dispersal” or “founder-event speciation”, as an event of dispersal that is coincident with speciation, with one daughter lineage instantaneously “jumping” into a new area that was not part of the ancestral range, for example, from A to A and C in Figure 2.5(b). This new cladogenetic scenario is modeled in the DEC+J model by a separate parameter j (Matzke 2014), which is not part of the CTMC process that governs range evolution along branches. Therefore, this j parameter is not equivalent to the rate of jump dispersal p and q in the BIB model (Figure 2.5(a)), and it is also not dependent on time, unlike the DAB or EA parameters in DEC. Ree and Sanmartín (2018) showed that by decoupling “jump dispersal” from time, the DEC+J model can result in highly counterintuitive scenarios and degenerate likelihood inferences, in particular if founder speciation is assigned a higher likelihood (“weight”) relative to other cladogenetic scenarios such as allopatry or peripheral isolate speciation. Moreover, when estimated as its maximum value, the inclusion of j can lead to underestimation of the rates of the anagenetic, time-dependent parameters: range expansion and range contraction. As a result, the DEC+J model can generate reconstructions with rates of anagenetic dispersal and (especially) of extinction close to zero, and distribution patterns that are explained almost exclusively by cladogenetic events. The end result is a diminishing of the relevance of time (branch lengths) in biogeographic inference, considered as the key advance of parametric over parsimony-based approaches (Ree and Sanmartín 2018). Figure 2.7 shows an example of this potential bias. As pointed out by Ronquist and Sanmartín (2011) and Ree and Sanmartín (2018), the proper modeling of cladogenetic events in parametric range evolution requires the use of trait-dependent speciation-extinction models (Maddison et al. 2007), discussed in more detail below. A different solution is adopted by another DEC-derived model, BayArea (Landis et al. 2013). It uses a Bayesian data augmentation approach in which parameters in the Q matrix are estimated by simulating outcomes (geographic range evolutionary histories) along branches in the phylogeny. This allows for a larger number of areas and geographic ranges in the model, including widespread states. Unlike DEC, there is no modeling of speciation scenarios: ranges are identically inherited by the two descendants, which also helps simplifying the model.
Figure 2.7. Effect of decoupling biogeographic inference from time in DEC+J. Two phylogenies with identical topology and tip distributions, but internal branches elongated or shortened by half. The software BiogeoBears was used to infer ancestral ranges and rates of dispersal (d) and extinction (e) under the DEC and DEC+J models; the latter includes a parameter, j, for cladogenetic jump dispersal (“founder speciation”). a) DEC (short branches): d=0.0542, e=0.0436, j=0; LnL=−7.75. b) DEC (long branches): d = 0.0289, e=0.0375, j=0; LnL=−8.94. c) DEC+J (short): d=0; e=0; j=0.4265; LnL=−3.99. d) DEC+J (long): d=0; e=0; j=0.4265; LnL=−3.99. Notice that the DEC reconstruction for the most basal nodes changes with the branch lengths, but the DEC+J reconstruction does not. Only the range with the highest relative likelihood is shown; the maximum number of areas in widespread ranges was constrained to two. LnL: model log-likelihood. For a color version of this figure, see www.iste.co.uk/guilbert/biogeography.zip
Regarding the BIB model, extensions have gone in the direction of introducing species-specific rates of geographic movement or implementing procedures for reducing the size of the Q matrix. The original BIB model was used to infer patterns of colonization in oceanic (Sanmartín et al. 2008) or continental (Sanmartín et al. 2010) islands. It implemented a hierarchical Bayesian approach in which relative dispersal rates between islands and island carrying capacities were estimated from phylogenetic and distribution data from multiple, co-distributed island lineages. Phylogenetic and biogeographic parameters were simultaneously estimated from species DNA sequence data and associated geographic distributions, but allowing each species to have their own rates of molecular and biogeographic (dispersal) evolution. This hierarchical, species-partitioned approach allows researchers to infer general, broad-scale patterns of island colonization while accounting for (marginalizing) organism-specific differences in rates of molecular evolution, age of origin or dispersal ability. The BIB model was subsequently implemented in a epidemiology context to study patterns of viral spread (Lemey et al. 2009). These authors also extended the BIB model to include a stepwise regression approach, Bayesian stochastic variable selection, to identify those transition rates or dispersal pathways in the CTMC Q matrix that are better supported by the data (Lemey et al. 2009). This BIB extension has also been used to infer migration patterns at the population, within-species level (Mairal et al. 2015). Other extensions of BIB have gone in the direction of making dispersal rates dependent on external factors or predictors (Faria et al. 2013), or allowing the inferred dispersal pathways to differ across taxa (Cybis et al. 2013).
The applications of BIB in epidemiology and phylogeography are probably some of the most popular uses of the model in the present. BIB in these fields is termed discrete trait analysis, DTA, or the “mugration” model because it equates migration to mutation events (De Maio et al. 2015). Though treating migration events as instantaneous mutations in a sequence might be acceptable at geological time scales and species levels, as was done in the original BIB (Sanmartín et al. 2008), it can be more problematic under the coalescent process; this is a model used at short-time scales and population-levels for building phylogenetic relationships (De Maio et al. 2015). Subsequent authors have extended the BIB-DTA model to allow for geographically structured populations’ conditioning under the coalescent process (De Maio et al. 2015; Muller et al. 2017).