Fig. 4.
Results for amino acid divergence-time analyses of the 21-locus empirical data set of Meredith et al. (2011; see Materials and Methods for details). NT refers to relative time estimates of node divergence times. (A) Time estimate scatter diagram for analysis using the full set of 21 genes with 60% sparseness. In this case, because true divergence times are unknown, truth is taken to be the full-coverage case with all data. The R2 coefficient of determination is 0.98. (B) Depiction of the mammalian clades used for the systematic sampling. Nineteen of the clades are marked by black diamonds, and the 20th is taken to be all of the remaining taxa. Each gene contains sequences for exactly two orders, assigned so that for each gene g0, there exists exactly one gene g1 and another gene g2 such that g0 shares exactly one clade with g1 and the other clade with g2. In this case, there is strictly limited species overlap among genes. We then added one “backbone” gene with a sequence for each taxon. The phylogenetic tree in the figure is based on the one that appears in Meredith et al. (2011). (C) Time estimate scatter diagram for analysis under the systematic sampling method described in the text, but without inclusion of the universal “backbone” gene. Again, truth is taken to be the full-coverage case with all data. (D) Time estimate scatter diagram for analysis under the systematic sampling method described in the text but in this case with the universal backbone gene. Resultant sparseness (percentage gaps in the data matrix) is approximately 90%. As before, truth is taken to be the full-coverage case with all data. We see that the R2 coefficient (0.9) is much higher than in the case without backbone (0.7). (E) Effect of evolutionary rate of backbone gene on mean divergence time estimation error. Our mammalian data set contained 21 genes, each of which was used in turn as a backbone gene (black markers in the graph). The x axis represents the total number of substitutions observed for that gene and the y axis represents the total error of the resulting time tree. Gray markers represent 20-gene data sets without the backbone gene for each case (x value). Regression lines are fit to both sets of results and have R2 values of 0.1 and 0.4 for the no-backbone and backbone case, respectively. We see that the use of the backbone gene is effective in reducing error and that faster-evolving backbone genes tend to be more effective than slowly evolving ones. (F) Distribution of signed error in the 60% sparse case when compared with the systematically sampled case with backbone. As we might expect from the fact that the systematic case is effectively 90% sparse, we see more spread to higher error in that case.

Results for amino acid divergence-time analyses of the 21-locus empirical data set of Meredith et al. (2011; see Materials and Methods for details). NT refers to relative time estimates of node divergence times. (A) Time estimate scatter diagram for analysis using the full set of 21 genes with 60% sparseness. In this case, because true divergence times are unknown, truth is taken to be the full-coverage case with all data. The R2 coefficient of determination is 0.98. (B) Depiction of the mammalian clades used for the systematic sampling. Nineteen of the clades are marked by black diamonds, and the 20th is taken to be all of the remaining taxa. Each gene contains sequences for exactly two orders, assigned so that for each gene g0, there exists exactly one gene g1 and another gene g2 such that g0 shares exactly one clade with g1 and the other clade with g2. In this case, there is strictly limited species overlap among genes. We then added one “backbone” gene with a sequence for each taxon. The phylogenetic tree in the figure is based on the one that appears in Meredith et al. (2011). (C) Time estimate scatter diagram for analysis under the systematic sampling method described in the text, but without inclusion of the universal “backbone” gene. Again, truth is taken to be the full-coverage case with all data. (D) Time estimate scatter diagram for analysis under the systematic sampling method described in the text but in this case with the universal backbone gene. Resultant sparseness (percentage gaps in the data matrix) is approximately 90%. As before, truth is taken to be the full-coverage case with all data. We see that the R2 coefficient (0.9) is much higher than in the case without backbone (0.7). (E) Effect of evolutionary rate of backbone gene on mean divergence time estimation error. Our mammalian data set contained 21 genes, each of which was used in turn as a backbone gene (black markers in the graph). The x axis represents the total number of substitutions observed for that gene and the y axis represents the total error of the resulting time tree. Gray markers represent 20-gene data sets without the backbone gene for each case (x value). Regression lines are fit to both sets of results and have R2 values of 0.1 and 0.4 for the no-backbone and backbone case, respectively. We see that the use of the backbone gene is effective in reducing error and that faster-evolving backbone genes tend to be more effective than slowly evolving ones. (F) Distribution of signed error in the 60% sparse case when compared with the systematically sampled case with backbone. As we might expect from the fact that the systematic case is effectively 90% sparse, we see more spread to higher error in that case.

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close