Performance of genotype calling using the naive genotype caller method and BCFtools multiallelic caller with beta distributed errors in the estimation of the quality scores, by read depth. Colors indicate different variances in the beta distribution (0 and , respectively). The line types indicate the Li GL model (1) and McKenna GL model (2) (for details, see the Supplementary Material, Section 1.4). The genotype calling error rates (y-axis) and call rates (x-axis) are defined in the main text. The average per-site read depth is indicated in the top left corner of each plot. The curves are obtained by varying the GQ threshold for genotype calling. The vertical line segments below 0 on the y-axis denote the minimum GQ threshold of 20, and the horizontal line segments after the final call rate on the x-axis denote the final error rate of each group. The data are from 20 replicates of 100 diploid individuals simulated using msprime, resulting in 328 230 variable sites per simulation replicate (for details, see the Supplementary Material, Section 2). The BCFtools multiallelic caller is used for each population separately, and the prior parameter is disabled.
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.