-
PDF
- Split View
-
Views
-
Cite
Cite
Brian Charlesworth, Deborah Charlesworth, Neutral Variation in the Context of Selection, Molecular Biology and Evolution, Volume 35, Issue 6, June 2018, Pages 1359–1361, https://doi.org/10.1093/molbev/msy062
- Share Icon Share
Abstract
In its initial formulation by Motoo Kimura, the neutral theory was concerned solely with the level of variability maintained by random genetic drift of selectively neutral mutations, and the rate of molecular evolution caused by the fixation of such mutations. The original theory considered events at a single genetic locus in isolation from the rest of the genome. It did not take long, however, for theoreticians to wonder whether selection at one or more loci might influence neutral variability at linked sites. Once DNA sequence variability could be studied, and especially when resequencing of whole genomes became possible, it became clear that patterns of neutral variability in genomes are affected by selection at linked sites, and that these patterns could advance our understanding of natural selection, and can be used to detect the action of selection in genomic regions, including selection much weaker than could be detected by direct measurements of the relative fitnesses of different genotypes. We outline the different types of processes that have been studied, in approximate order of their historical development.
Introduction
The first process to be studied in which selection on sites in genomes affects neutral variants was associative overdominance (AOD), first analyzed in a paper in the same year as Kimura's formulation of the neutral theory (Sved 1968), and later developed by Ohta and Kimura (1970). In a randomly mating population, genetic drift causes linkage disequilibrium (LD) between a neutral locus and one or more loci under selection, even in randomly mating populations. These early studies showed that, if the selected loci experience heterozygote advantage, or segregate for partially recessive deleterious alleles that are maintained by mutation pressure, homozygotes at the neutral locus also appear to have lower fitnesses. The effects on “apparent fitnesses” can be large if the neutral and selected loci are closely linked. In a partially inbreeding population, such as a plant population that reproduces by a mixture of self-fertilization and outcrossing, another form of AOD arises, because inbreeding produces between-individual correlations in homozygosity among loci. This identity disequilibrium (ID) occurs even between unlinked loci. The effects can be strong enough to be detected empirically, and were indeed observed in plants and molluscs, and even in mammals, in the pre-DNA-sequencing era, using genotypes of allozyme and microsatellite markers (Strauss 1986; Leberg et al. 1990; Bierne et al. 2000).
For nearly 50 years, it was thought that the apparent heterozygote advantage generated by AOD would delay loss of variability at the neutral site, compared with the rate predicted by standard neutral theory. However, AOD induced by ID does not change allele frequencies at neutral loci (Charlesworth 1991), and the same has recently been shown for LD-induced AOD (Zhao and Charlesworth 2016); indeed, this is implicit in the equations in the original papers. At first sight, this suggests that AOD should not retard the loss of neutral variability. However, a re-analysis of LD-induced AOD has shown that this is incorrect (Zhao and Charlesworth 2016). Heterozygote advantage retards loss of variability under quite light conditions; with partially recessive deleterious mutations, retardation can occur if the product of the selection coefficient s against homozygotes and the effective population size, Ne, is of the order of 0.5 or less. Situations also exist when loss of variability is actually accelerated (see the discussion of background selection [BGS]). Unexpectedly, therefore, the pioneering theoretical work on AOD has taken on a new lease of life.
The second type of effect of selection on variability at linked neutral sites, hitchhiking by positively selected mutations (selective sweeps), has attracted greater interest than AOD. The basic theory was developed by John Maynard Smith and John Haigh (1974), stimulated by Lewontin’s (1974) observation that populations' allozyme locus variability levels are only weakly related to their size. They proposed that the spread of a selectively favorable allele would reduce variability at a linked neutral site, if the mutation to the favored allele had originated on a single haplotype. In the extreme case of no recombination, the spread to fixation of such a mutation will “sweep” to fixation all variants present on the haplotype in question. With recombination, they showed that the reduction of variability at a neutral site declines rapidly with r/sa, where r is the frequency of recombination with the selected locus, and sa is the selection coefficient for the advantageous allele. They suggested that the overall variability in a species would depend more on the frequency of selective sweeps than the rate of genetic drift determined by its effective population size, resolving Lewontin’s paradox. This proposal was later elaborated by Gillespie (2001) in his theory of “genetic draft,” but the basic idea remains the same.
Despite empirical evidence for the hitchhiking of restriction site variants associated with the spread of the human hemoglobin S mutation (Kan and Dozy 1978), relatively little attention was paid to selective sweeps until Drosophila population geneticists, led by Chuck Langley, noticed that levels of sequence variability (mostly detected using restriction site variants) tended to be unusually low in genome regions that recombine infrequently, notably telomere and centromere regions (Aguadé et al. 1989). This stimulated the development of more elaborate models of selective sweeps (Kaplan et al. 1989), especially after Begun and Aquadro (1992) demonstrated a significant correlation between the local recombination rate in Drosophilamelanogaster and the level of restriction site variability in genes. Their finding that no such correlation is seen between recombination rate and divergence between D. melanogaster and its close relative Drosophilasimulans ruled out potential confounding factors, such as a direct influence of recombination on the mutation rate, strongly indicating that variability at neutral or nearly neutral sites is influenced by selection at linked sites. Similar patterns have now been detected in many other taxa, including humans, and in further Drosophila population studies (Cutter and Payseur 2013).
The literature on the theory of selective sweeps, and its application to statistical tools for detecting sweeps from patterns of variability at putatively neutral sites, has now developed to a large volume, and we cannot do it justice in this brief overview. We simply note the development of theory for recurrent selective sweeps by Wiehe and Stephan (1993), and improved theory concerning the effects of a single sweep (Barton 1998).
Several methods to detect selective sweeps from their effects on patterns of variability, including their effects on the departure of the frequency distribution of segregating sites and on LD, have been developed and successfully applied (Nielsen et al. 2005). In addition, much interest has been aroused by the possibility that there may be “soft sweeps” that involve favorable mutations present initially as more than one copy, rather than single new mutations (Hermisson and Pennings 2005). Similarly, it is possible to detect introgression of a selectively favorable allele from another species or population using patterns of variability at linked loci (Bradburd et al. 2016).
Selective sweeps are not the only possible mode of hitchhiking. It also occurs when deleterious mutations are eliminated from a population, taking linked variants with them. This process is commonly known as BGS. It was described verbally by R.A. Fisher in connection with the fate of a beneficial mutation arising on a background of deleterious mutations (Fisher 1930). The first model of this process (Birky and Walsh 1988) did not consider the effect of deleterious mutations on neutral variability at linked sites. A formal treatment of this aspect of BGS (Charlesworth et al. 1993) was quickly followed by more sophisticated theoretical work (Hudson and Kaplan 1995; Nordborg et al. 1996). BGS provides an additional possible explanation for the low neutral variability in genome regions with low local recombination rates mentioned above. With Nes ≫ 1, so that the deleterious alleles are held close to their equilibrium frequencies under mutation–selection balance, neutral variability at linked sites is reduced below the value in the absence of selection. However, as mentioned above, partially recessive mutations in a diploid organism with a sufficiently small Nes value can retard loss of neutral variability. The same basic equation for the effect of allele frequency change at a site under directional selection on the allele frequency at a linked neutral site underlies all three of the processes discussed so far (Zhao and Charlesworth 2016). They can all be regarded as different versions of hitchhiking.
When sites subject to deleterious mutation are closely linked, the efficacy of selection on each site is reduced (McVean and Charlesworth 2000), through Hill–Robertson interference (Hill and Robertson 1966). This effect reduces the effects of BGS on neutral variability in the region, explaining why the observed variability in nonrecombining portions of the genome is higher than is predicted by the standard BGS model (Kaiser and Charlesworth 2009). In this interference selection limit, predictions of the effects of selection on linked variability require more complex models (Good et al. 2014).
If variants are maintained by long term balancing selection within a single population, the equilibrium variability at linked sites can be higher than expected in the absence of selection (Hudson 1990). This situation is similar to a subdivided population: the alternative alleles at the selected site correspond to different subpopulations, and recombination between the neutral and selected sites act like migration. Given enough time, sites that are closely linked to a target of selection will accumulate sequence differences associated with the alternative alleles at the selected site; the population as a whole will show a peak in variability at neutral sites around the target of selection, which declines with the recombination distance from the selected site. In the region, neutral variants will have higher frequencies than expected in the absence of the selected site, which can allow such balancing selection to be detected (e.g., DeGiorgio et al. 2014).
A similar pattern of enhanced variability can be produced by alleles maintained for a long time by differences in local selective pressures, because the selective elimination of the “wrong” type of allele reduces the effective migration rate of neutral variants closely linked to the target of selection (Petry 1983; Charlesworth et al. 1997). In this case, however, the size of the region affected is related to the ratio of the recombination rate to the selection coefficient against the locally nonadaptive allele, and will often be much larger than expected with balancing selection. Genome scans can potentially detect signatures of enhanced population subdivision at neutral markers, but it is difficult to distinguish between reduced variability within local populations caused by local selective sweeps from between-population differences in the direction of selection (Cruickshank and Hahn 2014).
Overall, levels of neutral variability in genomes cannot be understood without taking the effects of selection at linked sites into account. In organisms with compact genomes, such as Drosophila, the combined actions of selective sweeps and BGS have probably more than halved the average nucleotide diversity at nearly neutral sites, relative to the value expected in the absence of selection (Elyashiv et al. 2016). The low diversity in highly inbreeding species and in asexual lineages (Cutter and Payseur 2013) must also reflect, at least in part, such effects. In addition to the positive correlations observed between variability and the local recombination rate, selection at linked sites can account for other genomic patterns, such as an increase in nucleotide diversity with increasing distance from coding sequences in mammals (e.g., Halligan et al. 2013). Much work remains to be done to distinguish between the effects of selective sweeps and BGS in causing these patterns, as well as between hard and soft sweeps. In addition, the extent to which variability is affected by loci under balancing selection remains to be determined.