-
PDF
- Split View
-
Views
-
Cite
Cite
Bridgett M vonHoldt, Alexandra L DeCandia, Kira A Cassidy, Erin E Stahler, Janet S Sinsheimer, Douglas W Smith, Daniel R Stahler, Patterns of reproduction and autozygosity distinguish the breeding from nonbreeding gray wolves of Yellowstone National Park, Journal of Heredity, Volume 115, Issue 4, July 2024, Pages 327–338, https://doi.org/10.1093/jhered/esad062
- Share Icon Share
Abstract
For species of management concern, accurate estimates of inbreeding and associated consequences on reproduction are crucial for predicting their future viability. However, few studies have partitioned this aspect of genetic viability with respect to reproduction in a group-living social mammal. We investigated the contributions of foundation stock lineages, putative fitness consequences of inbreeding, and genetic diversity of the breeding versus nonreproductive segment of the Yellowstone National Park gray wolf population. Our dataset spans 25 years and seven generations since reintroduction, encompassing 152 nuclear families and 329 litters. We found more than 87% of the pedigree foundation genomes persisted and report influxes of allelic diversity from two translocated wolves from a divergent source in Montana. As expected for group-living species, mean kinship significantly increased over time but with minimal loss of observed heterozygosity. Strikingly, the reproductive portion of the population carried a significantly lower genome-wide inbreeding coefficients, autozygosity, and more rapid decay for linkage disequilibrium relative to the nonbreeding population. Breeding wolves had significantly longer lifespans and lower inbreeding coefficients than nonbreeding wolves. Our model revealed that the number of litters was negatively significantly associated with heterozygosity (R = −0.11). Our findings highlight genetic contributions to fitness, and the importance of the reproductively active individuals in a population to counteract loss of genetic variation in a wild, free-ranging social carnivore. It is crucial for managers to mitigate factors that significantly reduce effective population size and genetic connectivity, which supports the dispersion of genetic variation that aids in rapid evolutionary responses to environmental challenges.

Introduction
Territorial, cooperatively breeding species can exhibit a diversity of breeding strategies (e.g. monogamy, polygamy, and polyandry) and reproductive skew (Keller and Reeve 1994; Clutton-Brock 2016). Yet regardless of these patterns, access to reproduction is restricted and effective population size (Ne) is reduced (Jennions and Macdonald 1994; Frankham 1995; Komdeur and Deerenberg 1997). Estimates of Ne provide critical information during the application and interpretation of evolutionary and population genetic theory (Wright 1931, 1969; Crow and Kimura 1970; Lanfear et al. 2014), as well as for wildlife management programs (Rowe and Beebee 2004). Species that have experienced either a natural population decline or bottleneck are expected to have a dramatically reduced census and effective size, and an increased probability of inbreeding and genetic identity by descent or autozygosity (Hedrick and Kalinowski 2000; Charlesworth and Willis 2009). Consequently, inbreeding can dramatically impact Ne estimates through increased genetic correlations and decreased frequency of heterozygotes (Ellegren 1999; Wang et al. 2016). Viability is then threatened by inbreeding depression, which is the negative fitness consequence expected to manifest in inbred offspring (Lynch and Walsh 1998; Charlesworth and Willis 2009; Hedrick and Garcia-Dorado 2016). Inbreeding depression is often measured in fitness-related traits of life history (e.g. fecundity, survival, and morphological measurements), often with captive populations providing a small yet detailed population for quantifying complex fitness traits (Hedrick and Kalinowski 2000). Measuring autozygosity further provides an assessment of the full effect of deleterious alleles, as individuals who carry greater levels of autozygosity are expected to exhibit decreased fitness (Charlesworth and Willis 2009).
Although there has been much attention paid to inbreeding avoidance and its potential impact on reproductive traits in captive breeding programs from pedigree estimates (Ralls et al. 1988; Laikre and Ryman 1991; Laikre et al. 1993; Kalinowski et al. 2000; Kalinowski and Hedrick 2001; Jiménez-Mena et al. 2016), there are rare opportunities for detailed studies about the genetic and fitness consequences of inbreeding in wild populations when pedigree relationships are often less certain (Lacy 1997; Keller and Waller 2002; Curik et al. 2017). Furthermore, pedigrees that are unbalanced, incorrect, or incomplete can lead to errors in estimating expected inbreeding coefficients and reduction in reproductive fitness (Curik et al. 2017). One critical aspect of pedigree-based errors is the assumption that pedigree founders are unrelated. Genome data now provide an opportunity to directly and accurately measure autozygosity, which quantifies the fraction of the genome that is contained within long stretches of homozygosity delineated along chromosomes (FROH; McQuillan et al. 2008). Keller et al. (2011) also showed that FROH outperformed inbreeding estimates derived from pedigree data.
We studied the dynamic landscape of inbreeding and life-history traits in a pedigreed population of gray wolves (Canis lupus). As a cooperatively breeding species with a single annual estrous, wolf populations are structured with limited access to reproduction that is typically reinforced by a kin-structured social rank-related stress reflected in glucocorticoid levels (Sands and Creel 2004; vonHoldt et al. 2008). Although monogamy is considered common among canids, gray wolves in exhibit a diversity of reproductive strategies and inbreeding avoidance (Smith et al. 1997; Jedrzejewski et al. 2005; vonHoldt et al. 2008; Ausband 2019). Several studies of small, isolated, bottlenecked, or captive gray wolf populations across their Holarctic range have found genetic evidence of high inbreeding levels with a corresponding reduction in fitness and related population genetic health metrics. Two primary examples of such isolated island populations of wolves are those inhabiting Isle Royale in Lake Superior and Scandinavia (Liberg et al. 2005; Räikkönen et al. 2013; Åkesson et al. 2016; Robinson et al. 2019).
Specific to the wolves inhabiting the contiguous United States, effective government wildlife control programs eradicated nearly all gray wolves by the early 1930s (except in Minnesota; Fritts et al. 1997). Over five decades later, substantial support developed to establish a reintroduction plan for gray wolf restoration to the Rocky Mountains (Fritts et al. 1997). During the winters of 1995 and 1996, the US Fish and Wildlife Service released 35 wolves in central Idaho, and in a joint effort with the National Park Service, released 31 wolves in Yellowstone National Park (YNP). Additionally, there was the translocation of 10 wolf pups from the Sawtooth pack in northwestern Montana to YNP where they were released in 1997. Foundation stock (a.k.a. YNP founders) were captured from two source locations in western Canada, and vonHoldt et al. (2008) utilized 26 microsatellites to provide valuable genetic insights with respect to their relatedness and population genetic health during the first decade of recovery. vonHoldt et al. (2008) found YNP founder breeding pairs to be unrelated with near absolute inbreeding avoidance despite living in groups of related individuals. Yet, given the resolution limits of microsatellite data for investigating genomic architecture, little is known about the nature of genetic variation or chromosomal regions that are identical by descent (IBD).
Building upon the recent release of their updated pedigree constructed using genome-wide single nucleotide polymorphism (SNP) data (vonHoldt et al. 2020), we explore here the genetics of YNP founders’ lineages, demography, genomic contributions, inbreeding, and consequences thereof in YNP gray wolves during their 25 years post-reintroduction. As this population has exhibited inbreeding avoidance with long-term data recorded on mating, reproduction, and genetic variation patterns (vonHoldt et al. 2008, 2020), we expect to find lower values of inbreeding and autozygosity in breeding wolves relative to nonbreeding wolves. We separately assessed pedigree- and marker-based aspects of the wolf population with a unique opportunity to integrate life-history fitness correlates with respect to reproduction for a social and recovered canid population. With the pedigree reconstructed utilizing genomic data, these two perspectives have provided new details regarding the genetic viability, reproduction, and recovery status of this social carnivore. We highlight the differences in estimates from pedigree-based methods, which provide an expectation, compared with those of genomic-based estimators which provide an estimate of the realized metrics and perceived to be more accurate.
Materials and methods
YNP’s reintroduced wolf population
After reintroduction, the YNP population rapidly expanded, reaching a high of 174 wolves in 2003. However, the population stabilized over the last decade (Smith et al. 2020), averaging 97.9 (±3.9 SE) at the year-end official census count. These gray wolves are annually monitored, with helicopters used in winter to tranquilize wolves for physical assessments, radio-collaring, morphometric data collection (age, sex, and breeding status), and collection of whole blood in EDTA vacutainers following protocols approved by the National Park Service (IACUC #IMR_YELL_Smith_wolves_2012). Furthermore, YNP collects tissue specimens from carcasses to ensure a high representation of individuals in the curated collections. For this study, the original 31 wolves translocated to YNP and the parents of the 10 translocated orphaned pups were designated as YNP founders. This is not to be confused with the animals at the foundation of the pedigree, known as the pedigree founders, which included two wolves sampled in Canada who were not translocated but known to be related to YNP wolves. The original and updated genealogy for this population confirmed close kinships (e.g. parent/offspring of full siblings) and genetic similarities among a subset of the YNP founders (vonHoldt et al. 2008, 2020). These kinship ties were expected given the nature of the wild capture methods and release. Several individuals were captured together, likely representing a pack or a subset thereof. These relationships are reflected by increased allele sharing in the genetic-based analysis below, while the pedigree-based analysis incorporated these known relationships. These relationships are critical to account for as most treatments of founders assume they are unrelated (i.e. their kinship fxy and inbreeding coefficients F are set to 0; Jiménez-Mena et al. 2016).
Rationale for pedigree- and population-based analyses
We used two approaches to explore inbreeding in the YNP wolf population as each is based on different estimators that may give different results. We first leverage the unique opportunity of having a pedigree for the YNP wolf population that was constructed from genomic data. Such pedigree reconstruction using SNP data requires heavy and careful filtering that results in approximately a few hundred loci that meet several expectations of linkage disequilibrium (LD), polymorphism, diversity, and neutrality. Many of the pedigree-based metrics are considered less accurate than molecular-based estimates; however, given the pedigree constructed from genomic data, we focused on features estimated from the pedigree that is typical of pedigree-based studies (see below). In parallel, we utilized a population-level approach that does not require a prior known pedigree relationships where we were able to include animals that lack parentage information but have a genomic sample. Although pedigree relationships are valuable to include, we utilized a marker-based genomics approach to explore the impact of pedigree structure on genomic variation and life-history-related traits (Kardos et al. 2015). A subset of wolves in the pedigree were excluded from the genetic analyses due to low sequence coverage or lack of a genetic sample (i.e. only observation data supported their pedigree information). As such, the pedigree contains more individuals than analyzed with SNP data. Furthermore, we included observational data regarding parentage and reproduction. Such an exception was made when reproduction was observed but a genetic sample was unavailable.
Pedigree-based analysis
We analyzed 474 YNP gray wolves with pedigree data, of which 391 also had genome-wide SNP data derived from previously published restriction-site associated DNA sequencing (RADseq) data obtained from blood and tissue samples collected between 1995 and 2020 (vonHoldt et al. 2020). We updated the full YNP wolf pedigree by adding three new individuals (Supplementary Table S1) and used the same genomic preparation and analytical methods described by vonHoldt et al. (2020). Briefly, we used high molecular weight genomic DNA in a modified protocol for RADseq capture protocol (Ali et al. 2016) where DNA was digested with SbfI. We ligated unique 8-bp barcoded biotinylated adapters to the resulting fragments for recovery after pooling samples into a single library for paired-end (2 × 150 nt) Illumina NovaSeq 6000 sequencing at Princeton University’s Lewis-Sigler Genomics Institute core facility. We randomly sheared each library to 400 bp in a Covaris LE220 followed by an enrich for the adapter-ligated fragments using a Dynabeads M-280 streptavidin binding assay. We used the NEBnext Ultra II DNA Library Prep Kit (New England Biolabs) for library preparation. We used the STACKS v2.6 (Catchen et al. 2013; Rochette et al. 2019) process_radtags module to rescue our barcoded reads with a 2 bp mismatch and excluded reads with a quality score <10, followed by the clone_filter to remove PCR duplicates and then mapped to the reference dog genome CanFam3.1 assembly (Lindblad-Toh et al. 2005) using bwa-mem (Li 2013a,b). We also included the Y chromosome (KP081776.1; Li et al. 2013a,b). We removed low-quality alignments (MAPQ <20). We implemented the gstacks and populations modules, and increased the minimum significance threshold using the marukilow genotype model (--vt-alpha and --gt-alpha with P = 0.01). We used default values in STACKS for retaining read stacks that had a minimum stack depth of 3 reads. We removed loci with singleton and private doubleton alleles, loci with more than 90% missing data across all samples and excluded individuals with more than 30% missing data. We removed loci with a minor allele frequency (MAF <0.03) and required at least an 80% genotyping rate per locus (--geno 0.2) in PLINK v1.90b3i (Chang et al. 2015).
We constructed a parentage-informative dataset of 736 SNPs through strict filtering with parameter settings recommended for pedigree reconstruction (Huisman 2017): Hardy-Weinberg equilibrium (HWE) (--hwe 0.001) to identify SNPs that meet neutrality expectations, MAF (--maf 0.45), and statistical LD (--indep-pairwise 50 5 0.2) to identify SNPs that are uncorrelated and unlinked. We calculated pairwise relatedness coefficients (r) using the coancestry function within the R v3.6.0 (2019) package related and implemented the dyadic likelihood estimator (dyadml = 1; Milligan 2003) with allowance for inbreeding (allow.inbreeding = TRUE) (Pew et al. 2015). We then used these estimates alongside observation data and likely parent-offspring (P-O) pairs assigned by the R v3.6.0 package sequoia (Huisman 2017) to update the full YNP wolf pedigree to include 871 P-O pairs among 474 wolves (n, females = 226, males = 239, unknown = 9; Supplementary Table S1) (vonHoldt et al. 2020). Wolves that lacked genetic confirmation of parents or offspring were excluded from the pedigree analysis. For individuals where the likely birthday was a range of years, we assigned it the earliest birth year. We completed several of the described methods for the full pedigreed population, as well as annually between 1995 and 2020.
We used PedScope v.2.4.01, a proprietary program used for animal population management and breeding recommendations, to perform pedigree-based analyses. The pedigree contained 26 pedigree founders, defined as individuals where no parents are included in the pedigree (Supplementary Table S1). Correlations were assessed using the product-moment correlation coefficient (r). We estimated several measures of gene diversity in the pedigreed population. We estimated the number of founder equivalents (fe) and founder genome equivalents (fg). The former measured gene diversity in a specified set of the pedigree (e.g. pedigree founders, annual, or entire) and represents the number of equally contributing founders expected to produce the observed level of genetic diversity in the specified population. The latter is a related metric that accounts for drift, where fg represents an estimate that incorporates random loss of alleles. Furthermore, we estimated the number of effective ancestors (fa) and identified influential ancestors as those with the greatest contributions toward fa using an algorithm developed by Boichard et al. (1997). Estimates of fa are similar to that of fe but account for bottlenecks in the pedigree.
We surveyed genome uniqueness (GU) or rarity, defined as the animal’s likelihood of carrying pedigree founder alleles not present in another individual in the pedigree and determined by a gene drop analysis (MacCleur et al. 1986). We also estimated mean kinship (MK), Wright’s pedigree-based inbreeding coefficient (F), and Ballou’s (1997) direct method for estimating ancestral inbreeding coefficient. In a pedigree-based approach, pairwise kinship values describe the probability that two alleles shared are IBD, a measure also estimated by the pedigreed population’s inbreeding coefficient F. We obtained the pairwise additive genetic relationship matrix (“A” matrix) to also assess pairwise relatedness. We described trends over time using a correlation coefficient.
We counted the number of litters per individual as the total number of years a specific mating pair reproduced and included only offspring where birth year was known. We opted to quantify the trait as the number of litters per individual wolf rather than the total number of offspring per individual wolf given the uncertainty of paternity and the number of surviving offspring (e.g. number of pups born versus the number observed to exit the den). We also included litter counts if at least one parent was known. We used the natural log (ln) of the number of genetic litters as a proxy for fitness, surveyed as a function of inbreeding coefficients, where the slope of the fitted line is inferred as the inbreeding load (−B) (Keller and Waller 2002). We further used the pedigree to estimate the total number of lethal equivalents (LE) per diploid genome (2B). The inbreeding load is often defined as the reduction in survival expected in a completely homozygous individual. We then estimated the expected depression in fitness (δ) following δ = 1−e−B × F, where B is the estimated number of LE per gamete and F is the inbreeding coefficient (Ralls et al. 1988; Keller and Waller 2002).
Genome-wide SNP genotype data
We analyzed 56,576 SNP loci (referred to as 56K) genotyped across 391 gray wolves (n, females = 193, males = 195, unknown = 3), which were previously collected using the same RADseq capture method (vonHoldt et al. 2020). The average number of reads uniquely mapped to the reference was 3,869,692 with an average of 7.2-fold sequence coverage (Supplementary Table S2). The previously published data analyzed in this study is publicly available on NCBI’s Sequence Read Archive (PRJNA577957). The mapped bam files for three newly collected individuals can be found at PRJNA743299 with additional metadata found in Supplementary Table S1. These public data were previously filtered to exclude individuals with >20% missing data and sites with >10% per-site missing data, MAF <1%, or significant deviation from HWE (P < 0.0001), leading to a final genotyping rate of 96.4%. We recognize that missing data thresholds present possible biases, both for the inclusion or exclusion of such (Arnold et al. 2013; Huang and Knowles 2016). As such, we opted to permit samples with no more than 20% and SNPs with no more than 10% of the data missing to avoid biasing allele frequency spectra in addition to the reduced representation library construction method. We estimated LD decay using the --r2 flag in PLINK across 391 YNP wolves genotyped for 56K SNPs. We further removed SNPs with a genotype correlation of r2 > 0.5 using the --indep-pairwise 50 5 0.5 flag to construct a pruned set of 24,235 statistically unlinked SNPs (referred to as 24K). Individual wolves with confirmed offspring via genetics or observation were categorized as “breeding” individuals (n = 152), while wolves lacking offspring were considered “non-breeding” individuals (n = 235), with four wolves of unknown breeding status (Supplementary Table S2). We will refer to these as separate portions of the population, not to be confused with demes or genetic populations. Rather, the difference in life histories (reproductive or not) is the central trait that distinguishes these groups of wolves within the same genetic population. We estimated pairwise genetic differentiation as Weir and Cockerham’s FST in VCFtools v0.1.17 (Danecek et al. 2011). We reported average FST across the genome (autosomes and X chromosome combined). We also completed several of the described methods for the full pedigreed population overall as well as annually and censored, between 1995 and 2018, with the pruned SNP set. We assessed differences in annual HO and FROH estimates using a 1-way analysis of variance (ANOVA) in R. We used the Kolmogorov–Smirnov test with the null hypothesis that comparisons of data subsets are drawn from the same distribution. We estimated standard observed and expected heterozygosity (HO and HE, respectively) in PLINK with the --hardy and --het flags.
Estimating the inbreeding coefficient from genotype data
We analyzed neutral and unlinked SNPs on the autosomes for a total length (Lgenome) of 2,202,059,258 nucleotides surveyed by our RADseq loci with the function homozyg in PLINK (flags --chr 1-38 --homozyg group). We estimated inbreeding coefficients from autozygosity, which was detected from runs of homozygosity (ROH). We scanned windows defined by a maximum of 200 SNPs and permitted a maximum of 50% missing data within each window (--homozyg-window-missing 50 --homozyg-window-snp 200). For a window to be considered a ROH, it could not contain more than three heterozygous genotypes (--homozyg-window-het 3) and was a minimum of 1000 kb (Grossen et al. 2018). We then converted the ROH segments to an individual-level inbreeding coefficient (FROH) following:
where LROH is the summed length of all ROHs detected in an individual. Furthermore, we estimated FROH across four genotype missingness thresholds (1%, 5%, 8%, and 10%) where individual-level missingness was limited to 20% in conjunction with estimates for no more than 10% missing genotype data and four individual thresholds of missing data (1%, 5%, 10%, and 15%).
Modeling reproduction, life history, and genetic parameters
We examined the age at first litter (Age_first_litter) using Cox proportional hazards regression (survival version 3.2-11) with time (in years) to the earliest of the three events first litter, death, or last documented observation. Animals who died or were lost to observation before breeding were considered right censored data in the analysis. We used the exact method to deal with ties when two or more animals had the same survival times. We standardized the potential predictors (observed heterozygosity and FROH) with a z-score transformation and included sex and the first two principal components (PCs) from genotype data as covariates in all analyses. We fit the following model structure:
For the Cox proportional hazards regression analysis, although Wald test P-values are reported, we elected the best-fit models through assessment of the Akaike information criterion (AIC).
Results
Pedigree structure
We analyzed a pedigree that was composed of 26 pedigree founders (individuals where no parents are found within the pedigree), 152 nuclear families (individuals that share the same parents), and 319 litters (Fig. 1A). The pedigree did not incorporate 17 wolves that lacked both confirmed offspring and either or both parents (Supplementary Table S1). Compared with the census size of 41 YNP founder individuals, we inferred 19.92 pedigree founder equivalents (fe), 18.53 founder genome equivalents (fg), and 18.79 effective ancestors (fa) in the pedigree analysis. There was a maximum ancestry depth of seven generations (average ± standard deviation [SD] = 3.8 ± 1.6), with an average kinship (MK) of 3.2%. We restricted the founder analysis to the wolves translocated in years 1995 and 1996, which included the individuals translocated from the Sawtooth pack in northwestern Montana. We found that wolf 009F had the highest genealogical contribution as measured by the percent contribution (pC) in the current pedigree due to that individual (pC = 12%; average = 3.8%), followed by 005F with 9% to the full pedigree of 474 wolves, and then by their respective mates 010M and 004M both with 8% contributions each (Fig. 1B). Nearly all pedigree founders had ≥87% of their alleles retained in the total pedigree (Average ± SD = 88% ± 20%), with only three individuals at 50% retention (015M, 036F, and a Canada source wolf Y38 whose offspring were translocated to the Chief Joseph pack in YNP) (Fig. 1B).

A) A small example of the Yellowstone gray wolf pedigree for 155 individuals that reproduced within the study period (circles = females; squares = males). The dashed lines represent interlineage parentage connections. B) Estimates of the pedigree founder contributions and genomic uniqueness or rarity in Yellowstone gray wolves. GU is the probability that a gene from the pedigree foundation was inherited “uniquely” with respect to other pedigree founders. Sawtooth F and M represent the known (and unsampled) parents of 10 full-sibling pups translocated (and genetically sampled) from the Sawtooth pack in northwestern Montana and fostered by the Nez Perce pack. Two of the translocated Sawtooth individuals assumed a breeding status in Yellowstone and represent gene flow (outbreeding) events. (Abbreviations: MC, marginal contribution to the total pedigree; pAR, proportion of alleles retained; pC, proportion of genes in the pedigreed population due to that founder). C) Number of litters as a function of pedigree-based inbreeding coefficient (F) estimates with data from 153 Yellowstone gray wolves (product-moment correlation coefficient, r = −0.11; Spearman rank correlation coefficient, rs = 0.27). D) Linkage disequilibrium (r2) decay for Yellowstone gray wolves (n, all = 391, breeding = 122, nonbreeding = 235, unknown = 4). E) Violin plots of genome-wide inbreeding coefficients (FROH) for 391 gray wolves across all reproductive categories. F) Number of litters as a function of FROH across both reproductive categories.
Average levels of founder GU were low (GU = 0.10 ± 0.3), with only a single pedigree founder (036F) displaying absolute uniqueness (GU = 1) (Fig. 1B). Four additional individuals had nonzero values (013M, 015M, 035M, Y53), indicating the likelihoods that alleles are not present in another individual in the pedigree are low (Fig. 1B). A survey of the total pedigree also revealed 20 influential pedigree founders, highlighting the largest marginal contributions by mated pair 009F (12%) and 010M (8%), and mated pair 004M and 005F (8% and 9%, respectively) (Fig. 1B). Across annual estimates, we found a positive correlation of MK (r = 0.77), genomic uniqueness or rarity over time (r = 0.55), and in founder equivalents (r = 0.43) (Fig. 1B). Furthermore, we found a negative correlation in the number of founder genome equivalents over time (r = −0.66) over time, with a weak but still negative correlation in the number of effective founders (r = −0.16) (Fig. 1B).
When we excluded observations of 0 litters, 153 observations remained with low levels of inbreeding coefficients (average F = 0.009 ± 0.03; range = 0.0 to 0.25) and was negatively correlated to the number of litters (m = −0.595, R2 = 0.004, P = 0.35) (Fig. 1C). From this, we estimated the inbreeding load as the slope of the trendline between ln(litter number) and inbreeding coefficient (y = −1.3689x + 0.6719), and thus |2B|=2.74 LE in a diploid genome. Following, the estimated fitness (δ) is expected to be depressed in 9.7% of progeny from a mating of first-degree relatives (F = 0.25).
Lower inbreeding coefficients in the breeding portion of the population
We obtained genome-wide SNP-based estimations for LD, heterozygosity, and inbreeding for 391 Yellowstone gray wolves, 387 of which have known breeding status within the study period of 1995 to 2018 (Supplementary Table S2). We did not find evidence that wolves with a confirmed breeding status had higher levels of observed heterozygosity relative to nonbreeding wolves across the 56K SNP genotypes (average ± SD HO: breeding = 0.208 ± 0.02, nonbreeding = 0.206 ± 0.02, one-tailed t-test of equal variance P = 0.0916). LD in the total YNP wolf population extended for an average of 150 kb before decaying below r2 = 0.5, with a similar length of LD decay found in the breeding population (230 kb) and substantially longer LD decay in nonbreeding wolves (750 kb) (Fig. 1D).
We found an overall consistency in FROH estimates with increasingly strict filtering (Supplementary Table S3); and selected the SNP set of limiting genotype missingness to 10% and individual-level missingness to 20%. Furthermore, the breeding population had significantly lower levels of inbreeding estimates than nonbreeding individuals (FROH: breeding = 0.0016 ± 0.003, nonbreeding = 0.0023 ± 0.004, P = 0.0390) (Fig. 1E), despite these wolves representing the same genetic population (mean, weighted FST = −0.0004, −0.0005). We further find that breeding wolves carried fewer nucleotides in ROH segments than nonbreeding wolves (mean length in kb, breeding = 3436.4, nonbreeding = 5012.6, Welch two-sample t-test, t = −1.90, df = 365.6, P = 0.05863), but no differences in the total number of ROH segments across breeding status (mean number of ROH segments, breeding = 0.31, nonbreeding = 0.41, t = −1.51, df = 354.4, P = 0.1329) (Supplementary Fig. S2).
We observed stable annual heterozygosity estimates that were not significantly different between 152 breeding and 235 nonbreeding wolves for 21 of the 24 years surveyed, likely due to small intra-annual genetic sample size relative to the census size (Fig. 2; Supplementary Table S4). For three years (2011 to 2013), 36 unique nonbreeding wolves carried significantly higher observed heterozygosity than the 50 unique breeding wolves (ANOVA P ~ 0.05; Supplementary Table S4), although this trend can be noted in nearly every year surveyed and should be interpreted with caution given the small annual sample sizes. A similar but opposite trend was noted for the skewness of inbreeding coefficients estimated from FROH, where the breeding population had a significantly larger proportion of lower FROH than nonbreeding wolves for two years (2012 and 2013; KS P = 0.0016 and 0.0271, respectively) (Fig. 2).

Loess smoothed annual population averages of observed heterozygosity (HO) estimated from 56K SNPs and inbreeding coefficients (FROH) from 24K pruned SNPs compared across breeding and nonbreeding gray wolves of Yellowstone National Park (see Supplementary Table S3 for more details). Years with significant or notable differences in the metric between the populations are indicated by vertical dashed lines and the associated P-value.
Reproduction influenced by genomic diversity and autozygosity
We restricted the dataset to explore fitness and genetic data to 385 pedigreed wolves with sufficient and reliable life-history information (known year of birth, death or last observation, and years of reproduction). We found that breeding wolves on average had longer duration of being observed (lifespan or dispersal out of YNP) than nonbreeders (average breeding = 6.1 years, N = 150; nonbreeding = 2.1 years, N = 225, Welch two-sample t-test t = 17.4, df = 205.4, P < 2.2 × 10−16) although observed heterozygosity was not significantly different between reproductive status (average HO: breeding = 0.208, nonbreeding = 0.205, t = 1.2, df = 277.4, P = 0.2313) (Supplementary Table S2). We found that nonbreeding wolves had higher inbreeding coefficients than breeding wolves (average FROH: breeding = 0.0016, nonbreeding = 0.0023, t = −1.9, df = 365.5, P = 0.05948). We found no significant difference between breeding status in the number of ROH segments or their lengths (Supplementary Table S2, Supplementary Fig. S1). Furthermore, we used a linear regression and found a significant negative relationship between inbreeding and the number of litters documented (All wolves R = −0.11, P = 0.029; Breeding status wolves only R = −0.13, P = 0.15) (Fig. 1F). For 142 breeding wolves with known litters, the average time to first litter was 2.93 ± 1.3 years, with the time stratified by sex where breeding males had their first litter at a significantly older age than females (males = 3.2 ± 1.3, females = 2.7 ± 1.2; t = 2.4, df = 119.4, P = 0.02065).
We examined the relationship of FROH and heterozygosity with the number of litters that each wolf was observed to produce. We restricted the data to breeders with known number of litters; recall that nonbreeders have no documented litters. Please note, we recognize that young wolves in the nonbreeder cohort may ultimately acquire breeding status if they had lived long enough. Thus, we restricted wolves in either cohort to be at least two years of age as a proxy for sexual maturity and we feel this would confidently reduce any potential confounding in the nonbreeding cohort. We also excluded wolves known to have reproduced but the exact number was unknown and wolves who were breeders but their age at such was unknown. Under these restrictions, we analyzed 109 breeders and 100 nonbreeders in a negative binomial distribution to model the number of litters. We used a best subset approach to choose the covariates for the null model, which resulted in age rounded to the nearest year as a covariate (βrounded_age = 0.373, P < 2.00 × 10−16, AIC = 569.4) but not the PCs or sex. With this base model, we then explored the association of standardized estimates of heterozygosity and FROH, each with the number of litters. Using AIC as the criterion, the number of litters was significantly associated with standardized heterozygosity (βstdhet = −0.187, P = 0.0117, AIC = 565.3) and to a lesser extent with standardized FROH (βstdFROH = −0.190, P = 0.0673, AIC = 567).
We further explored the relationship of similar genetic parameters with respect to age at first litter for 136 breeding and 100 nonbreeding wolves using Cox proportional hazards survival analysis. We similarly used the standardized FROH and heterozygosity estimates, and again used a best subset approach to choose the covariates for the null model (AIC = 1258.7) (Table 1). As a result, we included sex as a covariate (βsex = −0.597, P = 0.000663) and found that males had their first litter at an older age than females, along with the second PC from SNPs covering the genome to account for the relatedness (βPC2 = 0.00231, P = 0.00139). Similar results were obtained without these covariates (results not shown). Although we found a general trend that individuals who had their first litter at an older age also had higher inbreeding coefficients, we found that this inbreeding trend did not significantly change the Cox proportional hazard model and was not associated with reproduction at an older age. Thus, FROH was not a significant predictor of age at first litter (Log hazard coefficient for standardized βFROH = −0.0916, P = 0.351, AIC = 1259.8). The standardized heterozygosity did not significantly change the Cox proportional hazard model (βHO=0.0222, P = 0.828, AIC = 1260.6).
Cox proportional hazards regression with age at first litter as the outcome response. Time is recorded in years and represents time to first litter (n = 136), time to last observation, or death (n = 100) whichever is sooner. Data from animals who die or are no longer observed before reproducing are considered censored. The predictors are time invariant and have been standardized. The best-fitting model has the lowest AIC.
Predictor . | Log hazard coefficient . | Coefficient P-value . | AIC . |
---|---|---|---|
Base model: Sex | −0.597 | 0.000663 | 1258.7 |
PC2 | 0.00231 | 0.00139 | |
Model: HO | 0.0222 | 0.828 | 1260.6 |
Sex | −0.602 | 0.000667 | |
PC2 | 0.00240 | 0.00415 | |
Model: FROH | −0.0916 | 0.351 | 1259.8 |
Sex | −0.595 | 0.000697 | |
PC2 | 0.00235 | 0.00121 |
Predictor . | Log hazard coefficient . | Coefficient P-value . | AIC . |
---|---|---|---|
Base model: Sex | −0.597 | 0.000663 | 1258.7 |
PC2 | 0.00231 | 0.00139 | |
Model: HO | 0.0222 | 0.828 | 1260.6 |
Sex | −0.602 | 0.000667 | |
PC2 | 0.00240 | 0.00415 | |
Model: FROH | −0.0916 | 0.351 | 1259.8 |
Sex | −0.595 | 0.000697 | |
PC2 | 0.00235 | 0.00121 |
Abbreviations: HO, observed heterozygosity.
Cox proportional hazards regression with age at first litter as the outcome response. Time is recorded in years and represents time to first litter (n = 136), time to last observation, or death (n = 100) whichever is sooner. Data from animals who die or are no longer observed before reproducing are considered censored. The predictors are time invariant and have been standardized. The best-fitting model has the lowest AIC.
Predictor . | Log hazard coefficient . | Coefficient P-value . | AIC . |
---|---|---|---|
Base model: Sex | −0.597 | 0.000663 | 1258.7 |
PC2 | 0.00231 | 0.00139 | |
Model: HO | 0.0222 | 0.828 | 1260.6 |
Sex | −0.602 | 0.000667 | |
PC2 | 0.00240 | 0.00415 | |
Model: FROH | −0.0916 | 0.351 | 1259.8 |
Sex | −0.595 | 0.000697 | |
PC2 | 0.00235 | 0.00121 |
Predictor . | Log hazard coefficient . | Coefficient P-value . | AIC . |
---|---|---|---|
Base model: Sex | −0.597 | 0.000663 | 1258.7 |
PC2 | 0.00231 | 0.00139 | |
Model: HO | 0.0222 | 0.828 | 1260.6 |
Sex | −0.602 | 0.000667 | |
PC2 | 0.00240 | 0.00415 | |
Model: FROH | −0.0916 | 0.351 | 1259.8 |
Sex | −0.595 | 0.000697 | |
PC2 | 0.00235 | 0.00121 |
Abbreviations: HO, observed heterozygosity.
Discussion
The reintroduced gray wolves of YNP are unique in that they are observed every single year, with life events documented and supported by a wealth of molecular data. Despite the rising accessibility of genome sequencing, Yellowstone wolves stand among only a few other systems with similar multidimensional datasets that merge static, dynamic, and molecular perspectives (Stahler et al. 2020). We explored this interface by conducting a pedigree- and marker-based assessment of genetic variation over time within the breeding portion of the population and modeled fitness-related traits. The pedigree is large and complex, capturing seven generations since the reintroduction of gray wolves to the Rocky Mountains of the United States in 1995 and 1996. Although Canadian founders were carefully selected and indeed have contributed to the genetic success of the population, equally important was the translocation of gray wolf pups from northwest Montana in 1997 that carried divergent genetics (vonHoldt et al. 2008, 2010). As two of these pups matured and assumed a social rank with reproductive access, this lineage continues to provide an influx of genetic variation distinct from the Canadian founders into YNP wolves through their descendants. This finding illustrates the critically positive impact that a few successful breeders have on gene flow into a population. Furthermore, this is exemplary as a design for reintroduction programs to establish a genetically diverse founding population.
Given the known genealogy of this population, we identified a handful of founders that contributed a significant amount of kinship and GU, with more than 87% of the founders’ genomes persisting in the pedigree. The remaining fraction was lost through random drift or emigration. As expected for group-living species, MK significantly increased over time; however, this appears to be mitigated by the continued reproductive success of the translocated individuals from northwest Montana (now the Nez Perce lineage). Furthermore, concordant with vonHoldt et al. (2008), mate choice and inbreeding avoidance also mitigates the degree to which MK increased over time in the pedigree, which is expected to be significantly higher under random breeding alone. We found that the number of effective founders and genomic uniqueness increased over time; this positive consequence is expected in a social species with structured mating, which can alleviate the potentially strong impacts of the reintroduction event which inevitably forces a population genetic bottleneck.
We utilized genome-wide SNP data to explore the distribution of genetic variation across the pedigree relative to time, reproduction, and fitness-related traits (e.g. litters, inbreeding). Overall, we document high retention of genetic variation over the 25 years since reintroduction. We found that the population of breeding wolves carried a significantly shorter distances at which LD decays. This is consistent with past findings that the reproductively active portion of the YNP gray wolf population carried a higher mean allelic richness, which is expected to show more immediate changes than other allele diversity metrics like heterozygosity, than the census in nearly all years surveyed (1995 to 2015) (Cornuet and Luikart 1996; DeCandia et al. 2021).
We hypothesized that reproductive status reflected overall fitness and predicted that if the breeding population showed decreased genetic variation with increased autozygosity levels, fitness-related traits would be reduced. Elevations of autozygosity levels indicate both a lack of genetic diversity and inflation of alleles that are IBD via inbreeding and are often associated with increased expression of recessive deleterious traits (Keller 2002; Charlesworth and Willis 2009). Indeed, we found remarkably lower levels of autozygosity and inbreeding coefficients in the breeding population relative to the nonbreeding population. Such a trend is far more significant on overall inference of genetic health than levels of homozygosity. Our finding is in direct contrast to the gray wolf population in Isle Royale National Park (IRNP; Michigan, USA), which had persisted at exceedingly low numbers for over a decade with negative fitness and genome-wide consequences recently documented (Robinson et al. 2019). We do not suggest the genetic health of YNP wolves is similar to IRNP; rather, they represent differences along the complex gradient of interactions between breeding population size, genetic isolation, and fitness.
The suggested traits impacted by autozygosity in YNP gray wolves were reflected in reproduction, a fitness consequence concordant with studies of IBD and complex diseases in humans (Szpiech et al. 2013; Ceballos et al. 2018; Pemberton et al. 2018). After controlling for sex-specific differences in primiparity, individuals that are older at the time of their first litter carried higher levels of autozygosity, which was also positively associated with a significant reduction in the total number of litters for that individual. Proximate mechanisms for why these older individuals were unable to breed during previous years may be due to intra-pack composition (e.g. lower social status or no access to unrelated mates) or challenges with finding potential mates through dispersal. Ultimately, individuals with higher levels of autozygosity that breed later in life may be losing out to higher quality competitors before eventually breeding, indicating a fitness cost. Sex-specific differences in age at first litter likely reflect differential breeding opportunities through male-biased dispersal and female natal philopatry documented in YNP wolves (Smith et al. 2020; Stahler et al. 2020).
Reproductive fitness can be challenging to measure, and estimates thereof, in a wild natural population as a large enough sample size for a well powered is needed, especially in species with longer generation times (Keller and Waller 2002). Nielsen et al. (2012) utilized a long-term pedigree-based design for meerkats (Suricata suricatta), a cooperative and group-living species, and found that increased inbreeding coefficients were associated with negative morphologic consequences for pups with higher inbreeding coefficients. Increased pathogen load was found in Soay sheep (Ovis aries) in sheep with reduced genetic diversity (Coltman et al. 1999) and depressed fitness across developmental stages for the endangered red-cockaded woodpecker (Picoides borealis) (Daniels and Walters 2000). We also recently found similar trends for the YNP wolves whereas reductions in genome-wide diversity estimates are associated with increased disease severity with respect to sarcoptic mange infections (DeCandia et al. 2021). Although YNP wolves overall are genetically diverse, we detected reproduction consequences associated with moderate increases in autozygosity as a result of reduced genetic variation. Similar patterns of reproductive consequences were found with maternal inbreeding levels in red deer (Cervus elaphus) being correlated with reduced offspring survival, further suggesting that depression of fitness traits in adult individuals is likely to be more accurately estimated from marker-based data (Huisman et al. 2016). Chu et al. (2019) reported a negative correlation between fecundity and autozygosity in a longitudinal study of domestic dogs, which also supported pedigree-based assessments of such (LeRoy et al. 2015).
Surveying LE assumes that inbreeding depression is caused by deleterious recessive variation found in the homozygous state (or the equivalent of alleles across loci with a similar contribution toward inbreeding depression) and is defined by the number of deaths expected in a group where each individual carried a single lethal allele in a homozygous genotype (Morton et al. 1956). Pedigree-based estimates of inbreeding revealed a negative correlation to the number of litters, with an estimation that each wolf likely carries nearly two LEs in their genome, similar to previous estimates made for humans, Drosophila, and great tits (Szulkin et al. 2007; Gao et al. 2015). This directly translated to an expected decrease in offspring survival to maturity by 14% due to a full-sibling mating. Nietlisbach et al. (2018) conducted a literature survey and reported that wild vertebrate populations carried on average 3.5 LEs. However, 13 of the 18 studies analyzed represented bird species, likely across a diversity of mating systems relative to the highly structured one of gray wolves.
Limitations of a small sample size and reduced genomic surveillance
The limitation of the total sample size in addition to the reduced number of wolves that have all metadata result in a lack of statistical power for some of the analyses. Although increasing the sample size is always preferred, the focus on the uniqueness and value of a wild, pedigreed population has merit. We explored the degree that missing genotype data (per locus or per individual) might reveal significant inconsistencies in ROH identification and the resulting inbreeding inference. Across nine parameter combinations (genotype missingness 1%, 5%, 8%, and 10%; individual missingness 1%, 5%, 10%, 15%, and 20%), we found consistent FROH inferences albeit fewer ROH fragments were detected per individual as SNP densities decreased (from 27,360 to 18,222 loci) (Supplementary Table S2). Locus filtering had a greater impact on ROH detection than did individual exclusion based on missingness. These parameter explorations are useful to ensure that estimates are not artifacts of methods or data quality, and builds confidence that, despite small sample sizes, trends can still provide biological meaning.
Long-term implications
These results provide a valuable baseline through which continued monitoring can evaluate long-term genetic health of wolves in YNP. Similar approaches could be applied across a larger geographic scale to monitor genetic health and connectivity of wolf populations throughout the contiguous United States, a task that is identified in the final endangered species act (ESA) delisting requirements (USFWS 2020). In light of our findings, the future genetic health of wolves in YNP depends upon the critical role of gene flow and preserving landscape corridors to support effective dispersal. The YNP wolf population, which represents the core of a larger population throughout the Greater Yellowstone Ecosystem, serves as an important source for wolves that disperse beyond the protective boundaries (vonHoldt et al. 2010). As the YNP wolf population has stabilized at lower densities over time, the pedigreed population exhibited an increase in GU (e.g. Table 1), suggestive of successful effective dispersal that has been documented through field observations of radio-collared dispersers (Stahler et al. 2020). Although this study does not evaluate gene flow into YNP, our findings suggest that immigration of effective dispersers over time will be essential for safeguarding their future through the incorporation of new and adaptive genetic variation. Such variation provides an important mechanism for rapid evolutionary response to changing environmental challenges caused by disease, climate change, and human alteration of habitats (Kardos et al. 2021).
To maintain larger-scale wolf population connectivity and counteract loss of genetic variation, natural dispersal dynamics should be promoted and anthropogenic factors that significantly reduce genetic connectivity and effective population size should be mitigated (vonHoldt et al. 2010). These goals face increased challenges under recent wolf management directives in some northern Rocky Mountain states that aim to significantly reduce the number of wolves on the landscape. For example, recent legislative actions in Montana (2021 Montana Code Annotated 87-1-901) and Idaho (2021 Idaho Senate Bill No. 1211) include aggressive policies to reduce wolf population sizes to levels close to minimum threshold requirements to prevent ESA relisting (e.g. Idaho SB 1211). These actions include regulations that allow for killing wolves using methods beyond shooting, such as baiting, snaring, trapping, night-time hunting on private land, aerial gunning, unlimited quotas, large numbers of wolf tags per hunter (20 wolves in Montana) or unlimited (Idaho), and extensive hunting seasons (e.g. year-round in Idaho and 6 months in Montana). If continued, these specific regulations have the potential to significantly reduce regional wolf population sizes, not only limiting gene flow into YNP, but disrupting the genetic connectivity that was demonstrated to have occurred across a larger regional scale following the first decade of wolf recovery in the western United States (vonHoldt et al. 2010). Such management policies that fail to incorporate larger meta-population dynamics of dispersal and interregional connectivity with adequate effective population sizes could jeopardize the tremendous success of wolf recovery efforts and genetic health in the Western United States over the last 25 years. These findings can be more broadly applicable to species of conservation concern by demonstrating methods for evaluating the maintenance of genetic variation through time, providing accurate estimates of inbreeding and its impact on reproductive traits, and identifying resulting fitness consequences that are crucial for predicting their future viability.
Supplementary Material
Supplementary material is available at Journal of Heredity online.
Acknowledgments
We are grateful for funding support from Yellowstone Forever and their many donors, especially Valerie Gates and Annie and Bob Graham. We also are appreciative to the many field technicians who continue to document in amazing details the lives of wolves in Yellowstone National Park.
Funding
JSS is partially funded by the NIH (R35GM141798, R01AI153044, and R01HG009120) and the National Science Foundation (DMS-1264153). DRS and DWS were partially funded by the National Science Foundation (DEB-0613730 and DEB-1245373).
Conflict of interest statement. None declared.
Data availability
The previously published data analyzed in this study is publicly available on NCBI’s Sequence Read Archive (SRA PRJNA577957). The mapped bam files for three newly collected individuals can be found at PRJNA743299 with additional metadata found in Supplementary Table S1.
Benefit-sharing statement
Benefits from this research accrue from the sharing of our data and results on public databases as described above.
Author contributions
The project was designed by BMV, with data collected by BMV, ALD, KAC, EES, and DRS. The analysis was conducted by BMV, ALD, and JSS. All authors contributed toward drafting and reviewing the manuscript.