Abstract

Cross-species application of single-nucleotide polymorphism (SNP) chips is a valid, relatively cost-effective alternative to the high-throughput sequencing methods generally required to obtain a genome-wide sampling of polymorphisms. Kharzinova et al. (2015) examined the applicability of SNP chips developed in domestic bovids (cattle and sheep) to a semi-wild cervid (reindeer). The ancestors of bovids and cervids diverged between 20 and 30 million years ago (Hassanin and Douzery 2003; Bibi et al. 2013). Empirical work has shown that for a SNP chip developed in a bovid and applied to a cervid species, approximately 50% genotype success with 1% of the loci being polymorphic is expected (Miller et al. 2012). The genotyping of Kharzinova et al. (2015) follows this pattern; however, these data are not appropriate for identifying runs of homozygosity (ROH) and can be problematic for estimating linkage disequilibrium (LD) and we caution readers in this regard.

Inbreeding (mating between relatives) results in parts of the genome that are identical-by-descent (IBD) in the offspring. More specifically, IBD segments of the genome occur when identical copies of a chromosome segment—originating from a common ancestor of the parents—are transmitted to an offspring. Importantly, IBD is manifested in runs of homozygosity (ROH) with inbreeding between individuals having a more recent shared ancestor translating to longer ROH (Kardos et al. 2015). Domestic cattle have a long history of selective breeding programs and the rate of inbreeding has generally been increasing (USDA 2006). Screening a variety of cattle breeds with the bovineSNP50 chip, Ferenčaković et al. (2013) found ROH to vary between 2.36 and 4.01Mb. Using the same SNP chip, but with fewer than 1500 polymorphic SNPs, Kharzinova et al. (2015) documented values around 15Mb (with some over 30Mb). This is despite the fact that there is no strong evidence of inbreeding in reindeer (Roed 1985; Holand et al. 2007).

It is difficult, if not impossible, to reliably detect ROH using a small number of polymorphic loci. To demonstrate this, we tested the effect of the number of polymorphic loci on the detection of ROH with data simulated using the approach of Kardos et al. (2015). We simulated a single population with effective populations size (Ne) of 75 diploid individuals, with an immigration rate of m = 1/75 (1 immigrant per generation on average). The simulated genomes included 20 equally sized autosomes, a genetic map length of 3600 cM, and a total physical length of 3 Gb. 100 000 polymorphic SNPs (mean expected heterozygosity = 0.30) were simulated. We compared the length distribution of the true IBD chromosome segments to the lengths of ROH detected with PLINK. We applied the same PLINK settings as in Kharzinova et al. (2015) using 25K, 50K, and 100K SNPs.

The distribution of the lengths of the true IBD segments and the ROH detected with 50K and 100K SNPs are shown in Figure 1. Analysis based on 25K loci failed to detect any ROH. Apparently, the ROH analyses of Kharzinova et al. (2015) were carried out on all of the SNPs in the array—including those that were not polymorphic—which invalidates the analysis and explains the very high ROH values (we thank the authors for providing the original log files that confirmed fixed sites were used). It is inappropriate to include fixed loci in analyses of ROH because these loci are uninformative of the presence of ROH within a population sample (i.e., ROH are inferred using only polymorphic sites). It is clear from our simulations that the majority of IBD segments cannot be detected using small numbers of loci. For this reason, it is recommended that more than 100 000 SNPs be used for ROH analyses (Purcell et al. 2007). Another consideration in the cross-species application of SNP chips is that ROH analysis depends on genome coordinates. This will introduce a bias if karyotypes differ between species. In the study of Kharzinova et al. (2015), reindeer have a karyotype of 2n = 70 (Nes et al. 1965), while cattle 2n = 60 (Wurster and Benirschke 1968) and sheep 2n = 54 (Di Meo et al. 2007). The cross-species application of SNP chips and ROH analysis will thus produce misleading inferences on inbreeding unless a high number of polymorphic SNPs cross amplify from a species with a similar karyotype.

True IBD segments and ROH detected in simulated data. Histograms are provided and show the distribution of the lengths of true IBD segments (a), ROH detected using 100K SNPs (b), and ROH detected using 50K SNPs (c) in PLINK analyses as described in the main text.
Figure 1.

True IBD segments and ROH detected in simulated data. Histograms are provided and show the distribution of the lengths of true IBD segments (a), ROH detected using 100K SNPs (b), and ROH detected using 50K SNPs (c) in PLINK analyses as described in the main text.

The measurements of linkage disequilibrium (LD) by Kharzinova et al. (2015) also appear problematic. Applying the ovineSNP50 chip to wild sheep, Miller et al. (2011) observed a genome-wide r2 of 0.04, with syntenic comparisons (i.e., loci on the same chromosome) approaching a r2 of 0.19. With the bovineSNP50 chip, Bohmanova et al. (2010) observed a mean r2 of 0.24 for SNPs less than 40kb apart. Kharzinova et al. (2015) reported r2 values of 0.54 and 0.41 for these respective chips. The cause of this extreme LD is not immediately clear, but the high level of polymorphism could be the problem. High heterozygosity results in increased power to detect LD (Ott and Rabinowitz 1997); however, the observed levels of heterozygosity in this study are nearly double to what is expected for both chips (0.91–0.48; 0.79–0.44), and in stark contrast to a similar study using the canine SNP chip on a wild phocid, species that are even more divergent (0.24–0.25; Hoffman et al. 2013). It is likely that paralogues, balancing selection, small sample size, or contamination are impacting these estimates (as well as the ROH analysis). Elevated LD and heterozygosity do not appear to be a common trend with cross-species application of SNP chips (i.e., Miller et al. 2011; Hoffman et al. 2013), but it does inhibit the utility of these markers for population genetic inference on reindeer.

In conclusion, we agree that cross-species application of chips is a valid approach to obtain SNPs in non-model organisms. However, these markers will generally be not suitable for estimating meaningful ROH values for reasons stated above. Studies should further ensure LD and heterozygosity estimates produced by chips are in accordance with commonly observed values before making population genetic inferences.

References

Bohmanova
J
Sargolzaei
M
Schenkel
FS
.
2010
.
Characteristics of linkage disequilibrium in North American Holsteins
.
BMC Genomics
.
11
:
421
.

Bibi
F
.
2013
.
A multi-calibrated mitochondrial phylogeny of extant Bovidae (Artiodactyla, Ruminantia) and the importance of the fossil record to systematics
.
BMC Evol Biol
.
13
:
166
.

Di Meo
GP
Perucatti
A
Floriot
S
Hayes
H
Schibler
L
Rullo
R
Incarnato
D
Ferretti
L
Cockett
N
et al. .
2007
.
An advanced sheep (Ovis aries, 2n = 54) cytogenetic map and assignment of 88 new autosomal loci by fluorescence in situ hybridization and R‐banding
.
Anim Genet
.
38
:
233
240
.

Ferenčaković
M
Sölkner
J
Curik
I
.
2013
.
Estimating autozygosity from high-throughput information: effects of SNP density and genotyping errors
.
Genet Select Evol
.
45
:
42
.

Hassanin
A
Douzery
EJP
.
2003
.
Molecular and morphological phylogenies of Ruminantia and the alternative position of the Moschidae
.
Syst Biol
.
52
:
206
228
.

Hoffman
JI
Thorne
MA
McEwing
R
Forcada
J
Ogden
R
.
2013
.
Cross-amplification and validation of SNPs conserved over 44 million years between seals and dogs
.
PLoS One
.
8
:
e68365
.

Holand
O
Askim
KR
Røed
KH
Weladji
RB
Gjøstein
H
Nieminen
M
.
2007
.
No evidence of inbreeding avoidance in a polygynous ungulate: the reindeer (Rangifer tarandus)
.
Biol Lett
.
3
:
36
39
.

Kardos
M
Luikart
G
Allendorf
FW
.
2015
.
Measuring individual inbreeding in the age of genomics: marker-based measures are better than pedigrees
.
Heredity (Edinb)
.
115
:
63
72
.

Kharzinova
VR
Sermyagin
AA
Gladyr
EA
Okhlopkov
IM
Brem
G
Zinovieva
NA.
2015.
A study of applicability of SNP chips developed for bovine and ovine species to whole-genome analysis of Reindeer Rangifer tarandus
.
J Hered
.
106
:
758
761
.

Miller
JM
Kijas
JW
Heaton
MP
McEwan
JC
Coltman
DW
.
2012
.
Consistent divergence times and allele sharing measured from cross‐species application of SNP chips developed for three domestic species
.
Mol Ecol Resour
.
12
:
1145
1150
.

Miller
JM
Poissant
J
Kijas
JW
Coltman
DW
.
2011
.
A genome‐wide set of SNPs detects population substructure and long range linkage disequilibrium in wild sheep
.
Mol Ecol Resour
.
11
:
314
322
.

Ott
J
Rabinowitz
D
.
1997
.
The effect of marker heterozygosity on the power to detect linkage disequilibrium
.
Genetics
.
147
:
927
930
.

Nes
N
Amrud
J
Tondevold
O
.
1965
.
Kromosomstudier hos rein (Rangifer tarandus). Kromosomstudier hos rein (Rangifer tarandus)
.
Nord Vet Med
17
:
589
593
.

Purcell
S
Neale
B
Todd-Brown
K
Thomas
L
Ferreira
MA
Bender
D
Maller
J
Sklar
P
de Bakker
PI
Daly
MJ
et al.
2007
.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
.
81
:
559
575
.

Roed
KH
.
1985
.
Genetic variability in Norwegian semi‐domestic reindeer (Rangifer tarandus
).
Hereditas
.
102
:
177
184
.

United States Department of Agriculture
.
2006
. Animal Improvement Programs Laboratory. Inbreeding coefficients for holstein cows. http://aipl.arsusda.gov/dynamic/inbrd/current/HOt.html (accessed 18 October 2015).

Wurster
DH
Benirschke
K
.
1968
.
Chromosome studies in the superfamily Bovoidea
.
Chromosoma
.
25
:
152
171
.

Author notes

Address correspondence to Aaron B.A. Shafer at the address above, or e-mail: [email protected].

Corresponding editor: Taras Oleksyk