Abstract

The introduction of animals from a different environment or population is a common practice in commercial livestock populations. In this study, we modeled the inclusion of a group of external birds into a local broiler chicken population for the purpose of genomic evaluations. The pedigree was composed of 242,413 birds and genotypes were available for 107,216 birds. A five-trait model that included one growth, two yield, and two efficiency traits was used for the analyses. The strategies to model the introduction of external birds were to include a fixed effect representing the origin of parents and to use unknown parent groups (UPG) or metafounders (MF). Genomic estimated breeding values (GEBV) were obtained with single-step GBLUP using the Algorithm for Proven and Young. Bias, dispersion, and accuracy of GEBV for the validation birds, that is, from the most recent generation, were computed. The bias and dispersion were estimated with the linear regression (LR) method,whereas accuracy was estimated by the LR method and predictive ability. When fixed UPG were fit without estimated inbreeding, the model did not converge. In contrast, models with fixed UPG and estimated inbreeding or random UPG converged and resulted in similar GEBV. The inclusion of an extra fixed effect in the model made the GEBV unbiased and reduced the inflation. Genomic predictions with MF were slightly biased and inflated due to the unbalanced number of observations assigned to each metafounder. When combining local and external populations, the greatest accuracy can be obtained by adding an extra fixed effect to account for the origin of parents plus UPG with estimated inbreeding or random UPG. To estimate the accuracy, the LR method is more consistent among scenarios, whereas the predictive ability greatly depends on the model specification.

Introduction

The introduction of animals from a different environment or population is a common practice in some livestock breeding populations (Lo, 1994). This practice is done to increase the genetic performance or to reduce the inbreeding in the current population. The inclusion of external animals should be appropriately accounted for in the genetic evaluation model; otherwise, the estimated breeding values (EBV) or the genomic EBV (i.e., GEBV if genomic information is used) may be not accurately assessed and comparison of birds in the same selection group coming from distinct populations can be compromised. Two issues arise from this topic: first, the methods to include external animals in the evaluation, and second, the comparison between those methods and how to choose the best one.

Nowadays, the most used methods to account for the difference in the origin of the animals in the evaluation are to: 1) include an extra fixed effect in the model, 2) use genetic groups (Quaas, 1988), and 3) use metafounders (MF; Legarra et al., 2015). The addition of an extra fixed effect in the model can be done by fitting a cross-classified effect. This can be interpreted as the inclusion of the mean of each group of animals, for example, one mean per line or origin.

Regarding genetic groups, they are widely used in animal breeding to model the lack of pedigree information. In this setting, animals are assigned to groups according to a specific criterion. Hence, this approach allows to represent the average genetic merit of a group of individuals with missing parents and this is known as unknown parent groups (UPG). The UPG can be easily introduced in genetic evaluations by modifying the inverse of the pedigree relationship matrix (A-1), as proposed by Quaas (1988). To obtain a more accurate estimation of actual relationships, inbreeding could be considered for UPG (VanRaden, 1992). Additionally, UPG can be considered as random effects in the model (Sullivan and Schaeffer, 1994; Legarra et al., 2007; Misztal et al., 2013).

With the advent of genomic information, genetic groups can be used for two purposes: first, to model missing pedigrees as in a pedigree-based evaluation (Misztal et al., 2013), and second, to overcome the incompatibility between the pedigree relationship matrix (A) and the genomic relationship matrix (G) (Vitezica et al., 2011). Both issues can be solved with MF theory (Christensen, 2012; Legarra et al., 2015). The basic difference between MF and UPG is that relationships can exist among MF but not among UPG. MF under single-trait models provided the best results among other methods to model missing pedigrees in simulated datasets (Bradford et al., 2019).

The performance of genetic evaluation models is usually evaluated by cross-validation (Gianola and Schön, 2016). A statistic widely used for this purpose is the predictive ability, which is defined as the correlation between EBV and the phenotypes of the individual adjusted for the fixed effects in the model (Legarra et al., 2008). When divided by the square root of the heritability, the predictive ability gives an estimator of the accuracy of the model. Legarra and Reverter (2018) derived a method called the LR method to obtain estimates of bias, dispersion, accuracy, and other parameters related to the quality of EBV. The statistics from the LR method are easy to obtain and, theoretically, work with any type of model (Legarra and Reverter, 2018). However, since it is a new method, it has not been tested with complex models and large data sets.

The primary objective of this study was to model the inclusion of a group of external birds into a local population using either a fixed effect representing the origin of parents, UPG, or MF. As a secondary goal, we determined whether the predictive ability or LR method gives a more reliable estimation of the accuracy of GEBV.

Materials and Methods

Data

The dataset used in this study was provided by Cobb-Vantress Inc. (Siloam Springs, AR). The traits used in the evaluation were weight (WGT), carcass yield component 1 (CYC1), feed efficiency 1 (FE1), carcass yield component 2 (CYC2), and feed efficiency 2 (FE2). The number of records and genotyped birds per trait and the number of birds used to validate the models (i.e., focal individuals) are presented in Table 1. Genotypes from the 60k single nucleotide polymorphism (SNP) panel were available for 107,216 birds and pedigree information was available for 242,413 birds. About 1% of the birds in the whole pedigree had missing parents. Quality control removed SNP with call rates lower than 0.9, minor allele frequencies lower than 0.05, and heterozygosity deviation greater than 0.15 from the Hardy–Weinberg equilibrium expectation (Wiggans et al., 2009). Markers with unknown position or located on sex chromosomes were also excluded from the analyses. After quality control, 47,952 SNPs were kept for analysis. All birds were classified into two groups according to their origin, that is, local and external individuals. Local individuals belonged to the original Cobb-Vantress Inc. (Siloam Springs, AR) breeding program, and the external individuals belonged to another breeding company and were recently acquired. The number of individuals in each group was 240,929 and 1,484, respectively. The number of animals with missing parents was 2,161 in the first group and 656 in the second group. This classification will be referred to as C1. The birds from the second group of C1 were used as parents and those whose parents were known had records. Based on C1, the birds with both parents known were classified into three groups: 1) with both parents belonging to the original breeding program, 2) with both parents coming from an external population, and 3) with one parent coming from an external population. The number of individuals in each group was 203,759, 18,599, and 16,407, respectively. This second classification will be referred to as C2.

Table 1.

Number of records and number of focal individuals of each trait1

TraitAbbreviationNumber of recordsNumber of focal individuals
WeightWGT74,484 (38,044)8,689 (3,030)
Carcass yield component 1CYC115,293 (14,928)823 (585)
Feed efficiency 1FE159,567 (28,478)8,917 (3,012)
Carcass yield component 2CYC215,208 (14,849)820 (584)
Feed efficiency 2FE260,126 (28,566)8,963 (3,004)
TraitAbbreviationNumber of recordsNumber of focal individuals
WeightWGT74,484 (38,044)8,689 (3,030)
Carcass yield component 1CYC115,293 (14,928)823 (585)
Feed efficiency 1FE159,567 (28,478)8,917 (3,012)
Carcass yield component 2CYC215,208 (14,849)820 (584)
Feed efficiency 2FE260,126 (28,566)8,963 (3,004)

1The number of genotyped animals is in parenthesis.

Table 1.

Number of records and number of focal individuals of each trait1

TraitAbbreviationNumber of recordsNumber of focal individuals
WeightWGT74,484 (38,044)8,689 (3,030)
Carcass yield component 1CYC115,293 (14,928)823 (585)
Feed efficiency 1FE159,567 (28,478)8,917 (3,012)
Carcass yield component 2CYC215,208 (14,849)820 (584)
Feed efficiency 2FE260,126 (28,566)8,963 (3,004)
TraitAbbreviationNumber of recordsNumber of focal individuals
WeightWGT74,484 (38,044)8,689 (3,030)
Carcass yield component 1CYC115,293 (14,928)823 (585)
Feed efficiency 1FE159,567 (28,478)8,917 (3,012)
Carcass yield component 2CYC215,208 (14,849)820 (584)
Feed efficiency 2FE260,126 (28,566)8,963 (3,004)

1The number of genotyped animals is in parenthesis.

The focal individuals used to validate the models were phenotyped individuals from the most recent generation in the pedigree. Table 1 presents the number of focal individuals used for each trait. Hereafter, the subscript w will denote the scenarios when the phenotypes of the focal individuals were used in the evaluation, whereas the subscript p will denote when the phenotypes were not used.

Models

Two five-trait models that included growth and efficiency traits in broilers were used in the analyses. The first model, denoted as MREG, was:

Where y is the vector of phenotypes, b is the vector of fixed effects including generation and sex of the bird, u is the vector of breeding values, p is the vector of permanent environmental effects for WGT, e is the vector of errors, and X, Z, and W are incidence matrices.

The second model, denoted as MFIX, was equal to MREG plus an extra cross-classified fixed effect representing the origin of parents. Therefore, MFIX was:

Where α is a vector representing the C2 classification, and U is an incidence matrix. For this model, records from animals with missing parents were removed from the analyses.

For both models, it was assumed that Var(e)=IVe, Var(u)=HVu, and cov(e,u)=0, where Ve and Vu are the covariance matrices among traits, and H is defined as in Legarra et al. (2009).

Both models were evaluated without the inclusion of UPG (MREGN and MFIXN), with UPG without estimated inbreeding for UPG (VanRaden, 1992) (MREGUPG/MFIXUPG), with UPG and estimated inbreeding for UPG (MREGUPG_INB/MFIXUPG_INB), with random UPG and estimated inbreeding for UPG (MREGUPG_INB_RAN/ MFIXUPG_INB_RAN), and with MF (MREGMF/MFIXMF). Both UPG and MF were defined by using the C1 classification; hence, only two UPG or MF were used in the analyses. These differences within a model will be referred to as “scenarios.” For example, for MREGMF, the model is MREG, whereas the scenario is with MF.

UPG were defined as in Misztal et al. (2013), where the Quaas-Pollack (QP) transformation (Quaas and Pollak, 1981) is applied to the pedigree relationship matrix (A), the genomic relationship matrix (G), and the pedigree relationship matrix among genotyped animals (A22). Inbreeding coefficients estimated as in Aguilar and Misztal (2008) were used to compute both inverses of A and A22; therefore, inbreeding coefficients were updated when inbreeding for UPG was considered. Inbreeding for UPG was calculated as the average inbreeding of known parents of the same period (t), that is, Fupg=F¯pt (VanRaden, 1992).

For MF, their covariance matrix (Γ) was computed by generalized least squares for multiple populations (Legarra et al., 2015; Garcia-Baccino et al., 2017 ) using Gammaf90 from the BLUPF90 software suite (not publicly released). The resulting matrix was Γ=[0.50.380.380.57]. Following the methods in Legarra et al. (2015), the estimated genetic variance (σ^g2) was divided by k=1+diag(Γ)¯/2Γ¯, and the inverse of the numerator relationship matrix was computed using the rules provided in the same study. The variance components used in this study were provided by Cobb-Vantress Inc. (Siloam Springs, AR). Heritability (h^2), genetic, and phenotypic correlations for all traits are presented in Table 2.

Table 2.

Heritability (diagonal), genetic correlation (above-diagonal), and phenotypic correlation (below-diagonal) for each trait

WGTCYC1FE1CYC2FE2
WGT0.160.170.180.08−0.09
CYC10.460.59−0.10−0.49−0.09
FE10.01−0.040.360.070.87
CYC20.01−0.380.060.620.05
FE2−0.37−0.120.730.010.24
WGTCYC1FE1CYC2FE2
WGT0.160.170.180.08−0.09
CYC10.460.59−0.10−0.49−0.09
FE10.01−0.040.360.070.87
CYC20.01−0.380.060.620.05
FE2−0.37−0.120.730.010.24
Table 2.

Heritability (diagonal), genetic correlation (above-diagonal), and phenotypic correlation (below-diagonal) for each trait

WGTCYC1FE1CYC2FE2
WGT0.160.170.180.08−0.09
CYC10.460.59−0.10−0.49−0.09
FE10.01−0.040.360.070.87
CYC20.01−0.380.060.620.05
FE2−0.37−0.120.730.010.24
WGTCYC1FE1CYC2FE2
WGT0.160.170.180.08−0.09
CYC10.460.59−0.10−0.49−0.09
FE10.01−0.040.360.070.87
CYC20.01−0.380.060.620.05
FE2−0.37−0.120.730.010.24

Estimation of the model effects

Estimates for all effects in the models were obtained with single-step genomic best linear unbiased prediction (ssGBLUP; Legarra et al., 2009; Christensen and Lund, 2010) using BLUP90IOD2 (Misztal et al., 2014) with the Algorithm for Proven and Young (APY) (Misztal, 2016). For BLUP90IOD2, the mixed model equations were solved with the preconditioned conjugate gradient algorithm, with a convergence statistic equal to the squared norm of the difference between the right-hand side and the coefficient matrix times the current vector of solutions, divided by the squared norm of the right-hand side. A convergence criterion of 10–14 was adopted. For more details, the reader is referred to Tsuruta et al. (2001). The genomic relationship matrix was computed by the first method of VanRaden (2008) with observed allele frequencies, except for the scenarios with MF, where the allele frequencies were set to 0.5. For APY, the core was comprised of 15,867 randomly selected birds, which represents more than 99% of the variance in G but avoids further core changes. As APY requires the inverse of the subset of G for core animals, this matrix was blended with 5% of A22.

Validation

All scenarios were compared by the estimated bias, dispersion, and accuracy of the predictions for the focal individuals. The bias and dispersion were estimated as in Legarra and Reverter (2018). The accuracy was estimated by predictive ability (PA) and the LR-method: 1) acc^PA=corr(yadj,u^)h^2, where yadj is the vector of phenotype minus the estimates of all nongenetic effects (Legarra et al., 2008) and u^ is the vector of GEBV and 2) acc^LR=cov(u^w,u^p)(1F¯)σ^u2 (Legarra and Reverter, 2018; Macedo et al., 2020). Both cov and corr correspond to the sample covariance and sample correlation coefficient, respectively. The denominator of acc^LR corresponds to one minus the average inbreeding of the focal individuals (F¯) times the estimate of the additive genetic variance of the trait (σ^u2).

Results

Models with fixed UPG did not converge; hence, the results for MREGUPG and MFIXUPG were not obtained. The lack of convergence is because the coefficient matrix became nonpositive definite. All the results regarding UPG with estimated inbreeding and random UPG were the same due to the small number of UPG in the models and the large number of records for each trait. Thus, referring to one of them will suffice.

Figure 1 shows the rounds until convergence for each model and each scenario. For both models, the use of MF reached convergence in 500 to 1,000 rounds earlier than the rest of the methods. The slowest convergence was seen for fixed and random UPG with estimated inbreeding.

Rounds until convergence for each scenario. The first box corresponds to MREG, while the second corresponds to MFIX. The dashed line denotes the logarithm with base 10 of the convergence criterion. MREG refers to a model without parents’ origin fitted as a fixed effect. MFIX refers to a model with parents’ origin fitted as a fixed effect.
Figure 1.

Rounds until convergence for each scenario. The first box corresponds to MREG, while the second corresponds to MFIX. The dashed line denotes the logarithm with base 10 of the convergence criterion. MREG refers to a model without parents’ origin fitted as a fixed effect. MFIX refers to a model with parents’ origin fitted as a fixed effect.

Table 3 presents the estimated accuracy of the predictions for each scenario and each trait. Values near one indicate that the EBV are accurately estimated. The concordance between acc^PA and acc^LR depends on the model and the trait. For MREG, both acc^PA and acc^LR were different for CYC1, FE1, and FE2. When MF were used, the two measures of prediction accuracy became similar for FE1 and FE2. On the other hand, for MFIX, the predictive ability and LR method performed similarly, except for CYC1. The largest difference between acc^PA and acc^LR was 0.34 for MREG and 0.15 for MFIX. In general, differences among models were not large. The highest accuracies were obtained using MF in MREG, whereas, for MFIX, the greatest accuracies were obtained with UPG with estimated inbreeding. Within models, no differences were observed among scenarios for some traits (e.g., CYC1).

Table 3.

Accuracy of predictions for each combination of model scenario1 and each trait estimated with both acc^PA and acc^LR

WGTCYC1FE1CYC2FE2
acc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LR
MREGN0.720.720.500.590.390.630.590.570.290.63
MREGUPG_INB0.730.710.500.580.400.590.580.610.320.52
MREGUPG_INB_RAN0.730.710.500.580.400.590.580.610.320.52
MREGMF0.730.700.620.830.690.690.670.800.580.61
MFIXN0.790.740.590.690.630.650.580.610.630.64
MFIXUPG_INB_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXUPG_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXMF0.710.690.580.740.590.620.610.710.520.57
WGTCYC1FE1CYC2FE2
acc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LR
MREGN0.720.720.500.590.390.630.590.570.290.63
MREGUPG_INB0.730.710.500.580.400.590.580.610.320.52
MREGUPG_INB_RAN0.730.710.500.580.400.590.580.610.320.52
MREGMF0.730.700.620.830.690.690.670.800.580.61
MFIXN0.790.740.590.690.630.650.580.610.630.64
MFIXUPG_INB_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXUPG_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXMF0.710.690.580.740.590.620.610.710.520.57

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Table 3.

Accuracy of predictions for each combination of model scenario1 and each trait estimated with both acc^PA and acc^LR

WGTCYC1FE1CYC2FE2
acc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LR
MREGN0.720.720.500.590.390.630.590.570.290.63
MREGUPG_INB0.730.710.500.580.400.590.580.610.320.52
MREGUPG_INB_RAN0.730.710.500.580.400.590.580.610.320.52
MREGMF0.730.700.620.830.690.690.670.800.580.61
MFIXN0.790.740.590.690.630.650.580.610.630.64
MFIXUPG_INB_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXUPG_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXMF0.710.690.580.740.590.620.610.710.520.57
WGTCYC1FE1CYC2FE2
acc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LRacc^PAacc^LR
MREGN0.720.720.500.590.390.630.590.570.290.63
MREGUPG_INB0.730.710.500.580.400.590.580.610.320.52
MREGUPG_INB_RAN0.730.710.500.580.400.590.580.610.320.52
MREGMF0.730.700.620.830.690.690.670.800.580.61
MFIXN0.790.740.590.690.630.650.580.610.630.64
MFIXUPG_INB_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXUPG_RAN0.820.790.590.730.630.670.570.650.630.66
MFIXMF0.710.690.580.740.590.620.610.710.520.57

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Table 4 presents the estimated dispersion of the predictions from the LR method for each scenario and each trait. Values near one indicate that there is no over/under dispersion in the GEBV. For MREG, the GEBV for CYC1, FE1, and FE2 were inflated except when MF were implemented in the evaluation. No considerable inflation was observed for any situation in MFIX. Within the range of 0.90 to 1.10, we considered the inflation/deflation as acceptable. Overall, for MREG, the best performance was achieved when MF were included in the model, whereas the worst performance was with UPG and estimated inbreeding. For MFIX, the best performance was obtained without modeling the genetic groups.

Table 4.

Dispersion of predictions for each scenario1 and each trait

WGTCYC1FE1CYC2FE2
MREGN1.010.870.711.050.62
MREGUPG_INB0.980.850.691.000.62
MREGUPG_INB_RAN0.980.850.691.000.62
MREGMF1.050.901.110.961.01
MFIXN0.990.991.011.000.98
MFIXUPG_INB0.980.960.980.950.95
MFIXUPG_INB_RAN0.980.960.980.950.95
MFIXMF0.990.930.970.920.91
WGTCYC1FE1CYC2FE2
MREGN1.010.870.711.050.62
MREGUPG_INB0.980.850.691.000.62
MREGUPG_INB_RAN0.980.850.691.000.62
MREGMF1.050.901.110.961.01
MFIXN0.990.991.011.000.98
MFIXUPG_INB0.980.960.980.950.95
MFIXUPG_INB_RAN0.980.960.980.950.95
MFIXMF0.990.930.970.920.91

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Table 4.

Dispersion of predictions for each scenario1 and each trait

WGTCYC1FE1CYC2FE2
MREGN1.010.870.711.050.62
MREGUPG_INB0.980.850.691.000.62
MREGUPG_INB_RAN0.980.850.691.000.62
MREGMF1.050.901.110.961.01
MFIXN0.990.991.011.000.98
MFIXUPG_INB0.980.960.980.950.95
MFIXUPG_INB_RAN0.980.960.980.950.95
MFIXMF0.990.930.970.920.91
WGTCYC1FE1CYC2FE2
MREGN1.010.870.711.050.62
MREGUPG_INB0.980.850.691.000.62
MREGUPG_INB_RAN0.980.850.691.000.62
MREGMF1.050.901.110.961.01
MFIXN0.990.991.011.000.98
MFIXUPG_INB0.980.960.980.950.95
MFIXUPG_INB_RAN0.980.960.980.950.95
MFIXMF0.990.930.970.920.91

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Table 5 presents the estimated bias of the predictions for each scenario and each trait, expressed proportionally to the genetic standard deviation of each trait. Values near zero indicate that the EBV are unbiased. For all the traits, the bias was reduced when a fixed effect representing the origin of parents was fitted to the data (i.e., MFIX). The absolute value of the bias for MREG ranged from 0.10 to 0.69 standard deviations, whereas for MFIX the absolute bias ranged from 0.01 to 0.18. Except for WGT, the least biased predictions for MREG were obtained when MF were included in the model. For MFIX, the least biased predictions were obtained without including genetic groups and with UPG with estimated inbreeding.

Table 5.

Bias of predictions for each scenario1 expressed in terms of the genetic standard deviation of each trait

WGTCYC1FE1CYC2FE2
MREGN−0.100.30−0.69−0.24−0.59
MREGUPG_INB−0.070.27−0.61−0.21−0.51
MREGUPG_INB_RAN−0.070.27−0.61−0.21−0.51
MREGMF0.190.170.52−0.020.40
MFIXN−0.010.030.01−0.020.02
MFIXUPG_INB−0.010.030.01−0.020.02
MFIXUPG_INB_RAN−0.010.030.01−0.020.02
MFIXMF0.030.090.18−0.090.18
WGTCYC1FE1CYC2FE2
MREGN−0.100.30−0.69−0.24−0.59
MREGUPG_INB−0.070.27−0.61−0.21−0.51
MREGUPG_INB_RAN−0.070.27−0.61−0.21−0.51
MREGMF0.190.170.52−0.020.40
MFIXN−0.010.030.01−0.020.02
MFIXUPG_INB−0.010.030.01−0.020.02
MFIXUPG_INB_RAN−0.010.030.01−0.020.02
MFIXMF0.030.090.18−0.090.18

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Table 5.

Bias of predictions for each scenario1 expressed in terms of the genetic standard deviation of each trait

WGTCYC1FE1CYC2FE2
MREGN−0.100.30−0.69−0.24−0.59
MREGUPG_INB−0.070.27−0.61−0.21−0.51
MREGUPG_INB_RAN−0.070.27−0.61−0.21−0.51
MREGMF0.190.170.52−0.020.40
MFIXN−0.010.030.01−0.020.02
MFIXUPG_INB−0.010.030.01−0.020.02
MFIXUPG_INB_RAN−0.010.030.01−0.020.02
MFIXMF0.030.090.18−0.090.18
WGTCYC1FE1CYC2FE2
MREGN−0.100.30−0.69−0.24−0.59
MREGUPG_INB−0.070.27−0.61−0.21−0.51
MREGUPG_INB_RAN−0.070.27−0.61−0.21−0.51
MREGMF0.190.170.52−0.020.40
MFIXN−0.010.030.01−0.020.02
MFIXUPG_INB−0.010.030.01−0.020.02
MFIXUPG_INB_RAN−0.010.030.01−0.020.02
MFIXMF0.030.090.18−0.090.18

1Models are defined as MREG and MFIX, while scenarios are denoted with the subscript in each line (N, UPG, UPG_INB, UPG_INB_RAN, and MF).

Discussion

The best way to model the inclusion of a group of external individuals in the evaluation was via the addition of an extra fixed effect representing the origin of the parents of the individuals, that is, local and external. Although the impact in accuracy is small, the GEBV with MFIX were almost unbiased and without inflation/deflation. For all traits, the phenotypic means in C2 were different among groups. Thus, dismissing α leads to a model misspecification (Rencher and Schaalje, 2008, Section 7.9).

Although the data and results show that α should be included in the model, this new parameter is hard to interpret in this situation because all focal individuals were born in the local environment. Hence, it is likely that α may be modeling a genetic effect absent in MREG. In fact, fitting α as a fixed effect accounts for nongenetic and genetic differences between parents of local and external individuals, whereas UPG account only for the genetic differences. If only UPG are considered, the nongenetic differences are not going to be modeled. The consideration of α as a fixed effect instead of random was to avoid the reestimation of variance components and to keep model simplicity. Although the reestimation of variance components may not be a limitation for smaller datasets, it can be time-consuming when the number of genotyped animals and traits is large. This can lead to a limitation in the interval between successive genetic evaluations and, therefore, become a problem for companies.

In this study, two estimators for the accuracy were used. Predictive ability is widely used but depends on model specification and will tend to be higher with over-parametrized models. That is, when increasing the number of fixed effects, the predictive ability will tend to overestimate the accuracy of GEBV. As an extreme example, with a saturated model (one fixed effect per animal) acc^PA will be equal to 1, whereas acc^LR will be equal to zero. It can be observed in Table 3 that acc^PA greatly varied when the extra fixed effect was added into the model, whereas acc^LR showed smaller changes. The average difference between acc^PA and acc^LR for MREG was 0.10, whereas for MFIX it was 0.06. It can be observed that for some traits acc^PA greatly varied from MREG to MFIX. For example, it varied from 0.39 to 0.63 for FE1 when no genetic groups were modeled. On the other hand, acc^LR varied from 0.63 to 0.65 for the same scenario. The addition of a simple fixed effect may not justify the increase in accuracy when using acc^PA. Additionally, as an example, the correlation between GEBV for FE1 estimated with MREGN and MFIXN was 0.92. Clearly, this shows that the accuracy is wrongly estimated by acc^PA in MREG. For the same scenario in Table 5, it can be observed that the bias greatly decreased from −0.69 to 0.01 standard deviations. Thus, it can be deduced that acc^PA is more influenced by the model specification than acc^LR. In other words, when a fixed effect like α is omitted from the model, the estimates of the accuracy with predictive ability greatly vary. Therefore, the use of acc^LR for further evaluations is recommended since it depends less on the modeling of fixed effects. Different ways to compute acc^LR exist depending on how the denominator is constructed (Legarra and Reverter, 2018; Macedo et al., 2020). However, the differences may be small and all the forms are proportional.

The accuracy of predictions with MF estimated with both acc^PA and acc^LR decreased significantly from MREG to MFIX. Also, predictions with MF in MFIX were more biased than with other methods. These issues may be related with the scaling of the genetic variance due to MF. As mentioned in Materials and Methods, when using MF in ssGBLUP, the genetic variance and consequently the inverse of the relationship matrix are divided by k. The way of computing k implicitly assumes that all MF have the same amount of information in the population. When groups to define MF are very unbalanced, as, in the case of C1, this assumption may be incorrect. If the group of external birds was larger, there would possibly be no issues of reduced accuracy and biased predictions with the inclusion of MF. However, further research is needed to solve this issue when the groups used to define MF are unbalanced and no other group definition can be used to determine the number of MF. In agreement with previous studies (van Grevenhof et al., 2019), the use of MF resulted in better convergence properties than other methods because of a reduction in the condition number of the system.

Conclusions

The inclusion of animals from an external group into local genetic evaluations should be correctly modeled to obtain accurate, unbiased, and not under-/over-dispersed GEBV. This can be done by including an extra fixed effect in the model, by using UPG or MF. For most of the traits in this study, the best scenario includes an extra fixed effect and with an extra adjustment as UPG with estimated inbreeding or MF. When the groups to define the MF greatly differ in the number of animals, predictions can be biased. In such a situation, it may be better to use UPG. Further research on this topic is needed. In this study, UPG without estimated inbreeding did not converge. Thus, taking into account estimated inbreeding for UPG is recommended. The use of MF reduced the condition number of the system, resulting in faster convergence for both models; however, this result cannot be generalized to other models and datasets. The estimation of accuracy using the LR method is less sensitive to the fixed effects of the model than predictive ability, therefore, being a proper method to the estimate accuracy of GEBV, independently of model specification.

Abbreviations

    Abbreviations
     
  • APY

    Algorithm for Proven and Young

  •  
  • BLUP

    best linear unbiased prediction

  •  
  • CYC1

    carcass yield component 1

  •  
  • CYC2

    carcass yield component 2

  •  
  • EBV

    estimated breeding values

  •  
  • FE1

    feed efficiency 1

  •  
  • FE2

    feed efficiency 2

  •  
  • GEBV

    genomic estimated breeding values

  •  
  • MF

    metafounders

  •  
  • MREGMF/MFIXMF

    models evaluated with MF

  •  
  • MREGN/MFIXN

    models evaluated without the inclusion of UPG

  •  
  • MREGUPG/MFIXUPG

    models evaluated with UPG without estimated inbreeding for UPG

  •  
  • MREGUPG_INB/MFIXUPG_INB

    models evaluated with UPG and estimated inbreeding for UPG

  •  
  • MREGUPG_INB_RAN/MFIXUPG_INB_RAN

    models evaluated with random UPG and estimated inbreeding for UPG

  •  
  • ssGBLUP

    single-step GBLUP

  •  
  • UPG

    unknown parent groups

  •  
  • WGT

    weight

Acknowledgment

This study was supported by Cobb-Vantress Inc. (Siloam Springs, AR).

Conflict of interest statement

The authors declare that they do not have any conflict of interest.

Data Availability

The data belong to Cobb-Vantress and, therefore, cannot be shared.

Literature Cited

Aguilar
,
I.
, and
I.
Misztal
.
2008
.
Technical Note: Recursive algorithm for inbreeding coefficients assuming nonzero inbreeding of unknown parents
.
J. Dairy Sci
.
91
:
1669
1672
. doi:10.3168/jds.2007-0575

Bradford
,
H. L.
,
Y.
Masuda
,
P. M.
Vanraden
,
A.
Legarra
, and
I.
Misztal
.
2019
.
Modeling missing pedigree in single-step genomic BLUP
.
J. Dairy Sci
.
102
(
3
):
2336
2346
. doi:10.3168/jds.2018-15434

Christensen
,
O. F
.
2012
.
Compatibility of pedigree-based and marker-based relationship matrices for single-step genetic evaluation
.
Genet. Sel. Evol
.
44
:
37
. doi:10.1186/1297-9686-44-37

Christensen
,
O. F.
, and
M. S.
Lund
.
2010
.
Genomic prediction when some animals are not genotyped
.
Genet. Sel. Evol
.
42
:
2
. doi:10.1186/1297-9686-42-2

Garcia-Baccino, C. A., A. Legarra, O. F. Christensen, I. Misztal, I. Pocrnic, Z. G. Vitezica, R. J. C. Cantet. 2017. Metafounders are related to F st fixation indices and reduce bias in single-step genomic evaluations. Gen. Sel. Evol. 49(1):34. doi:10.1186/s12711-017-0309-2

Gianola
,
D.
, and
C. C.
Schön
.
2016
.
Cross-validation without doing cross-validation in genome-enabled prediction
.
G3 (Bethesda)
.
6
(
10
):
3107
3128
. doi:10.1534/g3.116.033381

van Grevenhof
,
E. M.
,
J.
Vandenplas
, and
M. P. L.
Calus
.
2019
.
Genomic prediction for crossbred performance using metafounders
.
J. Anim. Sci
.
97
:
548
558
. doi:10.1093/jas/sky433

Legarra
,
A.
,
I.
Aguilar
, and
I.
Misztal
.
2009
.
A relationship matrix including full pedigree and genomic information
.
J. Dairy Sci
.
92
:
4656
4663
. doi:10.3168/jds.2009-2061

Legarra
,
A.
,
J. K.
Bertrand
,
T.
Strabel
,
R. L.
Sapp
,
J. P.
Sánchez
, and
I.
Misztal
.
2007
.
Multi-breed genetic evaluation in a Gelbvieh population
.
J. Anim. Breed. Genet
.
124
:
286
295
. doi:10.1111/j.1439-0388.2007.00671.x

Legarra
,
A.
,
O. F.
Christensen
,
Z. G.
Vitezica
,
I.
Aguilar
, and
I.
Misztal
.
2015
.
Ancestral relationships using metafounders: finite ancestral populations and across population relationships
.
Genetics
200
:
455
468
. doi:10.1534/genetics.115.177014

Legarra
,
A.
, and
A.
Reverter
.
2018
.
Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method
.
Genet. Sel. Evol
.
50
:
53
. doi:10.1186/s12711-018-0426-6

Legarra
,
A.
,
C.
Robert-Granié
,
E.
Manfredi
, and
J. M.
Elsen
.
2008
.
Performance of genomic selection in mice
.
Genetics
180
:
611
618
. doi:10.1534/genetics.108.088575

Lo
,
L. L
.
1994
.
Genetic evaluation and selection in multibreed population
[PhD dissertation].
Champaign (IL)
:
University of Illinois at Urbana-Champaign.
Available from http://hdl.handle.net/2142/22306. Accessed on April 2020.

Macedo
,
F. L.
,
A.
Reverter
, and
A.
Legarra
.
2020
.
Behavior of the linear regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models
.
J. Dairy Sci
.
103
(
1
):
529
544
, doi:10.3168/jds.2019-16603

Misztal
,
I
.
2016
.
Inexpensive computation of the inverse of the genomic relationship matrix in populations with small effective population size
.
Genetics
202
:
401
409
. doi:10.1534/genetics.115.182089

Misztal
,
I.
,
S.
Tsuruta
,
D. A. L.
Lourenco
,
I.
Aguilar
,
A.
Legarra
, and
Z.
Vitezica
.
2014
.
Manual for BLUPF90 family of programs.
Available from http://nce.ads.uga.edu/wiki/lib/exe/fetch.php?media=blupf90_all7.pdf. Accessed on April 2020.

Misztal
,
I.
,
Z.
Vitezica
,
A.
Legarra
,
I.
Aguilar
, and
A.
Swan
.
2013
.
Unknown-parent groups in single-step genomic evaluation
.
J. Anim. Breed. Genet
.
130
:
252
258
. doi:10.1111/jbg.12025

Quaas
,
R. L
.
1988
.
Additive genetic model with groups and relationships
.
J. Dairy Sci
.
71
(
5
):
1338
1345
. doi:10.3168/jds.S0022-0302(88)79691-5

Quaas
,
R. L.
, and
E. J.
Pollak
.
1981
.
Modified equations for sire models with groups
.
J. Dairy Sci
.
64
:
1868
1872
. doi:10.3168/jds.S0022-0302(81)82778-6.

Rencher
,
A.
, and
G.
Schaalje
.
2008
.
Linear models in statistics
.
Hoboken
:
Wiley & Sons.

Sullivan
,
P. G.
, and
L. R.
Schaeffer
.
1994
.
Fixed versus random genetic groups
. In:
Proceedings of the 5th World Congress on Genetics Applied to Livestock Production;
August 7 to 12 1994.
Guelph (ON, Canada)
:
Department of Animal & Poultry Science, University of Guelph
. Vol.
18
; pp.
483
486
.

Tsuruta
,
S.
,
I.
Misztal
, and
I.
Strandén
.
2001
.
Use of the preconditioned conjugate gradient algorithm as a generic solver for mixed-model equations in animal breeding applications
.
J. Anim. Sci
.
79
:
1166
1172
. doi:10.2527/2001.7951166x

VanRaden
,
P.M
.
1992
.
Accounting for inbreeding and crossbreeding in genetic evaluation of large populations
.
J. Dairy Sci
.
75
(
11
):
3136
3144
. doi:10.3168/jds.S0022-0302(92)78077-1

VanRaden
,
P. M
.
2008
.
Efficient methods to compute genomic predictions
.
J. Dairy Sci
.
91
:
4414
4423
. doi:10.3168/jds.2007-0980

Vitezica
,
Z. G.
,
I.
Aguilar
,
I.
Misztal
, and
A.
Legarra
.
2011
.
Bias in genomic predictions for populations under selection
.
Genet. Res. (Camb)
.
93
:
357
366
. doi:10.1017/S001667231100022X

Wiggans
,
G. R.
,
T. S.
Sonstegard
,
P. M.
VanRaden
,
L. K.
Matukumalli
,
R. D.
Schnabel
,
J. F.
Taylor
,
F. S.
Schenkel
, and
C. P.
Van Tassell
.
2009
.
Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada
.
J. Dairy Sci
.
92
:
3431
3436
. doi:10.3168/jds.2008-1758

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]