Abstract

In a genetically admixed population, admixed individuals possess genealogical and genetic ancestry from multiple source groups. Under a mechanistic model of admixture, we study the number of distinct ancestors from the source populations that the admixture represents. Combining a mechanistic admixture model with a recombination model that describes the probability that a genealogical ancestor is a genetic ancestor, for a member of a genetically admixed population, we count genetic ancestors from the source populations—those genealogical ancestors from the source populations who contribute to the genome of the modern admixed individual. We compare patterns in the numbers of genealogical and genetic ancestors across the generations. To illustrate the enumeration of genetic ancestors from source populations in an admixed group, we apply the model to the African-American population, extending recent results on the numbers of African and European genealogical ancestors that contribute to the pedigree of an African-American chosen at random, so that we also evaluate the numbers of African and European genetic ancestors who contribute to random African-American genomes. The model suggests that the autosomal genome of a random African-American born in the interval 1960–1965 contains genetic contributions from a mean of 162 African (standard deviation 47, interquartile range 127–192) and 32 European ancestors (standard deviation 14, interquartile range 21–43). The enumeration of genetic ancestors can potentially be performed in other diploid species in which admixture and recombination models can be specified.

Introduction

The genealogical pedigree of any individual person can be viewed as a structure that has been shaped by demographic events such as migrations and population admixtures. The pedigree contains the individual’s recent ancestors, who have contributed in a genealogical sense to the individual, and with increasing probability as time proceeds toward the most recent generations, in a genetic sense as well.

The distinction between genealogical and genetic ancestry is inconsequential in recent generations: an individual necessarily contains genetic material from both parents, and almost certainly from all 4 grandparents and 8 great-grandparents as well. However, genetic transmission involves chromosomal segments, the number of which is finite. Hence, going back in time, the number of genealogical ancestors increases rapidly, and proportionally fewer of them are genetic ancestors: individuals who contribute to the genetic material of the modern individual. In the memorable description of Donnelly (1983), “This means that someone descended from the Scottish poet Robert Burns (born 1759) probably carries some of his genes, but that someone unilineally descended from the English playwright William Shakespeare (born 1564) is unlikely to have any genes in common with him.”

A number of studies have explored the peculiar consequences of the distinction between genealogical and genetic ancestors (Wiuf and Hein 1997; Baird et al. 2003; Matsen and Evans 2008; Gravel and Steel 2015; Buffalo et al. 2016; Kelleher et al. 2016). For example, one simulation study (Rohde et al. 2004), based on earlier mathematical work (Chang 1999), argued that the most recent genealogical ancestor shared by all living humans might have lived as few as 5,000 years ago, even though the most recent genetic ancestor lived much earlier. The rate at which recent genealogical ancestors dissipate from an individual’s genetic ancestry has been studied by Coop (2013), who used approximations to the human recombination process in order to calculate the number of autosomal fragments a genealogical ancestor passes to a descendant. Through that quantity, Coop (2013) computed the probability that a genealogical ancestor k generations ago is also a genetic ancestor. This analysis finds that although the number of genealogical ancestors grows exponentially in the number of generations back from the present, the number of genetic ancestors grows only linearly.

Recent admixture introduces a new dimension to the challenge of understanding the distinction between genealogical and genetic ancestry. In a recently admixed population, genealogical ancestors ultimately trace to 2 or more source populations. Some of these genealogical ancestors are genetic ancestors and some are not, so that the fraction of the genetic ancestors that trace to a specific source group need not equal the corresponding fraction of the genealogical ancestors that trace to that source.

Building on a mechanistic admixture model (Verdu and Rosenberg 2011), we have devised a model for counting genealogical ancestors in an admixed individual’s pedigree (Mooney et al. 2023), evaluating the numbers of individuals that enter the pedigree from each specific source population. Our goal here is to extend this genealogical model of an admixed pedigree to count the genetic ancestors that enter the pedigree. That is, we seek to count genetic ancestors from a certain source population that contribute to an individual’s genome, considering genetic ancestors in each generation in the pedigree.

To answer the new question posed by the study—how many genetic ancestors from the source populations does the genetic admixture of a random member of an admixed population represent?—we combine 2 mathematical approaches. The first is the extension of the admixture model studied by Mooney et al. (2023). The second is the method of Coop (2013) for approximating the probability that a genealogical ancestor is also a genetic ancestor. We develop a model that counts across the generations both genealogical and genetic ancestors from a certain source population of an admixed individual. We apply it to the African-American population, elaborating on the strictly genealogical approach of Mooney et al. (2023).

For this purpose, extending the work of Mooney et al. (2023), for a member of the admixed population, we study the random number of admixed genealogical ancestors in the pedigree in each generation by proceeding recursively back in time. From this random variable, we evaluate properties of the number of genetic ancestors from the admixed population and the number of genetic ancestors from the source populations, as well as the number of genealogical ancestors from the source populations as studied by Mooney et al. (2023).

The model

Admixture process

We build upon the model of Verdu and Rosenberg (2011) and Mooney et al. (2023), which considers the formation of a new admixed population. Two source populations that were present in generation 0 form the new admixed population in generation 1. After the initial admixture event, in each subsequent generation after generation 1, individuals from both source populations and the admixed population can be parents of an individual in the admixed population. Our interest is in an admixed individual in generation g after the initial admixture.

We call the source populations “source 1” and “source 2.” For each n=1,2,,g, we denote by s1,n1 the probability that for an admixed individual in generation n (n generations after members of generation 0 admix to form generation 1), a specific parent is from source population 1. We denote by hn1 the probability that the parent is from the admixed population, and by s2,n1 the corresponding probability for source 2. Therefore, for each n=1,2,,g, we have s1,n1+hn1+s2,n1=1, recalling that h0=0 (Fig. 1). The 2 parents are independent and identically distributed, amounting to an assumption that they are exchangeable members of the previous generation. The population is large, so that the chance that a particular individual is sampled twice can be ignored.

The general admixture model. Starting from generation 0, 2 source populations form an admixed population in generation 1, with admixture proportions s1,0 and s2,0. In the following generations, n=2,3,…,g, the admixed population receives contributions from both the source populations and the admixed population, in proportions s1,n−1, s2,n−1, and hn−1.
Fig. 1.

The general admixture model. Starting from generation 0, 2 source populations form an admixed population in generation 1, with admixture proportions s1,0 and s2,0. In the following generations, n=2,3,,g, the admixed population receives contributions from both the source populations and the admixed population, in proportions s1,n1, s2,n1, and hn1.

Genealogical ancestors in a pedigree

Consider Fig. 2a, describing the pedigree of an admixed individual. Tracing back from the admixed individual on each genealogical line, we eventually reach genealogical ancestors from the source populations. In each lineage that reaches ancestors who are only in source populations, we tabulate only the most recent one in our count of genealogical ancestors from source populations.

Counting genealogical and genetic ancestors from the source populations for an admixed individual. a) Pedigree of an admixed individual. Ancestors can be from source populations or from the admixed population itself. Ancestors from the source populations can be both genealogical and genetic ancestors (solid color), or genealogical ancestors only (striped). Along each genealogical line that reaches a source population, we count the most recent ancestor (dark color). b) Counting genealogical ancestors from source populations. For the pedigree in a), this panel goes back in time from an admixed individual in generation g (circled), on each line stopping when a source population is reached. The number of individuals from source 1 is 4 (red), and the number from source 2 is 3 (blue). c) Counting genetic ancestors from source populations. As in b), we traverse all admixed individuals in the pedigree, irrespective of genetic ancestry status. However, if a source-1 or source-2 ancestor is not a genetic ancestor, then that individual is not tabulated. Note that for ease of interpretation, the figure contains a higher number of genealogical but nongenetic ancestors than is likely in real pedigrees.
Fig. 2.

Counting genealogical and genetic ancestors from the source populations for an admixed individual. a) Pedigree of an admixed individual. Ancestors can be from source populations or from the admixed population itself. Ancestors from the source populations can be both genealogical and genetic ancestors (solid color), or genealogical ancestors only (striped). Along each genealogical line that reaches a source population, we count the most recent ancestor (dark color). b) Counting genealogical ancestors from source populations. For the pedigree in a), this panel goes back in time from an admixed individual in generation g (circled), on each line stopping when a source population is reached. The number of individuals from source 1 is 4 (red), and the number from source 2 is 3 (blue). c) Counting genetic ancestors from source populations. As in b), we traverse all admixed individuals in the pedigree, irrespective of genetic ancestry status. However, if a source-1 or source-2 ancestor is not a genetic ancestor, then that individual is not tabulated. Note that for ease of interpretation, the figure contains a higher number of genealogical but nongenetic ancestors than is likely in real pedigrees.

In the figure, some genealogical ancestors are genetic ancestors and some are not. In Mooney et al. (2023), we counted genealogical ancestors; the mathematical strategy followed previous studies (Verdu and Rosenberg 2011; Goldberg et al. 2014; Goldberg and Rosenberg 2015; Goldberg et al. 2020; Kim et al. 2021), in which source ancestry proportions were calculated recursively, beginning with the count of ancestors one generation after the initial admixture (n=1), and moving forward in time.

To count genetic ancestors, the approach of Mooney et al. (2023) is not straightforward to apply, because the probability that a genealogical ancestor is a genetic ancestor depends on that ancestor’s number of generations back from the present, even if the admixture process itself is constant in time. Further, a genetic ancestor of an individual in some generation gn, with 0<ng, is not necessarily a genetic ancestor of the individual of interest in generation g.

To address these problems, we develop a model in which we count genealogical and genetic ancestors by proceeding backward in time (Fig. 2b and c). Tracing back from the admixed individual of interest in generation g, we examine, in each step, the parents of all the admixed individuals present in the pedigree. We tabulate those who are from a certain source population in our count of genealogical ancestors from that source population (Fig. 2b). We tabulate as genetic ancestors those who, in addition to being genealogical ancestors from the source, are also genetic ancestors (Fig. 2c). For this step, we use the calculations of Coop (2013) for generationwise probabilities of genetic ancestry.

Genetic ancestors and recombination

Coop (2013) used a model of recombination in humans to evaluate the probability that 2 individuals with an ancestor–descendant relationship share at least 1 piece of DNA. In other words, the model gives an approximate probability that a descendant separated by k generations from a genealogical ancestor possesses at least 1 genomic fragment from the ancestor. The model takes into account approximations to the recombination process.

In the model of Coop (2013), the number of genomic fragments that a genealogical ancestor passes to a descendant k generations forward in time is treated as a random variable Nk. This random variable is approximated as Poisson-distributed owing to an assumption that recombination breakpoints are Poisson-distributed. The probability pk that a genealogical ancestor is a genetic ancestor to a k-generation descendant then equals 1P[Nk=0]=1eλk, where λk is the Poisson mean E[Nk].

Considering the autosomal genome, the mean number of genomic pieces that a parent passes to its offspring, λ1, is 22, the number of autosomes. Each generation, on average every 100 megabases (Mb) a crossover event occurs, adding 1 piece. Because the haploid genome is about 3,300 Mb long, each generation after the first, 33 pieces are added on average. In each generation back in time after the first, those pieces are distributed between 2 parents. Hence, in generation k2, the total number of pieces for one of an individual’s 2 genomic copies, maternal or paternal, is 22+33(k1). Those pieces trace to 2k1 genealogical ancestors k generations back from the present. Hence, the mean number of fragments contributed by a specific ancestor k generations back from the present is λk=[22+33(k1)]/2k1. The Poisson probability that at least 1 fragment traces to such an ancestor then equals 1 minus the probability that no fragments trace to the ancestor, or for k2,

(1)

We also define p1=1.

Figure 3 shows pk across the generations, illustrating its decline as k increases. With a 25-year generation time, the claim (Donnelly 1983) that an individual living in 1983, say, born in 1960, probably possesses genetic material from a randomly chosen genealogical ancestor born in 1759 corresponds to 8 generations and p8=0.8615. The claim that the individual probably does not possess genetic material from a randomly chosen genealogical ancestor born in 1564 corresponds to 16 generations and p16=0.0157. Interestingly, the period in which this probability of sharing genetic material with an ancestor decreases from a high to a low number corresponds to the period of interest in the founding of the African-American population, on which our example analysis focuses.

The probability pk that a genealogical ancestor is an (autosomal) genetic ancestor as a function of the number of generations back in time from the present. This plot is based on Eq. 1.
Fig. 3.

The probability pk that a genealogical ancestor is an (autosomal) genetic ancestor as a function of the number of generations back in time from the present. This plot is based on Eq. 1.

The human-specific Eq. 1 can be written in a more general form suitable for other diploid organisms. Denote the number of pairs of autosomes by q and the haploid genome length in megabases by ℓ. Denote by m the distance in megabases over which the mean number of crossover events is 1. As in the special case for humans, p1=1. The probability that a k-generation (k>1) genealogical ancestor is also a genetic ancestor is

(2)

The computation requires basic parameters of genomes and recombination maps, quantities that are available for diverse organisms (Milo and Phillips 2015; Stapley et al. 2017).

Results for the general model

To count genetic ancestors from source populations in a pedigree of a random admixed individual, we first trace the pedigree back, counting admixed individuals. We then use the count of admixed genealogical ancestors to count genetic ancestors. We also show how this approach can be used to recover the distribution of the number of genealogical ancestors from source populations in each generation, extending beyond calculations from Mooney et al. (2023) that focused on the expectation.

Counting admixed individuals in a pedigree

Continuing to consider a model with g generations, we now index generations by k, setting k=0 in generation g, with k increasing backward in time. Let Xk be the random number of admixed individuals in the pedigree at step k. When k=0, we consider a random admixed individual of interest in generation g, and X0=1. For 1kg, we proceed backward in time. At step k, or generation gk, a randomly chosen parent of an admixed individual in the previous step, or generation g(k1), has probability hgk of being an admixed individual. Consequently, because an individual has 2 parents, XkBin(2Xk1,hgk).

The number of admixed individuals in the pedigree is a nonhomogeneous branching process going back in time. It follows from  Appendix A that for 0kg,

(3)
(4)

For the sum of the number of admixed genealogical ancestors across all generations, computing the variance of the sum in  Appendix A, we have

(5)
(6)

Genealogical ancestors

In step k, 1kg, let Uki be the random number of source-1 genealogical ancestors of the generation-g admixed individual who are parents of individual i, one of the Xk1 admixed genealogical individuals in step k1. Proceeding back in time, after step k, =1ki=1X1Ui genealogical ancestors from source 1 have been counted (Fig. 2b).

Random variable Uki takes values 0, 1, and 2, with probabilities as follows:

(7)

In fact, UkiBin(2,s1,gk), as 1s1,gk=hgk+s2,gk. The number of source-2 genealogical ancestors can be counted symmetrically by transposing subscripts 1 and 2 in Eq. 7.

The {Uki}i=1Xk1 are independent and identically distributed. Therefore, using Uk=i=1Xk1Uki to sum across all Xk1 admixed genealogical ancestors in step k1, we have for each k, 1kg,

(8)

Indeed, considering all parents of the admixed individuals in generation g(k1), the distribution of the vector of counts of genealogical ancestors in source population 1, the admixed population, and source population 2 can be summarized by a multinomial distribution. If we denote by Uk the number of source-2 genealogical ancestors reached in generation gk, then

(9)

By Eq. 3,

(10)

This equation accords with the summand in Eq. 12 of Mooney et al. (2023), noting that generation i in that equation is equivalent to generation gk in Eq. 10. If we consider all 2k genealogical ancestors of the generation-g admixed individual present in step k, 1kg, then the expected fraction of them who are source-1 individuals who are parents of step-(k1) admixed individuals is E[Uk]/2k.

We calculate the variance using the law of total variance together with Eqs. 3 and 4:

(11)

We write s~1,gk=s1,gk/(1hgk) for convenience. Summing genealogical ancestors across generations in Eqs. 10 and 11 and computing the variance in  Appendix B, we have

(12)
(13)

Genetic ancestors

Next, we count genetic ancestors. Let Yki be the number of source-1 genetic ancestors of the generation-g admixed individual who are parents of individual i, one of the admixed genealogical ancestors in step k1. Proceeding back in time, after step k, =1ki=1X1Yi genetic ancestors from source 1 have been counted. We have for 1kg probabilities

Here, pk is the probability that a genealogical ancestor k generations ago is also a genetic ancestor (Eq. 1). The count of genetic ancestors from source 2 is obtained symmetrically.

We can also see that YkiBin(2,s1,gkpk), as

We write Yk=i=1Xk1Yki for the number of genetic ancestors tabulated in step k. By analogy with the tabulation of genealogical ancestors, we conclude by Eqs. 3 and 4 that for 1kg,

(14)
(15)
(16)

For the sum of the number of genetic ancestors across all generations, computing the variance in  Appendix B, we have

(17)
(18)

Among all 2k genealogical ancestors of the generation-g admixed individual who are present in step k, 1kg, the expected fraction of them who are source-1 individuals who are parents of step-(k1) admixed individuals and are genetic ancestors is E[Yk]/2k.

In the same way that we count genetic ancestors among the genealogical ancestors from the source populations, we can count the number of admixed genealogical ancestors who are also genetic ancestors. Denoting the random number of admixed genetic ancestors in step k by Xk*, this random variable is binomially distributed for 1kg, so that

(19)
(20)
(21)

The expected fraction of the 2k genealogical ancestors of the generation-g admixed individual who are themselves admixed individuals and who are also genetic ancestors is E[Xk*]/2k.

Considering all parents of the admixed individuals in generation g(k1), the distribution of the vector of counts of genetic ancestors in source population 1, the admixed population, and source population 2 follows a multinomial distribution. If we denote by Yk the number of source-2 genealogical ancestors reached in generation gk, then

(22)

For the sum of the number of genetic ancestors across generations, we have

(23)
(24)

A single admixture event

We now consider 2 specific cases of the admixture model, where after the initial generation of admixture, the contributions from the 2 sources and from the admixed population are constant across generations. First, we study the case in which the constants are 0. We examine the situation in which no subsequent admixture occurs after the admixed population is founded: in other words, s1,0,s2,0>0 and for all n, 1ng1, s1,n=s2,n=0 and hn=1.

For each k=1,2,,g1, the random number of admixed individuals in the pedigree of a randomly chosen admixed individual follows XkBin(2Xk1,1). Recalling that X0=1 for the single admixed individual in generation g, we have Xk=2k for all k=0,1,2,,g1: all 2k ancestors of an individual k generations back from the present are admixed.

To consider genealogical ancestors from the source populations, we separate between 2 cases, 1kg1 and k=g. For 1kg1, UkBin(2k,0) and no individuals from sources 1 and 2 are reached. Consequently, Uk=0 for all k with 1kg1.

Next, we proceed one generation back from the case of k=g1. If k=g, then by Eq. 8, UgBin(22g1,s1,0). Therefore, E[Ug]=2gs1,0 and Var[Ug]=2gs1,0(1s1,0).

For genetic ancestors, we again separate 1kg1 from k=g. For 1kg1, YkBin(2k,0), and the count of genetic ancestors is Yk=0 for all k with 1kg1, as is seen with genealogical ancestors. For k=g, by Eq. 14, YgBin(2g,s1,0pg). Therefore, E[Yg]=2gs1,0pg and Var[Yg]=2gs1,0pg(1s1,0pg). The numbers of genetic ancestors from the source populations, like the corresponding numbers of genealogical ancestors, are determined by parameters of the initial admixture, as tabulated by n=0 looking forward in time, or by k=g looking backward.

Constant positive admixture

We now examine the situation in which s1,0,s2,0>0, after which the contributions from the sources are constant and positive. We denote s1,n=s1 and s2,n=s2 for all n, 1ng1, with s1,s2>0. Then hn=1s1,ns2,n is also constant for all n, 1ng1; we denote this constant by hn=h.

Mathematical results

The number of admixed genealogical ancestors Xk follows a homogeneous branching process. For k=0, E[Xk]=1. By Eq. 3, for k=1,2,,g1,

(25)

For k=g, E[Xk]=0.

For the variance of the number of admixed genealogical ancestors, by Eq. 4, Var[X0]=0 and for 1kg1,

(26)

For k=g, Var[Xk]=0.

To count genealogical and genetic ancestors, we again separate 1kg1 from k=g. When k=g, by Eq. 8, UgBin(2Xg1,s1,0). Hence, by Eqs. 10 and 25, for genealogical ancestors, we have

(27)

For the variance, starting from Eq. 11 and applying Eqs. 25 and 26, we have

(28)

For 1kg1, by Eq. 8, UkBin(2Xk1,s1). By Eqs. 10 and 25,

(29)

We then obtain, by Eqs. 11, 25, and 26,

(30)

For genetic ancestors, when k=g, similarly to the calculations for genealogical ancestors, we use Eq. 14 to obtain YgBin(2Xg1,s1,0pg). By Eqs. 15 and 25,

(31)

Following the reasoning underlying Eq. 16, with Eqs. 25 and 26,

(32)

For 1kg1, by Eq. 14, YkBin(2Xk1,s1pk). Hence, by Eqs. 15 and 25,

(33)

By Eqs. 16, 25, and 26,

(34)

Analysis of temporal trends

In the case of constant positive admixture, we analyze the way in which genealogical and genetic ancestors accumulate across the generations of the admixture process. Comparing generation k, 2kg1, to the generation k1 of its offspring, Eq. 29 gives

If h<12, then 2h<1 and E[Uk] decreases with increasing k and hence decreasing n=gk (Fig. 4a). The number of admixed ancestors is small, so that the source populations are likely to be reached in a small number of generations back from the present; hence, the numbers of genealogical ancestors from the source populations are also small. The contribution from the admixed population is low enough and the contributions from the source populations are high enough that the number of genealogical ancestors from the source populations is greatest in the most recent generations.

Genealogical and genetic ancestors in a model of constant admixture with g=15, evaluated forward in time from generation n=0 to generation n=g−1=14. The forward-time generation n corresponds to the backward-time generation k=g−n. a–c) Expected number of source-1 genealogical ancestors (Eqs. 27, 29) and genetic ancestors (Eqs. 31, 33). The 3 panels use s1,0=s2,0=0.5 and s1=s2 with different values of h. a) h=0.4. b) h=0.6. c) h=0.5. d) The ratio of the conditional probabilities of genetic ancestry given genealogical ancestry for generations k and k−1, pk/pk−1 (Eq. 1), where k=0 in generation g=15 and n=g−k. Note that this plot stops at n=13 and k=2 with the value of p2/p1.
Fig. 4.

Genealogical and genetic ancestors in a model of constant admixture with g=15, evaluated forward in time from generation n=0 to generation n=g1=14. The forward-time generation n corresponds to the backward-time generation k=gn. a–c) Expected number of source-1 genealogical ancestors (Eqs. 27, 29) and genetic ancestors (Eqs. 31, 33). The 3 panels use s1,0=s2,0=0.5 and s1=s2 with different values of h. a) h=0.4. b) h=0.6. c) h=0.5. d) The ratio of the conditional probabilities of genetic ancestry given genealogical ancestry for generations k and k1, pk/pk1 (Eq. 1), where k=0 in generation g=15 and n=gk. Note that this plot stops at n=13 and k=2 with the value of p2/p1.

If, on the other hand, h>12, then 2h>1 and E[Uk] increases with increasing k and decreasing n=gk (Fig. 4b). The number of admixed genealogical ancestors is larger than with h<12, so that the number of genealogical ancestors from the source populations is also larger. With a high contribution from the admixed population to itself, the number of genealogical ancestors from the source populations is greatest farther back in time. A transition occurs at h=12, where 2h=1 and E[Uk] is constant in time, equaling 2s1 by Eq. 29 (Fig. 4c).

For genetic ancestors, for 2kg1, Eq. 33 gives

Although the admixture process is constant in time after the founding of the admixed population, the dependence of pk on k (Eq. 1) affects the time at which genetic ancestors from the sources accumulate.

We now examine pk/pk1. Denote the event “a k-generation genealogical ancestor is a k-generation genetic ancestor” by Ak, k1. Irrespective of the form chosen for P[Ak], we argue that

(35)

For the right-hand side of Eq. 35, a necessary condition for a k-generation genealogical ancestor of a descendant to be a k-generation genetic ancestor is that it is a (k1)-generation genetic ancestor of the parent of the descendant. In other words, AkAk1 and P[Ak]P[Ak1].

For the left-hand side of Eq. 35, because AkAk1,

Ak|Ak1 is the event that conditional on a k-generation ancestor transmitting at least 1 genomic segment to the parent of a descendant, the k-generation ancestor transmits at least 1 segment to the descendant itself. The probability that a parent transmits a certain segment to an offspring is 12, and therefore 12P[Ak|Ak1].

For the functional form of P[Ak] used by Coop (2013), Eq. 1, a proof that 12<pk<1 for all k2 appears in  Appendix C. An example of pk/pk1 appears in Fig. 4d, illustrating a decrease in pk/pk1 with increasing k and decreasing n=gk.

Application to African-Americans

Model and methods

We apply our model to count genetic ancestors for a random individual in the African-American population in the United States. In Mooney et al. (2023), relying on demographic data on the history of the population, we considered a model with g=14 generations, ending in 1960–1965. Using information on current patterns of genetic admixture, we inferred admixture parameters (s1,n,hn,s2,n), with source 1 representing Africans and source 2 representing Europeans. The model divided the demographic history of the population into 3 epochs: 1619–1808, during which the population was founded, with importation of enslaved African captives and admixture with Europeans; 1808–1865, during which enslavement and admixture continued but importation of enslaved persons was illegal; and 1865–1965, after the end of legal enslavement. The 1965 endpoint for the model was chosen to accord with the approximate timing of the birth of individuals in whom genetic ancestry has been measured, and to precede subsequent major demographic changes.

The model considered 25-year generations, initializing the population solely with Africans (s1,0=1,s2,0=0). The first epoch had 7 generations (1635–1640, 1660–1665, 1685–1690, 1710–1715, 1735–1740, 1760–1765, 1785–1790; n=1–7), the second epoch had 3 (1810–1815, 1835–1840, 1860–1865; n=8–10) and the third had 4 (1885–1890, 1910–1915, 1935–1940, 1960–1965; n=11–14). In the first epoch, s2,n was kept constant, and the values of s1,n and hn were specified by estimating the value of s1,n/(s1,n+hn) using demographic data (Hacker 2020) about newly transported enslaved individuals from Africa and births in the African-American population. In both the second and third epochs, s1,n, hn, and s2,n were maintained as constants for all generations in the epoch.

Mooney et al. (2023) identified sets of parameter values that recovered features of genetic ancestry measured in African-Americans: an expected African genetic ancestry in [0.75,0.85] with standard deviation in [0.08,0.15]. A summary of generationwise mean parameter values across all accepted parameter sets appears in Fig. 5a. The figure reports mean values of s1, h, and s2, summarizing distributions that appear in Fig. 4 of Mooney et al. (2023). It shows the high African contribution to the African-American population in the earliest generations (s1), with an increasing contribution of the African-American population to itself (h), and with European contributions occurring across the generations (s2). For each set of accepted parameters, Mooney et al. (2023) calculated the generationwise expected numbers of African and European genealogical ancestors associated with the set.

Generation-specific genealogical and genetic ancestry features for African-Americans. a) Generationwise mean admixture contributions s1 (African), h (African-American), and s2 (European) across accepted parameter sets. Error bars show standard deviations. b) Means across accepted parameter sets of the expected numbers of African-American genealogical and genetic ancestors possessed by a random individual, as calculated by Eqs. 3 and 20. Error bars show the standard deviations of these expected numbers across accepted parameter sets. The values plotted in a) are obtained by summarizing the distributions underlying Fig. 4 of Mooney et al. (2023). The values in b) are given in Table 1.
Fig. 5.

Generation-specific genealogical and genetic ancestry features for African-Americans. a) Generationwise mean admixture contributions s1 (African), h (African-American), and s2 (European) across accepted parameter sets. Error bars show standard deviations. b) Means across accepted parameter sets of the expected numbers of African-American genealogical and genetic ancestors possessed by a random individual, as calculated by Eqs. 3 and 20. Error bars show the standard deviations of these expected numbers across accepted parameter sets. The values plotted in a) are obtained by summarizing the distributions underlying Fig. 4 of Mooney et al. (2023). The values in b) are given in Table 1.

Table 1.

Generation-specific expectations of the numbers of African-American genealogical and genetic ancestors across accepted parameter sets.

Number of African-American ancestors
GenealogicalGenetic
MeanStandardMean ofMeanStandardMean of
GenerationBirthofdeviation ofstandardofdeviation ofstandard
(n)yearexpectationexpectationdeviationexpectationexpectationdeviation
01610–1615
11635–16400.070.030.270.010.000.08
21660–16652.320.981.830.400.170.66
31685–16908.933.384.412.600.981.88
41710–171533.3011.1512.7115.445.176.60
51735–174083.0924.3728.9355.9116.3920.05
61760–176597.9325.0833.1784.3621.6128.99
71785–179062.3914.2221.0560.3913.7620.57
81810–181534.436.9711.5134.336.9511.52
91835–184019.043.526.2819.043.526.28
101860–186510.551.873.4010.551.873.40
111885–18905.840.771.835.840.771.83
121910–19153.240.280.943.240.280.94
131935–19401.800.080.421.800.080.42
Total362.9390.16119.94293.8969.6699.54
Number of African-American ancestors
GenealogicalGenetic
MeanStandardMean ofMeanStandardMean of
GenerationBirthofdeviation ofstandardofdeviation ofstandard
(n)yearexpectationexpectationdeviationexpectationexpectationdeviation
01610–1615
11635–16400.070.030.270.010.000.08
21660–16652.320.981.830.400.170.66
31685–16908.933.384.412.600.981.88
41710–171533.3011.1512.7115.445.176.60
51735–174083.0924.3728.9355.9116.3920.05
61760–176597.9325.0833.1784.3621.6128.99
71785–179062.3914.2221.0560.3913.7620.57
81810–181534.436.9711.5134.336.9511.52
91835–184019.043.526.2819.043.526.28
101860–186510.551.873.4010.551.873.40
111885–18905.840.771.835.840.771.83
121910–19153.240.280.943.240.280.94
131935–19401.800.080.421.800.080.42
Total362.9390.16119.94293.8969.6699.54

Suppose θi denotes an accepted parameter set and θ={θi}i=1|θ| denotes the collection of all accepted parameter sets. For each generation n=gk with g=14 (k=1,2,,g), the mean of the expectation of the genealogical ancestors is Meanθ{E[Xk(θi)]} (Eq. 3; Eq. 20 for genetic ancestors); the standard deviation of the expectation is σθ{E[Xk(θi)]}; the mean of the standard deviation is Meanθ{Var[Xk(θi)]} (Eq. 4; Eq. 21 for the genetic ancestors). For the total, the mean of the expectation of the genealogical ancestors is Meanθ{E[k=1gXk(θi)]} (Eq. 5; Eq. 23 for the genetic ancestors); the standard deviation of the expectation is σθ{E[k=1gXk(θi)]}; the mean of the standard deviation is Meanθ{Var[k=1gXk(θi)]} (Eq. 6; Eq. 24 for genetic ancestors). The table shows the generationwise values plotted in Fig. 5b for the mean and standard deviation of the expectation.

Table 1.

Generation-specific expectations of the numbers of African-American genealogical and genetic ancestors across accepted parameter sets.

Number of African-American ancestors
GenealogicalGenetic
MeanStandardMean ofMeanStandardMean of
GenerationBirthofdeviation ofstandardofdeviation ofstandard
(n)yearexpectationexpectationdeviationexpectationexpectationdeviation
01610–1615
11635–16400.070.030.270.010.000.08
21660–16652.320.981.830.400.170.66
31685–16908.933.384.412.600.981.88
41710–171533.3011.1512.7115.445.176.60
51735–174083.0924.3728.9355.9116.3920.05
61760–176597.9325.0833.1784.3621.6128.99
71785–179062.3914.2221.0560.3913.7620.57
81810–181534.436.9711.5134.336.9511.52
91835–184019.043.526.2819.043.526.28
101860–186510.551.873.4010.551.873.40
111885–18905.840.771.835.840.771.83
121910–19153.240.280.943.240.280.94
131935–19401.800.080.421.800.080.42
Total362.9390.16119.94293.8969.6699.54
Number of African-American ancestors
GenealogicalGenetic
MeanStandardMean ofMeanStandardMean of
GenerationBirthofdeviation ofstandardofdeviation ofstandard
(n)yearexpectationexpectationdeviationexpectationexpectationdeviation
01610–1615
11635–16400.070.030.270.010.000.08
21660–16652.320.981.830.400.170.66
31685–16908.933.384.412.600.981.88
41710–171533.3011.1512.7115.445.176.60
51735–174083.0924.3728.9355.9116.3920.05
61760–176597.9325.0833.1784.3621.6128.99
71785–179062.3914.2221.0560.3913.7620.57
81810–181534.436.9711.5134.336.9511.52
91835–184019.043.526.2819.043.526.28
101860–186510.551.873.4010.551.873.40
111885–18905.840.771.835.840.771.83
121910–19153.240.280.943.240.280.94
131935–19401.800.080.421.800.080.42
Total362.9390.16119.94293.8969.6699.54

Suppose θi denotes an accepted parameter set and θ={θi}i=1|θ| denotes the collection of all accepted parameter sets. For each generation n=gk with g=14 (k=1,2,,g), the mean of the expectation of the genealogical ancestors is Meanθ{E[Xk(θi)]} (Eq. 3; Eq. 20 for genetic ancestors); the standard deviation of the expectation is σθ{E[Xk(θi)]}; the mean of the standard deviation is Meanθ{Var[Xk(θi)]} (Eq. 4; Eq. 21 for the genetic ancestors). For the total, the mean of the expectation of the genealogical ancestors is Meanθ{E[k=1gXk(θi)]} (Eq. 5; Eq. 23 for the genetic ancestors); the standard deviation of the expectation is σθ{E[k=1gXk(θi)]}; the mean of the standard deviation is Meanθ{Var[k=1gXk(θi)]} (Eq. 6; Eq. 24 for genetic ancestors). The table shows the generationwise values plotted in Fig. 5b for the mean and standard deviation of the expectation.

Here, using these parameter sets, we calculate the generationwise expected numbers of African-American genealogical ancestors and the expected numbers of African, European, and African-American genetic ancestors (Eq. 15), in a pedigree of a person drawn randomly from the African-American population born between 1960 and 1965. We also show the distribution across parameter sets, in each generation, of the expected numbers of genealogical ancestors from each population.

Genealogical ancestors

For each accepted parameter set, using Eq. 3, we evaluated the generationwise expected number of African-American ancestors that appear in a random genealogy, represented by E[Xk]. The mean across accepted parameter sets is shown in Fig. 5b and Table 1. Forward in time, the mean number of African-American genealogical ancestors is initially small, increasing to a peak in generation 6 (1760–1765) with a value of 98. It decreases toward the end of the admixture process.

At each generation n, genealogical ancestry is split across 5 groups: Africans reached in generation n, African-Americans present in generation n, Europeans reached in generation n, Africans who are ancestors to Africans reached subsequent to generation n, and Europeans who are ancestors to Europeans reached subsequent to generation n. The first and third of these categories were studied by Mooney et al. (2023). The fourth and fifth are individuals who are genealogical ancestors of individuals who contributed directly to the African-American population, but who are not themselves parents of African-Americans; the expected number of Africans who are ancestors to African genealogical ancestors reached only subsequent to generation n is obtained from Eq. 10 by i=n+1132inE[U14i]. A similar computation can be performed for Europeans.

Figure 6a plots the fractions among all genealogical ancestors assigned to the 5 categories, and the values plotted appear in Table 2. In the earliest generations, all genealogical ancestors are Africans and Europeans who do not directly contribute to the African-American population. As the admixture continues, African and European genealogical ancestors who directly contribute are reached, and eventually, African-Americans represent most of the genealogical ancestors. In generation 0 (1610–1615), ∼79% of genealogical ancestors are African and ∼21% are European, reflecting the fractions of an African-American genome that trace to African genetic ancestry and to European genetic ancestry.

Generation-specific genealogical and genetic ancestry fractions for African-Americans. a) Generationwise genealogical ancestry for a random African-American individual, partitioned across 5 categories and averaged across accepted parameter sets. The fraction of genealogical ancestors who are Africans in generation n who contribute directly to the African-American population is obtained from Eq. 10 as E[U14−n]/214−n; the fraction of genealogical ancestors who are African but who only contribute to the African-American population through their subsequent African descendants is (∑i=n+1132i−nE[U14−i])/214−n. Similar calculations are performed for Europeans. The fraction of genealogical ancestors who are African-American is E[X14−n]/214−n, calculated using Eq. 3. The values plotted appear in Table 2. b) Generationwise expected African genetic ancestry contributed to a descendant as a fraction of the total expected African genetic ancestry in the descendant, and expected European genetic ancestry contributed to the descendant as a fraction of the total expected European genetic ancestry in the descendant. The values are obtained from Eq. 36, with n=14−k. Error bars represent standard deviations of the values from Eq. 36 across accepted parameter sets.
Fig. 6.

Generation-specific genealogical and genetic ancestry fractions for African-Americans. a) Generationwise genealogical ancestry for a random African-American individual, partitioned across 5 categories and averaged across accepted parameter sets. The fraction of genealogical ancestors who are Africans in generation n who contribute directly to the African-American population is obtained from Eq. 10 as E[U14n]/214n; the fraction of genealogical ancestors who are African but who only contribute to the African-American population through their subsequent African descendants is (i=n+1132inE[U14i])/214n. Similar calculations are performed for Europeans. The fraction of genealogical ancestors who are African-American is E[X14n]/214n, calculated using Eq. 3. The values plotted appear in Table 2. b) Generationwise expected African genetic ancestry contributed to a descendant as a fraction of the total expected African genetic ancestry in the descendant, and expected European genetic ancestry contributed to the descendant as a fraction of the total expected European genetic ancestry in the descendant. The values are obtained from Eq. 36, with n=14k. Error bars represent standard deviations of the values from Eq. 36 across accepted parameter sets.

Table 2.

Generation-specific expectations of the fractions of genealogical ancestry assigned to 5 categories, across accepted parameter sets.

Fraction of genealogical ancestors
GenerationBirthAfricanAfrican-EuropeanAfrican,European,
(n)yearAmericannot countednot counted
01610–16150.00000.79340.2066
11635–16400.00050.00000.00000.79290.2066
21660–16650.00350.00060.00030.78940.2062
31685–16900.02570.00440.00240.76370.2038
41710–17150.11710.03250.01270.64660.1911
51735–17400.18900.16230.03130.45750.1599
61760–17650.06320.38250.04170.39440.1182
71785–17900.03200.48740.01860.36240.0996
81810–18150.03620.53800.02080.32620.0788
91835–18400.04090.59500.02340.28530.0554
101860–18650.05840.65930.01180.22690.0436
111885–18900.06630.72960.01300.16060.0305
121910–19150.07520.80880.01450.08540.0161
131935–19400.08540.89850.0161
Fraction of genealogical ancestors
GenerationBirthAfricanAfrican-EuropeanAfrican,European,
(n)yearAmericannot countednot counted
01610–16150.00000.79340.2066
11635–16400.00050.00000.00000.79290.2066
21660–16650.00350.00060.00030.78940.2062
31685–16900.02570.00440.00240.76370.2038
41710–17150.11710.03250.01270.64660.1911
51735–17400.18900.16230.03130.45750.1599
61760–17650.06320.38250.04170.39440.1182
71785–17900.03200.48740.01860.36240.0996
81810–18150.03620.53800.02080.32620.0788
91835–18400.04090.59500.02340.28530.0554
101860–18650.05840.65930.01180.22690.0436
111885–18900.06630.72960.01300.16060.0305
121910–19150.07520.80880.01450.08540.0161
131935–19400.08540.89850.0161

The table shows the values plotted in Fig. 6a.

Table 2.

Generation-specific expectations of the fractions of genealogical ancestry assigned to 5 categories, across accepted parameter sets.

Fraction of genealogical ancestors
GenerationBirthAfricanAfrican-EuropeanAfrican,European,
(n)yearAmericannot countednot counted
01610–16150.00000.79340.2066
11635–16400.00050.00000.00000.79290.2066
21660–16650.00350.00060.00030.78940.2062
31685–16900.02570.00440.00240.76370.2038
41710–17150.11710.03250.01270.64660.1911
51735–17400.18900.16230.03130.45750.1599
61760–17650.06320.38250.04170.39440.1182
71785–17900.03200.48740.01860.36240.0996
81810–18150.03620.53800.02080.32620.0788
91835–18400.04090.59500.02340.28530.0554
101860–18650.05840.65930.01180.22690.0436
111885–18900.06630.72960.01300.16060.0305
121910–19150.07520.80880.01450.08540.0161
131935–19400.08540.89850.0161
Fraction of genealogical ancestors
GenerationBirthAfricanAfrican-EuropeanAfrican,European,
(n)yearAmericannot countednot counted
01610–16150.00000.79340.2066
11635–16400.00050.00000.00000.79290.2066
21660–16650.00350.00060.00030.78940.2062
31685–16900.02570.00440.00240.76370.2038
41710–17150.11710.03250.01270.64660.1911
51735–17400.18900.16230.03130.45750.1599
61760–17650.06320.38250.04170.39440.1182
71785–17900.03200.48740.01860.36240.0996
81810–18150.03620.53800.02080.32620.0788
91835–18400.04090.59500.02340.28530.0554
101860–18650.05840.65930.01180.22690.0436
111885–18900.06630.72960.01300.16060.0305
121910–19150.07520.80880.01450.08540.0161
131935–19400.08540.89850.0161

The table shows the values plotted in Fig. 6a.

Genetic ancestors

Considering the accepted parameter sets from Mooney et al. (2023), we used Eq. 15 to calculate generationwise expected numbers of African and European genetic ancestors. These values enable evaluation of expected fractions of the total African and European ancestry that have contributed to a descendant genome by each generation of genetic ancestors. For example, the fraction of the genome that traces to a specific African genetic ancestor from k generations before the descendant is, on average, 1/Wk, where Wk is the number of genetic ancestors in that generation. Wk has expectation 2kpk, the product of the number of genealogical ancestors k generations ago and the probability that a genealogical ancestor is a genetic ancestor. Therefore, the expected contribution to the African genetic ancestry fraction from all African genetic ancestors k generations before the present can be approximated by E[Yk]/(2kpk), the ratio of the expected number of African genetic ancestors k generations prior to the descendant and the expected total number of genetic ancestors in that generation. By Eqs. 10 and 15, E[Yk]/(2kpk)=E[Uk]/2k.

Figure 6b shows the expected African and European genetic ancestry contributed by the genetic ancestors from each generation as fractions of the total African and European genetic ancestry, or

(36)

The figure converts between the backward-time perspective indexed by k and the forward-time n=gk. Because a genetic ancestor from the more recent generations (large n) contributes more genetic ancestry on average than a genetic ancestor in previous generations (small n), we observe nonnegligible contributions from these later generations. However, ∼40% of the African genetic ancestry traces to generations 4 and 5, and ∼35% of the European genetic ancestry traces to generations 5 and 6, with an additional ∼30% of European genetic ancestry tracing to generations 7, 8, and 9.

The generationwise mean values across parameter sets of the expected numbers of genetic ancestors appear in Fig. 7, alongside expected numbers of genealogical ancestors for comparison. Replotting values from Fig. 7 of Mooney et al. (2023), the numbers of genealogical ancestors are greater for Africans than for Europeans, and the expected total numbers of genealogical ancestors, summing across generations, are 314 Africans and 51 Europeans (Tables 3 and 4). Looking forward in time from the founding of the population, the numbers of genealogical ancestors increase to peak values and then decrease. The numbers of genetic ancestors also reach peaks and decrease toward the present. The expected total numbers of genetic ancestors are 162 Africans and 32 Europeans.

Generation-specific expectations of the numbers of African and European genealogical and genetic ancestors. The expected number of African genealogical ancestors is calculated according to Eq. 10 (standard deviation, Eq. 11). The expected number of African genetic ancestors is calculated according to Eq. 15 (standard deviation, Eq. 16). Similar calculations are performed for Europeans. The plot shows means of the expectation and standard deviation across expected parameter sets. The values plotted appear in Table 4.
Fig. 7.

Generation-specific expectations of the numbers of African and European genealogical and genetic ancestors. The expected number of African genealogical ancestors is calculated according to Eq. 10 (standard deviation, Eq. 11). The expected number of African genetic ancestors is calculated according to Eq. 15 (standard deviation, Eq. 16). Similar calculations are performed for Europeans. The plot shows means of the expectation and standard deviation across expected parameter sets. The values plotted appear in Table 4.

Table 3.

Summary statistics for the expected numbers of African, European, and African-American genealogical and genetic ancestors for a random individual from the African-American population across the accepted parameter sets.

QuantityMeanStandard deviationMinimumFirst quartileMedianThird quartileMaximum
African genealogical ancestors31499124240299376680
African genetic ancestors1624772127155192332
European genealogical ancestors51244325169125
European genetic ancestors3214421324377
African-American genealogical ancestors36390202294345418709
African-American genetic ancestors29470172240280336566
QuantityMeanStandard deviationMinimumFirst quartileMedianThird quartileMaximum
African genealogical ancestors31499124240299376680
African genetic ancestors1624772127155192332
European genealogical ancestors51244325169125
European genetic ancestors3214421324377
African-American genealogical ancestors36390202294345418709
African-American genetic ancestors29470172240280336566

The estimates consider random individuals in the 1960–1965 birth cohort, assumed to be generation g=14 in a 3-epoch model. The standard deviations are standard deviations of the means across accepted parameter sets; means and standard deviations are rounded from Tables 1 and 4. The values for African and European genealogical ancestors appear in Table 3 in Mooney et al. (2023).

Table 3.

Summary statistics for the expected numbers of African, European, and African-American genealogical and genetic ancestors for a random individual from the African-American population across the accepted parameter sets.

QuantityMeanStandard deviationMinimumFirst quartileMedianThird quartileMaximum
African genealogical ancestors31499124240299376680
African genetic ancestors1624772127155192332
European genealogical ancestors51244325169125
European genetic ancestors3214421324377
African-American genealogical ancestors36390202294345418709
African-American genetic ancestors29470172240280336566
QuantityMeanStandard deviationMinimumFirst quartileMedianThird quartileMaximum
African genealogical ancestors31499124240299376680
African genetic ancestors1624772127155192332
European genealogical ancestors51244325169125
European genetic ancestors3214421324377
African-American genealogical ancestors36390202294345418709
African-American genetic ancestors29470172240280336566

The estimates consider random individuals in the 1960–1965 birth cohort, assumed to be generation g=14 in a 3-epoch model. The standard deviations are standard deviations of the means across accepted parameter sets; means and standard deviations are rounded from Tables 1 and 4. The values for African and European genealogical ancestors appear in Table 3 in Mooney et al. (2023).

Table 4.

Generation-specific expectations of the numbers of African and European genealogical and genetic ancestors across accepted parameter sets.

Number of African ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–16150.140.070.540.010.000.09
11635–16404.251.993.410.410.190.71
21660–166514.276.047.282.451.031.93
31685–169052.7019.9520.4715.335.806.91
41710–1715119.9040.1542.2255.6018.6220.50
51735–174096.7628.3733.4565.1019.0923.12
61760–176516.184.146.6013.943.575.88
71785–17904.102.712.503.972.622.45
81810–18152.311.571.702.311.571.70
91835–18401.310.911.201.310.911.19
101860–18650.940.390.990.940.390.99
111885–18900.530.230.720.530.230.72
121910–19150.300.140.530.300.140.53
131935–19400.170.080.390.170.080.39
Total313.8698.58102.62162.3746.7252.66
Number of African ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–16150.140.070.540.010.000.09
11635–16404.251.993.410.410.190.71
21660–166514.276.047.282.451.031.93
31685–169052.7019.9520.4715.335.806.91
41710–1715119.9040.1542.2255.6018.6220.50
51735–174096.7628.3733.4565.1019.0923.12
61760–176516.184.146.6013.943.575.88
71785–17904.102.712.503.972.622.45
81810–18152.311.571.702.311.571.70
91835–18401.310.911.201.310.911.19
101860–18650.940.390.990.940.390.99
111885–18900.530.230.720.530.230.72
121910–19150.300.140.530.300.140.53
131935–19400.170.080.390.170.080.39
Total313.8698.58102.62162.3746.7252.66
Number of European ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–1615
11635–16400.320.140.610.030.010.18
21660–16651.280.611.310.220.100.48
31685–16904.982.523.101.450.731.36
41710–171512.987.076.506.023.283.53
51735–174016.019.447.8010.776.355.60
61760–176510.676.855.589.195.904.96
71785–17902.381.841.842.301.781.81
81810–18151.331.041.271.331.041.27
91835–18400.750.600.900.750.600.90
101860–18650.190.120.440.190.120.44
111885–18900.100.060.320.100.060.32
121910–19150.060.030.240.060.030.24
131935–19400.030.020.180.030.020.18
Total51.0824.3218.6832.4414.3112.18
Number of European ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–1615
11635–16400.320.140.610.030.010.18
21660–16651.280.611.310.220.100.48
31685–16904.982.523.101.450.731.36
41710–171512.987.076.506.023.283.53
51735–174016.019.447.8010.776.355.60
61760–176510.676.855.589.195.904.96
71785–17902.381.841.842.301.781.81
81810–18151.331.041.271.331.041.27
91835–18400.750.600.900.750.600.90
101860–18650.190.120.440.190.120.44
111885–18900.100.060.320.100.060.32
121910–19150.060.030.240.060.030.24
131935–19400.030.020.180.030.020.18
Total51.0824.3218.6832.4414.3112.18

Values are calculated as in Table 1. The mean of the expectation for African genealogical ancestors is obtained by averaging values of Eq. 10 across accepted parameter sets (Eq. 15 for genetic ancestors); the standard deviation of the expectation takes the standard deviation of those values. The mean of the standard deviation for African genealogical ancestors is obtained as the mean of Eq. 11 across accepted parameter sets (Eq. 16 for genetic ancestors). For the total, the mean of the expectation of the sum of the African genealogical ancestors is calculated by averaging values of Eq. 12 across accepted parameter sets (Eq. 17 for genetic ancestors); the standard deviation of the expectation takes the standard deviation of those values. The mean of the standard deviation for the total African genealogical ancestors is obtained as the mean of Eq. 13 across accepted parameter sets (Eq. 18 for genetic ancestors). Corresponding quantities for European ancestors are calculated by replacing each s1,gk with s2,gk. The values of the total means for the expectation and standard deviation of African and European genealogical ancestors are those that appear in Table 3 of Mooney et al. (2023). The table shows the generationwise values plotted in Fig. 7 for the means and standard deviations of the expectation across the accepted parameter sets.

Table 4.

Generation-specific expectations of the numbers of African and European genealogical and genetic ancestors across accepted parameter sets.

Number of African ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–16150.140.070.540.010.000.09
11635–16404.251.993.410.410.190.71
21660–166514.276.047.282.451.031.93
31685–169052.7019.9520.4715.335.806.91
41710–1715119.9040.1542.2255.6018.6220.50
51735–174096.7628.3733.4565.1019.0923.12
61760–176516.184.146.6013.943.575.88
71785–17904.102.712.503.972.622.45
81810–18152.311.571.702.311.571.70
91835–18401.310.911.201.310.911.19
101860–18650.940.390.990.940.390.99
111885–18900.530.230.720.530.230.72
121910–19150.300.140.530.300.140.53
131935–19400.170.080.390.170.080.39
Total313.8698.58102.62162.3746.7252.66
Number of African ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–16150.140.070.540.010.000.09
11635–16404.251.993.410.410.190.71
21660–166514.276.047.282.451.031.93
31685–169052.7019.9520.4715.335.806.91
41710–1715119.9040.1542.2255.6018.6220.50
51735–174096.7628.3733.4565.1019.0923.12
61760–176516.184.146.6013.943.575.88
71785–17904.102.712.503.972.622.45
81810–18152.311.571.702.311.571.70
91835–18401.310.911.201.310.911.19
101860–18650.940.390.990.940.390.99
111885–18900.530.230.720.530.230.72
121910–19150.300.140.530.300.140.53
131935–19400.170.080.390.170.080.39
Total313.8698.58102.62162.3746.7252.66
Number of European ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–1615
11635–16400.320.140.610.030.010.18
21660–16651.280.611.310.220.100.48
31685–16904.982.523.101.450.731.36
41710–171512.987.076.506.023.283.53
51735–174016.019.447.8010.776.355.60
61760–176510.676.855.589.195.904.96
71785–17902.381.841.842.301.781.81
81810–18151.331.041.271.331.041.27
91835–18400.750.600.900.750.600.90
101860–18650.190.120.440.190.120.44
111885–18900.100.060.320.100.060.32
121910–19150.060.030.240.060.030.24
131935–19400.030.020.180.030.020.18
Total51.0824.3218.6832.4414.3112.18
Number of European ancestors
GenealogicalGenetic
Generation (n)Birth yearMean of expectationStandard deviation of expectationMean of standard deviationMean of expectationStandard deviation of expectationMean of standard deviation
01610–1615
11635–16400.320.140.610.030.010.18
21660–16651.280.611.310.220.100.48
31685–16904.982.523.101.450.731.36
41710–171512.987.076.506.023.283.53
51735–174016.019.447.8010.776.355.60
61760–176510.676.855.589.195.904.96
71785–17902.381.841.842.301.781.81
81810–18151.331.041.271.331.041.27
91835–18400.750.600.900.750.600.90
101860–18650.190.120.440.190.120.44
111885–18900.100.060.320.100.060.32
121910–19150.060.030.240.060.030.24
131935–19400.030.020.180.030.020.18
Total51.0824.3218.6832.4414.3112.18

Values are calculated as in Table 1. The mean of the expectation for African genealogical ancestors is obtained by averaging values of Eq. 10 across accepted parameter sets (Eq. 15 for genetic ancestors); the standard deviation of the expectation takes the standard deviation of those values. The mean of the standard deviation for African genealogical ancestors is obtained as the mean of Eq. 11 across accepted parameter sets (Eq. 16 for genetic ancestors). For the total, the mean of the expectation of the sum of the African genealogical ancestors is calculated by averaging values of Eq. 12 across accepted parameter sets (Eq. 17 for genetic ancestors); the standard deviation of the expectation takes the standard deviation of those values. The mean of the standard deviation for the total African genealogical ancestors is obtained as the mean of Eq. 13 across accepted parameter sets (Eq. 18 for genetic ancestors). Corresponding quantities for European ancestors are calculated by replacing each s1,gk with s2,gk. The values of the total means for the expectation and standard deviation of African and European genealogical ancestors are those that appear in Table 3 of Mooney et al. (2023). The table shows the generationwise values plotted in Fig. 7 for the means and standard deviations of the expectation across the accepted parameter sets.

By a similar computation, Fig. 5b provides the generationwise expected numbers of African-American genetic ancestors, comparing them to corresponding numbers of genealogical ancestors. The expected total number of African-American genealogical ancestors, summing from generations 0 to 13, is 363, and the expected total for genetic ancestors is 294 (Tables 1 and 3).

In Fig. 7, the peak expected number of African genealogical ancestors appears in generation 4 (1735–1740). However, the corresponding peak for genetic ancestors occurs in generation 5. The difference occurs because the peak for African genealogical ancestors occurs far enough back in time that the probability of genetic ancestry for those genealogical ancestors is well below 1 (p10=p1440.4637 by Eq. 1); the number of genetic ancestors among the smaller number of generation-5 genealogical ancestors is greater than among the larger number of generation-4 genealogical ancestors.

For Europeans, the peak of genealogical ancestors occurs later than for Africans, in generation 5 (1760–1765). In that later generation, the fraction of genealogical ancestors who are also genetic ancestors is greater than in generation 4 (p9=p1450.6728 by Eq. 1). Because the peak in genealogical ancestors occurs later for Europeans, the fraction of all European genealogical ancestors who are genetic ancestors (32510.63) exceeds the corresponding fraction for Africans (1623140.52).

This observation can be illustrated in a computation shown in Fig. 8, which compares the ratio of African and European genetic ancestors to the ratio of African and European genealogical ancestors across accepted parameter sets. The African:European ratio of genetic ancestors is consistently lower than the African:European ratio of genealogical ancestors. The comparative recency of the European genealogical ancestors—and the resulting increased probability of genetic ancestry for those genealogical ancestors—produces a greater value for the fraction of all genetic ancestors who are European compared to the fraction of all genealogical ancestors who are European.

Ratios of the number of African ancestors to the number of European ancestors. The x-axis shows the ratio for genealogical ancestors, and the y-axis shows the ratio for genetic ancestors. For each of 45,189 accepted parameter sets, we calculated ((∑n=013E[U14−n])/(∑n=013E[U14−n′]), (∑n=013E[Y14−n])/(∑n=013E[Y14−n′])), visualizing the ordered pair of ratios in a density plot. The 89% of the pairs (40,201) that have both ratios below 20 are presented in the plot, with the color of a 12×12 square representing the number of pairs located in that square. The mean ratios across all accepted parameter sets are (9.99,7.07), and the standard deviations are (11.87,6.32), with covariance 73.56. For the 89% of points shown, the mean ratios are (6.74,5.36), with standard deviations (4.18,2.96) and covariance 12.15. The y=x line is shown for comparison. Among the accepted parameter sets, the ratio we observed for genetic ancestors was always smaller than the ratio for genealogical ancestors; hence, for squares along the diagonal, only the lower triangle is colored. Note that although a smaller value for the ratio of genetic ancestors compared to the ratio of genealogical ancestors was always observed, such a relationship need not hold in principle.
Fig. 8.

Ratios of the number of African ancestors to the number of European ancestors. The x-axis shows the ratio for genealogical ancestors, and the y-axis shows the ratio for genetic ancestors. For each of 45,189 accepted parameter sets, we calculated ((n=013E[U14n])/(n=013E[U14n]), (n=013E[Y14n])/(n=013E[Y14n])), visualizing the ordered pair of ratios in a density plot. The 89% of the pairs (40,201) that have both ratios below 20 are presented in the plot, with the color of a 12×12 square representing the number of pairs located in that square. The mean ratios across all accepted parameter sets are (9.99,7.07), and the standard deviations are (11.87,6.32), with covariance 73.56. For the 89% of points shown, the mean ratios are (6.74,5.36), with standard deviations (4.18,2.96) and covariance 12.15. The y=x line is shown for comparison. Among the accepted parameter sets, the ratio we observed for genetic ancestors was always smaller than the ratio for genealogical ancestors; hence, for squares along the diagonal, only the lower triangle is colored. Note that although a smaller value for the ratio of genetic ancestors compared to the ratio of genealogical ancestors was always observed, such a relationship need not hold in principle.

In Fig. 5b, the peak number of African-American genealogical ancestors appears still later than the peaks for African and European genealogical ancestors, in generation 6 (1785–1790). In that generation, the fraction of genealogical ancestors who are also genetic ancestors is p8=p1460.8615 (by Eq. 1). Hence, the fraction of African-American genealogical ancestors who are also genetic ancestors (2943630.81) exceeds corresponding fractions for Africans and Europeans.

Discussion

We have developed an approach to counting genetic ancestors of an admixed individual, estimating the number of genetic ancestors who contributed directly to the admixed population and the number of genetic ancestors belonging to the admixed population itself. The approach proceeds by recursively treating the number of such ancestors in a given generation as a random variable that is binomially distributed based on a corresponding random variable for the subsequent generation. We used an admixture model together with a model of African-American demographic history to estimate that a random African-American born between 1960 and 1965 has an estimated mean of 162 for the number of African genetic ancestors (standard deviation 47) and 32 for the number of European genetic ancestors (standard deviation 14) who contributed to the African-American population directly from the source populations, and 294 total African-American genetic ancestors (standard deviation 70).

Genetic and genealogical ancestors

In population-genetic studies of genetically admixed populations, genetic ancestry that traces to the source populations has generally been analyzed by evaluation of estimated admixture fractions in members of an admixed population. The statistical models used for this estimation consider admixture in terms of the fractions of genomes contributed rather than via contributions of specific ancestors. With the increasing use of these genomic contributions to report information to individuals about their own genealogies, the meaning of concepts of genetic ancestry and admixture—and their estimates—have been increasingly queried (Weiss and Long 2009; Lawson et al. 2018; Mathieson and Scally 2020). Our use of mechanistic admixture models enables new perspectives on the interpretation of genetic admixture and ancestry estimates, seeking to describe the timing at which the ancestors entered pedigrees of individuals and to count genetic ancestors across the length of the admixture process.

The number of genetic ancestors is bounded above by the number of genealogical ancestors, as each genetic ancestor must also be a genealogical ancestor. Both for genealogical and for genetic ancestors, the number of ancestors in a given generation is binomially distributed based on the number of genealogical ancestors in the subsequent generation (Eqs. 3, 10, 15, 20). The difference between the distributions of genealogical and genetic ancestors is in the binomial probability of success. For genealogical ancestors, the distribution depends only on parameters of the admixture process (Eqs. 3, 10), whereas for genetic ancestors, it depends also on a genetic ancestry probability for a genealogical ancestor separated from a descendant by a specified number of generations (Eqs. 15, 20). Depending on the features of the admixture process, the number of genetic ancestors from a source population can be close to the number of genealogical ancestors, or far smaller (Fig. 4).

The evaluation of genetic ancestors extends the mechanistic admixture model of Mooney et al. (2023). From a mathematical perspective, the focus on genealogical ancestors by Mooney et al. (2023) proceeded by adding a well-placed factor of 2 to the work of Verdu and Rosenberg (2011), converting a genomic fraction in a single-generation recursion into a genealogical ancestor count. The mathematical extension here is substantial, incorporating into the admixture model not only the factor of 2 but also the time-varying probability that a genealogical ancestor is a genetic ancestor.

Viewed from the perspective of the recombination-based genetic ancestry model of Coop (2013), our approach extends the analysis of genetic ancestors by separating them across source populations. If we were to follow Coop (2013) and consider all populations together as one, then Eq. 3 would reduce to E[Xk]=2k, and our count of the random number of genetic ancestors in generation k would reduce Eq. 20 to Xk*Bin(2k,pk). In other words, with no ancestry proportion considered—or alternatively, with all genealogical ancestors treated as members of the admixed population—the number of genealogical ancestors in generation k is 2k, and the probability that a genealogical ancestor is tabulated as a genetic ancestor depends only on the genetic ancestry probability pk. The expectation of this random variable gives the Coop (2013) calculation of the expected number of genetic ancestors in generation k, E[Xk*]=2kpk (Eq. 1, Fig. 3).

African-American demographic history

With the Mooney et al. (2023) 14-generation model of African-American demographic history, we examined the expected numbers of genetic ancestors from Africa, Europe, and the African-American population itself, for random African-Americans born 1960–1965. We found for the mean numbers of genetic ancestors 162 Africans and 32 Europeans (Fig. 7, Table 3), smaller than the corresponding numbers of genealogical ancestors, 314 Africans and 51 Europeans (Mooney et al. 2023). Tabulating ancestors within the African-American population itself, the expected numbers of genealogical and genetic ancestors are 363 and 294, respectively (Fig. 5b, Table 1).

The peak number of genealogical ancestors occurs in generation 4 for Africans (1710–1715), generation 5 for Europeans (1735–1740), and generation 6 for African-Americans (1760–1765, Tables 1 and 4). Tracing genealogical ancestors back in time, noting that the total number of genealogical ancestors doubles in each generation, we find that the proportion of African-Americans among genealogical ancestors is greatest in generation 13, decreasing back in time (Fig. 6a, Table 2). The highest proportion occurs for Africans in generation 5 and for Europeans in generation 6. Eventually, African and European genealogical ancestors are reached who are parents solely of Africans or of Europeans; the proportions of these Africans and Europeans increase back in time until all genealogical ancestors are in these categories, in an approximate ratio of 79% Africans to 21% Europeans (Table 2). These quantities, which estimate fractions of all genealogical ancestors tracing to Africans and Europeans, lie in the range of permissible mean empirical genomic ancestry coefficients (Mooney et al. 2023).

For genetic ancestors, the contribution to African genetic ancestry is greatest for generations 4 and 5; the European genetic ancestry is highest in generations 5 and 6 (Fig. 6b). The peak number of genetic ancestors occurs in generation 5 for Europeans and generation 6 for African-Americans, matching corresponding peaks for genealogical ancestors (Tables 1 and 4). However, the peak for African genetic ancestors occurs in generation 5, one generation later than for African genealogical ancestors (Table 4). Many African genealogical ancestors are far enough back in time that many of them are not genetic ancestors—so that the peak for genetic ancestors occurs later for genealogical ancestors. The fact that African genealogical ancestors occur on average farther in the past than European genealogical ancestors means that the 314:51 ratio of the mean numbers of African and European genealogical ancestors is smaller than the 162:32 ratio of the mean numbers of African and European genetic ancestors (Fig. 8), as a larger fraction of the African genealogical ancestors have been lost as genetic ancestors. In effect, the fact that the European genealogical ancestors are later on average than the African genealogical ancestors has the result that the probability that a European genealogical ancestor is also a genetic ancestor exceeds the corresponding probability for Africans.

An interesting difference occurs between the peak of the African ancestor counts and the subsequent peak of the Transatlantic Slave Trade. The fraction of Africans transported by 1760 is about half of the total (Hacker 2020, Table 1); however, the comparable fraction of African genealogical ancestors, individuals born in generation 5 (born 1735–1740, reproductive age at 1760) or earlier, is 92% (Table 4). Hence, although the many transported Africans born in generations 6 and 7 certainly contributed in great numbers to the African-American population, a typical pedigree likely contains multiple lines that trace to the earlier enslaved migrants of generations 5 and earlier. In other words, by the time of the birth of generations 6 (1760–1765) and 7 (1785–1790), the African-American population was large enough that among all genealogical lines of a person born 1960–1965, many trace to genealogical ancestors who were already resident in the African-American population at the time of those generations. Indeed, for generation 6 onward and even for generation 5, African-Americans are a nontrivial fraction of the genealogical ancestors of a modern person (Fig. 6), from ∼38% in generation 6 up to ∼90% in generation 13 (Table 2). The other major component in generation 6 onward is African genealogical ancestors who did not contribute directly to the African-American population. These Africans are the genealogical ancestors of Africans newly contributing to the African-American population. The substantial fraction for this category results from the accumulation of many African genealogical ancestors who contributed to pedigrees in generations later than generation 5.

Limitations and extensions

As our approach follows the assumptions of Mooney et al. (2023), it is subject to many of the same limitations. For example, we do not consider a Native American component of admixture in African-Americans. Our treatment of a “random African-American” born in the 1960–1965 window does not take into consideration regional variation across the African-American population in admixture processes or other demographic phenomena. We also disregard the possibility that the same genealogical ancestor might occur in multiple positions in a pedigree, so that our count of the number of ancestors might double-count some individuals; the time over which this assumption is sensible is the period in which the number of genealogical ancestors in a pedigree is small in relation to the pool of potential ancestors. Our discretization of the generations oversimplifies the demographic history, as does our 3-epoch model, though this model does accord with the perspective of one of the most comprehensive empirical analyses of African-American genetic admixture (Baharian et al. 2016). Another limitation is that our model in principle allows an unlikely scenario in which the 2 parents of an African-American are 2 Europeans. We also do not consider distinct ancestry parameters for males and females. Each of these limitations is shared between the assessment of genealogical ancestors by Mooney et al. (2023) and our analysis of genetic ancestors here. As is discussed by (Mooney et al. 2023, p. 13), each is possible to address by extensions and modifications of the model, potentially leading to further understanding of both genealogical and genetic ancestors.

Additional limitations not shared in the work of Mooney et al. (2023), which focused solely on genealogical ancestors, arise from the use of the Coop (2013) model to evaluate the probability that a genealogical ancestor is a genetic ancestor. This approach does not account for recombination phenomena such as recombination-rate variation across the genome, gene conversion, the particular sizes of chromosomes, crossover interference that perturbs the Poisson distribution assumed for the number of new genomic segments each generation, differing male and female recombination rates, or the X chromosome. With its simple treatment of the recombination process, the Coop (2013) model ignores many complexities that affect the probability that some segment from a genealogical ancestor might be retained in a descendant. Although extensions to accommodate such phenomena could be developed, in a single simple equation (Eq. 1), the Coop (2013) recombination model does capture the basic phenomenon—as explained by Donnelly (1983)—that as the time between ancestor and descendant increases, the probability that the descendant retains a segment from the ancestor decreases (Fig. 3), and a steep drop in probability occurs when the separation increases from 7–8 generations (Robert Burns and descendants born 1960–1965) to 15–16 generations (descendants of William Shakespeare).

Our empirical focus has been on an example from human populations, but the model can be applied more generally to diploid species in which mechanistic admixture models and recombination models can be specified. To take one example, Armstrong et al. (2023) have studied genetic variation in captive tigers, a population formed through admixture of wild source populations from several different parts of Asia. Armstrong et al. (2023) have estimated genomic proportions that trace to the various source populations. With a generalization to permit more than 2 sources, our model can assist in understanding the properties of the genetic ancestors that have given rise to typical individual captive tigers.

Conclusions

Further study of a mechanistic admixture model has deepened the analysis of the number of genealogical ancestors who contribute from a source population to an admixed pedigree, and it has also introduced an approach to evaluating the number of contributing genetic ancestors. For African-Americans, the distinction between genealogical and genetic ancestors suggests that although the number of African genealogical ancestors in a pedigree greatly exceeds the number of European genealogical ancestors, because the African genealogical ancestors are on average earlier in time than the European genealogical ancestors, the number of African genetic ancestors does not exceed the number of European genetic ancestors by as great a margin. More generally, the calculations contribute to understanding the relationship between an admixed population’s demographic history, its ancestral individuals who have given rise to the modern population, and the genomes of its current members.

Data availability

The 45,189 sets of accepted parameter values (s1,0,h0,s2,0), (s1,1,h1,s2,1),,(s1,13,h13,s2,13) from Mooney et al. (2023), on which the analysis of the African-American population is based, are available in Supplementary File 1. Supplemental material is available at GENETICS online.

Acknowledgments

We thank Jonathan Pritchard and 3 reviewers for comments.

Funding

We acknowledge support from National Science Foundation grant BCS-2116322 and from a Council for Higher Education of Israel Scholarship for Outstanding Postdoctoral Fellows in Data Science.

Literature cited

Armstrong
 
EE
,
Mooney
 
JA
,
Solari
 
KA
,
Kim
 
BY
,
Barsh
 
GS
,
Grant
 
V
,
Greenbaum
 
G
,
Kaelin
 
CB
,
Panchenko
 
K
,
Pickrell
 
JK
, et al.
2023
. Unraveling the genomic diversity and evolutionary history of captive tigers in the United States. bioRxiv 545608. https://doi.org/10.1101/2023.06.19.545608.

Baharian
 
S
,
Barakatt
 
M
,
Gignoux
 
CR
,
Shringarpure
 
S
,
Errington
 
J
,
Blot
 
WJ
,
Bustamante
 
CD
,
Kenny
 
EE
,
Williams
 
SM
,
Aldrich
 
MC
, et al.
2016
.
The great migration and African-American genomic diversity
.
PLoS Genet
.
12
:
e1006059
. doi:

Baird
 
SJE
,
Barton
 
NH
,
Etheridge
 
AM
.
2003
.
The distribution of the surviving blocks of an ancestral genome
.
Theor Pop Biol
.
64
:
451
471
. doi:

Buffalo
 
V
,
Mount
 
SM
,
Coop
 
G
.
2016
.
A genealogical look at shared ancestry on the X chromosome
.
Genetics
.
204
:
57
75
. doi:

Chang
 
JT
.
1999
.
Recent common ancestors of all present-day individuals
.
Adv Appl Probab
.
31
:
1002
1026
. doi:

Donnelly
 
KP
.
1983
.
The probability that related individuals share some section of genome identical by descent
.
Theor Pop Biol
.
23
:
34
63
. doi:

Goldberg
 
A
,
Rastogi
 
A
,
Rosenberg
 
NA
.
2020
.
Assortative mating by population of origin in a mechanistic model of admixture
.
Theor Pop Biol
.
134
:
129
146
. doi:

Goldberg
 
A
,
Rosenberg
 
NA
.
2015
.
Beyond 2/3 and 1/3: the complex signatures of sex-biased admixture on the X chromosome
.
Genetics
.
201
:
263
279
. doi:

Goldberg
 
A
,
Verdu
 
P
,
Rosenberg
 
NA
.
2014
.
Autosomal admixture levels are informative about sex bias in admixed populations
.
Genetics
.
198
:
1209
1229
. doi:

Gravel
 
S
,
Steel
 
M
.
2015
.
The existence and abundance of ghost ancestors in biparental populations
.
Theor Pop Biol
.
101
:
47
53
. doi:

Hacker
 
JD
.
2020
.
From ‘20 and odd’ to 10 million: the growth of the slave population of the United States
.
Slavery Abol
.
41
:
840
855
. doi:

Kelleher
 
J
,
Etheridge
 
AM
,
Véber
 
A
,
Barton
 
NH
.
2016
.
Spread of pedigree versus genetic ancestry in spatially distributed populations
.
Theor Pop Biol
.
108
:
1
12
. doi:

Kim
 
J
,
Edge
 
MD
,
Goldberg
 
A
,
Rosenberg
 
NA
.
2021
.
Skin deep: the decoupling of genetic admixture levels from phenotypes that differed between source populations
.
Am J Phys Anthropol
.
175
:
406
421
. doi:

Lawson
 
DJ
,
van Dorp
 
L
,
Falush
 
D
.
2018
.
A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots
.
Nature Commun
.
9
:
3258
. doi:

Mathieson
 
I
,
Scally
 
A
.
2020
.
What is ancestry?
 
PLoS Genet
.
16
:
e1008624
. doi:

Matsen
 
FA
,
Evans
 
SN
.
2008
.
To what extent does genealogical ancestry imply genetic ancestry?
 
Theor Pop Biol
.
74
:
182
190
. doi:

Milo
 
R
,
Phillips
 
R
.
2015
.
Cell biology by the numbers
.
New York
:
Garland Science
.

Mooney
 
JA
,
Agranat-Tamir
 
L
,
Pritchard
 
JK
,
Rosenberg
 
NA
.
2023
.
On the number of genealogical ancestors tracing to the source groups of an admixed population
.
Genetics
.
224
:
iyad079
. doi:

Rohde
 
DLT
,
Olson
 
S
,
Chang
 
JT
.
2004
.
Modelling the recent common ancestry of all living humans
.
Nature
.
431
:
562
566
. doi:

Stapley
 
J
,
Feulner
 
PGD
,
Johnston
 
SE
,
Santure
 
AW
,
Smadja
 
CM
.
2017
.
Variation in recombination frequency and distribution across eukaryotes: patterns and processes
.
Philos Trans R Soc B
.
372
:
20160455
. doi:

Verdu
 
P
,
Rosenberg
 
NA
.
2011
.
A general mechanistic model for admixture histories of hybrid populations
.
Genetics
.
189
:
1413
1426
. doi:

Weiss
 
KM
,
Long
 
JC
.
2009
.
Non-Darwinian estimation: my ancestors, my genes’ ancestors
.
Genome Res
.
19
:
703
710
. doi:

Wiuf
 
C
,
Hein
 
J
.
1997
.
On the number of ancestors to a DNA sequence
.
Genetics
.
147
:
1459
1468
. doi:

Appendix A: Proofs of Eqs. 3, 4 and 6

We prove Eq. 3, describing E[Xk], by induction. For k=0,

We assume that for k1,

Using the inductive hypothesis and the fact that XkBin(2Xk1,hgk) for 1kg, we obtain

Next, we prove Eq. 4, again by induction. For k=0, X0=1 has variance 0; Eq. 4 holds trivially, as it is an empty sum. For k=1, X1Bin(2,hg1), and therefore,

We assume that for k1,

We use the law of total variance with Eq. 3 and the inductive hypothesis. We have

Finally, we prove Eq. 6. First, we prove that if 0m<ng, then

Fixing m with 0mg1, we proceed by induction on n. For n=m+1, we have

We now assume that for (n,m) with 0m<ng and nm+2,

Then

Having obtained the covariance Cov[Xn,Xm], we conclude

Appendix B: Proofs of Eqs. 13 and 18

We prove Eq. 13, starting with the law of total variance.

For line (ii), given X0,,Xk1,Xk, Uk depends only on Xk1 and Xk. Among the genealogical ancestors in step k of the descendant from step 0, 2Xk1 are parents of admixed individuals from step k1, and Xk are admixed individuals in step k; 2Xk1Xk reach a source population in step k, with binomial probabilities s1,gk/(s1,gk+s2,gk)=s1,gk/(1hgk)=s~1,gk for source 1 and s2,gk/(s1,gk+s2,gk)=s2,gk/(1hgk)=s~2,gk for source 2, respectively. In other words, Uk|Xk1,XkBin(2Xk1Xk,s~1,gk).

For line (iii), in the sum k=1gs~1,gk(2Xk1Xk), for k=1,2,,g1, X0=1 and Xg=0 are constants and have zero variance. We also use the law of total expectation. Line (iv) follows from Eq. 6 and from the binomial distribution of Xk|Xk1, so that E[E[2Xk1Xk|Xk1]]=2E[Xk1]2hgkE[Xk1]=(1hgk)(2E[Xk1]). Finally, for (v), we simplify s~1,gk(1hgk)=s1,gk.

Similarly, we also use the law of total variance to prove Eq. 18:

The proof is entirely analogous, except that s~1,gkpk appears in place of s~1,gk.

Appendix C: Proof of Eq. 35

We prove inequalities concerning pk/pk1: (1) pk/pk1<1 for k2; (2) pk/pk1>12 for k2.

  1. By Eq. 1, pk/pk1=[1ea(k)]/[1eb(k)] for k3, where a(k)=(33k11)/2k1 and b(k)=(33k44)/2k2. For k3, 0<a(k)<b(k), and hence, 1ea(k)<1eb(k) and pk/pk1<1. For k=2, pk/pk1<1 as pk<1 by Eq. 1 and pk1=1.

  2. For k=2, pk/pk1=p2=1e55/2>12. For k3, we rearrange Eq. 1 to find that the inequality pk/pk1>12 is equivalent to
    (C1)
    The inequality ex+ex2 holds for all x, as it is equivalent to coshx1. Hence, for c>1, cex+ex>ex+ex2. We see that Eq. C1 then follows, with
    in place of (c,x). As Eq. C1 holds, we conclude pk/pk1>12.

Author notes

Conflicts of interest The author(s) declare no conflicts of interest.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Editor: J Novembre
J Novembre
Editor
Search for other works by this author on:

Supplementary data