-
PDF
- Split View
-
Views
-
Cite
Cite
João Pimenta, Alexandra M Lopes, David Comas, António Amorim, Miguel Arenas, Evaluating the Neolithic Expansion at Both Shores of the Mediterranean Sea, Molecular Biology and Evolution, Volume 34, Issue 12, December 2017, Pages 3232–3242, https://doi.org/10.1093/molbev/msx256
- Share Icon Share
Abstract
During the Neolithic, human populations underwent cultural and technological developments that led to an agricultural revolution. Although the population genetics and evolution of European Neolithic populations have been extensively studied, little is known regarding the Neolithic expansion in North Africa with respect to Europe. One could expect that the different environmental and geological conditions at both shores of the Mediterranean Sea could have led to contrasting expansions. In order to test this hypothesis, we compared the Neolithic expansion in Europe and North Africa accounting for possible migration between them through the Strait of Gibraltar. We analyzed the entire X chromosome of 580 individuals from 20 populations spatially distributed along the North of Africa and Europe. Next, we applied approximate Bayesian computation based on extensive spatially explicit computer simulations to select among alternative scenarios of migration through the Strait of Gibraltar and to estimate population genetics parameters in both expansions. Our results suggest that, despite being more technologically advanced, Neolithic populations did not expand faster than Paleolithic populations, which could be interpreted as a consequence of a more sedentary lifestyle. We detected reciprocal Neolithic migration between the Iberian Peninsula and North Africa through the Strait of Gibraltar. Counterintuitively, we found that the studied Neolithic expansions presented similar levels of carrying capacity and migration, and occurred at comparable speeds, suggesting a similar demic process of substitution of hunter–gatherer populations. Altogether, the Neolithic expansion through both Mediterranean shores was not so different, perhaps because these populations shared similar technical abilities and lifestyle patterns.
Introduction
The Neolithic transition in Europe and North Africa started in the Near East around 10,000 years ago (ya) and generated a revolution in the living-style with the emergence of farmers after previous hunter–gatherer Paleolithic populations (Zeder and Hesse 2000; Skoglund et al. 2012; Broushaki et al. 2016; Gallego-Llorente et al. 2016; Lazaridis et al. 2016; Meiklejohn et al. 2017). Two contrasting hypotheses were postulated for the expansion of Agriculture, the demic diffusion model (Ammerman and Cavalli-Sforza 1984; Sokal et al. 1991) and the cultural diffusion model (Fort 2012). The first one supports a migratory flux of Neolithic populations across Europe, which partially replaced the indigenous hunter–gatherer populations and implies gene flow between hunter–gatherers and farmers. The second model predicts the cultural adoption of the Neolithic lifestyle by neighbor populations with limited gene flow from Near Eastern populations (Sampietro et al. 2007; Fort 2012; Fu et al. 2012). Agriculture spread rapidly along the Southeast of Europe (Zeder 2008), leading to a pronounced cultural revolution, and left profound genetic marks in European populations (Currat and Excoffier 2005; Bramanti et al. 2009). In contrast to Europe, the tempo and mode of the Neolithic expansion throughout the North of Africa is still obscure, mostly because of a lack of archaeological and biological data (Barich 2014). Most evidence of the presence of domesticated species in North Africa dates around 8,000 ya and older (Linseele et al. 2014; Madella et al. 2014), suggesting an early Neolithic presence in the region, which is in agreement with the Mediterranean expansion route (Zeder 2008; Isern et al. 2017). Beforehand, one could predict that the transition to a farmer lifestyle in North Africa could have been slower than in Europe due to the arid conditions of the Sahara desert. It is also noteworthy that the region suffered frequent and drastic climatic changes during the Paleolithic, with wet and arid periods (Adkins et al. 2006; McGee et al. 2013), which could have caused major demographic and cultural shifts (Clarke et al. 2016). However, North Africa experienced a long African Humid Period (AHP) between 12,000 and 5,000 ya (Adkins et al. 2006; Dunne et al. 2012; McGee et al. 2013), coincident with the Neolithic expansion period.
Importantly, the Mediterranean Sea was a geographical barrier to migration between Neolithic populations of North Africa and Europe, generating great cultural differences between both regions, exemplified by the expansion of Indo-European languages in Europe (Haak et al. 2015) and the expansion of Afro-Asiatic languages in North Africa (Ruhlen 1991). However, the geographical proximity at the westernmost region of both continents (Strait of Gibraltar) could have provided a migration corridor between the two shores (Bosch et al. 2001; González-Pérez et al. 2003; Rhouda et al. 2009; Botigué et al. 2013), although still little is known about the specific direction and amount of migration through this Strait during the Neolithic.
Here, we aim to evaluate and compare the Neolithic expansion on both shores of the Mediterranean Sea accounting for the role of the Strait of Gibraltar as a potential migration corridor between them. Using an approximate Bayesian computation (ABC) method based on extensive spatially explicit computer simulations, we investigated the population genetics and evolution of the Neolithic expansion in Europe and North Africa. First, we selected the evolutionary scenario of migration through the Strait of Gibraltar that fitted best with the real data and second, under the previously selected evolutionary scenario, we estimated population genetics parameters for the Neolithic expansion at each shore of the Mediterranean Sea. For this purpose, we built an X-chromosome data set that combines data from several SNP data set panels and includes samples from Europe, North Africa, Middle East, and Sub-Saharan Africa. The use of the X-chromosome SNPs allowed us to perform an ABC analysis in a computational cost–benefit manner, and avoids errors of haplotype reconstruction since phasing is directly inferred in male individuals.
Results
We first describe the selection between alternative evolutionary scenarios of migration through the Strait of Gibraltar. After that, we present the parameter estimation under the best fitting evolutionary scenario, with a focus on those parameters that are specific for the Neolithic expansion over each shore of the Mediterranean Sea.
Migration through the Strait of Gibraltar Occurred in Both Directions
The selection of the evolutionary scenario (fullMIG vs. noMIG) presented very high posterior probabilities to identify the correct scenario. We found that all the applied ABC methods were able to identify the correct scenario with probabilities that ranged from 0.78 (rejection method implemented in the abc package) to 0.98 (multinomial logistic regression and neural network methods; supplementary table S5, Supplementary Material online).
Next, real data clearly fitted better with the fullMIG scenario (posterior probabilities around 0.99 under all the ABC methods excepting the rejection method of abc package, 0.85; supplementary fig. S3A, Supplementary Material online). Goodness of fit analyses also indicated this finding (supplementary fig. S4, Supplementary Material online). Altogether the results suggest reciprocal migration through the Strait of Gibraltar.
Given that the real data fitted better with the presence of migration, we next investigated the direction of the migration through the Strait of Gibraltar. The validation of the selection of the evolutionary scenario (nsMIG vs. snMIG) also presented very high posterior probabilities to identify the correct scenario. We found that all the ABC methods performed acceptably to recognize the correct scenario with probabilities that ranged from 0.85 (rejection method implemented in the abc package) to 1.00 (Pritchard‘s rejection method; supplementary table S6, Supplementary Material online).
Next, the fitting of real data with both evolutionary scenarios, nsMIG and snMIG, suggested migration in both directions, although with a higher probability for the nsMIG scenario (supplementary fig. S3B, Supplementary Material online). The results varied among ABC methods but led to the same overall conclusion.
Next, we describe the results (evaluation of model selection and analysis of real data) from the two additional comparisons of evolutionary scenarios, (1) the four evolutionary scenarios together (fullMIG vs. noMIG and nsMIG vs. snMIG) and (2) the evolutionary scenario of bidirectional migration against each evolutionary scenario of unidirectional migration (fullMIG vs. nsMIG and fullMIG vs. snMIG).
Concerning the evaluation of the model selection, when the four scenarios are evaluated together (supplementary table S7, Supplementary Material online), the scenario noMIG was recognized with the highest probability (reaching 0.85) when applying the multinomial logistic regression and the neural network methods, whereas the three other scenarios presented lower probabilities (which is expected since fullMIG can sometimes be near to nsMIG or snMIG, see Materials and Methods) but showed the highest probability with the correct scenario. Concerning comparisons between the scenario of bidirectional migration with each scenario of unidirectional migration, we found that all the ABC methods performed acceptably to recognize the correct scenario with probabilities between 0.65 and 0.85 (supplementary tables S8 and S9, Supplementary Material online). Again despite of fullMIG can sometimes be near to nsMIG or snMIG (Materials and Methods).
Concerning the fitting with real data, the highest probability corresponded to the fullMIG scenario, as expected followed by the scenarios with unidirectional migration since these scenarios are very near to the fullMIG scenario (fig. 1, see also supplementary fig. S4, Supplementary Material online). We found similar results when comparing the scenario of bidirectional migration with each scenario of unidirectional migration (supplementary figs. S5 and S6, Supplementary Material online).

Fitting of the four alternative evolutionary scenarios of migration through the Strait of Gibraltar (fullMIG, noMIG, nsMIG, and snMIG) to the real data. Posterior probabilities of evolutionary scenarios were estimated with four different ABC methods (the rejection method by Pritchard and the rejection, multinomial logistic regression, and neural network methods implemented in the abc package, Csilléry et al. 2012).
Similar Neolithic Expansion at Both Shores of the Mediterranean Sea
The estimation of the population genetics parameters was validated with the tests data sets of the best-fitting evolutionary scenario (fullMIG). These estimations presented a performance with overall small estimation errors that fell within the 50% HPDI with respect to the true value (supplementary table S10, Supplementary Material online), although for some parameters related with very ancestral populations (i.e., ancestral Paleolithic population size) the precision was low, as expected and also found in other studies (Wegmann and Excoffier 2010; Alves et al. 2016). The estimated population genetics parameters for the real data are shown in table 1 and in supplementary figure S7, Supplementary Material online. Again, some parameters (especially those related with Paleolithic populations and ancestral population sizes) presented wide posterior distributions, which is expected due to loss of very ancestral genetic signatures (Fagundes et al. 2007; Wegmann and Excoffier 2010; Alves et al. 2016). However, we preferred to account for this variation instead of fixing parameters’ values since previous studies disagree on specific parameters values (i.e., Benguigui and Arenas 2014). Regarding the Paleolithic population, our estimated date for the onset of the expansion of Anatomically Modern Humans (AMH) in Africa is similar (≈100 kya) to that obtained by Alves et al. (2016), Voight et al. (2005), and Fagundes et al. (2007), and like in those studies we also obtained a wide posterior distribution. The estimated ancestral size (≈15,000) was slightly larger but still in agreement with previous analyses based on other genetic markers and also with wide posterior distributions (Fagundes et al. 2007; Alves et al. 2016). The estimation of the onset of the out-of-Africa expansion (≈70 kya but with a wide posterior distribution) is very similar to that obtained by Alves et al. (2016) and within the intervals obtained by Fagundes et al. (2007). The carrying capacity estimated for the Paleolithic population (≈1,500) was slightly larger than in other studies (Hamilton et al. 2007; Deshpande et al. 2009) but falls within the estimations by Alves et al. (2016). The population growth rate per generation (≈0.70) fell within the range estimated in other studies for hunter–gatherer populations (between 0.2 and 0.9; Alves et al. 2016). The migration rate for Paleolithic populations (≈0.23) also fell within the range estimated by Alves et al. (2016). The estimated interbreeding rate was ≈0.011, which corresponds to ∼71% of admixture between Paleolithic and Neolithic populations (obtained with a polynomial regression based on, Arenas et al. 2013) and falls within the range described in other studies (Bramanti et al. 2009; Lazaridis et al. 2014, 2016; Haak et al. 2015).
Process . | Population Genetics Parameter . | Median . | Mean . | Mode . | 95% HPDI . |
---|---|---|---|---|---|
Time | Initial expansion of modern humans (TSTART)a | 100,313 | 100,182 | 88,041 | 81,048–119,050 |
Out of Africa (TOOA)a | 66,313 | 65,864 | 75,516 | 50,899–79,300 | |
Initial Neolithic Expansion (TNEO)a | 10,425 | 10,279 | 11,321 | 8,174–11,900 | |
Population size and demographics | Ancestral size Paleolithic (NANC) | 13,966 | 13,428 | 16,687 | 5,726–19,625 |
Ancestral size Neolithic (NANCNEO) | 26,917 | 27,354 | 21,314 | 6,272–49,117 | |
Population growth rate Paleolithic (GPALEO) | 0.671 | 0.662 | 0.730 | 0.414–0.890 | |
Population growth rate Neolithic (GNEO) | 0.671 | 0.665 | 0.836 | 0.416–0.888 | |
Migration | Migration rate Paleolithic (MPALEO) | 0.205 | 0.191 | 0.233 | 0.075–0.248 |
Migration rate Neolithic in Europe (MNEOEUR) | 0.210 | 0.199 | 0.232 | 0.092–0.248 | |
Migration rate Neolithic in North Africa (MNEONA) | 0.192 | 0.183 | 0.231 | 0.074–0.248 | |
Migration rate Neolithic in rest of the world (MNEOPER) | 0.167 | 0.163 | 0.223 | 0.056–0.247 | |
Migration rate at Strait of Gibraltar—North to South route (MnsSG) | 0.149 | 0.151 | 0.138 | 0.056–0.245 | |
Migration rate at Strait of Gibraltar—South to North route (MsnSG) | 0.153 | 0.152 | 0.122 | 0.055–0.245 | |
Carrying capacity | Carrying capacity Paleolithic (KPALEO) | 1,477 | 1,414 | 1,845 | 571–1,970 |
Carrying capacity Neolithic in Europe (KNEOEUR) | 5,173 | 4,996 | 6,315 | 2,285–6,928 | |
Carrying capacity Neolithic in North Africa (KNEONA) | 5,034 | 4,841 | 6,155 | 2,280–6,920 | |
Carrying capacity Neolithic in rest of the world (KNEOPER) | 4,681 | 4,655 | 6,384 | 2,224–6,863 | |
Admixture | Interbreeding rate (Irate) | 0.015 | 0.017 | 0.011 | 0.004–0.042 |
Process . | Population Genetics Parameter . | Median . | Mean . | Mode . | 95% HPDI . |
---|---|---|---|---|---|
Time | Initial expansion of modern humans (TSTART)a | 100,313 | 100,182 | 88,041 | 81,048–119,050 |
Out of Africa (TOOA)a | 66,313 | 65,864 | 75,516 | 50,899–79,300 | |
Initial Neolithic Expansion (TNEO)a | 10,425 | 10,279 | 11,321 | 8,174–11,900 | |
Population size and demographics | Ancestral size Paleolithic (NANC) | 13,966 | 13,428 | 16,687 | 5,726–19,625 |
Ancestral size Neolithic (NANCNEO) | 26,917 | 27,354 | 21,314 | 6,272–49,117 | |
Population growth rate Paleolithic (GPALEO) | 0.671 | 0.662 | 0.730 | 0.414–0.890 | |
Population growth rate Neolithic (GNEO) | 0.671 | 0.665 | 0.836 | 0.416–0.888 | |
Migration | Migration rate Paleolithic (MPALEO) | 0.205 | 0.191 | 0.233 | 0.075–0.248 |
Migration rate Neolithic in Europe (MNEOEUR) | 0.210 | 0.199 | 0.232 | 0.092–0.248 | |
Migration rate Neolithic in North Africa (MNEONA) | 0.192 | 0.183 | 0.231 | 0.074–0.248 | |
Migration rate Neolithic in rest of the world (MNEOPER) | 0.167 | 0.163 | 0.223 | 0.056–0.247 | |
Migration rate at Strait of Gibraltar—North to South route (MnsSG) | 0.149 | 0.151 | 0.138 | 0.056–0.245 | |
Migration rate at Strait of Gibraltar—South to North route (MsnSG) | 0.153 | 0.152 | 0.122 | 0.055–0.245 | |
Carrying capacity | Carrying capacity Paleolithic (KPALEO) | 1,477 | 1,414 | 1,845 | 571–1,970 |
Carrying capacity Neolithic in Europe (KNEOEUR) | 5,173 | 4,996 | 6,315 | 2,285–6,928 | |
Carrying capacity Neolithic in North Africa (KNEONA) | 5,034 | 4,841 | 6,155 | 2,280–6,920 | |
Carrying capacity Neolithic in rest of the world (KNEOPER) | 4,681 | 4,655 | 6,384 | 2,224–6,863 | |
Admixture | Interbreeding rate (Irate) | 0.015 | 0.017 | 0.011 | 0.004–0.042 |
Note.—For each parameter, we present the median, mean, and mode of the estimated posterior distribution. Parameters in bold correspond to comparisons between European and North African Neolithic populations. In italic, we indicate the most accurate measure (median, mean, or mode) according to supplementary table S10, Supplementary Material online. 95% HPDI indicates the 95% highest posterior probability.
Time is shown in years, assuming a 25 year generation time.
Process . | Population Genetics Parameter . | Median . | Mean . | Mode . | 95% HPDI . |
---|---|---|---|---|---|
Time | Initial expansion of modern humans (TSTART)a | 100,313 | 100,182 | 88,041 | 81,048–119,050 |
Out of Africa (TOOA)a | 66,313 | 65,864 | 75,516 | 50,899–79,300 | |
Initial Neolithic Expansion (TNEO)a | 10,425 | 10,279 | 11,321 | 8,174–11,900 | |
Population size and demographics | Ancestral size Paleolithic (NANC) | 13,966 | 13,428 | 16,687 | 5,726–19,625 |
Ancestral size Neolithic (NANCNEO) | 26,917 | 27,354 | 21,314 | 6,272–49,117 | |
Population growth rate Paleolithic (GPALEO) | 0.671 | 0.662 | 0.730 | 0.414–0.890 | |
Population growth rate Neolithic (GNEO) | 0.671 | 0.665 | 0.836 | 0.416–0.888 | |
Migration | Migration rate Paleolithic (MPALEO) | 0.205 | 0.191 | 0.233 | 0.075–0.248 |
Migration rate Neolithic in Europe (MNEOEUR) | 0.210 | 0.199 | 0.232 | 0.092–0.248 | |
Migration rate Neolithic in North Africa (MNEONA) | 0.192 | 0.183 | 0.231 | 0.074–0.248 | |
Migration rate Neolithic in rest of the world (MNEOPER) | 0.167 | 0.163 | 0.223 | 0.056–0.247 | |
Migration rate at Strait of Gibraltar—North to South route (MnsSG) | 0.149 | 0.151 | 0.138 | 0.056–0.245 | |
Migration rate at Strait of Gibraltar—South to North route (MsnSG) | 0.153 | 0.152 | 0.122 | 0.055–0.245 | |
Carrying capacity | Carrying capacity Paleolithic (KPALEO) | 1,477 | 1,414 | 1,845 | 571–1,970 |
Carrying capacity Neolithic in Europe (KNEOEUR) | 5,173 | 4,996 | 6,315 | 2,285–6,928 | |
Carrying capacity Neolithic in North Africa (KNEONA) | 5,034 | 4,841 | 6,155 | 2,280–6,920 | |
Carrying capacity Neolithic in rest of the world (KNEOPER) | 4,681 | 4,655 | 6,384 | 2,224–6,863 | |
Admixture | Interbreeding rate (Irate) | 0.015 | 0.017 | 0.011 | 0.004–0.042 |
Process . | Population Genetics Parameter . | Median . | Mean . | Mode . | 95% HPDI . |
---|---|---|---|---|---|
Time | Initial expansion of modern humans (TSTART)a | 100,313 | 100,182 | 88,041 | 81,048–119,050 |
Out of Africa (TOOA)a | 66,313 | 65,864 | 75,516 | 50,899–79,300 | |
Initial Neolithic Expansion (TNEO)a | 10,425 | 10,279 | 11,321 | 8,174–11,900 | |
Population size and demographics | Ancestral size Paleolithic (NANC) | 13,966 | 13,428 | 16,687 | 5,726–19,625 |
Ancestral size Neolithic (NANCNEO) | 26,917 | 27,354 | 21,314 | 6,272–49,117 | |
Population growth rate Paleolithic (GPALEO) | 0.671 | 0.662 | 0.730 | 0.414–0.890 | |
Population growth rate Neolithic (GNEO) | 0.671 | 0.665 | 0.836 | 0.416–0.888 | |
Migration | Migration rate Paleolithic (MPALEO) | 0.205 | 0.191 | 0.233 | 0.075–0.248 |
Migration rate Neolithic in Europe (MNEOEUR) | 0.210 | 0.199 | 0.232 | 0.092–0.248 | |
Migration rate Neolithic in North Africa (MNEONA) | 0.192 | 0.183 | 0.231 | 0.074–0.248 | |
Migration rate Neolithic in rest of the world (MNEOPER) | 0.167 | 0.163 | 0.223 | 0.056–0.247 | |
Migration rate at Strait of Gibraltar—North to South route (MnsSG) | 0.149 | 0.151 | 0.138 | 0.056–0.245 | |
Migration rate at Strait of Gibraltar—South to North route (MsnSG) | 0.153 | 0.152 | 0.122 | 0.055–0.245 | |
Carrying capacity | Carrying capacity Paleolithic (KPALEO) | 1,477 | 1,414 | 1,845 | 571–1,970 |
Carrying capacity Neolithic in Europe (KNEOEUR) | 5,173 | 4,996 | 6,315 | 2,285–6,928 | |
Carrying capacity Neolithic in North Africa (KNEONA) | 5,034 | 4,841 | 6,155 | 2,280–6,920 | |
Carrying capacity Neolithic in rest of the world (KNEOPER) | 4,681 | 4,655 | 6,384 | 2,224–6,863 | |
Admixture | Interbreeding rate (Irate) | 0.015 | 0.017 | 0.011 | 0.004–0.042 |
Note.—For each parameter, we present the median, mean, and mode of the estimated posterior distribution. Parameters in bold correspond to comparisons between European and North African Neolithic populations. In italic, we indicate the most accurate measure (median, mean, or mode) according to supplementary table S10, Supplementary Material online. 95% HPDI indicates the 95% highest posterior probability.
Time is shown in years, assuming a 25 year generation time.
Regarding the Neolithic population, the estimated time for the onset of the Neolithic expansion (around 10 kya) was in agreement with previous findings (Goring-Morris and Belfer-Cohen 2011). As expected, the estimated Neolithic ancestral size (around 23,000) and carrying capacity (≈6,200) were higher than for the Paleolithic population (table 1). However, the population growth rate and migration rate for Neolithic and Paleolithic populations were similar. We did not find significant differences (at the 95% HPDI) between the migration rate from the Iberian Peninsula to the North of Africa and the migration rate in the opposite direction. Indeed and as expected, the migration rate through the Strait (≈0.15) was much lower than the migration rate in other regions, which is indicative of the barrier to gene flow generated by the Strait. Importantly, the Neolithic migration rate through Europe and through the North of Africa presented similar levels (≈0.23). Indeed, the Neolithic carrying capacity for European and North African populations also exhibited similar levels (table 1).
Discussion
The change from the hunter–gatherer lifestyle to the more sedentary farmers of the Neolithic was a major transition in human history (Bocquet-Appel 2011). The Near Eastern Neolithic revolution expanded to several regions, including Europe and North of Africa. However, little is known about the Neolithic expansion in North of Africa and its comparison with the expansion in Europe. Here, we evaluate the Neolithic progression in Europe, a region where the expansion from Anatolia was fast and agriculture was adopted in ≈3,000 years (Bocquet-Appel et al. 2012), and in North Africa, a region still poorly studied. To address this aim, we analyzed the entire X chromosome (16,917 SNPs) from 580 male individuals that belong to 20 populations from Europe, Africa, and Near East.
Migration through the Strait of Gibraltar during the Neolithic is abundantly verified by archaeological and genetic data (Hervella et al. 2016), even in studies based on uniparental markers (Bosch et al. 2001; Gonzalez et al. 2003; González-Pérez et al. 2003; Cruciani et al. 2004; Ennafaa et al. 2009; Rhouda et al. 2009) and genome-wide data (Botigué et al. 2013; Arauna et al. 2017). Although we also detected migration through this strait, we found that the Strait of Gibraltar was a barrier to gene flow since the estimated migration rate through this strait was lower than the estimated migration rates in Europe or in the North of Africa (table 1). Our results support a reciprocal gene flow through the Strait of Gibraltar, which suggests European influence on the genetic diversity of North African populations (and vice versa), increasing their heterogeneity. Concerning North Africa, this is in agreement with the recent study by Arauna et al. (2017), which showed that North African populations present, in general, high levels of genetic heterogeneity and low geographical structure. Note that North Africa has a peculiar geographic landscape that contributed for its complex genetic diversity, limited by the Sahara desert and the Mediterranean Sea and connected to the Middle East through the Sinai and Arabian Peninsulas. Both the Sahara desert and the Mediterranean Sea operated as permeable barriers, not completely blocking the contact between North Africa and the rest of the African continent and Europe (Henn et al. 2012; Botigué et al. 2013; Arauna et al. 2017). Those migrations, together with the particular above described migration through the Strait of Gibraltar, played an important role on current genetic diversity of North African populations (Henn et al. 2012; Arauna et al. 2017). In contrast, Southern Europe might have presented a lower number of different migration routes, leading to a higher population structure (Hofmanová et al. 2016; Baker et al. 2017).
The parameter estimation showed that Neolithic migration and carrying capacity in Europe and North Africa were very similar, even presenting comparable speeds of expansion on both regions. This finding fits well with the existence of: (1) maritime migrations that could potentiate a regular contact between both shores (Fernández et al. 2014; Paschou et al. 2014) and (2) favorable climatic conditions in both Southern Europe and North Africa that facilitated the adoption of the farmer lifestyle (Dunne et al. 2012; Manning and Timpson 2014; Olalde et al. 2015). Archaeological data show that the introduction of the Neolithic in the Iberian Peninsula involved a pioneer expansion of the first farmers by sea (Zilhão 2001) and that this rapid expansion most likely occurred through long-distance dispersal (LDD) events along the coast (Isern et al. 2017). Furthermore, the archaeological evidence in North Africa and the Iberian Peninsula suggests a simultaneous arrival of the Neolithic by seafarers based on similar pottery and manufacturer techniques in both regions (Cortés Sánchez et al. 2012; Linstadter et al. 2012). Also, these results suggest a regular contact between western Mediterranean populations, which could be favored by their geographical proximity (through the Strait of Gibraltar) and are in agreement with our selected evolutionary scenario of reciprocal migration between both shores.
Regarding the speed of the Neolithic range expansion, we performed a set of computer simulations based on the estimated posterior distributions (within the 95% HPDI) to estimate the dates of Neolithic arrival to the Iberian Peninsula and to perform comparisons with other studies (Isern et al. 2017). We found that the timing based on empirical data (Cortés Sánchez et al. 2012; Linstadter et al. 2012; Bernabeu Aubán et al. 2015; Isern et al. 2017) fell within our estimated time intervals, which indeed may provide an additional validation of our models. Next, our models generated a similar time of arrival of Agriculture at the Iberian Peninsula and Western North Africa, which is in agreement with radiocarbon dates of Neolithic plants found in both regions (Morales et al. 2013; Isern et al. 2017).
The transition from hunter–gatherer to farmer lifestyle is still shrouded in controversy with two main hypotheses proposed for the way it took place: cultural or demic diffusion. Archaeological and classical genetic analyses of the European Neolithic expansion suggested a more influential role of the demic diffusion at the continental level (Cavalli-Sforza et al. 1993,, 1995), whereas the cultural diffusion had a secondary role, despite possible regional variations (Fort 2012; Rasteiro et al. 2013; Fort et al. 2015). Recent whole-genome data (Lazaridis et al. 2014,, 2016) reinforce the idea of the Neolithic demic diffusion from the Near East and differential admixture with hunter–gatherers in Europe. The spread of the Neolithic in Europe was estimated to be around 1.0 km per year (Pinhasi et al. 2005), which is in agreement with predictions for the demic diffusion model. Our results suggest a similar scenario for the Neolithic expansion in North Africa considering that this region presented a similar migration rate to the corresponding one in Europe (table 1). However, despite our estimates of a similar general velocity of the Neolithic front-wave in Europe and in North Africa, the Neolithic transition of Northern African populations could have been more heterogeneous. Archaeological evidence denotes a longer period of coexistence of hunter–gatherers and farmers in North Africa than in the Iberian Peninsula, which could be attributed to harsher environmental conditions for crops and cattle production and could lead to a slower Neolithization process by cultural assimilation (Linstadter et al. 2012). Indeed, genetic data show some discontinuity between both shores of the Mediterranean (Botigué et al. 2013) despite the regular contact between the two regions, at least since the emergence of the Neolithic, as indicated by our results and archaeological studies (Linstadter et al. 2012; Mulazzani et al. 2016). Our results follow those based on archaeological data (in terms of timing), which might suggest that the Neolithic cultural component was more important for the contact between both shores than a genetic exchange.
Interestingly, we did not find significant differences between migration rates of Paleolithic and Neolithic populations despite the larger carrying capacities and ancestral sizes presented in the Neolithic. These results, which are in agreement with previous studies (e.g., Pinhasi et al. 2005), could be caused by a space-competition between Mesolithic and Neolithic populations leading to a slowdown of the front wave transition of the Neolithic (Fort and Méndez 1999; Isern and Fort 2010; Isern et al. 2012).
Although the X chromosome is usually ignored in population genetics studies, it encloses some useful features (in comparison to genetic markers on the autosomes, Y chromosome and mtDNA) that we took advantage while performing this study. The X chromosome has a lower recombination rate than the autosomes due to its particular inheritance pattern, with recombination in the X chromosome nonPAR only occurring in females, resulting in the presence of long chromosomal segments that share a common history. Furthermore, haplotypes can be directly inferred in male individuals. In contrast to the Y chromosome and mtDNA, genetic drift has a lower effect on X chromosome markers while having a larger number of loci with potential different histories (Schaffner 2004). Additionally, male haplotypes of a given generation are the result of meiotic recombination in females in the previous generation and half of the X chromosomes in females in the following generation, meaning that for studies involving a large number of generations, problems concerning sex-bias are mostly avoided indicating that results would not be substantially different using autosomal markers. Indeed, Goldberg and Rosenberg (2015) demonstrated that, under a model of constant migration rate with no sex-bias, the admixture fractions for both autosomes and X chromosome are very similar when considering a large number of generations. Interestingly, a recent study based on three X chromosome regions with little or no recombination where European populations were analyzed (Delser et al. 2017), showed evidence of preNeolithic expansions at 15–18 kya and diversity in one of the studied regions supported the existence of a Neolithic expansion around 10 kya. This is in line with our model for a post LGM recolonization of Europe and a Near East Neolithic expansion, respectively. In conclusion, our study and other recent studies (Goldberg and Rosenberg 2015; Delser et al. 2017) showed that the X chromosome per se can provide a large amount of useful information for population genetics analyses aimed to dissect the complex genetic history of modern humans, not merely as a complement for other genetic markers.
Materials and Methods
Data and Genotyping
A data set composed by 20 populations with 846 individuals (580 males and 266 females) was built with published data (fig. 2 and supplementary table S1, Supplementary Material online). The data set included 11 European populations, Finland (FIN), Northern and Western Europeans currently in Utah (CEU), Great-Britain (GBR), Tuscany (TSI), East Spain (ESP), and Central Spain (CSP) populations retrieved from 1,000 Genomes Phase 3 (The 1000 Genomes Project Consortium 2012); Basque (BAS), Andalusia (AND), and Galicia (GAL) populations retrieved from Henn et al. (2012) and 1,000 Genomes Phase 3; and Porto (POR) and Lisbon (LIS) populations retrieved from Lopes et al. (2013). It also included seven North African populations retrieved from Henn et al. (2012) and a Sub-Saharan [Yoruba (YRI)] populations from 1,000 Genomes Phase 3. Finally, a Syrian population retrieved from Arauna et al. (2017).

Location and sample size (580 male individuals) of the 20 populations that composed the studied data set. Populations were divided in eight geographic groups abbreviated as EUR (European populations), BAS (Basque Country population), IBE (Iberian populations), WNA (western North Africa populations), CNA (central North Africa populations), ENA (eastern North Africa populations), SYR (Syrian population), and YRI (Yoruba population).
A quality control filter was applied to the X chromosome data using PLINK 1.07 (Purcell et al. 2007) and SNPs with missing rate >10%, SNPs that failed the Hardy–Weinberg equilibrium under a threshold of 0.05 and SNPs with Minor Allele Frequency (MAF) <0.05 were excluded to avoid biases. In addition, individuals with a missing rate >10% and those with an identity by state, for the autosomes, >85%, were also excluded. Moreover, SNPs within both pseudoautosomal regions (PAR1 and PAR2) and those that were heterozygous in the X specific region in male individuals were also excluded. A total of 16,939 SNPs were maintained to proceed with the phasing step.
Phasing
As a sanity check, the data set was phased using the SHAPEIT2 software (Delaneau et al. 2013), the HapMap PhaseII b37 (The International Hapmap Consortium 2003) with the genetic map and the 1,000 genomes data set as the reference panel (The 1000 Genomes Project Consortium 2012). The data set was phased after an alignment with the reference panel and the removal of SNPs that did not align. Next, all genotypes were phased simultaneously with a total of 100 iterations, half of them for burn-in and pruning stages. A total of 16,917 SNPs were kept for subsequent analyses. For the ABC analysis, only male samples were considered, since X chromosome haplotypes can be directly (without biases) identified.
Spatially Explicit Computer Simulations
Paleolithic and Neolithic expansions were simulated with the evolutionary framework SPLATCHE2 (Ray et al. 2010). SPLATCHE2 is a well-established spatially explicit computer simulator that has been widely used to investigate human evolution (Benguigui and Arenas 2014). This program is based on three major steps. The first one consists of a forward-in-time simulation of the whole population under demographic parameters such as population size at the onset of the simulation, population growth rate and number of generations. This demographic simulation is performed over a 2D landscape (grid of demes) that considers a carrying capacity per deme (number of individuals that can be sustained by the resources of the deme) and where individuals can move between demes according to a migration rate and under a 2D stepping-stone migration model (Kimura and Weiss 1964). The landscape used for this study was obtained by a geographic information system (GIS) and includes 4,814 demes with deme size 150 × 150 km following Alves et al. (2016; fig. 2). A land bridge was added to connect Great Britain with mainland Europe, allowing the colonization of these islands. The second step is a backward-in-time (coalescent) simulation that generates the evolutionary history of a user-specified sample accounting for the history of the whole population stored in the previous step (further details in Currat et al. 2004). In the third step, the program simulates the genetic data of the sample. To perform this task, a randomly selected sequence is assigned to the root node of the previously simulated coalescent history and is evolved going forward-in-time over such coalescent history, according to a substitution model of evolution, to generate genetic sequences for the sampled nodes (Yang 2006; Arenas 2012). Altogether, the simulated genetic data are influenced by the user-specific population genetics parameters (i.e., large population size can lead to high genetic diversity; Ray et al. 2010).
We performed computer simulations following settings described in recent studies (François et al. 2010; Arenas et al. 2013; Alves et al. 2016). The Paleolithic expansion was assumed to start from the Ethiopian region (Pritchard et al. 2000) at TSTART generations ago with an ancestral population size of NANC haploid individuals. At each generation, individuals migrated to the adjacent demes at a migration rate MPALEO and the population density for each deme was determined by the population growth rate GPALEO and the carrying capacity KPALEO (Currat et al. 2004). A range contraction was simulated to mimic the last glacial maximum (LGM) period in Africa and Eurasia described by Alves et al. (2016). Briefly, this range contraction consisted of 30 consecutive events that progressively reduced the habitable regions (i.e., at the North of Europe or increasing deserts size; supplementary fig. S1, Supplementary Material online) of human settlements (Arenas et al. 2012). The range contraction started at 25,000 ya and took 3,000 years to reach its maximum. Next, individuals were only living at the refugia (South of Europe, North Africa, and South of Africa; see supplementary fig. S1, Supplementary Material online) during 4,000 years and finally re-expanded at 18,000 ya (Alves et al. 2016).
In addition, we simulated a Neolithic expansion that started from the Middle East (Zeder 2008) at TNEO generations ago and with an ancestral size at the onset of the expansion of NANCNEO haploid individuals and a population growth rate GNEO. Here, the map was divided into three different geographic regions: Europe, North Africa and peripheral regions (i.e., Arabian Peninsula and Sub-Saharan Africa; fig. 3A). Each of these regions had a specific migration rate (MNEOEUR, MNEONA, and MNEOPER) and carrying capacity (KNEOEUR, KNEONA, and KNEOPER), actually these are the only parameters implemented in SPLATCHE2 with spatio-temporal variation. The contribution of the Paleolithic to the final genetic pool was considered with the parameter interbreeding rate Irate (Barbujani et al. 1995; Currat and Excoffier 2004, 2005, 2011; Currat et al. 2008; François et al. 2010; Arenas et al. 2013), which is assumed to be spatio-temporal invariable in the simulator.

Landscape features for the spatially explicit computer simulations of the Neolithic expansion. (A) 2D map of the three different regions considered for the Neolithic expansion: Europe, North Africa and peripheral regions. Uninhabited regions (i.e., deserts) are shown in light gray. The origin of the simulated Neolithic expansion is shown in black. (B) Alternative evolutionary scenarios of migration through the Strait of Gibraltar: (a) reciprocal migration between both regions (fullMIG), (b) lack of migration between both regions (noMIG), (c) migration allowed only from the Iberia Peninsula to North Africa (nsMIG), (d) migration allowed only from North Africa to the Iberia Peninsula (snMIG).
All the parameters were drawn from Uniform prior distributions (supplementary table S2, Supplementary Material online) based on previous studies (Arenas et al. 2013; Alves et al. 2016) and were estimated with ABC (details below).
Next, based on the previously simulated demographic history, we simulated genetic data for the sample of 580 male individuals of the 20 populations. Each individual included an X chromosome sequence of 16,917 SNPs divided in 14 independent blocks (supplementary table S3, Supplementary Material online). These blocks were determined by the presence of particular regions without recombination based on published data (Tillmar et al. 2008) and based on estimated recombination hotspots for the CEU, POR and YRI populations (chosen arbitrarily) with the program LDhat 2.2 (McVean et al. 2004). LDhat was run following the recommendations provided in the software documentation. We applied a sliding window of 2,000 SNPs with an overlap of 500 SNPs and performed six independent runs (two per population) with 10 million iterations each (supplementary fig. S2A, Supplementary Material online). We specified 2,000 iterations per sampling and a block penalty of 20 for changes in recombination rate ρ (ρ = 4Ner). Following the documentation, 10% of iterations were discarded as burn-in. Convergence between the independent runs was assessed considering a recombination rate threshold of 20; however, most SNPs presented a difference <1 indicating a very high convergence (supplementary fig. S2B–D, Supplementary Material online). To perform the computer simulations, the MAF value was set to be not <0.05 for every block, in line with the SNP filter applied to the real data set, as described above.
Alternative Scenarios
In order to properly study the Neolithic expansion over Europe and over North Africa, we explored genetic signatures of migration through the Strait of Gibraltar during the Neolithic, assuming absence or little migration between these regions before that period (e.g., Timmermann and Friedrich 2016). Although some studies suggested Paleolithic migration through this Strait (Currat et al. 2010), other studies described or assumed lack of this migration (Ferembach 1985; Bocquet-Appel and Demars 2000; Liu et al. 2006; Timmermann and Friedrich 2016). We believe that if Paleolithic migration through the Strait took place, the corresponding genetic signatures could be insignificant due to the small number of migrants and evolutionary processes like genetic drift. Next, Neolithic migration through the Strait of Gibraltar could affect Neolithic populations at both shores of the Mediterranean Sea by exchanging genetic material between them. Therefore, first, we investigated the fitting of alternative evolutionary scenarios to the real data with ABC (details below). One of these scenarios considered migration in both directions between Iberian Peninsula and North Africa (fullMIG, fig. 3Ba), with migration from North to South and from South to North through the Strait of Gibraltar with rates MnsSG and MsnSG, respectively. The other scenario considered lack of migration (noMIG, fig. 3Bb).
Next, since fullMIG better fitted with the real data (see Results), we designed two additional scenarios to compare both directions of migration through the Strait of Gibraltar. Both scenarios allowed migration but under only one direction (anisotropic migration): a scenario of migration from the Iberian Peninsula to North Africa (nsMIG, fig. 3Bc) at MnsSG rate and, other scenario of migration from North Africa to the Iberian Peninsula (snMIG, fig. 3Bd) with rate MsnSG.
Finally, we investigated model selection under some additional combinations of evolutionary scenarios: (1) the four scenarios together (fullMIG vs. noMIG and nsMIG vs. snMIG) and (2) the scenario of bidirectional migration against each scenario of unidirectional migration (fullMIG vs. nsMIG and fullMIG vs. snMIG). However here note that fullMIG includes a different prior distribution for the migration rate towards each direction, whose sampling can sometimes lead to situations (large anisotropic migration) near to nsMIG or snMIG and this must be considered in the analysis of these scenarios.
ABC for Selection of Evolutionary Scenarios and Parameter Estimation
The real data were analyzed with ABC (Beaumont et al. 2002; Beaumont 2010), a statistical approach to obtain posterior probabilities in the selection among alternative scenarios and the estimation of parameters. Next, we describe the different steps of our ABC method.
Computer Simulations
For each evolutionary scenario (of the four above described), we performed a total of 100,000 spatially explicit computer simulations according to the prior distributions described in supplementary table S2, Supplementary Material online.
Summary Statistics (SS)
SS from real and simulated data sets were obtained with ARLEQUIN ver.3.5 (Excoffier and Lischer 2010). To perform this task, the 20 populations were classified into eight geographical groups (described in fig. 2) that allowed the estimation of diverse group-based SS. This grouping is important because the ABC performance reduces quickly as the number of statistics grows (Beaumont 2010). The computed SS included: number of alleles (k), heterozygosity (H), number of segregating sites (s), and FST between groups and populations. Next, we selected a small set (36) of the most informative SS for the estimations (supplementary table S4, Supplementary Material online). This was done by testing the contribution of the different SS to the estimations from 100 simulated data sets (hereafter test data sets) that are independent of the 100,000 simulated data sets used to generate the ABC method. We did not apply partial least square (PLS) components to reduce the number of SS (Wegmann et al. 2009) because our selected SS led to accurate estimations and because PLS components can generate SS without biological meaning and thus being difficult to interpret, and also can generate biases when comparing components that were separately computed (Aeschbacher et al. 2012).
Selection of Evolutionary Scenarios
As noted above, we generated 100,000 simulated data sets for each of the four migration scenarios. In addition, we simulated 100 test data sets for each scenario to evaluate the accuracy of the method. We applied four ABC methods to perform the selection of the best-fitting evolutionary scenario to the real data: the rejection method developed by Pritchard et al. (1999); and the methods rejection, multinomial logistic regression, and neural network, implemented in the abc package of R (Csilléry et al. 2012). Following the authors recommendation we retained 1% of simulations that were closer to the observed data (Csilléry et al. 2012). Before analyzing the real data, the accuracy of the ABC methods was assessed with the test data sets (considering them as true data) and by cross-validation based on 100 permutations and a tolerance of 1% (following the recommendations described in the package abc, Csilléry et al. 2012). Although traditionally the goodness of fit analysis is only applied to the preferred scenario, we performed goodness of fit analyses (based on principal component analysis (PCA) and histogram of null distributions from the selected SS, Csilléry et al. 2012) to the four investigated evolutionary scenarios.
Parameters Estimation
We performed the parameters estimation under the evolutionary scenario (fullMIG), which is the scenario that fitted best with the real data. We applied the rejection method implemented in the abc package (which generated the most accurate inferences for the studied data set) by retaining 1,000 simulated data sets of the closest simulations. The robustness of the parameter estimation was assessed with the 100 test data sets simulated under the fullMIG scenario, considering them as true data. Then, we computed the distance between the true value and the estimated value (median, mean, and mode) from each of the 100 test data sets.
Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.
Acknowledgments
We would like to thank Lara R. Arauna (Universitat Pompeu Fabra) for technical advice with the data sets used. IPATIMUP integrates the i3S Research Unit, which is partially supported by FCT in the framework of the project “Institute for Research and Innovation in Health Sciences” (POCI-01-0145-FEDER-007274). J.P., A.M.L., and M.A. were funded by the Portuguese Government through the FCT fellowship SFRH/BD/97200/2013 and research contracts IF/01262/2014 and IF/00955/2014, respectively. M.A. was also supported by the “Ramón y Cajal” grant RYC-2015-18241 from the Spanish Government. D.C. was supported by the Spanish MINECO grant CGL2013-44351-P.
References
Author notes
Associate editor: Evelyne Heyer