The details of Eurasian back-migration into Africa

Carl Zimmer has an interesting write-up on the new method to detect Neanderthal ancestry in Africa, Neanderthal Genes Hint at Much Earlier Human Migration From Africa. There are two quotes from researchers that are of note.

First, from David Reich:

Despite his hesitation over the analysis of African DNA, Dr. Reich said the new findings do make a strong case that modern humans departed Africa much earlier than thought.

“I was on the fence about that, but this paper makes me think it’s right,” he said.

It’s possible that humans and Neanderthals interbred at other times, and not just 200,000 years ago and again 60,000 years ago. But Dr. Akey said that these two migrations accounted for the vast majority of mixed DNA in the genomes of living humans and Neanderthal fossils.

Over the years I have had several discussions with members of the Reich lab about whether there was a major migration of the antecedent lineage of modern humans before the one that we detect 60,000 years ago. Many were quite skeptical because of the lack of clear genetic signal of anything before 60,000 years ago, as well as its correlation with a strong archaeological record. But, it seems now that David Reich at least is convinced that the evidence of admixture into Neanderthals means that there were descendants of the same lineage that led to the major “Out of Africa” expansion 60,000 years ago who had spread earlier (though the footprint was small, and their impact on later humans difficult to detect).

Second, Sarah Tishkoff says something that I forgot to mention in my earlier post:

Sarah Tishkoff, a geneticist at the University of Pennsylvania, is doing just that, using the new methods to look for Neanderthal DNA in more Africans to test Dr. Akey’s hypothesis.

Still, she wonders how Neanderthal DNA could have spread between populations scattered across the entire continent.

The second part isn’t that inexplicable. In the paper, they mention that they don’t have the power to analyze small sample numbers. So they focused on the 1000 Genomes samples, which are from West and East Africa. From agriculturalist and agro-pastoralist populations. If you listen to this week’s episode of The Insight Spencer and I talk extensively about the recent agriculturally mediated expansions within Africa. Much of the genetic landscape of the continent is novel, new, and of short historical time-depth. The Africa of Old Kingdom Egypt, 4,500 years ago, was very different.

As hinted by Tishkoff the key is going to be when we get samples from hunter-gatherers. Some of these have much lower Eurasian affinities, and likely they’ll carry less Neanderthal ancestry.

On a final note, this paper and the first author, Joshua Akey, hints at some resolution in the interminable disagreement about continuous gene flow vs. pulse admixture. Some of the methods to infer and detect admixture assume pulse admixture, and so our conception of the past has been skewed. On the other hand, I think it is plausible that in a patchy low population density Paleolithic landscape continuous gene flow may have been quite attenuated over long distances. Admixture then would occur when there were cultural revolutions and long-distance contact for short periods of time, before an equilibration. Basically, it’s some of both.

The great Han Empire in Africa

Howard French’s China’s Second Continent: How a Million Migrants Are Building a New Empire in Africa is a bit cliche. Rather than a scholarly book it’s more an observational travelogue, and it suffers somewhat from the fact that it is focused on Chinese who live in Africa, but are never of it. Chinese are Chinese, and those who migrate to Africa have more commonalities than most. So French’s attempts to spin out distinct experiences was a bit stretched. Basically, the same thing is happening over and over across the African continent.

When I say this that it is cliche, I’m alluding to the fact that for many Chinese presence in Africa is rather well known. But the reality is not everyone knows about it. So I was happy to see The New York Times put this issue front and center, Is China the World’s New Colonial Power?

There are twists which are important to remember. First, China’s working age population has been declining since 2012. This is going to put a crimp in any “imperial” ambitions. Second, this Chinese “empire” is not going to be an explicitly political one, but rather one of influence, control, and tough soft power.

That being said, we should’t underestimate the will and need of the Chinese to have their “time in the sun.” Fifteen years ago Ross Terrill wrote The New Chinese Empire. In it be observed that for much of Chinese history there has long been a division between a moralistic/ideological camp and a more nationalist realpolitik element. He traces this division back to antiquity, with Confucianism and Legalism as the prototypes (I’m not sure I believe this). But Terrill observed that Deng Xiaoping and the leaders who he cultivated and promoted to succeed him were generally much more nakedly nationalistic than Mao ever was.

Just something to keep in mind as we look to the future….

Beyond “Out of Africa” and multiregionalism: a new synthesis?

For several decades before the present era there have been debates between proponents of the recent African origin of modern humans, and the multiregionalist model. Though molecular methods in a genetic framework have come of the fore of late these were originally paleontological theories, with Chris Stringer and Milford Wolpoff being the two most prominent public exponents of the respective paradigms.

Oftentimes the debate got quite heated. If you read books from the 1990s, when multiregionalism in particular was on the defensive, there were arguments that the recent out of Africa model was more inspirational in regards to our common humanity. As a riposte the multiregionalists asserted that those suggesting recent African origins with total replacement was saying that our species came into being through genocide.

Though some had long warned against this, the dominant perception outside of population genetics was that results such the “mitochondrial Eve” had given strong support to the recent African origin of modern humans, to the exclusion of other ancestry. 2002’s Dawn of Human Culture took it for granted that the recent African origin of modern humans to the total exclusion of other hominin lineages was established fact.

In 2008 I went to a talk where Svante Paabo presented some recent Neanderthal ancient mtDNA work. It was rather ho-hum, as Paabo showed that the Neanderthal lineages were highly diverged from modern ones, and did not leave any descendants. Though of course most modern human lineages did not leave any descendants from that period, Paabo took this evidence supporting the proposition that Neanderthals did not contribute to the modern human gene pool.

When his lab reported autosomal Neanderthal admixture in 2010, it was after initial skepticism and shock internally. I know Milford Wolpoff felt vindicated, while Chris Stringer began to emphasize that the recent African origin of modern humanity also was defined by regional assimilation of other lineages. The data have ultimately converged to a position somewhere between the extreme models of total replacement or balanced and symmetrical gene flow.

This is not surprising. Extreme positions are often rhetorically useful and popular when there’s no data. But reality does not usually conform to our prejudices, so ultimately one has to come down at some point.

The data for non-Africans is rather unequivocal. The vast majority of (>90%) of the ancestry of non-Africans seems to go back to a small number of common ancestors ~60,000 years ago. Perhaps in the range of ~1,000 individuals. These individuals seem to be a node within a phylogenetic tree where all the other branches are occupied by African populations. Between this period and ~15,000 years ago these non-Africans underwent a massive range expansion, until modern humans were present on all continents except Antarctica. Additionally, after the Holocene some of these non-African groups also experienced huge population growth due to intensive agricultural practice.

To give a sense of what I’m getting at, the bottleneck and common ancestry of non-Africans goes back ~60,000 years, but the shared ancestry of Khoisan peoples and non-Khoisan peoples goes back ~150,000-200,000 years. A major lacunae of the current discussion is that often the dynamics which characterize non-Africans are assumed to be applicable to Africans. But they are not.

A 2014 paper illustrates one major difference by inferring effective population from whole genomes: African populations have not gone through the major bottleneck which is imprinted on the genomes of all non-African populations. The Khoisan peoples, the most famous of which are the Bushmen of the Kalahari, have the largest long term effective populations of any human group. The Yoruba people of Nigeria have a history where they were subject to some population decline, but not to the same extent as non-Africans.

What do we take away from this?

One thing is that we have to consider that the assimilationist model which seems to be necessary for non-Africans, also applies to Africans. For years some geneticists have been arguing that some proportion of African ancestry as well is derived from lineages outside of the main line leading up to anatomically modern humans. Without the smoking gun of ancient genomes this will probably remain a speculative hypothesis. I hope that Lee Berger’s recent assertion that they’ve now dated Homo naledi to ~250,000 years before the present may offer up the possibility that ancient DNA will help resolve the question of African archaic admixture (i.e., if naledi is related to the “ghost population”?).

The second dynamic is that the bottleneck-then-range-expansion which is so important in defining the recent prehistory of non-Africans is not as relevant to Africans during the Pleistocene. The very deep split dates being inferred from whole genome analysis of African populations makes me wonder if multiregional evolution is actually much more important within Africa in the development of modern humans in the last few hundred thousand years. Basically, the deep split dates may highlight that there was recurrent gene flow over hundreds of thousands of years between different closely related hominin populations in Africa.

Ultimately, it doesn’t seem entirely surprising that the “Out of Africa” model does not quite apply within Africa.

Addendum: Over the past ~5,000 years we have seen the massive expansion of agricultural populations within the continent. The “deep structure” therefore may have been erased to a great extent, with Pygmies, Khoisan, and Hadza, being the tip of the iceberg in terms of the genetic variation which had characterized the Africa during the Pleistocene.

“Out of Africa” bottleneck is what really matters for mutations


At least in relation to mutational load, if you read a new preprint in biorxiv, The demographic history and mutational load of African hunter-gatherers and farmers:

The distribution of deleterious genetic variation across human populations is a key issue in evolutionary biology and medical genetics. However, the impact of different modes of subsistence on recent changes in population size, patterns of gene flow, and deleterious mutational load remains to be fully characterized. We addressed this question, by generating 300 high-coverage exome sequences from various populations of rainforest hunter-gatherers and neighboring farmers from the western and eastern parts of the central African equatorial rainforest. We show here, by model-based demographic inference, that the effective population size of African populations remained fairly constant until recent millennia, during which the populations of rainforest hunter-gatherers have experienced a ~75% collapse and those of farmers a mild expansion, accompanied by a marked increase in gene flow between them. Despite these contrasting demographic patterns, African populations display limited differences in the estimated distribution of fitness effects of new nonsynonymous mutations, consistent with purifying selection against deleterious alleles of similar efficiency in the different populations. This situation contrasts with that we detect in Europeans, which are subject to weaker purifying selection than African populations. Furthermore, the per-individual mutation load of rainforest hunter-gatherers was found to be similar to that of farmers, under both additive and recessive modes of inheritance. Together, our results indicate that differences in the subsistence patterns and demographic regimes of African populations have not resulted in large differences in mutational burden, and highlight the role of gene flow in reshaping the distribution of deleterious genetic variation across human populations.

There’s two major moving parts in this preprint. First, they using phylogenomic methods to explicitly model population history. Second, they integrated their demographic results in generation and interpreting the distribution of mutations within the exomes of these populations. That is, they combined phylogenomics to gain insight into population genomics, as the latter focuses more on the parameters which define variation with a population.

The data they worked with was from the exome. The regions of the genome which translate into genes. That’s ~30 million bases. They get really good precision due to high coverage, hitting site about 70 times. Their sample was about 300 Africans and 100 Europeans, and they got ~500,000 polymorphisms or variants for their trouble.

The populations were labeled by subsistence and provenance. The Europeans were Belgians. For the Africans they had two groups of hunter-gatherer Pymgies, and two groups of Bantu agriculturalists, sampled from western and eastern locations as you see on the map above.

The admixture plots, which separate out individuals into K numbers of populations break out in a way that makes sense. First, Europeans separate, and the eastern agriculturalist populations have a little bit of evidence of European-like ancestry. This is almost certainly Middle Eastern farmer, which has been found in many East African populations, and those populations which have mixed with them. Then the hunter-gathers separate from the agriculturalists. This is in line with expectation and earlier research; the hunter-gatherers of Africa seem very different from the agriculturalists, and are actually more closely related to each other than the agriculturalists in their neighboring regions.

The exception to this pattern is caused by recent gene flow, which is clearly evident above. Due to population size differences it looks like there is more agricultural ancestry in the Pygmies than vice versa. I wish that they had sampled Mbuti Pygmies. I’m told that this group has the least agricultural admixture.

But then they decided to get fancy and explicitly model demographic histories with fastsimcoal2. What does this do? From the website for the software:

While preserving all the simulation flexibility of simcoal2, fastsimcoal is now implemented under a faster continous-time sequential Markovian coalescent approximation, allowing it to efficiently generate genetic diversity for different types of markers along large genomic regions, for both present or ancient samples. It includes a parameter sampler allowing its integration into Bayesian or likelihood parameter estimation procedure.

fastsimcoal can handle very complex evolutionary scenarios including an arbitrary migration matrix between samples, historical events allowing for population resize, population fusion and fission, admixture events, changes in migration matrix, or changes in population growth rates. The time of sampling can be specified independently for each sample, allowing for serial sampling in the same or in different populations.

The models you see that were tested are pretty simple, and they all seem plausible I suppose. Their simulations suggested that the three above scenarios, with alternative branching patterns and various gene flows, were all of equal likelihood. That is, given the models and the data that they had (4-fold synonymous sites which are likely to be neutral) you can’t distinguish which is right.

In all the models hunter-gatherers diverged relatively recently and so did the agriculturalists. Europeans, who are stand-ins for all non-Africans in this scenario, diverged pretty early from the Africans. But how the Africans relate to each other and Europeans is not totally clear. Why? Because ancient population structure. It is becoming rather obvious now that ~100,000 years ago, and earlier, there were many different modern human lineages which had already diversified. The Khoisan seem to have diverged from other human lineages closer to 200,000 thousand than 100,000 years ago. What this means is that for most of the history of anatomically modern humans population structure  existed between distinct lineages. And some of that persists down to today within Africa.

I’ll bullet point some of their inferences from these models (verbatim quotes below):

  1. Our results suggest that the ancestors of the contemporary RHG, AGR and EUR populations diverged between 85 and 140 thousand years ago (kya), from an ancestral population that underwent demographic expansion between 173 and 191 kya
  2. After the initial population splits, the Ne of AGR and RHG (NaAGR and NaRHG) remained within a range extending from 0.55 to 2.2 times the ancestral African Ne (NHUM), whereas EUR (NaEUR) experienced a decrease in Ne by a factor of three to seven.
  3. The ancestors of the wRHG and eRHG populations diverged 18 to 20 kya (TRHG), and underwent a decreased in Ne by a factor of 3.8 to 5.7 for the wRHG (NwRHG) and 7.1 to 11 for the eRHG (NeRHG), regardless of the branching model considered.
  4. The ancestors of the AGR (NaAGR) split into western and eastern populations 6.7 to 11 kya (TAGR), and underwent a mild expansion, by a factor of 2.3 to 3.1 for the wAGR (NwAGR) and 1.2 to 2.2 for the eAGR (NeAGR).
  5. The EUR population experienced a 7.1- to 8.3-fold expansion (NEUR) 12 to 22 kya (TEUR).

No results are perfect. But some of these dates do not make sense. There’s a lot of circumstantial evidence that the ancestors of European populations began to expand over the last 10,000 years. The dates above suggest there was a Pleistocene expansion. Basically you can divide that value by half, and then you get a reasonable range.

Second, both the agriculturalists sampled here are Bantu speaking, and there’s a good amount of cultural and genetic data for recent shared ancestry of the Bantu over the last 3,000 years. I understand that admixture with a very diverged lineage (e.g., eastern Bantu agriculturalist samples mixing with Nilotic populations, which is how they got some non-African ancestry, as well as local Pygmy groups) can inflate these divergence dates. If that’s the case, they should note that in the text.

We don’t have much historical or archaeological clarity from what I know about divergences between Pygmy groups. This particular group has studied the topic and published on it before, so I’m inclined to trust them more than anyone else. But, the above dates for groups we do know make me a bit more skeptical of a simple divergence around the Last Glacial Maximum.

Then there are the earliest divergences. And 85 to 140,000 year interval is huge for when non-Africans split off from Africans. If closer to 140 than 85, then that means that non-African divergence from Africans preserves ancient African diversity. That is, non-Africans descend from an African group that no longer exists (or has not been sampled in this study at least!). I’ve poked around this question, and when you take into account recent gene flow, it is hard to find the specific African group that non-Africans descend from, though there is some consensus that they branched off from the non-Khoisan Africans later than from the Khoisan.

But there is also a lot of archaeological and some ancient genetic DNA now that indicates that the vast majority of non-African ancestry began to expand rapidly around 50-60,000 years ago. This is tens of thousands of years after the lowest value given above. Therefore, again we have to make recourse to a long period of separation before the expansion. This is not implausible on the face of it, but we could do something else: just assume there’s an artifact with their methods and the inferred date of divergence is too old. That would solve many of the issues.

I really don’t know if the above quibbles have any ramification for the site frequency spectrum of deleterious mutations. My own hunch is that no, it doesn’t impact the qualitative results at all.

Figure 3 clearly shows that Europeans are enriched for weak and moderately deleterious mutations (the last category produces weird results, and I wish they’d talked about this more, but they observe that strong deleterious mutations have issues getting detected). Ne is just the effective population size and s is the selection coefficient (bigger number, stronger selection).

Why are the middle two values enriched? Presumably it’s the non-African bottleneck. This is where another non-African population would have been a nice check to make sure that it was the “Out of Africa” bottleneck…but it’s probably asking a bit much to sequence more individuals to 70x coverage.

The lack of difference between the African populations is an indication that recent demography is not shaping the distribution much. Additionally, they note that gene flow between the African groups probably increased diversity in some ways, so that as long as a group is connected with other populations it will probably be rescued (note that none of these in their data were particular inbred as judging by runs of homozygosity).

Finally, they found that the number of homozygote mutations that were deleterious is higher in their model results for Europeans than the African groups. This is not surprising, and what one expects. But, they found that this is a function likely of continuous gene flow between the African groups. Without gene flow homozygosity would have been much higher. This gets back to the fact that gene flow is a powerful homogenizing tool, and the lack of gene flow has to be pretty extreme for divergence to occur.

Which brings us back to the “Out of Africa” event. The next ten years are going to see a lot of investigation of African phyologenomics and population genomics. Basically, the relationships, and selection pressures. It is totally implausible that Bantu groups in Kenya and Tanzania did not absorb local non-Nilotic populations. We’ll figure that out. Additionally, selection pressures are probably different between different groups. We’ll know more about that. But, ancient DNA will probably give us some understanding of why non-Africans went through such a massive demographic sieve. We know in broad sketches. But most people want to fill in the details.

Citation: The demographic history and mutational load of African hunter-gatherers and farmers, Marie Lopez, Athanasios Kousathanas, Helene Quach, Christine Harmant, Patrick Mouguiama-Daouda, Jean-Marie Hombert, Alain Froment, George H Perry, Luis B Barreiro, Paul Verdu, Etienne Patin, Lluis Quintana-Murci, doi: https://doi.org/10.1101/131219