Very ancient ghosts in the African genome

The above figure is from a preprint (updated from last year), Recovering signals of ghost archaic introgression in African populations. But to truly get a sense of this preprint, I would highly recommend you read the supplementary material. And, to be honest, a publication from 2007, The Joint Allele-Frequency Spectrum in Closely Related Species, as the core of the method used in the preprint is developed in that paper.

Here is the abstract:

While introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.

To get a sense of how much work went into this preprint, really do read the supplementary material. The step by step analysis convinced me pretty thoroughly that these results are not due to straightforward errors in the genotypes and classifications of the genotypes. Such things do happen, so it was nice to see them be very careful about that.

The key point is that the distribution of the conditional site frequency (CFS) spectrum in West Africans does not align with theoretical expectations. The condition here being the state in the archaic outgroup, generally the Vindijia Neanderthal. The authors ran a bunch of simulations and models and found a subset that could produce the CSF they see, the u-shaped distribution. It is represented by the graph you see at the top-right. Basically, a scenario where a diverged archaic lineage which diverged from the other human lineages before the Neanderthal-Denisovan lineage left Africa contributed to the ancestry of West Africans within the last ~100,000 years (the most likely time is ~50,000 years ago).

This is not a new finding at the highest level of generality. Jeff Wall has been beating this drum for nearly 15 years. For example, Genetic evidence for archaic admixture in Africa.

What has changed is that whole-genome sequencing, including high-quality sequences of ancient hominins, has allowed for a more robust exploration of the topic. The analysis of site frequencies was really not useful 20 years ago without genome-wide data. More data has allowed for more subtle methods.

Within the supplements, the authors are quite modest that many elements of their model are likely to be wrong. The bigger picture though is that they believe they are capturing some general dynamics. It seems rather clear from multiple lines of evidence in the preprint, as well as earlier work, that there are strong suggestions of very deep structure within Africa that assimilated into an expanding modern human population. They actually tested for a scenario of continuous gene flow, and a rapid pulse admixture of the 2-19% is a better fit to the data.

Additionally, there are peculiarities which they haven’t resolved in their results. The Luhya gives really bizarre numbers, and the authors don’t have a good explanation for it. It could be a problem with their model specification in some deep way, or, the history of East Africa (the Luhya are a Bantu group who mixed with East Africans) is more complicated than we may have understood.

They also did some cool things identifying possible introgressed segments. Their methods seem to agree on the regions, and with older literature which had earlier identified these as targets for introgression. Finally, there was also some validation of the finding that West Africans may have some “Basal modern human.” That is, the modern human lineage that split off first from everyone else.

As the coverage of populations and the number of genome sequences in Africa increases, we will probably get more resolution. I do wonder at how computationally intensive some of this work is, and how many moving parts there are. Replicating this work is doable, as all the code is provided, but it would take time.

In general, these results align with most of my priors, so I am pretty confident they’ve grasped onto a thread of reality here. I would, wouldn’t I? Basically, ~50,000 years ago there was a massive expansion of a core modern human lineage which absorbed other human groups as it expanded outward. Though the easiest explanation is that it was one group, the Holocene agricultural expansion should tell us that sometimes differently related groups in close proximity can undergo the sample cultural revolution and expand in different directions.

Note: It is clear “super-deep” lineages admixing is going to be the next big thing. See Alan Rogers recent work.

6 thoughts on “Very ancient ghosts in the African genome

  1. Hello Razib! Layman again here.

    Am I right in reading the Unknown Archaic as the group whose ancestry diverged from the ancestors of modern humans/Neanderthals/denisovans 1.4 million years ago then admixed with Denisovans roughly 200k years ago? Also, seems the chart is showing another group whose ancestors diverged from modern humans over a million years ago and mixed back in roughly 50-100k years ago.

    That is an incredibly deep time apart to be mixed in again! I can’t even begin to fathom what that looks like. Is there any period longer then these we know of where two populations were split then later merged? To my layman eyes that’s just nothing short of stunning. I mean mind boggling. I just want to say if there’s any way to draw interest into this subject this would be a good start.

    I really, really hope the rest of the African continent is explored and the studies into this deep ancestry get….deeper. Good on Reich and co for this revolution.

  2. Hi Razib,
    I read the link provided but it was not clear if the Archaic introgression into Africans is also present in non Africans.
    The estimated time of admixture is wide enough to suggest that either either before or after OOA is possible.
    I am guessing it is not shared, but wondered if it was clear to you?

  3. One buried finding in the supplemental text:

    “We observe that the CEU CSFS, like the CSFS in the African populations, is U-shaped with an increase in the counts of high frequency alleles. (Figures S4 and S6). Further, we also observe a U-shape in the CSFS computed in the Han Chinese (CHB) population (Figure S6). These results suggest that a component of the archaic ancestry that we detect in African populations is shared with non-African populations. We view these results as provisional…”

    which is pretty important, if true! And which also reminds me of this recently published work:

    Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania

  4. Could someone help me, a novice, understand why the range of purported archaic admixture (~2-19%) is so wide? Especially if we consider that the 4 tested West African populations are all closely related. Elsewhere in the supplementary documents the rate of purported archaic admixture is narrowed to ~6-7%.

  5. According to the image at the top (top right), the 2-19% of archaic “Unknown Hominin” admixture seems to have mixed into the common ancestors of Africans and Europeans/non-Africans (modern humans pre-OOA), implying that it (or some of it) is also present in non-Africans—which fits with the above mentioned finding from the supplements (cited by the commenter above) that the admixture, or at least some/a component of the admixture, is shared by both African and non-African populations.

  6. @Mwami I’d say that due to the fact that Bantu populations are heavily admixed and as the authors admit, that made interpreting the results for the modalised Luhya population much harder as they are basically [West-Central African+Nilotic+Eurasian via Cushitic pastoralists] similar to basically all the great-lakes Bantu [Ganda, Banryawanda, Soga,Sukuma, Pare etc] even though phenotypically these groups are very different. I think the Luhya sample used so extensively [LWK – Webuye] needs to be updated and include Luhyas from Kakamega, Mumias, Busia etc because the LWK Luhyas are HIGHLY admixed with Nilotic and you can see that physically…whereas some Luhyas look typically Nigerian and I would bet have very, very little Nilotic admixture.

Comments are closed.