The above figure is from a preprint (updated from last year), Recovering signals of ghost archaic introgression in African populations. But to truly get a sense of this preprint, I would highly recommend you read the supplementary material. And, to be honest, a publication from 2007, The Joint Allele-Frequency Spectrum in Closely Related Species, as the core of the method used in the preprint is developed in that paper.
Here is the abstract:
While introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.
To get a sense of how much work went into this preprint, really do read the supplementary material. The step by step analysis convinced me pretty thoroughly that these results are not due to straightforward errors in the genotypes and classifications of the genotypes. Such things do happen, so it was nice to see them be very careful about that.
The key point is that the distribution of the conditional site frequency (CFS) spectrum in West Africans does not align with theoretical expectations. The condition here being the state in the archaic outgroup, generally the Vindijia Neanderthal. The authors ran a bunch of simulations and models and found a subset that could produce the CSF they see, the u-shaped distribution. It is represented by the graph you see at the top-right. Basically, a scenario where a diverged archaic lineage which diverged from the other human lineages before the Neanderthal-Denisovan lineage left Africa contributed to the ancestry of West Africans within the last ~100,000 years (the most likely time is ~50,000 years ago).
What has changed is that whole-genome sequencing, including high-quality sequences of ancient hominins, has allowed for a more robust exploration of the topic. The analysis of site frequencies was really not useful 20 years ago without genome-wide data. More data has allowed for more subtle methods.
Within the supplements, the authors are quite modest that many elements of their model are likely to be wrong. The bigger picture though is that they believe they are capturing some general dynamics. It seems rather clear from multiple lines of evidence in the preprint, as well as earlier work, that there are strong suggestions of very deep structure within Africa that assimilated into an expanding modern human population. They actually tested for a scenario of continuous gene flow, and a rapid pulse admixture of the 2-19% is a better fit to the data.
Additionally, there are peculiarities which they haven’t resolved in their results. The Luhya gives really bizarre numbers, and the authors don’t have a good explanation for it. It could be a problem with their model specification in some deep way, or, the history of East Africa (the Luhya are a Bantu group who mixed with East Africans) is more complicated than we may have understood.
They also did some cool things identifying possible introgressed segments. Their methods seem to agree on the regions, and with older literature which had earlier identified these as targets for introgression. Finally, there was also some validation of the finding that West Africans may have some “Basal modern human.” That is, the modern human lineage that split off first from everyone else.
As the coverage of populations and the number of genome sequences in Africa increases, we will probably get more resolution. I do wonder at how computationally intensive some of this work is, and how many moving parts there are. Replicating this work is doable, as all the code is provided, but it would take time.
In general, these results align with most of my priors, so I am pretty confident they’ve grasped onto a thread of reality here. I would, wouldn’t I? Basically, ~50,000 years ago there was a massive expansion of a core modern human lineage which absorbed other human groups as it expanded outward. Though the easiest explanation is that it was one group, the Holocene agricultural expansion should tell us that sometimes differently related groups in close proximity can undergo the sample cultural revolution and expand in different directions.
Note: It is clear “super-deep” lineages admixing is going to be the next big thing. See Alan Rogers recent work.