Very ancient ghosts in the African genome

The above figure is from a preprint (updated from last year), Recovering signals of ghost archaic introgression in African populations. But to truly get a sense of this preprint, I would highly recommend you read the supplementary material. And, to be honest, a publication from 2007, The Joint Allele-Frequency Spectrum in Closely Related Species, as the core of the method used in the preprint is developed in that paper.

Here is the abstract:

While introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.

To get a sense of how much work went into this preprint, really do read the supplementary material. The step by step analysis convinced me pretty thoroughly that these results are not due to straightforward errors in the genotypes and classifications of the genotypes. Such things do happen, so it was nice to see them be very careful about that.

The key point is that the distribution of the conditional site frequency (CFS) spectrum in West Africans does not align with theoretical expectations. The condition here being the state in the archaic outgroup, generally the Vindijia Neanderthal. The authors ran a bunch of simulations and models and found a subset that could produce the CSF they see, the u-shaped distribution. It is represented by the graph you see at the top-right. Basically, a scenario where a diverged archaic lineage which diverged from the other human lineages before the Neanderthal-Denisovan lineage left Africa contributed to the ancestry of West Africans within the last ~100,000 years (the most likely time is ~50,000 years ago).

This is not a new finding at the highest level of generality. Jeff Wall has been beating this drum for nearly 15 years. For example, Genetic evidence for archaic admixture in Africa.

What has changed is that whole-genome sequencing, including high-quality sequences of ancient hominins, has allowed for a more robust exploration of the topic. The analysis of site frequencies was really not useful 20 years ago without genome-wide data. More data has allowed for more subtle methods.

Read More