Very ancient ghosts in the African genome

The above figure is from a preprint (updated from last year), Recovering signals of ghost archaic introgression in African populations. But to truly get a sense of this preprint, I would highly recommend you read the supplementary material. And, to be honest, a publication from 2007, The Joint Allele-Frequency Spectrum in Closely Related Species, as the core of the method used in the preprint is developed in that paper.

Here is the abstract:

While introgression from Neanderthals and Denisovans has been well-documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. Using 405 whole-genome sequences from four sub-Saharan African populations, we provide complementary lines of evidence for archaic introgression into these populations. Our analyses of site frequency spectra indicate that these populations derive 2-19% of their genetic ancestry from an archaic population that diverged prior to the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations that recover about 482 and 502 megabases of archaic sequence, respectively. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day African populations.

To get a sense of how much work went into this preprint, really do read the supplementary material. The step by step analysis convinced me pretty thoroughly that these results are not due to straightforward errors in the genotypes and classifications of the genotypes. Such things do happen, so it was nice to see them be very careful about that.

The key point is that the distribution of the conditional site frequency (CFS) spectrum in West Africans does not align with theoretical expectations. The condition here being the state in the archaic outgroup, generally the Vindijia Neanderthal. The authors ran a bunch of simulations and models and found a subset that could produce the CSF they see, the u-shaped distribution. It is represented by the graph you see at the top-right. Basically, a scenario where a diverged archaic lineage which diverged from the other human lineages before the Neanderthal-Denisovan lineage left Africa contributed to the ancestry of West Africans within the last ~100,000 years (the most likely time is ~50,000 years ago).

This is not a new finding at the highest level of generality. Jeff Wall has been beating this drum for nearly 15 years. For example, Genetic evidence for archaic admixture in Africa.

What has changed is that whole-genome sequencing, including high-quality sequences of ancient hominins, has allowed for a more robust exploration of the topic. The analysis of site frequencies was really not useful 20 years ago without genome-wide data. More data has allowed for more subtle methods.

Read More

Drivers of selection for ghosts in the genome

A new preprint on bioRxiv, Strong selective sweeps before 45,000BP displaced archaic admixture across the human X chromosome, is suggestive of an exciting new phase in human evolutionary genomics. Basically, leveraging whole-genomes in diverse populations to explore selection dynamics.

The authors looked at X chromosomes in males for reasons of technical tractability. Human males carry a single X chromosome, so it’s easy to determine the sequence of genetic variants across a physical stretch of DNA (females, with two X chromosomes, require phasing). The X is interesting for two other reasons: it is present in females about 2/3 of the time (because they have two copies), and, is subject to really strong selection in males 1/3 of the time. Basically, males exhibit no recessive dynamics on the X chromosome because we carry only one. This means that genetic variants which are “recessive” in their expression to selection in females are expressed in males.

The fact that the X is disproportionately¬†found in females also means that all sorts of intra-genomic conflict driven by sex occur on this chromosome. You will know this if you read Matt Ridley’s excellent¬†The Red Queen.

The specific result here is that the authors found a common family of haplotypes in males on the X chromosome outside of Africa whose homogeneity is indicative of a very strong sweep. From the text:

The identified selective sweeps are as strong or even stronger than the most dramatic sweeps previously found in humans. Ten sweeps span between 500kb and 1.8Mb in more than 50% of non-Africans (Table S2). The strongest sweep span 900kb in 91% of non-Africans and affects 53% of non-Africans across a 1.8Mb region. For comparison, the strongest sweep previously reported surrounds the lactase gene and spans 800kb in 77% of European Americans (24). The selection coefficient on the genetic variant driving this sweep was estimated to 0.15 (24) suggesting even stronger selection for several of the X chromosome sweeps we have identified.

The swept regions we identify here may be recurrent targets of strong selection during human evolution. To investigate this possibility, we intersect our findings with our previously reported evidence of selective sweeps in the human-chimpanzee ancestor (16).We find a strong overlap between the sweeps reported here and regions swept during the 2-4 my that separated the human-chimpanzee and human-gorilla speciation events (17, 25) shown as grey regions in Figure 2 (Jaccard stat.: 0.17, p-value: <1e-5) (Materials and Methods). This suggests that the identified regions of the X chromosome are continually subjected to extreme positive selection.

A selection coefficient of 0.15 is eye-popping. Selection coefficients of 0.01 are reasonable. For humans, anything in the 0.10 range is more like a weird artifact than a true result. But here it is. The fact that the regions overlap with earlier targets of selection during the speciation event that led up to our lineage is clearly of interest.

Looking more closely at the regions of the X which was subject to the selection, they found almost no archaic ancestry. That is, the ~1% Neanderthal ancestry that is expected across the X chromosome is almost absent in these segments derived from the selection event. The inference is made that perhaps then these sweeps occurred due to introgression from a sister modern human lineage, perhaps an earlier wave out of Africa which never mixed with Neanderthals. The archaeology is compelling now that these people existed, and there are tentative suggestions from genomics which attests to their presence as well (e.g., modern human admixture into the Altai Neanderthals).

Looking at the 45,000-year-old Siberian genome the authors found the same signatures that they see in other non-Africans. This means the event had to happen between 55 and 45 thousand years ago, after the Neanderthal admixture (which is found all around these zones in the genome), but before geographical diversification and expansion of the modern human lineage.

The authors conclude:

We hypothesize that our observations are due to meiotic drive in the form of an inter-chromosomal conflict between the X and the Y chromosomes for transmission to the next generation. If an averagely even transmission in meiosis is maintained by a dynamic equilibrium of antagonizing drivers on X and Y, it is possible that the main bottlenecked out-of-Africa population was invaded by drivers retained in earlier out-of-Africa populations. If this hypothesis is true, the swept regions represent the only remaining haplotypes from such early populations not admixed with Neanderthals.

Meiotic drive is a segregation distorter. A form of intra-genomic selection which is potentially very powerful. Some hypothesize that alleles normally subject to meiotic drive sweep through the population so fast that researchers underestimate the phenomenon’s ubiquity because they haven’t caught sweeps in action.

This strange evolutionary genetic process then may have preserved a genetic relic within the human genome. But on Twitter Iosif Lazaridis suggests that perhaps the donor population were “Basal Eurasians,” and all non-Africans may have some Basal Eurasian ancestry, with Near Easterners exhibiting more than baseline ancestry (presumably through later admixture).