Over the last ten years David Reich and other researchers have been constructing what is basically an atlas of human demographic history. Taking the genealogies written in our DNA, mapping them onto population bifurcations and admixtures, and synthesizing that back together with what we know from history and archaeology.
To a great extent, this is a project of human phylogenomics. Taking genome-wide data and constructing phylogenies out of it (or, perhaps more precisely, graphs, as this is on a intra-species time scale mostly and characterized by lots of gene flow across the “tips” of the tree). But there’s another thing you can do with modern human genomics and evolution: look at patterns of selection within the genome.
The Reich group has already started doing this. For example, they have adduced that CCR5 delta 32 mutation seems to have emerged out of the Yamnaya horizon.
Last fall, a paper came out in MBE, Ancestry-Specific Analyses Reveal Differential Demographic Histories and Opposite Selective Pressures in Modern South Asian Populations, which I gave a cursory read, but which I’ve looked at more closely. It takes a “natural experiment,” the emergence of Indian subcontinental populations from a massive admixture between lineages which diverged 40,000 years ago, and looks to see which genetic regions deviate from what you would expect based on overall genome.
The method is simple: imagine that “Ancestral North Indians” are fixed for an allele at a gene in one state and “Ancestral South Indians” are fixed in the other state. Indian populations are about 50:50 (with a range). If the frequency today in Indian populations is 95% for the allele that is from the “Ancestral North Indians”, one might be suspicious as to what’s going on. Or, vice versa.
In the paper, they used whole genomes to reconstruct the ancestral steppe/Iranian population without any residual “Ancient Ancestral South Indian” (AASI), the latter of which has no West Eurasian. They did the same for the AASI. These reconstructions are always dicey, but they made a good faith effort to check their work. On the whole, that section was impressive. The authors seem to be roughly aligned with the results in Narasimhan et al. 2019. The AASI seems to be homogeneous, with the exception of attempting to model them from donors which were Munda or Burusho, both groups with deep East Asian admixture (illustrating the problem with deconvolution). Second, they show that the AASI are not clustering with the Andamanese, which makes sense since these groups diverged closer to 40,000 years ago. Finally, the steppe/Iranian group looks most like Armenian middle-to-late Bronze Age people. A synthesis of steppe and some Iranian-like ancestry.
But this isn’t the most interesting part of the paper. It’s the selection. Here are the top, top, candidates:
|Component||# of Pops with Sig Value||Genes (±50-kb Region)|
|ANI||22 (percentile = 99.9949)||THUMPD3, SETD5|
|21 (percentile = 99.9814)||SNAP91, RIPPLY2, CYB5R4, MRAP2, CEP162, TBX18|
|21 (percentile = 99.9814)||TRIM31, TRIM40, TRIM10, TRIM15, TRIM26, HLA-L|
|19 (percentile = 99.9383)||Intergenic|
|18 (percentile = 99.9195)||ZNF681, ZNF726, ZNF254|
|ASI||−21 (percentile = 0.0057)||RXFP3, SLC45A2, AMACR, C1QTNF3, ADAMTS12|
|−16 (percentile = 0.038)||SRXN1, SCRT2, SLC52A3|
|−16 (percentile = 0.038)||Intergenic|
|−15 (percentile = 0.0757)||Intergenic|
|−14 (percentile = 0.1268)||ATP6V1H, RGS20, TCEA1, LYPLA1, MRPL15|
I’ll quote the authors at length from the “Discussion”:
We also show that the interaction between alleles that were highly polarized between the two ancestry sources that admixed in South Asia caused patterns of admixture imbalance across the majority of sampled groups, hence unlikely explainable by population specific random drift, and perhaps due to positive or negative environmental pressures. Interestingly, we report how loci that include genes involved with diabetes (SETD5), diet (ZNF) and the immune response (HLA) show West Eurasian (N) haplotypes to be significantly more represented compared with the South Asian (S) counterparts. This might be a stark contrast to what is expected, given the long-term history of local adaptation of S haplotypes in local environment. We speculate that the diet-related signal may be linked with post-Neolithic dietary shifts that might have followed the arrival of the West Eurasian component in the area, whereas the overrepresentation of West Eurasian HLA haplotypes might have some similarity, although at a different time scale, with what has happened in Native American populations after recent colonization likely caused by European borne epidemic (Lindo et al. 2016).
On the other hand, the top region for significant enrichment of South Asian ancestry includes the rs16891982-G allele of SLC45A2 gene (associated with light skin pigmentation in West Eurasians), suggesting purifying selection at this locus following admixture…the overall abundance of these West Eurasian alleles is drastically reduced in 21 out of 25 South Asian populations analyzed here…Such a strong negative pressure against a light pigmentation allele may be explained by the high ultraviolet (UV) radiation at South Asian latitudes and this result seems to be further corroborated by similar N ancestry deficiencies in TYRP1 and BNC2 genes for as many as 11 South Asian populations (supplementary table 4, Supplementary Material online). However, purifying selection against maladaptive light pigmentation alleles in high UV environment is not observed for all pigmentation alleles; in fact, the rs1426654-A allele of the SLC24A5 gene…shows instead an increase of frequency in South Asian…Taken together, our results point to opposite pressures on some West Eurasian alleles involved in skin and eye pigmentation. On one hand, SLC45A2 seem to have undergone some selective pressure that removed most of West Eurasian alleles that arrived in the area after the admixture event. Conversely, the SLC24A5 (rs1426654-A) West Eurasian allele seems to have escaped such a negative pressure perhaps thanks to its apparent neutral role with respect to susceptibility to skin carcinoma caused by UV radiation…
As I said, in the phylogenomic analysis above the authors suggest that the AASI population was homogeneous. I think this suggests that a single ancestral population was absorbed into expanding Iranian-related-farmers in NW South Asia. The prevalence of deeply diverged haplogroup M on the mtDNA in subcontinental peoples points to female mediated admixture. The positive selection for various “lifestyle” alleles indicates to me that expanding Iranian-related-farmers absorbed AASI tribes, in particular the women, and assimilated them to the new lifestyle.
The results from pigmentation are surprising, but not shocking. Knowing what I know about the ancestral frequency distribution of the various alleles, it was clear that the derived fraction of SLC24A5 was enriched. A lot of the other ones that are responsible for variation in Europeans looked either selected against or, the ancestral Indo-Aryans et al. were not quite like modern Europeans. These data point to in situ selection.
But why selection for some pigmentation alleles and not others? First, I don’t think cancer is a major selective pressure. That happens late in life. Rather, I think SLC24A5 in the derived variant does something that has nothing to do with pigmentation. It was positively selected among the Khoisan people of Southern Africa and looks to have been selected in Ethiopia as well after the admixture event. In Europe itself its frequency is so high that there has clearly been lots of positive selection since the “great admixture.”
As far as the other alleles, perhaps it is pigmentation. But perhaps it is something else?
Round and round we’ve been going with these genome-wide studies, but in the 2020s I think biologists who know the molecular pathways in a way that plumbs the depths of pleiotropy need to get involved.