Hard sweeps and natural selection obscured by Bronze Age admixture

The above is the map from the Online Ancient Genome Repository. You can see the variation by region. There’s a lot of ancient DNA in Europe. Very little in Asia. And only moderate amounts elsewhere.

The map is from a new preprint, Ancient human genomes reveal a hidden history of strong selection in Eurasia:

The role of selection in shaping genetic diversity in natural populations is an area of intense interest in modern biology, especially the characterization of adaptive loci. Within humans, the rapid increase in genomic information has produced surprisingly few well-defined adaptive loci, promoting the view that recent human adaptation involved numerous loci with small fitness benefits. To examine this we searched for signatures of hard sweeps – the selective fixation of a new or initially rare beneficial variant – in 1,162 ancient western Eurasian genomes and identified 57 sweeps with high confidence. This unexpectedly extensive signal was concentrated on proteins acting at the cell surface, and potential selection pressures include cold adaptation in early Eurasian populations, and oxidative stress from carbohydrate-rich diets in farming populations. Critically, these sweep signals have been obscured in modern European genomes by subsequent population admixture, especially during the Bronze Age (5-3kya) and empires of classical antiquity.

So the “big thing” that they found here is that admixture obscures signals of selection. More precisely, it obscures signals of hard selective sweeps, the classical variant where a single position in a single haplotype rises up in frequency rapidly due to positive selection.

If you read further into the paper you note that they believe admixture, due to the mixing of backgrounds, attenuates the signal of hard sweeps, and may even imply that these hard sweeps are soft sweeps through the mixing of distinct genetic backgrounds. I honestly didn’t follow that too closely, but I guess it depends on the selection coefficient and rate of mixing. They are reporting lots of selection events of >1%, and I wonder about how credible this is (Haldane’s dilemma?).

That being said, the functional significance of these selection events is important. Basically, they look like adaptations to climate and changes in diet. What authors seem to be suggesting here is that the shift in lifestyle and expansion of farmers in the early Holocene was a pretty big deal, and the mixing between various divergent streams during the Bronze Age muddled the signals.

If the authors are right, that means that ancient DNA is going to be very big for understanding the trajectory of selection, because it’s not just going to be subtle polygenic changes.

Selection on height in Sardinians

Evidence of polygenic adaptation at height-associated loci in mainland Europeans and Sardinians:

Adult height was one of the earliest putative examples of polygenic adaptation in human. By constructing polygenic height scores using effect sizes and frequencies from hundreds of genomic loci robustly associated with height, it was reported that Northern Europeans were genetically taller than Southern Europeans beyond neutral expectation. However, this inference was recently challenged. Sohail et al. and Berg et al. showed that the polygenic signature disappeared if summary statistics from UK Biobank (UKB) were used in the analysis, suggesting that residual uncorrected stratification from large-scale consortium studies was responsible for the previously noted genetic difference. It thus remains an open question whether height loci exhibit signals of polygenic adaptation in any human population. In the present study, we re-examined this question, focusing on one of the shortest European populations, the Sardinians, as well as on the mainland European populations in general. We found that summary statistics from UKB significantly correlate with population structure in Europe. To further alleviate concerns of biased ascertainment of GWAS loci, we examined height-associated loci from the Biobank of Japan (BBJ). Applying frequency-based inference over these height-associated loci, we showed that the Sardinians remain significantly shorter than expected (~ 0.35 standard deviation shorter than CEU based on polygenic height scores, P = 1.95e-6). We also found the trajectory of polygenic height scores decreased over at least the last 10,000 years when compared to the British population (P = 0.0123), consistent with a signature of polygenic adaptation at height-associated loci. Although the same approach showed a much subtler signature in mainland European populations, we found a clear and robust adaptive signature in UK population using a haplotype-based statistic, tSDS, driven by the height-increasing alleles (P = 4.8e-4). In summary, by examining frequencies at height loci ascertained in a distant East Asian population, we further supported the evidence of polygenic adaptation at height-associated loci among the Sardinians. In mainland Europeans, we also found an adaptive signature, although becoming more pronounced only in haplotype-based analysis.

The whole literature on selection and height is confused. This is definitely an unformed and new area of exploration, so I wouldn’t put my money on any particular result. But, it is important to note I think that the association of particular genetic variants with differences in height is stronger than the signature of selection on those variants. Second, the preprint is hard to follow because there are all sorts of factors like ascertainment in the huge datasets necessary to do analysis on polygenic traits that date from the way the data were generated in the late 2000s (as well as new datasets coming online).

I think looking at variants in East Asians, and how they impact Europeans, is pretty neat. Obviously, some of the variants that impact polygenic traits are going to be rare, and so not shared between populations, but a lot of it is probably “standing variation” that dates back to before the Out of Africa event. In other words, the key thing is to look at differences in frequencies of alleles which are present in most populations, not different alleles which are not present in all populations.

One element that jumps out at me is the trajectory of selection, and how much is due to events that date deep into the past, to such an extent that it might not make sense to talk about populations as we understand them today. So, for example, they talk about selection events going back to beyond 10,000 years…but all the populations that we survey today did not really exist that deeply in time. This doesn’t mean that selection didn’t happen. “Populations” is a human construct, alleles are alleles. They may have been subject to selection in a variety of populations which admixed themselves out of existence in turn (there was selection for larger brains on and off for millions of years up until about ~200,000 years ago in various hominin populations).

The strongest selection result in this preprint seems to be that something is going on with Sardinians, the most direct descendants of Neolithic farmers. As noted on Twitter I think this has more to do with the nature of calorie restriction, or lack thereof, than selection on height per se. A lot more has to be done on understanding how the “secondary products revolution” (going from simple cereal farming to agro-pastoralism) impacted on human nutrition to understand selection on height, which does seem to be a reoccurring signal across human groups.

 

Inventing the whites, what hath fog wrought?


One of the first posts on this blog relating to archaeogenetics involved an essay by me involving reflections on the fact that a particular Y chromosomal haplogroup, N1c (N3a now), had a peculiar distribution which ranged from Siberia to Finland. The argument, at the time, was whether it was a lineage which moved east to west (as suggested by the decline of microsatellite diversity in that direction), or whether it moved west to east (as was suggested by the frequency, which was highest in parts of Uralic Europe).

Today we know the general outline of the answer. The N1c lineage seems to have moved westward along the forest-tundra fringe, along with Uralic peoples in general. Genome-wide evidence shows minor but significant affinities with Siberian people among many European Uralic groups, including the Finns, and to a lesser extent Estonians. Though the genome-wide fraction is small in Finns, 5% or less, because this minor component is so genetically different from the generic Northern European ancestry of this group, it shifts Finns off the normal dimensions of variation for Europeans (in addition to the fact that many Finns have been subject to bottlenecks). The fraction is higher in the Sami, and lower in the Estonians.

Additionally, ancient DNA suggests that the arrival of this ‘eastern’ Uralic mediated ancestry seems to date to the early Iron Age. The hypothesis that the Finnic languages were primal to Baltic Europe, is on shaky ground which has cracked open. Rather, the circumstantial evidence is that Finnic languages replaced Indo-European dialects.

A new paper from Estonia as some more detail to the general outline, as well as highlighting some aspects of adaptation. The Arrival of Siberian Ancestry Connecting the Eastern Baltic to Uralic Speakers further East:

In this study, we compare the genetic ancestry of individuals from two as yet genetically unstudied cultural traditions in Estonia in the context of available modern and ancient datasets: 15 from the Late Bronze Age stone-cist graves (1200–400 BC) (EstBA) and 6 from the Pre-Roman Iron Age tarand cemeteries (800/500 BC–50 AD) (EstIA). We also included 5 Pre-Roman to Roman Iron Age Ingrian (500 BC–450 AD) (IngIA) and 7 Middle Age Estonian (1200–1600 AD) (EstMA) individuals to build a dataset for studying the demographic history of the northern parts of the Eastern Baltic from the earliest layer of Mesolithic to modern times. Our findings are consistent with EstBA receiving gene flow from regions with strong Western hunter-gatherer (WHG) affinities and EstIA from populations related to modern Siberians. The latter inference is in accordance with Y chromosome (chrY) distributions in present day populations of the Eastern Baltic, as well as patterns of autosomal variation in the majority of the westernmost Uralic speakers [1, 2, 3, 4, 5]. This ancestry reached the coasts of the Baltic Sea no later than the mid-first millennium BC; i.e., in the same time window as the diversification of west Uralic (Finnic) languages [6]. Furthermore, phenotypic traits often associated with modern Northern Europeans, like light eyes, hair, and skin, as well as lactose tolerance, can be traced back to the Bronze Age in the Eastern Baltic.

Read More

It’s raining selective sweeps

A week ago a very cool new preprint came out, Identifying loci under positive selection in complex population histories. It’s something that you can’t even imagine just ten years ago. The authors basically figure out ways to identify deviations of markers from expected allele frequency given a null neutral evolutionary model. The method is put first, which I really like, before getting to results or discussion. Additionally, they did a lot of simulation ahead of time. The sort of simulation that is really not possible before the sort of computational resources we have now.

Here’s the abstract:

Detailed modeling of a species’ history is of prime importance for understanding how natural selection operates over time. Most methods designed to detect positive selection along sequenced genomes, however, use simplified representations of past histories as null models of genetic drift. Here, we present the first method that can detect signatures of strong local adaptation across the genome using arbitrarily complex admixture graphs, which are typically used to describe the history of past divergence and admixture events among any number of populations. The method – called Graph-aware Retrieval of Selective Sweeps (GRoSS) – has good power to detect loci in the genome with strong evidence for past selective sweeps and can also identify which branch of the graph was most affected by the sweep. As evidence of its utility, we apply the method to bovine, codfish and human population genomic data containing multiple population panels related in complex ways. We find new candidate genes for important adaptive functions, including immunity and metabolism in under-studied human populations, as well as muscle mass, milk production and tameness in particular bovine breeds. We are also able to pinpoint the emergence of large regions of differentiation due to inversions in the history of Atlantic codfish.

On a related note in regards to selection, On the well-founded enthusiasm for soft sweeps in humans: a reply to Harris, Sackman, and Jensen. The authors are responding to a recent preprint criticizing their earlier work. The reason that it’s fascinating to me is that these sorts of arguments today are really concrete and not so theoretical. There’s a lot of data for analytic techinques to chew through, and computation has really transformed the possibilities.

A generation ago these sorts of debates would be a sequence of “you’re wrong!” vs. “no, you’re wrong!” Today the disputes involve a lot of data, and so have a reasonable chance of resolution.

The first preprint identifies the usual candidates in humans that you normally see, and expected targets in cattle and cod. Sure, that will given biologists more interested in mechanisms and pathways things to chew upon, but imagine once researchers have large numbers of genomes for thousands and thousands of species. Then they’ll be testing deviations from neutral allele frequencies across many trees, and getting a more general and abstract sense of the parameter that selection explores, conditional on particularities o evolutionary history.

This is why I’m excited about plans to sequence lots and lots of species.

So merfolk are a real thing now: adaptation to diving

When Rasmus Nielsen presented preliminary work on diving adaptations a few years ago at ASHG I really didn’t know what to think. To be honest it seemed kind of crazy. Everyone was freaking out over it…and I guess I should have. But it just seemed so strange I couldn’t process it. High altitude adaptations, I understood. But underwater adaptations?

The paper is out now, and open access, Physiological and Genetic Adaptations to Diving in Sea Nomads. There are a lot of moving parts in it, so I really recommend Carl Zimmer’s piece, Bodies Remodeled for a Life at Sea:

On Thursday in the journal Cell, a team of researchers reported a new kind of adaptation — not to air or to food, but to the ocean. A group of sea-dwelling people in Southeast Asia have evolved into better divers.

When Dr. Ilardo compared scans from the two villages, she found a stark difference. The Bajau had spleens about 50 percent bigger on average than those of the Saluan.

Only some Bajau are full-time divers. Others, such as teachers and shopkeepers, have never dived. But they, too, had large spleens, Dr. Ilardo found. It was likely the Bajau are born that way, thanks to their genes.

A number of genetic variants have become unusually common in the Bajau, she found. The only plausible way for this to happen is natural selection: the Bajau with those variants had more descendants than those who lacked them.

As some of you might know “sea nomads” are common across much of Southeast Asia. The Bajau are just one major group. The anthropology here is not surprising…but the biology most definitely is. For various technical reasons, the authors didn’t have extremely fine-grained genome data (high coverage sequence data, or very high-density chips). So they didn’t do some haplotype-based tests (e.g., iHS), though that might not matter anyhow (see below why). But, looking at the genome-wide relatedness and comparing that to makers which deviated from that expectation, both of which they could do robustly, the authors narrowed in on candidates for targets of selection. From the paper: “Remarkably, the top hit of our selection scan (Table 1) is SNP rs7158863, located just upstream of BDKRB2, the only gene thus far suggested to be associated with the diving response in humans.

There are many cases where researchers find selection signals in an ORF of unknown function. In this case, the top hit happens to be exactly in light with the biological characteristic you’re already curious about. The alignment is so good it’s hard to believe.

But wait, there’s more! Spleen size variation is not due to variation on just one locus. It’s polygenic, albeit probably dominated by larger effect quantitative trait loci (QTLs) than something like height (so more like skin color). They compared the Bajau to a nearby population, the Saluan, as well as Han Chinese as an outgroup. On the whole the distribution of allele frequency differences should reflect the phylogeny (Han(Bajau, Saluan)). The key is to look for cases where the Bajau are the outgroup. From the paper:

While some of the selection signals uniquely present in the Bajau may be related to other environmental factors, such as the pathogens, several of the other top hits also fall in candidate genes associated with traits of possible importance for diving. Examples include FAM178B, which encodes a protein that forms a stable complex with carbonic anhydrase, the primary enzyme responsible for maintaining carbon dioxide/bicarbonate balance, thereby helping maintain the pH of the blood….

FAM1788 shows up again later:

We identified one region overlapping chr2:97627143, which falls in the gene FAM178B, that falls in the 99% quantile of the genome-wide distribution for the fD statistic (Martin et al., 2015). Of the populations considered, this region exclusively stands out in the Bajau, and the signal appears strongest when using Denisova as source. Notably, this region was also proposed as a candidate for Denisovan introgression in Oceanic populations by….

What they’re saying here is that the allele at this locus adapted to diving may have come originally from the Denisovans! Remember, we already know that one of the Tibetan high altitude adaptations come from the Denisovans. So this isn’t surprising, but it is pretty cool. But most of the other hits don’t seem to be introgressed. That is, they come from modern humans (or have been segregating in our species for a long, long, time).

Many of the alleles found at high frequencies in the Bajau are found in other populations, just as very low frequencies. This implies that selection is operating on standing variation. Another suggestion that this is so is that the widths of the regions of the genome impacted by selection seem rather narrow. In contrast, the Eurasian adaptation to lactose digestion is from a de novo mutation, something that wasn’t at high frequency at all in the ancestral human populations. The sweep is strong and powerful around that single mutation, and huge swaps of the genome around it “hitchhiked” along so that on a population-wide level the area around the mutational target was homogenized (basically, a lot of one single original mutant human is found around that causal variant for lactase persistence).

Anyone who has learned basic quantitative genetics knows that one way to change a mean trait value is just to change the allele frequencies at a lot of different loci…over time you’ll have a lot of low-frequency alleles present in an individual which would otherwise never have occurred. Eventually, you can have a median value which is outside of the range of the original distribution. The mechanism here in a dynamic sense seems totally comprehensible, though as Carl Zimmer notes, and the rather short-shrift given in the Cell paper suggest, they’re not sure in a proximate sense how the selection is working (i.e., obviously there is a fitness implication but how does it manifest? Do people die? Are they unable to support a family?).

One key issue is to consider the demographic history of these people. The authors tried to model it genetically:

We found a model compatible with the data that has a divergence time of ∼16 kya, with subsequent high migration from Bajau to Saluan and low migration from Saluan to Bajau (for details see STAR Methods). We note that the estimate of 16 kya may reflect the divergence of old admixture components shared in different proportions by the Saluan and the Bajau, similarly to, for example, European populations being closely related to each other but differing in the proportion of ancient admixture components….

The authors cite papers which outline the real story about what happened, so they know that the model is somewhat unrealistic. For example, Ancient genomes document multiple waves of migration in Southeast Asian prehistory:

Southeast Asia is home to rich human genetic and linguistic diversity, but the details of past population movements in the region are not well known. Here, we report genome-wide ancient DNA data from thirteen Southeast Asian individuals spanning from the Neolithic period through the Iron Age (4100-1700 years ago). Early agriculturalists from Man Bac in Vietnam possessed a mixture of East Asian (southern Chinese farmer) and deeply diverged eastern Eurasian (hunter-gatherer) ancestry characteristic of Austroasiatic speakers, with similar ancestry as far south as Indonesia providing evidence for an expansive initial spread of Austroasiatic languages. In a striking parallel with Europe, later sites from across the region show closer connections to present-day majority groups, reflecting a second major influx of migrants by the time of the Bronze Age.

The upshot is that the predominant genetic character of Southeast Asia dates to the Neolithic, and to a great extent even more recently. The deep divergence between two Austronesian groups may be an artifact of drift in one group (probably the Bajau), or different proportions of admixture from the primary ancestral components in maritime Southeast Asia: Austronesian, Austro-Asiatic, and indigenous hunter-gatherer. As per Lipson 2014 the Bajau are probably mostly Austronesian but may have Negrito ancestry from the Phillippines, as well as indigenous hunter-gatherer more closely related to Malaysian Negritos. There probably isn’t so much Austro-Asiatic in Sulawesi, but I’d bet the farmers have more of that.

Ultimately the question here is are the adaptations to diving old or new? Anthropologists and historians have all sorts of theories, as reported in the Carl Zimmer article and hinted at in the paper. My own bet is that they are both old and new. By this, I mean that some sort of maritime lifestyle was surely practiced by indigenous people between the end of the last Ice Age and the arrival of farmers. But if the variation was present in humans more generally, the Austronesians would probably also have the capacity for the diving adaptations. Mixing with hunter-gatherers and another bout of selection could have done the trick in concert. So the adaptations and lifestyle are old, but the Bajau people may date to the last 2,000 years, and selection within this population may be that recent.

A lot of the answer might be found in looking at the other sea nomad groups….

Natural selection in humans (OK, 375,000 British people)

 


The above figure is from Evidence of directional and stabilizing selection in contemporary humans. I’ll be entirely honest with you: I don’t read every UK Biobank paper, but I do read those where Peter Visscher is a co-author. It’s in PNAS, and a draft which is not open access. But it’s a pretty interesting read. Nothing too revolutionary, but confirms some intuitions one might have.

The abstract:

Modern molecular genetic datasets, primarily collected to study the biology of human health and disease, can be used to directly measure the action of natural selection and reveal important features of contemporary human evolution. Here we leverage the UK Biobank data to test for the presence of linear and nonlinear natural selection in a contemporary population of the United Kingdom. We obtain phenotypic and genetic evidence consistent with the action of linear/directional selection. Phenotypic evidence suggests that stabilizing selection, which acts to reduce variance in the population without necessarily modifying the population mean, is widespread and relatively weak in comparison with estimates from other species.

The stabilizing selection part is probably the most interesting part for me. But let’s hold up for a moment, and review some of the major findings. The authors focused on ~375,000 samples which matched their criteria (white British individuals old enough that they are well past their reproductive peak), and the genotyping platforms had 500,000 markers. The dependent variable they’re looking at is reproductive fitness. In this case specifically, “rRLS”, or relative reproductive lifetime success.

With these huge data sets and the large number of measured phenotypes they first used the classical Lande and Arnold method to detect selection gradients, which leveraged regression to measure directional and stabilizing dynamics. Basically, how does change in the phenotype impact reproductive fitness? So, it is notable that shorter women have higher reproductive fitness than taller women (shorter than the median). This seems like a robust result. We’ve seen it before on much smaller sample sizes.

The results using phenotypic correlations for direction (β) and stabilizing (γ) selection are shown below separated by sex. The abbreviations are the same as above.

 

There are many cases where directional selection seems to operate in females, but not in males. But they note that that is often due to near zero non-significant results in males, not because there were opposing directions in selection. Height was the exception, with regression coefficients in opposite directions. For stabilizing selection there was no antagonistic trait.

A major finding was that compared to other organisms stabilizing selection was very weak in humans. There’s just not that that much pressure against extreme phenotypes. This isn’t entirely surprising. First, you have the issue of the weirdness of a lot of studies in animal models, with inbred lines, or wild populations selected for their salience. Second, prior theory suggests that a trait with lots of heritable quantitative variation, like height, shouldn’t be subject to that much selection. If it had, the genetic variation which was the raw material of the trait’s distribution wouldn’t be there.

Using more complex regression methods that take into account confounds, they pruned the list of significant hits. But, it is important to note that even at ~375,000, this sample size might be underpowered to detect really subtle dynamics. Additionally, the beauty of this study is that it added modern genomic analysis to the mix. Detecting selection through phenotypic analysis goes back decades, but interrogating the genetic basis of complex traits and their evolutionary dynamics is new.

To a first approximation, the results were broadly consonant across the two methods. But, there are interesting details where they differ. There is selection on height in females, but not in males. This implies that though empirically you see taller males with higher rLSR, the genetic variance that is affecting height isn’t correlated with rLSR, so selection isn’t occurring in this sex.

~375,000 may seem like a lot, but from talking to people who work in polygenic selection there is still statistical power to be gained by going into the millions (perhaps tens of millions?). These sorts of results are very preliminary but show the power of synthesizing classical quantitative genetic models and ways of thinking with modern genomics. And, it does have me wondering about how these methods will align with the sort of stuff I wrote about last year which detects recent selection on time depths of a few thousand years. The SDS method, for example, seems to be detecting selection for increasing height the world over…which I wonder is some artifact, because there’s a robust pattern of shorter women having higher fertility in studies going back decades.

Selection for pigmentation in Khoisan?

In the recent paper, Reconstructing Prehistoric African Population Structure, there was a section natural selection. Since my post on the paper was already very long I didn’t address this dynamic.

But now I want to highlight this section:

The functional category that displays the most extreme allele frequency differentiation between present day San and ancient southern Africans is ‘‘response to radiation’’ (Z = 3.3 compared to the genome-wide average). To control for the possibility that genes in this category show an inflated allele frequency differentiation in general, we computed the same statistic for the Mbuti central African rainforest hunter-gatherer group but found no evidence for selection affecting the response to radiation category.

We speculate that the signal for selection in the response to radiation category in the San could be due to exposure to sunlight associated with the life of the Khomani and Juj’hoan North people in the Kalahari Basin, which has become a refuge for hunter-gatherer populations in the last millenia due to encroachment by pastoralist and agriculturalist groups.

I’m a bit puzzled here, because the implication seems to be that the San populations are darker than they were in the past. And yet earlier this summer I saw a talk which strongly suggested that there was a selection in modern Bushman populations for the derived variant of SLC24A5, presumably introduced through admixture from East African populations with Eurasian admixture.

In comparison to their neighbors the San are quite light-skinned, so it’s a reasonable supposition that they have been subject to natural selection recently. The Hadza, in contrast, seem to have the same complexion as their Bantu neighbors.

So what’s point of demographic models which leave you scratching your head


There’s a new paper on Tibetan adaptation to high altitudes, Evolutionary history of Tibetans inferred from whole-genome sequencing. The focus of the paper is on the fact that more genes than have previously been analyzed seem to be the targets of natural selection. And I buy most of their analyses (not sure about the estimate of Denisovan ancestry being 0.4%…these sorts of things can be tricky).

But they fancy it up with a ∂a∂i model of population history, as well as using MSMC to account for gene flow. I don’t understand why they didn’t use something simpler like TreeMix, which can also handle more complex models. I guess because they wanted to focus on only a few populations?

Years ago I asked the developer of MSMC, Stephan Schiffels, if assuming an admixed population is not admixed might cause weird inferences. Why yes, it would. For example, admixed populations might show higher effective population since they’re pooling the histories of two separate populations. As for ∂a∂i, the model above leaves me literally scratching my head.

…predicted that the initial divergence between Han and Tibetan was much earlier, at 54kya (bootstrap 95% C.I 44 kya to 58 kya). However, for the first 45ky, the two populations maintained substantial gene flow (6.8×10-4 and 9.0×10-4 per generation per chromosome). After 9.4 kya (bootstrap 95% C.I 8.6 kya to 11.2 kya), the gene flow rate dramatically dropped (1.3×10-11 and 4×10-7 per generation per chromosome), which is consistent with the estimate from MSMC.

Mystifying. The separation between Chinese and Tibetans is pretty much immediately after modern humans arrive in East Asia. Then there’s a lot of reciprocal gene flow…which ends during the Holocene.

We’re being told here that there are two populations which persisted in some form for ~45,000 years. Is this believable? That these two populations maintained some sort of continuity, and, remained in close proximity to engage in gene flow. And then ~10,000 years ago the ancestors of the Tibetans separated from the ancestors of the modern Han Chinese.

The latter scenario I can imagine. It’s this ~45,000 year dance I’m confused by. If there is substantial gene flow between the two groups why did they keep enough distinctive drift to be separate populations?

With what we know about ancient DNA from Europe if we posited such a model for that continent we’d be way off. There’s been too many population turnovers. Is East Asia different? I’m moderately skeptical of that. I think perhaps researchers should be very aware of the limitations of ∂a∂i when it comes to fine-grained population genomic analyses.

Note: This is a cool paper, and this small section is not entirely relevant. Which is why I’m confused about it since it seems the weakest part of the analysis in terms of originality, and the least believable.