The two Pleistocene people of Europe

Dual ancestries and ecologies of the Late Glacial Palaeolithic in Britain:

Genetic investigations of Upper Palaeolithic Europe have revealed a complex and transformative history of human population movements and ancestries, with evidence of several instances of genetic change across the European continent in the period following the Last Glacial Maximum (LGM). Concurrent with these genetic shifts, the post-LGM period is characterized by a series of significant climatic changes, population expansions and cultural diversification. Britain lies at the extreme northwest corner of post-LGM expansion and its earliest Late Glacial human occupation remains unclear. Here we present genetic data from Palaeolithic human individuals in the United Kingdom and the oldest human DNA thus far obtained from Britain or Ireland. We determine that a Late Upper Palaeolithic individual from Gough’s Cave probably traced all its ancestry to Magdalenian-associated individuals closely related to those from sites such as El Mirón Cave, Spain, and Troisième Caverne in Goyet, Belgium. However, an individual from Kendrick’s Cave shows no evidence of having ancestry related to the Gough’s Cave individual. Instead, the Kendrick’s Cave individual traces its ancestry to groups who expanded across Europe during the Late Glacial and are represented at sites such as Villabruna, Italy. Furthermore, the individuals differ not only in their genetic ancestry profiles but also in their mortuary practices and their diets and ecologies, as evidenced through stable isotope analyses. This finding mirrors patterns of dual genetic ancestry and admixture previously detected in Iberia but may suggest a more drastic genetic turnover in northwestern Europe than in the southwest.

Cool paper that shows that the British can still get some things done. Basically, they found that genetically and culturally there were really two different populations in late Pleistocene Europe, and that the earlier post-Magdelenaian populations left some impact on the mostly Villabruna-descended populations like Cheddar Man.

But why is the lactase persistent allele not in HWE?

Dairying, diseases and the evolution of lactase persistence in Europe:

In European and many African, Middle Eastern and southern Asian populations, lactase persistence (LP) is the most strongly selected monogenic trait to have evolved over the past 10,000 years1. Although the selection of LP and the consumption of prehistoric milk must be linked, considerable uncertainty remains concerning their spatiotemporal configuration and specific interactions2,3. Here we provide detailed distributions of milk exploitation across Europe over the past 9,000 years using around 7,000 pottery fat residues from more than 550 archaeological sites. European milk use was widespread from the Neolithic period onwards but varied spatially and temporally in intensity. Notably, LP selection varying with levels of prehistoric milk exploitation is no better at explaining LP allele frequency trajectories than uniform selection since the Neolithic period. In the UK Biobank4,5 cohort of 500,000 contemporary Europeans, LP genotype was only weakly associated with milk consumption and did not show consistent associations with improved fitness or health indicators. This suggests that other reasons for the beneficial effects of LP should be considered for its rapid frequency increase. We propose that lactase non-persistent individuals consumed milk when it became available but, under conditions of famine and/or increased pathogen exposure, this was disadvantageous, driving LP selection in prehistoric Europe. Comparison of model likelihoods indicates that population fluctuations, settlement density and wild animal exploitation—proxies for these drivers—provide better explanations of LP selection than the extent of milk exploitation. These findings offer new perspectives on prehistoric milk exploitation and LP evolution.

Two issues

1) Doesn’t seem to explain why LP started becoming common in Britain before the continent

2) Why are the alleles not in HWE? There’s not really any assortative mating.

Back migration into Africa by Eurasians

Two preprints/papers.

Identifying and Interpreting Apparent Neanderthal Ancestry in African Individuals:

Admixture has played a prominent role in shaping patterns of human genomic variation, including gene flow with now-extinct hominins like Neanderthals and Denisovans. Here, we describe a novel probabilistic method called IBDmix to identify introgressed hominin sequences, which, unlike existing approaches, does not use a modern reference population. We applied IBDmix to 2,504 individuals from geographically diverse populations to identify and analyze Neanderthal sequences segregating in modern humans. Strikingly, we find that African individuals carry a stronger signal of Neanderthal ancestry than previously thought. We show that this can be explained by genuine Neanderthal ancestry due to migrations back to Africa, predominately from ancestral Europeans, and gene flow into Neanderthals from an early dispersing group of humans out of Africa. Our results refine our understanding of Neanderthal ancestry in African and non-African populations and demonstrate that remnants of Neanderthal genomes survive in every modern human population studied to date.

Basically, this paper concludes that Eurasian back-migration related to Europeans/West Asians seems to be around 30% of Sub-Saharan African ancestry. They carry about 30% of the Neanderthal ancestry of Eurasians.

Then, a preprint that uses a pretty sophisticated method, Ancient Admixture into Africa from the ancestors of non-Africans:

Genetic diversity across human populations has been shaped by demographic history, making it possible to infer past demographic events from extant genomes. However, demographic inference in the ancient past is difficult, particularly around the out-of-Africa event in the Late Middle Paleolithic, a period of profound importance to our species’ history. Here we present SMCSMC, a Bayesian method for inference of time-varying population sizes and directional migration rates under the coalescent-with-recombination model, to study ancient demographic events. We find evidence for substantial migration from the ancestors of present-day Eurasians into African groups between 40 and 70 thousand years ago, predating the divergence of Eastern and Western Eurasian lineages. This event accounts for previously unexplained genetic diversity in African populations and supports the existence of novel population substructure in the Late Middle Paleolithic. Our results indicate that our species’ demographic history around the out-of-Africa event is more complex than previously appreciated.

This paper estimates 35-40% back-migration from the ancestral proto-Eurasian population, with less (~20%) in African hunter-gatherers. This paper didn’t detect Neanderthal ancestry and argues that the back-migration predates the West vs East Eurasian split. It plausibly argues African effective population sizes are inflated by the admixture event.

The two results here clearly contradict the details.

The genetics of Southeast Asia gets more complex…

Ancient genomes from the last three millennia support multiple human dispersals into Wallacea:

Previous research indicates that the human genetic diversity found in Wallacea – islands in present-day Eastern Indonesia and Timor-Leste that were never part of the Sunda or Sahul continental shelves – has been shaped by complex interactions between migrating Austronesian farmers and indigenous hunter-gatherer communities. Here, we provide new insights into this region’s demographic history based on genome-wide data from 16 ancient individuals (2600-250 yrs BP) from islands of the North Moluccas, Sulawesi, and East Nusa Tenggara. While the ancestry of individuals from the northern islands fit earlier views of contact between groups related to the Austronesian expansion and the first colonization of Sahul, the ancestry of individuals from the southern islands revealed additional contributions from Mainland Southeast Asia, which seems to predate the Austronesian admixture in the region. Admixture time estimates for the oldest individuals of Wallacea are closer to archaeological estimates for the Austronesian arrival into the region than are admixture time estimates for present-day groups. The decreasing trend in admixture times exhibited by younger individuals supports a scenario of multiple or continuous admixture involving Papuan- and Asian-related groups. Our results clarify previously debated times of admixture and suggest that the Neolithic dispersals into Island Southeast Asia are associated with the spread of multiple genetic ancestries.

This paper is hard to parse. But here are my takeaways

– the samples in this study do not seem particularly closely related to the 7,000 years old sample from Sulawesi

– there was likely an earlier mainland migration into Wallacea of Austro-Asiatic speaking people

– gene flow seems to have been reoccurrent from the east, in western Melanesia, as well as from Austronesians to the northeast

Yemen and the Yemeni Jews


In my Substack post Under pressure: the paradox of the diamond I said this:

The implication of these DNA results is that Yemeni Jews are by and large descended from natives of this region of Arabia. They are converts, and their genetic uniqueness is a function of their isolation from demographic currents that swept across Arabia with the rise of Islam. The Yemenis of the highlands, isolated by geography, show the same genetic signature of isolation, as they descend solely from the original inhabitants of the region. This is the nth demonstration that culture and geography are both powerful factors driving genetic distinctiveness.

Some people took objection, or, inquired further, as to why I said this. From High-resolution inference of genetic relationships among Jewish populations:

Four Jewish populations included in the study—Ethiopian Jews, Indian Jews from Cochin, Indian Jews from Mumbai, and Yemenite Jews—are considered to be culturally distinct and not part of the Ashkenazi, Mizrahi, North African, or Sephardi groups; they are therefore not analyzed in sets…

…Figure 1b reveals a distinctive position for the Yemenite Jewish samples in relation to other Jewish populations…

…The resulting MDS plot (Fig. 1c) places the Yemenite Jews near Bedouin, Saudi Arabian, and Yemenite non-Jewish populations…

…Jewish populations have mixed membership in the two clusters, with the exception of the Yemenite Jews, who are placed primarily in the main cluster among Middle Eastern populations. For K = 3, the third cluster (dark blue) separates the Mozabite and Moroccan populations. Non-Jewish populations from the Levant generally have substantial membership in this cluster, as do North African and Yemenite Jews.

For K = 6, Yemenite Jews have relatively high membership in the new cluster, which also has substantial membership from Middle Eastern populations such as Bedouins and Saudi Arabians (pink)…

We further reduced the population set, exploring structure among Jewish populations, continuing to exclude Ethiopian and Indian Jews, and also excluding the relatively dissimilar Yemenite Jews (population set 4)…

You can look at the plots above. I also added some of my own after I added Vyas et al. Yemen samples (warning, only 7,000 SNP intersection!). Using my own Fst, PCA, and TreeMix, I think it’s possible that the modern Yemenis aren’t related to ancient Yemenis, but Yemeni Jews clearly cluster with modern Arabian populations.

What does the three-population test say? You can look here, but Yemeni Jews don’t show a significant deviation from a three-population phylogeny when they’re an outgroup with the populations I have. That means with my particular model they’re probably best thought of as an ancient Arabian population without much gene flow from external sources (they don’t have much African admixture, unlike other Yemenis).

If you want to see the alternative, please read Mitochondrial DNA reveals distinct evolutionary histories for Jewish populations in Yemen and Ethiopia. I’m not spending any more time on this.

So many assumptions about Africa


I have been staring and this figure and rereading Ancient West African foragers in the context of African population history. The Shum Laka sample from this paper, dating to four to eight thousand years ago, have drawn my attention, and I’m just looking at them a lot.

It seems ridiculous I’ve been using Nigerians as my “African reference” for decades. Most African populations, including Pygmies and Khoisan, have Eurasian admixture from the last 10,000 years. And what about deeper back-to-Africa ancestry? That seems likely and is hinted at in the above paper.

Modern human lineages have a deep history in Africa and the Near East. I think we’re going to have a transformation of our understanding of what happened in these regions in the near future.

The Greeks in the mountains

The New Yorker has a long feature that explores the strange results from the paper last year, Ancient DNA from the skeletons of Roopkund Lake reveals Mediterranean migrants in India. Basically, they found a bunch of Indians who died 1,000 years ago, and, a bunch of Greeks who died a few centuries ago. They were buried naturally in a very isolated lake high in the Himalayas. There are all sorts of hypotheses regarding the Greeks, whose bones indicate a Mediterranean diet, and the closest match to individuals in Crete. My personal experience is that “mainland Greeks” tend to be a bit Northern European shifted, so these individuals may have been Anatolian or Aegean Greeks.

Stuart Fidel, who sometimes comments on this weblog, suggests these were Armenian traders. But David Reich correctly points out Armenians are very distinct genetically from Greeks (though the two are not entirely different obviously!). Another hypothesis is a bone mix-up, but the issue here is there are a lot of individuals who are of the same population and seem to have lived in the same region. How could bone mix-ups produce so many systematic errors?

Ultimately there’s no final answer in the piece, though hopefully, someone will present a reasonable conjecture.

Because the piece has Reich and his lab spotlighted, they allude to the controversy around him. This is ultimately going to be the legacy of the hit-piece from a few years back. He’s now a “controversial figure,” which is, to be frank not a bad thing in the eyes of some of the Reich lab’s scientific rivals. Most media treatments that aren’t purely about his research (i.e., Carl Zimmer’s column in The New York Times covering the Reich lab publications) will mention this now.

Here’s why he’s a mensch:

Still, some anthropologists, social scientists, and even geneticists are deeply uncomfortable with any research that explores the hereditary differences among populations. Reich is insistent that race is an artificial category rather than a biological one, but maintains that “substantial differences across populations” exist. He thinks that it’s not unreasonable to investigate those differences scientifically, although he doesn’t undertake such research himself. “Whether we like it or not, people are measuring average differences among groups,” he said. “We need to be able to talk about these differences clearly, whatever they may be. Denying the possibility of substantial differences is not for us to do, given the scientific reality we live in.”

This is, in 2020, is an old-fashioned view. There are now young American researchers who frankly express disquiet and discomfort at the idea of studying human population genetic variation, period.  Including people who themselves have studied topics such as polygenic adaptation in humans. This would be a very strange view for older researchers, but it’s not totally out of the norm today, so expect someone like Reich to be viewed as quite the dinosaur in a decade. It seems ridiculous to say, but I do wonder if we’re seeing the end of the “humans as a model organism” era. Lots of ppl are not happy with the new atmosphere, but lots of people just keep quiet and go along.

Correlated response is a big story of selection

Adaptation is clearly one of the most important processes in understanding how evolution occurs. In a classical sense, it’s easy to understand. Parallel adaptations in body plans make dolphins and swordfish shaped the same. It’s physics.

But with the emergence of DNA, a lot of the focus on adaptation has been displaced to the signatures of natural selection on the molecular level. Phenotypes are controlled by variation in genotypes, and instead of description and hypothesizing, researchers can actually infer from the genetic patterns the history and arc of adaptation. 

At least that’s the theory.

The initial tests for signatures of natural selection focused on adaptation between species. For example, Tajima’s D. Usually this took the form of comparing variation across two lineages of Drosophila. In the 2000s with genome-wide data new methods predicated on looking at ‘haplotype structure’ (variation across sequences of genes) emerged. Instead of between species, these methods focused on the selection within species (e.g., why are some humans adapted to malaria?). These methods were good at picking up strong signals at a few genes where the selective sweeps were recent.

But as datasets and genomics got bigger and better researchers focused on more fundamental patterns and analyses, such as looking at ‘site frequency spectra.’ Ultimately the goal was to go beyond selection at a single locus (e.g., lactase persistence), and understand polygenic characteristics (e.g., height). Obviously, this is much harder because polygenic characters are distributed across many genetic loci, and issues of statistical power are always going to loom large (and there is the soft vs hard sweep issue too!).

A new preprint is an excellent introduction to this wild world, Disentangling selection on genetically correlated polygenic traits using whole-genome genealogies:

We present a full-likelihood method to estimate and quantify polygenic adaptation from contemporary DNA sequence data. The method combines population genetic DNA sequence data and GWAS summary statistics from up to thousands of nucleotide sites in a joint likelihood function to estimate the strength of transient directional selection acting on a polygenic trait. Through population genetic simulations of polygenic trait architectures and GWAS, we show that the method substantially improves power over current methods. We examine the robustness of the method under uncorrected GWAS stratification, uncertainty and ascertainment bias in the GWAS estimates of SNP effects, uncertainty in the identification of causal SNPs, allelic heterogeneity, negative selection, and low GWAS sample size. The method can quantify selection acting on correlated traits, fully controlling for pleiotropy even among traits with strong genetic correlation (|rg| = 80%; c.f. schizophrenia and bipolar disorder) while retaining high power to attribute selection to the causal trait. We apply the method to study 56 human polygenic traits for signs of recent adaptation. We find signals of directional selection on pigmentation (tanning, sunburn, hair, P=5.5e-15, 1.1e-11, 2.2e-6, respectively), life history traits (age at first birth, EduYears, P=2.5e-4, 2.6e-4, respectively), glycated hemoglobin (HbA1c, P=1.2e-3), bone mineral density (P=1.1e-3), and neuroticism (P=5.5e-3). We also conduct joint testing of 137 pairs of genetically correlated traits. We find evidence of widespread correlated response acting on these traits (2.6-fold enrichment over the null expectation, P=1.5e-7). We find that for several traits previously reported as adaptive, such as educational attainment and hair color, a significant proportion of the signal of selection on these traits can be attributed to correlated response, vs direct selection (P=2.9e-6, 1.7e-4, respectively). Lastly, our joint test uncovers antagonistic selection that has acted to increase type 2 diabetes (T2D) risk and decrease HbA1c (P=1.5e-5).

There’s a lot going on here. This is my favorite passage:

To address these issues, we recently developed a full-likelihood method, CLUES, to test for selection and estimate allele frequency trajectories. 21 The method works by stochastically integrating over both the latent ARG using Markov Chain Monte Carlo, and the latent allele frequency trajectory using a dynamic programming algorithm, and then using importance sampling to estimate the likelihood function of a focal SNP’s selection coefficient, correcting for biases in the ARG due to sampling under a neutral model.

Alrighty then! Someone’s a major-league nerd.

The preprint is fine, but ultimately this is something you get a “feel” for by working with models, data, and general analyses in the field. And I don’t have a strong feel since I don’t work with these sorts of data and questions myself. So what do I know? That being said, I like the preprint because it satisfies an intuition I’ve long had: correlated response is a big part of the story of polygenic selection.

Basically, you have to remember that complex traits are subject to variation at a host of genetic positions. And genetic variants rarely have singular effects. That is, one locus usually exhibits pleiotropy. The genetic effect shapes a lot of characteristics. Therefore, if there is a strong selection on a gene, more traits than simply the target of selection will be impacted. In animal breeding making huge, meaty, fast-growing lineages can render them infertile if selection is taken too far. That’s a bad correlated response.

After correcting for the genetic correlation the authors note that some traits, such as EDU and hair color, are not really selected directly at all. This is like the fact that we know EDAR is associated with hair thickness and is a strong target of selection. We have no idea what the trait of interest is. But it’s a pretty big deal. All these quantitative traits controlled by variation across the genome are being reshaped by adaptation on other traits. What are those traits? This preprint doesn’t answer that really.

Hopefully, we’ll make some headway in the 2020s because we’re definitely looking through the mirror darkly.

Humans are basically invasive weeds

One of the somewhat surprising things we have learned over the last decade is that massive admixture and homogenization has occurred between distinct human lineages over the last 10,000 years. By this, I mean that we’re not talking simply about continuous gene-flow between neighboring populations, but massive expansions of small groups and assimilation of very different groups from the expanding groups. As a stylized fact, it looks like “Early European Farmers” we as distinct from Mesolithic hunter-gatherers as modern Northern Europeans are from Han Chinese (pairwise Fst ~0.10). The fusion of these two groups later merged in much of Europe with migrants from the east, the western edge of the forest-steppe.

The empirical pattern seems to be that cultural innovations (e.g., agriculture) trigger demographic revolutions, which homogenize and admix vast regions. This is a story of demographic history. Phylogeography.

But there is another aspect, natural selection. Humans are not exempt from this. Selection operates upon genetic variation, which is preexistent (“standing variation”), or, comes from new mutations (de novo).

It seems plausible that cultural innovation has resulted in a great deal of selection over the last 10,000 years. So where did the raw material come from? One argument that has been playing out is between those who argue that it’s from variation within human populations that is ancestral and shared, and new variation. This is where admixture comes into play.

A new preprint on bioRxiv uses the 1000 Genomes data in the New World to suggest that admixture resulted in the introduction of a lot of adaptive alleles into populations of mostly European and Native background from African ancestry. Basically, it seems likely that the American tropics were colonized by African tropical diseases, which entailed adaptations which were already existent within African populations. Admixture-enabled selection for rapid adaptive evolution in the Americas:

Background: Admixture occurs when previously isolated populations come together and exchange genetic material. We hypothesized that admixture can enable rapid adaptive evolution in human populations by introducing novel genetic variants (haplotypes) at intermediate frequencies, and we tested this hypothesis via the analysis of whole genome sequences sampled from admixed Latin American populations in Colombia, Mexico, Peru, and Puerto Rico. Results: Our screen for admixture-enabled selection relies on the identification of loci that contain more or less ancestry from a given source population than would be expected given the genome-wide ancestry frequencies. We employed a combined evidence approach to evaluate levels of ancestry enrichment at (1) single loci across multiple populations and (2) multiple loci that function together to encode polygenic traits. We found cross-population signals of African ancestry enrichment at the major histocompatibility locus on chromosome 6, consistent with admixture-enabled selection for enhanced adaptive immune response. Several of the human leukocyte antigen genes at this locus (HLA-A, HLA-DRB51 and HLA-DRB5) showed independent evidence of positive selection prior to admixture, based on extended haplotype homozygosity in African populations. A number of traits related to inflammation, blood metabolites, and both the innate and adaptive immune system showed evidence of admixture-enabled polygenic selection in Latin American populations. Conclusions: The results reported here, considered together with the ubiquity of admixture in human evolution, suggest that admixture serves as a fundamental mechanism that drives rapid adaptive evolution in human populations.

The period after 1492 is easy for us to think about. But what ancient DNA has shown us is that it’s not as uncommon a phase as we might have thought.

Whole genome sequencing comes to Cavalli-Sforza’s samples

More than twenty years ago L. L. Cavalli-Sforza published The History and Geography of Human Genes. Based on decades of analysis of ‘classical’ markers, this work lays out results of statistical genetic analyses based on a few hundred genes, as well as displaying Cavalli-Sforza’s encyclopedic ethnographic knowledge. A close look at this book will yield some familiar population groups to readers of this weblog. The reason for this is simple: the cell lines continued onward to contribute to the HGDP data set.

In 2002 Rosenberg et al. used these populations in Genetic Structure of Human Populations by looking at “377 autosomal microsatellite loci.” Microsatellites are highly variable genetic regions. They pack a lot of diversity per locus. With more input variation Rosenberg et al. advanced beyond Cavalli-Sforza’s earlier work (instead of pairwise comparisons between populations, one could infer individual relatedness as displayed in a bar plot).

But times change, and in 2008 the same data set was used in Worldwide human relationships inferred from genome-wide patterns of variation, which utilized a 650,000 marker SNP-array. Though Rosenberg et al.’s work advanced the ball considerably, the move to genome-wide analysis was even bigger. For many years this data set has been a widely used benchmark and reference (these markers and populations were part of the early basis of 23andMe’s analyses in terms of population genetic inference). As the 1000 Genomes Project moved us beyond the SNP-array period, looking the whole genome, as opposed to a specific set of SNPs, the HGDP populations were still an important complement.

The reason was simple: Cavalli-Sforza was an ethnographic genius in comparison to most geneticists and had selected very interesting and informative populations. In some ways, the original motivation given for selecting these groups, that they may have preserved phylogenetic patterns obscured in cosmopolitan populations, has only been partially justified.

Ancient DNA has shed light on the reality that almost all populations, indigenous and cosmopolitan, come out of periods of admixture between lineages which had heretofore been distinct and separate. But some of Cavalli-Sforza’s populations have been inordinately important in informing us about branches of the human family tree less well represented in the cosmopolitan samples accessible in the 1000 Genomes Project (or earlier, the HapMap). I’m thinking here of the Kalash (a relatively good proxy for “Ancestral North Indians”), Sardinians (the best representatives in the modern world of “Early European Farmers”), and African hunter-gatherers (who carry the deepest diverging lineage within the modern human clade).

With all that, finally, the HGDP whole genome preprint is out. Anders Bergstrom superstar!

Insights into human genetic variation and population history from 929 diverse genomes:

Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented private genetic variation in southern and central Africa and in Oceania and the Americas, but an absence of fixed, private variants between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the last 10,000 years, a potentially major population growth episode after the peopling of the Americas, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations. We also demonstrate benefits to the study of population relationships of genome sequences over ascertained array genotypes. These genome sequences are freely available as a resource with no access or analysis restrictions.

The authors were able to make recourse to many more subtle analytic methods with their phasing, which seems to have been considerably superior to population phasing (the HGDP does have some closely related individuals due to endogamy, but no traditional trios). Because their population set included some undersampled groups with a lot of diversity (e.g., San Bushmen), they detected about as many variants with ~1,000 individuals at good coverage as the 1000 Genome Project with ~2,500 individuals at variable coverage.

And there are major lacunae within the 1000 Genome Project data set even after taking into account ethnographically and historically important groups such as the San Bushmen. There are no Middle Eastern populations in the 1000 Genome Project. The HGDP has the Druze, Bedouin, Palestinians, and Mozabites.

The preprint requires a lot of deep reading. There is much in there that one can mull over (frankly, I’m excited about the supplementary text, but that’s just me). One thing that came to mind is that ancient DNA and other more narrow studies laid the groundwork for the interpretations that naturally fall out of this extremely rich potential set of analyses. For example, by looking at shared variants across western and central Africa the authors confirm the likely result that there is a basal human population of some sort mixed into peoples of far western Africa. And, they also confirm that the Yoruba are about ~5-10% Eurasian.

These sequences generate so much data that there are lots of potential models that might conform to them. Earlier work eliminates some possibilities and highlights others.

Ancient DNA has confirmed for many that non-Africans have Neanderthal ancestry. But there have been several debates about whether there are issues with the assumption that Africans have no Neanderthal ancestry, and how it skews statistics (e.g., if Africans have some Neanderthal one will underestimate the Neanderthal fraction). Though there are still details to be hashed out, looking at coalescence patterns of haplotypes the authors seem to be able to infer the presence of deeply diverged lineages in various populations without positing a prior model of which populations did not have the introgression as a baseline. Basically, Neanderthal and Denisovan ancestry is going to result in some “long branches” in the phylogenies of the genes within non-African populations which are lacking in Africans, and that is what they see.

These researchers also confirm the model presented by others that Neanderthal contribution seems to have been from a single admixture event (I do wonder if perhaps Neanderthals were not simply extremely homogeneous, so multiple close admixture events may not be differentiable). They also find that the “Denisovan” population structure was more complex, and there were several admixture events into eastern Eurasian and Oceanian populations.

Finally, there are attempts to adduce the nature of population differentiation, and times of separation. As noted in the text all of these sorts of analyses are sensitive to assumptions within models. They used a variety of methods which came to different results, but, one thing that seems clear is that Africa had a lot of deep structure for a long time, but gene flow between regional populations meant that genetic differentiation emerged gradually, rather than in a rapid fashion due to geographic separation. Over five years ago Iain Mathieson casually told me that he viewed much of the past 200,000 years as the collapse of deep population structure, and that does not seem to have been a crazy prediction if you read through this preprint (though the collapse may be increased rates of gene flow, rather than massive pulse admixtures).

But the separation and differentiation outside of Africa, and between the archaic lineages and Africans, seems to have exhibited more punctuation. For the past twenty years John Hawks has been emphasizing that we need to remember that during Pleistocene Africa likely had a much larger population than the rest of the world for hominins (with perhaps a caveat for lower latitude Asia). The relatively “clean” separation between the proto-modern African lineage and the Eurasian hominins, and then the quick separation between Neanderthals and the eastern group which became Denisovans, emphasizes perhaps the importance of particular geographic barriers (deserts in the Near East), as well as the lower carrying capacity in much of Eurasia. With lower population densities and patchy occupation patterns, gene flow would be sharply reduced. This would result in drift and sharply different lineages.

There are arguments out there about whether humans are a clinal species or not. These verbal descriptions really don’t tell us much. The combination of ancient DNA and whole genome data will allow us to specific at specific times and places the nature of population dynamics. If human population relationships can be thought of as a graph, a set of interconnected edges, in some areas the connections will be thicker (ergo, lots of continuous gene flow), and in other areas, the graphs will be easier to represent as diverging trees.

I think the last 10,000 years of the Holocene has brought to Eurasia a more African pattern, as deep structure comes crashing down due to rapid population expansion and mixing….