The modern human family tree might be shallower that I’ve been saying

Estimating population split times and migration rates from historical effective population sizes:

The estimation of effective population sizes (Ne) through time is of fundamental interest in population genetics, but the interpretation of Ne as the effective number of breeding individuals in the population is challenged by the effect of population structure. In fact, variation in Ne reported in many studies may be a consequence of changes in migration rates between populations rather than changes in actual population size. We address this long-standing problem here by constructing joint models of population size changes, migration, and divergence that can adjust temporal estimates of Ne and estimate the actual Ne of a local deme connected to another population through migration. We also develop a method for estimating divergence times and migration rates taking into account complex scenarios of changing population sizes. We apply the method to previously published data from humans, and show that, when taking migration and changes in Ne into account, the estimated divergence between the San and Dinka populations is approximately 108 kya, and not 255 kya as reported in a previous study. Using simulations, we demonstrate that the previously reported and surprisingly old estimates of divergence between San and Dinka is in fact caused by a quantifiable estimation bias due to changes in Ne through time.

If you read this blog you know I’ve alluded to really deep structure in Africa for a while. Some work using ancient Khoisan samples from South Africa puts the divergence between this population and other moderns as far back as 300,000 years ago. Privately a lot of geneticists were skeptical of this, but the published record is what it is. But there were many simplifying assumptions in these models. You can see this in Ne calculations on admixed (recently) populations as if they weren’t admixed but ancient groups. If you have an admixed population with higher genetic diversity you’re going to estimate a larger effective population…but really it’s just the pooled population of two distinct ancestral populations (or more than two, depending).

This preprint, along with A weakly structured stem for human origins in Africa, is pushing the divergence times within Africa closer to 100-125,000 years ago, as opposed to 200,000 years ago. They do make an estimate for Eurasians that seems to be corroborated by archaeology and ancient DNA:

For Han-French divergence, the model with the highest composite likelihood was one with a split time of 1505 generations (i.e. 43,645 years ago assuming 29 years per generation) and a mostly unidirectional migration rate of 2.92 from Han to French (Table 1). We also replicate the results from Sj¨odin et al. [2021], in which the TT method infers nonsensical negative split times between Han and French. The unidirectional migration inferred from Han to French is in line with current models of the peopling of Europe through waves of farmers coming from central Eurasia [Haak et al., 2015].

The 43,645 years ago estimate seems broadly correct. Ancient DNA and archaeology I think point to a period definitely before 40,000 years ago, but admixture with Neanderthals and the spread of modern human technologies means it is unlikely to be very much before 50,000 years ago (i.e., not 60,000 years ago). The “Han” to “French” migration is strange, but there is suggestive evidence deep in supplements of East Asian migration into late Pleistocene/early Holocene Europe into Mesolithic foragers. This might be common Ancestral North Eurasian ancestry, or something different. I’m not sure that this model totally checks out and we know what’s going on. Probably one reason it remains in the supplements of these papers.

They’re getting estimates between Sardinians and Africans a bit before 100,000 years ago…though they admit that it’s probably inflated by archaic (Neanderthal admixture). That value seems about right and indicates a long period of incubation of the ancestrally non-African populations within the context of African/perhaps West Asian population structure.

For more complexity/detail, see The genomic origins of the world’s first farmers, which purports to better model Ne variations to get better divergence times within Europe between various forager and farmer lineages. This group has not used “Basal Eurasians” in their human genetics papers the last few times. They don’t believe it’s needed.

Cities are where people go to flourish, and then die

Stable population structure in Europe since the Iron Age, despite high mobility:

Ancient DNA research in the past decade has revealed that European population structure changed dramatically in the prehistoric period (14,000-3,000 years before present, YBP), reflecting the widespread introduction of Neolithic farmer and Bronze Age Steppe ancestries. However, little is known about how population structure changed in the historical period onward (3,000 YBP – present). To address this, we collected whole genomes from 204 individuals from Europe and the Mediterranean, many of which are the first historical period genomes from their region (e.g. Armenia, France). We found that most regions show remarkable inter-individual heterogeneity. Around 8% of historical individuals carry ancestry uncommon in the region where they were sampled, some indicating cross-Mediterranean contacts. Despite this high level of mobility, overall population structure across western Eurasia is relatively stable through the historical period up to the present, mirroring the geographic map. We show that, under standard population genetics models with local panmixia, the observed level of dispersal would lead to a collapse of population structure. Persistent population structure thus suggests a lower effective migration rate than indicated by the observed dispersal. We hypothesize that this phenomenon can be explained by extensive transient dispersal arising from drastically improved transportation networks and the Roman Empire’s mobilization of people for trade, labor, and military. This work highlights the utility of ancient DNA in elucidating finer scale human population dynamics in recent history.

This is the most important: ‘According to a longstanding historical hypothesis, the Urban Graveyard Effect, the influx of migrants in city-centers disproportionately contributed to death rate over birth rate; a process which would contribute to observing individuals as “transient” migrants…’

To me, it confirms that the urban demographics of the ancient world were always transient because of low total fertility.

Ashkenazi Jewish ethnogenesis in light of the Erfurt medieval DNA


Genome-wide data from medieval German Jews show that the Ashkenazi founder event pre-dated the 14th century:

We report genome-wide data for 33 Ashkenazi Jews (AJ), dated to the 14th century, following a salvage excavation at the medieval Jewish cemetery of Erfurt, Germany. The Erfurt individuals are genetically similar to modern AJ and have substantial Southern European ancestry, but they show more variability in Eastern European-related ancestry than modern AJ. A third of the Erfurt individuals carried the same nearly-AJ-specific mitochondrial haplogroup and eight carried pathogenic variants known to affect AJ today. These observations, together with high levels of runs of homozygosity, suggest that the Erfurt community had already experienced the major reduction in size that affected modern AJ. However, the Erfurt bottleneck was more severe, implying substructure in medieval AJ. Together, our results suggest that the AJ founder event and the acquisition of the main sources of ancestry pre-dated the 14th century and highlight late medieval genetic heterogeneity no longer present in modern AJ.

I’ve been asked how this modifies the narrative in my Substack piece. I’d say only on the margins.

The Erfurt community dates to the 1300’s. Interestingly it shows variation in Eastern European ancestry. The authors suggest that genetically there are actually two clusters here, though sociocultural they’re identical. The best guess is that the Eastern European enriched population migrated to Erfurt from the east (there is some difference in the isotope analysis for the teeth). Modern Ashkenazi Jews show less variability and are positioned between the two Erfurt communities in PCA and admixture space. The group without Eastern European ancestry seems to resemble Sephardi and Italian Jews. This isn’t surprising, since they confirm Ashkenazi Jews are some proportional mix of a Middle Eastern population, Italians, and Eastern Europeans, while Sephardi and Italian Jews clearly just lack the last.

The Erfurt community experienced a strong bottleneck, stronger than the one in modern Ashkenazi Jews. This implies that there are other groups out there unsampled, and modern Ashkenazim descend from that. This isn’t surprising, one feedback I got is that there are so many medieval Jews for the inferred population size during this period (going by the texts). I think one issue might be a lot of the medieval Jewish communities simply went extinct. Many of them were undergoing similar dynamics, but not all contributed to future Ashkenazi ancestry.

The homogeneity (relative) of modern Ashkenazim is probably due to late-medieval metapopulation dynamics.

The Japanese as a creation of the Christian Era

The traditional model, which I’ve alluded to before on this weblog before, is that Japan is a synthesis of Jomon and Yayoi, with the latter dominant, and bringing rice-agriculture to the islands. A new paper in Science indicates it may be more complicated than that.

Ancient genomics reveals tripartite origins of Japanese populations:

Prehistoric Japan underwent rapid transformations in the past 3000 years, first from foraging to wet rice farming and then to state formation. A long-standing hypothesis posits that mainland Japanese populations derive dual ancestry from indigenous Jomon hunter-gatherer-fishers and succeeding Yayoi farmers. However, the genomic impact of agricultural migration and subsequent sociocultural changes remains unclear. We report 12 ancient Japanese genomes from pre- and postfarming periods. Our analysis finds that the Jomon maintained a small effective population size of ~1000 over several millennia, with a deep divergence from continental populations dated to 20,000 to 15,000 years ago, a period that saw the insularization of Japan through rising sea levels. Rice cultivation was introduced by people with Northeast Asian ancestry. Unexpectedly, we identify a later influx of East Asian ancestry during the imperial Kofun period. These three ancestral components continue to characterize present-day populations, supporting a tripartite model of Japanese genomic origins.

The Kofun period begins around 300 AD. The implication here is that there was a mass migration from the Asian continent less than 2,000 years ago, likely from Korea. The first agriculturalists, the Yayoi, seem to be a mix of native Jomon and individuals with strong affinities to populations in Manchuria.

Here’s a stylized representation that captures the turnover:

The Jomon are interesting because these results indicate low effective population, and, deep connections with ANE (Ancient North Eurasians). They also seem a clade deep within Northeast Asians, dating to the Pleistocene.

In any case, the authors admit that their sampling of the Yayoi is weak, so there needs to be follow-up here. If it does turn out that the Japanese are mostly Kofun-period, then I think that recalibrates our sense of its history a great deal. The Japan of the 7th century which enters into history was a very young nation.

Funnel Beaker, Corded Ware, Únětice, oh my!


Since David hasn’t mentioned it, I’m going to post some notes on Dynamic changes in genomic and social structures in third millennium BCE central Europe. This is a big deal because there’s a huge data-set spanning the Neolithic (older than 3000 BC) to the Bronze Age in Bohemia, looking at Globular Amphora, Corded Ware, Bell Beaker, and Únětice. Since I’m not too familiar with European archaeology, the most surprising thing that jumped out at me is that there was structure and variability in the nature and origins of the Neolithic societies in the region. The Bohemian Funnel Beaker populations seem to have been migrants from the west, for example.

The two big takeaways:

  1. Confirms serial admixture that tends to be female-mediated from Neolithic (though some “pure” steppe women also migrated)
  2. The Corded Ware and successor cultures in the region seem to have an affinity for an unsampled population to the north of the Yamnaya zone, in the forest-steppe

The first part is highlighted by the fact that several individuals with ~0% steppe ancestry are buried early on as “Corded Ware.” These were clearly individuals who were culturally assimilated, but their ancestry was totally different. Some of these women in particular seem to have been non-local as well, though from Neolithic societies. This suggests, unsurprisingly, that the ethnogenesis of Indo-European cultures was synthetic and complex. The figure to the top/right illustrates the trend whereby the earliest Corded Ware population exhibited far greater genetic distances between individuals than is to be found in modern European pairwise comparisons. This is part of the broader trend that over the recent past there’s been a massive worldwide panmixia.

Second, the Corded Ware has always been an awkward fit with a simple Yamnaya+Neolithic admixture. The stylized model, which I’ve repeated for simplicity, is that the Yamnaya moved west and mixed with the locals. Kristian Kristiansen explicitly refers to the Corded Ware as basically Yamnaya when I pushed him on this, and who am I to disagree with him? I think the key distinction here is that archaeologically the Corded Ware seems so much like European adaptations of the Yamnaya cultural toolkit…but genetically there are subtle indications of difference. Basically, the authors argue, plausibly, that the Corded Ware is not derived from the Yamnaya as such (their Y chromosomes do not match anyway), but a Yamnaya-adjacent population in the forest-steppe. This region seems to have also contributed a second pulse of migration which resulted in increased northeastern affinity, and a higher fraction of R1a lineages.

When it comes to the Y chromosomes, the authors conclude that inter-group competition was intense, and resulted in serial replacements of paternal lineages. The reproductive fitness gain they estimate for the elite lineages is 15% per generation, which is a very large number in evolutionary genetics (2% selection coefficients are large in this field). The Bell Beaker group seems to have been reflux from the west, and it itself was replaced later on by the Únětice.

One of the less supported, though still useful, models for the Corded Ware is a genetic influx from Pitted Ware samples, the mostly “EHG” hunter-gatherer group from Sweden. I think this supports the proportion that a group of early Yamnaya penetrated the forest-steppe, and assimilated hunter-gatherers in the southern portions of the taiga. If my read of the archaeology is correct, the overwhelmingly dominant culture of these synthetic groups was Yamnaya-like.

Finally, I have to wonder about these peoples’ association with and relationship to the Fatyanovo culture of western Russia, right in the forest-steppe. These groups seem to have been proto-Indo-Iranian judging by their R1a1a-Z93. One of the individuals in these data was clearly Z282, which is so common among Slavs (and Europe).

Italian genetics in the Bronze Age

A new paper on Italian Bronze Age and Iron Age genomics, Ancient genomes reveal structural shifts after the arrival of Steppe-related ancestry in the Italian Peninsula. The abstract:

Across Europe, the genetics of the Chalcolithic/Bronze Age transition is increasingly characterized in terms of an influx of Steppe-related ancestry. The effect of this major shift on the genetic structure of populations in the Italian Peninsula remains underexplored. Here, genome-wide shotgun data for 22 individuals from commingled cave and single burials in Northeastern and Central Italy dated between 3200 and 1500 BCE provide the first genomic characterization of Bronze Age individuals (n = 8; 0.001-1.2× coverage) from the central Italian Peninsula, filling a gap in the literature between 1950 and 1500 BCE. Our study confirms a diversity of ancestry components during the Chalcolithic and the arrival of Steppe-related ancestry in the central Italian Peninsula as early as 1600 BCE, with this ancestry component increasing through time. We detect close patrilineal kinship in the burial patterns of Chalcolithic commingled cave burials and a shift away from this in the Bronze Age (2200-900 BCE) along with lowered runs of homozygosity, which may reflect larger changes in population structure. Finally, we find no evidence that the arrival of Steppe-related ancestry in Central Italy directly led to changes in frequency of 115 phenotypes present in the dataset, rather that the post-Roman Imperial period had a stronger influence, particularly on the frequency of variants associated with protection against Hansen’s disease (leprosy). Our study provides a closer look at local dynamics of demography and phenotypic shifts as they occurred as part of a broader phenomenon of widespread admixture during the Chalcolithic/Bronze Age transition.

The samples pick up steppe ancestry around 1600 BC, but that’s due to a lacuna in the transect. We know now that steppe ancestry arrived in Spain and Greece before 2000 BC. It seems to me unlikely that it would be notably tardy in Italy.

Another thing I want to mention is there is clearly something West Asian (CGH-related) that is moving westward ~2000 BC in a straight shot from Anatolia to the Balkans to southern Italy. This migration seems associated with Y chromosomal lineage J2. Trying to estimate how much exogenous post-Imperial eastern ancestry is present in Southern Italians is somewhat difficult for this reason. The differences between the far south and central and northern Italy may date to the Bronze Age because of this minority component of West Asian ancestry that extended itself across the Mediterranean.

Related: my Substack piece from March on the genetic/cultural history of Italy.

Y chromosomes around the Baltic

A new paper on rare Y chromosomal lineages around the Baltic, Phylogenetic history of patrilineages rare in northern and eastern Europe from large-scale re-sequencing of human Y-chromosomes:

…a considerable number of men in every population carry rare paternal lineages with estimated frequencies around 5%…Here we harness the power of massive re-sequencing of human Y chromosomes to identify previously unknown population-specific clusters among rare paternal lineages in NEE. We construct dated phylogenies for haplogroups E2-M215, J2-M172, G-M201 and Q-M242 on the basis of 421 (of them 282 novel) high-coverage chrY sequences collected from large-scale databases focusing on populations of NEE. Within these otherwise rare haplogroups we disclose lineages that began to radiate ~1–3 thousand years ago in Estonia and Sweden and reveal male phylogenetic patterns testifying of comparatively recent local demographic expansions. Conversely, haplogroup Q lineages bear evidence of ancient Siberian influence lingering in the modern paternal gene pool of northern Europe…

For context, over 90% of the Y chromosomal lineages in Northeastern Europe localize to just four haplogroups. R1a, R1b, I1, and N3 (TAT-C). R1a and R1b are associated with early Indo-Europeans. I1 is local to European hunter-gatherers, but probably got integrated early on into the Corded Ware lineages (it shows recent star phylogeny). N3 is associated with the male-mediated expansion of Siberians over the last 3,000 years and the expansion of Finnic languages in the region.

Taking a step back it’s rather shocking how high the frequency here is of these common lineages. Finland stands out: “the screened sample of 506 Finnish males we did
not detect any rare NEE lineages as almost all Finnish samples belong to hgs common among neighbouring populations – a probable reflection of either differing migration history or of demographic bottleneck(s) that have affected the Finnish population.” This is partly due to the overwhelming dominance of N3 in Finland. But, it is also a function of the fact that Neolithic agriculture never took root in Finland. The “Neolithic” ancestry is Finland is due to Corded Ware migration, and that varied depending on the Corded War population (some of the early Corded Ware in Estonia seem to have been pure Yamnaya).

G-M201 seem to be survival from European farmers. The low frequency of this lineage shows the great winnowing of older paternal lineages with the arrival of Corded Ware. Not totally clear about J2 in this paper, but that too might be a survival. The E2 lineage they adduce to hunter-gatherers associated with the Villabruna culture, because the coalescence with Middle Eastern lineages is too ancient (E has been found in Villabruna). Seems weak. But the result on Q is fascinating. I assumed it came with the Corded Ware or the Siberian migration. But it’s not really found in the Finns, and the Estonian lineages seem to be derived from the more common Swedish ones? The authors infer from this that it’s a hunter-gatherer (Scandinavian hunter-gatherer) survival, as this lineage has been found in Mesolithic European populations. I still think it might be due to Corded Ware, as Q is found in some of the Sintashta too. But it warrants further investigation.

Australasian ancestry in Pacific coast Americans?

About five years ago researchers discovered that there was some affinity between people in the Amazon and populations in Australasia. This was very strange but robust. After that, an ancient sample from Brazil also showed this affinity.

Now PNAS has a paper with a bigger data set that finds this ancestry more widely in South America:

Different models have been proposed to elucidate the origins of the founding populations of America, along with the number of migratory waves and routes used by these first explorers. Settlements, both along the Pacific coast and on land, have been evidenced in genetic and archeological studies. However, the number of migratory waves and the origin of immigrants are still controversial topics. Here, we show the Australasian genetic signal is present in the Pacific coast region, indicating a more widespread signal distribution within South America and implicating an ancient contact between Pacific and Amazonian dwellers. We demonstrate that the Australasian population contribution was introduced in South America through the Pacific coastal route before the formation of the Amazonian branch, likely in the ancient coastal Pacific/Amazonian population. In addition, we detected a significant amount of interpopulation and intrapopulation variation in this genetic signal in South America. This study elucidates the genetic relationships of different ancestral components in the initial settlement of South America and proposes that the migratory route used by migrants who carried Australasian ancestry led to the absence of this signal in the populations of Central and North America.

The intrapopulation variation makes me suspicious. If it has been tens of thousands of years I would have expected intra-population variation to disappear.

The statistic looks correct. But we still don’t know what that means. The hypothesis presented about a coastal migration seems reasonable enough. But who knows?

The rise and fall of the Scythians

A new paper on Scythians in Science, Ancient genomic time transect from the Central Asian Steppe unravels the history of the Scythians:

The Scythians were a multitude of horse-warrior nomad cultures dwelling in the Eurasian steppe during the first millennium BCE. Because of the lack of first-hand written records, little is known about the origins and relations among the different cultures. To address these questions, we produced genome-wide data for 111 ancient individuals retrieved from 39 archaeological sites from the first millennia BCE and CE across the Central Asian Steppe. We uncovered major admixture events in the Late Bronze Age forming the genetic substratum for two main Iron Age gene-pools emerging around the Altai and the Urals respectively. Their demise was mirrored by new genetic turnovers, linked to the spread of the eastern nomad empires in the first centuries CE. Compared to the high genetic heterogeneity of the past, the homogenization of the present-day Kazakhs gene pool is notable, likely a result of 400 years of strict exogamous social rules.

This follows up on earlier work on Scythians and Sarmatians. The basic finding seems to be that the classical Scythians, an Iranian-speaking nomadic group, had an ethnogenesis in the eastern Kazakh steppe. And, their origins involve the amalgamation of earlier Bronze Age Eurasian pastoralists, probably out of the Indo-Iranian Andronovo horizon societies, with admixture with Bactria-Margiana populations to the south, and East Asian Bronze Age hunter-gatherers and pastoralists, to the east in Mongolia. The Sarmatians, also presumably Iranian-speaking, are somewhat different in that they had less East Asian ancestry, though they too had more Near Eastern ancestry than earlier Indo-Iranian steppe pastoralists.

The whole paper is worth reading. But I think the key thing to note is that Iron Age steppe pastoralists seem to have been much more interconnected with each other and with the world around them than their Bronze Age predecessors. Though there was some gene flow to the steppe from West Asia and elsewhere during the Bronze Age, it was a marginal phenomenon. By the Iron Age, it was ubiquitous. Additionally, there was now structure and connectedness across the steppe.

By the Iron Age the steppe had become an integrated social-political unit.

The French Bronze Age is what matters

A new preprint on ancient DNA, Ancient genomes from present-day France unveil 7,000 years of its demographic history. It goes from the late Pleistocene to the Iron Age, and has a lot of Neolithic samples, as well as Mesolithic and Bronze Age samples

Major takeaways:

– The Magdalenian populations, as represented by Goyet2, seem to have contributed ancestry to groups in substantial numbers down to the Mesolithic period. Earlier work showed the persistence of this group mostly in Iberia, but these data suggest they were present in France, and perhaps even Central Europe.

– The Neolithic populations are what you’d expect. The transition is what you’d expect (little initial admixture, later increase in Mesolithic hunter-gatherer ancestry). That being said, they seem to not establish whether the Neolithic farmers in France were mostly Cardial [Southewest European] or LBK [Central European]. It seems they lean to the proposition that they were more Cardial. Certainly for the southern samples.

– The arrival of the Beaker Culture heralded major genetic change as in Britain, though perhaps not as much. R1b became common, and steppe ancestry as well was ubiquitous. But, there was lots of variation. One of their samples is only about 25% steppe (the balance Neolithic-farmer), while another is 100% steppe. Southwest France, which had many non-Indo-European speakers until relatively late, had more Mesolithic hunter-gatherer and less steppe.

– Unlike in Iberia, there was significant mtDNA turnover. What this means is that the Indo-European expansion into Iberia was very male-mediated, but it France it wasn’t. Though the Neolithic impact seems higher than in Britain, on the whole there seem broad similarities here.

The shift from the Bronze Age to the Iron Age didn’t result in a change in the average ancestry, but the variance seems to have decreased. The reason for this is that prehistoric France seems to have been undergoing genetic mixing across reagions.

– Finally, strong very recent selection on lactse persistence and pigmentation.