
The figure to the left is from The genetic prehistory of the Greater Caucasus. If you are a regular reader of this weblog, or Eurogenes, you can figure out what’s going on, and keep track of the terminology. But in 2018 I think we’re getting to the end of the line in making sense of “admixture graphs” in relation to West Eurasian population structure. The models are just getting too complicated to keep everything straight, and the distinct-populations-subject-to-pulse-admixture seems to be an assumption that may not necessarily hold.
To get a sense of what I’m talking about, the above preprint focuses on populations in and around the Caucasus region. One of the major reasons that this is important is that the Caucasus was and is to some extent a continental hinge, connecting Eastern Europe and the Pontic steppe, to the Near East. The Arab Muslims pushed north of the Caucasus, and came into conflict with the Khazars, while Cimmerians and Scythians moved south from the Pontic steppe.
The elephant in the room is the relevance to the “Indo-European controversy.” Colin Renfrew long ago posited that the Indo-European languages derive from West Asian farmers who expanded into Europe as early as ~9,000 years ago. A rival theory is that Indo-Europeans spread out of the Pontic steppe ~4,000 years ago. In 2015 two major papers suggested that the steppe was a major source of Indo-European expansion. Case closed? This preprint suggests perhaps not.
But we’ll get to that later. What do the results here show? The prose is a little hard to tease apart, but the major issues seem to be that in antiquity, or at least the period they’re focusing on, much of the gene flow seems to have been south (Near East) to the north (through the Caucasus, and out to the north slope). To some extent, we already knew this: the Yamna people of the Pontic steppe have “southern” ancestry from the Near East that earlier East European/Pontic people do not. In this preprint, the authors show that groups such as the Maykop of the north slope of the Caucasus carry Y haplogroups such as G2, and not the R1 lineages commonly found in the steppe. David W. suggests that this confirms that Near Eastern gene flow into the steppe was female-mediated. This is plausible, but I would caution that Y chromosomes alone can be deceptive, due to the power of particular patrilineages. We’ll probably rely on the X chromosome to make a final judgment.
The plot below shows many of the relationships as a function of location and time. The green component is modal among “Iranian farmers,” the orange among “Anatolian farmers,” and the blue among “Western hunter-gatherers.”
A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals: the first, from Anatolian farmers proper, and the second from the descendants of European farmers, who themselves are a mix of Anatolian farmers with a minority ancestry among the hunter-gatherers. The answers would probably be totally unintelligible if not for archaeology. It’s clear that the steppe people had contact with both European and Near Eastern farmers and that later East European groups that succeeded the Yamna were subject to reflux from Central Europe, and received European farmer ancestry.
Another curious nugget in their results is that there was early detection of both Ancestral North Eurasian (ANE) ancestry and, some East Eurasian gene flow (related to Han Chinese). One of their individuals carries the East Eurasian variant of EDAR, which today is only found in Finns, though it was found in reasonable frequencies among the Motala hunter-gatherers of Scandinavia. Additionally, Fu et al. 2016 found that the ancestors of Mesolithic hunter-gatherers received some gene flow from Eastern Eurasians as well (also in the supplements of Lazaridis et al. 2016).
The authors admit that there is probably population structure among ANE and undiscovered groups of East Eurasians who were traversing the Inner Asian landscape. I think this is all suggestive of some long-distance contacts, though the intensity and magnitude increased a lot with high-density societies and the mobility of pastoralism.
Much of the genetic mixing in the Near East, and to some extent in the trans-Caucasian region, seems to date to the 4th millennium. This is technically prehistory, but it is also the Uruk period. This was a phase of Mesopotamian culture expansion between 4000 and 3100 BC which resulted in replicas of Uruk style settlements as far away as Syria and southeastern Anatolia. There is even evidence of Uruk-related migration to the North Caucasus.
The Uruk experienced abrupt and sudden collapse. Uruk settlements outside of the core zone of Mesopatamia disappear.
It’s the final paragraph that warrants discussion:
The insight that the Caucasus mountains served not only as a corridor for the spread of CHG/Neolithic Iranian ancestry but also for later gene-flow from the south also has a bearing on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia…Perceiving the Caucasus as an occasional bridge rather than a strict border during the Eneolithic and Bronze Age opens up the possibility of a homeland of PIE south of the Caucasus, which itself provides a parsimonious explanation for an early branching off of Anatolian languages. Geographically this would also work for Armenian and Greek, for which genetic data also supports an eastern influence from Anatolia or the southern Caucasus. A potential offshoot of the Indo-Iranian branch to the east is possible, but the latest ancient DNA results from South Asia also lend weight to an LMBA spread via the steppe belt…The spread of some or all of the proto-Indo-European branches would have been possible via the North Caucasus and Pontic region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and now widely documented ‘steppe ancestry’ in European populations, the postulate of increasingly patrilinear societies in the wake of these expansions (exemplified by R1a/R1b), as attested in the latest study on the Bell Beaker phenomenon….
But instead of tackling this let’s focus on the paper that came out of the Willerslev group, The first horse herders and the impact of early Bronze Age steppe expansions into Asia. This is a final manuscript in Science. That means it was probably written before The Genomic Formation of South and Central Asia. When it comes to South Asia, the results from the two publications are consanant. There is no conflict.*
More interesting are the results in West Asia, and the linguistic supplement. In the authors note that tablets now indicate an Indo-Aryan presence in Syria ~1750 BC. Second, Assyrian merchants record Indo-European Hittite, or Nesili (the people of Nesa), as early as ~2500 BC.
As suggested in earlier work Hittite remains don’t suggest steppe influence. David W. says:
The apparent lack of steppe ancestry in five Hittite-era, perhaps Indo-European-speaking, Anatolians was interpreted in Damagaard et al. 2018 as a major discovery with profound implications for the origin of the Anatolian branch of Indo-European languages.
But I disagree with this assessment, simply because none of these Hittite-era individuals are from royal Hittite, or Nes, burials. Hence, there’s a very good chance that they were Hattians, who were not of Indo-European origin, even if they spoke the Indo-European Hittite language because it was imposed on them.
The main aspect I’d bring up with this is that in other areas steppe ancestry has spread deeply and widely into the population, including non-Indo-European ones. It is certainly possible that the sample is not needed enough to pick up the genuinely Hittite elite, but I probably lean to the likelihood that the steppe signal won’t be found. It seems that the Anatolian languages were already diversified by ~2000 BC, and perhaps earlier. Linguists have long suggested that they are the outgroup to other Indo-European languages, though this could just be a function of their isolation among highly settled and socially complex populations.
Two alternative models present themselves for these results. The Anatolian Indo-European languages expanded through elite diffusion, part of the same general migrations that emerged out of the Yamna culture ~3000 BC. The lack of a steppe signal may be due to sampling bias, as David W. suggested, or, more likely in my opinion, simple dilution of the signal. Second, the steppe migrations were one part of a broader palette of population movements and cultural diffusions, and the Anatolian Indo-Europeans are basal to the efflorescence of the steppe derived branches.
The evidence of the explosion of Indo-Aryans in the years after 2000 BC in West and South Asia, as well as the expansion of Iranians across vast swaths of Inner Asia during the same period, suggest to me that Indo-Iranians are most definitely part of the steppe pulse. The connection to the Sintashta charioteers presents itself, and, connections to the Uralic languages indicates incubation in the trans-Volga region.
In West Asia, the Indo-Aryans crashed themselves against the most advanced civilizations of their time. Like the Bulgars, and unlike the Hittites, Indo-Aryan Mitanni was totally absorbed by their non-Indo-European Hurrian substrate. Indo-Aryan linguistic influence was preserved in their names, their gods, and in particular words relating to chariots. And yet in 2017’s Continuity and Admixture in the Last Five Millennia of Levantine History from Ancient Canaanite and Present-Day Lebanese Genome Sequences, the authors observe:
We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modeled as Sidon_BA 93% ± 1.6% and a Steppe Bronze Age population 7% ± 1.6% (Figure 3C; Table S6). To estimate the time when the Steppe ancestry penetrated the Levant, we used, as above, LD-based inference and set the Lebanese as admixed test population with Natufians, Levant_N, Sidon_BA, Steppe_EMBA, and Steppe_MLBA as reference populations. We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).
This needs to be more explored. The admixture could have come from many sources. I am curious about the frequency of R1a1a-z93 among modern-day Syrians and Lebanese.
For me these arguments can only be resolved with a deeper understanding of linguistic evolution. The close relationship of Indo-Aryan and Iranian languages is obvious to any speaker of either of these languages (I can speak some Bengali). A divergence in the range of 4 to 5 thousand years before the present seems most likely to me. But the relationship of the other Indo-European languages is much less clear.
One of the arguments in Peter Bellwood’s First Farmers is that the Indo-European languages exhibit a “rake-like” topology with the exception of Indo-Iranian, which forms a clear clade. To him and others in his camp, this argues for deep divergences very early in time.
It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these. Only a deeper understanding of linguistic evolution, and multidisciplinary analysis of regional substrates will generate the clarity we need.
* I’m going to skip the Botai angle in this post.






