Migration at the roof of West Asia

Click to see the full figure

The figure to the left is from The genetic prehistory of the Greater Caucasus. If you are a regular reader of this weblog, or Eurogenes, you can figure out what’s going on, and keep track of the terminology. But in 2018 I think we’re getting to the end of the line in making sense of “admixture graphs” in relation to West Eurasian population structure. The models are just getting too complicated to keep everything straight, and the distinct-populations-subject-to-pulse-admixture seems to be an assumption that may not necessarily hold.

To get a sense of what I’m talking about, the above preprint focuses on populations in and around the Caucasus region. One of the major reasons that this is important is that the Caucasus was and is to some extent a continental hinge, connecting Eastern Europe and the Pontic steppe, to the Near East. The Arab Muslims pushed north of the Caucasus, and came into conflict with the Khazars, while Cimmerians and Scythians moved south from the Pontic steppe.

The elephant in the room is the relevance to the “Indo-European controversy.” Colin Renfrew long ago posited that the Indo-European languages derive from West Asian farmers who expanded into Europe as early as ~9,000 years ago. A rival theory is that Indo-Europeans spread out of the Pontic steppe ~4,000 years ago. In 2015 two major papers suggested that the steppe was a major source of Indo-European expansion. Case closed? This preprint suggests perhaps not.

But we’ll get to that later. What do the results here show? The prose is a little hard to tease apart, but the major issues seem to be that in antiquity, or at least the period they’re focusing on, much of the gene flow seems to have been south (Near East) to the north (through the Caucasus, and out to the north slope). To some extent, we already knew this: the Yamna people of the Pontic steppe have “southern” ancestry from the Near East that earlier East European/Pontic people do not. In this preprint, the authors show that groups such as the Maykop of the north slope of the Caucasus carry Y haplogroups such as G2, and not the R1 lineages commonly found in the steppe. David W. suggests that this confirms that Near Eastern gene flow into the steppe was female-mediated.  This is plausible, but I would caution that Y chromosomes alone can be deceptive, due to the power of particular patrilineages. We’ll probably rely on the X chromosome to make a final judgment.

The plot below shows many of the relationships as a function of location and time. The green component is modal among “Iranian farmers,” the orange among “Anatolian farmers,” and the blue among “Western hunter-gatherers.”

A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals: the first, from Anatolian farmers proper, and the second from the descendants of European farmers, who themselves are a mix of Anatolian farmers with a minority ancestry among the hunter-gatherers. The answers would probably be totally unintelligible if not for archaeology. It’s clear that the steppe people had contact with both European and Near Eastern farmers and that later East European groups that succeeded the Yamna were subject to reflux from Central Europe, and received European farmer ancestry.

Another curious nugget in their results is that there was early detection of both Ancestral North Eurasian (ANE) ancestry and, some East Eurasian gene flow (related to Han Chinese). One of their individuals carries the East Eurasian variant of EDAR, which today is only found in Finns, though it was found in reasonable frequencies among the Motala hunter-gatherers of Scandinavia. Additionally, Fu et al. 2016 found that the ancestors of Mesolithic hunter-gatherers received some gene flow from Eastern Eurasians as well (also in the supplements of Lazaridis et al. 2016).

The authors admit that there is probably population structure among ANE and undiscovered groups of East Eurasians who were traversing the Inner Asian landscape. I think this is all suggestive of some long-distance contacts, though the intensity and magnitude increased a lot with high-density societies and the mobility of pastoralism.

Much of the genetic mixing in the Near East, and to some extent in the trans-Caucasian region, seems to date to the 4th millennium. This is technically prehistory, but it is also the Uruk period. This was a phase of Mesopotamian culture expansion between 4000 and 3100 BC which resulted in replicas of Uruk style settlements as far away as Syria and southeastern Anatolia. There is even evidence of Uruk-related migration to the North Caucasus.

The Uruk experienced abrupt and sudden collapse. Uruk settlements outside of the core zone of Mesopatamia disappear.

It’s the final paragraph that warrants discussion:

The insight that the Caucasus mountains served not only as a corridor for the spread of CHG/Neolithic Iranian ancestry but also for later gene-flow from the south also has a bearing on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia…Perceiving the Caucasus as an occasional bridge rather than a strict border during the Eneolithic and Bronze Age opens up the possibility of a homeland of PIE south of the Caucasus, which itself provides a parsimonious explanation for an early branching off of Anatolian languages. Geographically this would also work for Armenian and Greek, for which genetic data also supports an eastern influence from Anatolia or the southern Caucasus. A potential offshoot of the Indo-Iranian branch to the east is possible, but the latest ancient DNA results from South Asia also lend weight to an LMBA spread via the steppe belt…The spread of some or all of the proto-Indo-European branches would have been possible via the North Caucasus and Pontic region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and now widely documented ‘steppe ancestry’ in European populations, the postulate of increasingly patrilinear societies in the wake of these expansions (exemplified by R1a/R1b), as attested in the latest study on the Bell Beaker phenomenon….

But instead of tackling this let’s focus on the paper that came out of the Willerslev group, The first horse herders and the impact of early Bronze Age steppe expansions into Asia. This is a final manuscript in Science. That means it was probably written before The Genomic Formation of South and Central Asia. When it comes to South Asia, the results from the two publications are consanant. There is no conflict.*

More interesting are the results in West Asia, and the linguistic supplement. In the authors note that tablets now indicate an Indo-Aryan presence in Syria ~1750 BC. Second, Assyrian merchants record Indo-European Hittite, or Nesili (the people of Nesa), as early as ~2500 BC.

As suggested in earlier work Hittite remains don’t suggest steppe influence. David W. says:

The apparent lack of steppe ancestry in five Hittite-era, perhaps Indo-European-speaking, Anatolians was interpreted in Damagaard et al. 2018 as a major discovery with profound implications for the origin of the Anatolian branch of Indo-European languages.

But I disagree with this assessment, simply because none of these Hittite-era individuals are from royal Hittite, or Nes, burials. Hence, there’s a very good chance that they were Hattians, who were not of Indo-European origin, even if they spoke the Indo-European Hittite language because it was imposed on them.

The main aspect I’d bring up with this is that in other areas steppe ancestry has spread deeply and widely into the population, including non-Indo-European ones. It is certainly possible that the sample is not needed enough to pick up the genuinely Hittite elite, but I probably lean to the likelihood that the steppe signal won’t be found. It seems that the Anatolian languages were already diversified by ~2000 BC, and perhaps earlier. Linguists have long suggested that they are the outgroup to other Indo-European languages, though this could just be a function of their isolation among highly settled and socially complex populations.

Two alternative models present themselves for these results. The Anatolian Indo-European languages expanded through elite diffusion,  part of the same general migrations that emerged out of the Yamna culture ~3000 BC. The lack of a steppe signal may be due to sampling bias, as David W. suggested, or, more likely in my opinion, simple dilution of the signal. Second, the steppe migrations were one part of a broader palette of population movements and cultural diffusions, and the Anatolian Indo-Europeans are basal to the efflorescence of the steppe derived branches.

The evidence of the explosion of Indo-Aryans in the years after 2000 BC in West and South Asia, as well as the expansion of Iranians across vast swaths of Inner Asia during the same period, suggest to me that Indo-Iranians are most definitely part of the steppe pulse. The connection to the Sintashta charioteers presents itself, and, connections to the Uralic languages indicates incubation in the trans-Volga region.

In West Asia, the Indo-Aryans crashed themselves against the most advanced civilizations of their time. Like the Bulgars, and unlike the Hittites, Indo-Aryan Mitanni was totally absorbed by their non-Indo-European Hurrian substrate. Indo-Aryan linguistic influence was preserved in their names, their gods, and in particular words relating to chariots. And yet in 2017’s Continuity and Admixture in the Last Five Millennia of Levantine History from Ancient Canaanite and Present-Day Lebanese Genome Sequences, the authors observe:

We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modeled as Sidon_BA 93% ± 1.6% and a Steppe Bronze Age population 7% ± 1.6% (Figure 3C; Table S6). To estimate the time when the Steppe ancestry penetrated the Levant, we used, as above, LD-based inference and set the Lebanese as admixed test population with Natufians, Levant_N, Sidon_BA, Steppe_EMBA, and Steppe_MLBA as reference populations. We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).

This needs to be more explored. The admixture could have come from many sources. I am curious about the frequency of R1a1a-z93 among modern-day Syrians and Lebanese.

For me these arguments can only be resolved with a deeper understanding of linguistic evolution. The close relationship of Indo-Aryan and Iranian languages is obvious to any speaker of either of these languages (I can speak some Bengali). A divergence in the range of 4 to 5 thousand years before the present seems most likely to me. But the relationship of the other Indo-European languages is much less clear.

One of the arguments in Peter Bellwood’s First Farmers is that the Indo-European languages exhibit a “rake-like” topology with the exception of Indo-Iranian, which forms a clear clade. To him and others in his camp, this argues for deep divergences very early in time.

It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these. Only a deeper understanding of linguistic evolution, and multidisciplinary analysis of regional substrates will generate the clarity we need.

* I’m going to skip the Botai angle in this post.


18 thoughts on “Migration at the roof of West Asia

  1. The assertion in the linguistic supplement, shifting the date of the Anatolian languages back more than 500 years from any other attestation of Anatolian languages is based entirely on twenty proper noun names of individuals in a particular kingdom within Anatolia in a Syrian clay tablets, that is mixed up with lots of other names, some of which are even Semitic.

    Given the quite subtle differences, at best, between Hattic proper names and Hittite proper names, both of which have features (sometimes called “Banana names” for the repeated syllables) that are distinct from Indo-European and Semitic languages elsewhere.

    If these proper name conventions in the Anatolian languages are due to substrate influence, then the chronology falls apart.

    Similarly, the linguistic supplement cites Mallory 1989 on a hypothesis regarding the Tocharians that Mallory has since disavowed.

    All in all, the linguistic supplement does not make a credible or solid case for the positions that it is backing. It repeatedly overstates the evidence and treats as mainstream hypotheses that do not have wide linguistic support.

  2. They need to sample early elite graves from all Anatolian language speakers and Mycenaens. That way whats in common and a possible Balkan route could be (dis)proven.
    What they have now is at best a starter.

  3. Also, those Anatolian guys in Armi 2500 BC are non-elite/ being ruled by Semites. There were no IE elite language imposition. IE language likely moved from East Anatolia-South Caucasus area to South/Central Anatolia (ancestors of Hittites) and to North, Pontic-Caspian steppe. (ancestors of the rest)

  4. “The prose is a little hard to tease apart”

    You don’t say. This is the most amateur-unfriendly ancient DNA paper I have read, which is a problem due to the intricacy of the results they are reporting.

    “A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals”

    David Reich mentioned in his book that using ancient DNA to work out the interaction of closely-related lineages is still in its infancy, because the state of the art is best at picking out highly-differentiated tracer dyes. This paper makes the limits of current approaches clear, I think.

    More important than the Anatolian ancestry, to me, is picking out the various streams of CHG/Iranian ancestry! The authors discuss introgression of CHG/Iranian from Steppe Maykop into the steppe, but then say at the end of the paper that EHG and CHG were already heavily mixed in the steppe at the beginning of the Eneolithic! How did that mixture come about? They lack the data, which is unfortunate…

    “It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these.”

    I find this conundrum hilarious. If it is proven that IE languages originated south of the Caucasus but spread to the steppe, everyone gets to go home happy! Gramkelidze, Anthony, steppe hypothesis linguists, Renfrew and the Anatolianists, Dienekes, and the Indians.

    It is also funny to watch full steppe hypothesis supporters making some of the same criticisms of this publication that OIT supporters made of the Indian paper. (I still sympathize with the full steppe hypothesis but am becoming agnostic on the issue.)

  5. Re: X chromosome and sex biased admixture, one thing I don’t currently understand is why there’s not much more use of modern dna on this. Trying to test with ancients using the X has been done, with significant problems with methods (see Amy Goldberg’s disputed paper), due to the effects of low coverage and overlap even with best ancient samples.

    Yet if admixture was sex biased in ancestors of present day Europeans, it seems you should be able to obviously see it in comparisons purely of moderns. If Steppe admixture into Europe sex biased to males, Fst/differentiation should be lower for Europeans on X than on autosome, relative to normalization against world populations and outgroups, in patterns that correlate strongly with autosomal model of estimated ancestry proportions. No real problem doing this as far as I would expect, unless I’m very wrong.

  6. The discussion about sex bias here isn’t referring to I.E. speakers coming into Europe, it’s about the composition of the EHG-CHG mixture in the Eneolithic steppe. I think this question can only be solved with ancient DNA.

  7. Yudi, yes, I know, I’m doing what is called using an analogous example, rather than misunderstanding Razib’s post. You may still be able to work out something about sex-bias in the Eneolithic steppe via X vs autosome comparisons using only modern Europeans and Caucasian populations though (e.g. in this case for instance, that X sharing with the Caucasus should peak for populations with high steppe ancestry, rather than high EEF, and other patterns).

  8. @Matt: Sex biased admixture might be more normal than you might think. Its not restricted to the PIE case for sure.

  9. Well said Razib, this is where things have been headed for a while.The steppe spread IE to N.Europe but when you to Myceneans and Hittites there’s a different picture emerging, and in the supplement they finally talk about Greek and Armenian maybe coming from the Caucasus.More samples are obviously needed to clarify, and I think II languages aren’t clear yet.J2 also seems to be involved in Greece and Anatolia, and R1A to Europe, IE taking 2 routes to Europe.Reich also talked about PIE south of the Caucasus.

  10. Thanks for explaining in simple language which even non-academician general audience can understand. contrary to the other popular blog, you don’t get high that often. i don’t understand whats with the steppe fascination! anyway, i think, picture will be more complicated and different from what it is today as more and more “evidence” comes. i think computer software is a creation of humans not the creator himself. Hope to see more articles accessible to general public.

  11. The Maykop paper almost opens up as many questions as it answers. Haha. The Catacomb results aren’t surprising at all though, R1b associated with earlier steppe cultures, not as much EEF-WHG admixture as later steppe etc. We clearly need much more early data.

    I don’t think there’s any real argument about Greek coming from the East as things stand now. Unlike the less steppe influenced Myceaneans who can be plausibly be derived from Armenia too, the one low-quality genome that they couldn’t analyze too much and similarly doesn’t make it into amateur analyses is much more plausibly admixed from a Balkan than an Anatolian-Armenian source. Too much Euro-HG, too little extra CHG and we know that the late Greek Neolithic was overall very Basal and not Euro-HG-heavy at all so admixture from a northern source is needed. Pre-Greek from Anatolia, Greek from the Balkans (ultimately from the steppe) was the more commonly made argument in archaeology and linguistics too.

    The only attested branch that’s not as clearly from the steppe is Anatolian but even there we have plenty of good archaeological arguments about its connection to the earlier (i.e. pre-Yamnaya) steppe via a succession of cultures in the eastern Balkans.

    So while in general agreement, I’m not sure I agree with Andrew’s specific point as I generally understand it i.e. that Anatolian is not so much an early splitter as simply a heavily influenced branch. Even if the presented onomastic material, which is admittedly often not very transparent, isn’t Anatolian and even if Anatolian itself is a late arrival *to* Anatolia, it certainly seems to be the earliest split according to most (not all) linguists. Does’t seem part of the same general phenomena as the later attested branches which is an interesting possibility in its own way.

    I still have some questions about Corded Ware and its exact connections to the steppe and forest steppe area too…

  12. Mycenean is very close to reconstructed PIE and if PIE is south of the Caucasus, I know that’s not clear yet but hints are being dropped, why not directly from there? For now there’s very little steppe and R1A, and that possibility is mentioned in the supplement.Now we have Mycenean data and whatever the consensus was doesn’t mean much. .

  13. We finally get some data from early and attested IE speakers and magically a small amount of steppe means Greek came from there, Hittites have some EHG apparently which had been entering the Caucasus since Mesolithic not from steppe at all.At any rate neglible steppe and lots of J2.

  14. “We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).”

    That leaves just enough wiggle room to be caused by the expansions of the Hittite New Kingdom.

  15. @Everyone,

    Let’s be honest, the Maykop results aren’t very friendly to the idea that PIE expanded from the Near East onto the steppe, unless we now start talking about a purely linguistic transmission.

    That’s because there’s no clear signal of Maykop ancestry in Yamnaya overall, nor in Corded Ware, and not even much of one in the so called Yamnaya_Caucasus.

    Yeah, there’s Near Eastern related ancestry on the steppe, but most of that now looks very old and native to the North Caucasus region, thus very difficult to link to any specific expansion from south of the Caucasus that might have brought early Indo-European languages there.

    So that leaves the Hittites and other Anatolian-speakers. If there’s dense enough sampling of Bronze Age Anatolia, and no steppe ancestry found, even in specifically elite Hittite remains, then that will be interesting, and perhaps a sign that ancient DNA isn’t always able to solve linguistics issues.

    But guesses and predictions are for fun. Let’s see what the data show when it’s released. Don’t bet on it that all angles will be covered in every paper. For instance, during the last two years a whole series of high brow papers were arguing that Yamnaya had no Euro farmer ancestry, when it was rather clear from the data that it did.


    Just to reiterate, Steppe Maykop didn’t have an impact on Yamnaya overall, nor on Corded Ware or those steppe-like Bell Beakers, because it had Botai-like ancestry, which is practically missing in all of those groups. So it’s very difficult to link it to anything Indo-European, unless we now say that Corded Ware wasn’t the first Indo-European culture in Northern Europe, and so on.

    At the same time, there’s plenty of Steppe_MLBA admixture in South Asia, and especially among the upper castes there, so there’s no issue.

    @fred mertz

    Mycenaeans do have a fair whack of steppe ancestry that looks very Steppe_MLBA or Corded Ware-like. Around 20% or so.

    Sure, you can whittle that away to something like 5% by focusing just on EHG, but that’s not exactly a proximate and useful model, because EHG was long gone by that stage. On the other hand, Steppe_MLBA-like people were moving into the Balkans precisely during the early Mycenaean period, and we have direct evidence of this in ancient DNA. So why ignore it?

  16. Well I’m not ignoring it and they do have some steppe ancestry apparently, and some folks are trying to get that number up as much as possible.Based on that paper and the Hittites you should atleast consider a Caucasus route for Greek,Armenian, and Anatolian.Reich and others are.We all want more samples and can speculate.If the main goal starting your blog, which I visit a lot and is the best out there, was to show that z93 is from Europe and not India then there’s no argument against that.Also a steppe homeland for many IE is not a small thing.

  17. “i don’t understand whats with the steppe fascination!”

    Gay fantasies of blond Aryan warriors bringing civilization to Europe and India. It’s like 19th century Nordicism (even though we now know they didn’t look like that).

    But coming from a Slav it’s more like Afrocentrism: someone with very little significant history enviously trying to steal other people’s history to boost his self-esteem.

    It’s all political, like the “Out of India” stuff, just in reverse. Their whole identity is based on PIE coming from the steppe, and when it doesn’t work out they get pissed.

    I posted relevant quotes from this blog post in Davidski’s comments and he deleted them and banned me. If he could ban Razib from the internet I’m sure he would. LOL

  18. “…connections to the Uralic languages indicates incubation in the trans-Volga region.”

    The curious thing is, there are no traces of Uralic superstrates/adstrates/loanwords in any Indo-Aryan languages, old or modern. Since Old Indo Aryan has been “preserved like a tape-recording”(Michael Witzel’s words) in the rituals of the Brahmins,if there is any place where such Uralic loanwords should be preserved is in the Rig Veda. So this absence seems to be extremely curious.

    So, we have only these 3 possibilities,

    [A] Indo Aryan were spread out across a vast region in the Steppes, and the northern IA speakers in the Volga interacting with the Uralic speakers while the southern branch entered Indian sub-continent, and there were no further contact between the northern and southern IA speakers.

    [B] A group of IA speakers migrated back from the sub-continent or southern Central Asia to mix with the Uralic speakers.

    [C]If we assume the loanwords in Uralic are not Indo Aryan but are Indo Iranian, then the situation of (A) is reenacted with IIr speakers in place of IA speakers.


Comments are closed.