Substack cometh, and lo it is good. (Pricing)

New David Reich talk

Eurogenes points me to a new talk by David Reich, that has a nice new long abstract online. I’ll just insert my comments within the blockquote…

We present an integrative genetic history of the Southern Arc, an area divided geographically between West Asia and Europe, but which we define as spanning the culturally entangled regions of Anatolia and its neighbors, in both Europe (Aegean and the Balkans), and in West Asia (Cyprus, Armenia, the Levant, Iraq and Iran). We employ a new analytical framework to analyze genome-wide data at the individual level from a total of 1,320 ancient individuals, 731 of which are newly reported and address major gaps in the archaeogenetic record. We report the first ancient DNA from the world’s earliest farming cultures of southeastern Anatolia and northern Mesopotamia, as well as the first Neolithic period data from Cyprus and Armenia, and discover that it was admixture of Natufian-related ancestry from the Levant—mediated by Mesopotamian and Levantine farmers, and marked by at least two expansions associated with dispersal of pre-pottery and pottery cultures—that generated a pan-West Asian Neolithic continuum [“it was” refers to Cyprus and Armenia? How Mesopatamian farmers related to the Zagros-Levant-Anatolian trichotomy?]. Our comprehensive sampling shows that Anatolia received hardly any genetic input from Europe or the Eurasian steppe from the Chalcolithic to the Iron Age; this contrasts with Southeastern Europe and Armenia that were impacted by major gene flow from Yamnaya steppe pastoralists [I believe Southeastern Europe had both patchy early Yamnaya and later Indo-Europeans? Armenia on the other hand seems unique].

In the Balkans, we reveal a patchwork of Bronze Age populations with diverse proportions of steppe ancestry in the aftermath of the ~3000 BCE Yamnaya migrations, paralleling the linguistic diversity of Paleo-Balkan speakers. We provide insights into the Mycenaean period of the Aegean by documenting variation in the proportion of steppe ancestry (including some individuals who lack it altogether), and finding no evidence for systematic differences in steppe ancestry among social strata, such as those of the elite buried at the Palace of Nestor in Pylos [Mycenanean Greece starts at 1750 BC, so probably at least 500 years at least from the major penetration of Indo-Europeans, so that’s 20 generations or so. That seems enough time for status-gene correlations to breakdown if there’s no endogamous caste-like structure].

A striking signal of steppe migration into the Southern Arc is evident in Armenia and northwest Iran where admixture with Yamnaya patrilineal descendants occurred, coinciding with their 3rd millennium BCE displacement from the steppe itself. This ancestry, pervasive across numerous sites of Armenia of ~2000-600 BCE, was diluted during the ensuing centuries to only a third of its peak value [Looking online, there’s a 2012 paper that indicates that modern Armenians have of the specifically Yamnaya R1b lineage. If this, true might explain why Armenian is so hard to place within a Indo-European tree, as Celtic, Germanic, Balto-Slavic and Indo-Iranian seem to come out of a broader Corded Ware cultural complex], making no further western inroads from there into any part of Anatolia, including the geographically adjacent Lake Van center of the Iron Age Kingdom of Urartu. The impermeability of Anatolia to exogenous migration contrasts with our finding that the Yamnaya had two distinct gene flows [David of Eurogenes does not like this, but this could mean Anatolian and CHG/Iranian pulses?], both from West Asia, suggesting that the Indo-Anatolian language family originated in the eastern wing of the Southern Arc and that the steppe served only as a secondary staging area of Indo-European language dispersal. The demographic significance of Anatolia on a Mediterranean-wide scale is further documented by our finding that following the Roman conquest, the Anatolian population remained stable and became the geographic source for much of the ancestry of Imperial Rome itself.

35 thoughts on “New David Reich talk

  1. Put my main comment on the open thread a few days ago (about 8 hours after the link got uploaded 😉 !) so won’t repost (

    Some additional comments though based on further thoughts and discussion at Eurogenes:

    1) As well as the Hajji Firuz BA female (from NW Iran around the late 3rd millennium) who has a large amount of steppe ancestry, we also have the evidence of two Iron Age males from NW Iran, from Hajji Firuz and Hasanlu (both from around 1100-800 BCE). These guys both have the R1b-Z2103, which is from Yamnaya/Steppe_EMBA.

    This certainly strongly suggests to me this was patrilineally Yamnaya related (or their later descendents, such as Catacomb) and we know that we have south moving Yamnaya related people who are admixing with people in the Caucasus, from the samples we have from the admixed Late Kubano-Tersk culture around 2000 BCE.

    This finding was already suggested by present day clines of R1b/R1a in Iran and Turkey, where there is a West->East cline of R1b-Z2103 to Central Asian R1a.

    So this does seem to firmly link the Steppe related ancestry to Yamnaya and descendent cultures, and not a later Corded Ware derived culture from Steppe_MLBA.

    2) This being the case, it’s probably now even more unsustainable to take the position that “Yamnaya was maybe Indo-Anatolian, but only Corded Ware was late Proto-Indo-European”. Unless we are to say that this movement is not responsible for the Armenian language (since Armenian is lPIE).

    Now rejecting this connection between this movement and Armenian would be in line with the hypothesis that Armenian came via Northern Anatolia, from the Balkans, but that seems not parsimonious at all given that we’re then asking for a coincidence where the Yamnaya made a big impact in this region, but then were linguistically replaced by something from the Balkans, without this leaving an enduring or obvious genetic trace in the Anatolian sample. And then this phenomenon is subsequently entirely linguistically localized in impact within the same place where the steppe ancestry actually made an impact.

    3) Another thing here is that the Caucasus route for Armenian has been claimed to be infeasible in the past, due to the strong linguistic diversity of the Caucasus.

    Well, it seems like it happened anyway! It also seems that people in the Northern Caucasus, in contrast to Yamnaya, have received Steppe related geneflow without any major disruption to their languages or patrilineages (e.g. steppe ancestry into Northern Caucasuses groups may have been mostly from women).

    This might tell us something about how mountain communities can maintain high patrilocality and linguistic continuity over time, even as significant migrations move through them.

    It also weakens an argument against proto-IE migration up through the Caucasus, that this would have levelled out linguistic diversity there. This doesn’t necessarily strengthen that argument much, but it is something.

    4) Regarding the links of Greek and Armenian (which seem the most common family grouping), although I doubt the Greek migration strongly involved a disproportionate success of patrilines compared to some other IE movements (i.e. its doubtful on the basis of the male Mycenaean we have and the autosomally identical much later Greek colonists in Iberia), it’ll be interesting to see if Mycenae turns up R1b-Z2103. That will definitely tend to favour the argument that some have made that Greek has something to do with Catacomb Culture, and has nothing much to do with the Corded Ware or Corded Ware derived cultures. As per one of Anthony/Mallory’s suggested that Graeco-Armenian was from Catacomb, with some influence here into Indo-Iranian too. Although these waves may be difficult to separate entirely.

    5) Re; Mycenaean lack of social structure correlation with steppe ancestry, yes this could fall by the wayside very quickly, even in a few generations I’d guess ( approximately 100 years even at a push?), if there’s nothing going on like European colonial racism or Hindu caste structures. Especially if you have the opposite, where higher status people are more likely to migrate and admix.

    But it also may have been lacking to begin with; it’s possible that the proto-Greeks expanded into Greece not as a conquering army (as per Robert Drews) but as a somewhat less dominant people settling in hinterlands, but then over time becoming integrated with a more linguistically diverse local landscape, and then becoming the dominant source of a shared language and rituals that enabled broader networking.

    There’s that finding here that: “Our results indicate multi-phased genetic shifts in the Aegean populations since the early Neolithic that can be traced to populations related to Anatolia and then, during the Late Bronze Age, to Central-Eastern Europe. Besides the long-lasting biological connections with these adjacent regions, we also found that Bronze Age Aegeans exhibited endogamy in high frequencies so far unobserved in the rest of the ancient West Eurasia. These close-kin marital practices, likely equivalent to first-cousins unions, were substantially higher in Crete and other Aegean islands than in Mainland Greece.” . (Raises the question; did the Anatolians who migrated into Greece at this time before the Indo-Europeans also practice close-kin marriage? Or was this something to do with Greece’s broken, mountainous, insular territory. This new paper may tell us.)

    Possibly the coming of the Greeks gave some shared culture that started to assimilate these groups into a single population. Though possibly not completely – if there were close breeding practices still through LBA, that might help to explain why variable steppe ancestry proportions could exist, where the endogamy is maintaining the variation?

    Many of David Anthony’s early ideas on the spread of IE emphasized the idea of IE as a shared elite culture that diffuses down. That might still be more relevant in SE Europe. Just one possibility anyway.

  2. Echoing a comment I saw on Eurogenes: I wish they would give more attention to other language families. There are linguistic theories attempting to link Basque to Caucasian languages or Afroasiatic, independent theories about Afroasiatic in pre-IE Europe, theories linking NE Caucasian languages to Hurro-Urartians (or Sino-Tibetan…), and mountains of speculation on Sumerian and Elamite.

    Would be interesting to see how the genetic data weeds out or reinforces those theories. Most are dismissed by mainstream linguists, but most also haven’t been given a serious look beyond easily-dismissed individual efforts. If one or more of those theories gets serious support from ancient DNA, it could get the attention it needs to be properly evaluated.

  3. @Marco, it would be nice, although I’m not sure how much the adna will shed any light on those. The major components of ancestry can’t be correlated with the distribution of those languages or putative higher level groupings, which are anyway subject to debate. I think we already have a fair idea that Basque-Caucasian and things like this aren’t going to have any genetic signal to them? Could be possible but seems doubtful?

    I do think hopefully we might see something on whether Afro-Asiatic might be better explained by a neolithic or pre-neolithic dispersal, perhaps, and maybe on whether there is any signal correlated with Semitic specifically, because that’s an expansive family and there are some more dates around that which can be compared to genetics.

    As another sort of Eurogenes linked comment that I’ve said there before, I think at least the paper does suggest that Urartian / Hurro-Urartian has no links to the steppe, in that the Urartian kingdom lacks it. This was a minor hypothesis, due to the equestrian links of Urartu in historically attested material.

    (This is another indication that the horse focused culture around the expansion of the Sintashta DOM2 horse, which we find the first samples of basically simultaneously in Turkey and the steppe, happened far faster and independently of any very large movement of people and population genetic change.)

    @Razib; Possible typo above: “If this, true might explain why Armenian is so hard to place within a Indo-European tree, as Celtic, Germanic, Balto-Slavic and Indo-European seem to come out of a broader Corded Ware cultural complex” – Iranian?

    As another, other side point, perhaps some confirmation that Armenian is from Catacomb and not the post Corded Ware cultures may help to add some weight to the suggestion that the connected satem and ruki laws are more areal and don’t actually so much connect Balto-Slavic and Indo-Iranian in a linguistic family connection (IIRC Armenian shows some incomplete satemization?).

  4. While I tend to agree more with the “Northern Model” for the formation of the Mycenaeans, I don’t know if I agree that steppe was connected to status in Greece.

    @Razib Khan, have you gotten a chance to check out this:

    Our thread on the topic:

    The leaked PCA shows that other Ancient Greeks from other areas looked very similar to the Myceneans of Pylos. Though it is not clear what date they are from.

  5. You guys still not willing to accept that BMAC was Indo-Iranian/Indo Aryan starting 2200bce.

    Indo-Iranians didn’t come from corded ware. All ingredients of Aryan culture were present in BMAC way before they mixed with Andronovians.

    Fire temples, Soma, horse, carts/Wagons, Indra etc. I am no expert of genetics but those who are are of the opinion that Mitanni Indo-Aryan vocabulary most likely came from the BMAC-Shar-e-Shoktha like source as seen through the ancestry profile of Allakh lady 1500 BCE whose steppe input is lower than modern Dravidians.

    BTW I am Kashmiri (Indian) and I have been told we have a significant ancestry from Kangju like groups, Your take Razib.

  6. Somethings to Ponder upon.

    “These include a wide variety of ivory objects like dice and hairpins; Bakry adds “ivory objects were not only found at Altyn Depe and Gonur Depe, there were also unique ivory discs unearthed at Djarkutan (Southern Uzbekistan), where we can also find the depictions of the pipal leaves (typical Indus motif)” (p. 424). Then there are the ceramics; about a fifth of Harappan types have counterparts in the later (Namazga V) phase of BMAC culture (about 2300-1800 BCE). These includes the dish-on-stand and perforated jar types. Seals have been found, Bakry notes: “It is important that Altyn Depe seals were uncovered in a ritual center including a tomb for priests, where various valuable artifacts were found near the altar, including the golden head of a bull and the seal number 2 [a swastika seal]” (p. 427). Indeed, it seems as if it is in burials of the elites and what could be priestly or ritual centers that many of the finds of the most sophisticated goods were made. Beads, bangles, even a faience monkey figurine has been found at Gonur Depe.”

    Then we have scholars like Parpola who are of the opinion and I quote;

    “Asko Parpola states that dasa referred only to Central Asian peoples. Vedic texts that include prayers for the defeat of the dasa as an “enemy people”, according to Parpola, possibly refers to people from the so-called Bactria–Margiana Archaeological Complex (BMAC), who spoke a different language and opposed Aryan religious practices.”

    Archaeological evidence is to the contrary. Even genetic evidences I believe is to the contrary.

    “New research in Murghab region, in excavations at defensive walls of Adji Kui 1, showed pastoralists presence as early as the second half of the Middle Bronze Age (c. 2210-1960 BC), with the coexistence of BMAC people living in the ‘citadel’ and pastoral population living on the edge of the town.”

    So BMAC and Andronovians didn’t mix with each other for nearly half a millennia. Why? Were they hostile to each other or did BMAC people didn’t consider them worthy of intermixing with?

    That brings us to another important pointer. Since Aryan cult was already present in BMAC what was then the identity of Andronovians?

    Were Andronovians the Dasyus of the Rig Veda?

    Quoting Witzel

    “Michael Witzel in his review of Indo-Iranian texts in 1995, states that dasa in the Vedic literature represented a North Iranian tribe, who were enemies of the Vedic Aryans, and das-yu meant “enemy, foreigner.”

    So did Andronovians spoke somekind of Proto-Iranian or Proto Balto-Slavic-Iranian?

  7. Early Kassites ~18th.c BCE have personal names like Gandash & Abi-Rattash both appear to be of Indo-Aryan character. There God list has another set of Aryan Gods like Bugash(Bhaga), Suriash(Surya), Marutash(Maruta).

    If these Gods are indeed Aryan it’s highly unlikely the source for them was Andronovo. It also seems like Mitanni and Kassites came in contact of 2 separate tribes with their separate set of deities.

    From : Archaeogenetics

    Hasanlu IA 900bce, Hasanlu, W Iran

    Target: IRN_Hasanlu_IA
    Distance: 2.3395% / 0.02339468
    35.0 Levant_Ashkelon_LBA
    29.4 UZB_Bustan_BA
    26.6 Kura-Araxes_ARM_Kalavan
    9.0 RUS_Sintashta_MLBA
    0.0 ARM_LBA
    0.0 ARM_Lchashen_MBA
    0.0 ARM_MBA
    0.0 MNG_TUK001
    0.0 PAK_Katelai_IA

    30% BMAC ancestry

    A medean perhaps.


    Are these Ancestries related to Hurrian or Assyrians? From what I have gathered from internet it seems to me that this BMAC like group after mixing with the other source ditched Vedic Indo-Aryan culture and adopted Assyrian God. There does appear resemblance in the iconography of Assur and Zoroastrian Ahura Mazdha + Assur banipals clay tablets attests to Assara Mazaš a few centuries earlier to Darius inscription, but I don’t know the context.

  8. this sort of stuff makes you seem like an idiot:
    “Michael Witzel in his review of Indo-Iranian texts in 1995, states that dasa in the Vedic literature represented a North Iranian tribe, who were enemies of the Vedic Aryans, and das-yu meant “enemy, foreigner.”

    who cares what Michael witzel thought in 1995? he has changed his mind on many things in the last ten years, let along 25. do you have no understanding of the literature you quote? probably.

    overall your comments are just a vomit of random information. not persuasive if that’s your intent, as opposed to just babbling at us.

  9. I am confused. Aren’t Yamnaya a mix of EHG and CHG? So did EHG and CHG mix in Anatolia and then migrate to the Steppe?

  10. “731 of which are newly reported and address major gaps in the archaeogenetic record.”

    Kudos to Reich’s group on this very important score that shouldn’t be forgotten. All analysis rests on this foundation.

    “We report the first ancient DNA from the world’s earliest farming cultures of southeastern Anatolia and northern Mesopotamia, as well as the first Neolithic period data from Cyprus and Armenia, and discover that it was admixture of Natufian-related ancestry from the Levant—mediated by Mesopotamian and Levantine farmers, and marked by at least two expansions associated with dispersal of pre-pottery and pottery cultures—that generated a pan-West Asian Neolithic continuum.”

    This sentence reminds us why colleges still insist that everyone take English composition before receiving their degrees.

    I’m also a bit confused about what they are trying to say here.

    The ancient DNA we’ve seen so far, rather than reflecting a “pan-West Asian Neolithic continuum” has shown a stark divide, disproportionate to geographic separation, between Levantine early Neolithic farmers on one hand, and Caucasian-Iranian early Neolithic farmers on the other. The Levantine farmers were closely related to Natufian hunter-gatherers and proto-farmers. The Causasian-Iranian early Neolithic farmers were closely related to Caucasian hunter-gatherers. Almost all other early Neolithic farmers were migrants derived mostly from one or the other of these populations for the most part, with minor local introgression.

    Are they saying that early Anatolian farmers bridged this gap in a continuum?

    Were Mesopotamian early Neolithic farmers distinct genetically from Caucasian-Iranian early Neolithic farmers?

    Where do Cyprus and Armenia fit in that sprawling sentence?

    “If this, true might explain why Armenian is so hard to place within a Indo-European tree, as Celtic, Germanic, Balto-Slavic and Indo-Iranian seem to come out of a broader Corded Ware cultural complex]”

    It wasn’t all that hard to explain why Armenia was hard to place without this hypothesis, since Armenia is on the border of several different major branches of IE languages that have ebbed and flowed in their control of the region, and because it had areal influences that no other branch of IE enjoyed.

    Critically, there is almost nothing in Armenian that isn’t found in some other geographically close branch of IE. Few, if any, features or words that can’t be readily attributed to substrate influences are found in it. Instead, Armenian looks a bit like English, mugging other languages in dark alleys and stealing from them to produce an unnatural mix of pieces of many languages.

    “suggesting that the Indo-Anatolian language family originated in the eastern wing of the Southern Arc and that the steppe served only as a secondary staging area of Indo-European language dispersal.”

    I too find this not even remotely plausible.

    There is no solid evidence that Indo-European Anatolian languages were widely spoken pre-2000 BCE in any large territory within Anatolia, and no archaeology to back up an Anatolian sourced language shift in the steppe at the right time thousands of years earlier. This isn’t consistent with the available historical documents. Indo-European Anatolian languages were present in Anatolia before the Hittite empire came into being, but didn’t have the momentum to expand and widely spread their language widely to other lands until then. In 2000 BCE the speakers of Indo-European Anatolian languages were a decidedly minority population of the peninsula that was rising in clout. An expanding culture of Indo-European Anatolians capable of producing language shift on the steppe would have conquered Anatolia first and soundly dominated Anatolia many centuries earlier than they did.

    The divergence of the Indo-European Anatolian languages is much better explained as a strong substrate influence atypical of the substrate influences found in Europe, in an Anatolian elite that wasn’t as completely dominant and hence more vulnerable to substrate influence than in Europe where civilization has more of less collapsed when Indo-Europeans arrived and there was mass population replacement. At a very crude level, Hittite words and names sound very non-IE Hattic shifted, compared, e.g. to Mycenanean Greek or to Sanskrit.

    Their hypothesis of a primary Anatolian origin of Indo-European languages requires a far more bold leap that has nothing to support it. Where are the temples to the Indo-European gods in late Neolithic Anatolia? Where are the burial practices that appear in the steppe? Where are the Anatolian Y-DNA haplogroups in the steppe given that Indo-European culture always and everywhere through the Bronze Age at least, was intensely patrilineal and almost always expanded in a male dominated fashion that subjugated other peoples they encountered? How do you get elite dominance language transmission without any signs of a late Neolithic/Copper Age, genetically distinct, Anatolian dominated elite (as you do, for example, in Hungary, one of the leading and rare cases of elite dominated language shift without much demic impact)?

  11. @ohwilleke; I think your comment is good but I caveats around what patrilineality could tell us.

    One thing either Anthony, in his lecture now unfortunately taken offline, or Kristian Kristiansen in an interview on Tides if History, said was that the DNA showed that Yamnaya were interacting in SE Europe in possibly a less patrilineally biased way than Corded Ware did in the north, and also it seems possible that this reached a stronger patrilineal clan pattern in later CWC and Beaker than early, where early CWC in Bohemia show some more diverse y DNA.

    When it comes to the putative proto-Armenians, it’ll be interesting to see if they’re all R1b-Z2103, or its some with other groups about.

    I think as I recall Narasimhans paper, the steppe_MLBA admixed folks in Central Asia did not strongly, if male, tend to be R1a, but there was a bit of J and Q in proportion to BMAC and ANE related ancestries.

    The tightness of the patrilineality may have varied over time, not that I’d deny patrilineal themes in IE. E.g. how useful were local foreign males vs clan males?

    Hopefully there is enough Sredni Stog to talk about patrilocality vs matrilocality in that culture.

    I think if we do get these findings that indicate that the earlier Anthony model is correct in SE Europe, where IE spreads by steppe chiefs entering into client relationships with local chiefs that allowed ydna to persist (“You, farmer chief are my client and marry my daughter, and our clan is strengthened”), it’ll be hard to claim very strongly that this is not possible to have happened on the Eneolithic steppe. Though on balance of probabilities it still may seem less likely… Currently we’re sampling heavily from Corded Ware and Beaker who dealt with declining, smaller, more peripheral farmers, and the culture around integration for them may have been distinct.

  12. @ohwilleke, great appointments.

    Specially “The divergence of the Indo-European Anatolian languages is much better explained as a strong substrate influence atypical of the substrate influences found in Europe, in an Anatolian elite that wasn’t as completely dominant and hence more vulnerable to substrate influence than in Europe where civilization has more of less collapsed when Indo-Europeans arrived and there was mass population replacement. At a very crude level, Hittite words and names sound very non-IE Hattic shifted, compared, e.g. to Mycenanean Greek or to Sanskrit.”

    As we didn’t yet see the new more than 700 genotyped individuals, and as Reich’s text is somewhat unclear, we are baffled by this ‘abstract’. Nothing we previously knew pointed to something like that. Maybe some successive waves of CHG introgressions into southern Steppes, but not a “southern Caucasus origin of PIE” or a solid Indo-Anatolian phase before PIE.

    @Razib, I re-heard your talk with David Anthony and he didn’t glimpse at a really southern ultimate origin of PIE or at a decisive Indo-Anatolian phase before Steppe PIE. That’s weird, would Reich’s people keep him in the dark about the evidences they now claim to hold about these sort of things?

  13. Re; Anatolian as derived but strongly shifted, rather than branching from an ancestor, that’s a really strongly linguistic argument. Historical linguists tend to be very firm on their ideas of which features can logically have developed simpler/more complex forms through change vs where simpler/more complex forms must be ancestral, and they have some deep expertise here that is difficult to challenge. They have strong ideas about what the directions of change should be, and what lexical changes are likely to be due to loss or preservation of an archaic phase lacking the term, or independent derivation. Fundamentally any argument like that would need to be made on a strong purely linguistic case and reach linguistic acceptance before it could form a basis for us to think too much about the DNA.

  14. Anyway I don’t want to naysay the idea that Anatolian is diverged because of contact effects, but just wanted to emphasize that linguists havent just counted derived features and split their tree first with Anatolian because of a number; they have actually done it on the basis of what is logically to them derived from an ancestral form vs what could plausibly be derived from late pIE under any circumstances (even those of heavy contact). So that takes a linguistic response to counter.

  15. @Matt

    “caveats around what patrilineality could tell us.”

    This isn’t wrong, although if there is too much of a cultural difference in litmus test distinctions like patriloineality, there is also a decent argument that there might not be linguistic identity either.

    Certainly, one could argue that a default patrilineality of the IE source will manifest most extremely in places where local civilization had collapsed the most like Europe further to the North and West, and in the former Harappan and BMAC civilization, and might be manifested more moderate in places like Anatolia and the Balkans that may have had a more healthy civilization in place (perhaps, in part, because those locations were the ones ideal for the Fertile Crescent plus Neolithic package with domesticates already struggling beyond their sweet spot in other ecological regions). The stronger the local chief is, the more the cost-benefit analysis of making him a client rather than obliterating and replacing him makes sense.

    “Anatolian as derived but strongly shifted, rather than branching from an ancestor, that’s a really strongly linguistic argument.”

    It is, but it is an important one. There is a fair amount of literature out there pointing to the conclusion that the proportion of linguistic change that is punctuated and contact driven relative to the proportion of linguistic change that is due to random drift proportional to time elapsed is much more contact driven than conventional wisdom assumes. Relatively isolated linguistic communities, e.g. at frontiers or in liturgical languages that don’t have living counterparts, can be very static over time. Examples include (1) the similarities that persist between Icelandic and Old Norse, (2) the archaic Spanish constructions that persisted in Southern Colorado and Northern New Mexico which were isolated frontiers until quite recently while disappearing in all other Spanish dialects, and (3) the persistence of very old (almost Shakespearian) pronunciations and construction of English in isolated Appalachian communities in the U.S. As another familiar example, the evolutions of English from Old English to Middle English and to modern English was likely heavily contract driven. (Examples of liturgical language persistence include Sumerian, Hattic, Coptic, Hebrew, Latin, Old Church Slavonic, and Sanskrit.)

    Quentin D. Atkinson’s work initially favoring an Anatolian hypothesis which didn’t account for language contact such as Russell Gray and Quentin Atkinson, Nature (2003). Atkinson recognized this was a problem, and in a 2008 study fit the data to a more complex model that attempted to determine how much language change was attributable to change in the formative period of a language, and how much language change was due to random drift. In the Indo-European languages, Atkinson’s effort determined that about 21% of language change was due to language formation effects, a result that neatly produces an estimated age of proto-Indo-European of about 6600 years BP, a match within the margin of error with the most widely accepted Kurgan hypothesis of Indo-European language origins. See Atkinson, Q. D., Meade, A. M., Venditti, C., Greenhill, S. J. and Pagel, M. (2008). Languages evolve in punctuational bursts. Science, 319: 588. DOI: 10.1126/science.1149683 The great importance of punctuational bursts is a relatively recent linguistic insight.

    Also, it is easy to miss substrate influences in most Indo-European languages in Europe, because all of them have substrates that probably derived from the same Western Anatolian first farmer population’s language family or families, so the substrate influences would be similar over a huge range and hard to determine if they have a major IE branch source or a substrate source. This isn’t true of Anatolia itself, however, because it probably experienced language replacement in the Eneolithic that put the West Asian highland sourced Hattic people (and probably remote linguistic ancestors of Minoans as well) in place.

    The archaeology and historical evidence also just don’t support the deep time depth.

  16. Saw your tweet 🙂

    In ancient times a commoner like tanner, landless labourer etc were
    less likely to have a foreign spouse than say a merchant, warrior & Philosopher.

    Scythian Rudradaman married her daughter to Satavahana prince of the deccan.

    Gupta kings were known to have matrimonial alliance.

    Seleucus Nikator too married his daughter to Chandragupta.

    Merchants since the Harappan days were always on the move.

    Philosopher too weren’t lazy bums and kept migrating.

    What little steppe & R1a you see outside so called upper caste could most likely be a sign of people who were outcasted from their clans for breaking sacred clan rules.

    There are Dravidian speaking groups in Andhra Karnataka & Kerala who are genetically closer to North West Indians than to Indo-Aryans of Ganga plains & other Dravidians surrounding them.

    Scholars of the common Era attest to Indus valley being overrun by foreigners/Mlechccha.

    I wonder if children of women sex slaves ever grew up to leave behind their progeny.

    4 classes of Hindu society

    Brahmans – Scholars/Philosopher/Priest etc

    Kashtriya/Rajanya – Ruling class/Governors/Straps

    Vaishya – Merchants/Traders

    Shudra – labourers/Servants

    Castes – Jatis – there are thousands of them.

  17. There were Priests & Merchants even in BMAC & Harappa. Since the assumption is no type of steppe ancestry reached in these places till 1800bce, could possibly mean these people shared same ancestry with rest of the population.

    I am not sure what was your average IVC profile as IranN like: AASI shows huge variation.

    If Harappan priest and merchants turned out to be IranN like shifted with mine workers & rural population more AASI shifted, would that mean Varna and Jati like groupings were already present in India since Harappan times.

    I leave this to geneticists to ponder upon.

  18. Question to your subscribers.

    What language did the ~2100bce Fire-S/homa priests of BMAC chanted their mantras in 🙂

    Language X

    Make a logical guess not assumptions based on wrong inferences…..

  19. [stop spamming the comments or i’ll start deleting]

    Suppose a Harappan male in 2000bce brought a steppe bride and they had 4 sons who would be unable to pass on their steppe mtdna to their offspring but their grandchildren would carry around 12.5% of steppe ancestry.

    In 100 years this family expanded in size (50-64 individuals with this kind of ancestry)(no case of cousin marriage)

    Now some of these individuals had brides as well as grooms from steppe.

    100 years later we have population size increasing to >1000 with steppe ancestry ranging from 5-18%

    Now if originally there were 25-50 such cases reported from different parts of Indus valley the population carrying steppe autosomal ancestry will multiply exponentially.

    So you are possibly looking at a population of 50000 who carry 5-20% steppe ancestry. From 0 to ~50000 in 200 years

    Do you guys really think this ancestry shift will lead to language shift?

  20. To frame my argument in a much better way;

    World population estimates
    75 million
    South Asia
    15 million
    Let’s assume North was more densely populated.
    10 million approx.

    If the claim that post 1500bce is a bad source then most of the steppe came between 2000-1500bce.

    Archaeological there is no proof of such a large migration.

    Better model is slow and gradual migration with higher birth rates. With post 1500bce serving as a steppe enrichment rather than source population for most of the groups.

    These post 1000bce steppe population with 50% steppe component are excellent source for steppe enrichment.

    Their East asian Component will go down to negligible levels. As is visible in the negligible Greek component in modern North West Indians despite a historical Greek presence there.

    There was selective mixing to frame it better and the likely hood of low to no steppe populations/soldiers dying in wars with Persians, Greeks, Huns and Turks.

  21. Another example

    Indian Female – 90% Harappan + 10% steppe
    Scythian Male – 50% steppe + 20% BMAC + 30% EA

    Result = 45% Harappan + 30% steppe + 15% EA + 10% BMAC

    Both females.

    Mixed with local men with minor steppe.

    Result = 67.5% Harappan + 20% steppe + 5% BMAC + 7.5 % EA

    Both women had 4 children each


  22. @ohwilleke, yeah, I think your point around substrate and contact influences can be true. although there are many non-contact reasons that a language can develop very fast too – say speakers of some dialect differentiate themselves from their linguistically and genetically neighbours by adopting some sound change that just makes loads of words unintelligble, then they’ll have to innovate loads of new/alternative vocabulary and the language will change much faster, and they may have to change the grammar to make it intelligble. So I’d be very wary of saying as default “If a language has apparently changed a lot, it is because of contact” or anything like this.
    And as mutual sprachbund can develop between different language families – e.g. Indo-Aryan and Dravidian – these can be misleading into us saying “Well, it must be that the more local language family is just representative of its ancestral state, and has influenced the incoming family” where actually both are derived.

    But contact may be underrated.

    The point is just that as I understand it linguists aren’t basing the position of Anatolian in trees on the amount of change, but that they view the logical sequence of changes to be such that Anatolian cannot have derived from the same last common ancestor as the rest of the IE languages.

    As an independent thing from how extensive the sound and grammar changes are in the languages. I’m not sure that this could be attributed to a shared substrate in IE languages and a different shared one in Anatolian. But anyway, I guess this is just a point for linguists is really the thing and is hard for us to assess, so I understand Harvard or any group of geneticists particularly without an Anatolian expert actually just taking it on face value from the current consensus.

  23. @ohwilleke, just on that tantalising question of whether there was a similar linguistic substrate for IE expanding into Europe west of the steppe, I think there are some challenges to it just from the population genetics:

    – Globular Amphora had a complete takeover of Euro HG y DNA and seems patrilocal from limited intentional cemeteries (e.g. males related through the father). If we follow that patrilocality + ydna replacement indicates language from the fathers’ group that would suggest Euro HG language for GAC. (or we could abandon this idea in the general case, but we couldn’t just make a special exception here).

    – In Scandinavian late Neolithic, we have a complete replacement of previous Corded Ware y DNA (which in this case was largely R1b with some I2 and a few R1a) with presumably EuroHG derived I1. Now nothing is happening in terms of autosome really here. But it’s hard to avoid the inference that linguists’ perception of a HG substrate in Germanic has some connection here.

    – The Balkans seem most certain to have EEF language and substrate, though in Greece we have the potential influence of people coming in from Anatolia to Greece and to Crete and the Aegean (e.g. see the models for Myceneans in Lazaridis’s paper).

    – Even when it comes to Italo-Celtic, there is a question of whether proto-Italo-Celto is from Beaker or more possibly from Balkans, developing into Italic and Celtic in Italy or in France/Central Europe respectively, with lots of absorption of people. Celtic does crop up in Central Europe with an increase in G2a, I2, which might make sense if these guys had been absorbed by Yamnaya migrations here and spread NW… (Or maybe they were speaking a Beaker derived language but had wider trade with Italy and the Balkans so brought in the ydna – hard to know…).

    – A particularly big problem though arises if Armenian derives not from a movement from Balkans but direct for Catacomb Culture with no Corded Ware EEF substrate. Then it’s really hard to say that anything reconstructed for IE including Armenian but not in Indo-Anatolian is due to European substrate.

  24. Razib, could swat valley samples be related to Ashvakan tribes and thus ancestral to Afghan-Pashtuns?

    What does your data say?

  25. @Matt

    Re Celtic

    In sensu stricto Celtic archaeological cultural traces start in the Central European La Tene culture, and expand from there in the Iron Age, and one of the very recent ancient DNA papers finds traces of demic expansion to the British Isles post-Beaker that could be attributed to it. But the historic maximal range of Celtic and of Beaker derived cultures is so similar, and the population turnover with Beaker is so great, that there just has to have been a language shift at that point which as at a minimum a substrate of Celtic. Alternately, Celtic languages date to the Bell Beaker and the time frame we attribute to narrow Celtic physical culture, might have only resulting in some dialect innovations within an existing language rather than a full on language shift. The presence of primitive Tartans and other Celtic-like styles in physical objects in the Tarim basin likewise argue for an older origin of Celtic with the La Tene culture in the Iron Age merely “renovating” and updating Bell Beaker culture rather than replacing a Bell Beaker language and culture wholesale.

    In a late Celtic dispersal and language shift hypothesis, however, the Unetice and/or Urnfield cultures makes sense to associate with Italo-Celtic.

    Almost all of the Beaker-Celtic range would have had an EEF substrate.

    Re Non-EEF Substrates In Europe

    Globular Amphora and Scandinavia could have European HG substrate and might be a good place to try to distinguish substrate influences from early IE branch terms found in other Baltic, Slavic, Italo-Celtic and Germanic branches. Another would be to compare Sanskrit and Avestian on one hand, to the European IE branches on the other.

    Re Armenian

    One challenge for working out Armenian is that it is a single language family within IE and is attested only quite late. Nothing written in Armenian survives from earlier than the 5th century CE, and even the Armenian people aren’t mentioned in any written texts before the 6th century BCE, post-Bronze Age collapse and more than a thousand years after the earliest attested mention of the Hittites in a written document.

    Armenian is definitely at a crossroads, appropriate for its geography, with loan words from Western Middle Iranian languages like Parthian, and also Greek, Persian, Syriac, the Anatolian languages Luwian and Hittite, Aramaic, Arabic, Mongol, Kartvelian, Northeast Caucasian languages, and possibly Akkadian or Sumerian via Hurro-Urartian. It also has similarities to Phrygian and grammatical features found in Balto-Slavic languages. More recently it acquired Turkish loan words that were excised after the Armenian genocide. Likewise it has a mix of satem and centum features. Armenian’s particularly rich phoneme inventory for an Indo-European language, somewhat reminiscent of the particularly rich phoneme inventory in some dialects of New York City which also has lots of language contact with languages that have phonemes not found in most dialects of English is also notable.

    The genetic connection direct to the Yamnaya people via Y-DNA cements the antiquity of its origins underlines how big a gap there is to fill, something that pretty much has to be left to archaeology and ancient DNA, since history offers so little insight into Eneolithic and Bronze Age Armenia.

  26. @ohwilleke, good point, re; Armenian and how little is known. I am quite fascinated by the idea of learning how far east and into Iran from NW Iran this steppe EMBA infusion, which evidently did not reach to zone of the BMAC. It seems like the whole of Northern Iran between 2500-1000 BCE is just Terra Incognita.

    (Are these migrations related to the Gutians? Probably wrong and that feels too far south to my intuition…).

    (There are huge expanses of time and territory in the area of the Near East unsampled –

  27. @Matt

    “we also have the evidence of two Iron Age males from NW Iran, from Hajji Firuz and Hasanlu (both from around 1100-800 BCE). These guys both have the R1b-Z2103, which is from Yamnaya/Steppe_EMBA.”

    It’s easier to argue that the female outlier is likely not connected to the Steppe_MLBA-connected Indo-Iranian expansions based on the dating (or did you also have other reasons in mind too?) but Y-DNA aside, kinda hard to know what the later Iranian guys represent right? They come from EIA contexts where we expect Iranic to have arrived to western Iran. Do we know that the Y-DNA in their specific case wasn’t picked up in the northeast from the previous steppe expansions and brought to the southwest than more locally via Steppe_EBA expansions that influenced the region previously (tbh that might be a good question for the Y-DNA-focused people on other fora, specific subclades etc)?

    Can I ask why you seem to be suggesting a Caucasus route for Armenian (based on these sorts of samples? or the ARM_MLBA ones too? those latter ones seem, so far, to arguably represent a purer Yamnaya-like admixing source than anything already mixed in the Balkans – and we see those early MBA samples from Greece were already significantly mixed – so it might make more sense for a “hop over” than a roundabout trip…) in your comment? What is pointing towards it in your view? I might be misreading part of your comment though.

  28. @forgetful, seems like if there’s no genetic impact in Turkey and you have south moving steppe in the mountain region happening at this time (the samples Wang labels “Late North Caucasus” here, KBD –, then in seemed to make sense. Could’ve gone via elsewhere; maybe they came via boat from the Black or Caspian Sea, but I don’t see a reason for any far detour like via the Balkans and Anatolia.

    It seems like a big coincidence that Reich would claim “Yamnaya patrilineal descendants occurred, coinciding with their 3rd millennium BCE displacement from the steppe itself” in exactly the exactly the same region where these lineages are later found, and that there’s a modern west->east cline in these lineages in Iran… But actually these were later brought in via Indo-Iranian movements by coincidence. It just seems highly unlikely.

  29. @Forgetful; I may have misunderstood something in your comment. Where do you mean by “northeast” and “southwest”? The IA samples with R1b-Z2103 are from northwest Iran.

  30. @Matt

    I suppose if we do see absolutely no impact in Anatolia – though the “hardly any” might at least work for Anatolian depending on the circumstances on that movement, though otherwise one might want to opt for a scenario like ohwilleke’s whether that means simply a late split of Anatolian after all or some longer stay in the Balkans before crossing to Anatolia relatively late or whatever else – but a more consistent one in the Transcaucasus, it’s a good argument. Or at least it’s handwavy to assume otherwise just due to pre-existing arguments. From what I remember, from a while ago, their sample location map even showed quite a few from all over specifically Anatolia so it might not be a case of a lack of sampling like we could argue before with the few BA-IA samples. Certainly might constrain some possibilities about both branches at least.

    Under that scenario a hypothetical Greco-Armenian split (assuming that’s valid for the sake of discussion) would happen while still on the steppe itself, not implausible even going by the datings arrived at in the recent paper you brought to attention by Kassian et al., and you’d have a sequence of Yamnaya-like -> Kuban-Tersk-like (I haven’t noticed if it’s a good source for whatever is around Armenia at the time, though we have lots of gaps anyway, but it looks like it might work) -> ARM_MLBA for Armenian and Yamnaya-like -> Logkas-like -> Mycenaean-like for Greek.

    By “southwest” I meant the location of the samples in western Iran relative to the movement from the “northeast” that would have brought Iranic there by the time of these samples (EIA). But yeah, no disagreement on your overall point. More curious about whether any of the earlier R1b on the eastern steppe was also picked up and brought southwest (I do remember northeast Iranic speakers showing it too though at lower frequencies, though no idea where that’s specifically from there or was also to some extent brought later from the southwest instead) or if it overwhelmingly/solely comes from that earlier movement into the area.

  31. @All, by the way I mentioned above an upcoming paper by Harald Ringbauer and collaborators (lead author Eirini Skourtanioti), that mentions that Bronze Age Greeks seem to be inbreeding at levels far higher than the average of other known cultures.

    There’s some unpublished evidence that this may have endured down to the Iron Age or Archaic Period world. Was watching a video from Carles Lalueza-Fox from March 2022, on “Inequality: A Genetic History” ( Most of the results are probably generally known to people interested, but one interesting slide at 43:47, that was unpublished, which talked about some samples from a Greek colony at Baria (post 600 BCE), where the people they’ve sampled have a high inbreeding level (slide:

    Inbreeding and Greek civilization generally? Or a symptom of the Greek colonists forming a certain kind of mercantile caste in this place?

    @Forgetful, just to confirm I read your post above, nothing really to dispute with it. (As I’ve discussed on Eurogenes, actually lots of problems with the idea of Armenian as both from Yamnaya and also sitting on a family tree with an Indo-Baltic clade… But this may not be necessary.)

  32. @Matt

    Yeah, I saw what you’re referring to. My general assumption going by traditional arguments too was that Armenian might have gone through the Balkan route, though the ARM_MLBA group that we’ve had a while seemed to need a source that was still quite steppe-rich/eastern overall (though, as you’ve pointed out too, again ‘local’ representation is lacking a bit, all other populations of the general area can be quite different and distant in time) compared to what we generally see in the Balkans around that time, especially in potentially geographically relevant regions, and these forthcoming results might also provide a more direct argument for a ‘hop over’ as you mentioned. Interesting either way, the more the better.

    Thanks for the youtube link, I hadn’t come across it. Seems worth a watch later.

    IIRC, that preprint’s claim about the E(?)BA was more specifically about small island endogamy and that the same effect wasn’t seen in the mainland as muhch, right?

    Baria/Villaricos was a Punic colony in the area of that southwestern Phoenician cluster on that screenshot you took, so I’m curious where those apparently Aegean individuals more specifically come from. Maybe they are part of just a few migrant families marrying each other over generations due to ethnic/linguistic/cultural ties. Would be interesting if we had similar results with say Empuries samples to see what the situation is in an actual Greek colony of some size in the general Iberian area as well.

  33. @Forgetful, as I read Skourtanioti’s abstract, it seemed to be saying that both that Aegeans in general – which I’d take to include Mainland Greece too – had more than typical (“Bronze Age Aegeans exhibited endogamy in high frequencies so far unobserved in the rest of the ancient West Eurasia.”) But the islands more than them (“These close-kin marital practices, likely equivalent to first-cousins unions, were substantially higher in Crete and other Aegean islands than in Mainland Greece”, suggesting they found some in Greece?). However if they are restricting their description of “Bronze Age Aegeans” to the islanders only, then that’s maybe not the case. I can’t really tell from the abstract – a read where it may just be insular inbreeding and not anything general to Greece at all is probably pretty sensible too, looking at it another way.

Comments are closed.