ncomms9912-f451IZQjMbVlL._SX346_BO1,204,203,200_Update: Here is a post that you must read, A note on the early expansions of the Indo-Europeans. The post dates to the middle of December, and is similar in many ways to my own thoughts. But, the author rejects a two wave model where the first wave has a deep time history, and seems to give the balance of opinion that agriculture is predominantly indigenous in development to South Asia, and not primarily an exogenous event. Rather, they suggest that there were multiple waves of Indo-Aryans into South Asia, with the steppe cultures being parallel and pulses from the Indo-Iranian ur-heimat. The primary criticism of the genetic interpretation that I would make is that from what I am to understand LD decay methods seem to catch the last admixture event and/or underestimate time since initiation of mixture. Therefore, though I accept a substantial mixture event ~4,000 years before the present, my own model present below suggests that older ones occurred thousands of years earlier.

That being said, I have updated my own views to rather uncertain at this point. I would not be surprised if on the whole a model as the one proposed in the blog post is closer to the truth than the one below. My reasoning has less to do with the details of the argumentation, and more to do with authority.

1) the individual who wrote the above post has comparable mastery of the historical genetic descriptive results.

2) but, the individual has far superior understanding of the archaeology and philology in comparison to me.

Ignoring the details of any argument, on a priori grounds I find that the individual above could give a better appraisal of the probabilities in regards to South Asian archaeogenetics than I could. The main thing that is holding me back from suggesting that I now find their model more probable than mine is the issue in regards to LD and rolloff methods. But I’ve definitely increased my uncertainty, from ~25% to ~50%, with the balance split between the two models (or some combinations thereof).

End Update

Sometimes you see things in fragments, disparate threads, which only snap into focus in hindsight. In this post I will hazard a prediction of results which are going to come out of remains from Indus valley sites in South Asia, which will confirm that there were two major demographic pulses which entered the subcontinent from the Northwest over the past 10,000 year. The first wave was the dominant one in comparison to the second genetically, and began at Mehrgarh 9,000 years ago. Its locus of origin was in the highlands of Western Asia, between the Caucasus and the Fertile Crescent. The second wave though left its mark culturally, as it is associated with Indo-Aryans, and likely derives ultimately from the trans-Volga steppe societies. The genetic signatures of the former people are found in nearly every indigenous South Asian group, as they amalgamated with a deeply entrenched local group of peoples who were distantly related to those of Oceania and eastern Eurasia. In short, the latter are the “Ancestral South Indians” (ASI) and the former are the “Ancestral North Indians” (ANI, see Reconstructing Indian Population History).

Screenshot - 01022016 - 09:46:36 PMThe figure above is from Upper Palaeolithic genomes reveal deep roots of modern Eurasians (open access), which found that ancient DNA from two samples in the northern Caucasus region are representatives of a population which contributed to the origins of the steppe people who swept into Northern Europe ~4,500 years ago. It shows how contemporary populations are best modeled as admixture events between reference populations. What you see is that most South Asian groups are well modeled as a mixture between “Caucasian hunter-gatherers” (CHG), and another element which is labeled “South Asian” because it is mostly restricted to the subcontinent. But wait there’s more! In the supporting materials the statistics show that though most South Asian groups have more potential mixture from the high quality CHG sequence, Kotias, a subset, unspecified Gujarati groups and Tiwaris, share more drift with the Afanasevo culture, which flourished in the Altai region of Central Asia between 5,500 and 4,500 years ago. We have enough ancient DNA to infer that the Afanasevo basically the same people as the Yamna culture, who were present between the Volga and Dnieper, far to the west. The Tiwari are an upper caste group which is present across Northern India. The second wave component is clearly strongest in the Northwest, as indicated by the Kalash sharing so much drift with Ma’lta. Before subsequent waves  of gene flow into the steppe people, which brought dollops of European farmer and hunter-gatherer ancestry into the mix, they had a higher fraction of Ancestral North Eurasian (ANE) than any contemporary Northern European population. Their contribution to South Asian groups on the Northwest fringe of the subcontinent explains then the presence of high fractions of ANE there.

A final aspect which needs to be mentioned is that the Z93 subclade of R1a1a is found across much of South Asia. Though it is correlated with higher caste, and Indo-Aryan speaking, populations, it is not exclusive to them. In fact it is found in substantial fractions among notionally primal tribal people in South India who traditionally practice primitive slash and burn agriculture and engage in extensive hunting and gathering. Ancient DNA results from the Sbruna culture of Central Eurasia have yielded Z93 among buried males. This subclade is rather rare in this region today, and, it succeeded groups which were carrying R1b, today dominant across Western Europe. The details are to be worked out, but, I believe that are associated with, but more expansive than, the Indo-Aryans. Beyond the limits of the folk migrations were outrider groups of males who integrated themselves into indigenous societies, often taking elite positions as members of a dominant patrilineage. If there was a strong bias for male descendants of a small number of these individuals, but not female ones, to have higher reproductive fitness, than over time their Y chromosomes might be far more common than their total genome contribution (to illustrate what I’m talking about, a recent paper in Australian Aboriginals admits that 56% of their Y chromosomes introgressed over the past 200 years from Europeans!).

Bringing it together one implication of the above is that the Dravidian languages of the Indian subcontinent were probably brought by the West Asian farmers (perhaps confirming an ancient link to Elamite?). Therefore, the language(s) of the Indus valley civilization was probably a form of Dravidian. Another aspect to consider is that no South Asian population lacks the genetic imprint of these West Asian farmers. It seems likely that as in Europe the farmer populations which entered the subcontinent via the northwest totally marginalized most of the hunter-nihms137159f3gatherer groups, which were numerically less substantial in any case. But, why do all South Asian groups also exhibit ASI ancestry, which is deeply rooted in the subcontinent? Just as in Europe the initial populations of farmers on the fringes of the subcontinent mixed with the local hunter-gatherers, producing a synthetic population which over time evolved its cultural toolkit to become more well adapted to South Asian geographies. Once the crucial cultural adaptations occurred then the synthetic population underwent a phase of massive demographic expansion beyond its delimited ghetto on the fringes, where West Asian climatic parameters allowed for the initial phase of near total cultural transplantation. As in Europe the expanding South Asian farmer groups absorbed hunter-gatherer substrate, accruing greater and greater ASI fractions on the wave of demographic advance, and so generating the ANI-ASI cline evident in genetic analyses. The presence of ASI in groups like the Pashtuns in Afghanistan is probably due to the fact that the synthetic populations, what we now term “South Asians” or “Indians” or “desis”, exhibited enough cultural hegemony and influence to reach deep into the plateau of modern Afghanistan and impacted both the pre-Iranic and East Iranic people of Afghanistan (also, note that Indians were very common as slaves in the cities of Afghanistan during the early Islamic period).

The reason I took time to put this post up now is that it looks like the publication of ancient South Asian genomes from the Indus valley period is imminent. From The Guardian on December 30th, Rakhigarhi: Indian town could unlock mystery of Indus civilisation:

One has stood out: who exactly were the people of the Indus civilisation? A response may come within weeks.

“Our research will most definitely provide an answer. This will be a major breakthrough. I am very excited,” said Vasant Shinde, an Indian archaeologist leading current excavations at Rakhigarhi, which was discovered in 1965.

Shinde’s conclusions will be published in the new year. They are based on DNA sequences derived from four skeletons – of two men, a woman and a child – excavated eight months ago and checked against DNA data from tens of thousands of people from all across the subcontinent, central Asia and Iran.

They looked somewhat like a recent Miss America!
I predict that the Y chromosomal haplogroups will be H or J2. Both these are common in Dravidian speaking groups of Southern India, and, are found at some fractions in West Asia. I predict that these individuals who share gene flow with Kotias, and not with Central Eurasian groups. I predict that these individuals will not be enriched for ANE ancestry. I predict these individuals will have mtDNA lineages present in modern Indian populations, probably M. Though excavated in a region of South Asia where today lactase persistence (LP( is common, none of the individuals with carry the common derived Eurasian haplotype conferring LP. They will segregate for the derived variant of SLC24A5. On a PCA plot these individuals will cluster with non-Brahmin upper/middle caste South Indian populations, such as the Reddys of Andhra Pradesh.

Note: I’ve been told by friends for two years and more that there are efforts to sequence and type Indus valley individuals. But I have no inside information. If you are an individual in the media who has early access feel free to send me a PDF with the understanding that I will honor the embargo! (if you don’t send me the PDF I’m mildly confident I’ve already hit the major themes you are safeguarding)

