About 36% of the world’s population are citizens of the Peoples’ Republic of China and the Republic of India. Including the other nations of South Asia (Pakistan, Bangladesh, etc.), 43% of the population lives in China and/or South Asia.
But, as David Reich mentions in Who We Are and How We Got HereChina is dominated by one ethnicity, the Han, while India is a constellation of ethnicities. And this is reflected in the genetics. The relatively diversity of India stands in contrast to the homogeneity of China.
At the current time, the best research on population genetic variation within China is probably the preprint A comprehensive map of genetic variation in the world’s largest ethnic group – Han Chinese. The author used low-coverage sequencing of over 10,000 women to get a huge sample size of variation all across China. The PCA analysis recapitulated earlier work. Genetic relatedness among the Han of China is geographically structured. The largest component of variance is north-south, but a smaller component is also east-west. The north-south element explains more than 4.5 times the variance as the east-west.
The abstract is long, but I’ll reproduce it in full:
The genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia — consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC — and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.
Though the abstract is focused on South Asia, the preprint actually has quite a bit about Inner Asia, because of the provenance of the samples. We often view the typical person in the past as a peasant in an agricultural society, and therefore relatively immobile over their lifetime. The story we like to tell ourselves is that non-elites in premodern societies, on the whole, had narrow horizons, delimited by their home village, or the neighboring network of villages.
But results from this work and others show that mobile populations where individuals spanned vast areas of Eurasia across their lifetimes, were not that uncommon for pastoralists. We know this historically, as empires such as that of the Turks and Mongols were defined by a ruling elite whose writ extended from eastern to western Eurasia. The Sintashta samples, which exhibit genetic heterogeneity, with some individuals very different from the norm in their settlement, is exactly what you’d expect from a social and political culture which was united in some fashion over huge distances.
As the sample sizes for ancient DNA have increased it seems rather clear that demographic dynamics that we see in later historical expansions of Inner Asian polities extends back to the Bronze Age. With expanding populations across the ecologically friendly landscape, the ancient proto-Indo-Europeans seem to have mixed with the local substrate wherever they went, just as Turks did later. As they moved west, they mixed with late Neolithic Europeans, as they went east, they mixed with Siberian populations, and as they conquered south they mixed with descendants of West Asian farmers.
One of the primary aspects that I think one needs to keep in mind is that one can’t just imagine that this was defined by simple diffusion dynamics. Historically the boundary between pastoralists and peasants could be fluid, but when political resistance collapses pastoralists have been able to use their military prowess to swarm across the lands of agriculturalists. In other words, centuries of gradual inter-demic gene flow might be interrupted by a rapid “pulse” admixture. There’s no reason that pre-literate polities couldn’t exist. The Inca were one such example, the homogeneity of the Uruk civilization in the 4th millennium BC is strongly suggestive of an imperial hegemony or paramountcy.
Another dynamic is that pastoralists are highly mobile, and so may leapfrog over territory which is unsuitable. Or, they may move so rapidly that there isn’t much mixing with populations in between point A and point B.
This is apparently the case with the Bactria–Margiana Archaeological Complex. These people were mostly descended from people related to the eastern farmers of West Asia, those in modern day Iran. Some of their ancestry had affinities with Anatolian farmers, and there is some evidence even of Siberian admixture in this region. But there are three important takehomes of this preprint in relation to this area 1) the BMAC did not contribute much genetically to South Asia at all, 2) steppe ancestry, related to that of the Yamna culture of the Pontic region, only shows up in BMAC ~2000 3) there is actually evidence of South Asian (Indus valley?) migration into the BMAC.
The fact that Yamna-like ancestry shows up in the BMAC region so late is a strong reason to suspect that Indo-Iranian peoples did not move to Iran and India until after 2000 BC. In earlier comments on this issue, I was rather vague about timing, because the Corded-Ware people show up in Europe before 2500 BC, and I was going along with the parsimonious idea that this was part of one single cultural and social revolution.
I was wrong. Going back to the Turkic analogy, there were multiple waves of migration and folk wandering by Turkic pastoralists. By different Turkic groups. One of the major ones occurred due to the rise of the Mongols, and the Mongols were not even Turks. The same seems to be true of Inner Eurasian Indo-European groups.
Moving on to South Asia, there are two primary constructs which come out of this preprint. “Indus Periphery” and “Ancient Ancestral South Indians.” I’ll call the former InPe and the latter is termed AASI. To some extent these complement and replace the earlier terms “Ancestral North Indian” and “Ancestral South Indian” (ANI and ASI). The AASI are the ancient hunter-gatherers of the Indian subcontinent. The authors suggested that divergence of this group from other eastern Eurasians occurred very early, that the division between the ancestors of the Papuans, Onge, and AASI was even polytomic (that basically separated very quickly without discernible structure).
The InPe samples are from eastern Iran and the BMAC. They’re unique in having AASI ancestry, at variable fractions (indicating contemporaneous admixture). They also resemble samples from Swat Valley which date to 1200 BC and later, with one major difference: the Swat Valley samples have steppe ancestry.
There are no samples from the Indus Valley proper, so the authors suggest that the InPe are reasonable proxies. Additionally, they assert that ASI can best be modeled as a mixture between InPe and AASI. In other words, there were two admixture events. Their Pulliyar samples are actually pretty good proxies for the resultant ASI, while the Kalash of Pakistan are good proxies for the ANI, who are presumably now modeled as a mixture of steppe populations with the InPe.
This resolves the enigmatic result that Priya Moorjani reported to me last year: less than 4,000 years ago “pure” ANI and ASI people existed. She was presumably going off admixture timing estimates. These results suggest that in some form ANI and ASI still exist, and the first admixture occurred with the creation of InPe.
Using a new method the authors contend that InPe emerged 4700-3000 BC. If this is true then the Indus Valley Civilization (IVC) was a compound of AASI and Iranian agriculturalists (sampled from the eastern end of the cline of admixture with Anatolians, that is, they had none of that ancestry). They also post the first arrival of agriculture to Mehrgarh by 2,000 years at the least. I suspect that it will turn out there were earlier admixtures, which are not being detected. For various ecological reasons the West Asian cultural complex was portable only to the northwest fringe of South Asia, and there it persisted for ~4,000 years. This served as a natural eastern limit for cultures which were migrating out of the West Asian zone, and a point where AASI hunter-gatherers constantly mixed into the local population.
As the IVC sites begin to get sampled in the future I predict that instead of a homogeneous transect of admixture over time and space we’ll see a lot of heterogeneity.
In the Swat samples, the authors see two correlated trends, an increase in steppe ancestry, and an increase in AASI ancestry. No doubt this dates to the “great admixture” which occurred between 2000 BC, and some time before 1000 AD (the Bengali admixture with East Asians dates to between 0 and 1000 AD, as does that of Brahmins who left the North Indian plain and mixed with local populations elsewhere).
Finally, the authors detect a skew toward steppe ancestry among some populations, in particular, Brahmins. The skew is in relation to Iranian farmer ancestry, the two being the primary constituents of ANI ancestry. In Who We Are and How We Got Here David Reich says some of the ANI admixture is much more recent than the rest, judging by tract length. And also going by the BMAC and Swat samples it seems that the time period for when Indo-Aryans arrived in South Asia has to be in the interval between 2000 BC and 1200 BC.
There’s another aspect of the preprint which allows for dating. The arrival of Austro-Asiatic people in South Asia probably has to postdate the expansion of the same group in Vietnam about 4,000 years ago (though not necessarily obviously). But the Munda Austro-Asiatic people of northeast India exhibit curious genetic patterns. They clearly have East Asian ancestry related to other Austro-Asiatic populations in Southeast Asia, but they have a lot less “West Eurasian” in their ANI/ASI mix. The authors resolve this by suggesting that the Munda arrived in South Asia when there was still heterogeneity among the ASI, and unadmixed AASI.
After 2000 BC the IVC went into decline. Various groups of Indo-Aryans were expanding and admixing. From the other end of the subcontinent arrived rice cultivators from Southeast Asia. At some point, they ran into an ASI population that had some Iranian admixture, but not as much as typical. All of this probably occurred in the period between 2000 BC and 1000 BC. I know that some researchers have argued that the Gangetic plain was inhabited by Munda speaking peoples before it was inhabited by Indo-Aryans. The main issue I’ve had with this is that modern Munda peoples are very genetically distinctive, and there’s no evidence of East Asian ancestry in most populations of the Gangetic plain (the main exceptions are those which have experienced Tibetan influence/contact).
So here is my interpretation of the genetic and historical evidence:
1) IVC emerges out of a matrix that was a synthesis of West Asian farmers and indigenous hunter-gatherers. I would not be surprised if later genetic work recapitulates the findings in Europe of an initial period of separation, and then a “resurgence” of indigenous ancestry as the barriers between the two groups break.
2) The period between 2000 BC and 1000 BC is the beginning of the transformation of the South Asian genetic and ethnolinguistic landscape, with the intrusion of two different groups from different directions, Indo-Aryans to the west and Austro-Asiatics from the east. Austro-Asiatic rice culture was superior to western wheat culture because rice is more delicious than wheat, but the Indo-Aryans ultimately established cultural supremacy across South Asia by the Iron Age.
3) The situation in South India is more complicated and confused. The admixture of groups like Pulliyar from InPe and AASI into the classic ASI configuration seems to be more recent than 2000 BC (their low bound dates go as late as 400 BC). The admixture may have occurred in various places, not just in South India. The evidence from this paper suggests that the Andronovo/Sintashta cultural zone was characterized by some genetic heterogeneity due to variation in admixture with neighboring peoples, and the same could be said for the IVC then. I would not be surprised if northern IVC locations had more AASI than southern IVC, as the latter were more insulated from the east due to the Thar desert (the results are consistent with earlier work that suggest modern populations in the lower Indus basis have less Indo-Aryan and more Iranian, with less AASI).
4) We need to be careful about assuming that everything here is a linear combination of distinct and separable atomic units of cultural integrity and wholeness. What I mean is that though Brahmins and some other North Indian groups are enriched for steppe ancestry, it is not only their purview. Rather, it may be that these upper caste groups simply mixed less with the other populations with Iranian and AASI ancestry. The statistics in this paper do not detect enrichment of steppe ancestry in South Indian Brahmins. I believe this is simply an artifact of the reality that South Indian Brahmins mixed with Iranian-enriched elites, like Reddys, when they emigrated to the south.
Though the model outlined in the preprint is much more complicated than a simple ANI/ASI mix, it still simplifies the demographic histories of many populations. For example, own survey of the data suggests that Brahmins who left the Indo-Gangetic plain mixed with local elites wherever they went (Bengali Brahmins have East Asian ancestry, just as South Indian Brahmins have more Iranian-like ancestry).
5) Language is important but is not determinative. R1a1a-Z93 arrived in South Asia relatively late with groups from the steppe. Its frequency is highest in the northwest, and among upper castes. That is, it is correlated in a coarse manner to steppe ancestry. But R1a1a-Z93 is pervasive throughout South Asia irrespective of caste and region. Even in Dravidian speaking southern populations, some groups have quite a bit of R1a1a-Z93.
The analogy that presents itself here is Southern Europe, where some groups with high frequencies of R1b, such as the Basques and Sardinians, are clearly descended in the main from pre-steppe populations. What this suggests is that a broad social-culture prestige network mediated by males extended itself into regions where its cultural hegemony was not assured. Additionally, the autosomal genetic impact was modest, even if privileges given to particular male lineages allowed them to sweep other groups out of the gene pool.
Tamil history precipitates out only a little later than that of North Indian Indo-Aryan civilization. I suspect that this is not a coincidence, that South Asia after the collapse of the IVC and the arrival of the Indo-Aryans and Mundas, could be thought of as a brought mixing cauldron genetically and culturally. In many regions, Dravidian languages persisted in the face of the expansive Indo-Aryan, but there was a cultural influence, likely reciprocal. This is why once Indian civilization reemerged its coherent unity set against peoples to the west and east was not strange despite the linguistic gap between the north and the south.
The only exception here might be the Munda. As I have said, R1a1a-Z93 is pervasive. But it is nearly unfound among the Munda, who tend to carry relatively exotic Southeast Asian Y lineages such as O. I believe that the Munda were in some way losers in a cultural conflict, but they maintained themselves in the hills above the Gangetic plain.
Finally, two reflections, one navel-gazing, one big picture. Genome bloggers in the years around 2010 actually anticipated many of these results. There’s some hindsight bias here because you remember the times you are right and not the times you were wrong. We were right that there was more than one ANI pulse. Additionally, we were looking at the ratio between “Eastern European” and “West Asian” ancestry years ago and noticing the skewed patterns, with North Indian Brahmins biased toward the former and South Indian elite non-Brahmins skewed toward the latter. Chaubey 2010 suggested to us that something was different about the Munda not only in their East Asian ancestry but in their ANI/ASI ancestry. They just didn’t seem to have any Indo-European ancestry (steppe), and a lot of ASI. Over the past few years I’ve been suggesting that Dravidian languages were not primal to South India, but the product of a recent expansion (though part of this is due to scientific publications).
The truth was out there. It just took ancient DNA and the analytic chops of the Reich group and their collaborators to prune the tree of possibilities so that we could zero in on a few precise and likely models.
In the general, I wonder about the role of clines, diffusions, and pulses. The models that the foremost practitioners of the science of ancient DNA utilize tend to assume pulse admixtures, rather than isolation-by-distance gene flow. This isn’t always a crazy assumption. But there was a discussion in the paper of a west-east admixture cline between Anatolian farmers and Iranian farmers. Is this cline due to admixture, or was it always there? A paper from a few years ago implied that early farmers were highly structured, structure that broke down later.
Also, the polytomy at the base of the eastern Eurasian human family tree, where all the major lineages diverge rapidly from each other, makes me wonder about gene flow vs. admixture. It seems possible that the polytomy may mask a phylogenetic tree topology which had gradually bifurcating nodes, if periodically a single daughter population replaced all its sister lineages in a local geographic zone. Much of history in human meta-populations may be characterized by isolation-by-distance and gene flow, erased by the extinction of most lineages and expansion of a favored lineage.
The relationship between China and India is clearly one-sided: India is obsessed with a China which is approaching lift-off toward becoming on the verge of a developed nation within a generation (certain urban areas are already basically developed, albeit not particularly wealthy in comparison to Hong Kong or Singapore).
Often when I see interviews with regular Chinese about their opinions of the other country the fixation is upon the manifest Third World nature of India, which seems to be changing much more slowly than their own nation. For me GDP is less important that vital statistics like child mortality or life expectancy. And it is in these sorts of statistics where you see the gap opening up between the two nations. India is developing…. but China is leading, and converging faster with developed nations.
While Amazon.com has sellers hailing from many countries, Mr. Cheris said that India and China are the two most important places for Amazon to recruit new merchants since both nations are sources of cheap manufactured goods.
Unlike China, where local companies dominate e-commerce, India is also a huge domestic market for Amazon. Although most of India’s commerce is conducted offline, Indians are coming onto the internet at a rapid clip through their smartphones. Amazon’s chief executive, Jeff Bezos, views India and its 1.3 billion residents as vital to his company’s future, and he has vowed to spend at least $5 billion building up his India operations.
a, I was aware that Amazon really hadn’t gotten any traction in the Chinese market. I did not know that Amazon was so competitive in India, though Flipkart is still dominant there.
The story outlined seems to be part of a bigger trend whereby India is on a very different path from China in its relationship to the rest of the world. China’s economy is big enough and insular enough that it sees the world as either an export market or a source of commodities. It is quickly taking back its place of old as a lumbering hegemon. India, in contrast, seems to be developing a more integrative relationship with large economies such as the United States, despite its command and regulatory economy legacy.
Of course, the India-USA relationship is nothing like “Chimerica” in terms of magnitude, but the Sino-American relationship strikes me as very transactional. Despite the recent tendency of Indian society to espouse a stronger Hindu nationalist line, which is at odds with the West, it seems that there is more cultural exchange between elite Indians and Western societies in the deep sense of values, than has occurred with the Chinese and the West. And, yoga and aspects of spirituality notwithstanding, most of the cultural exchange seems to be toward cosmopolitan elites Indians assimilating to global values which draw from the mode of the West.
Ultimately all of this seems to have geopolitical implications. I’m assuming smarter people than me are keeping track of these trends….
Is India’s caste system the remnant of ancient India’s social practices or the result of the historical relationship between India and British colonial rule? Dirks (history and anthropology, Columbia Univ.) elects to support the latter view. Adhering to the school of Orientalist thought promulgated by Edward Said and Bernard Cohn, Dirks argues that British colonial control of India for 200 years pivoted on its manipulation of the caste system. He hypothesizes that caste was used to organize India’s diverse social groups for the benefit of British control. His thesis embraces substantial and powerfully argued evidence. It suffers, however, from its restricted focus to mainly southern India and its near polemic and obsessive assertions. Authors with differing views on India’s ethnology suffer near-peremptory dismissal. Nevertheless, this groundbreaking work of interpretation demands a careful scholarly reading and response.
The condensation is too reductive. Dirks does not assert that caste structures (and jati) date to the British period, but the thrust of the book clearly leaves the impression that this particular identity’s formative shape on the modern landscape derives from the colonial experience. The British did not invent caste, but the modern relevance seems to date to the British period.
This is in keeping with a mode of thought flourishing today under the rubric of postcolonialism, with roots back to Edward Said’s Orientalism. As a scholar of literature Said’s historical analysis suffered from the lack of deep knowledge. A cursory reading of Orientalism picks up all sorts of errors of fact. But compared to his heirs Said was actually a paragon of analytical rigor. I say this after reading some contemporary postcolonial works, and going back and re-reading Orientalism.
To not put too fine a point on it postcolonialism is more about a rhetorical posture which aims to destroy what it perceives as Western hegemonic culture. In the process it transforms the modern West into the causal root of almost all social and cultural phenomenon, especially those that are not egalitarian. Anyone with a casual grasp of world history can see this, which basically means very few can, since so few actually care about details of fact.
Castes of Mind is an interesting book, and a denser piece of scholarship than Orientalism. Its perspective is clear, and though it is not without qualification, many people read it to mean that caste was socially constructed by the British.
This seems false. It has become quite evident that even the classical varna categories seem to correlate with genome-wide patterns of relatedness. And the Indian jatis have been endogamous for on the order of two thousand years. From The New York Times, In South Asian Social Castes, a Living Lab for Genetic Disease:
The Vysya may have other medical predispositions that have yet to be characterized — as may hundreds of other subpopulations across South Asia, according to a study published in Nature Genetics on Monday. The researchers suspect that many such medical conditions are related to how these groups have stayed genetically separate while living side by side for thousands of years.
Unfortunately though science is not well known in any depth among the general public. The ascendency of social constructionism is such that a garbled and debased view that “caste was invented by the British” will continue to be the “smart” and fashionable view among many intellectual elites.
Over at Brown Pundits I’ve mentioned the continuing simmer of controversy over a recent piece, How genetics is settling the Aryan migration debate. This has prompted responses in the Indian media from a Hindu nationalist perspective. One of these notes that the author of the piece above cites me, and then goes on to observe I was fired from The New York Times a few years ago due to accusations of racism (also, there is the implication that I’m just a blogger and we should trust researchers with credibility like Gyaneshwer Chaubey; well, perhaps he should know that Gyaneshwer Chaubey considers me “unbiased” according to an email exchange which I had with him last week [we all have biases, so I think he’s wrong in a literal sense]).
I was a little surprised that a right-wing magazine would lend legitimacy to the slanders of social justice warriors, but this is the world we live in. Those who believe that everything written about me in the media, I invite you to submit your name and background to me. I have contacts in the media and can get things written if I so choose. Watch me write something which is mostly fact, but can easily be misinterpreted by those who Google you, and watch how much you value the objective “truth-telling” power of the press all of a sudden.
There’s a reason so many of us detest vast swaths of the media, though to be fair we the public give people who don’t make much money a great deal of power to engage in propaganda. Should we be surprised they sensationalize and misrepresent with no guilt or shame? I have seen most of those who snipe at me in the comments disappear once I tell them that I know what their real identity is. Most humans are cowards. I have put some evidence into the public record to suggest that I’m not.
Perhaps more strange for me is that the above piece was passed around favorably by Sanjeev Sanyal, who I was on friendly terms with (we had dinner & drinks in Brooklyn a few years back). I asked him about the slander in the piece and he unfollowed me on Twitter (a friend of Hindu nationalist bent asked Sanjeev on Facebook about the articles’ attack on me, but the comment was deleted). It shows how strongly people feel about these issues.
I’m in a weird position because I’m brown and have a deep interest in Indian history. But that interest in Indian history isn’t because I’m brown, I’m pretty interested in all the major zones of the Old World Oikoumene. Aside from some jocular R1a1a chauvinism I don’t have much investment personally (I just told said Hindu nationalist friend who turns out to be R2 to clean my latrine; joking of course, though I’m sure he resents that I’m descended on the direct paternal line from the All-Father & Lord of the Steppes and he is not!).
In the aughts I accepted the model outlined in 2006’s The Genetic Heritage of the Earliest Settlers Persists Both in Indian Tribal and Caste Populations. But to be frank it always struck me as a little confusing because the tentative autosomal data we had suggested that many South Asians were closer to West Eurasians than deep divergences dating to the Last Glacial Maximum would suggest. Since I’ve written something like 5 million words in 15 years, I actually can check if I’m remembering correctly. So here’s a post from 2008 where I express reservations of the idea of long term deep heritage of Indians separate from other West Eurasians. The reason I was so impressed by 2009’s Reconstructing Indian Population History is that it resolved the paradox of South Asian genetic relatedness.
To recap, Reich et al. proposed that modern Indians (South Asians) could be modeled as a two way mixture between two distinct populations with separate evolutionary genetic histories, Ancestral North Indians and Ancestral South Indians (ANI and ASI). How distinct? ANI were basically another West Eurasian population, while ASI was likely nested in the clade with Eastern Non-Africans. Additionally, there was a NW-to-SE and caste admixture cline. In other words, the higher you were on the caste ladder the more ANI you had, and the closer your ancestors were from the north and west, and more ANI you had. The difference between Y and mtDNA, male and female, could be explained by sex-biased migration.
But there were still aspects of the paper which I had reservations about. After all, it was a model.
Models are imperfect fits onto reality. The idea of mass migration seemed ridiculous to me at the time, because even by the time of the Classical Greeks it was noted that India was reputedly the most populous land in the world (to their knowledge). But ancient DNA has convinced me of the reality of mass migrations.
I wasn’t sure about the nature of the closest modern populations to the ANI. The researchers themselves (in particular, Nick Patterson) told me that the relatedness of ANI to Europeans was very close (on the order of intra-European differences). But modern Indians do not look to be descended from a population that is half Northern European physically. Again, ancient DNA has shown that there was lots of population turnover, and it turns out that Europeans and ANI were likely both compounds and mixed daughter populations of common ancestors (also, typical European physical appearance seems to have emerged in situ over the past 5,000 years).
The two way admixture modeled seemed too simple. I had run some data and it struck me that North Indian populations like Jats had something different than South Indian groups like Pulayars. In 2013 Priya Moorjani’s paper pretty much confirmed that it was more than a two way admixture along the ANI-ASI cline.
This March BMC Evolution Biology published Silva et al’s A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals. It has made a huge splash in India, arguably triggering the write up in The Hindu. But for me it was a bit ho-hum. If you read my 2008 post it is pretty clear that I suspected the most general of the findings in this paper at least 10 years back. It is nice to get confirmation of what you suspect, but I’m more interested to be surprised by something novel.
Nevertheless A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals has come in for lots of repeated attack in the right-wing Indian press. This is unfair, because it is a rather good paper. I suspect that it wasn’t published in a higher ranked journal because most scientists don’t consider the history of India to be that important, and they didn’t really apply new methods, as opposed to bringing a bunch of data and methods together (in contrast, the 2009 Reich et al. paper was one of the first publications which showed how to utilize “ghost populations” in explicit phylogenetic models with relevance to human demographic history).
As it happens I will be writing up my thoughts in detail in an article for a major Indian publication (similar circulation numbers as The Hindu). This has been in talks for over six months, but I’ve been busy. But a month or so ago I thought it was time that I put something into print for the Indian audience, because I felt there was some misrepresentation going on (i.e., the Aryan invasion theory has not been been refuted by genetics, but this is what many Indians assert).
For any years people have told me there are certain topics that shouldn’t be talked about. I have offended people greatly. There are many things people do not want to know. I have come to the conclusion this is not an entirely indefensible viewpoint (though if you accept this viewpoint, I think acceptance of authoritarianism is inevitable, so I hope people will toe the line when the new order arrives; knowing their personalities I think they will conform fine). But my nature is such that I continue to have nothing but contempt for the duplicitous and craven manner in which people go about these sorts of private conversations. I assume that as someone with the name “Razib Khan” I will be attacked vociferously by Hindu nationalists, who will no doubt make recourse to the Left-wing hit pieces against me to undermine my credibility. The fact that these groups are fellow travelers should tell us something, though I will leave that as an exercise for the reader.
I will write my piece that reflects the science as I believe it is, without much consideration of the attacks. That is rather easy for me to do in part because I live in the United States, where denigrating the deeply held views and self-esteem of Hindu nationalists is not sensitive or politically protected (unlike say, Muslims). And Hindu nationalists are less likely to kill me by orders of magnitude than Muslim radicals, and they have far less purchase in this nation then the latter (though you may be interested to know that very conservative Muslims follow me on Twitter; they’re actually more open-minded than many SJWs to be entirely honest).
Let me go over some general points that I see coming up over and over on the relationship between Indian (pre)history and genetics in the critiques .
One of the major critiques has to do with the nature of R1a-Z93 and its subclades. Basically this Y chromosomal haplogroup, the greatest that has ever been known, exhibits a strong signature of very rapid expansion over the past 4,000 years or so. It is divided from Z282. While Z93 is found in South Asia, Central Asia, and Siberia, Z282 is European, with its dominant subclade the one associated with Eastern Europeans. Both of these clades of R1a have gone through massive expansion. In the Altai region R1a is 40% of the heritage of peoples who are now predominantly East Eurasian today. But they are Z93. Additionally, ancient DNA from the Pontic Steppe dated ~4,000 years ago from Srubna remains is Z93, as are Scythian remains from the Iron Age.
Much of the argument comes down to dating, and citing papers that give deep coalescence numbers between difference branches of R1a1a. Hindu nationalists and their fellow travelers point to recent papers which give dates >10,000 years ago, and so place the origin of Z93 plausibly in the Pleistocene. The problem is that Y chromosomal coalescence dating is something of a mug’s game. Often they use microsatellite data whose mutational rates are highly uncertain. In contrast, using SNP data, which has a slower mutation rate but requires a lot more data, you get TRMCA (common ancestry) between Z93 and Z282 around ~5,800 years ago. But coalescence estimates often have wide confidence intervals of thousands of years. And even with these intervals, the assumptions you make (e.g., mutation rate) strongly influence your midpoint estimate.
The Y chromosomal data is powerful, but its interpretation is still buttressed upon other assumptions. The really big picture framework is the nature of ancient genome-wide variation across Eurasia. Lazaridis et al. 2016 condition us to a prior where much of Eurasia was subject to massive population-wide genetic changes since the Holocene. Therefore, I am much less surprised if there was massive genetic change in India relatively recently. The methods in Priya Moorjani’s paper and in other publications make it obvious that mixture was extensive in South Asia between very distinct groups until about ~2,000 years ago. In fact, Moorjani et al. using patterns of variation across the genome to come at a number of two to four thousand years ago as the period of massive admixture.
Though we don’t have relevant ancient DNA from India proper to answer any questions yet, we do have ancient DNA from across much of Europe, Central Asia, and the Near East. What they show is that Indian populations share ancestry from both Neolithic Iranians and peoples of the Pontic steppe, who flourished ~5 to ~10,000 years ago. To some extent the latter population is a daughter population of the former…which makes things complicated. Conversely, no West Eurasian population seems to harbor ancient signals of ASI ancestry.
One scientist who holds to the position that most South Asian ancestry dates to the Pleistocene argued to me that we don’t know if ancient Indian samples from the northwest won’t share even more ancestry than the Iranian Neolithic and Pontic steppe samples. In other words, ANI was part of some genetic continuum that extended to the west and north. This is possible, but I do not find it plausible.
The reasons are threefold. First, it doesn’t seem that continuous isolation-by-distance works across huge and rugged regions of Central Eurasia. Rather, there are demographic revolutions, and then relative stasis as the new social-cultural environment crystallizes. This inference I’m making from ancient DNA and extrapolating. This may be wrong, but I would bet I’m not off base here.
Second, it strikes me as implausible that there was literally apartheid between ASI and ANI populations for the whole Holocene right up until ~4,000 years before the present. That is, if Northwest India was involved in reciprocal gene flow with the rest of Eurasia over thousands of years I expect there should have been some distinctive South Asian ASI-like ancestry in the ancient DNA we have. We do not see it.
Third, one of the populations with strong affinities to some Indian populations are those of the Pontic steppe. But we know that this group itself is a compound of admixture that arose 5,000-6,000 years ago. Because of the complexity of the likely population model of ANI this is not definitive, but it seems strange to imagine that ANI could have predated one of the populations with which it was in genetic continuum as part of a quasi-panmictic deme.
Finally, many of the critiques involve evaluation of the scientific literature in this field. Unfortunately this is hard to do from the outside. Citing papers from the aughts, for example, is not wrong, but evolutionary human population genomics is such a fast moving field that even papers published a few years ago are often out of date.
Many are citing a 2012 paper by a respected group which argues for the dominant model of the aughts (marginal population movement into South Asia). One of their arguments, that Central Asian migrant should have East Asian ancestry, is a red herring since it is well known that this dates to the last ~2,000 years or so (we know more now with ancient DNA). But the second point that is more persuasive in the paper is that when they look at local ancestry of ANI vs. ASI in modern Indians, the ANI haplotypes are more diverse than West Eurasians, indicating that they are not descendants but rather antecedents (usually the direction of ancestry is from more diverse to less due to subsampling).
There are two points that I have make here. First, local ancestry analysis is difficult, so I would not be surprised if they integrated ASI regions into ANI and so elevated the diversity in that way (though they think they’ve taken care of it in the paper). Second, if the ANI are a compound of several West Eurasian groups then we expect them to be more diverse than their parents. In other words, the paper is refuting a model which is almost certainly incorrect, but the alternative hypothesis is not necessarily the true hypothesis (which is a more complex demographic model than many were testing in 2012).
But there are many things we do not know still. Many free variables which we haven’t nailed down. Here are some major points:
Y chromosomal lineages have a correlation with ethno-linguistic groups, but the correlation is imperfect. R1b and R1a seems correlated with Indo-European groups, but both these are found in high proportions in groups which are putatively mostly “pre-Indo-European” in origin (e.g., Basques, Sardinians, and South Indian tribals and non-Brahmin Dravidian speaking groups). Also, haplogroups like I1 in Europe expand with Indo-Europeans locally, suggesting there was lots of heterogeneity in Indo-Europeans as they expanded. In other words, Indo-European expansion in relation to powerful paternal lineages did not always correlate with ethno-linguistic change.
There are probably at minimum two Holocene intrusions from the northwest into South Asia, but this is a floor. The models that are constructed always lack power to detect more complexity. E.g., it is not impossible that there were several migrations of Indo-Europeans into South Asia which we can not distinguish genetically over a period of a few thousand years.
If one looks over all of South Asia it may be that ASI ancestry in totality is >50% of the total genome ancestry. I don’t have a good guess of the numbers. If this is correct, perhaps most South Asian ancestors 10,000 years ago were living in South Asia (though the fertility rate are such in Pakistan that ANI ancestry is increasing right now in relative rates).
But, this presupposes that ASI were present in South Asia in totality 10,000 years ago, rather than being migrants themselves. If ancient DNA confirms that ANI were long present in Northwest India, I hold then it is entirely likely that ASI was intrusive to South Asia! The BMC Evolutionary Biology Paper does a lot of interpretation of deep structure in haplogroup M in South Asia. I’m moderately skeptical of this. Europe may not be a good model for South Asia, but there we see lots of Pleistocene turnover.
So where does this leave us? Ancient DNA will answer a lot of questions. Pretty much all scientists I’ve talked to agree on this. My predictions, some of which I’ve made before:
The first period of admixture is old, and dates to the founding of Mehrgarh as an agricultural settlement. The dominant ANI component dates to this period and mixture event, all across South Asia. The presence in South India is due to expansion of these farming populations.
A second admixture event occurred with the arrival of steppe people. Those who argue for the Aryan invasion model posit 1500 BCE as the date. But these people probably were expanding in some form before this date.
We still don’t know who the antecedents for the Indo-Aryans were. Probably they were a compound of different steppe groups, and also other populations which were mixed in (by analogy, in Europe it is obvious now that there was some mixture with the local European farmers and hunter-gatherers as Europeans expanded their frontier westward; the same probably applies for Indo-Aryans are the BMAC).
The Washington Post has a piece typical of its genre, A Chinese student praised the ‘fresh air of free speech’ at a U.S. college. Then came the backlash. It’s the standard story; a student from China with somewhat heterodox thoughts and sympathies with some Western ideologies and mores expresses those views freely in the West, and social media backlash makes them walk it back. We all know that the walk back is insincere and coerced, but that’s the point: to maintain the norm of not criticizing the motherland abroad. The truth of the matter of how you really feel is secondary.
Tacit in these stories is that of course freedom of speech and democracy are good. And, there is a bit of confusion that even government manipulation aside, some of the backlash from mainland Chinese seems to be sincere. After all, how could “the people” not defend freedom of speech and democracy?
Reading this story now I remember what an academic and friend (well, ex-friend, we’re out of touch) explained years ago in relation to what you say and public speech: one can’t judge speech by what you intend and what you say in a descriptive sense, but you also have to consider how others take what you say and how it impacts them. In other words, intersubjectivity is paramount, and the object or phenomenon “out there” is often besides the point.
At the time I dismissed this viewpoint and moved on.
Though in general I do not talk to people from China about politics (let’s keep in real, it’s all about the food, and possible business opportunities), it was almost amusing to hear them offer their opinions about Tibet and democracy, because so often very educated and competent people would trot out obvious government talking points. In this domain there was little critical rationalism. One could have a legitimate debate about the value of economic liberalization vs. political liberalization. But it was ridiculous to engage with the thesis that China was always unitary between the Former Han and today. That is just a falsehood. Though the specific detail was often lacking in their arguments, it was clearly implied that they knew the final answer. I would laugh at this attitude, because I thought ultimately facts were the true weapon. The world as it is is where we start and where we end.
Or is it? From the article:
Another popular comment expressed disappointment in U.S. universities, suggesting without any apparent irony that Yang should not have been allowed to make the remarks.
“Are speeches made there not examined for evaluation of their potential impact before being given to the public?” the commentator wrote.
“Our motherland has done so much to make us stand up among Western countries, but what have you done? We have been working so hard to eliminate the stereotypes the West has put on us, but what are you doing? Don’t let me meet you in the United States; I am afraid I could not stop myself from going up and smacking you in the face.”
Others were critical not of Yang’s comments but of the venue in which she chose to make them.
“This kid is too naive. How can you forget the Chinese rule about how to talk once you get to the United States? Just lie or make empty talk instead of telling the truth. Only this will be beneficial for you in China. Now you cannot come back to China,” @Labixiaoxin said.
There is a lot of texture even within this passage. I do wonder if the writers and editors at The Washington Post knew the exegetical treasures they were offering up.
To me, there is irony in the irony. Among the vanguard of the intelligensia in these United States there is plenty of agreement with the thesis that some remarks should not be made, some remarks should not be thought. Especially in public. The issue is not on the principle, but specifically what remarks should not be made, and what remarks should not be public. That is, the important and substantive debates are not about a positive description of the world, but the values through which you view the world. The disagreements with the Chinese here are not about matters of fact, but matters of values. Facts are piddling things next to values.
So let’s take this at face value. Discussions about Tibetan autonomy and Chinese human rights violations cause emotional distress for many Chinese. I’ve seen this a little bit personally, when confronting Chinese graduate students with historical facts. It’s not that they were ignorant, but their views of history were massaged and framed in a particular manner, and it was shocking to be presented with alternative viewpoints when much of one’s national self-identity hinged on a particular narrative. Responses weren’t cogent and passionate, they were stuttering and reflexive.
Now imagine the psychic impact on hundreds of millions of educated Chinese. They’ve been sold a particular view of the world, and these students get exposed to new ideas and viewpoints and relay it back, and it causes emotional distress. Similarly, for hundreds of millions of Muslims expressing atheism is an ipso facto assault on their being, their self-identity. This is why I say that the existence of someone like me, an atheist from a Muslim background, is by definition an affront to many. My existence is blasphemy and hurtful.
And the Chinese view of themselves and their hurt at insults to their nationhood do not come purely from government fiction. There’s a factual reality that needs to be acknowledged. China was for thousands of years was one of the most significant political and cultural units in the world. But the period from 1850 to 1980 were dark decades. The long century of eclipse. China was humiliated, dismembered, and rendered prostrate before the world. It collapsed into factious civil war and warlordism. Tens of millions died in famines due to political instability.
In the late 1950s and early 1960s between 20 to 50 million citizens of the Peoples’ Republic of China starved due to Mao’s crazy ambitions. This is out of a population of ~650 million or so. Clearly many Chinese remember this period, and have relatives who survived through this period. A nation brought low, unable to feed its own children, is not an abstraction for the Chinese.
On many aspects of fact there are details where I shrug and laugh at the average citizen of China’s inability to look beyond the propaganda being fed to it. And I am not sure that the future of the Chinese state and society is particularly as rosy as we might hope for, as its labor force already hit a peak a few years ago. But the achievement of the Chinese state and society over the past generation in lifting hundreds of millions out of grinding poverty have been a wonder to behold. A human achievement greater than the construction of the Great Wall, not just a Chinese achievement.
But it is descriptively just a fact that nations which have been on the margins and find themselves at center stage want their “time in the sun.” The outcomes of these instances in history are often not ones which redound to the glory of our species, but it is likely that group self-glorification and hubris come out of a specific evolutionary context.
There are on the order of ~300 million citizens of the United States. There are 1.3 billion Chinese. If offense and hurt are the ultimate measures of the acceptance of speech than an objective rendering might suggest that we lose and they win. There are more of them to get hurt than us.
But perhaps the point is that there is no objectivity. There is no standard “out there.” Once the measuring stick of reality falls always, and all arguments are reduced to rhetoric, it is sophistry against sophistry. Power against power. Your teams and views are picked for you, or, through self-interest, or, your preferences derived from some aesthetic bias. Sometimes the team with the small numbers wins, though usually not.
Discourse is like a season of baseball. At the end there is a winner. But there is no final season. Just another round of argument.
Ten years ago I read Alister McGrath’s The Twilight of Atheism. I literally laughed at the time when I closed that book, because the numbers did not seem to support him in his grand confidence about atheism’s decline. And since the publication of that book the proportion of people in the United States who are irreligious has increased. Contrary to perceptions there has been no great swell of religion across the world.
But on a deep level McGrath was correct about something. Much of the book was aimed at the “New Atheism” specifically. A bold and offensive movement which prioritized the idea of facts first (in the ideal if not always the achievement), McGrath argued that this was a last gasp of an old modernist and realist view of the world, which would be swallowed by the post-modern age. He, a traditional Christian, had a response to the death of reason and empiricism uber alleles, his God of Abraham, God of Isaac, and the God of Jacob. Primordial identities of religion, race, and nationality would emerge from the chaos and dark as reason receded from the world.
With the rise of social constructionism McGrath saw that the New Atheists would lose the cultural commanding heights, their best and only weapons, the glittering steel of singular facts over social feelings. On the other hand, if facts derive from social cognition, than theistic views have much more purchase, because on the whole the numbers are with God, and not his detractors.
And going back to numbers. The Washington Post is owned by Jeff Bezos. And China is a massive economic shadow over us all. Anyone who works in the private sector dreams of business in China. Currently Amazon is nothing in China. What if the Chinese oligarchs made an offer Bezos couldn’t refuse? Do you think The Washington Post wouldn’t change its tune?
When objectivity and being right is no defense, then all that remains is self-interest. Ironically, cold hard realism may foster more universal empathy by allowing us to be grounded in something beyond our social unit. In the near future if the size of social units determines who is, and isn’t, right, than those who built a great bonfire on top of positivism’s death may die first at the hands of the hungry cannibal hordes. Many of us will shed no tears. We were not the ones in need of empathy, because we were among the broad bourgeois masses.
In the end the truth only wins out despite our human natures, not because of it.
Its seems every post on Indian genetics elicits dissents from loquacious commenters who are woolly on the details of the science, but convinced in their opinions (yes, they operate through uncertainty and obfuscation in their rhetoric, but you know where the axe is lodged). This post is an attempt to answer some questions so I don’t have to address this in the near future, as ancient DNA papers will finally start to come out soon, I hope (at least earlier than Winds of Winter).
The current distribution of the M17 haplotype is likely to represent traces of an ancient population migration originating in southern Russia/Ukraine, where M17 is found at high frequency (>50%). It is possible that the domestication of the horse in this region around 3,000 B.C. may have driven the migration (27). The distribution and age of M17 in Europe (17) and Central/Southern Asia is consistent with the inferred movements of these people, who left a clear pattern of archaeological remains known as the Kurgan culture, and are thought to have spoken an early Indo-European language (27, 28, 29). The decrease in frequency eastward across Siberia to the Altai-Sayan mountains (represented by the Tuvinian population) and Mongolia, and southward into India, overlaps exactly with the inferred migrations of the Indo-Iranians during the period 3,000 to 1,000 B.C. (27). It is worth noting that the Indo-European-speaking Sourashtrans, a population from Tamil Nadu in southern India, have a much higher frequency of M17 than their Dravidian-speaking neighbors, the Yadhavas and Kallars (39% vs. 13% and 4%, respectively), adding to the evidence that M17 is a diagnostic Indo-Iranian marker. The exceptionally high frequencies of this marker in the Kyrgyz, Tajik/Khojant, and Ishkashim populations are likely to be due to drift, as these populations are less diverse, and are characterized by relatively small numbers of individuals living in isolated mountain valleys.
In a 2002 interview with the India site Rediff, the first author was more explicit:
Some people say Aryans are the original inhabitants of India. What is your view on this theory?
The Aryans came from outside India. We actually have genetic evidence for that. Very clear genetic evidence from a marker that arose on the southern steppes of Russia and the Ukraine around 5,000 to 10,000 years ago. And it subsequently spread to the east and south through Central Asia reaching India. It is on the higher frequency in the Indo-European speakers, the people who claim they are descendants of the Aryans, the Hindi speakers, the Bengalis, the other groups. Then it is at a lower frequency in the Dravidians. But there is clear evidence that there was a heavy migration from the steppes down towards India.
But some people claim that the Aryans were the original inhabitants of India. What do you have to say about this?
I don’t agree with them. The Aryans came later, after the Dravidians.
Over the past few years I’ve gotten to know the above first author Spencer Wells as a personal friend, and I think he would be OK with me relaying that to some extent he was under strong pressure to downplay these conclusions. Not only were, and are, these views not popular in India, but the idea of mass migration was in bad odor in much of the academy during this period. Additionally, there was later work which was less clear, and perhaps supported an Indian origin for R1a1a. Spencer himself told me that it was not impossible for R1a to have originated in India, but a branch eventually back-migrated to southern Asia.
By 2009 one might have admitted that perhaps Spencer was wrong. I was certainly open to that possibility. There was very persuasive evidence that the mtDNA lineages of South Asia had little to do with Europe or the Middle East.
Yet a closer look at the above papers reveals two major systematic problems.
First, ancient DNA has made it clear that there has been major population turnover during the Holocene, but this was not the null hypothesis in the 2000s. Looking at extant distributions of lineages can give one a distorted view of the past. Frankly, the 2009 Indian paper was egregious in this way because they included Turkic groups in their Central Asian data set. Even in 2009 there was a whole lot of evidence that Central Asian Turkic groups were likely very different from Indo-European Turanian populations which would have been the putative ancestors of Indo-Aryans. Honestly the authors either consciously loaded the die to reduce the evidence for gene flow from Central Asia, or they were ignorant (the nature of the samples is much clearer in the supplements than the primary text for what it’s worth).
Second, Y chromosomal marker sets in the 2000s were constrained to fast mutating microsatellite regions or less than 100 variant SNPs on the Y. Because it is so repetitive the Y chromosome is hard to sequence, and it really took the technologies of the last ten years to get it done. Both the above papers estimate the coalescence of extant R1a1a lineages to be 10-15,000 years before the present. In particular, they suggest that European and South Asian lineages date back to this period, pushing back any possible connection between the groups, and making it possible that European R1a1a descended from a South Asian founder group which was expanding after the retreat of the ice sheets. The conclusions were not unreasonable based on the methods they had. But now we have better methods.*
Whole genome sequencing of the Y, as well as ancient DNA, seems to falsify the above dates. Though microsatellites are good for very coarse grain phyolgenetic inferences, one has to be very careful about them when looking at more fine grain population relationships (they are still useful in forensics to cheaply differentiate between individuals, since they accumulate variation very quickly). They mutate fast, and their clock may be erratic.
Additionally, diversity estimates were based on a subset of SNP that were clearly not robust. R1a1a is not diverse anywhere, though basal lineages seem to be present in ancient DNA on the Pontic steppe in some cases.
To show how lacking in diversity R1a1a is, here are the results of a 2016 paper which performed whole genome sequencing on the Y. Instead of relying on the order of 10 to 100 SNPs, this paper discover over 65,000 Y variants worldwide. Notice how little difference there is between different South Asian groups below, indicative of a massive population expansion relatively recently in time which didn’t even have time to exhibit regional population variation. They note that “The most striking are expansions within R1a-Z93 [the South Asian clade], ~4.0–4.5 kya. This time predates by a few centuries the collapse of the Indus Valley Civilization, associated by some with the historical migration of Indo-European speakers from the western steppes into the Indian sub-continent.”
For some reason women do not seem to migrate much into South Asia. In the late 2000s I, along with others, noticed a strange discrepancy in the Y and mtDNA lineages which trace one’s direct male and female lines: in South Asia the male lineages were likely to cluster with populations to the north an west, while the females lines did not. South Asia’s females lines in fact had a closer relationship to the mtDNA lineages of Southeast and East Asia, albeit distantly.
One solution which presented itself was to contend there was no paradox at all. That the Y chromosomal lineages found in South Asia were basal to those to the west and north. In particular, there were some papers suggesting that perhaps R1a1a originated in South Asia at the end of the last Pleistocene. Whole genome sequencing of Y chromosomes does not bear this out though. R1a1a went through rapid expansion recently, and ancient DNA has found it in Russia first. But in 2009 David Reich came out with Reconstructing Indian population history, which offered up somewhat of a possible solution.
What Reich and his coworkers found that South Asia seems to be characterized by the mixture of two very different types of populations. One set, ANI (Ancestral North Indian), are basically another western or northwestern Eurasian group. ASI (Ancestral South Indian), are indigenous, and exhibit distant affinities to the Andaman Islanders. The India-specific mtDNA then were from ASI, while the Y chromosomes with affinities to people to the north and west were from ANI. In other words, the ANI mixture into South Asia was probably through a mass migration of males.
But it’s not just Y and mtDNA in this case only. A minority of South Asians speak Austro-Asiatic languages. The most interesting of these populations are the Munda, who tend to occupy uplands in east-central India. Older books on India history often suggest that the Munda are the earliest aboriginals of the subcontinent, but that has to confront the fact that most Austro-Asiatic language are spoken in Southeast Asia. There was no true consensus where they were present first.
Genetics seems to have solved this question. The evidence is building up that Austro-Asiatic languages arrived with rice farmers from Southeast Asia. Though most of the ancestry of the Munda is of ANI-ASI mix, a small fraction is clearly East Asian. And interestingly, though they carry no East Asian mtDNA, they do carry East Asian Y. Again, gene flow mediated by males.
Zoroastrianism is one of the oldest extant religions in the world, originating in Persia (present-day Iran) during the second millennium BCE. Historical records indicate that migrants from Persia brought Zoroastrianism to India, but there is debate over the timing of these migrations. Here we present novel genome-wide autosomal, Y-chromosome and mitochondrial data from Iranian and Indian Zoroastrians and neighbouring modern-day Indian and Iranian populations to conduct the first genome-wide genetic analysis in these groups. Using powerful haplotype-based techniques, we show that Zoroastrians in Iran and India show increased genetic homogeneity relative to other sampled groups in their respective countries, consistent with their current practices of endogamy. Despite this, we show that Indian Zoroastrians (Parsis) intermixed with local groups sometime after their arrival in India, dating this mixture to 690-1390 CE and providing strong evidence that the migrating group was largely comprised of Zoroastrian males. By exploiting the rich information in DNA from ancient human remains, we also highlight admixture in the ancestors of Iranian Zoroastrians dated to 570 BCE-746 CE, older than admixture seen in any other sampled Iranian group, consistent with a long-standing isolation of Zoroastrians from outside groups. Finally, we report genomic regions showing signatures of positive selection in present-day Zoroastrians that might correlate to the prevalence of particular diseases amongst these communities.
The paper uses lots of fancy ChromoPainter methodologies which look at the distributions of haplotypes across populations. But some of the primary results are obvious using much simpler methods.
1) About 2/3 of the ancestry of Indian Parsis derives from an Iranian population
2) About 1/3 of the ancestry of Indian Parsis derives from an Indian popuation
3) Almost all the Y chromosomes of Indian Parsis can be accounted for by Iranian ancestry
4) Almost all the mtDNA haplogroups of Indian Parsis can be accounted for by Indian ancestry
5) Iranian Zoroastrians are mostly endogamous
6) Genetic isolation has resulted in drift and selection on Zoroastrians
The fact that the ancestry proportion is clearly more than 50% Iranian for Parsis indicates that there was more than one generation of males who migrated. They did not contribute mtDNA, but they did contribute genome-wide to Iranian ancestry. There are wide intervals on the dating of this admixture event, but they are consonant oral history that was later written down by the Parsis.
So there you have it. Another example of a population formed from admixture because women hate going to India.
Citation: The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection.
Saioa Lopez, Mark G Thomas, Lucy van Dorp, Naser Ansari-Pour, Sarah Stewart, Abigail L Jones, Erik Jelinek, Lounes Chikhi, Tudor Parfitt, Neil Bradman, Michael E Weale, Garrett Hellenthal
bioRxiv 128272; doi: https://doi.org/10.1101/128272
Pretty much any person of Indian subcontinental origin in the United States of a certain who isn’t very dark skinned has probably had the experience of being spoken to in Spanish at some point. When I was younger growing up in Oregon I had the experience multiple times of Spanish speakers, probably Mexican, pleading with me to interpret for them because there was no one else who seemed likely. It isn’t a genius insight to conclude I was most likely South Asian…but it wasn’t out of the question I was Mexican. This applies even more to lighter skinned South Asians. In the Central Valley of California, where there are many Sikhs from Punjabi and Mexicans, this confusion occurred a lot for some Indian kids.
Of course biogeographically there isn’t that much connection between South Asia and the New World. But it isn’t crazy that Christopher Columbus labelled the peoples of the New World “Indian.” After all, they were a brown-skinned people whose features were not African, East Asian, or West Eurasian. And, it turns out genetically there is a coincidence that connects the New World and South Asia: the mixed peoples of Latin America with Amerindian and European ancestry recapitulate an admixture which resembles what occurred in South Asia thousands of years ago. It looks as if about half the ancestry of South Asians is West Eurasian and half something more like eastern Eurasians.
On principles component analysis that means that South Asian and Mexican and Peruvian samples often overlap. This is somewhat curious because the non-West Eurasian ancestors of South Asians and Amerindians diverged in ancestry on the order of 25 to 45 thousand years before the present. And the Iberian ancestry of the mixed people of the New World is almost as far from the character of South Asian West Eurasian ancestry as you can get (in the parlance of this blog, lots of EEF, less CHG, not too much ANE).
I actually began writing about this in the late 2000s, when the fact that South Asian mtDNA was very different from West Eurasian mtDNA, and South Asian Y chromosome was mostly West Eurasian, was obvious. Then work using genome-wide data sets began to point to massive intra-Eurasian admixture between very diverged lineages. The paper is not revolutionary, but worth reading for its thoroughness and how it brings together all the lines of evidence.
Finally, no ancient DNA. That’s probably for the future, but I don’t expect any surprises.