The genetics of the St. Thomas Christians

First, I have to say I appreciate everyone who keeps sending data to the South Asian Genotype Project. Basically, I’m automating the pipeline, finding ways to merge data from a host of sources, but also figuring out how to refine the analysis.

But until then, today I decided to do some more manual analysis of three St. Thomas Christian samples I have (also called Nasranis). The reason is that there were some questions on Twitter in relation to the genetics of this group, and though three is not a great sample size, it’s better than nothing.

The St. Thomas Christians are a diverse group of people of various denominations in the southern state of Kerala who have diverse origin stories. Today the St. Thomas Christians have a range of denominational and sectarian affinities, but their origins probably have something to the Church of the East.

These Christians claim roots among the local Brahmin community, Jews, and West Asian settlers. To be honest, whenever people tell me about the Brahmin ancestors unless they were recent converts I discount this because there are about ten times as many St. Thomas Christians in Kerala as there are Brahmins. There is a small Jewish community in the area, and this region of India was long part of the Indian Ocean trade network of the Arabs.

I merged the three Nasrani samples with a lot of other populations. Zooming in on the South Asians, if you look at the PCA plot to the left (click it), you’ll see that they are not in the same cluster as the South Indian Brahmins (Brahmins from the four South Indian states are very similar to each). But, in comparison to non-Brahmin South Indians, they do seem Brahmin shifted.

As I have observed before these South Indian Brahmins can be thought of as more than 50% North Indian Brahmin, but the remainder being South Indian non-Brahmin. Aside from exotic exceptions (Parsis, Bengalis), most South Asians exist on an ANI-ASI “cline,” with lower caste South Indians being at one end of the cline (more ASI), and populations in the far northwest, such as the Kalash, being at the other end (more ANI). The PCA would suggest that the Nasrani are more ANI-shifted than a generic South Indian group, but less so than South Indian Brahmins.

Using Treemix to detect gene flow events, what I found is that the Nasranis look like a generic South Indian group. There’s no evidence of gene flow from Middle Eastern populations (Jews, Persians).

I did some f-3 tests and there isn’t anything conclusive I see to suggest Middle Eastern gene flow into the Nasranis.

Finally, I ran ADMIXTURE in supervised mode. Here are the average results for a set of South Asian populations (mean values):

Group Druze Georgian Han Iranian Telugu Yemenite Jew
Bangladeshi 1% 2% 12% 1% 83% 1%
Chamar 0% 0% 3% 0% 97% 0%
Gujurati_Patel 0% 1% 0% 10% 89% 0%
UP Kshatriya 0% 3% 1% 21% 76% 0%
Nasrani 0% 4% 1% 12% 83% 0%
Pathan 0% 4% 1% 55% 40% 0%
Piramalai_Kallar 0% 0% 2% 0% 97% 0%
SI_Brahmin 0% 4% 1% 16% 78% 0%
Telugu_Reddy 0% 3% 0% 0% 94% 3%
UP_Brahmin 0% 4% 1% 26% 69% 0%
UP_Kayastha 0% 0% 1% 20% 79% 0%
Velama 1% 1% 0% 2% 96% 0%
West_Bengal_Kayastha 0% 0% 7% 8% 85% 0%

In these results, the Nasrani do look shifted in the same direction as South Indian Brahmins, though less so. Observe that there is no clear Middle Eastern signal in the Nasrani above and beyond what you see in South Asians. This, despite the fact that Indian Jews show a very strong signal of admixture from the Middle East. At this point, I am confident in rejecting Nasrani St. Thomas Christian origins in a converted Jewish community, or one with a large degree of West Asian admixture.

Though the genetic profile of these three individuals does not support clear descent from South Indian Brahmins, I can not reject the model of Brahmin admixture into this community. On the contrary, a plausible model would see to be that various South Indian groups, including Brahmins, contributed to the Nasrani community over the centuries.

To be continued….

“Rakhigarhi paper” out in January 2018? (maybe?)

Tony Joseph has an interesting piece up, Who built the Indus Valley civilisation?, which people are asking me about via email. First, I don’t have any inside information. Last I heard in September was that the Rakhigarhi results were “one or two months away,” like they have been for a year or so. So I put it out of mind.

In any case, here are the important points:

All this could now change thanks to the science of genetics and four ancient skeletons excavated from a village called Rakhigarhi in Haryana. The four people to whom these bones once belonged — a couple, a boy and a man — lived roughly 4,600 years ago when the Indus Valley civilisation was in full bloom.

In the three-and-a-half years since its excavation, Shinde has brought together scientists from Indian and international institutions like the Centre for Cellular and Molecular Biology, Hyderabad (CCMB), Harvard Medical School, Seoul National University, and the University of Cambridge to work on different parts of the project, including extracting and analysing DNA from these ancient people, reconstructing their faces, and studying the remains of their habitation to understand their daily habits and ways of life.

The DNA analysis will also help figure out their height, body features, and even the colour of their eyes….

Joseph also asserts that the publication will happen in a “leading international journal” in a month or so. If I had to bet, I’d say Nature.

Harvard Medical School suggests to me they finally got David Reich’s group involved. As for Cambridge University, Eske Willerslev now has an appointment there. He’s apparently assembling a paleogenetics group.

The piece specifically highlights Y and mtDNA. But if they are talking about height, body features, and color of eyes, they must have gotten genome-wide data. If Eske Willerslev is involved they may have sequenced the whole genome at some coverage of at least one of the samples.

If I had to bet I think the Rakhigarhi samples will be Y haplogroups J2 or the Indian branch of L, and the mtDNA will be an Indian branch of M. In terms of genome-wide patterns they will exhibit a mixture between West Eurasian ancestry, with strong affinities to Near Eastern farmers from the Zagros, and what we now term “Ancestral South Indians” (AS), who descend from the aboriginal peoples of the subcontinent, and are genetically somewhat closer to East Eurasians than West Eurasians (to be fair, I think it is not implausible that much of ASI heritage is the product of westward migration out of Southeast Asia during the Pleistocene and early Holocene).

Overall, genetically these samples may look the most like South Indian non-Brahmin middle-to-upper castes. Think the Reddy people of Andhra Pradesh. Additionally, going back to R1a1a-Z93, I do think it was intrusive with the Indo-Aryans. Its highest frequencies do tend to be among upper castes, and there is an increasing cline toward the northwest of the subcontinent.

ButR1a1a-Z93’s presence at appreciable frequencies in South India among non- Brahmins, including tribal populations, indicates a more complex ethnogenesis of Dravidian speaking groups than we might have realized. Priya Moorjani told me specifically that 4,000 years ago there were “unmixed ANI and ASI groups” in the subcontinent. I think for the former she’s picking up the signal of intrusive Indo-Aryans. But what about the latter? I doubt there were unmixed ASI in the Indus Valley. But they probably still persisted to the south and east when the Indus Valley people were in decline and the Indo-Aryans arrived. The South Indian Neolithic dates from 3000 to 1400 BC.

Here my moderate confidence sketch. The collapse of the Indus Valley civilization was probably ultimately due to the fact that these early antique societies were not very robust to exogenous shocks and endogenous decay of asabiya. Once these societies, which have accumulated some level of surplus wealth by squeezing it out of the Malthusian margin, start to totter social collapse and dissolution can happen fast, and barbarian groups outside of the gates with more social cohesion can engage in a takeover.

In the case of the collapse of the Sumerian-Akkadian civilization, the barbarian Amorites actually took over and maintained cultural continuity. In post-Roman Britain, the Roman civilization collapsed in totality, and “Roman Christianity” had to be reintroduced from the European continent and from the Celts into Anglo-Saxon England. The barbarian takeover resulted in the total cultural obliteration of the Britons. Finally, you have instances such as post-Roman Gaul, which transformed into Francia. Unlike the case of the transition from the rule of the Third Dynasty of Ur to that of the Amorites, the Frankish rulers oversaw a wholesale reimagining of the identity of the people of Gaul. Even as late as 800, a ruler such as Charlemagne still spoke a dialect of German as his first language. And yet the Franks of Neustria were ultimately transformed and became one with the “Romans” whom they ruled.

In the post-Harappan world of northwest India I suspect something close to the Anglo-Saxon precedent is likely. Though the majority of the ancestry of the Upper Gangetic plain is not Indo-Aryan, a substantial proportion is. And this ancestry is detectable at lower fractions even among non-Brahmin Bengalis. In Central and South India the situation was probably more like Mesopotamia around ~2000 BC or Gaul post-500 AD. There were various sorts of interactions between Indo-Aryans and local populations, as well as the final assimilation of aboriginal peoples into Indo-Aryan and Dravidian speaking peoples.*

* The Munda people clearly have some East Asian ancestry. And, they are mostly a mix of ANI and ASI. But whenever I look at their genome-wide results it strikes me they may not have any Indo-Aryan ancestry. This may ultimately be totally comprehensible in light of the chronology of migration and segregation.

Update: One of the researchers involved indicates Eske Willerslev is not involved.

The Indo-Aryan migration to the Indian subcontinent

The piece is up at India Today. The headline and title are of course optimized for clicks. I would, for example, say that the Indo-Aryans came from the west, not the West.

In the course of writing this it has become clear that many people have very specific commitments on this issue. I think it is clear I do not. Genetic inference methods have wide shoulders of confidence in particular dates. So I’ll leave it to those with more archaeological knowledge to argue over specific date. But it strikes me that the dates point to a likelihood that much of the expansion and diversification of Indo-Aryans may precede their expansion into the Gangetic plain ~1500 BCE, the date preferred by many scholars.

Apparently we shouldn’t have to wait too long for ancient DNA from Rakighari (months, not years). But I doubt that will settle anything, as opposed to being preliminary and setting off new debates.