Substack cometh, and lo it is good. (Pricing)

What the Harappa Ancestry Project has resolved

My friend Zack Ajmal has been running the Harappa Ancestry Project for several years now. This is a non-institutional complement to the genomic research which occurs in the academy. His motivation was in large part to fill in the gaps of population coverage within South Asia which one sees in the academic literature. Much of this is due to politics, as the government of India has traditionally been reluctant to allow sample collection (ergo, the HGDP data uses Pakistanis as their South Asian reference, while the HapMap collected DNA from Indian Americans in Houston). Of course this sort of project is not without its own blind spots. Zack must rely on public data sets to get a better picture of groups like tribal populations and Dalits, because they are so underrepresented in the Diaspora from which he draws many of the project participants.

Once Zack has the genotype one of the primary things he does is add it to his broader data set (which includes many public samples) and analyze it with the Admixture model-based clustering package. What Admixture does is take a specific number of populations (e.g. K = 12) and generate quantity assignments to individuals. So, for example individual A might be assigned 40% population 1 and 60% population 2 for K = 2. Individual B might be 45% population 1 and 55% population 2. These are not necessarily ‘real’ populations. Rather, the populations and their proportions are there to allow you to discern patterns of relationships across individuals.

Since Zack has put his results online, I thought it would be useful to review what patterns have emerged over the past two years, as his sample sizes for some regions are now moderately significant. Though he has K=16 populations, not all of them will concern us, because South Asians do not tend to exhibit many of the components. I will focus on seven: S Indian, Baloch, Caucasian, NE Euro, SE Asian, Siberian and NE Asian. These are not real populations, but the labels tell you which region these components are modal. So, for example, the “S Indian” component peaks in southern India. The “Baloch” in among the Baloch people of southeastern Iran and southwest Pakistan. The “NE Euro” among the eastern Baltic peoples. The last three are Asian components, running the latitude from south to north to center. They only concern the first population of interest, Bengalis.  I will combine these last three together as “Asian.”

Below is a table, mostly individuals from Zack’s results (though there are some aggregate results from public data sets). Comments below.

EthnicitySIndianBalochCaucasianNEEuroAsian
Bengali53%28%2%5%8%
Bengali Baidya45%30%3%5%12%
Bengali Baidya45%27%3%6%12%
Bengali Brahmin45%35%2%11%4%
Bengali Brahmin44%35%5%11%4%
Bengali Brahmin43%35%4%10%4%
Bengali Brahmin42%32%4%8%6%
Bengali Brahmin41%33%7%8%5%
Bengali Brahmin40%33%4%10%4%
Bengali Brahmin40%30%6%10%7%
Bengali Muslim50%25%1%5%15%
Bengali Muslim49%28%3%4%15%
Bengali Muslim45%27%4%4%17%
Bengali Muslim45%26%2%2%16%
Bengali Muslim45%24%1%3%19%
Bengali Muslim43%25%3%2%18%
Bengali Muslim48%27%0%5%15%
Tamil Brahmin48%37%6%5%
Tamil Brahmin48%37%3%5%
Tamil Brahmin48%35%5%6%
Tamil Brahmin47%38%6%4%
Tamil Brahmin47%40%3%5%
Tamil Brahmin46%40%3%6%
Tamil Brahmin Iyengar50%35%2%8%
Tamil Brahmin Iyengar47%38%6%4%
Tamil Brahmin Iyengar47%35%6%6%
Tamil Brahmin Iyer48%38%4%5%
Tamil Brahmin Iyer48%38%2%5%
Tamil Brahmin Iyer47%37%2%5%
Tamil Brahmin Iyer47%37%6%8%
Tamil Brahmin Iyer43%35%6%5%
Tamil Muslim58%28%3%2%
Tamil Nadar62%30%0%0%
Tamil Nadar59%32%3%0%
Tamil Nadar55%30%3%0%
Tamil Vellalar50%35%6%1%
Tamil Vellalar51%32%5%0%
Tamil Vellalar (Sri Lankan)60%32%5%0%
Tamil Vellalar (Sri Lankan)60%33%0%0%
Tamil Vellalar (Sri Lankan)56%36%0%0%
Tamil Vishwakarma70%23%0%0%
Tamil Vishwakarma66%25%4%0%
Andhra Pradesh60%34%2%0%
Andhra Pradesh54%36%2%3%
Andhra Pradesh (Hyderabad)56%29%5%0%
Andhra Pradesh (Hyderabad)47%35%8%4%
Andhra Pradesh Gouda61%30%2%1%
Andhra Pradesh Kamma51%33%7%0%
Andhra Pradesh Kapu62%30%2%1%
Andhra Pradesh Naidu51%32%4%2%
Andhra Pradesh Reddy57%37%1%0%
Andhra Pradesh Reddy54%38%3%0%
Andhra Pradesh Reddy51%35%4%0%
Andhra Pradesh Reddy50%36%2%1%
Andhra Pradesh Telegu Brahmin45%33%6%4%
AP Brahmin (Xing, N = 25)49%36%3%6%
AP Naidu (Reich, N = 4)61%31%1%1%
Kannada Devanga60%31%3%1%
Karnataka Catholic Christian56%37%3%0%
Karnataka Lingayat55%34%4%0%
Karnataka54%36%2%0%
Karnataka Brahmin51%35%3%5%
Karnataka Iyengar49%36%5%5%
Karnataka Iyengar48%39%3%5%
Karnataka Iyengar48%37%3%7%
Karnataka Brahmin47%38%4%6%
Karnataka Konkani Brahmin47%37%2%6%
Karnataka Konkani Brahmin46%33%6%7%
Karnataka Kokani Brahmin44%34%6%5%
Kerala47%33%7%2%
Kerala Brahmin43%39%4%6%
Kerala Christian53%35%4%0%
Kerala Christian50%35%8%1%
Kerala Christian45%33%7%3%
Kerala Muslim Rawther53%35%2%1%
Kerala Muslim Rawther51%28%4%3%
Kerala Nair48%40%4%0%
Kerala Nair47%38%5%5%
Kerala Syrian Christian50%37%6%0%
Kerala Syrian Christian50%35%9%1%
Kerala Syrian Christian46%33%5%4%
Kerala Syrian Christian44%33%6%4%
Pathan (HGDP, N = 23)23%42%16%11%
Kalash (HGDP, N = 23)22%43%18%11%
Burusho (HGDP, N = 25)23%41%12%10%
Brahui (HGDP, N = 25)12%58%12%2%
Sindhi (HGDP, N = 24)29%46%10%6%
Kashmiri Pandit (Reich, N = 5)32%39%12%9%
Punjabi43%36%5%9%
Punjabi39%39%9%7%
Punjabi34%43%7%7%
Punjabi34%40%12%8%
Punjabi33%44%5%10%
Punjabi31%41%14%8%
Punjabi29%36%11%11%
Punjabi Arain (Xing, N = 25)31%44%10%7%
Punjabi Brahmin35%40%8%11%
Punjabi Brahmin33%41%13%10%
Punjabi Chamar40%33%9%6%
Punjabi Jatt28%39%11%10%
Punjabi Jatt30%44%6%14%
Punjabi Jatt28%42%8%13%
Punjabi Jatt28%46%7%13%
Punjabi Jatt28%40%10%15%
Punjabi Jatt27%44%10%13%
Punjabi Jatt27%35%16%11%
Punjabi Jatt Muslim30%39%13%8%
Punjabi Khatri30%42%12%12%
Punjabi Lahori Muslim31%44%11%8%
Punjabi Pahari Rajput34%43%11%7%
Punjabi Pakistan28%36%16%7%
Punjabi Ramgarhia35%43%5%9%
Haryana Jat25%33%12%17%
Haryana Jat25%33%12%17%
Haryana Jatt28%38%5%20%
Haryana Jatt26%39%10%17%
Rajasthan Marwari Jain47%34%5%6%
Rajasthani Agarwal51%37%6%1%
Rajasthani Brahmin32%38%9%15%
Rajasthani Marwari48%34%6%2%
Rajasthani Rajput45%38%5%9%
UP40%28%10%8%
UP Brahmin41%37%7%11%
UP Brahmin40%37%7%11%
UP Brahmin37%38%2%14%
UP Kayastha47%38%5%3%
UP Muslim33%33%10%9%
UP Muslim28%35%12%11%
UP Muslim Pathan48%36%7%4%
UP Muslim Syed33%31%13%7%
UP Syed36%37%7%8%
UP/Haryana Agarwal52%35%6%2%
UP/Haryana Jatt28%42%7%18%
UP/Madhya Pradesh51%27%1%7%
UP/Punjabi40%33%7%10%
UP/Punjabi Khatri27%43%10%11%
Bihari Baniya47%31%5%5%
Bihari Brahmin39%38%5%11%
Bihari Kayastha53%33%1%7%
Bihari Muslim48%28%5%8%
Bihari Muslim42%34%9%6%
Bihari Muslim41%36%7%8%
Bihari Muslim42%32%7%9%
Bihari Syed42%35%4%9%
Gujarati (HapMap, N = 63, Patel)54%42%0%1%
Gujarati (HapMap, N = 34, Non-Patel)44%39%5%7%

A recent paper suggested that there was a single pulse of admixture between South and East Asians in the environs of what is today Bangladesh which occurred ~500 A.D. The traditional accounts for the arrival of Brahmins to Bengal suggests a period around and after 1000 A.D. (Bengal was one of the last redoubts of institutional Buddhism in northern India, so presumably would have less need for the services of Brahmins). The results are easy to align with these two facts. All the Bengali non-Brahmins (Baidya are a non-Brahmin high caste in West Bengal) have substantial East Asian ancestry. The Bengali Brahmins have far less of this. Additionally, their “NE Euro” component is about double that of non-Brahmins. There is still room for the Bengali Brahmins being a synthetic community with some admixture (their East Asian fraction is still notably higher than elsewhere in South Asia), but the outlines of the traditional narrative seem to explain the broad outline of these results.

When you look at South Indians from the four Dravidian states there are four facts which strike me as of note:

– There is a distinct difference between Brahmins and non-Brahmins (most of the non-Brahmins Zack has in the Harappa data set are upper caste, though the public data sets have Dalits and tribal populations)

– There is very little difference between South Indian Brahmins by region and sect (e.g., Iyengar vs. Iyer are Tamil Brahmins divided by theological differences).

– South Indian Brahmins are genetically distinct from North Indian Brahmins. They seem to have about one half the proportion of the “NE Euro” component as North Indian Brahmins (e.g., compare to Bengali Brahmins).

– South Indian non-Brahmin upper castes have very little of the “NE Euro” component, which is found at low, but consistent fractions among non-Brahmins in the Gangetic plain (and at much higher fractions as one moves toward the Punjab)

I do not know about the nature of the origin of the Pancha-Dravida group of Brahmins, but they look to be endogamous, from the same source, and probably had some admixture with the local substrate early on. This would explain their uniformity and lower fraction of “NE Euro” in relation to North Indian Brahmins. The results above also suggest that the Syrian Christians derive from converts from the Nair community, or related communities. This should not be surprising.

Finally let’s move to North India, and the zone stretching between Punjab in the Northwest and Bihar in the East. Though in much of this region Brahmins have higher “NE Euro” fractions, this relationship seems to breakdown as you go northwest. The Jatt community in particular seems to have the highest in the subcontinent. There are inchoate theories for the origins of the Jatts in Central Asia. I had dismissed them, but am thinking now they need a second look. The reasoning is simple. The Jatts of the eastern Punjab have a higher fraction of “NE Euro” than populations to their northwest (Pathans, Kalash, etc.), and Brahmin groups (e.g., Pandits) in their area who are theoretically higher in caste status. This violation of these two trends implies something not easily explained by straightforward social and geographic processes. The connection between ancestry and caste status also seems to break down somewhat in the Northwest, as there is a wide variation in ancestral components.

Someone with more knowledge of South Asian ethnography should weigh in. But until then I invite readers of South Asian heritage to submit their results to Zack.

Posted in Uncategorized

Comments are closed.