Substack cometh, and lo it is good. (Pricing)

The Parsis are about 25% South Asian genetically

parsiMTDNA

parsi2In the comments below I made the comment that the Parsi people of India, who reputedly arrived in India ~1000 years ago from Iran, are about 25 percent South Asian. By this, I mean that their ancestry is about 75 percent Iranian (presumably Persian), with 25 percent admixture from South Asian populations amongst whom they lived. But my feeling about this was vague, and I decided to check the scientific literature. Unfortunately there hasn’t been a lot of work done in this area with cutting edge genomics. But a cursory examination shows that there’s been substantial migration of Indian women into the Parsi lineage via the mtDNA. In the figure to the right you see that “PA”, the Parsis, have a lot of “South Asian” mtDNA lineages compared to the Iranian groups. This mostly consists of South Asian branches of haplogroup M. It jumps out to you immediately when looking at the haplotypes that the Parsis carry on their mtDNA. I found less on the Y chromosomes, which are less informative in differentiated South Asians from Iranians in any case (the mtDNA difference is much greater between these two regions), but what I did find is that Parsis can be modeled as 100% Iranian on their paternal lineages. This is probably an exaggeration, but as a stylized fact I think it gets to the heart of the matter.

But what would really be useful are autosomal results. Those were hard to find. Noah Rosenberg’s 2006 paper on Indian genetic differentiation using microsatellites did have a Parsi sample. If you look at the results the Parsi do seem South Asian, roughly equivalent to Pathans, an Iranian speaking group in Pakistan which has strong South Asian affinities. But the sample set does not include any Iranian groups from Iran proper, but rather Middle Eastern groups from the Arab world or the Caucasus. Without such a reference population it is hard to gauge Parsi relatedness.

There was one last hope. Harappa DNA has been collecting results for many years now, and I was hoping that there was a Parsi in the sample. There was, just one. I took the Parsi and compared this individual to various Iranian and a few select Indian groups. Here are the admixture results (edited to show only the relevant ancestral clusters):

EthnicityS.IndianBalochCaucasianNE.EuroMediterraneanSW.Asian
Kurd (Iraqi)029404616
Iraqi Arab111300544
Kurd (Iraqi)126435516
Kurd (Iraqi)128435513
Kurd (Iranian)129417612
Kurd Zaza Turkey223436613
Iranian224435713
Kurd (Turkish)226466610
Iranian228477310
Iranian22943388
Iranian230444213
Iraqi Arab3203901019
Kurd Kurmanji Iraq421414715
Kurd from Turkey424414812
Iranian426397712
Kurd Yezidi Iraq426394713
Iranian427415611
Iranian429374412
Iraqi Arab519385719
Kurd Kurmanji Iraq524394813
Iranian525385712
Kurd (Iraqi)527415514
Iranian625376612
Kurd (Feyli)625383714
Iranian Khorasani829359211
Afghan Pashtun1432251234
Pashtun (Kandahar)1534251005
Mumbai Parsi1628285412
Afghan Pashtun2036171105
Afghan Pashtun213317922
Pashtun2135181005
Gujarati Khoja284713701
Gujarati Patel Muslim343213336
Gujarati Sunni Vohra Surti353413524
Gujarati Ganchi38425930
Gujarati Vaishnav Vania45364413
Gujarati Jain46366400
Gujarati Vaniya52372601
Gujarati53430020
Gujarati56390020

The key is to focus on the “South Indian” ancestry. Though this is found in some Iranian groups, it drops off very rapidly once you move past groups like the Pathans. The Parsi individual has 16 percent South Indian ancestral component. Looking at the Iranian individuals, you can probably say that you might expect 5 percent from this population. The question is what is the Indian source population? There’s a lot of variation among these. But, if you take 50 percent South Indian for the South Asian source population, then you get:

(50 percent)*(0.25) + (5 percent)*(0.75) = 16.25%

So at least going by this one individual something like ~25 percent is probably correct for the Parsis in terms of how much “native” South Asian ancestry they’ve picked up. Since they are genetically quite homogeneous at this point an N = 1 might be sufficient to reach a conclusion. I’d be curious if anyone finds anything different.

Posted in Uncategorized

Comments are closed.