Rakhigarhi sample doesn’t have steppe ancestry (probably “Indus Periphery”)

We’ve been waiting for two years now, and it looks like they’re about to pull the trigger, Indus Valley People Did Not Have Genetic Contribution From The Steppes: Head Of Ancient DNA Lab Testing Rakhigarhi Samples:

Niraj Rai, the head of the Ancient DNA Laboratory at Lucknow’s Birbal Sahni Institute of Palaeosciences (BSIP), where the DNA samples from the Harappan site of Rakhigarhi in Haryana are being analysed, has revealed that a forthcoming paper on the work will show that there is no steppe contribution to the DNA of the Harappan people….

“It will show that there is no steppe contribution to the Indus Valley DNA,” Rai said. “The Indus Valley people were indigenous, but in the sense that their DNA had contributions from near eastern Iranian farmers mixed with the Indian hunter-gatherer DNA, that is still reflected in the DNA of the people of the Andaman islands.” He added that the paper based on the examination of the Rakhigarhi samples would soon be published on bioRxiv (pronounced “bio-archive”), a preprint repository of papers in the life sciences.

At this point none of this is surprising. I also wonder if this preprint was hastened by the release of The Genomic Formation of South and Central Asia. It seems that the results here are totally consonant with what came before. My expectation is that the lone sample that they got genetic material out of will be similar to the “Indus Periphery” (InPe) individuals in the earlier preprint: a mix of West Asian with ancestry strongly shifted toward eastern Iran, and indigenous South Asian “hunter-gatherer.”  That’s pretty much what Niraj Rai states in the piece. I think genetically the individual won’t be that different from the Chamars of modern day Punjab.

In fact, Rai, the lead researcher, ends by twisting the knife:

In other words, the preprint observes that the migration from the steppes to South Asia was the source of the Indo-European languages in the subcontinent. Commenting on this, Rai said, “any model of migration of Indo-Europeans from South Asia simply cannot fit the data that is now available.”

A major caveat here is that we’re talking about one sample from the eastern edge of the Indus Valley Civilization (IVC). I’m not sure that this should adjust our probabilities that much. From all the other things we know, as well as copious ancient DNA from Central Asia, our probability for the model which the Rakhigarhi result aligns with should already be quite high.

Again, since it’s one sample, we need to be cautious…but I bet once we have more samples from the IVC the Rakhigarhi individual will probably be enriched for AASI relative to other samples from the IVC. The InPe samples in The Genomic Formation of South and Central Asia exhibited some variation, and it’s likely that the IVC region was genetically heterogeneous.

But, this is going to be a DNA sample from an individual who lived 4,600 years ago within the orbit of the IVC when it was in its mature phase. That’s still a big deal. As most of you know the IVC is prehistory because we haven’t deciphered the seals which are associated with this civilization. But, the IVC clearly had relationships with West Asia and Central Asia, with parts of eastern Iran and the BMAC culture both being influenced and interaction with it. Traders who were likely from the IVC seem to be mentioned in Mesopotamian records.

Additionally, the genetics of one individual can be highly informative if it’s high-quality whole-genome data (I’m skeptical of that in this case). One could possibly even identify the time period that admixture between West Asian and AASI components occurred from a single genome, by looking at ancestry tract lengths.

A single sample isn’t going to falsify the idea held by some that steppe peoples were long present within the IVC. Perhaps they’ll show up in other samples? That’s possible, and it’s what I would argue if I held their position, but I think the constellation of evidence on the balance now does suggest that a relatively late incursion into South Asia is likely. The steppe ancestry with Northern European affinities shows up in BMAC only around 4,000 years ago. It is hard to imagine it was in South Asia before it was in Central Asia.

As I’ve been saying for a while it seems that though there will be more genetic work written on India in the near future, the real analysis is going to have to come out of archaeology and mythology.

It’s pretty clear that in Northern Europe the arrival of the Corded Ware peoples from the steppe zone resulted in great tumult. A linguistic analysis suggests that the languages of Northern Europe have words related to agriculture with a non-Indo-European origin, of common provenance.  But we don’t have much in the way of mythos about the arrival of the Corded Ware.

In contrast, India has a rich mythos which seems to date to the early period of the arrival of the Indo-Aryans. One interpretation has been that since these myths seem to take as a given that Indo-Aryans were autochtonous to India, they were. But the genetic data seem to be strongly suggesting that the arrival of pastoralists occurred in South Asia concomitant with their arrival in West Asia, and somewhat after their expansion westward into Europe. Indian tradition and mythos could actually be a window into the general process of how these pastoralists dealt with native peoples and an illustration of the sort of cultural synthesis that often occurred.

36 thoughts on “Rakhigarhi sample doesn’t have steppe ancestry (probably “Indus Periphery”)

  1. In contrast, India has a rich mythos which seems to date to the early period of the arrival of the Indo-Aryans. One interpretation has been that since these myths seem to take as a given that Indo-Aryans were autochthonous to India, they were.
    The contrast may be strong if one compares India with North-Western Europe, but looking elsewhere, while Avesta describes the travel of the Iranian ancestors in the general Westerly / SW direction, people argued that they didn’t travel very far and stayed within the general region. The Epic poetry of the Greek is rich on travel, but it is understood as intra-regional as well; a further similarity with India is that many aspects of material culture, and indeed much Minoan DNA, has survived the Steppe-related changes, and so even today, there is a nearly universal “IE invasion denialism” in Greece (although not “OIG”). There is a lot of continuity of material culture of India since, and through, the Bronze Age (no change in pottery which would parallel European events). But there are numerous Rigveda references to West-to-East travel of the ancestors, and to their semi-nomadic lifestyle amid the rubble of the cities, so the myth isn’t completely silent about the population-on-the-move.

    Back to DNA … if Siberian HG DNA is unique to the descendants of Indo-Iranians (but not Europeans), then it also defines the direction of the travel, right? And makes the Migration Theory more politically palatable by underscoring the non-European nature of the invaders? So the question is, do we know the extent of Westerly reach of the SibHG DNA? Did the ancestors of the Balto-Slavs or Celts lack it?

  2. I want to watch the Bollywood musical of the geneticists’ / archaeologists’ nationalists’ politicking that brought us to this point.

  3. Thanks for doing the honorable thing by affirming that one sample isn’t going to prove anything.

    There is a lot of trust issues between Hindutva folks and researchers. Comments like Rai’s, that this one sample is going to make all the difference to disproving OIT, and doubling down on it by saying “because you have genome-wide data and not just the Y-chromosome”, only helps to worsen it, as it gives the impression that these guys will say anything to sell their paper.

  4. Who really cares… The darkest skinned tribals have been found to have some of the steppe ancestry and the lightest skinned kashmiri pandits have ancestral south indian heritage in their blood too. Whatever the past, today we are one people.

    People named Bharadvaj and Agnihotri weren’t doing vedic yajna’s in the Ukraine. Most of our gods are clearly described as dark or brown/golden skinned and our religious texts are clearly written in South Asia (sapta saindhavah).

    We are one people with a common set of beliefs and heritage. None of these studies change that. In fact it adds more fuel to the fire of what’s already been happening in India – i.e. continued elimination of the caste system. I love whats been happening in many temples now with brahmin priest being an earned title rather than born title.

  5. BTW how does the arrival date of 1600 BC square with earlier results that the splintering of Steppe Y-DNA in today’s Indians stopped 2500-2000 BC? Thanks.

  6. @froginthewell:

    I’m not sure if you subscribe to OIT or hold a skeptical view towards both sides of the debate, but I’ll ask you nonetheless:

    How does the OIT explain the existence and variations of the various other branches of IE beside the Indo-Aryan one? It seems to be that AIT has broad consensus and OIT has none (outside of Indian circles) because the former does a better job of taking all the evidence from the various cultures and geographies into account. The proponents of OIT, on the other hand, seem satisfied to just cast doubt on AIT by focusing all of their attention on the “Indian” evidence, in particular the Indian texts. But they offer no positive evidence for how, say, the Irish or the Albanians came to be descendants of folk wanderings from the Indian subcontinent. I’ve had some arguments with OIT proponents on other forums (Disqus) and their approach is typically to completely deny linguistics and call it a pseudo-science.

  7. BTW how does the arrival date of 1600 BC square with earlier results that the splintering of Steppe Y-DNA in today’s Indians stopped 2500-2000 BC? Thanks.

    there was a lot of substructure on the Y on the steppe. see how yamna r1b turned into sbrubna/cw r1a.

    There is a lot of trust issues between Hindutva folks and researchers. Comments like Rai’s, that this one sample is going to make all the difference to disproving OIT, and doubling down on it by saying “because you have genome-wide data and not just the Y-chromosome”, only helps to worsen it, as it gives the impression that these guys will say anything to sell their paper.

    a month ago hindutva twitter was citing the great indian scientist rai to refute my pseudoscience 😉 anyway i think perhaps they sat on the results too long and the release of the earlier preprint took wind out of their sails. 1 genome sample from IVC is a big deal…but, i think at this point we know enough that it doesn’t change the big picture. it causes more problems with OIT types though, but if i was them i’d bring up sample size issues. they’ll keep doing that until we’re at N = 100 or so i assume.

  8. Numinous,

    You need to appreciate some things before making the comparison between the Western (mostly European) scholarship on PIE versus Indian scholarship on the OIT.

    1. First and foremost, The Europeans have been dabbling into the origins of the Indo-Europeans for more than 2 centuries and many countries in Europe have dedicated institutions and scholarship exclusively devoted to it and this has been going on for 2 centuries.

    Is the situation in India even remotely comparable, For most of the last 2 centuries we were a British colony. Whatever Indian scholarship existed during the colonial era could not be expected to challenge the colonial narrative, could it ? Since Independence, not much has been allocated to any type of research or scholarship in India, forget about the Indo-European studies which is quite exotic, when we compare it with the European nations.

    Since Independence, we have done little to challenge the colonial narrative of Indian history. Its only now that this is slowly changing with the rise of our economy.

    2. For more than 200 years, there have been countless no of linguists from the Western countries, who have studied the Indo-European languages and who have dedicated their lives to this sole pursuit.

    How many linguists have come out of India in this period and that too from Indian institutions ? Not more than a handful. There is not a single linguist in India devoted to the study of Indo-European languages. The last one was Satya Swarup Mishra who passed away a few years ago. There is insitution dedicated to the study of Indo-European languages or Indo-European history.

    And it is mostly the linguists who have played the key role in arguing and debating about the PIE homeland. It is they who have reconstructed the hypothetical PIE.

    Given the dedicated research and enormous funding at the disposal of Western IE scholarship for 2 centuries versus the abysmal state in India in the intervening period, how do you expect Indian scholarship to keep up with the West ? Indian scholars simply cannot compete with the funding available to Western scholarship and they cannot keep up with the volumes upon volumes of literature that come out from the West.

    ——————————-

    But the questions is – does it mean we should let the Western scholarship do what they want and blindly accept their version of our history ? Should we naively assume that Western scholarship on this matter is without prejudice and bias given the fact that the very reason Aryan Invasion was invented and India was demoted from being the PIE homeland was because of extreme racial prejudice of the Europeans who were uncomfortable with the idea that the genesis of their ‘glorious’ Western civilization lay with the degenerate heathen Hindus ? Shouldn’t we atleast debate and discuss the Western scholarship and see if it fits with what we know of our history ?

    And if you want to know about some of the biases and prejudices of Western scholarship let me just give you one example (there are others but it would be too lengthy to list more than one)

    The most popular model of PIE origins is the Pontic Caspian steppe hypothesis. This hypothesis was first propounded by the archaeologist Marija Gimbutas and its most well known proponents today are James Mallory & David Anthony. The whole basis for the theory was the finding of evidence that there was some sort of migration/invasion of steppe cultures into Europe around the turn of the 3rd millenium BC. This was automatically co-related with the migration of Indo-Europeans into Europe. It led to the argument that infact the steppe was the original homeland of the Indo-Europeans. However, there was no attempt made to find the archaeological evidence that showed the migration/invasion of steppe cultures into South Asia, Central Asia or Iran.

    To this date, as James Mallory himself admits, these Western academics simply do not have much or archaeological evidence to show how the steppe people got into the advanced civilizations of Central Asia, Iran & South Asia and managed to change their entire religio-socio-linguistic landscape.

    Look at David Anthony’s book ‘The Horse, Wheel & the Language’ and see how much time he spends discussing the migration of IE languages from the steppe into Europe and how pitifully few pages with extremely flimsy evidence he dedicates to the movement of Indo-Aryans into South Asia.

    You can also go through this blogpost to see the questionable manner in which Sanskrit was demoted from its initual exalted status

    http://new-indology.blogspot.in/2013/07/indo-european-linguistics-indo-iranian.html

    If you find such biases acceptable its your choice.

    ——————————–

    And so that is what some of the Indians are trying to do. It is indeed a worthwhile thing to do. It is not borne out of some prejudice or bias or inferiority complex.

    And if want to know, there is good enough evidence to make a linguistic and archaeological case for PIE in South Asia.

    THE LINGUISTIC CASE

    There is a massive book by Soviet scholars Gamkrelidze and Ivanov who argue for the Indo-European homeland in Eastern Turkey / Armenia. They showcase various lines of evidence to support their theory. Quite a few of them, also easily support an Indian homeland origin for PIE. Let me list a few:-

    1. The authors try to reconstruct the flora and fauna of the PIE homeland among several other things. In this there are 3 animals of PIE they reconstruct, which are found nowhere else in the IE world but India – the lion, the elephant & the monkey.

    2. They talk of the PIE homeland being close to high mountains with swift flowing rivers. Don’t we have the Himalayas ?

    3. They argue that the European languages had contacts with Central Asian languages and therefore in their model they make the European languages come from Anatolia all the way to Central Asia before moving north to the steppe and going to Europe. In an homeland scenario, will not this be much better explained ? The IE languages have to just move North from South Asia into Central Asia and then take the route from there to Europe. Whats more, Indian literature explicitly talks of the Druhyu people with origins in Punjab who were forced to migrate to Afghanistan and who eventually spread towards the North in Central Asia.

    4. They argue that the PIE people were one of the independent domesticators of cattle. South Asia, apart from the Near East, is the only other place with an independent domestication of cattle which is the Bos indicus or Zebu.

    5. The Indo-European homeland was apparently in close proximity with the Near Eastern civilizations since it shows links with the Sumerian, Semitic and other languages of the Near East. The Indus civilization was in a ideal location to effect such an exchange.

    https://archive.org/stream/GamkrelidzeAndIvanovIndoEuropeanAndTheIndoEuropeans1995/Gamkrelidze%20and%20Ivanov_IndoEuropean%20and%20the%20IndoEuropeans_1995_djvu.txt

    Johanna Nichols, a well known linguist, mostly agreed with the evidence marshalled by Gamkrelidze & Ivanov but argued that the evidence much favours the spread of Indo-European languages from a locus of spread in South Central Asia in Bactriana Margiana. Her arguments are spread over two long articles and it is beautifully argued. Read it.

    https://www.academia.edu/28869625/The_epicenter_of_the_Indo-European_linguistic_spread

    https://www.academia.edu/18306905/The_Eurasian_spread_zone_and_the_Indo-European_dispersal

    And if you want the archaeological evidence for the same have a look

    https://www.scribd.com/document/326936678/Kaukasus-und-Orient-Mariya-Ivanova

    Read the abstract in English

    Also, refer to chapter 4, page 50 in this following book,

    https://www.dropbox.com/s/gzn50m89zkhurmt/ivanova_2013a.pdf?dl=0

  9. @Razib: “a month ago hindutva twitter was citing the great indian scientist rai to refute my pseudoscience”

    But I always have the luxury to admit that my side doesn’t always cover itself with glory 😀

    I already (in my vague sense) knew that bit about splintering. My question was about the earlier work (I think Poznik was the lead author) which looked at the splintering in Indian R1a-Z93 and concluded that the Steppe migration should have happened 2500-2000 BC.

    What happens to that study if the migration is around 1500 BC? For instance, is this to be explained by saying that Steppe being already mixed with Turanic or BMAC folks around 2500-2000 BC before arriving in Indus valley?

    Which reminds me of another question: from the 1200BC-onwards Swat valley samples which has Steppe admixture, does one know when these individuals’ admixture must have occurred?

    Thanks again!

  10. Hi Razib – thanks for your posts on David Reich’s new book and on these latest results regarding the IVC. Reich’s chapter on India is good on when India’s caste system might have arisen (2-3,000 years BP), on it durability, and its role in preserving India’s extraordinarily high genetic diversity compared to Europe.

    What it doesn’t (and perhaps can’t) do is explain how and why the caste system arose in India but not in Europe. Reich’s book makes this question even more intriguing because of the evidence it provides on the closely parallel genetic building blocks of Europe and India – hunter-gatherers, ‘first farmers’ and Yamnaya. Wouldn’t this rule out racialist or genetic determinist causes for the caste system? Do you have thoughts on the origins of caste? I don’t think I have ever seen anything convincing on this.

    I have some discussion of these points in this blog post: “Scenes from Two Weddings – England and India” (https://naimisha_forest.silvrback.com/scenes-from-two-weddings). Thanks again!

  11. I already (in my vague sense) knew that bit about splintering. My question was about the earlier work (I think Poznik was the lead author) which looked at the splintering in Indian R1a-Z93 and concluded that the Steppe migration should have happened 2500-2000 BC.

    the z93 mutation is spread across many iranic groups. today it’s found in the altai groups (who are mongolic and turkic mostly actually). it shows up earliest in a srubna sample. i think 1850 bc or so?

    basically the dating

    1) always had interval that was wide
    2) only refers to the emergence of the lineage. the spread to different areas is later. as we get more WGS of south asian r1as we’ll get more clarity. if it’s exogenous, and i think it is, you’ll get a south asian clade which is nested within more diverse c. asian ones.

  12. @froginthewell
    I doubt that Dr. Rai is going to be worsening trust issues between ‘OIT’ supporters sympathetic to Hindutva and researchers. If anything its journalists like tony joseph who will ve resposible. While the OIT might be wrong some of the points it raises are pretty genuine. In a way the mainstream ie. primary leftist academia in India (and also there predecessors the pre WWII indologists) were also wrong in somehow making this into a racial and class conflict (between indo-aryan elite and a people of the egaliarian ivc) and somehow the primary basis of all conflict in modern India. He certainly is not doing science a favour by using this issue to attack Hindutva on twitter. Hopefully he and his colleagues will realize this.

  13. @froginthewell

    You are simply making the question unnecessarily complicated.

    Simply ask where is the Indian L-657 in the data in the SCA paper. All the DNA unearthed till date has not turned out a single L-657. (Including the Genomic History of Eurasian Steppe).

    @razib So who invaded Steppe women?

  14. Jijnasu: thanks. If I may ask, I know quite closely someone who used to have a handle “jijnyasu” – are you he?

    Razib: Thanks. Now I understand your explanation, and it makes sense to me. I don’t know what I’m talking, but it may be interesting if someone can narrow and possibly forward-shift Poznik’s wide interval (the claim about Z93 expansion between 2500-2000 BC) by crosschecking with the Srubna and other samples. Also, sorry for having bothered you more than the default quota today.

  15. @razib What am saying is there is just one R1a sample which is not L657 found in 500BC. And many steppe mtdna samples in Swat Valley from 1200BC to 300BC.

    Based on uniparental markers it looks like it was a influx of steppe women.

    So where is the R1a-L657 which you say is the steppe marker for the invasion/migration which ever side you lean to.

  16. So where is the R1a-L657 which you say is the steppe marker for the invasion/migration which ever side you lean to.

    i’m not talking much about uniparentals for a reason. it’s not clear what’s going on with this data.

    don’t put words into my mouth or i won’t publish your comments (not joking, watch yourself).

    you can think what you want. honestly, i don’t even care anymore.

  17. @Jaydeepsinh Rathod:

    Believe it or not, I’m well-aware of the politics and history surrounding this topic, but thanks for the excellent summary nonetheless.

    My personal evolution on this issue is something you may not have expected, but I came to be aware of the problems with the AIT model (as outlined in my school history textbooks growing up) back in the late 90s, when I first got on the Internet. I found the arguments for how the references in the Vedas may have been misrepresented (e.g., mapping “dasas” or “dasyus” to Dravidians, who were portrayed as dark-skinned aboriginals defeated by the Aryans. I think all of those criticisms still stand) quite convincing, and for a while, I subscribed to the Out-of-India theory, not knowing much (if anything) about linguistics, genetics, nor archaeology.

    It was only later when I read more about the IE issue, especially the data on all the IE cultures outside of India, that I started to find the OIT model untenable, and that is the view I still hold, regardless of the (many sound) objections you raise to the AIT/AMT model. Anthony’s book was one of the factors in my evolution, but so have Razib’s many writings on this topic over the past decade.

    The AIT/AMT model seems to me to be the more plausible (or Occam’s Razor) explanation by far. OIT proponents are right to wonder about why the Aryans failed to remember and record their peregrinations from the north, but then could the question not be flipped at them: if the Irish are the remnants of the Druhyu and the Iranians of the Anu (or whatever best fits the RigVedic accounts), then how come they do not have any memory of migration from a warm southern land? And if it was Indians migrating to the north and west rather than the other way round, how come there is far more skin color variation in India than in those lands?

    Lastly, as someone who has lived in the West for an extended period of time (though I eventually moved back to India), it is impossible for me to so easily and flippantly ascribe malice and bad faith to the research being carried out there. I just don’t believe there is a cabal of people who are perennially engaged in the project of undermining India. Maybe that was true back in the 19th century, but it definitely has no basis today. They are doing the best they can with the data they possess. Only if I had never traveled out of India, and saw Steve Sailer’s comments section as being representative of Western academia, might I jump to the opposite conclusion.

  18. @razib

    I know what I asked. I think you misunderstood it as only pertaining to the data from this paper.

    let me reiterate, why has no R1a-L657 been found after scavenging combing all of north eurasia by multiple studies. As it stands currently L657 is estimated to be 5500 to 6000 years old.

    Also i feel you should be bold enough to take questions, when you take the liberty to comment
    and write extensively about anything and everything in public domain. Because when you do no one says I am going to
    delete your post.

  19. @froginthewell

    Poznik study is a bit dated.

    Check this out, https://www.yfull.com/tree/R-L657/

    They do WGS to uncover the phylogeny of Y-DNA, and it is a trusted source, however the dates are deflated by 15 to 20%. Which is supported by clades found earlier in ancient DNA than their expected TMRCA as per their calculation.

    Majority of R1a-L657 is what is found in India, the sporadic non-L657 we see could really have been remnants of Kushan, Saka, or later intrusions or simply migrants from central asia.

  20. @CallingBS

    The kind predominant in Sintashta and Andronovo (R1a-Z2124) is the second most common clade in South Asia and much more than ‘sporadic’, so apparently the steppe horde was not entirely Amazons. 😉

    I do agree the missing R1a-L657 is a hole in the present theory though; where do you think it comes from? West Siberian Neolithic-type ancestry shared across Central Asia prior to the Chalcolithic?

  21. @Megalophis

    Do you have any credible references to look up what is the percentage of second most frequent clade vs the first?

    All I could find is Wikipedia reference for sample sizes and estimates of R1a prevalence in South Asia. It appeared very sparse and not necessarily a representative sample for all of Indian population. But if there are better sources with that much detail I am interested to know.

  22. @megalophias

    Atleast based on the current data it DOES look like amazon hordes came riding in.

    Calling Z2124 second most common is just playing with words. Because #100-L657 and #5 – Z2124 also makes it the second most populous.
    But in the larger scheme of things it is just in noise proportions.

    Z2124 peaks in Afghan Pashtuns and Kyrgystan areas which come under central asian.

    And no one denies the later Kushan Saka and historic period intrusions as well as migrants coming in from exactly those regions.

  23. let me reiterate, why has no R1a-L657 been found after scavenging combing all of north eurasia by multiple studies. As it stands currently L657 is estimated to be 5500 to 6000 years old.

    no idea. you know the Y better than i do!

    Also i feel you should be bold enough to take questions, when you take the liberty to comment
    and write extensively about anything and everything in public domain. Because when you do no one says I am going to
    delete your post.

    you can feel however you want. and i can delete however you want. to leave comments

    1) you need to be clear
    2) not hector
    3) be substantive

    etc. etc.

    as long as i feel you are fulfilling those conditions your comments will go through. if you aren’t, i won’t let them through. end of story.

  24. also, i’m confused as to why you think there has to be one wave of migrants with one diagnostic r1a lineage? perhaps r1a came later with a different migrant wave.

    in general i put more weight on autosomal since it lends itself to multiple analytic techniques.

    not as sure what’s going on with star shaped phylogenies after the recent horse paper (could be natural selection after all….).

  25. Thanks very much folks for the relatively layman-friendly explanations, it is very nice to be hear of these considerations.

  26. @Violet (and CallingBS)

    I don’t think there is anything like a representative sample of South Asians anywhere, it would be very hard to do. 🙂 We just use what we got. (Apologies for the over-sized comment.)

    Take the South Asians from the 1000 Genomes Project – samples of Punjabis, Gujaratis, Bangladeshis, Telugus, and Sri Lankan Tamils, total 260 Y sequences (that I can find anyway). Combined, they have 20% R1a-L657, 7% R1a-Z2125, and 2% other R1a-Z93, so 3 times as much L657 as Z2125. In this case it’s the southerners who have the greater proportion of Z2125 – both Telugus and Tamils have 10-11% Z2125 and 17% L657 (~2:3 ratio), while the Punjabis and Gujaratis have 4% Z2125 and 24% L657 (~1:6) and the Bangladeshis 2% Z2125 and 16% L657 (~1:8).

    In Underhill et al’s large survey of R1a, “The phylogenetic and geographic structure of Y-chromosome haplogroup R1a”, there is a wide range of samples geographically but most unidentified as to caste or ethnicity. In total 12% R1a-L657 (M780 actually) and 4% R1a-Z2125, so overall R1a is lower, but again 3 times as much L657 as Z2125. The highest proportion of Z2125 is in North Pakistan, where it actually exceeds L657 at 5:1, while in South Pakistan L657 is almost twice as common as Z2125. Keep in mind though that the Pakistanis here are roughly equal amounts of Kalash, Burusho, Hazara, Pashtun, Balochi, Brahui, Makrani, and Sindhi, hence wildly unrepresentative. Nepal has almost all L657 and no Z2125; a tribal population from Andhra Pradesh has 2:5 Z2125:L657, the same proportion as the Northwest Indian sample; the East India sample has only 4% R1a, mostly neither L657 nor Z2125; the Northeast and South Indian samples have no Z2125; the Central Indian sample (14% R1a) is entirely Z2125.

    So yeah, CallingBS might want to change his name to TalkingBS: R1a-Z2124 is certainly not “noise level” and is well-represented even in the furthest south.

  27. @megalophias

    You are clutching at straws to push a narrative. Here’s why

    1) You are looking at percentages of small communities in isolation which is not representative of India.

    2) All your calculations of Z2125 not being at noise level take a hit when you calculate the aggregate percentage of Z2125 for South Asia as a whole which comes out to a mere 3%.

    3) If Z2125 is found at equal frequency in North West the supposed entry point for any incursions and in the South in isolated Tribals it does not look like we are looking at any IndoAryan speaking people here.

    4) 1000 Genomes project has sourced most samples from disapora Indian populations which are not totally representative of real diversity within the country for obvious reasons.

    So going back to the main sequence where is the L-657 and where did it come from ?

    And I will not be as irrational as asking you to change your name or anything if you cannot find where it is.:)

    Hopefully next you will not say it came from Nepal because Nepal shows a frequency of 64% of L-657 in the paper you quote. 🙂

  28. @CallingBS

    I got no narrative to push, you have no actual data to respond with, so I guess we are done here.

  29. @Megalophias

    Thanks for pointing out the sources. I knew of 1000 genomes but in my view it is nowhere near representative sample. This is based on what I know about Telugus and their popular emigration patterns.

    There are 62 unrelated male samples of UK-based Telugus in the dataset. (There are >40 million Telugu males 2011 census). It is a very sparse sample but better than nothing.

    However, UK is an attractive place for medical doctors since US visa is very difficult to obtain for MBBS doctors from India (unless 99 percentile in USMLE is obtained). So, most people going to UK are doctors (unless there are on-site IT people), while all other Telugus line up for Houston (jk :)). In any case, medicine is more elitist than the engineering in Telugu states (competition for admission and minimum yearly fees, etc). So, it more likely that Reddys, Kamma, and Kapu will be over-represented in the sample (one point of reference is the Telugu cultural association memberships).

    As noted in Narasimhan et al, Reddys and other upper-caste non-Brahmins are already tilted toward higher steppe. There are >10million SC/ST Telugu males (2011 census), and it is unlikely that the ~20-25% R1a derived for 62 samples of Telugus will be applicable to them at all. I mean, what kind of ratios are representative when we are talking about 10-12 people from an already biased sample?

    So, it is kind of disappointing that the narrative is driven by this kind of sparse and unrepresentative stats. Hope a wealthier Indian population in future will help crowd-source more data using personal genomics.

  30. So, it more likely that Reddys, Kamma, and Kapu will be over-represented in the sample (one point of reference is the Telugu cultural association memberships).

    there are 3 brahmins in the telugu data set. mostly reddy etc. only few dalits. many more in the tamil data from sri lanka.

  31. Hi Razib,

    Can you please point me to the source of where to get this caste data?

    I opened several spreadsheets from 1000 genomes website but couldn’t figure out where caste, age etc are located.

  32. violet, it’s not in the 1000 genomes. i simply plotted them against caste data from estonian biocentre. i blogged the results in 2015. it’s in my archives.

  33. >> The steppe ancestry with Northern European affinities shows up in BMAC only around 4,000 years ago. It is hard to imagine it was in South Asia before it was in Central Asia.

    Actually, Narasimhan et. al. mention that they did not find Steppe ancestry in the BMAC samples but only in the Swat samples (the Steppe people bypassed BMAC and interacted with north-west India)

Comments are closed.