R1a1 and the peopling of Eurasia

A few weeks ago people in the comments were nagging me a bit about some new papers on the haplogroup R1a1. This Y chromosomal lineage is found at very high frequencies from East-Central Europe into India. Initially, researchers such as Spencer Wells assumed that R1a1 signaled the arrival of Indo-Aryans to the Indian subcontinent, its frequencies decline in a northwest-to-southeast gradient, and from high to low castes. In Europe the modal frequencies are among Slavic groups, with a high representation among Germanic-speakers. The frequency of R1a1 declines sharply in Western and Southern Europe. It is very common in Central Asia as well as eastern Iran and Afghanistan. One parsimonious explanation would be that R1a1 spread with Kurgan males, along with Indo-European languages, on the order of 4-5,000 years ago.

There is a problem with this model though. One of the new papers reiterates the finding that the coalescence of the European and South Asian lineages is on the order of 10,000 years ago: Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a (R1a1 is the dominant clade within R1a). A second paper reports the finding that R1a1 is very diverse in India, indicating deep time depth: The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system. For both R1a1 &”Ancestral North Indians” (ANI) in Reich et al.: the frequency seems intuitively way too high among tribal populations, even in South India. Remember that the low bound for ANI was ~40%. R1a1 is found at frequencies as high as 25% or so among some South Indian tribals. If this lineage arrived with the Indo-Aryans it is peculiar that it is found in such high frequencies in populations which were marginal and isolated from the dominant non-Indo-Aryan populations of South India. Back to Europe, here is a section from the abstract of the first paper:

Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.

The Holocene started 11,700 years ago. We are living in the Holocene. So the means that gene flow can’t be any later than 6,000 years ago. The paper which focuses specifically on Indian lineages reports a coalescence time on the order of 10,000 years in the past for South Asian R1a1 branches. Additionally, they confirm earlier findings that of caste ranking of R1a1 in terms of frequency, as well Brahmins having the most diversity of all groups in terms of haplotypes (ergo, the title of the paper).

Both Dienekes and Polish Genetics and Anthropology suggest that the calibration is wrong on these coalescence times. They argue that one should reduce the time to a common ancestor by a factor of 3. This would of course make a huge difference. In regards to the Reich et al. paper which argued for a plausible two-way admixture between ANI and “Ancestral South Indians” (ASI), the linkage disequilibrium has decayed too much from the time of admixture to peg a date. This was a method used to calculate the emergence of the Uyghurs as a hybrid population, on the order of 2-3 thousand years ago (admixture between two very different populations generates linkage disequilibrium which decays over time due to recombination). In terms of Fst the ANI have a value in relation to Northern Europeans which is about 3 times larger than the mean between population differences in Europe. This is somewhat greater than the pairwise values between any European populations except for the Baltic peoples (in particularly, the swath from Karelia to Lithuania) to the groups of Southern Europe. The degree of Neolithic Middle Eastern ancestry within Europe under debate, but I think one can assume that Southern Italians and Karelians are likely at opposite ends in terms of frequency of this contribution to the pre-Ice Age demographic substratum of Europe. From this I offer that it is not totally unreasonable to posit that the ANI contribution to South Asian ancestry was closer to the margins of the last Ice Age, rather than the period of the Indo-European expansion, and that its Fst values are not unreasonable in relation to modern European groups.

The main issue that is confusing is the diversity of R1a1 in South Asia. A first order model going from just this data would be that R1a1 derives from India, and spread to the Eurasian plain. But Reich et al. show data that imply little likelihood of South Asian contribution to European ancestry. The only possibility would be if ANI and ASI were totally separated when a branch of ANI left South Asia for the Eurasian plain, and which point the process of admixture between ANI and ASI began. Another possibility is that the distribution of R1a1 in Eurasia is a palimpsest. Recent work in ancient DNA is suggesting that inferring past distributions from contemporary ones may lead us astray. It could be that R1a1 was once far more diverse in Europe and Central Asia, but that subsequent demographic events eliminated most of that diversity, while such events did not occur in Europe. Y chromosomal lineages may be particularly likely to be wiped out by the expansion of new tribes as old elites are killed or marginalized. The current distribution of a particular branch of R1a1 in Europe, associated in particular with Slavs, may be an expansion of the lineage which managed to survive elimination at some point in the mid-Holocene.

Though do note I put little weight in my speculations. It seems rather confusing. But since I was asked….

  1. Razib, you said… 
    In Europe the modal frequencies are among Slavic groups, with a high representation among German-speakers“ 
    (Assuming by “German-speakers” you mean Germany….) Wonder how much of that R1a1 got put into Germany when the Russians left Germany during the Second World War?! (The story I’ve Been told is that there wasn’t a virgin left in the country.) There was also the Germanizing of Polish children. 
    I’m not even sure of the magnitudes of these are, so maybe it’s trivial.

  2. *polish genes* points out that r1a1 has been found in ancient DNA in modern germany, so it’s old. also, i really mean germanic, so i’ll fix that. it’s in relatively high frequencies in scandinavia. it’s low in england, but there are signs of a east-west cline. might be the saxon + scandinavian settlements of the early medieval period (probably disproportionately male).

  3. Don’t understand why you highlighted the M458 marker and the flow from East Europe to Asia. Why this obsession? Aryan is just an old fashion, and not very applicable name for a language family. I don’t know anyone, anyone who thinks, who accepts that Europeans seeded Northern India or Iran, or contributed genetically to any of those countries peoples, high or low caste, Brahman or Tribal. 
    Dienekes has an thing with dating methods particularly I have found with J-M267 which he wants to believe is of very recent origin, and totally due to recent foreigners to Europe from the historic Middle East namely Jews and Arabians. It is like the opposite of the “Aryan” thing of India. In this case true Europeans are R1b, R1a, and I. Anyone who is E or J or G is not authentic, especially the J people though J2 has been whitewashed and made European due to its high frequencies in guess where, Greece and Italy. J2 is just as “Middle Eastern” as J1 or even haplogroup I which originated east of the geographic Europe. Dienekes just uses the divide by three rule, the Germ Line Mutation theory, to support his views of the origins of different Europeans. It is quasi racism. Separating the Whites from the Darkies. It seems to have been proven that the R group has a Central Asian, West Siberian origin and entered Europe after the LGM. Too much effort has gone in proving beliefs like the Paleolithic continuation of Europe’s population into the modern age with little or no contribution from anywhere outside Europe. Yet every new paper seems to show that no haplogroups in Europe, even the old ones, entered Europe much before the LGM. Recent studies show a discontinuity between modern European and both Paleolithic (actually Mesolithic people living at the time of the farming introduction into Europe) and the Neolithic people who were present in Europe complete with Near Eastern plants and animals. 
    R1a’s entry into both Europe and South Asia probably occurred at the same time and is not auchthonous to either, and has little to do with the mythical blue eyed blond men but men of generalized Caucasoid looks. The Slavic languages are not particularly old, and it is a mistake to correlate language with genetics. The correlation if it exists is most likely coincidental.

  4. Ah, those Kurgan groups. Am reading Cunliffe and his very interesting chapter on the Carpathian basin circa 4000YA and their chariot remains… Marija Gimbutas still has a partial case…