


One of the major weaknesses brought up in The Indo-European Controversy: Facts and Fallacies in Historical Linguistics is that these Bayesian phylogenetic models utilize lexical information as data inputs. In particular, a set of a few hundred cognates. There are two elements to the objection. First, the choice of cognates might be biased, or at least bias the output. Second, vocabulary may not be the best foundation on which to generate a phylogeny of language. Rather, something like grammar may be more phylogenetically informative. The authors of the above works under criticism actually state they’re trying to use grammar as an input too. But in any case, the tendency for vocabulary to be exchanged between nearby groups, irrespective of their phylogenetic origin, is presumably the reason that the Romani languages drifted far enough away from the other Indo-Aryan languages to seem like an outgroup. No matter how ingenious your method, if your input data is biased or not informative, your output is not likely to be useful. Pereltsvaig and Lewis allude to the fact that linguistics has not found their “atoms” yet. I’d state it differently: linguistics lacks its DNA sequence. Using a biological analogy, these linguistic applications of Bayesian phylogenetics are attempting to discern evolutionary history from phenotype.

Unfortunately this model is almost certainly wrong for human history. Ancient DNA has revolutionized everything, because it is shown just how punctuated demographic shifts can be. Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity highlighted this dynamic a few years back. More recently, Population genomics of Bronze Age Eurasia and Massive migration from the steppe is a source for Indo-European languages in Europe indicate discontinuity. I want to emphasize the term discontinuity, as this is very different from gradual diffusion. Rather than a methodologically individualistic model, where higher fertility in farmsteads or at least villages gradually resulted in the transition from one group to another, a more likely in my opinion is inter-group tension, conflict, and amalgamation. In some cases, near total replacement. It may not have been always violent, rather, agriculturalists on the Malthusian margins may not have been able to withstand the shock of a new culture arriving and sequestering critical resources (an analogy I’m thinking is the massive collapse of Roman culture in the Balkans whenever the imperial limes withdrew toward the coasts; without state support and scaffold the way of like the Latin peasantry just wasn’t feasible, so they quickly migrated or died off).
For example, it looks as if the Uygurs are not descended in large part from the first Indo-Europeans on the fringes of western China. I took the data the Reich lab posted and ran TreeMix on it. After reducing the number of populations, I ran TreeMix on it. Below are 10 plots. The West Eurasian ancestry of the Uygurs is not overwhelmingly Northern European-like. Weirdly the graphs below suggest it is somewhat less Northern European than the West Eurasian ancestry contributing to the Hazara! Though that may be an artifact of some sort. The point is that as suggested by many scholars it seems highly likely that the Indo-European population of the Tarim basin was a composition, and that Tocharians and Indo-Iranians were both present. And, probably did not appear at the same time.
So a second question that came to has to do with the origin of the Indo-Aryans, and the genetic history of the Indian subcontinent. About five years ago I told John Hawks that I was skeptical of too much European-like contribution to the Indian population because not enough European pigmentation alleles were segregating in the population. My inference was based on a wrong assumption. It turns out that the earliest steppe dwellers were not particularly pale of mien going by their genetic architecture on pigmentation loci. My objection has no basis, because the modern European phenotype is very new, and likely post-dates the arrival of Indo-Europeans to India. Additionally, there is suggestive evidence of a steppe connection, such as the widespread presence of the “European” allele for lactase persistence in Northwest India. This allele is new, and swept up in frequency very recently. Its presence in Northwest India almost certainly indicates non-trivial demographic connections.
The blogger at Eurogenes has illustrated the dynamic, but it’s pretty obvious that Northwest Indian populations have some affinity to the Yamnya population in particular. Below are the results from TreeMix using a narrower set of population than above. Notice how Pathan tends to move toward the Yamnaya…..
But why the affinity to the Pathan, and not the Iranian samples? Who knows. I’ll pull down the data set from the Willerslev lab soon, but I think ancient DNA from India is going to have to answer the question. But I’m curious how the “Out of India” people spin this, because they will have a ridiculous rationale….





















Comments are closed.