Language, genes, & peoples of Southeast Asia

As I am currently reading Victor Lieberman’s magisterial Strange Parallels: Volume 2. So I was very interested in a new paper from BMC Genetics, Genetic structure of the Mon-Khmer speaking groups and their affinity to the neighbouring Tai populations in Northern Thailand, pointed to by Dienekes today. Here are the results and conclusions:

A large fraction of genetic variation is observed within populations (about 80% and 90 % for mtDNA and the Y-chromosome, respectively). The genetic divergence between populations is much higher in Mon-Khmer than in Tai speaking groups, especially at the paternally inherited markers. The two major linguistic groups are genetically distinct, but only for a marginal fraction (1 to 2 %) of the total genetic variation. Genetic distances between populations correlate with their linguistic differences, whereas the geographic distance does not explain the genetic divergence pattern.
…
The Mon-Khmer speaking populations in northern Thailand exhibited the genetic divergence among each other and also when compared to Tai speaking peoples. The different drift effects and the post-marital residence patterns between the two linguistic groups are the explanation for a small but significant fraction of the genetic variation pattern within and between them.

There are many occasions when it has taken a synthetic scholar to point out to me the overall structure of a constellation of facts which I was conscious of prior. So it is with Lieberman’s work. I had known that the eruption of the Thai peoples into Southeast Asia occurred with the last 1,000 years, before which the peninsula was divided between Tibeto-Burman populations to the west and Austro-Asiatic languages to the east (the latter divided between the Khmer and Vietnamese). Additionally, it is presumed that the Tibeto-Burman languages themselves displaced Austro-Asiatic in the western zone (as evident by the persistence of Mon in modern Burma). What was noted in volume 1 of Strange Parallels though is that the three geographical regions engaged with and assimilated the Thai invasions different. In the center the Thai succeeded in dominating the previous groups and imposing their identity upon the region. It is often asserted that modern Cambodia’s existence as an independent state is a function of the protection conferred upon it by the French from the expansive ambitions of the Empire of Siam. But in the east the Vietnamese state was barely impacted by the Thai folk wandering. As in China the Thai in Vietnam are marginalized “mountain tribes.” Finally, in the west, in the zone which became Burma, the Thai did not take over the cultural commanding heights. But neither were they absolutely marginalized as in the east. Rather, the Shan people became part of the of the Burmese landscape, integrated into the Theravada Buddhist culture, but also a significant secondary ethnos to the Burman majority (along with Karens, Mons, etc.).

What does this have to do with genetics? Possibly everything and nothing, and all answers in between.

The massive shift in ethno-linguistic identity in the center of mainland Southeast Asia, its lack in the east, and position at the equipoise in the west, should be excellent tests of propositions as to the nature of the spread of such ethno-linguistic identities. Is it pure construction, demographic replacement, or some quantitative combination of the two parameters? Unfortunately the BMC Genetics paper focuses only on Y chromosomes and mtDNA, the paternal and maternal lineage. These markers are informative, but I’d rather look at total genome content. The ethnic coverage in a small area of northern Thailand though is impressive. The open circles represent Mon-Khmer ethnic groups, the dark ones Thai. The Mon-Khmer are the presumed indigenes, while the Thai are intrusive. At least over the past 1,000 years.

Below I’ve reedited the Y and mtDNA multidimensional scaling plots. The Y is on the left, and mtDNA on the right. The clustering pattern shows relationships across the lineages. Again, the open markers represent Mon-Khmer groups, and the closed ones Thai.

Since the paper is open access I invite you to read their interpretations. All I’d say is that the clustering of male Thai lineages is very interesting, and is well explained by the model of groups of related men being intrusive to a region, and taking wives from the indigenes. In contrast the Mon-Khmer Y chromosomal lineages scatter about more, and that may be due to the fact they coalesce back to common ancestors far further back in history. The intrusion of the Thai into Southeast Asia may then be demographically characterized by a migration of male warbands. In regions where these warbands managed to topple the previous order, as in central mainland Southeast Asia, they may have then monopolized access to women and entered into a period of demographic expansion.

Luckily we do have some thick-marker autosomal data. To the left I’ve reedited a figure generated with the HUGO Pan-Asian data. The bar plot is at K = 14. I’ve excised many of the extraneous populations. The colors within the bar plot correspond to associations with broader language families. So red seems to be Austro-Asiatic, while blue is Thai. You can see in the figure that the Chinese Thai lack the red Mon-Khmer component. Interestingly the the Hmong of upland Southeast Asia, who are culturally marginal to the dominant Theravada Buddhist culture of the lowlands, exhibit evidence of very sharp differentiation from the Thai and the Austro-Asiatic groups. They lack the affinity with island Southeast Asians, Malays, and Taiwanese Aborigines, which seems common amongst the South Chinese more broadly. The Karen of Thailand are probably the best proxy we have for the Tibeto-Burman people of Burma, who post-date the Austro-Asiatic, and predate the Thai. Going by these data it looks as if the Karen are very hard to differentiate from the Austro-Asiatic populations, though very distinctive from the Thai.

The Pan-Asian data set leaves a lot to be desired. There’s not much coverage of the east or west. I suspect that Southeast Asia is going to be somewhat complex, and extrapolating from the correlations between languages and genes in Thailand is going to get us only so far. But it’s a start. In Strange Parallels the author makes the case that mainland Southeast Asia can tell us a lot about generic Eurasian historical process. I hope, and suspect, that it can tell us something more general about the interplay between language and genes over time in other regions as well.

Language, genes, & peoples of Southeast Asia

Related Posts:

Related