The deep origins of the Han Chinese

A new paper came out today on ancient East Asian DNA. More precisely, this work focused on early and late Neolithic samples from China, especially the lower Yellow river basin (north-central China) and the Fujian in southeast China. A major result can be boiled down to the Admixturegraph to the right.

The first ancient DNA out of East Eurasia was that from Tianyuan cave near modern Beijing. As you can see that individual is basal to other ancient (and modern) East Asians. That is, it isn’t representative of the ancestors of modern East Asians. But, the Tianyuan individual was already closer to modern East Asians than West Eurasians. Since the Tianyuan individual is ~40,000 years old, that means the bifurcation between eastern and western Eurasian groups predates 40,000 years ago.

This is not a surprising result, as the bifurcations between various “eastern” Eurasian groups (e.g., the ancestors of the Andamanese and East Asians) date to close to 50,000 years ago. The separation from Western Eurasians had to have happened after ~55,000 years ago since that’s about when the common shared Neanderthal admixture occurred.

The graph also shows that some ancient West Eurasian ancestry did come into the ancestors of East Asians through Siberians. More precisely, the Paleo-Siberian populations (replaced more recently by Neo-Siberian groups) had some ancestry from Ancient North Eurasians, who themselves were ~70% West Eurasian in ancestry (the other ~30% being a deeply basal East Eurasian). These Paleo-Siberians contributed ancestry to many northern East Asian groups, and likely explain the affinity between these groups and the Mal’ta-related individuals.

Finally, most of the edges show the separations between northern and southern East Asians and differences between inland and coastal populations. Though there is a deep distinction between northern and southern groups, the paper makes it clear that there is gene flow between coastal groups. This may explain affinities between the Japanese and Koreans, and peoples in southern China.

In terms of broad dynamics, one pattern that is evident, and repeats what we see all across Eurasia, is that the more recent periods seem to have undergone some level of panmixia. Ancient samples from northern and southern China are well differentiated, with pairwise Fst of around 0.04. Modern individuals sampled from these regions are closer to 0.02. Part of this is due to a significant expansion of “northern” ancestry at the expense of “southern”. But there is also some flow northward of “southern” ancestry. Though not highlighted in this paper because they lacked the samples, the movement throughout the Chinese Empire over the last 2,000 years is surely mediating this. In instances of famine or war resulting in depopulation in a province, the Chinese central authorities routinely encouraged migration from overpopulated provinces (modern Sichuan was repopulated from Hunan after a series of wars during the Ming-Qing transition). After 800 AD the demographic center China was in the Yangzi river valley, and south.

Unsurprisingly, the authors find that the southern samples from Fujian seem most similar to Austronesians. Today no one from these regions is “pure” southern. Rather, they are a mix. The Austronesians migrated out early enough that they carry southern East Asian ancestry exclusively. This recapitulates a common phenomenon where the ancestral “homeland” of a given group changes over time, reducing the ability to infer origins (e.g., the percentage of “Middle Eastern” ancestry in Southern Europe was underestimated because Anatolian farmers were partially replaced in Anatolia by migrants from the east).

There are also details in the supplements which confirm earlier inferences. For example, the Tianyuan individual has affinities with the Goyet Aurginacian sample from Belgium which dates to 35,000 years ago. But other East Asians do not. This seems to imply that Tianyuan was much more closely connected to a population that had trans-Eurasian affinities (another possibility is ancient structure, but the bifurcation between eastern and western Eurasian populations was more than 15,000 years before the time of Goyet so I am skeptical). Additionally, they also detect possible gene flow into Mesolithic Europeans from a population with East Asian ancestry (one possibility here that doesn’t seem to be explored is shared Ancient North Eurasian ancestry into both groups).

What is the overall takeaway? I think this confirms the other early papers that East Asia exhibits more continuity with its past that Europe and South Asian, rather like West Asia. While Europeans and South Asians have substantial ancestry from profoundly intrusive groups during the Holocene, the Han Chinese are in many ways “sons of the soil.” They did to some extent marginalize and absorb many other peoples in the modern area of “China proper”, and are themselves as a compound of two ancestral streams, but at the end of the last Ice Age, more than 90% of their ancestors were living within the boundaries of China proper.

More generally, modern imperial polities are exactly what some of their critics accuse of them of being: panmixia machines. Pre-state people were more genetically differentiated across local spatial scales. This seems the case everywhere there are good transects.

Related: The Deep Origins Of East Eurasians.


Version alpha of trying to understand East Asian population history is now out!

We’ve been waiting for ancient DNA to answer some questions about eastern Eurasia for a while. I always thought Qiaomei Fu would spearhead it, but it doesn’t seem like it worked out that way. That’s because she’s not on a new preprint, The Genomic Formation of Human Populations in East Asia, which fills in a lot of gaps and confusing aspects of what has been reported from fragments of publications that came before (e.g., this clarifies a lot of things with Japan, see below). Since there has already been ancient DNA work on eastern Siberia and Southeast Asia, this is really focusing on the area in and around what is today the Peoples’ Republic of China. The first author has an affiliation with a university in Fujian, a province in southeast China.

Much of the analysis can be understood as organized around language families, and the demographics associated with them. In this way, it goes back to L. L. Cavalli-Sforza’s correlations between gene trees and language trees, as well as his later work on the agricultural Diasporas.

First, there isn’t something radically surprising here in their results. As I suggest above, the mass of ancient DNA in the preprint and model-building just snap together a lot of what you can see in other work, some going back decades.

Let’s start with the “Onge-like/related ancestry. ”

Below you see the strange pattern of Y chromosomal haplogroup D. It’s common in Tibet, Japan, and among the Andamanese.

In the preprint, the authors argue that there is a deep division among East Eurasian populations, going back further than 40,000 years, between a set of populations descended from groups related to Tianyuan man, and populations with affinities to the indigenous peoples of southeast Eurasia and Australia (“Ancestral Ancestral South Indians”, AASI, the Onge, the Negritos of Malaysia and the Phillippines, and Oceanians). Modern populations in East Asia can be thought of as a mix between these two groups, in various pulses and waves. The finding that some peoples in the Amazon had “Australo-Melanesian” affinity is very strange, but note that there’s no guarantee that the geographic distribution of the two clades was so skewed in the past in a north-south manner.

The Onge-related ancestry is apparently found as the deepest layer in the Tibetan plateau and contributes 45% of the ancestry to the Jomon of Japan. Among ancient proto-Austronesian peoples of Taiwan, it contributed 14% of the ancestry. Earlier work on Southeast Asia indicated that even before the expansion of Austro-Asiatic farmers out of southern China they mixed with a basal East Eurasian lineage related to the Onge.

Chinese annals record the presence of dark-skinned peoples in Yunnan nearly into historical periods. These could very well be legends or rumors, or, they could be the last relic populations that had not been fully absorbed into the Tianyuan-descended farmer expansion.

Moving more recently into the past, the preprint findings that of the Tianyuan descended populations in East Asia there is a northern and southern grouping. The northern grouping has been discussed before, it is the classic Amur-river valley population. It turns out that a sample from 5,000 years ago in northern Shaanxi, just to the north of the hearth of classical Chinese civilization in Henan, resembles these Amur-river valley populations. Though the authors don’t have samples from southern China, or even the Yangzi, they use modern samples from southern Chinese peoples, as well as ancient samples from Taiwan, to infer that it is likely that the Yangzi river valley was inhabited by a somewhat different group during prehistory than the modern Han Chinese.

In the preprint, the argument is made that Austronesian, Tai-Kadai, and Austro-Asiatic all emerged out of the Yangzi valley and its rice cultures. As noted above, other papers have already outlined the peopling of Southeast Asia using ancient DNA, so I will ignore that. But, note that for Austro-Asiatic populations, ~1/3 of the ancestry is Onge-related. Some of this was mixed in while in southern China, but some of it probably accrued later on in Southeast Asia.

Modern Austro-Asiatic populations can then be thought of as a compound of Tianyuan, and various  Onge-related groups.


Read More


The Neolithic roots of modern East Asian human geography

Because of the long and thorough tradition of Chinese historiography, we have a good and deep chronological record of East Asia going back two to three thousand years ago. Chinese records also help illuminate and clarify aspects of Japanese, Korean, and Southeast Asian, history. For example, what we know about the Indianized kingdom of Funan in eastern mainland Southeast Asia is from textual sources are Chinese.

But, history can take us only so far. We know this for Western Eurasia, where ancient DNA has revolutionized our understanding of Holocene transformations. Unfortunately, we don’t have that much ancient DNA from East Asia. So we still have to make recourse mostly to modern data. A new preprint proposes to use a lot of modern (and some ancient) data to answer a very specific question, Inland-coastal bifurcation of southern East Asians revealed by Hmong-Mien genomic history. The basic results are totally unsurprising:

Consistent with the two distinct routes of agricultural expansion from southern China, this Hmong-Mien founding ancestry is phylogenetically closer to the founding ancestry of Neolithic Mainland Southeast Asians and present-day isolated Austroasiatic-speaking populations than Austronesians. The spatial and temporal distribution of the southern East Asian lineage is also compatible with the scenario of out-of-southern-China farming dispersal. Thus, our finding reveals an inland-coastal genetic discrepancy related to the farming pioneers in southern China and supports an inland southern China origin of an ancestral meta-population contributing to both Hmong-Mien and Austroasiatic speakers.

More interesting to me is the admixture graph to the right. It uses a bunch of ancient and modern populations to model ancient and modern populations. You can see some general patterns and suggestions of what might come out fo ancient DNA.

For example, the green component is defined by the Hoabinhian samples. These are the people who are distantly related to the Andaman Islanders, and occupied Southeast Asia before the arrival of rice farmers. They are distantly related to “Ancient Ancestral South Indians” (AASI) as well. It is unsurprising that this component is well represented in a Munda tribe (Kharia) from northeast India, or in Austro-Asiatic people of Southeast Asia. But notice that it is well represented in the Jomon of Japan, and modern Tibetans.

If you read the preprint, the authors clearly don’t think that this is Hoabinhian ancestry as such. Rather, the model is looking for something very basal (distant) from other East Eurasians, and Hoabinhians fit that (and are somewhat closer to this basal group). This is probably the same phenomenon of “Australo-Melanesian” ancestry in the Amazon. Curiously, Y haplogroup D is found in Tibet, Japan, and the Andaman Islanders.

The largest group in East Asia are Han Chinese and can be modeled as an admixture of the ancient Northeast Asian Devil’s Gate Cave people and modern Ami Taiwanese aboriginals (Austronesians). This is basically a north-south cline. One doesn’t need to posit obviously that the modern Han is truly a mix of these two groups, but rather that Han identity emerged out of a synthesis of various Neolithic groups with differential affinities to these two groups.

Two ancient samples give a good picture of how these groups are related to West Eurasians. The Afanasevio was almost exactly like the Yamnaya. The Namazga sample comes from ancient prehistoric Khorasan, on the border of modern Iran and Turkmenistan. These two samples do have some affinities with each other. Both have ancestry that related to or derived from “Ancestral North Eurasians” (ANE) and “Caucasus Hunter-Gatherers” (CHG), with the Yamnaya having more ANE and Namazga more CHG. But the Yamnaya also had affinities with “Western Hunter-Gatherers” (WHG) that Namazga lacked. You see that the Kharia has affinities to Namazga, but not Afanasevio. This is not surprising: the Munda tribes of Northeast India seem almost untouched by Indo-Aryan influence (they are entirely lacking in R1a1a, which is found in South Indian tribals). Rather, they mixed with Indian populations which were impacted by migrations of farmers from West Asia.

The proportion of Afanasevio and Namazga are illustrative of particular historical dynamics. Mongols and Xiongnu (ancient) had some connection to the Afanasevio. This is almost certainly Indo-European (probably East Iranian) contact. In contrast, the Hui, Chinese Muslims who are mostly no different from Han aside from religion, have contributions from both Afanasevio and Namazga. This is a strong indication that Hui do have more recent Central Asian (Muslim) ancestry, while Mongolians do not. The increase in Namazga ancestry across Central Asia is probably a function of the rise of Persian and Islamic polities, and the movement north of agriculturalists. The shift to Turkic dominated polities integrated Turan with the rest of the Islamic steppe, which happens to exclude the Mongolians.

It is also interesting that the Thai have more Namazga than Khmer. This is strongly suggestive of a large contribution of Indian ancestry to the Dvaravati culture (the enrichment for Devil’s Cave in the Khmer is probably due to the reality that a few of the HGDP samples seem to be mixed with Chinese), though it could be more recent admixture from India. Note however that the Mon people of Burma seem to have more Indian ancestry, and were often associated with Dvaravati.

Finally, the authors point out that the red southern Northeast Asian component is now common in peoples like the Koreans and Japanese. A clear indication of the spread of farming from southern people, as well as the likely later demographic impact of the expansion of the Chinese state and its spillover impact on Korea.