The rise of China and Chinese identity was inevitable

I have heard that Benedict Anderson’s Imagined Communities: Reflections on the Origin and Spread of Nationalism is the most assigned text at American universities. Before I had read the book I had heard it mentioned many times in the media or in print. Anderson’s narrow thesis is fine as far as it goes, but I was underwhelmed overall by its general relevance. Rather, I found Azar Gat’s Nations: The Long History and Deep Roots of Political Ethnicity and Nationalism more interesting and illuminating, in large part because it is a powerful rejoinder to the sentiment that nationalism is a relatively new “invention”, a product of early modernity, first manifesting itself in its full flower with the French Revolution.

This cartoon cutout view is certainly one I would probably have unreflectively parroted in my teens. It seems erudite and counterintuitive. A classic, “well actually…” fact. But the more history I read, the less and less plausible I found the implications of the recent invention of nationalism. The nation-state as conceived between the Peace of Westphalia and the Congress of Vienna shapes and dictates modern understandings, but the sentiments and elements that come together to make the nation-state a powerful cultural phenomenon are quite old and widespread. Human tribalism emerges out of our innate cognitive architecture and is further selected through the process of cultural evolution. To some extent, this is extensible and scalable.

With that being said, how natural is the Han Chinese identity, which has come to the fore and will determine the course of this century? This is the ethnic-national group which makes up 90-95% of China’s current population. They are Chinese qua Chinese in a fundamental sense. People united by the written Chinese language, speaking related dialects which diverged over the past 2,000 years and bound together by a historical-cultural tradition with 3,000 years of continuity.

If you read History and Geography of Human Genes one of the peculiar results from the analyses within is that North Chinese cluster with Japanese, Koreans, etc., while South Chinese cluster with Southeast Asians. This did not turn out to be true. Most specifically, the South Chinese have a greater affinity for Southeast Asian groups (e.g., the Vietnamese Kinh) than North Chinese, but they are not closer to Southeast Asians than they are to North Chinese (the furthest southern dialect groups, such as those of Guangdong, are about equidistant to Vietnamese).

But what about the North Chinese? Are they simply Sinicized Mongols? It is clear that some of the North Chinese exhibit shifts toward West Eurasians. I think this is mostly through Mongols and Turks, who have a minor West Eurasian component. But, I believe that both North and South Chinese will be shown to have 50% or more of their ancestry attributable to people who founded the Erlitou culture of Henan. The Han exhibit signs in their genomes of massive demographic expansion in the Holocene. Some of the geographic variations we see today are due to differentiation driven by isolation by distance. Another proportion of it is through admixture with the substrate (e.g., the Yue have left a noticeable cultural imprint on parts of South China, and I suspect it’s a genetic impact as well). And finally, some of it is through admixture with newcomers. This is particularly true in China north of the Yangzi, which has been impacted by barbarian peoples since the rise of the Zhou and the interactions with the Rong and Di.

But China is too large, extensive, and long-lasting, to imagine it has a strong ethnic core with a genetic coherency in the way Finland has a strong Suomi core. Rather, genetics may more usefully be pointing to the powerful integrative and anti-centripetal forces at work across Chinese history. Hakka moved south, while southern families moved north again with the rise of the Sui-Tang. The 20th-century century has been characterized by the demographic Sinicization of Inner Mongolia and Manchuria, which had for most of Chinese history been outside of the domains of China proper.

Though I think one can argue that Classical China really crystalized during the Han Dynasty, my reading of works such as Li Feng’s Early China indicate that the root of later developments really dates to the Zhou Dynasty. Confucius’ valorization of the Duke of Zhou may simply reflect the importance of the Zhou revolution in setting the tone of what China became. If the Shang were the Mycenaeans, the Zhou were the Classical Greeks. The Shang Dynasty and the Mycenaeans spoke the language and worshipped the gods of the people who would succeed them, but they lacked the spirit which would define Chinese and Greek civilization. For China, that spirit is reflected in the ideas, the canon, of the Spring and Autumn Period.

Though Classical Greek civilization and culture has persisted in a form, to a great extent it was adopted, synthesized, and transmuted. The integration of Greek philosophy into Christian theology preserved Greek thought, but it also transformed its import and made it so that that cultural inheritance was not defining or exclusive to the Greeks. This is far less the case with Classical China. An intellectual in the year 1900 arguably expressed the living tradition which had its genesis during the Spring and Autumn Period. It is as if the Platonic Academy had maintained its institutional integrity for 2,000 years. Or, if the civilian Roman senatorial elite had not been dissolved between the 4th and 6th-century A.D., to eventually be replaced by illiterate barbarian warlords (there was a bridge period of barbarians who exhibited some of the best aspects of Romanitas).

All this to say that China and Han identity is not a purely contingent construction of the 19th-century or a response to modernity and European hegemony. This is more clear to me after having read The Han: China’s Diverse Majority. The author engages in an ethnography and intellectual history, teasing out the parameters of the Hanzu self-identification promoted by Chinese nationalists in the 19th and 20th-century. The argument goes that this identity superseded and suppressed deep regional divergence, between north and south, Mandarin dialect and non-Mandarin. The Han does not address this position directly, but the intellectual history outlined makes it clear that what we substantively think of as the Chinese people had a self-conception even before the Han Dynasty. Just like the Egyptians or Indians, the early Chinese thought of themselves as the center of the world, as the civilized people par excellence. They did not think of themselves as a nationality at parity with other groups. Rather, they saw other groups as barbarians who could still be civilized, and so become Chinese.

Perhaps a useful analogy here might be a “what-if” scenario where the Latin Western Roman Empire did not fall permanently in the late 5th-century but resurrected itself. But even here I think it understates the integrative and unitary nature of Chinese self-conception even before the Han Dynasty. The Latinization of Iberia and Gaul seem to mostly been due to acculturation. I believe that Sinicization was accompanied by demographic expansion.

The People’s Republic of China is not just an imagined community. It is an outgrowth of a political and social unit that has been evolving for 3,000 years.

Finally, I think at this point it is useful to end with a comparative exercise that compares the attitudes of the civilizations of the Eurasian oikumene to a very important and universal human phenomenon: religion. The “Greater West” (The West + Arab-Turkic-Persian Islam), India, and China, overlap and differ in very particular ways.

The Greater West has developed exclusive and socially universal religious confessionalization to a very great extent. Exclusive, insofar as on paper religious confessionalization is in its mature state is not about pluralistic competition, but the solidifying of a monopoly. Universal, in that the religious identity cuts across class and ethnicity in a very cohesive fashion.

Modern India, and to some extent premodern India, seems to have developed strong confessional identities which are somewhat exclusive. Or have become so. People die because they are Muslim or Hindu, and the boundaries are sharp and stark. But, Indian society is not so universalizing. Within Hinduism, the Sanata Dharma, there are a wide range of practices and beliefs. Buddhism is part of this broader tradition and has engaged in confessionalization and universalizing very early on. But, like Hinduism, it tends not to seek exclusive monopoly on society.

Finally, we have the situation in China. Though “world religions” have been prominent historically, the Chinese do not develop exclusive or socially universal attachments. A single religion does not bind society together, and individuals can “consume” religious services and beliefs from a wide array of systems. It is sometimes said that in East Asia religion is unimportant. This is false. Rather, religion is not homogeneous or monopolistic. And often confessional identities are weak.

I bring this up because though there are deep human universals, there are also striking cultural differences. Indians often scoff at the Chinese tendency to convert to Christianity in the West, suggesting that perhaps the Chinese lack cultural pride. This is a false inference because the issue that Indians do not understand is that Chinese society does not tie itself to a strong confessional religious identity. Chinese identity at the core does not have to do with supernatural belief systems. Similarly, Westerners are often perplexed by the open-minded latitudinarianism of many Hindus. But Westerners do not internalize that Hindu religious beliefs are less about individual identity and more about collective communal customs and ties. Undergirded often by a monistic metaphysical system, Indians see little need to convert the world to become like themselves, because even within India communal diversity is the norm, and universalizing tendencies in religion has been marginal until lately.

We’re going into an interesting century. Whether that’s good or bad, I’ll leave to you.

The lineage of the ancient sage kings

After recording the “India genetics” podcast for The Insight and reading Early China: A Social and Cultural History, I wonder what surprises we’re going to get from China from ancient DNA when it comes online. If there is one thing we are learning by looking closely at DNA, modern and ancient, it’s that at least for humans there are very few ‘primal’ populations from the “Out of Africa” event which haven’t been threaded together from pulse admixtures of continuous gene flow across the landscape.

Early China makes it clear that Erlitou culture which dates from ~1900 to 1500 BC was almost certainly the legendary Xia dynasty. This means that the ethnogenesis of the modern Han Chinese probably dates to the latest ~4,000 years ago. This is centuries before the Indo-Aryans were likely arriving in South Asia, and around the same time that Indo-European groups were pushing into peninsular Southern Europe.

The Y chromosome data does not indicate a Bronze Age ‘star phylogeny’ expansion in East Asia that I know of, so the dynamics were not entirely similar to Western Eurasia. But, it seems quite plausible that the Han themselves are not a chrysalis from the late Pleistocene.

Read More

Hui have a lot of West Eurasian Y chromosomes

OCR1aR1bR2E1bGHI1I2J1J2LNQTTotal N
Han2581222211211792300
Hui24721191311411113144106
Tibetan49111811333371100

It’s been a while since I checked in on the genetics of the Hui people. I found the paper, Analysis of 17 Y‐STR loci haplotype and Y‐chromosome haplogroup distribution in five Chinese ethnic groups. About 50% of the Y chromosomal haplogroups are normally classified as “West Eurasian” (R, E, G, I and J). But curious a fraction of the Han have these too, as do some Tibetans.

Additionally, know that some Mongols also have R1a1a. It’s hard to differentiate different periods of admixture. But to me the presence of R2 and J2 point to a Central/South Asian origin of a lot of the Hui R1a as well.

The great genetic map and history of China


About 20 percent of the world’s population is Chinese (and since over 90% of Chinese citizens are ethnically Han, so by Chinese here I mean Han to a first approximation). In comparison to other non-European groups a fair amount of genetics research has been done with Chinese populations. But in comparison to their overall numbers, not too much has really been done. That will change.

A new preprint, A comprehensive map of genetic variation in the world’s largest ethnic group – Han Chinese, aims to enrich our knowledge set somewhat. The authors used low coverage next generation sequencing to get increase their sample sizes greatly (cheaper). By low coverage, I mean instead of hitting each genetic position on average 30 times or more, as is in the norm in medical genomics, they sampled a position closer to twice.

But while any given genome was usually not given much close attention, their overall sample size of individuals was 11,670 Han Chinese women. Impressive This means that if they called a position as a variant, they could assess their confidence that it was a variant by looking at how many times it was called as a variant across their data set (as coverage declines one’s confidence that a call of a variant is a true call declines because there is a relatively high base rate of error set against the proportion of true expected polymorphisms; in contrast if you sample 30 times the error rate gets overwhelmed by repeated sampling). Overall they counted 25,057,223 variants, which sounds about right. They also found 548,401 novel variants with at least a count of 10 in the data set (a ~0.04% allele frequency, so a very low cut-off).

The most important thing about this preprint is not that the sample size is large enough that they could detect low frequency variants and add to the catalog. No, for me, it is that they sampled so many of the provinces. As you can see in the figure up top just like Europe China’s Han population recapitulate the map of China. That is, populations arrange themselves spatially when projected onto a principle components analysis plot in the same manner that they do geographically. This is a new finding in some ways because previous sampling strategies had not been robust enough to detect the east-west cline (though to be honest if you looked at the Chinese samples in the 1000 Genomes there was suggestion of this).

All that being said, please note that the PCA is not to scale, insofar as most of the variation is north-south (4 to 5 times more than east-west). Rather like Europe in this regard. Part of this difference is due to the fact that gene flow from non-Han populations, particularly in the South, inflate the genetic variation on the first dimension. Another aspect of interest is that genetic variation between Han populations is rather low to begin with.

One way to visualize this is a matrix like the one to the left. You see pairwise population Fst statistics. The largest is between Guangdong in the south, home to Hong Kong and Guangzhou (Canton), and the northern provinces. The Fst value between Guangdong and Shanxi in the center-north is 0.0029. You may know that the Fst value between Han Chinese and Northern Europeans is ~0.10. A 34 factor difference, more than one order of magnitude. As a point of comparison you can find Fst tables which show values between English and Croations and English and Spaniards are about the same as between Guangdong and Shanxi.

What is just as interesting is the very low genetic differentiation on the North China plain. Why is this? There are two reasons I can think of. The easy explanation is that across politically unified flat landscapes gene flow occurs so easily that genetic differences disappear over time.

But, this presupposes there were genetic differences in the first place. The reason I say this is that though there was a early period of migration from the north to the south (from the Han dynasty onward), and absorption of non-Chinese peoples, there were also periods when much of China north of the Yangtze river valley was under barbarian domination or politically unstable. Elite northern families fled to the south, and eventually when political stability reemerged migrated back to the north (similarly, persistent north-south migration occurred, as the Hakka people of South China are clearly of northern provenance).

The low genetic differentiation across northern China may then be thought of as the outcome of structural fixtures of the landscape (no mountains to obstruct gene flow), as well as possibly due to historical instances of copious back-migration from various regions of southern China (or perhaps more accurately Central China, as I’m presuming much of the settlement would come from the lower Yangtze river valley). Both of these dynamics may have led to little intra-regional structure. In contrast you notice that genetic distance between Fujian and Guangdong, two regions adjacent to each other in the South, is still higher than between any of the northern regions.

Again, this is not surprising due to both geography and history. The dialect map of China shows that southeast China is more fragmented than the north (or southwest). These differences are long-standing and date to the initial founding of Han communities in the south via migrants from the north. Unlike North China South China is a topographically diverse landscape, with beautiful escarpments and deep gorges. Fujian literally hugs the ocean, and has long had a relationship to overseas communities for this reason. Geographic barriers mean there are genetic barriers. Combined with admixture with local populations this means it is not surprising that there were greater genetic differences between southern regions than in the north.

Additionally, China south of the Yangtze has been relatively shielded from foreign conquest and invasion compared to the North China plain. Obviously events like the Taiping rebellion and famine more generally had impacts on South China, but North China has had more periods of domination in a destabilizing manner by non-Chinese invaders over the past 2,000 years.

Perhaps more intriguing than the modern genetic relationships within China are the relations with non-Chinese populations. It is not surprising that the South Chinese populations show evidence of admixture with Dai and Tawainese aboriginals (the basal group of the Austronesian migration). The genetics and cultural practices in parts of South China have long suggested relationships to indigenous groups, as well as Sinicization. Honestly I suspect many were surprised how similar North and South Chinese were, indicating either continuous gene flow or descent from a large demographic expansion.

More curious is that some North Chinese seem to show evidence of admixture with West Eurasians. In particular, they show affinities with European populations. Again, this is not surprising. Some earlier analyses have shown evidence of European-like admixture in northern China, and among ethnic groups like Mongolians. More precisely there are strong signals of European-like admixture in the northwestern provinces of Gansu, Shaanxi, and Shanxi.

The details here are important though. The authors note that Hellenthal et al. detected admixture in the from Northern Europeans into North China using haplotype based methods to around 1200 AD. This preprint finds a similar admixture date. But they caution that these admixture dates may only signal the latest of the events.

As for what that event could be, there was clearly turmoil on the Silk Road in the years around 1000 AD. After 750 AD for all practical purposes the Chinese lost control of their portion of the Silk Road, what is now Xinjiang. Turkic groups like Uyghurs and Iranian ones such as Sogdians were prominent in China due to a power vacuum (the Uyghurs were used by the Tang emperors like the Germans were used by the later Roman Emperors, as federates). Later on one saw the emergence of Tanguts, various groups from Manchuria, and finally the Mongols. Since both haplotype based methods and these preprint suggest something around 1000 AD, the most likely candidate was the absorption of Central Asians with some European-like ancestry into the Chinese substrate. The Uyghur conquest of the major cities of in the centuries before the rise of the Mongols famously resulted in the assimilation of a European-like population which had earlier spoken Indo-European languages.

But admixture was not a feature of just recent Chinese history. The figure to the right is somewhat difficult to read, but it shows on the y-axis variance in the f3 statistic. In short, how well does the Chinese data set here form a clade with the outgroup, and how much does that statistic vary between groups. The x-axis is for the D statistic, which measures the relationship of four populations, with two clades. On the bottom left you see the Siberian genome from 45,000 years ago. On the y-axis you can see all provinces show very little variation, and that’s because the Siberian genome is old enough that it is basal to all the Chinese and Europeans. The D statistic indicates no gene flow between the Siberian populations and modern groups. Not so with other populations. You see the Pleistocene European populations are shifted to the right, and that’s because they all contribute to later Europeans. The Chinese-European clade is not a good fit. This is true across the Chinese populations (so the variance of the f3 statistic is very low),.

Also in the text they note that there is high shared drift with the three “Ancient North Eurasian” (ANE) samples from Siberia. This is discussed extensively in the supplements to Lazaridis et al. 2016. Another replicated finding is that the Chinese share drift with ancient European hunter-gatherers. The drift declines later on, likely because the Chinese do not share as much drift with the early farmers. This is due in part to the “Basal Eurasian” (BEu) element. But in Fu et al. 2016 they observe that drift between East Eurasians and European hunter-gatherers increases after 15,000 years BP, when there was a genetic turnover, and the Villabruna cluster (in their terminology) came to dominate the landscape.

The most probable, though not certain, explanation for this pattern is that ANE populations contributed ancestry to both antipodes of Eurasia. To European hunter-gatherers, and, to the ancestors of the Chinese in Pleistocene East Asia (remember that there was a fusion between a proto-East Asian population and ANE to give rise to the ancestors of Amerindians 15-20,000 years ago). Another explanation could be East Asian gene flow rather early on into Europe, some time after the Last Glacial Maximum ~20,000 years ago. We don’t have the sample density outside of Europe to really say with certainty.

Finally, I have to mention that at SMBE Melinda Yang of Qiaomei Fu’s lab gave a talk about the Tianyuan genome. Their group has found that the Tianyuan individual, who dates to 40,000 years ago, is the likely ancestor of modern East Asians. That is, Tianyuan shares more drift with modern East Asians than Europeans. No huge surprise. What was surprising though is that Tianyuan also shared appreciable drift with GoyetQ116, a 35,000 year old sample from Belgium, whose descendants seem to have played a role in the emergence of the Magdalenian culture. But not later European hunter-gatherer populations. The Tianyuan sample also seemed to share some drift with Australasian samples (a possible resolution for why some Amerindians share drift with Oceanians presents itself here obviously). Overall, the group’s conclusion was that this might be evidence of ancient population structure rather early on in the “Out of Africa” populations, which eventually carried over as the groups dispersed (rather than each geographic region being direct descendants from a single panmictic “Out of Africa” group). The implications here are beyond the purview of Chinese genetics so I’ll address it in a later post.

I have to mention there is a fair amount within this paper on selection as well as medical genetics. I didn’t tackle that in this post since there’s so much phylogenomics one could talk about.

Citation: A comprehensive map of genetic variation in the world’s largest ethnic group – Han ChineseCharleston W. K. ChiangSerghei MangulChristopher R. RoblesWarren W.KretzschmarNa CaiKenneth S. KendlerSriram SankararamJonathan Flint