Substack cometh, and lo it is good. (Pricing)

Version alpha of trying to understand East Asian population history is now out!

We’ve been waiting for ancient DNA to answer some questions about eastern Eurasia for a while. I always thought Qiaomei Fu would spearhead it, but it doesn’t seem like it worked out that way. That’s because she’s not on a new preprint, The Genomic Formation of Human Populations in East Asia, which fills in a lot of gaps and confusing aspects of what has been reported from fragments of publications that came before (e.g., this clarifies a lot of things with Japan, see below). Since there has already been ancient DNA work on eastern Siberia and Southeast Asia, this is really focusing on the area in and around what is today the Peoples’ Republic of China. The first author has an affiliation with a university in Fujian, a province in southeast China.

Much of the analysis can be understood as organized around language families, and the demographics associated with them. In this way, it goes back to L. L. Cavalli-Sforza’s correlations between gene trees and language trees, as well as his later work on the agricultural Diasporas.

First, there isn’t something radically surprising here in their results. As I suggest above, the mass of ancient DNA in the preprint and model-building just snap together a lot of what you can see in other work, some going back decades.

Let’s start with the “Onge-like/related ancestry. ”

Below you see the strange pattern of Y chromosomal haplogroup D. It’s common in Tibet, Japan, and among the Andamanese.

In the preprint, the authors argue that there is a deep division among East Eurasian populations, going back further than 40,000 years, between a set of populations descended from groups related to Tianyuan man, and populations with affinities to the indigenous peoples of southeast Eurasia and Australia (“Ancestral Ancestral South Indians”, AASI, the Onge, the Negritos of Malaysia and the Phillippines, and Oceanians). Modern populations in East Asia can be thought of as a mix between these two groups, in various pulses and waves. The finding that some peoples in the Amazon had “Australo-Melanesian” affinity is very strange, but note that there’s no guarantee that the geographic distribution of the two clades was so skewed in the past in a north-south manner.

The Onge-related ancestry is apparently found as the deepest layer in the Tibetan plateau and contributes 45% of the ancestry to the Jomon of Japan. Among ancient proto-Austronesian peoples of Taiwan, it contributed 14% of the ancestry. Earlier work on Southeast Asia indicated that even before the expansion of Austro-Asiatic farmers out of southern China they mixed with a basal East Eurasian lineage related to the Onge.

Chinese annals record the presence of dark-skinned peoples in Yunnan nearly into historical periods. These could very well be legends or rumors, or, they could be the last relic populations that had not been fully absorbed into the Tianyuan-descended farmer expansion.

Moving more recently into the past, the preprint findings that of the Tianyuan descended populations in East Asia there is a northern and southern grouping. The northern grouping has been discussed before, it is the classic Amur-river valley population. It turns out that a sample from 5,000 years ago in northern Shaanxi, just to the north of the hearth of classical Chinese civilization in Henan, resembles these Amur-river valley populations. Though the authors don’t have samples from southern China, or even the Yangzi, they use modern samples from southern Chinese peoples, as well as ancient samples from Taiwan, to infer that it is likely that the Yangzi river valley was inhabited by a somewhat different group during prehistory than the modern Han Chinese.

In the preprint, the argument is made that Austronesian, Tai-Kadai, and Austro-Asiatic all emerged out of the Yangzi valley and its rice cultures. As noted above, other papers have already outlined the peopling of Southeast Asia using ancient DNA, so I will ignore that. But, note that for Austro-Asiatic populations, ~1/3 of the ancestry is Onge-related. Some of this was mixed in while in southern China, but some of it probably accrued later on in Southeast Asia.

Modern Austro-Asiatic populations can then be thought of as a compound of Tianyuan, and various  Onge-related groups.

China:  

The modern Han Chinese seem to be a fusion of the two idealized ancestral populations described above:


No great surprise. The Han have more of an affinity for northern East Asian populations than southern ones, with those in the south having more of an affinity for southerners than those in the north. A simple model might be expansion out of Shaanxi and Henan across a landscape with many southern agriculturalists. But that makes us ask: why is there “southern” ancestry among many northern Han today?

I think the explanation is that the expansion of the Han was characterized by reversals, as well as panmixia induced by political unification. Let me outline this explicitly:

– proto-Han identity is focused around Henan and Shaanxi between 2000 BC and 300 AD. As this culture expanded into the margins of the Yangzi and into Sichuan, it absorbed “southern” ancestry (as well as elements of culture).

– During the Han dynasty, 200 BC to 200 AD, the Chinese colonized portions of the far south, and aspects of panmixia occurred, as individuals moved across China north to south and vice versa.

– The fall of the Han dynasty after 200 AD saw North China come to be ruled by “barbarians”, usually of Turkic provenance. South China maintained classical Han culture and political forms without external influence. Many northern families moved south between 200 AD and 600 AD. Many barbarians “became” Han, and mixed into the population. I believe this is when the 2-4% “West Eurasian” started to become prevalent in the north. This western ancestry was mediated through Turkic groups who were predominantly Siberian or Amur-river valley in ancestry.

R1a1a is found in North China, so I believe that this ancestry is from Iranian groups absorbed into the Turco-Mongol populations.

– The reemergence of an integrated China after 600 AD sees the shift of the center of gravity of the Chinese economy move to the center and south, in particular the Yangzi river valley (often attributed to “Champa rice”). Movement northward of South China repopulated areas that had been uninhabited moves “southern” ancestry north. Most of the population growth in the south is endogeneous, and not due to migration. There is very little to no West Eurasian ancestry in the south, as one might expect if large numbers of North Chinese moved south (the exception are probably the Hakka, who are known to be Northerners).

– There are still ethnic minorities in the South. Over the past 1,000 years, they have slowly been Sinicized and assimilated in many areas, so the proportion of “southern” ancestry in places like Guangdong has increased in part through such processes.

 Japan:

The Japanese are not entirely surprising. Using a two-way model with Han or Korean vs. Jomon, the Japanese are about 85% the former and 15% the latter. The proportion is a bit higher for Korean. The reason is straightforward: the Yayoi rice farmers probably derived from the Korean peninsula. Even into the edge of history Japan and the Baekje kingdom of Korea had close relations.

The interesting thing about Japan is this is an area where agriculturalists nearly overwhelmed the indigenous population, albeit absorbing them. The Jomon culture is unique because it was a sedentary hunter-gatherer society that also used pottery extensively. Previously analysis of Jomon remains produced “strange” results. In this preprint the authors give a good explanation of why: the Jomon are an even mixture of a population descended from the Onge-related clade and another one that is closer to the Amur river valley Northeast Eurasian populations, who descend from Tianyun.

Basically the Ainu are a fusion of a Siberian group, and, a population that has affinities with those indigenous to Southeast Asia before the arrival of agriculturalists. Before genetics archaeologists and anthropologists argued about the Ainu affinities. Despite sometimes looking “European” early blood group analysis quickly established an eastern affinity, but morphology and culture suggested different connections to Siberia or Australia. The Australian Aboriginals descend from one of the Onge-related groups to a great extent, so the affinities are now intelligible.

Tibet:

Tibetans seem to be mixed between a small proportion of Onge-related, a larger proportion of an East Asian population descended from Tianyuan and closer to the Amur river valley groups than “southern” rice farmers, and finally a population similar to the Shaanxi Han. The latter mixed with the fusion of the first two ~3-4,000 years ago. This makes intelligible the “Sino-Tibetan” language family, whose validity I’m not clear on. But the linguistic affinity might date to this period.

It also resolves confusion about the emergence of Tibetans that arose around the hypoxia papers of that period.

Mongolia/Xinjiang:

This is the portion that is somewhat “controversial.” In Mongolia, they find that there was the arrival of an early western group, the post-Yamnaya Afanasievo, about 5,000 years ago. They flourished in and around the Altai. They are genetically almost exactly with the Yamnaya. Then, at some point in the Bronze Age, this group was totally replaced by another much more like the Sintashta-Andronovo. These groups were similar to the Yamnaya, but ~30% of their ancestry is like “European-farmers.” The conjecture you can make here is that there was reflux from Europe that came back onto the steppe. These were almost certainly Iranian.  This second wave clearly contributed much of the western ancestry into Mongols, judging by the high fraction of R1a1a-Z93 in the Altai.

But, the more intriguing aspect is south and east in Xinjiang, overlapping the zone occupied by the Indo-European Tocharians, the populations remained similar to the Afanasievo, albeit mixing with East Eurasian groups over time. The implication then is that the authors have “pegged” a separation date from the Tocharian Indo-European branch from the others, about ~5,000 years ago. Aside from Anatolian (e.g., Hittite), Tocharian is often seen to be the most basal.

Later Xinjiang also saw the arrival of Iranians. The western and southern oases of Xijiniang were Iranian, while the northern and eastern ones were Tocharian.

(the authors cite David of Eurogenes, who disagrees with their interpretation)

Genetic admixture:

They find that over time genetic distance between populations in East Asia declines over time. This is analogous to what happened in Western Eurasia.

This might be a generalized process, but I think there’s a specific thing driving this: the rise of the Chinese state-polity. Not only did the Han expand and absorb, but there was gene flow to neighboring groups. It is well known that Han Chinese have been moving into Vietnam, and assimilating, for 2,000 years. Similarly, many Han in the north have been known to “go barbarian.”

23 thoughts on “Version alpha of trying to understand East Asian population history is now out!

  1. Wuzhuangguoliang samples should really push back the clock on what we can say about China. I suspect they’re still a bit too late to really be treating as synonymous with early neolithic Yellow River, but hey ho.

    One question, I noticed that they depart from using the HO capture SNPs for the Wuzhuangguoliang samples:

    “For all but the Chinese samples we enriched the ancient DNA for a targeted set of about 1.2 million single nucleotide polymorphisms (SNPs), while for the Wuzhuangguoliang samples from China we used exome capture (18 individuals) or shotgun sequencing (2 individuals) (Figure 1, Supplementary Data files 1 and 2 and Supplementary Information section 1). We performed quality control to test for contamination by other human sequences, assessed by the rate of cytosine to thymine substitution in the terminal nucleotide and polymorphism in mitochondrial DNA sequences as well as X chromosome sequences in males, and restricted analysis to individuals with minimal contamination(Online Table 1)”

    Probably no one knows but I wonder what the effect of this will be on the accuracy of these.

  2. I suspect the Taiwan Hanben population to be an off-shore remnant of Yangtze and coastal/lowland neolithic populations that could have originally lived all the way north to Shandong (I am thinking of Liangzhu and Dawenkou cultures in particular) and have been at the root of Austro-Tai languages.
    Whereas the Sino-Tibetan Wuzhuangguoliang matches inland Yellow river cultures like Yangshao, which would have started spreading East before 3000BCE.
    In this light, I expect that if one were to test Longshan (from 3000BCE onwards) remains from the lower Yellow River, one would already find this mixed Wuzhuangguoliang/Hanben profile (with a likely cultural and linguistic replacement in favour of Yangshao-Wuzhuangguoliang incomers… A precursor of Han expand/absorb culture?).
    Of course, later panmixia would have greatly contributed to the current homogeneous mix, but I do think it started before the Han came together and that Hanben is coastal and not just Southern per-se.

  3. “The finding that some peoples in the Amazon had “Australo-Melanesian” affinity is very strange”

    Do tell.

  4. @Walter I’m also a noob, but I just finished this chapter of David Reich’s book yesterday. He’s referring to ‘Population Y’. Some tribes in Amazonia have ancestry unrelated to that of other indigenous Americans; this ghost population seems to be most closely related to Australo-Melanesians.

  5. Would appreciate some clarification:

    1. So where did the Tianyuan/Amur population come from, if not from a divergent Onge-like population in Northeast Asia?

    2. Is the implication that there were two waves out of the north, the earlier one becoming the Yanzi rice-farmers, and the later one the Han who eventually migrated south during the imperial period?

    3. It seems as though the Sino-Tibetan languages originate close to the Turkic, Mongolic, and Korean languages, yet there is almost no affinity between them. How can this be explained?

    4. Isn’t it strange that the Onge-like ancestry is higher in Tibet and Japan than in Thailand/Malaysia?

  6. 45% Onge-related ancestry among the Jomon does not explain the high rate (~35%) of Y-HG D among modern Japanese – not in a two-way model of 85% Han-or-Korean, 15% Jomon; especially as the Jomon were the aboriginal, subject peoples.

  7. 45% Onge-related ancestry among the Jomon does not explain the high rate (~35%) of Y-HG D among modern Japanese – not in a two-way model of 85% Han-or-Korean, 15% Jomon; especially as the Jomon were the aboriginal, subject peoples.

    this could easily be explained by drift, which is most extreme on the Y due to lower effective population size

  8. D among modern Japanese is maybe akin to I2 among Sardinians (transmitted via early WHG introgression, despite their notoriously overwhelming AF ancestry).

  9. I2 among Sardinians has a pretty strong founder effect, big expansion from a small base in the LCA->BA. D in Japan the same, or not so much?

  10. So am I understanding this correctly?

    Onge related ancestry is a non-trivial contribution to east asian genetics, especially it’s seems the ancestral Japanese?

    Does that mean East asians onge like ancestor and the indian subcontinent AASI ancestry are closely related? Or is there a big distance between them?

  11. Is the East Eurasian component of the Amerindians related to the Amur River Valley population, or is it an earlier off-shoot which is probably associated with paleo-Siberians (i.e. before Turkics, Tungusics and Mongolics)?

    Also, were the ancestral Uralics and y-dna haplogroup N carriers also descended from the Amur River Valley population?

    Honestly, this whole topic of the origins of East Eurasians is quite fascinating. Hopefully one day they’ll have a clear phylogenetic tree of the various groups.

  12. @KM

    They are very closely related but I don’t understand why AASI ydna (H) and Onge ydna (D) profiles are so different.

  13. @thejkhan

    “They are very closely related but I don’t understand why AASI ydna (H) and Onge ydna (D) profiles are so different.”

    Honestly, we can’t say that with much certainty. Without aDNA, we’ll never have a legitimately grounded conception of what we mean when we say “AASI”.

    We still don’t even know where to draw the line between ENA and West Eurasian, in the context of South Asia! Iran_N is skewed towards South Asia when compared to CHG; Mesolithic Iranians are skewed towards South Asia when compared to Iran_N; Indus_perhiphery is skewed towards South Asia when compared to Iran_Hotu/Belt.

    ^ So, how much of this is a function of increasing “AASI”, and how much is a function of increasing West Eurasian ancestry unique to South Asia and its environs?

    You can model some Indus_perhiphery as only 10% Onge… but that’s using Iran_N. You can model Iran_N as 15% Levant_N + 85% Indus_perhiphery. Which is closer to reality? Do some Indus_perhiphery samples even have AASI (whatever AASI is)… or, does Iran_N even have Levant_N?

    And again, we’re still fuzzy on the whole AASI problem… so we might actually be asking conceptually confused questions. We really need empirical data to clear up said conceptual ambiguity.

    Anyway, if my memory serves me right, the South/Central Asian aDNA paper has a model of AASI diverging from the same point at which Onge, Australasians, and East Asians split. Again, going by memory (so I might be wrong).

    And for what it’s worth, there’s y-DNA haplogroup H in Neolithic Anatolians and Europeans (different lineage from what we see in South Asia… but H nonetheless).

    So not sure if we can’t just construe it as a very ancient set of West Eurasian lineages in the context of South Asia.

  14. Just throwing this out there: There is a yDNA D Hoabinhian from Malaysia.

    I think that the H in south Asia is intrusive within the last ~30,000 years and it largely replaced the previous yDNA haplogroups K2b, C1b and possibly D.

    Regarding the Iran populations- I think any model will be flimsy without knowing what the Basal Eurasian is.

  15. Probably no one knows but I wonder what the effect of this will be on the accuracy of these.

    usually it’s better if they can get enough markers. phylogenetic analysis is inured to noisy data if it’s not systematic

  16. 4. Isn’t it strange that the Onge-like ancestry is higher in Tibet and Japan than in Thailand/Malaysia?

    this paper doesn’t talk about thailand and malaysia. there’s more ‘onge-like’ in these groups since they mixed with local ppl

  17. Did ancestors of Genghis Khan remove the Yamnaya from Altai?

    they absorbed the iranian groups. the afanesevio are really the yamnaya, and they seem to have been kicked to the curb earlier

  18. Does that mean East asians onge like ancestor and the indian subcontinent AASI ancestry are closely related? Or is there a big distance between them?

    a lot of stuff diverged btwn 50 and 40 K bp. AASI ancestry in modern indians and Onge diverged 40K BP. this is really an old old split. so all these onge-like groups indicate a deeper split in east eurasians. it’s not as simple as north-south obv. it looks like amazonians have some of the onge-like component.

  19. Is the East Eurasian component of the Amerindians related to the Amur River Valley population, or is it an earlier off-shoot which is probably associated with paleo-Siberians (i.e. before Turkics, Tungusics and Mongolics)?

    in the paper they assert that it is like the devil’s gate people. but it’s more nuanced than that. see this paper https://www.biorxiv.org/content/10.1101/448829v1

    also, na dene is more recent

  20. Anyway, if my memory serves me right, the South/Central Asian aDNA paper has a model of AASI diverging from the same point at which Onge, Australasians, and East Asians split. Again, going by memory (so I might be wrong).

    the east asians split a little earlier,but yeah, it is almost polytomic for the others

  21. Are the Ancient northeast Asians – modeled by the Amur-river valley populations or the Devil’s Gate sample – also carrying deep Onge-related ancestry, or it is just the southern group that carry it? (not counting Tibetans and Japanese, of course)

Comments are closed.