Substack cometh, and lo it is good. (Pricing)

The population genetic structure of China (through noninvasive prenatal testing)


This week a big whole genome analysis of China was published in Cell, Genomic Analyses from Non-invasive Prenatal Testing Reveal Genetic Associations, Patterns of Viral Infections, and Chinese Population History. The abstract:

We analyze whole-genome sequencing data from 141,431 Chinese women generated for non-invasive prenatal testing (NIPT). We use these data to characterize the population genetic structure and to investigate genetic associations with maternal and infectious traits. We show that the present day distribution of alleles is a function of both ancient migration and very recent population movements. We reveal novel phenotype-genotype associations, including several replicated associations with height and BMI, an association between maternal age and EMB, and between twin pregnancy and NRG1. Finally, we identify a unique pattern of circulating viral DNA in plasma with high prevalence of hepatitis B and other clinically relevant maternal infections. A GWAS for viral infections identifies an exceptionally strong association between integrated herpesvirus 6 and MOV10L1, which affects piwi-interacting RNA (piRNA) processing and PIWI protein function. These findings demonstrate the great value and potential of accumulating NIPT data for worldwide medical and genetic analyses.

In The New York Times write-up there is an interesting detail, “This study served as proof-of-concept, he added. His team is moving forward on evaluating prenatal testing data from more than 3.5 million Chinese people.” So what he’s saying is that this study with >100,000 individuals is a “pilot study.” Let that sink in.

The PCA at the top of the post is a bit busy, so I want to highlight the salient aspect. These results confirm that 5-10% of the ancestry of the Hui, Chinese speaking Muslims, is West Eurasian. The Uygur and Kazakh are about ~40% on the left of the plot. The authors note that the Manchus overlapped almost perfectly with individuals sampled from Northern China. This is expected because by the end of the Ching dynasty most of the Manchus had been fully Sinicized, and in the 20th century fully assimilated. Recently due to an emphasis on “national minorities” and some privileges granted therein many people have identified as Manchu due to some ancestry who in all other ways simply northern Han (the Manchu language is moribund).

The sections on particular adaptations which vary by region are not surprising. In books like The Retreat of Elephants the slow, gradual, and inexorable expansion of the Chinese beyond the Yangzi basin is described in a way that makes it clear that southern diseases and climate were a major impediment. But through a process of acclimation, assimilation of local peoples, and adaptation, by 1000 AD the center of demographic gravity had shifted to the south.

There is a section of the text which I think will be falsified though:

After removing participants with 49bp read length and with sequencing error rate >0.00325, a principal component analysis of 45,387 self-reported Han Chinese from the 31 administrative divisions showed that the greatest differentiation of Han Chinese is along a latitudinal gradient (Figures S3E and S3F), consistent with previous studies (Chen et al., 2009, Xu et al., 2009). In contrast, there is, perhaps surprisingly, very little differentiation from East to West. This observation may be explained by the fact that a large proportion of the western Han populations in China are recent immigrants organized by the central government starting from 1949 when the People’s Republic of China was founded (Liang and White, 1996).

I don’t think there’s any need to make recourse to migration from 1949 and after. The argument in Guns, Germs, and Steel suffices: it’s just easier to move across latitudes than longitudes. The people of the north eat noodles made from wheat, and the people of the south eat rice. This is a big cultural transition for peasants to make, and so it didn’t happen as often as moving to the coast, or inland. We have documented instances of mass migrations from adjacent provinces due to famine and political instability. In the 17th century conflicts resulted in the depopulation of Sichuan and the arrival of large numbers of people from Hunan and Hubei to the east.

The plot below is one of the more interesting ones from the paper. From left to right, private alleles found in the HapMap Utah whites also found in all individuals in a given province, and then just Han, and then private alleles to ethnic Telugu Indians (from South India) found in all individuals in a given province, and then just Han.

Click to enlarge

The first thing to notice is that there is a correlation between the Han and non-Han. This shouldn’t be surprising. Plenty of ethnic groups have become Han through acculturation and become demographically absorbed. This is probably truer in parts of the south than in the north, but southern Chinese ethnic minorities are genetically and culturally much more like the Han in the first place.

Private alleles shared with Northern Europeans (CEU) almost certainly has to do with the interaction sphere of the steppe pastoralists, which extends from the Carpathians to Mongolia. The relatively high frequency of R1a, and to a lesser extent R1b, among many Turkic/Central Asian peoples is a pretty good sign of where this West Eurasian ancestry comes from.

The Indian affinity is perhaps more interesting. To be honest I was surprised at the high affinity in Yunnan and Hainan. Tibet has strong cultural connections to India through its form of Buddhism. But its interesting that Qinghai, where many Tibetans also live, does not have the affinity with India. What’s going on in the other provinces? I suspect that the aboriginal peoples assimilated by the Han and other groups in this region probably had some distant connections to the non-West Eurasian ancestry in South Asia.

9 thoughts on “The population genetic structure of China (through noninvasive prenatal testing)

  1. First time I went to Hainan, it was interesting to me that there are a couple of prominent, and very old looking, mosques in Sanya; possibly more that might not be so noticeable.

    The other thing I found interesting is that the local Han, as opposed to Han from other parts of China who have moved there in recent time to work, are very small in stature and pale brownish in skin tone, and very friendly and talkative. That’s as opposed to the aboriginal ethnic minorities, who are definitely not friendly or talkative. So, I assumed that was due to mixing – question is, mixing with whom?

  2. If I understand correctly, the Xibe still (to some extent) identify as a separate ethnic group and speak a Manchu-related language. Is anything known about their genetics?

  3. In Outer Manchuria(Primorsky Krai, Amur Oblast, Jewish Autonomous Oblast, Khabarovsk Krai) in Russia, I believe that there are still some Manchu/Jurchen related people(Udege, Nanai, Ulchi).

    Has there been studies done on them?

  4. Xibe Y-DNA (from the HGDP sample set; cf. Shi et al. 2010, Lippold et al. 2014, YFull, etc.)
    1/8 C-L1373(xM48, M401) [probably C-F1756]
    1/8 C-M77/M86(xB469)
    1/8 C-B469*(xB89)
    1/8 J2a1h-M530
    1/8 K-M9(xL-M20, M1-M106, NO-M214, P-M45) [most likely haplogroup T; less likely, it may be a rare instance of K*, most of which have been found in South Asia, Southeast Asia, and Oceania, or a geographically outlying member of Near Oceanian haplogroup S]
    1/8 N-Tat [probably N-F4205; TMRCA with an ethnic Mongol is approximately 1,500 ybp; that pair’s TMRCA with a pair of Russians is approximately 4,000 ybp]
    1/8 O-M122(xO2a1-KL1/L465, O2a2-IMS-JST021354/P201)
    1/8 O-CTS335*(xCTS3856) [a branch of O-F444; O-CTS3856 has been found in at least Hunan Han, Beijing Han, and Thailand, and another instance of O-CTS335*(xCTS3856) in at least one Japanese in Tokyo; TMRCA estimated to be 4,200 [95% CI 3,100 5,300] ybp]

    Xibe mtDNA (from the HGDP sample set)
    1/9 U4a2a1 [U4a is found in Spain, Italy, Czech, Slovakia, Poland, Belarus, Russia, Finland. U4a2 is found in England, Slovakia, Poland, Bulgarian Turks, Belarus, Estonia, Finland, Russia, Pamir. U4a2a is found in Spain, Italy, Czech, Slovakia, Serbia, Poland, Lithuania, Sweden, Russia, Armenian. U4a2a1 is found in Serbia, Swedish.]
    1/9 F1a1c [Russia, Mongolia, Japan, China, Tibet, Thailand, Moken]
    1/9 R11b [Altai-Kizhi, China, Tibet, Thailand-Laos]
    1/9 C4a1’5 [all over Central Asia and southern Siberia]
    1/9 C5a1 [Ulchi, Khanty, Yakut, Altai-Kizhi, Buryat, Bargut, Severo-Evensk, Khamnigan, Mongolia, Uyghur]
    2/9 C5d1 [Yukaghir, Tuvan, Altai-Kizhi, Stony Tunguska Evenk, Tompo Even, Kamchatka Even]
    1/9 Z3a [China, Tibet; Z3a1 is found in Yakut, China, Thailand-Laos]
    1/9 D4b2 [Japan, Russia, Buryat, Bargut, China, Uyghur, Tibet, Pamir, Thailand, Thailand-Laos, Armenian, Saudi Arabia]

    Xibe Y-DNA (Xue et al. 2006)
    1/41 = 2.4% BT-SRY10831.1(xC-M130, DE-YAP, J-12f2, K-M9) [probably Western Eurasian G-M201, European I-M170, or mainly South Asian H-L901/M2939]
    9/41 = 22.0% C-M217(xM48) [typically “Altaic,” East Asian, or aboriginal North American]
    2/41 = 4.9% C-M48 [typically Tungusic, Nivkh, Yukaghir, Chukotko-Kamchatkan, Mongolic, or Turkic]
    1/41 = 2.4% DE-YAP(xE-M40) [typically Andamanese, Tibetan, or Japanese]
    3/41 = 7.3% J-12f2 [typically Southwest Asian or Mediterranean]
    2/41 = 4.9% K-M9(xNO-M214, P-92R7)
    4/41 = 9.8% N-LLY22g(xM128, P43, Tat) [typically found in populations of western/southwestern China, such as Yi]
    1/41 = 2.4% N-M128
    2/41 = 4.9% N-M178
    3/41 = 7.3% O-M119 [typically found in populations of southeastern China, Daic peoples, and Austronesian peoples]
    1/41 = 2.4% O-M176(x47z) [typically Korean]
    2/41 = 4.9% O-M122(xM159, M7, M134) [LINE1-]
    2/41 = 4.9% O-M122(xM159, M7, M134) [LINE1+]
    5/41 = 12.2% O-M134(xM117) [possibly from assimilation of local Kazakhs]
    2/41 = 4.9% O-M117
    1/41 = 2.4% P-92R7(xR1a-SRY10831.2)

    Xibo Y-DNA (Shou et al. 2010)
    4/32 = 12.5% C-M130
    1/32 = 3.1% J-M304(xJ2-M172)
    3/32 = 9.4% K-M9(xN-M231, O-M175, P-M45)
    5/32 = 15.6% N-M231
    7/32 = 21.9% O-M175(xM119, M95, M122) [Note this major difference from the Xibe samples of Xue et al. 2006 and the HGDP. O-M175(xM119, M95, M122) most likely should belong to O1b1-F2320(xM95), which is most often found among Han Chinese (approx. 5%), or to O1b2-M176, which is most often found among Japanese and Koreans (approx. 30%).]
    4/32 = 12.5% O1a-M119
    2/32 = 6.3% O-M122(xM134)
    5/32 = 15.6% O-M134
    1/32 = 3.1% R1a1-M17

    Xibe Y-DNA (Zhong et al. 2010, 2011)
    2/61 = 3.3% D-M174
    2/61 = 3.28% C-M130(xM8, M38, M217, M347, M356, P55)
    12/61 = 19.67% C-M217(xM93, P39, M48, M407, P53.1, P62)
    6/61 = 9.84% C-P53.1
    2/61 = 3.28% C-M356 [Usually found among South Asians.]
    1/61 = 1.6% J2b2-M241
    11/61 = 18.0% N-M231
    24/61 = 39.3% O-M175
    1/61 = 1.6% R-M17

    The Xibe Y-DNA pool seems to be overall quite similar to the Manchu Y-DNA pool as one might expect according to history and linguistic phylogeny. Both populations appear to be genetically intermediate between Koreans/Northern Chinese on one side and indigenous Siberians (including their linguistic relatives) on the other. Both populations have historically recent origins in the region that is now Northeast China, so their Y-DNA seems to agree with the generally observed high correlation between geography and genetics. However, different studies of these populations have found greatly differing frequencies of certain haplogroups, most notably C-M48 and O-M176, so either the sampling in at least some of the studies must be inadequate or the real populations themselves must be inhomogeneous.

    Data regarding Xibe mtDNA are insufficient. There is not much information available regarding Manchu mtDNA, either, but what little is available suggests that they cannot be easily distinguished from other northern Chinese on the basis of mtDNA.

    Like other populations of northern China and Mongolia, both populations exhibit small amounts of Western Eurasian admixture in both male (e.g. J-12f2(xJ2-M172), J2a1h-M530, J2b2-M241, R1a1-SRY10831.2) and female lineages (e.g. T, U4a2a1). They may also contain small fractions of male-mediated Iranian- or South Asian-like admixture (e.g. C-M356 in some Xibe males, probable R2 and L in some Manchu males).

  5. HGDP01245 from the HGDP sample of Xibe does indeed belong to T1a-M70 according to Chuan-Chao Wang, Lei Shang, Hui-Yuan Yeh, and Lan-Hai Wei, “The Consistencies of Y-Chromosomal and Autosomal Continental Ancestry Varying among Haplogroups,” J Forensic Sci Med 2016;2:229-32. Therefore, T-M70 should be counted among the Western Eurasian Y-DNA haplogroups that have been observed among the Xibe; others are J2a1h-M530, J2b2-M241, J-M304(xJ2-M172), and R1a1a-M17.

    I have collected the following data regarding the Y-DNA of Northern Tungusic/Ewenic speakers in Siberia:

    Even from Eveno-Bytantaysky National district and Momsky district of Sakha Republic Y-DNA (Fedorova et al. 2013)
    10/24 = 41.7% C-M48
    10/24 = 41.7% N-Tat
    2/24 = 8.3% N-P43
    1/24 = 4.2% J-12f2
    1/24 = 4.2% R-M269

    Even from Sakkyryyr, Eveno-Bytantay region, Sakha Republic Y-DNA (Pakendorf et al. 2007, Duggan et al. 2013)
    1/25 = 4.0% C3c-M48(xM86)
    4/25 = 16.0% C3c1-M86
    1/25 = 4.0% N1b-P43
    19/25 = 76.0% N1c-Tat

    Even from Sebjan-Küöl [i.e. Sebyan-Kyuyol], Kobyaysky District, Sakha Republic Y-DNA (Pakendorf et al. 2007, Duggan et al. 2013)
    1/14 = 7.1% C-M130(xM217) [Although this individual’s Y-chromosome has been classified by the authors as C(xC3), it shares an identical 11-marker Y-STR haplotype with the Y-DNA of the C3c*-M48(xM86) Even from Sakkyryyr. Most likely, these two individuals both belong to C-M48(xM86).]
    4/14 = 28.6% C3c1-M86
    9/14 = 64.3% N1c-Tat

    Even from Tompo District, Sakha Republic Y-DNA (Pakendorf et al. 2007, Duggan et al. 2013)
    1/28 = 3.6% C3-M217(xM48)
    13/28 = 46.4% C3c1-M86
    12/28 = 42.9% N1b-P43 [“all the Tompo Evens carrying haplogroup N1b share the same STR haplotype”]
    2/28 = 7.1% N1c-Tat

    Even from Berezovka, Srednekolymsky District, Sakha Republic Y-DNA (Duggan et al. 2013)
    7/7 = 100% C3c1-M86

    Even from Magadan Oblast Y-DNA (Karafet et al. 2002, Tambets et al. 2004, Hammer et al. 2006)
    4/31 = 12.9% C-M217(xM86)
    19/31 = 61.3% C-M86
    1/31 = 3.2% I-P19
    4/31 = 12.9% N-M178
    1/31 = 3.2% Q-P36
    2/31 = 6.5% R1a-SRY10831.2

    Even from Kamchatka Y-DNA (Duggan et al. 2013)
    15/15 = 100% C3c1-M86

    Even Y-DNA total (Karafet et al. 2002/Tambets et al. 2004/Hammer et al. 2006 + Fedorova et al. 2013 + Duggan et al. 2013)
    79/144 = 54.9% C-M130 [mostly C-M86, though there are also some cases of C-M48(xM86) and C-M217(xM48)]
    44/144 = 30.6% N-Tat [mostly of the Yakut type; especially frequent among Evens in Yakutia]
    15/144 = 10.4% N-P43 [especially frequent among Evens in Tompo District of eastern Yakutia, though all cases sampled there share an identical Y-STR haplotype]
    1/144 = 0.7% I-P19
    1/144 = 0.7% J-12f2
    1/144 = 0.7% Q-P36
    2/144 = 1.4% R1a-SRY10831.2
    1/144 = 0.7% R1b-M269

    Negidal Y-DNA (Lell et al. 2002)
    2/17 = 11.8% C-M130(xM48)
    9/17 = 52.9% C-M48
    6/17 = 35.3% N-Tat

    Evenk from Taimyr Y-DNA (Duggan et al. 2013)
    8/18 = 44.4% C3c1-M86
    7/18 = 38.9% N1b-P43
    3/18 = 16.7% R1a

    Evenk/Siberia, middle reaches of the Nizhnyaya Tunguska River according to map (Derenko et al. 2006)
    20/50 = 0.400 C-RPS4Y
    9/50 = 0.180 N1-LLY22g(xN1c1-Tat)
    8/50 = 0.160 N1c1-Tat
    7/50 = 0.140 R1a1a-M17
    3/50 = 0.060 R1-M173(xR1a1a-M17)
    2/50 = 0.040 F-M89(xG-M201, H1-M52, I-M170, J-12f2, K-M9)
    1/50 = 0.020 I-M170

    Evenk from Stony Tunguska River basin Y-DNA (Pakendorf et al. 2006, Duggan et al. 2013)
    28/40 = 70.0% C3c1-M86
    1/40 = 2.5% I-M170
    11/40 = 27.5% N1b-P43

    Yenisey Evenk Y-DNA (Lell et al. 2002)
    18/31 = 58.1% C-M48
    1/31 = 3.2% O-M119
    6/31 = 19.4% K-M9(xM119, Tat, M45) [likely N-P43]
    3/31 = 9.7% N-Tat
    3/31 = 9.7% R-M17

    Evenk from Iengra River basin Y-DNA (Pakendorf et al. 2007, Duggan et al. 2013)
    2/9 = 22.2% C3c*-M48(xM86)
    4/9 = 44.4% C3c1-M86
    2/9 = 22.2% N1c-Tat
    1/9 = 11.1% O-M175

    Evenk/Ust-Maysky, Oleneksky, and Zhigansky districts of Sakha Republic (Fedorova et al. 2013)
    15/57 = 26.3% C3b2-M48
    3/57 = 5.3% C3-M217(xC3b2-M48, C3e1a-M407)
    29/57 = 50.9% N1c1-Tat
    5/57 = 8.8% N1c2b-P43
    3/57 = 5.3% R1a1a-M198(xR1a1a1b1a1-M458)
    1/57 = 1.8% R1a1a1b1a1-M458
    1/57 = 1.8% I2a1-P37

    Siberian Evenk Y-DNA [Karafet et al. 2001, Hammer et al. 2006) [Apparently, most of these Evenk DNA samples were collected in the Nyukzha River basin.]
    13/95 = 13.7% C3-M217(xC3b2a-M86)
    52/95 = 54.7% C3b2a-M86
    16/95 = 16.8% N1c1a-M178
    2/95 = 2.1% N1c2b-P43
    5/95 = 5.3% I-P19
    2/95 = 2.1% J-12f2 [Hammer et al. 2006] or 1/95 = 1.1% F-P14(xI-P19, G2a-P15, J-p12f2, K-M9) and 1/95 = 1.1% J2-M172 [Karafet et al. 2001]
    4/95 = 4.2% Q-P36(xM3)
    1/95 = 1.1% R1a1-SRY10831.2

    Okhotsk Evenk Y-DNA (Lell et al. 2002) [“Samples from a geographically isolated group of Evenks were collected in several small settlements on the mainland Okhotsk Sea shore in the Tugur-Chumikan District of the Khabarovsk Region.”]
    10/16 = 62.5% C-M48
    6/16 = 37.5% N-Tat

    Siberian Evenk Y-DNA total (Lell et al. 2002 + Hammer et al. 2006 + Derenko et al. 2006 + Fedorova et al. 2013 + Duggan et al. 2013)
    173/316 = 54.7% C-M130 [mostly C-M86, but there are also some cases of C-M48(xM86) and C-M217(xM48) like the Evens]
    64/316 = 20.3% N-M46 [tends to be more frequent toward the east, in Evenks from Yakutia and the shores of the Sea of Okhotsk]
    40/316 = 12.7% K-M9(xM119, Tat, M45) [appears to be all N-P43; tends to be more frequent toward the west, in Evenks from Taimyr Peninsula and the basin of the Yenisei River]
    18/316 = 5.7% R1a
    8/316 = 2.5% I-P19/M170
    4/316 = 1.3% Q-P36(xM3)
    3/316 = 0.95% R1-M173(xM17)
    3/316 = 0.95% F(xG2a-P15, I-P19/M170, J-12f2, K-M9) [one of these may actually belong to J-12f2(xJ2-M172)]
    1/316 = 0.32% J2-M172
    1/316 = 0.32% O-M175 with subclade undetermined
    1/316 = 0.32% O-M119

    It appears that most Ewenic-speaking males belong to haplogroup C-M86. The TMRCA of C-M86 is currently estimated by YFull to be 3,800 [95% CI 3,100 4,600] ybp. The rest of the population seems to consist of assimilated descendants of other indigenous Siberians (perhaps Yakuts, Samoyeds, Yukaghirs, and Koryaks), non-indigenous Russians/Soviets, and perhaps a few Mongols.

  6. Oroqen Y-DNA (HGDP sample set)
    3/7 C-M86 [probably C-B469]
    1/7 C-F5483/SK1074 [related to Y-chromosomes found among Daurs, Buryats, Mongols, Manchus, Xibes]
    1/7 C-F2613(xM407) [probably C-F2613(xCTS8579)]
    2/7 N-M46/Page70/Tat [HGDP01203 belongs to N-Y125664, which also has been found in Hebei; TMRCA with N-Y23749 of Japan is estimated to be 6,600 [95% CI 5,400 7,800] ybp]

    Oroqen Y-DNA (Hammer et al. 2006)
    5/22 = 22.7% C-M217(xM86)
    15/22 = 68.2% C-M86
    1/22 = 4.5% N-M178
    1/22 = 4.5% O-P31(xM176, M95)

    Oroqen Y-DNA (Xue et al. 2006)
    6/31 = 0.194 C3-M217(xC3b2-M48)
    13/31 = 0.419 C3b2-M48
    1/31 = 0.032 K-M9(xNO-M214, P-92R7)
    2/31 = 0.065 N1c2b-P43
    1/31 = 0.032 O-M175(xO1a-M119, O2-P31, O3-M122)
    2/31 = 0.065 O2-P31(xO2a1-M95, O2b-M176)
    2/31 = 0.065 O3-M122(xO3a2a-M159, O3a2b-M7, O3a2c1-M134)
    2/31 = 0.065 O3a2b-M7
    1/31 = 0.032 O3a2c1-M134(xO3a2c1a-M117)
    1/31 = 0.032 O3a2c1a-M117

    Oroqen Y-DNA total (HGDP + Hammer et al. 2006 + Xue et al. 2006)
    44/60 = 73.3% C-M130 total [mostly C-M86]
    1/60 = 1.7% K-M9(xNO-M214, P-92R7)
    3/60 = 5.0% N-M46 [including at least one member of N-Y125664]
    2/60 = 3.3% N-P43
    3/60 = 5.0% O-P31(xM95, M176)
    2/60 = 3.3% O-M122(xM159, M7, M134)
    2/60 = 3.3% O-M7
    1/60 = 1.7% O-M175(xM119, P31, M122)
    1/60 = 1.7% O-M134(xM117)
    1/60 = 1.7% O-M117

    Ewenki from NE China (Xue et al. 2006) [Most so-called “Ewenki” in NE China are members of the Solon ethnic group. This group has less in common with the Evenks of Siberia than the Oroqen have with the latter.]
    1/26 = 3.8% C-M130(xM8, M217)
    7/26 = 26.9% C-M217(xM48)
    7/26 = 26.9% C-M48
    1/26 = 3.8% K-M9(xNO-M214, P-92R7)
    1/26 = 3.8% N-M214(xM128, P43, Tat, M175)
    2/26 = 7.7% O-M119
    1/26 = 3.8% O-M95(xM88)
    1/26 = 3.8% O-M122(xM159, M7, M134) [LINE1+]
    1/26 = 3.8% O-M134(xM117)
    4/26 = 15.4% O-M117

    Manchurian/Chinese Evenk (Karafet et al. 2001, Hammer et al. 2006)
    4/41 = 9.8% C-M217(xM86)
    14/41 = 34.1% C-M86
    1/41 = 2.4% N-M214/LLY22g(xM128, P43, M178)
    1/41 = 2.4% N-P43
    6/41 = 14.6% O-M134
    4/41 = 9.8% O-M122(xM134) [LINE+]
    1/41 = 2.4% O-M119(xM110)
    2/41 = 4.9% O-P31(xM176, M95)
    1/41 = 2.4% O-M176(x47z)
    1/41 = 2.4% O-M95(xM111)
    4/41 = 9.8% Q-P36(xM3)
    2/41 = 4.9% R1a-SRY10831.2

    Ewenki (Inner Mongolia) (Zhong et al. 2010, 2011)
    3/31 = 9.7% C-M48
    1/31 = 3.2% C-M217(xM93, P39, M48, M407, P62) [P53.1+]
    4/31 = 12.9% N-M231
    23/31 = 74.2% O-M175

    PRC Evenk total (Xue et al. 2006 + Hammer et al. 2006 + Zhong et al. 2010/2011)
    37/98 = 37.8% C-M130 total [mostly C-M48, most of which is probably C-M86]
    1/98 = 1.0% K-M9(xNO-M214, P-92R7)
    7/98 = 7.1% N-M231 [at least one N-P43]
    47/98 = 48.0% O-M175
    4/98 = 4.1% Q-P36(xM3)
    2/98 = 2.0% R1a-SRY10831.2

    Ewenic-speaking populations in the PRC have much more O-M175 and much less N-M231 than their linguistic relatives in Russian territory. This may be a result of greater opportunity for Ewenic-speaking women to marry Chinese (or other typically O-M175-carrying) men in the PRC versus greater opportunity for Ewenic-speaking women to marry Yakut, Nenets, or other typically N-M231-carrying men in Russia.

    Note that most people officially classified as Èwēnkèzú in the PRC belong to a subgroup known as the Solons. I am not familiar with any secure etymology of this ethnonym, but I note that it resembles Jurchen-Manchu (Solgo ~ Solho), Mongolian (Soluŋɣus ~ Solongos), and Old Japanese (Siraki ~ Siragi), all of which are exonyms for Korea and Koreans (or, more precisely, for Silla and Sillans in the case of the Old Japanese word). The Old Japanese name sounds like “Whitecastle” or “Whitewood” in Old Japanese, the Mongolian name is almost identical to the Mongolian word for “rainbow” (solongo), and the Jurchen-Manchu word somewhat resembles Manchu words for “yellow” (although those lack the /l/ phoneme). The etymology of “Silla” and the Japanese, Jurchen-Manchu, and Mongolian exonyms for Silla/Korea is also unclear. The similarity between “Solon” and “Silla” may be coincidental, but it is something to consider.

  7. Mitochondrial DNA of Evens

    Even from Eveno-Bytantaysky National District and Momsky District of Sakha Republic (Sardana A Fedorova, Maere Reidla, Ene Metspalu, et al., “Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia,” BMC Evolutionary Biology 2013, 13:127. http://www.biomedcentral.com/1471-2148/13/127)
    15/105 = 14.3% C4a1c
    4/105 = 3.8% C4a2
    19/105 = 18.1% C4a total

    2/105 = 1.9% C4b1
    1/105 = 0.95% C4b1a
    1/105 = 0.95% C4b2
    3/105 = 2.9% C4b3a
    2/105 = 1.9% C4b7
    2/105 = 1.9% C4b9
    2/105 = 1.9% C4b
    13/105 = 12.4% C4b total

    2/105 = 1.9% C5a1
    1/105 = 0.95% C5a2a
    3/105 = 2.9% C5a total

    11/105 = 10.5% C5d1

    2/105 = 1.9% C7a1c
    48/105 = 45.7% C total

    4/105 = 3.8% Z1a1b
    4/105 = 3.8% Z1a3
    8/105 = 7.6% Z1a total

    1/105 = 0.95% D4a
    1/105 = 0.95% D3
    5/105 = 4.8% D4c2
    1/105 = 0.95% D2b1
    3/105 = 2.9% D4i2
    1/105 = 0.95% D4j4
    3/105 = 2.9% D4j5
    5/105 = 4.8% D4l2
    2/105 = 1.9% D4m2
    1/105 = 0.95% D4o2
    23/105 = 21.9% D4 total

    3/105 = 2.9% D5a2a2
    26/105 = 24.8% D total

    1/105 = 0.95% M7d
    2/105 = 1.9% M7c
    3/105 = 2.9% M7 total

    9/105 = 8.6% G1b
    1/105 = 0.95% G2a
    10/105 = 9.5% G total

    5/105 = 4.8% Y1a

    1/105 = 0.95% B4

    2/105 = 1.9% F1b
    1/105 = 0.95% F2b1
    3/105 = 2.9% F total

    1/105 = 0.95% J1c5 [also found in 4/125 = 3.2% Evenks from Sakha Republic and 6/423 = 1.4% Yakuts from Sakha Republic]

    Even from Sakkyryyr (also known as Batagay-Alyta), Eveno-Bytantaysky National District, Sakha Republic mtDNA (Ana T. Duggan, Mark Whitten, Victor Wiebe, et al., “Investigating the Prehistory of Tungusic Peoples of Siberia and the Amur-Ussuri Region with Complete mtDNA Genome Sequences and Y-chromosomal Markers,” PLoS ONE 8(12): e83570. doi:10.1371/journal.pone.0083570)
    2/23 = 8.7% C4a1c
    1/23 = 4.3% C4a1d
    2/23 = 8.7% C4b
    1/23 = 4.3% C4b1
    1/23 = 4.3% Z1a1
    1/23 = 4.3% D3
    3/23 = 13.0% D4i2
    1/23 = 4.3% D4j5
    1/23 = 4.3% D4l2
    2/23 = 8.7% D4m2
    2/23 = 8.7% D5a2a2
    2/23 = 8.7% M7c1d
    4/23 = 17.4% F1b1

    Even from Sebjan-Küöl [i.e. Sebyan-Kyuyol], Kobyaysky District, Sakha Republic mtDNA (Duggan et al. 2013)
    3/18 = 16.7% C4a1c
    6/18 = 33.3% C4b
    1/18 = 5.6% C5b1b
    1/18 = 5.6% D3
    1/18 = 5.6% D4e4a
    6/18 = 33.3% D4l2

    Even from Tompo District, Sakha Republic mtDNA (Duggan et al. 2013)
    6/27 = 22.2% C4a1c
    3/27 = 11.1% C4a2
    2/27 = 7.4% C4b
    2/27 = 7.4% C4b1
    1/27 = 3.7% C5d1
    3/27 = 11.1% Z1a
    3/27 = 11.1% D3
    4/27 = 14.8% D4l2
    1/27 = 3.7% D4m2
    1/27 = 3.7% G1b
    1/27 = 3.7% F1b1

    Even from Berezovka, Srednekolymsky District, Sakha Republic mtDNA (Duggan et al. 2013)
    1/15 = 6.7% C4b
    1/15 = 6.7% C4b1
    4/15 = 26.7% C4b3a
    1/15 = 6.7% C4b7
    3/15 = 20.0% Z1a
    1/15 = 6.7% Z1a2
    1/15 = 6.7% D3
    1/15 = 6.7% D4e4a1
    2/15 = 13.3% Y1a

    Even from Kamchatka mtDNA (Duggan et al. 2013)
    2/39 = 5.1% C4a1c
    6/39 = 15.4% C4b1
    3/39 = 7.7% C4b3a
    1/39 = 2.6% C5a2
    1/39 = 2.6% C5d1
    3/39 = 7.7% Z1a
    8/39 = 20.5% Z1a2
    8/39 = 20.5% D4o2
    6/39 = 15.4% G1b
    1/39 = 2.6% Y1a

    Even mtDNA total (Fedorova et al. 2013 + Duggan et al. 2013)
    28/227 = 12.3% C4a1c
    1/227 = 0.44% C4a1d
    7/227 = 3.1% C4a2
    36/227 = 15.9% C4a total

    13/227 = 5.7% C4b
    12/227 = 5.3% C4b1
    1/227 = 0.44% C4b1a
    1/227 = 0.44% C4b2
    10/227 = 4.4% C4b3a
    3/227 = 1.3% C4b7
    2/227 = 0.88% C4b9
    42/227 = 18.5% C4b total

    2/227 = 0.88% C5a1
    1/227 = 0.44% C5a2
    1/227 = 0.44% C5a2a
    4/227 = 1.8% C5a total

    1/227 = 0.44% C5b1b
    13/227 = 5.7% C5d1
    2/227 = 0.88% C7a1c
    98/227 = 43.2% C total

    1/227 = 0.44% D4a
    7/227 = 3.1% D3
    5/227 = 2.2% D4c2
    6/227 = 2.6% D4i2
    1/227 = 0.44% D4j4
    4/227 = 1.8% D4j5
    16/227 = 7.0% D4l2
    5/227 = 2.2% D4m2 [found with great frequency among Nivkhs and with lesser frequency among Tungusic peoples, Yukaghirs, Uyghurs, Yakut, Tuvans, Tubalar, Buryat, and Bargut; D4m1 has been found in Japan; D4m has been observed in a Uyghur and in an individual from the Pamir]
    9/227 = 4.0% D4o2 [found with great frequency among Evens in Kamchatka and with lesser frequency among Hezhens, Yakut, Evenks and Evens in Sakha Republic, and Koryaks]
    2/227 = 0.88% D4e4a
    1/227 = 0.44% D2b1
    57/227 = 25.1% D4 total
    5/227 = 2.2% D5a2a2
    62/227 = 27.3% D total

    9/227 = 4.0% Z1a
    1/227 = 0.44% Z1a1
    4/227 = 1.8% Z1a1b
    9/227 = 4.0% Z1a2
    4/227 = 1.8% Z1a3
    27/227 = 11.9% Z1a total

    16/227 = 7.0% G1b
    1/227 = 0.44% G2a
    17/227 = 7.5% G total

    8/227 = 3.5% Y1a

    2/227 = 0.88% F1b
    5/227 = 2.2% F1b1
    1/227 = 0.44% F2b1
    8/227 = 3.5% F total

    2/227 = 0.88% M7c
    2/227 = 0.88% M7c1d
    1/227 = 0.44% M7d
    5/227 = 2.2% M7 total

    1/227 = 0.44% B4

    1/227 = 0.44% J1c5

    The Evens exhibit very little mtDNA (1/227 = 0.44% J1c5) or Y-DNA (2/144 = 1.4% R1a-SRY10831.2, 1/144 = 0.7% R1b-M269, 1/144 = 0.7% I-P19, 1/144 = 0.7% J-12f2, 5/144 = 3.5% total) of Western Eurasian origin or affinity. They appear to be an example of relatively unadmixed Tungus, though they do exhibit some affinity with non-Tungusic indigenous populations of the extreme east of Siberia (Nivkhs, Yukaghirs, Koryaks, Chukchis, Itelmens, Eskimos) in Y-DNA and especially in mtDNA (e.g. 7.0% G1b + 3.5% Y1a + 2.2% D4m2 = 12.8% Nivkh-like mtDNA total). A few Evens belong to mtDNA haplogroups of generic East Asian affinity (e.g. 2/227 = 0.88% M7c, 2/227 = 0.88% M7c1d, 1/227 = 0.44% M7d, 1/227 = 0.44% B4, 1/227 = 0.44% F2b1) that are rather unlikely candidates for mtDNA haplogroups of proto-Tungusic-speaking women, and may descend from later female migrants from e.g. China (possibly through the medium of some other Siberian ethnic group, e.g. Yakut).

  8. Evenk from Ust-Maysky, Oleneksky, and Zhigansky districts of Sakha Republic (Fedorova et al. 2013)
    1/125 = 0.8% Z3(xZ3a)

    1/125 = 0.8% C4a1(xC4a1c, C4a1d)
    11/125 = 8.8% C4a1c
    1/125 = 0.8% C4a1d
    8/125 = 6.4% C4a2
    13/125 = 10.4% C4b1(xC4b1a)
    1/125 = 0.8% C4b3a
    9/125 = 7.2% C4b9
    5/125 = 4.0% C4b(xC4b1, C4b2, C4b3, C4b7, C4b8, C4b9)
    1/125 = 0.8% C5a1
    3/125 = 2.4% C5a2(xC5a2a)
    4/125 = 3.2% C5b1b
    2/125 = 1.6% C5d1
    59/125 = 47.2% C total

    2/125 = 1.6% D4b1(xD3)
    2/125 = 1.6% D3
    2/125 = 1.6% D4c2
    3/125 = 2.4% D2b1
    2/125 = 1.6% D4e4a1
    2/125 = 1.6% D4j4a
    2/125 = 1.6% D4j5
    1/125 = 0.8% D4j8
    8/125 = 6.4% D4l2
    1/125 = 0.8% D4o2
    25/125 = 20.0% D4 total

    10/125 = 8.0% D5a2a2
    35/125 = 28.0% D total

    1/125 = 0.8% M7c

    2/125 = 1.6% G1b
    1/125 = 0.8% G2a(xG2a5)
    3/125 = 2.4% G total

    3/125 = 2.4% A4(xA4b)
    2/125 = 1.6% A4b
    5/125 = 4.0% A4 total

    3/125 = 2.4% F1b

    4/125 = 3.2% J1c5
    7/125 = 5.6% J2
    11/125 = 8.8% J total

    1/125 = 0.8% U4a1

    4/125 = 3.2% H(xH1, H8, H20)
    1/125 = 0.8% H1(xH1a)
    1/125 = 0.8% H8
    6/125 = 4.8% H total

    18/125 = 14.4% J+U+H total

    Evenk from Taimyr mtDNA (Duggan et al. 2013)
    2/24 = 8.3% A2a
    5/24 = 20.8% C4a1c
    1/24 = 4.2% C4a1c1a
    1/24 = 4.2% C4a2
    4/24 = 16.7% C4b1
    1/24 = 4.2% D3
    1/24 = 4.2% D4e4a
    1/24 = 4.2% D4e4a1
    2/24 = 8.3% D4l2
    1/24 = 4.2% D5a2a2
    1/24 = 4.2% G2a1
    1/24 = 4.2% M13a1b
    3/24 = 12.5% Y1a

    Evenk from Stony Tunguska River basin mtDNA (Duggan et al. 2013)
    2/39 = 5.1% A4b
    6/39 = 15.4% C4a1c
    1/39 = 2.6% C4a1c1a
    7/39 = 17.9% C4a2
    3/39 = 7.7% C4b
    5/39 = 12.8% C4b1
    3/39 = 7.7% C4b3
    1/39 = 2.6% C5b1b
    4/39 = 10.3% C5d1
    1/39 = 2.6% D3
    5/39 = 12.8% D4e4a
    1/39 = 2.6% F1b1

    Evenk from Nyukzha River basin mtDNA (Duggan et al. 2013)
    5/46 = 10.9% A4
    2/46 = 4.3% C4a1c
    1/46 = 2.2% C4a1d
    10/46 = 21.7% C4a2
    3/46 = 6.5% C4b1
    1/46 = 2.2% C4b1a
    1/46 = 2.2% C5a2
    1/46 = 2.2% C7a1c
    1/46 = 2.2% D4e4a
    1/46 = 2.2% D4j4
    4/46 = 8.7% D4l2
    3/46 = 6.5% D5a2a2
    5/46 = 10.9% H8b
    2/46 = 4.3% J2a2b
    1/46 = 2.2% M7a2a
    3/46 = 6.5% Z1a
    2/46 = 4.3% Z1a1

    Evenk from Iengra River basin mtDNA (Duggan et al. 2013)
    2/21 = 9.5% A4
    2/21 = 9.5% C4a2
    1/21 = 4.8% C4b1
    1/21 = 4.8% C5a2
    2/21 = 9.5% D4i2
    1/21 = 4.8% D4j4
    2/21 = 9.5% D4j4a
    1/21 = 4.8% D4l2
    6/21 = 28.6% D5a2a2
    2/21 = 9.5% G1b
    1/21 = 4.8% Z1a1

    Siberian Evenk mtDNA total (Duggan et al. 2013 + Fedorova et al. 2013)
    10/255 = 3.9% A4
    4/255 = 1.6% A4b
    2/255 = 0.78% A2a
    16/255 = 6.3% A total

    1/255 = 0.39% C4a1
    24/255 = 9.4% C4a1c
    2/255 = 0.78% C4a1c1a
    2/255 = 0.78% C4a1d
    29/255 = 11.4% C4a1 total

    28/255 = 11.0% C4a2
    57/255 = 22.4% C4a total

    8/255 = 3.1% C4b
    26/255 = 10.2% C4b1
    1/255 = 0.39% C4b1a
    3/255 = 1.2% C4b3
    1/255 = 0.39% C4b3a
    9/255 = 3.5% C4b9
    48/255 = 18.8% C4b total

    1/255 = 0.39% C5a1
    5/255 = 2.0% C5a2
    6/255 = 2.4% C5a total

    5/255 = 2.0% C5b1b

    6/255 = 2.4% C5d1
    17/255 = 6.7% C5 total

    1/255 = 0.39% C7a1c
    123/255 = 48.2% C total

    2/255 = 0.78% D4b1
    4/255 = 1.6% D3
    6/255 = 2.4% D4b total

    2/255 = 0.78% D4c2

    3/255 = 1.2% D2b1
    7/255 = 2.7% D4e4a
    3/255 = 1.2% D4e4a1
    13/255 = 5.1% D4e total

    2/255 = 0.78% D4i2

    2/255 = 0.78% D4j4
    4/255 = 1.6% D4j4a
    2/255 = 0.78% D4j5
    1/255 = 0.39% D4j8
    9/255 = 3.5% D4j total

    15/255 = 5.9% D4l2

    1/255 = 0.39% D4o2
    48/255 = 18.8% D4 total

    20/255 = 7.8% D5a2a2
    68/255 = 26.7% D total

    3/255 = 1.2% F1b
    1/255 = 0.39% F1b1
    4/255 = 1.6% F1b total

    4/255 = 1.6% G1b
    1/255 = 0.39% G2a
    1/255 = 0.39% G2a1
    6/255 = 2.4% G total

    4/255 = 1.6% H
    1/255 = 0.39% H1
    1/255 = 0.39% H8
    5/255 = 2.0% H8b
    11/255 = 4.3% H total

    4/255 = 1.6% J1c5
    7/255 = 2.7% J2
    2/255 = 0.78% J2a2b
    13/255 = 5.1% J total

    1/255 = 0.39% M7a2a
    1/255 = 0.39% M7c
    2/255 = 0.78% M7 total

    1/255 = 0.39% M13a1b

    1/255 = 0.39% U4a1

    3/255 = 1.2% Y1a

    3/255 = 1.2% Z1a
    3/255 = 1.2% Z1a1
    1/255 = 0.39% Z3
    7/255 = 2.7% Z total

    Evenk/53 Stony (Podkamennaya) Tunguska River basin + 18 Sea of Okhotsk region (Elena B. Starikovskaya, Rem I. Sukernik, Olga A. Derbeneva, et al., “Mitochondrial DNA Diversity in Indigenous Populations of the Southern Extent of Siberia, and the Origins of Native American Haplogroups,” Annals of Human Genetics (2005) 69, 67-89)
    4/71 = 0.056 A(xA2)
    1/71 = 0.014 F
    41/71 = 0.577 C2
    10/71 = 0.141 C3
    13/71 = 0.183 D(xD1a, D2, D3, D5)
    1/71 = 0.014 D3
    1/71 = 0.014 D5

    Fedorova et al. 2013’s sample of Evenks from Ust-Maysky, Oleneksky, and Zhigansky districts of Sakha Republic and Duggan et al. 2013’s sample of Evenks from the Nyukzha River basin each contains approximately 15% of mtDNA of Western Eurasian origin or affinity and approximately 9% of Y-DNA of Western Eurasian origin or affinity. Other samples of Siberian Evenks exhibit variable levels of Y-DNA of Western Eurasian origin or affinity (e.g. 1/40 = 2.5% I-M170 in a sample of Evenks from the Stony Tunguska River basin, 3/18 = 16.7% R1a in a sample of Evenks from the Taimyr Peninsula), but no corresponding Western Eurasian mtDNA.

    Comparing the Siberian Evenks with the Evens, it appears that most members of their common ancestral proto-Ewenic population probably belonged to Y-DNA haplogroup C-M48 (and especially its C-M86 subclade) and mtDNA haplogroups C and D. A few members of Y-DNA haplogroup N-Tat and mtDNA haplogroups G1b, Z1a, Y1a, F1b, M7c, B4, G2a, and J1c5 also may have been present, but it is also possible that members of each of these haplogroups may have assimilated into each of the descendant populations at some later time. I am a bit surprised at how much diversity there is among members of mtDNA haplogroups C and D, both within and among the sampled populations. Some members of these haplogroups, too, may be descendants of females assimilated after dissolution of the hypothetical proto-Ewenic unity.

Comments are closed.