The Pleistocene roots of East Asian genetic variation

Posted on November 30, 2019November 30, 2019 by Razib Khan

In the comments below there was a mention of the fact that East Eurasians are less genetically varied than people to their west. The main reason for this is probably the serial bottleneck of modern people as they left Africa/Near East ~50,000 years ago. Similarly, people of the New World and Oceania are also less genetically diverse, because they’re at the “end of the line.”

But, that doesn’t mean that East Asian populations were particularly small after the founding compared to West Eurasians. It’s simply that genetic diversity over the long term is sensitive to bottlenecks, rather than rolling census counts (long term effective population is a harmonic mean).

With that taken care of, I think we need to be open to the possibility that the peculiar patterns of population expansion in East Asia could be a function of its Pleistocene paleoecology. From what I know in biogeography temperate China is more speciose in trees than temperate Europe. The reason offered is that China had more “ecological depth” due to its geographic configure. During the “Last Glacial Maximum” areas of Europe suitable for forests retreated and disappeared as the Mediterranean blocked further progress. In contrast, China expands in a continuous zone far to the south and east.

European hunter-gatherers have noticeably low genetic diversity and repeated population turnovers. The Mesolithic peoples of Europe were themselves the product of a late Pleistocene expansion, and quite genetically homogeneous (these groups “break” the iron correlation between distance from Africa and homozygosity). It seems plausible that European Pleistocene populations drew from a much shallower demographic reservoir than East Asian ones. This, to me, may explain why population turnover and lineage expansions were a more common feature of the West Eurasian landscape than in East Asia, where the local hunter-gather cultures came through the Ice Age with more robustness and deeper roots.

Well, at least part of the reason. I doubt it explains everything…

Note: You may wonder why I posted a photo of David Epstein’s Range: Why Generalists Triumph in a Specialized World. I’m to a great extent generalist. I know enough pop-gen to have an intuition about effective populations and genetic diversity, as well as enough human genomics background and experience to know the empirical distributions. When I was eight I spent a fair amount of time reading about climate science, and so know the Koppen system to this day and the differences between temperate Europe and East Asia (maritime vs continental as well as huge latitudinal difference and why). Finally, at some point in college, I was interested in biogeography and read a paper that explained why species richness varies across the two temperate (non-boreal) zones of Eurasia.

And that’s how this post came about.

Patterns in the Y chromosome and Holocene expansion

Posted on November 29, 2019November 29, 2019 by Razib Khan

I’ve been looking at uniparental lineages recently. That is, direct maternal and paternal lineages. The mtDNA and Y chromosomal phylogenies. Between the late 1990s and 2000s phylogenetic reconstructions of these lineages dominated historical population genetic inference.

Today with ancient DNA and genomewide SNP analysis, and now even whole-genome analysis, there isn’t nearly as much focus on Y and mtDNA. But curiously in some ways, the scaffold of autosomal ancestry through ancient DNA allows for better leveraging of the insights of Y and mtDNA patterns.

We now know that the Austro-Asiatic Munda people of northeast India are about 30% East Asian in their ancestry. But their dominant haplogroup, which connects them to East Asia, is present at 60% frequency. In contrast, their mtDNA is totally indistinguishable from their Indian neighbors.

One sees a similar dynamic among the Finnic people. These populations have high frequencies of a Siberian Y chromosomal haplogroup (~60%), which seems to have arrived within the last 3,000 years. But the mitochondrial lineages are very similar to their neighbors (though there are a few stray East Eurasian mtDNA lineages). In fact, Finns and Sami are enriched for U5, which seems to be the dominant lineage of Mesolithic hunter-gatherers. Additionally, “non-European” (East Asian-like) ancestry in Finns is around in the 5-10% range and ~20% for the Sami. This seems quite lower than the frequency of N1c, the haplogroup in question, but a reasonable hypothesis is that the men who brought N1c were already quite mixed between eastern and western ancestry.

In any case, I want to close out this post with a quote which won’t surprise closer readers, Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences:

Potential correspondences between genetics and archaeology in South and East Asia have received less investigation. In South Asia, we detect eight lineage expansions dating to ~4.0–7.3 kya and involving haplogroups H1-M52, L-M11, and R1a-Z93 (Supplementary Figs. 14b, 14d, and 14e). The most striking are expansions within R1a-Z93, ~4.0–4.5 kya. This time predates by a few centuries the collapse of the Indus Valley Civilization, associated by some with the historical migration of Indo-European speakers from the western steppes into the Indian sub-continent27. There is a notable parallel with events in Europe, and future aDNA evidence may prove to be as informative as it has been in Europe. Finally, East Asia stands out from the rest of the Old World for its paucity of sudden expansions, perhaps reflecting a larger starting population or the coexistence of multiple prehistoric cultures wherein one lineage could rarely dominate. We observed just one notable expansion within each of the O2b-M176 and O3-M122 clades.

In First Farmers: The Origins of Agricultural Societies Peter Bellwood argues that agricultural expansions explain most of the genetic variation in the world. In Europe and South Asia, this needs to be updated and modified. Late Neolithic and Bronze Age migrations clearly had a major impact. But what about East Asia?

Japan, for one, transformed in the recent past with the assimilation of post-Jomon people into the Yayoi 2,000 years ago. And the expansion of the Han entailed recent assimilations. But, perhaps there is no “star-shape” phylogeny because these expansions were fundamentally bottom-up demographic expansions as posited by Bellwood.

The Aryan Integration Theory (AIT)

Posted on September 12, 2019September 12, 2019 by Razib Khan

Over the past week, there have been lots of reactions to the two papers which came out last week, The formation of human populations in South and Central Asia and An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers. The Insight is still on hiatus, but I managed to interview Vagheesh Narasimhan for my other podcast, so check that out. Like many people, Narasimhan is not keen on the “Aryan invasion theory.” Myself, I don’t have a problem with the term, but it turns out that many Indians dislike the connotations of “AIT” quite a bit.

Since I’m not very invested in semantics, I’m going to just move on and propose another term that identifies a real dynamic. I present then the new AIT, the “The Aryan Integration Theory.”

For various reasons, Narasimhan et al. propose that steppe pastoralists who flourished between 2000 and 1500 BCE are the most likely candidates for the “steppe” contribution to modern Indian genomes. In the Swat valley samples, which date initially to ~1000 BCE, the authors noticed over time the proportion of Iranian-farmer-related ancestry decreased over time to give way to steppe and Andamese-related ancestry.

This pattern over time is related to something you see in the geographical and communal distribution of ancestry in the “three-way admixture” you see:

The day of the Dasa

Posted on September 5, 2019September 5, 2019 by Razib Khan

Unless you have been sleeping today you may have noticed two important papers on South Asian historical population genetics have been published. The simple and short paper is An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers. The longer paper, which is basically a book if you read the supplements, is The Formation of Human Populations in South and Central Asia (and update on a preprint which came out over a year ago).

So the “Rakhigarhi genome” is finally out. She turns out to be an interesting individual: she has some, but not much, Andamanese-related hunter-gatherer ancestry, a lot of Iranian-farmer-related ancestry, and no steppe ancestry. She is very similar the dozen or so “Indus Periphery” samples found outside of South Asia, in the region’s near-abroad (Khorasan and into Turan). Her mtDNA is U2b2. My mtDNA is U2b. So my mother’s maternal lineage dates back to the IVC period. Not a surprise, but still cool.

The major finding that is of great interest is that the “Iranian-farmer” ancestry of the Indus Valley Civilization population was possibly not “Iranian” at all. That is, it seems unlikely that the West Asian-related ancestry in the IVC people was due to a migration out of the Zagros agricultural hearth. The reasoning here is simple. There was ancient population structure in the Near East at the beginning of the Holocene. There were, roughly, there major groups which expanded, Anatolian farmers, related Levantine farmers, and more distantly related Iranian (Zagros) farmers. These groups intermixed copiously during the Holocene. All the farmers of the Holocene in western Iran and even the hunter-gatherers had some ancestry from the Anatolian lineage.

Anatolian heritage is not present in the IVC people. Because Anatolian ancestry is found in Iranian hunter-gatherers at the beginning of the Holocene, the West Asian-related ancestors of the IVC people must have diverged earlier. One option is that there were a set of hunter-gatherer populations in the territory of modern Iran, Afghanistan, and Pakistan (and possibly northwest India) who were related to each other but differentiated due to distance and separation. Modern Iran is bifurcated by some rather harsh deserts between the west and the east. There is no reason the same could not have applied to the Pleistocene. In particular, during the Last Glacial Maximum.

Related to this, Iosif Lazaridis has a preprint out which argues that the difference between the “Anatolian” and “Iran” clusters lay in differential admixture with “Ancient North Eurasians” (ANE) into the latter. The non-Rakhigarhi paper above highlights the role of Turan in mediated interaction and gene flow between northern Eurasia and Iran-Afghanistan-Central Asia region. The difference between the quasi-Iranian ancestors of the IVC people and those of the Zagros, the Iranians proper, may simply be that the ANE-related admixture was stronger further east. Or not. In some ways, the paper opens up a lot of possibilities as to the landscape of late Pleistocene western Asia. It is a reasonable interpretation in the paper that agriculture was spread not through mass migration (e.g., Bantu expansion, farming in Neolithic Europe, etc.) to northwest South Asia, but through cultural diffusion. But the distribution and origin of the quasi-Iranian population need a lot more ancient DNA.

The origin and distribution of Andamese-related hunter-gatherers (AHG), earlier described as “Ancient Ancestral South Indians” (AASI), also needs more elucidation. It has long been known that the various East Eurasian groups seem to have separated very soon after 40,000 years ago. The AHG clade is only distantly related to the Andamanese themselves, who have more of an affinity with the Hoabinhian people of Southeast Asia. Though the diversity of mtDNA macro-haplogroup M is suggestive of long-term habitation of South Asia by some of the AHG, we cannot reject the possibility that they were intrusive from the east during the Pleistocene or Holocene, at least in part.

The awkward construct proposed by Indian researchers to David Reich to term the ancestral populations “ANI” and “ASI” (Ancestral North Indian and Ancestral South Indian) was to some extent a political move. It left open the possibility of deep geographical indigeneity of most of the ancestry of modern South Asians. I was moderately skeptical because I suspected the ANI was intrusive from West Asia (the Iranian-farmer and steppe migration models). These results do not support that, and it may, in fact, be the case that ANI-like quasi-Iranians occupied northwest South Asia for a long time, and AHG populations hugged the southern and eastern fringes, during the height of the Pleistocene.

What a lot of these questions need are people with detailed paleoclimate knowledge. The human geography would be much easier to infer if we had a sense of the primary carrying capacity. Hunter-gatherers tend to be very thin in desert areas, so those would serve as natural gene flow barriers. The divergence between western and eastern Eurasian populations is rather stark, so one might suppose that the Thar desert region was particularly difficult during the Pleistocene to traverse.

At some point, I have to come back to the “Aryan question.” These papers strongly point to the likelihood that the Aryans were intrusive to the Indian subcontinent.

From the Cell paper:

Since language spreads in pre-state societies are often accompanied by large-scale movements of people (Bellwood, 2013), these results argue against the model (Heggarty, 2019) of a trans-Iranian- plateau route for Indo-European language spread into South
Asia. However, a natural route for Indo-European languages to have spread into South Asia is from Eastern Europe via Central Asia in the first half of the 2nd millennium BCE, a chain of transmission now documented in detail with ancient DNA. The fact that the Steppe pastoralist ancestry in South Asia matches that in Bronze Age Eastern Europe (but not Western Europe [de Barros Damgaard et al., 2018; Narasimhan et al., 2019]) provides additional evidence for this theory, as it elegantly explains the shared distinctive features of Balto-Slavic and Indo-Iranian languages (Ringe et al., 2002).

From the Science paper:

Our results not only provide negative evidence against an Iranian plateau origin for Indo-European languages in South Asia, but also positive evidence for the theory that these languages spread from the Steppe. While ancient DNA has documented westward movements of Steppe pastoralist ancestry providing a likely conduit for the spread of many Indo-European languages to Europe (7, 8), the chain-of-transmission into South Asia has been unclear because of a lack of relevant ancient DNA. Our observation of the spread of Central_Steppe_MLBA ancestry into South Asia in the first half of the 2 nd millennium BCE provides this evidence, and is particularly striking as it provides a plausible genetic explanation for the linguistic similarities between the Balto-Slavic and Indo-Iranian sub-families of Indo-European, which despite their vast geographic separation, share the Satem innovation and Ruki sound laws (63). If the spread of people from the Steppe in this period was a conduit for the spread of South Asian Indo-European languages, then it is striking that there are so few material culture similarities between the central Steppe and South Asia in the Middle to Late Bronze Age (i.e. after the middle of the 2nd millennium BCE). Indeed, the material culture differences are so substantial that some archaeologists recognize no evidence of a connection. However, lack of material culture connections does not provide evidence against spread of genes, as has been demonstrated in the case of the Beaker Complex, which originated largely in western Europe, but in Central Europe was associated with skeletons that harbored ~50% ancestry related to Yamnaya Steppe pastoralists (18).

If you look deeper in the paper you see that the authors zeroed in on the period between 2000 and 1000 BCE for a reason. The people of the Eurasian steppe are diverse, and always in flux, and the earlier and later agro-pastoralists were genetically distinct. The Yamnaya culture lacked a “European” element that arrived on the forest-steppe through demographic reflux. The later Indo-European agro-pastoralists, such as the Scythians and Kushans, tended to have East Asian ancestry which is lacking in northwest South Asia. The particular profile found groups such as North Indian Brahmins fits best with the steppe people which were ascendant in the period between 2000 and 1500 BCE.

There is, of course, the assertion by some Indians that Indo-European languages are indigenous to South Asia. If that is the case, then they would have had to expand elsewhere. I won’t address archaeological or linguistic issues. Rather, the problem is that the spread of “steppe” ancestry in the period between 3000 and 1000 BCE across the whole zone of Indo-European speaking languages is so clear that it is the most likely candidate, and the steppe ancestry has origins in the…forest-steppe. Indian counter-arguments are not impossible but tend to be highly complicated.

To me, the more interesting aspect of the story is not the origin of the Indo-Aryans, but how they came into being into what they were as depicted in the Vedas, and later the epics such as the Mahabharata and Ramayana. Let me quote from the Science paper:

Taken together, the poor fits at both extremes of the Indian Cline imply that the Indian Cline does not represent a simple mix of two homogeneous ancestral populations, ANI and ASI. Instead, in the Middle to Late Bronze Age both of these groups were themselves part of metapopulations—relatively well represented by the Steppe Cline and the Indus Periphery Cline—that were not completely homogenized at the time they met and mixed. Most groups in India today can be represented as mixtures of average points along the Steppe Cline (we show below that the ANI fit along the Steppe Cline) and the Indus Periphery Cline (the ASI) but there are deviations from this simple model that contribute to the observed patterns.

Between 1500 and 500 BCE South Asia saw the development of Indian genetics and culture in a way that we understand it today, from the north to the south. One of the striking aspects of the Swat valley samples in the Science paper is that AHG ancestry increases over time (along with steppe ancestry). The Swat people seem to have started out a much higher fraction of IVC sorts, very high on Iranian-related ancestry. But after 1000 BCE they integrated more and more with people to their south and east. Meanwhile, in South India, groups like Nadars from the Tamil country are still about 5% steppe in their heritage, and non-trivial fractions of R1a1a is found among these groups.

There is now a good amount of evidence that the Austro-Asiatic Munda expanded into a landscape where unmixed AHG/AASI populations existed. Though the Science paper puts this in the 3rd millennium, I think the period between 2000 and 1000 BCE is more likely, since Austro-Asiatic rice farmers are found in northern Vietnam in 1900 BCE. The existence of unmixed AHG/AASI suggests to me that the expansion and dominance of Dravidian-speaking agricultural societies in much of South India in the form we recognize them today does not predate the arrival of Indo-Aryans by much if at all. Rather than thinking of Indian culture as the application of Indo-Aryan elements atop a Dravidian base, it is more accurate I think to consider them a synthesis that developed simultaneously. Though it is quite likely that the IVC language was related to that of the Dravidians, the impact of the Indo-Aryans shapes most Dravidian-speaking societies both culturally and genetically.

In fact, the Indo-Aryans themselves had changed genetically and culturally by the time they occupied territory within South Asia. They had mixed with people in eastern Iran and Afghanistan, reducing their steppe fraction, and then mixed again with local South Asian populations. The Indo-Iranian soma/homa cult may have been picked up from the culture of Bactria-Margiana.

A major takeaway from these sorts of papers is the uniqueness of humans and the integrative and panmictic power of culture. From a population genetic perspective parameters such as distance and topography matter a lot. Major ecological barriers such as deserts also have an impact. But the spread of Indo-European languages and genes is more than just a matter of diffusion. A powerful cultural organism expanded, assimilated, and in some cases integrated and synthesized, huge swaths of Eurasia. The IVC society was successful for several thousand years. But it is clear that there were plenty of AHG peoples in the Indian subcontinent while they flourished in the northwest. It was the arrival of Indo-Aryans which revolutionized things so that no “pure” AHG community exists in South Asia today.

Ironically, the sons of Indra spread the seed of the Dasa far and wide, from the Himalaya to Kanyakumari.

Population genetic structure of the Italian peninsula

Posted on September 4, 2019September 4, 2019 by Razib Khan

A new open-access paper on Italian genetics, Population structure of modern-day Italians reveals patterns of ancient and archaic ancestries in Southern Europe:

European populations display low genetic differentiation as the result of long-term blending of their ancient founding ancestries. However, it is unclear how the combination of ancient ancestries related to early foragers, Neolithic farmers, and Bronze Age nomadic pastoralists can explain the distribution of genetic variation across Europe. Populations in natural crossroads like the Italian peninsula are expected to recapitulate the continental diversity, but have been systematically understudied. Here, we characterize the ancestry profiles of Italian populations using a genome-wide dataset representative of modern and ancient samples from across Italy, Europe, and the rest of the world. Italian genomes capture several ancient signatures, including a non–steppe contribution derived ultimately from the Caucasus. Differences in ancestry composition, as the result of migration and admixture, have generated in Italy the largest degree of population structure detected so far in the continent, as well as shaping the amount of Neanderthal DNA in modern-day populations.

My interpretation of what’s in the paper

– The largest impact on genetic variation across all Italians is “Early European Farmers” who derive from the expansion about of Anatolia. The descendants of Pleistocene hunter-gatherers had a marginal impact and were mostly absorbed.

– A “steppe” component shows a north-south gradient and seems to have arrived in the 3rd millennium. It’s almost absent in Sardinia. It is a minority component, but I believe it brought Indo-European languages to the Italian peninsula.

– Looking at the Tuscan results (more affinity with northern than southern Italy), it seems to me that the genetic impact of West Asians leading to the emergence of Etruscans is now no longer quite so viable. We’ll see. But the demographic impact of the steppe people seems to have been lesser in the Southern European peninsulas than in Northern Europe. Basque survived into the modern period in Spain. Paleo-Sardinian, which persisted into Classical times, was probably not Indo-European. And the ancient languages of Crete seem to have been non-Indo-European. It seems entirely plausible that Etruscan then was a pre-Indo-European survival, though the relationship to Lemnian is still there.

– Southern Italy and Sicily is interesting because of the strong West Asian (“Caucasus”) imprint. A 2017 paper on ancient Mycenaean and Minoan genomes showed evidence of gene-flow from the same area, likely during the Copper or Bronze Age. This could be part of the same migration. Or, it could be part of the legacy of Magna Graecia, the colonization of the region by Greeks during antiquity. Or, it could be due to Roman and later admixtures and migrations.

– The evidence of North African ancestry in Sicily is due probably to the settlements during the two centuries of Islam rule, when Sicily was in many ways part of greater North Africa.

Cryptic Ashkenazi ancestry across Eastern Europe

Posted on August 30, 2019August 30, 2019 by Razib Khan

One of the great things about the spread of ‘direct to consumer’ genomics is that it’s increasing sample size in countries where for various reasons there isn’t much coverage. It was brought to my attention that My Heritage DNA results have been analyzed by the company, and yielded the surprising result that Hungary has been most impacted by Ashkenazi Jewish admixture in the Diaspora. This is surprising since it is well known that the United States of America is home to the second-largest Jewish community in the world, with more than 90 percent of that being Ashkenazi.

The main issue here is the distinction between genetic and cultural definitions. Ashkenazi Jews are a coherent population genetic classification, emerging out of a series of admixtures during the medieval period as a strong endogamous group. This means that this community has a distinctive genetic profile, just as Finns or Cambodians have distinctive genetic profiles. But Ashkenazi Jews are also a cultural and religious entity. Because of social and cultural constraints imposed by Christian societies, Jews could leave their religious identity, but Christians could not become Jewish.

In the United States, the massive wave of Jewish migration occurred around 1900. This is not so many generations in the past, so not too many people have very distant Jewish ancestry. Additionally, anti-Semitism has been a more marginal factor on the American landscape, so Jewish ancestry has been less hidden (though not always).

The situation in Eastern Europe is very different. A massive wave of demographic expansion occurred among Ashkenazi Jews after 1500. In the 18th-century Jewish fertility was far greater than gentile fertility in Poland. This resulted in an increase in the Jewish proportion over time, but likely also assimilation of some Jews into Christian society. The “Jewish Enlightenment”, spanning the 100 years between 1780 and 1880, was also a period when massive defections occurred from the more integrated elements of the Central European Jewry. Moses Mendelssohn’s last male descendant to practice Judaism died in 1871, after one century of assimilation and conversion.

Overall, this result confirms what history would suggest to us. I believe if My Heritage DNA looks specifically at IBD tracts they will see that an early peak of admixture would center around 1830, during the height of the Jewish Enlightenment, in Central Europe. The admixture will be later further east in Europe, due to the later period of assimilation of Jews in those societies. In contrast, in the USA exogamy rates for Jews remained at 10% as late as 1960. Only in the past few generations have been risen to around 50% or more.

The origins of “Lucky Arabia”

Posted on August 29, 2019August 29, 2019 by Razib Khan

One of the benefits of reading Arabs is that the author is an expert on Yemen, which often gets short-shrift in works focused on the Arab peoples. As noted in the book itself this is not entirely unfair, insofar as until the past thousand years or so the peoples of Yemen did not even speak “Arabic.” Rather, they spoke various South Arabian languages more closely related to Semitic Ethiopian languages. In contrast, Arabic’s origins are probably along the northern fringes of the Arabian peninsula.

Historically, the various dialects (some of which are unintelligible to each other) of modern Arabic derive from the Arabic of the Koran, which seems to be quite similar to Nabataean Arabic. A revisionist model of the origins of the Arab conquests under Islam posits that they emerged at the margins of the Byzantines and Persians, in northern Arabia (where there had been Arabs for over 1,000 years), with the southward locus of the Islamic mythos in the Hejaz a later grafting upon the tradition.

But this post is not about that. Rather, one thing to note is that despite the ethnolinguistic differences between north Arabians (proto-Arabs qua Arabs), and south Arabians (the ancestors of Yemenis and Omanis), Arabs argues that there was extensive contact and migration between the two very habitable poles of the Arabian peninsula (the fringes of the Fertile Crescent in the north, and the highlands of Yemen in the south).

In the early Islamic period, there was a purported distinction between tribes of the north and the south, but these are often less about geography than genealogy. The Ghassanids of northern Arabia, who were a major Roman client people for centuries had their origins in the highlands of Yemen. More antiquely it seems likely that the settlement of southern Arabia was due to the impact of agriculturalists whose origins were in the Fertile Crescent, at the beginning of the Holocene.

A new preprint, Insight into the genomic history of the Near East from whole-genome sequences and genotypes of Yemenis, is broadly consonant with this framework:

We report high coverage whole genome sequencing data from 46 Yemeni individuals as well as genome wide genotyping data from 169 Yemenis from diverse locations. We use this dataset to define the genetic diversity in Yemen and how it relates to people elsewhere in the Near East. Yemen is a vast region with substantial cultural and geographic diversity, but we found little genetic structure correlating with geography among the Yemenis, probably reflecting continuous movement of people between the regions. African ancestry from admixture in the past 800 years is widespread in Yemen and is the main contributor to the countrys limited genetic structure, with some individuals in Hudayda and Hadramout having up to 20% of their genetic ancestry from Africa. In contrast, individuals from Maarib appear to have been genetically isolated from the African gene flow and thus have genomes likely to reflect Yemens ancestry before the admixture. This ancestry was comparable to the ancestry present during the Bronze Age in the distant Northern regions of the Near East. After the Bronze Age, the South and North of the Near East therefore followed different genetic trajectories: in the North the Levantines admixed with a Eurasian population carrying steppe ancestry whose impact never reached as far south as the Yemen, where people instead admixed with Africans leading to the genetic structure observed in the Near East today.

By coincidence, Maarib is also the purported homeland of the Ghassanids mentioned above. In any case, it is not surprising that they found such an admixture cline. They note in the paper that the Yemenis exhibit little geographic structure. This could reflect recent settlement and demographic expansion, or, lots of localized gene-flow. I’m putting my money on the former due to the rugged terrain in much of the highlands.

Again, the integrative and assimilative impact of the Islamic period is evident, as all the genetics suggests that the major (if not exclusive) admixture of African ancestry due to slavery occurred within the last 1,000 years. There were pre-Islamic empires in the region, but they had a marginal effect in comparison (the “Asian” admixture in people in southeast Yemen is probably in large part Indian, as they detect R1a Y chromosomes there).

The second issue is that looking at Yemenis from Maarib the authors got a better handle on later Eurasian gene-flow into the Levant. On the order of 20% of the ancestry in the Levant seems to post-date the Bronze Age (pegged by the 1800 BCE Sidon samples). This pulse has shared drift with Ancient North Eurasians. If I had to bet I think the various migrations of barbaric peoples such as the Mitanni and Guti are the likely culprits, along with possible later Roman era overlay. I suspect that this later gene-flow is why Yemenis are the supposed “source” population of the Eurasian ancestry within Ethiopians in naive admixture analysis.

Ethiopians lack the later Eurasian pulse with enriched Ancient North Eurasian, just like Yemenis. But, looking at other statistics such as identity by descent tracts the Eurasian ancestry looks more like that of the Levant. For me, the most obvious resolution is that the original Levantine pastoralists who spread Cushitic languages into eastern Africa pre-date the Bronze Age. This means that modern Levantine genetic profiles with too much Ancient North Eurasian are seen as not a good fit in the model, though modern Levantines are in some ways the parent population of both these pastoralists and Yemenis.

Finally, I suspect that the presence of South Arabian languages in some parts of Ethiopia indicate later cultural and genetic influence directly from Yemen far later than the expansion of agro-pastoralism. Samples from the highlands of northern Ethiopia are normally a bit enriched for Eurasian ancestry, and I think what we are seeing here are later waves of culturally influential Semitic-speaking peoples even in the greater proportion of non-Sub-Saharan African.

The Neolithic roots of modern East Asian human geography

Posted on August 10, 2019August 11, 2019 by Razib Khan

Because of the long and thorough tradition of Chinese historiography, we have a good and deep chronological record of East Asia going back two to three thousand years ago. Chinese records also help illuminate and clarify aspects of Japanese, Korean, and Southeast Asian, history. For example, what we know about the Indianized kingdom of Funan in eastern mainland Southeast Asia is from textual sources are Chinese.

But, history can take us only so far. We know this for Western Eurasia, where ancient DNA has revolutionized our understanding of Holocene transformations. Unfortunately, we don’t have that much ancient DNA from East Asia. So we still have to make recourse mostly to modern data. A new preprint proposes to use a lot of modern (and some ancient) data to answer a very specific question, Inland-coastal bifurcation of southern East Asians revealed by Hmong-Mien genomic history. The basic results are totally unsurprising:

Consistent with the two distinct routes of agricultural expansion from southern China, this Hmong-Mien founding ancestry is phylogenetically closer to the founding ancestry of Neolithic Mainland Southeast Asians and present-day isolated Austroasiatic-speaking populations than Austronesians. The spatial and temporal distribution of the southern East Asian lineage is also compatible with the scenario of out-of-southern-China farming dispersal. Thus, our finding reveals an inland-coastal genetic discrepancy related to the farming pioneers in southern China and supports an inland southern China origin of an ancestral meta-population contributing to both Hmong-Mien and Austroasiatic speakers.

More interesting to me is the admixture graph to the right. It uses a bunch of ancient and modern populations to model ancient and modern populations. You can see some general patterns and suggestions of what might come out fo ancient DNA.

For example, the green component is defined by the Hoabinhian samples. These are the people who are distantly related to the Andaman Islanders, and occupied Southeast Asia before the arrival of rice farmers. They are distantly related to “Ancient Ancestral South Indians” (AASI) as well. It is unsurprising that this component is well represented in a Munda tribe (Kharia) from northeast India, or in Austro-Asiatic people of Southeast Asia. But notice that it is well represented in the Jomon of Japan, and modern Tibetans.

If you read the preprint, the authors clearly don’t think that this is Hoabinhian ancestry as such. Rather, the model is looking for something very basal (distant) from other East Eurasians, and Hoabinhians fit that (and are somewhat closer to this basal group). This is probably the same phenomenon of “Australo-Melanesian” ancestry in the Amazon. Curiously, Y haplogroup D is found in Tibet, Japan, and the Andaman Islanders.

The largest group in East Asia are Han Chinese and can be modeled as an admixture of the ancient Northeast Asian Devil’s Gate Cave people and modern Ami Taiwanese aboriginals (Austronesians). This is basically a north-south cline. One doesn’t need to posit obviously that the modern Han is truly a mix of these two groups, but rather that Han identity emerged out of a synthesis of various Neolithic groups with differential affinities to these two groups.

Two ancient samples give a good picture of how these groups are related to West Eurasians. The Afanasevio was almost exactly like the Yamnaya. The Namazga sample comes from ancient prehistoric Khorasan, on the border of modern Iran and Turkmenistan. These two samples do have some affinities with each other. Both have ancestry that related to or derived from “Ancestral North Eurasians” (ANE) and “Caucasus Hunter-Gatherers” (CHG), with the Yamnaya having more ANE and Namazga more CHG. But the Yamnaya also had affinities with “Western Hunter-Gatherers” (WHG) that Namazga lacked. You see that the Kharia has affinities to Namazga, but not Afanasevio. This is not surprising: the Munda tribes of Northeast India seem almost untouched by Indo-Aryan influence (they are entirely lacking in R1a1a, which is found in South Indian tribals). Rather, they mixed with Indian populations which were impacted by migrations of farmers from West Asia.

The proportion of Afanasevio and Namazga are illustrative of particular historical dynamics. Mongols and Xiongnu (ancient) had some connection to the Afanasevio. This is almost certainly Indo-European (probably East Iranian) contact. In contrast, the Hui, Chinese Muslims who are mostly no different from Han aside from religion, have contributions from both Afanasevio and Namazga. This is a strong indication that Hui do have more recent Central Asian (Muslim) ancestry, while Mongolians do not. The increase in Namazga ancestry across Central Asia is probably a function of the rise of Persian and Islamic polities, and the movement north of agriculturalists. The shift to Turkic dominated polities integrated Turan with the rest of the Islamic steppe, which happens to exclude the Mongolians.

It is also interesting that the Thai have more Namazga than Khmer. This is strongly suggestive of a large contribution of Indian ancestry to the Dvaravati culture (the enrichment for Devil’s Cave in the Khmer is probably due to the reality that a few of the HGDP samples seem to be mixed with Chinese), though it could be more recent admixture from India. Note however that the Mon people of Burma seem to have more Indian ancestry, and were often associated with Dvaravati.

Finally, the authors point out that the red southern Northeast Asian component is now common in peoples like the Koreans and Japanese. A clear indication of the spread of farming from southern people, as well as the likely later demographic impact of the expansion of the Chinese state and its spillover impact on Korea.

Hungarian nationalism and the ghosts of Turan

Posted on July 19, 2019July 19, 2019 by Razib Khan

In Harper’s, The Call of the Drums Hungary’s far right discovers its inner barbarian:

The Great Kurultáj, an event held annually outside the town of Bugac, Hungary, is billed as both the “Tribal Assembly of the Hun-Turkic Nations” and “Europe’s Largest Equestrian Event.” When I arrived last August, I was fittingly greeted by a variety of riders on horseback: some dressed as Huns, others as Parthian cavalrymen, Scythian archers, Magyar warriors, csikós cowboys, and betyár bandits. In total there were representatives from twenty-seven “tribes,” all members of the “Hun-Turkic” fraternity. The festival’s entrance was marked by a sixty-foot-tall portrait of Attila himself, wielding an immense broadsword and standing in front of what was either a bonfire or a sky illuminated by the baleful glow of war. He sported a goatee in the style of Steven Seagal and, shorn of his war braids and helmet, might have been someone you could find in a Budapest cellar bar. A slight smirk suggested that great mirth and great violence together mingled in his soul.

The whole article is fascinating. Though the author is clearly disapproving of Hungary’s current nationalist resurgence, the description of Turanism, the cult of Attila in Hungary may seem strange, but it isn’t surprising when you consider that Mongolia has a cult of Genghis Khan.

Hungary is unique in Europe because the people speak a language that is only related to two groups in western Siberia, the Mansi, and Khanty. Most linguists place these Ugric languages as a distant sister clade to the Finno-Permic group. But it seems incontrovertible that the modern Magyar people are culturally descended from a group of people who were in close association with various Turkic nomads (e.g., the Khazars) in the lower Volga region. Their migration westward seems to have recapitulated the movement of the ancient Huns, who were likely Turkic. Additionally, not only did the Magyar tribes absorb Turkic tribes as they moved out of Khazar territory but in later centuries gave they refuge to Turkic groups fleeing the Mongols.

The Turanism described in the article is a real thing, but much of it seems to consist of the co-option of the lifestyle of the Altaic nomadic peoples, Turks, and Mongols, to add glamor to Hungarian history. In fact, the inclusion of groups such as Scythians and Sarmatians (Indo-European Iranians) indicates that what is common is not descent or ethnolinguistic affinity, but a lifestyle. It’s the lifestyle and ethos that Christopher Beckwith writes about in Empires of the Silk Road.

The mobile steppe nomads were not born, they were made. For thousands of years, peoples that occupied the fringe of the forest-zone seem to have taken up the horse, and full pastoralism, and so become part of a lifestyle which was optimally suited to militarization and therefore extraction of resources out of wealthy sedentary societies. The transition was natural because humans would rather be predators than prey.

This reality, that what Turanism celebrates is the idealization of brutal martial past, mitigates the fact that genetically modern Magyars descend overwhelmingly from the conquered, not the conquerors. The conquest elites did have an eastern affinity. But the best recent data indicates that modern Hungarians are only a few percent enriched for this ancestry. Rather, the ancestors of modern Hungarians probably are Slavic peasants as well as the post-Roman peoples of Pannonia.

One explanation for the discrepancy between elite burials from the Late Antique and Medieval period and modern Hungarians is that military conflicts between the first Mongol invasion and the Ottoman conquest took a disproportionate toll on the nobility descended from the Magyars and Turks. But I suspect a more prosaic one is that Hungary is an open plain, and gene flow with neighboring regions would have diluted the initial signature of admixture over the centuries.

Modern Hungarians are surely aware of the genetic realities on an intuitive level: they don’t look particularly different from their neighbors, and they know this. But, culturally they are distinctive, and that is due to the history and lives of the Turks and Magyars, and Hungarian nationalists nod to this reality in forming their own mythos.

The Magyarization of Pannonia requires a deeper investigation by both historians and cultural evolutionists. A pastoralist pagan people imposed their language on recently Christianized Slavs. How? Why? This is a sharp contrast to the Bulgars, who were Turks absorbed by their Slavic subjects.

“….for whither thou goest, I will go; and where thou lodgest, I will lodge: thy people shall be my people, and thy God my God.”

The late emergence of Semitic Ethiopia

Posted on July 8, 2019July 8, 2019 by Razib Khan

Some of you have asked me about a new paper on East Africa, Ancient DNA reveals a multistep spread of the first herders into sub-Saharan Africa.The reality is some of you know this topic better than I, so I don’t have much original to add. But, I was curious today when a preprint dropped, West Asian sources of the Eurasian component in Ethiopians: a reassessment. This is out of Luca Pagani’s group, and it was he who published 2012’s Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool. I spent a fair amount of time at his ASHG poster talking to him about his admixture estimate. He found which West Eurasian ancestry in Ethiopians (highland) which “may represent gene flow into Africa, which we estimate to have occurred ∼3 thousand years ago (kya).”

The West Eurasian ancestry into Ethiopians surprised no one. Rather, the relatively recent estimate of admixture was surprising. The Bible and Homer refer to “Ethiopians.” These mentions date to a period between 500 and 1000 BCE (even if the events predate that). My question was simple, “who were the Ethiopians mentioned by the Greeks and the Hebrews if the genetic character of the people who we today all Ethiopians only came into being ~3,000 years ago?”

Obviously, the term “Ethiopian” can be used in a more generic sense than for the people of Ethiopia proper. The Greeks confused Indians and Ethiopians because both were dark-skinned peoples, and presumably black African people of non-Ethiopian origin were sometimes identified as Ethiopian in Antiquity.

Let me quote the abstract of the preprint:

Previous genome-scale studies of populations living today in Ethiopia have found evidence of recent gene flow from an Eurasian source, dating to the last 3,000 years. Haplotype and genotype data based analyses of modern and ancient data (aDNA) have considered Sardinia-like proxy, broadly Levantine or Neolithic Levantine populations as a range of possible sources for this gene flow. Given the ancient nature of this gene flow and the extent of population movements and replacements that affected West Asia in the last 3000 years, aDNA evidence would seem as the best proxy for determining the putative population source. We demonstrate, however, that the deeply divergent, autochthonous African component which accounts for ~50% of most contemporary Ethiopian genomes, affects the overall allele frequency spectrum to an extent that makes it hard to control for it and, at once, to discern between subtly different, yet important, Eurasian sources (such as Anatolian or Levant Neolithic ones). Here we re-assess pattern of allele sharing between the Eurasian component of Ethiopians (here called NAF for Non African) and ancient and modern proxies area after having extracted NAF from Ethiopians through ancestry deconvolution, and unveil a genomic signature compatible with population movements that affected the Mediterranean area and the Levant after the fall of the Minoan civilization.