When Western Near Eastern Farmers carried North Eurasian Y chromosomes into Central Africa


Whenever you look at a map which shows the distribution of Y chromosomal haplogroup R1b you see two areas where the frequency seems very high. First, Western Europe has a very high frequency. Before 2010 it was commonly assumed that R1b was the heritage of late Pleistocene European hunter-gatherers. Around 2010 deeper analysis suggested perhaps that this was not so, and that the deepest divisions in the phylogeny of Eurasian R1b could be found to the east. The high frequency of this haplogroup then may have been an artifact of the Holocene.

Ancient DNA has confirmed this hypothesis. The high frequency of R1b in Western Europe seems to date to the Bronze Age. Though R1b is not found exclusively in Indo-European peoples and existed at low frequencies in Pleistocene Europe, its current ubiquity in Europe seems likely related to demographic turnover between 3 and 5 thousand years ago.

If I had to bet I think R1b, like R1a, originates among the North Eurasian people who mixed with West Eurasians and Amerindians. The Ma’lta boy, for example, seems to have been a basal R.

But notice a secondary mode of R1b in Africa. This is R-V88. The highest frequencies of this Y chromosomal haplogroup are found in Chadic speaking populations. Chadic is a basal group in the Afro-Asiatic language family. A few years ago a paper was published using autosomal DNA on Chad populations and suggested that Eurasian backflow occurred in deep antiquity. From that paper:

We estimate that [autosomal] mixture occurred 4,750–7,200 ya, thus after the Neolithic transition in the Near East…In Chad, we found a Y chromosome lineage (R1b-V88) that we estimate emerged during the same period 5,700–7,300 ya

A new paper, The peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages, really gets to the origin of R-V88, with a massive Y data-set. There’s a lot of other Y lineages that are surveyed in this work, but in the supplements, the figure makes it clear that Sardinian R-V88 is basal to star-like African topologies. The implication here is that the African lineages derive from European ones.

The autosomal paper found Chad populations (though the one in question was not Chadic speaking) seem to share drift from Sardinians in particular. Looking at ancient genomes Early European Farmers seem to have been the primary donor population. Additionally, the coalescence of the African lineages seems to date to 5 to 6 thousand years before the present.

Though not definitive, the association of Afro-Asiatic populations with R-V88 is strongly suggestive to me of the possibility that some western Near Eastern Farmers spoke Afro-Asiatic languages.

The sons of Ham and Shem


Recently I had the pleasure of having lunch with David Reich and he asked me about my opinions in relation to the Afro-Asiatic languages. I thought it was a strange question in that I get asked about that in the comments of this weblog too. Why would I have any particular insight? I gave him what I thought was the likely answer: Afro-Asiatic languages probably emerged from the western Levant. The ancient textual evidence indicates that to the north and east of Mesopotamia the languages were not Semitic. Though Akkadian, a Semitic language, was present at the dawn of civilization, Sumerian was the dominant language culturally in the land between two rivers, and it was not Semitic. As Lazaridis et al. did not detect noticeable Sub-Saharan African ancestry in Natufians, or later Near Easterners, I have become skeptical of any Sub-Saharan African origin for Afro-Asiatic.

But after the earlier post I made a few mental connections, and so I’ll put something up which pushes forward my confidence on a few issues. They lean predominantly on Y chromosomes. I understand that this sort of phylogeography has been shown to be not too powerful in the past, but in the scaffold of the ancient DNA framework it can resolve some issues.

About a decade ago study of Adolf Hitler’s paternal lineage (through male relatives) indicated that his haplogroup was E1b1b. Though reports that Hitler was non-European, because this is a very common lineage in non-Europeans, as well as Jews, were incorrect, it does turn out that Hitler’s paternal lineage is not associated with the Indo-European migrations. That is, unlike me, Adolf Hitler does not descend from the All-father, but rather one of the men who were conquered and assimilated by the steppe pastoralists.

But E1b1b is an interesting lineage. First, it is very common in much of Africa, especially the north. Second, it is common among the Natufian people according to Lazaridis et al. In contrast the Neolithic Iranian farmers seem to have harbored haplogroups J. Today the Near East is a mix of the two, which makes sense in light of the fact that reciprocal gene flow has occurred in the last 6,000 years.

Looking at E1b1b frequencies you notice a few things. The highest frequencies with large N’s are in the Cushitic and Berber languages. Haplogroup J has a different distribution, being skewed more to West Asia. In Ethiopia E1b1b is more common, but J is far more prevalent among the Semitic Amhara than the Cushitic Oromo. Though it is subtle autosomal DNA makes it clear that the Semitic speaking populations in Ethiopia-Somalia have more Eurasian ancestry than the Cushitic ones. I believe this is evidence of the multiple migration pattern discerned earlier.

If you go further south in East Africa and compare E1b1b and J you see a skew in the ratio. E1b1b declines in frequency, but J basically disappears. Among the Masai, who have a clear minor West Eurasian ancestral component, albeit far less than Ethiopians, 50% carry E1b1b. Among the Sandawe, who are a language isolate  with clicks, but exhibit Cushitic genetic affinities, 34% carry E1b1b. Among their Hadza hunter-gatherer neighbors, 15% do so. Among many Khoisan groups the frequency of E1b1b is 10%. Most of these groups exhibit no J haplogroup. This aligns easily with what Skoglund was reporting earlier: the first pastoralists had no “eastern farmer,” but did have “western farmer.” The Natufians were E1b1b. The wider reach of E1b1b in Africa in comparison to J is likely due to the fact that the admixed pastoralists were pushing into relatively virgin territories. Later Eurasian backflow events, which brought Semitic languages, encountered a much more densely populated Africa.

The hypothesis I present is that after the descendants of the Natufians made the transition to farming, some immediately pushed into areas of Africa suitable for farming and/or pastoralism. They quick diversified into the various Berber and Cushitic languages. The adoption of Nilo-Saharan languages, and later Khoisan ones, was simply the process of successive and serial admixture into local populations as these paternal lineages introduced their lifestyle. In the Near East many distinct Semitic languages persisted across the Fertile Crescent, and for whatever reason the various non-Semitic languages faded and Semitic ones flourished.