The genetic palimpsest of the Horn of Africa

Over the last few days I’ve been looking at genetic data related to the Middle East, and as part of that process, I added some Ethiopian samples (in particular, Beta Israel). Which has brought me to thinking about the issue of the origins of the Ethiopians. In 2012 Pagani et al. published a paper which concluded modern Ethiopian peoples by and large emerged out of an admixture event that occurred around ~3,000 years ago. The Sub-Saharan African ancestors of Ethiopians seem to be most similar to the peoples of Sudan. The dating of this admixture is really recent historically. In fact Homer mentions Ethiopians, which suggests that people in the Near East and Eastern Mediterranean may have had some awareness of various populations in this region while the mixing between different ancestral streams was occurring at that very moment (recall most admixture datings pick up the last signals, not earlier ones).

Pagani and those working with him published follow-up paper which indicated that the Eurasian ancestry in Ethiopians is most similar to that in Egypt and the Levant, and not to that in southern Arabia (in particular, Yemen). Their method broke down “tracts” of Eurasian ancestry, and compared affinities segment-by-segment. This is curious because genome-wide methods (e.g., Admixture, Treemix) always indicate that the Eurasian affinity of Ethiopians are with Yemenis. Seeing as how Yemen is literally across the Red Sea, this is reasonable. Pagani et al. suggest that the Yemeni affinity is due to Ethiopian gene-flow into Yemen (the two regions are historically bound together through conquests, etc.).

To make a long story short, I’m not totally convinced by this analysis. Over the past few years, we have more information on the genesis of the East African genetic landscape. First, an ancient genome from Mota in the Ethiopian highlands dated to 4,500 years ago did not have Eurasian admixture. This confirms Pagani et al.’s supposition of a relatively recent admixture event in the Ethiopian highlands. Skoglund et al. 2017 reported on ancient DNA from a Tanzanian pastoralist dating to 3,100 years before the present. This individual’s Eurasian ancestry (~40 percent) is similar to that of pre-pottery Neolithic Levantines. In other words, they lack genetic affinity with farmers from the eastern regions of the Near East. In contrast, Skoglund et al. report that modern Somalis have about ~15 percent of their ancestry from these eastern (“Iranian”) farmers, as well as the Levantine ancestry.

The data seem to be pointing to the fact that the emergence of the genetic patterns in the “Horn of Africa” were likely complex, and occurred through multiple waves of interaction and migration. As the map above makes clears there are two major branches of the Afro-Asiatic language families present in Ethiopia and Somalia, Semitic and Cushitic. The presence of Arabic to the north and west is a relatively recent phenomenon. The Nubian languages were Nilo-Saharan, while the language of ancient Egypt was a separate branch of Afro-Asiatic from Cushitic and Semitic.

The ancient languages of Yemen are part of the same “South Semitic” family as the Ethiopian Semitic languages. Though this may have been cultural diffusion, it does suggest that the genetic signal of connection points to a real phenomenon in terms of migration. Yemenite Jews and many Yemeni non-Jews do not have very much Sub-Saharan African ancestry, suggesting to me that before the rise of Islam most of the gene-flow was from southern Arabia to Ethiopia. A dynamic which reversed with Islam, as a substantial minority of the ancestry of most Yemenis is now Sub-Saharan African.

This does not account for the Cushitic languages. It seems that the Savanna Pastoral Neolithic cultures, of whom the Tanzanian pastoralist in Skoglund et al. was a representative, spoke Cushitic languages. This would mean that the languages of the largest number of Ethiopians, and that of Somalia, is that of the earliest Eurasian migrants into much of Sub-Saharan East Africa. The “Iranian farmer” ancestry in modern Somalis indicates long-term contacts with later migrants, possibly Semitic-speaking populations. Only in the highlands of northern Ethiopia did Semitic languages overtake Cushitic ones. Meanwhile in much of East Africa Cushitic gave way to other languages, often from the Nilo-Saharan family.

There is the broader question of where Afro-Asiatic languages come from. The diversity of languages in Ethiopia have suggested to some that one should look in Africa. I think that Ethiopia’s diversity is like that of the Caucasus: an artifact of rugged mountainous terrain. Rather, the existence of a very distinct Egyptian language 5,000 years ago quite different from contemporaneous Semitic Akkadian, suggests that the roots of this language family are quite old. I suspect that Semitic was intrusive to Ethiopia from Yemen, and that Cushitic became dominant in the period after the Mota individual flourished, and probably arrived from the north. Both the affinities with Levant populations and Yemenis make sense in this light. Much stronger genome-wide affinities with Yemenis could be because of the Ethiopian admixture into Yemen at some basal level quite early in history.

Update: The always well informed “Lank” leaves this comment:

Several problems with this. The Somali model you are referring to models them as a mixture of the 3100 ybp Tanzanian pastoralist, modern Sudanese Dinka, and Iranian farmers. This is unrealistic for several reasons, which could explain the strange 15% Iran-related ancestry that would imply very significant Semitic ancestry in Somalis, since early Semites themselves were mostly not of Iran-related ancestry as far as we know.

Analyses of the 3100 YBP Tanzanian pastoralist’s raw data, e.g. using David’s G25, reveal her to be very similar to Somalis. She can actually be modeled as ~90% Somali, with admixture related to East/South African hunter-gatherers. This is plausible as hunter-gatherers were the natives of the Rift Valley, and mixed with the proto-South Cushites of the Savannah Pastoralist culture represented by this sample. We see this in modern South Cushitic Tanzanian Iraqw (and ‘Nilo-Hamites’ like the Datog, of mostly Cushitic ancestry and cultural affinity) as well, who have very significant mtDNA related to the native hunter-gatherers of the Rift Valley. Much more than the currently more numerous Bantus, who have arrived more recently, South Cushites have mixed with hunter-gatherers. So using admixed early South Cushites like the Tanzanian to model Somalis, who despite being a modern population are actually fairly similar to pre-proto-South Cushites, may be what results in the strange model. Other analyses show that Iran/CHG-related ancestry in Somalis, if present, is very low. The raw data is out there if you want to try the models yourself.

Mota is a highly interesting sample, but not relevant for dating the admixture in early Cushites. He was found in remote southwestern Ethiopia, not really a stronghold of Cushites even today. The African component in Somalis (and most of the SSA in Cushitic/Semitic Ethiopians) is more closely related to the Sudanese, not the Omotic-speaking groups, who we now know tend to have high levels of ancestry related to Mota (other than Omotic groups like the Wolayta living closer to Cushitic/Semitic groups).

The recent 3 kya admixture model for the majority of the Eurasian admixture in Cushitic/Semitic populations does not hold up to scrutiny. The predominant overall ancestry as well as Eurasian admixture levels of the Tanzanian 3100 YBP sample is actually very similar to Somalis, with some local admixture. Finding this sample resembling modern Cushites all the way in Tanzania supports that its admixture traces back to the very earliest Cushites, who are certainly older than 3000 years.