The genetic palimpsest of the Horn of Africa


Over the last few days I’ve been looking at genetic data related to the Middle East, and as part of that process, I added some Ethiopian samples (in particular, Beta Israel). Which has brought me to thinking about the issue of the origins of the Ethiopians. In 2012 Pagani et al. published a paper which concluded modern Ethiopian peoples by and large emerged out of an admixture event that occurred around ~3,000 years ago. The Sub-Saharan African ancestors of Ethiopians seem to be most similar to the peoples of Sudan. The dating of this admixture is really recent historically. In fact Homer mentions Ethiopians, which suggests that people in the Near East and Eastern Mediterranean may have had some awareness of various populations in this region while the mixing between different ancestral streams was occurring at that very moment (recall most admixture datings pick up the last signals, not earlier ones).

Pagani and those working with him published follow-up paper which indicated that the Eurasian ancestry in Ethiopians is most similar to that in Egypt and the Levant, and not to that in southern Arabia (in particular, Yemen). Their method broke down “tracts” of Eurasian ancestry, and compared affinities segment-by-segment. This is curious because genome-wide methods (e.g., Admixture, Treemix) always indicate that the Eurasian affinity of Ethiopians are with Yemenis. Seeing as how Yemen is literally across the Red Sea, this is reasonable. Pagani et al. suggest that the Yemeni affinity is due to Ethiopian gene-flow into Yemen (the two regions are historically bound together through conquests, etc.).

To make a long story short, I’m not totally convinced by this analysis. Over the past few years, we have more information on the genesis of the East African genetic landscape. First, an ancient genome from Mota in the Ethiopian highlands dated to 4,500 years ago did not have Eurasian admixture. This confirms Pagani et al.’s supposition of a relatively recent admixture event in the Ethiopian highlands. Skoglund et al. 2017 reported on ancient DNA from a Tanzanian pastoralist dating to 3,100 years before the present. This individual’s Eurasian ancestry (~40 percent) is similar to that of pre-pottery Neolithic Levantines. In other words, they lack genetic affinity with farmers from the eastern regions of the Near East. In contrast, Skoglund et al. report that modern Somalis have about ~15 percent of their ancestry from these eastern (“Iranian”) farmers, as well as the Levantine ancestry.

The data seem to be pointing to the fact that the emergence of the genetic patterns in the “Horn of Africa” were likely complex, and occurred through multiple waves of interaction and migration. As the map above makes clears there are two major branches of the Afro-Asiatic language families present in Ethiopia and Somalia, Semitic and Cushitic. The presence of Arabic to the north and west is a relatively recent phenomenon. The Nubian languages were Nilo-Saharan, while the language of ancient Egypt was a separate branch of Afro-Asiatic from Cushitic and Semitic.

The ancient languages of Yemen are part of the same “South Semitic” family as the Ethiopian Semitic languages. Though this may have been cultural diffusion, it does suggest that the genetic signal of connection points to a real phenomenon in terms of migration. Yemenite Jews and many Yemeni non-Jews do not have very much Sub-Saharan African ancestry, suggesting to me that before the rise of Islam most of the gene-flow was from southern Arabia to Ethiopia. A dynamic which reversed with Islam, as a substantial minority of the ancestry of most Yemenis is now Sub-Saharan African.

This does not account for the Cushitic languages. It seems that the Savanna Pastoral Neolithic cultures, of whom the Tanzanian pastoralist in Skoglund et al. was a representative, spoke Cushitic languages. This would mean that the languages of the largest number of Ethiopians, and that of Somalia, is that of the earliest Eurasian migrants into much of Sub-Saharan East Africa. The “Iranian farmer” ancestry in modern Somalis indicates long-term contacts with later migrants, possibly Semitic-speaking populations. Only in the highlands of northern Ethiopia did Semitic languages overtake Cushitic ones. Meanwhile in much of East Africa Cushitic gave way to other languages, often from the Nilo-Saharan family.

There is the broader question of where Afro-Asiatic languages come from. The diversity of languages in Ethiopia have suggested to some that one should look in Africa. I think that Ethiopia’s diversity is like that of the Caucasus: an artifact of rugged mountainous terrain. Rather, the existence of a very distinct Egyptian language 5,000 years ago quite different from contemporaneous Semitic Akkadian, suggests that the roots of this language family are quite old. I suspect that Semitic was intrusive to Ethiopia from Yemen, and that Cushitic became dominant in the period after the Mota individual flourished, and probably arrived from the north. Both the affinities with Levant populations and Yemenis make sense in this light. Much stronger genome-wide affinities with Yemenis could be because of the Ethiopian admixture into Yemen at some basal level quite early in history.

Update: The always well informed “Lank” leaves this comment:

Several problems with this. The Somali model you are referring to models them as a mixture of the 3100 ybp Tanzanian pastoralist, modern Sudanese Dinka, and Iranian farmers. This is unrealistic for several reasons, which could explain the strange 15% Iran-related ancestry that would imply very significant Semitic ancestry in Somalis, since early Semites themselves were mostly not of Iran-related ancestry as far as we know.

Analyses of the 3100 YBP Tanzanian pastoralist’s raw data, e.g. using David’s G25, reveal her to be very similar to Somalis. She can actually be modeled as ~90% Somali, with admixture related to East/South African hunter-gatherers. This is plausible as hunter-gatherers were the natives of the Rift Valley, and mixed with the proto-South Cushites of the Savannah Pastoralist culture represented by this sample. We see this in modern South Cushitic Tanzanian Iraqw (and ‘Nilo-Hamites’ like the Datog, of mostly Cushitic ancestry and cultural affinity) as well, who have very significant mtDNA related to the native hunter-gatherers of the Rift Valley. Much more than the currently more numerous Bantus, who have arrived more recently, South Cushites have mixed with hunter-gatherers. So using admixed early South Cushites like the Tanzanian to model Somalis, who despite being a modern population are actually fairly similar to pre-proto-South Cushites, may be what results in the strange model. Other analyses show that Iran/CHG-related ancestry in Somalis, if present, is very low. The raw data is out there if you want to try the models yourself.

Mota is a highly interesting sample, but not relevant for dating the admixture in early Cushites. He was found in remote southwestern Ethiopia, not really a stronghold of Cushites even today. The African component in Somalis (and most of the SSA in Cushitic/Semitic Ethiopians) is more closely related to the Sudanese, not the Omotic-speaking groups, who we now know tend to have high levels of ancestry related to Mota (other than Omotic groups like the Wolayta living closer to Cushitic/Semitic groups).

The recent 3 kya admixture model for the majority of the Eurasian admixture in Cushitic/Semitic populations does not hold up to scrutiny. The predominant overall ancestry as well as Eurasian admixture levels of the Tanzanian 3100 YBP sample is actually very similar to Somalis, with some local admixture. Finding this sample resembling modern Cushites all the way in Tanzania supports that its admixture traces back to the very earliest Cushites, who are certainly older than 3000 years.

11 thoughts on “The genetic palimpsest of the Horn of Africa

  1. A nice summary. There is one minor quibble I have, which is while it is the case that modern Nubians speak a Nilo-Saharan language, we really don’t know what the languages of ancient Nubia – or even the Merotic language, which was likely spoken till around 350 AD – were. It could well have been Cushite or another Afro-Asiatic branch, considering Beja is still spoken on the Red Sea coast.

    There was a study a few years back of the genetics of Sudan (perhaps you remember what paper in particular it was). One of the interesting things to me is the Nubians clustered with Arabized North Sudanese and Ethiopians, but didn’t cluster at all with the other speakers of Nubian languages in the Nuba Mountains and Darfur. This seems to suggest that elite dominance and language shift happened in one direction or the other – likely, given the origins of the Nilo-Saharan language family, with groups migrating from the Nuba Mountains into the Nile Valley.

  2. First, an ancient genome from Mota in the Ethiopian highlands dated to 4,500 years ago did not have Eurasian admixture. This confirms Pagani et al.’s supposition of a relatively recent admixture event in the Ethiopian highlands. Skoglund et al. 2017 reported on ancient DNA from a Tanzanian pastoralist dating to 3,100 years before the present. This individual’s Eurasian ancestry (~40 percent) is similar to that of pre-pottery Neolithic Levantines. In other words, they lack genetic affinity with farmers from the eastern regions of the Near East. In contrast, Skoglund et al. report that modern Somalis have about ~15 percent of their ancestry from these eastern (“Iranian”) farmers, as well as the Levantine ancestry.

    Several problems with this. The Somali model you are referring to models them as a mixture of the 3100 ybp Tanzanian pastoralist, modern Sudanese Dinka, and Iranian farmers. This is unrealistic for several reasons, which could explain the strange 15% Iran-related ancestry that would imply very significant Semitic ancestry in Somalis, since early Semites themselves were mostly not of Iran-related ancestry as far as we know.

    Analyses of the 3100 YBP Tanzanian pastoralist’s raw data, e.g. using David’s G25, reveal her to be very similar to Somalis. She can actually be modeled as ~90% Somali, with admixture related to East/South African hunter-gatherers. This is plausible as hunter-gatherers were the natives of the Rift Valley, and mixed with the proto-South Cushites of the Savannah Pastoralist culture represented by this sample. We see this in modern South Cushitic Tanzanian Iraqw (and ‘Nilo-Hamites’ like the Datog, of mostly Cushitic ancestry and cultural affinity) as well, who have very significant mtDNA related to the native hunter-gatherers of the Rift Valley. Much more than the currently more numerous Bantus, who have arrived more recently, South Cushites have mixed with hunter-gatherers. So using admixed early South Cushites like the Tanzanian to model Somalis, who despite being a modern population are actually fairly similar to pre-proto-South Cushites, may be what results in the strange model. Other analyses show that Iran/CHG-related ancestry in Somalis, if present, is very low. The raw data is out there if you want to try the models yourself.

    Mota is a highly interesting sample, but not relevant for dating the admixture in early Cushites. He was found in remote southwestern Ethiopia, not really a stronghold of Cushites even today. The African component in Somalis (and most of the SSA in Cushitic/Semitic Ethiopians) is more closely related to the Sudanese, not the Omotic-speaking groups, who we now know tend to have high levels of ancestry related to Mota (other than Omotic groups like the Wolayta living closer to Cushitic/Semitic groups).

    The recent 3 kya admixture model for the majority of the Eurasian admixture in Cushitic/Semitic populations does not hold up to scrutiny. The predominant overall ancestry as well as Eurasian admixture levels of the Tanzanian 3100 YBP sample is actually very similar to Somalis, with some local admixture. Finding this sample resembling modern Cushites all the way in Tanzania supports that its admixture traces back to the very earliest Cushites, who are certainly older than 3000 years.

    9
    3
  3. Thanks for keeping an open mind about this. I look forward to your update!

    Hirbo’s thesis reveals a lot about East African population history, sampling mtDNA from Kenyan hunter-gatherers (Yaaku/Boni/Sanye), who seem to be related to the HG group admixing into South Cushites. Very interestingly, mtDNA L0f was found in one of Skoglund’s Malawian hunter-gatherers.

    Other than South Cushites and so-called Nilo-Hamites, Tutsi who have tested privately also show significant mtDNA related to these hunter-gatherers, along with other lineages typical of Cushites and East Africans in general.

    Relevant quote from the above link:

    “Frequency patterns for the L4 mtDNA lineage, specifically of the derived form,
    L4b2a2 found in East African hunter-gatherer populations (Appendix 9 – A9.2.3) and
    two other mtDNA lineages, L0d3 and L0f, support the argument of gene-flow between
    East African hunter-gatherers and pastoralist populations, specifically the southern
    Cushitic speaking pastoralist populations. The frequency profile for L4b2a2 clearly
    shows that most of the non-hunter-gatherer populations that carry the haplotype are those
    living in the vicinity of East African hunter-gatherer populations, and are likely to have
    had historical interactions (Appendix 9 – A9.2.3) [108, 113, 121, 122, 348] with them.
    However, L0f is found at highest frequency among the southern Cushitic speaking
    populations, followed by East African hunter-gatherer populations who have the
    haplotype at moderate frequencies (Appendix 9). This observation is consistent with the
    assertion that the East African hunter gatherers had most extensive interaction with
    southern Cushitic speaking populations, who represent the Savannah pastoral traditions
    [110, 133]. “

  4. from DW Phillipson 1977, The Later Prehistory of Eastern and Southern Africa, p.91-92:

    “It is probable that these Semitic languages owe their presence in this region [northern Ethiopia] initially to the gradual infiltration of the northern highlands by agricultural people from southern Arabia, a process which seems to have occupied the greater part of the first half of last millennium BC….By the fifth century BC a literate urban culture had been established by these South Arabian immigrants in the fertile highlands of Tigre. The best-known site is at Yeha near Adua….Stone carvings from Yeha are in a typical South Arabian style…, while inscriptions employ a South Arabian Himyaritic syllabary….Of particular importance is the use of the crescent and disc, symbol of the Sabaean moon-god ‘Ilumquh….Associated with these objects and monuments of South Arabian origin are objects of bronze and, occasionally, iron.”

    Semitic languages probably originated in Syria or NW Mesopotamia around 6000 BP. The exact phylogeny of these languages is debated, although an East (Mesopotamian) vs. West split is apparent. A common origin of Canaanite and South Arabian (both West Semitic) around 1800-1500 BC seems to be implied by the assumption that scripts used for both were derived from Proto-Sinaitic.

  5. I thought “Ethiopia” was more a conscious modern callback to Homer, rather than Homer happening to mention the name of a modern country? Αίθιοπσ means “burned face”, reflecting an awareness that it’s hot far to the south, and the people are black there.

    It’s like how Brits call themselves British because that’s what Romans called the island, so when they needed a name for the new country that was both England and Scotland in 1600, they called back to that and named it “Britain”. Shakespeare never mentions Britain except in the context of Roman times or myth (Lear).

  6. This just in. Modern SW Ethiopian hunter-gatherers who even look like they may have less Eurasian admixture than Mota, who was also a hunter-gatherer from SW Ethiopia.

  7. Razib would you give me your insight on the question i asked bwt the bantu/west african distinction in particular about south nigerians on this post seeing as its loosely based on African genetics or nah?

  8. thanks man, confirmed what i thought…a majority bantoid pop being shorhorned as the modal west african population. wild. lol

Leave a Reply

Your email address will not be published. Required fields are marked *