The origins of "Lucky Arabia"

One of the benefits of reading Arabs is that the author is an expert on Yemen, which often gets short-shrift in works focused on the Arab peoples. As noted in the book itself this is not entirely unfair, insofar as until the past thousand years or so the peoples of Yemen did not even speak “Arabic.” Rather, they spoke various South Arabian languages more closely related to Semitic Ethiopian languages. In contrast, Arabic’s origins are probably along the northern fringes of the Arabian peninsula.

Historically, the various dialects (some of which are unintelligible to each other) of modern Arabic derive from the Arabic of the Koran, which seems to be quite similar to Nabataean Arabic. A revisionist model of the origins of the Arab conquests under Islam posits that they emerged at the margins of the Byzantines and Persians, in northern Arabia (where there had been Arabs for over 1,000 years), with the southward locus of the Islamic mythos in the Hejaz a later grafting upon the tradition.

But this post is not about that. Rather, one thing to note is that despite the ethnolinguistic differences between north Arabians (proto-Arabs qua Arabs), and south Arabians (the ancestors of Yemenis and Omanis), Arabs argues that there was extensive contact and migration between the two very habitable poles of the Arabian peninsula (the fringes of the Fertile Crescent in the north, and the highlands of Yemen in the south).

In the early Islamic period, there was a purported distinction between tribes of the north and the south, but these are often less about geography than genealogy. The Ghassanids of northern Arabia, who were a major Roman client people for centuries had their origins in the highlands of Yemen. More antiquely it seems likely that the settlement of southern Arabia was due to the impact of agriculturalists whose origins were in the Fertile Crescent, at the beginning of the Holocene.

A new preprint, Insight into the genomic history of the Near East from whole-genome sequences and genotypes of Yemenis, is broadly consonant with this framework:

We report high coverage whole genome sequencing data from 46 Yemeni individuals as well as genome wide genotyping data from 169 Yemenis from diverse locations. We use this dataset to define the genetic diversity in Yemen and how it relates to people elsewhere in the Near East. Yemen is a vast region with substantial cultural and geographic diversity, but we found little genetic structure correlating with geography among the Yemenis, probably reflecting continuous movement of people between the regions. African ancestry from admixture in the past 800 years is widespread in Yemen and is the main contributor to the countrys limited genetic structure, with some individuals in Hudayda and Hadramout having up to 20% of their genetic ancestry from Africa. In contrast, individuals from Maarib appear to have been genetically isolated from the African gene flow and thus have genomes likely to reflect Yemens ancestry before the admixture. This ancestry was comparable to the ancestry present during the Bronze Age in the distant Northern regions of the Near East. After the Bronze Age, the South and North of the Near East therefore followed different genetic trajectories: in the North the Levantines admixed with a Eurasian population carrying steppe ancestry whose impact never reached as far south as the Yemen, where people instead admixed with Africans leading to the genetic structure observed in the Near East today.

By coincidence, Maarib is also the purported homeland of the Ghassanids mentioned above. In any case, it is not surprising that they found such an admixture cline. They note in the paper that the Yemenis exhibit little geographic structure. This could reflect recent settlement and demographic expansion, or, lots of localized gene-flow. I’m putting my money on the former due to the rugged terrain in much of the highlands.

Again, the integrative and assimilative impact of the Islamic period is evident, as all the genetics suggests that the major (if not exclusive) admixture of African ancestry due to slavery occurred within the last 1,000 years. There were pre-Islamic empires in the region, but they had a marginal effect in comparison (the “Asian” admixture in people in southeast Yemen is probably in large part Indian, as they detect R1a Y chromosomes there).

The second issue is that looking at Yemenis from Maarib the authors got a better handle on later Eurasian gene-flow into the Levant. On the order of 20% of the ancestry in the Levant seems to post-date the Bronze Age (pegged by the 1800 BCE Sidon samples). This pulse has shared drift with Ancient North Eurasians. If I had to bet I think the various migrations of barbaric peoples such as the Mitanni and Guti are the likely culprits, along with possible later Roman era overlay. I suspect that this later gene-flow is why Yemenis are the supposed “source” population of the Eurasian ancestry within Ethiopians in naive admixture analysis.

Ethiopians lack the later Eurasian pulse with enriched Ancient North Eurasian, just like Yemenis. But, looking at other statistics such as identity by descent tracts the Eurasian ancestry looks more like that of the Levant. For me, the most obvious resolution is that the original Levantine pastoralists who spread Cushitic languages into eastern Africa pre-date the Bronze Age. This means that modern Levantine genetic profiles with too much Ancient North Eurasian are seen as not a good fit in the model, though modern Levantines are in some ways the parent population of both these pastoralists and Yemenis.

Finally, I suspect that the presence of South Arabian languages in some parts of Ethiopia indicate later cultural and genetic influence directly from Yemen far later than the expansion of agro-pastoralism. Samples from the highlands of northern Ethiopia are normally a bit enriched for Eurasian ancestry, and I think what we are seeing here are later waves of culturally influential Semitic-speaking peoples even in the greater proportion of non-Sub-Saharan African.

8 thoughts on “The origins of “Lucky Arabia”

  1. Epic! Here’s my question: it can be assumed that migrations from the Levant were associated with the spread of South Semitic, and eventually, Arabic, deep into Arabia. What, if any, signature exists of whoever came before? Arabia was more fertile than today, rather than less, on the eve of the great Near Eastern mixing. Anatolian farmers mixed with SE European hunter gatherers to form EEFs—who did the original Levantine settlers assimilate? Are non-African-admixed groups in South Arabia really *that* similar to Bronze Age Levantines? Remarkable if so.

    [Have been thinking about these migrations w/new discovery that my Jewish Y-subclade has its closest relative, ca. 1500 BCE very roughly, in a cluster of Gulf Arabs with a tradition of origins in Maarib.]

    Also, I see you leaning into the Levantine homeland hypothesis for PAA! Can’t blame you; in some ways it’s a more parsimonious container for the genetic evidence. I just can’t shake the feeling that both sides in that debate are seriously vulnerable. Still lean (Northeast) African on the phylogenetic evidence…

  2. One quick point I’d add is that page 5 of their study shows Maarib and Lebanese are about equidistant from Neolithic Levant (using outgroup f4 measure), but Maarib closer to Natufians. Some subtle excess of Natufian is kind of in line with some previous genetic data on Yemen and can tend to provide a subtle offset between some of the different Levant BA sets that have been sampled.

  3. What evidence do you have for Cushitic and Afrasian being from levantine herders? Isn’t the consensus settled on a eastern Saharan homeland?

    Afrasian wasn’t a neolithic lanaguage, and E1b1b isn’t West Eurasian to say the least….IBM made that clear enough..

  4. “The Northern Near Easterners are themselves structured on the African cline with Palestinians, Jordanians, Syrians, and Lebanese Muslims having more African ancestry than Assyrians, Jews, Druze and Lebanese Christians (Figure 4) confirming our previous observation.5 We investigated the time when admixture has occurred in North and South of the Near East by using linkage disequilibrium, 17; 18 using the Lebanese Muslims and the Yemenis to represent the two areas, and setting Africans, East Asians and West Eurasians in the geno set as references (Figure 6). We found two significant admixture events in the Lebanese Muslims; the first occurred around 600BCE-500CE (Z=3.5) and the second occurred around 1580CE-1750CE (Z=3.4), confirming our previous results on the date of admixture in North of the Near East.19 In contrast, we detect one significant admixture event in Yemen occurring 1190CE-1290CE (Z=14.6) and thus these results suggest that the shifts of the North and South of the Near East along the African cline could arise from independent events”

    A little perplexed by this part here. So Lebanese Muslims received a pulse of African admixture sometime between 600 BCE – 500 CE, well before they obviously existed as an actual population. Do Leb Christians, Druze, Samaritans show this same pulse, but not the secondary one dated to 1580-1750 AD? And what would even be the mechanism for the earlier episode of African gene flow? Low level admixture from Ptolemaic-era Egyptians?

    And for any Lebanese historians that might be lurking, what happened in the 17th century that apparently resulted in African admixture into Lebanese Muslims?

  5. @Mick, the language was confusing, but if I interpreted right, 600 BCE — 500 CE refers to the pulse of northerly admixture? Not certain though.

  6. Lol @ the anon. I don’t understand how a person can refute solid evidence, but it happens all the time…continuously…. Anyway the study reminds me of the Africa over the Sea element you lot discussed with Jeff Rose (?)…makes sense tbh. Really looking forward to both your the Tutsi/East African and South Asian genotype work (im a lil biased cuz I have ancestry from both regions lol)

  7. “insofar as until the past thousand years or so the peoples of Yemen did not even speak “Arabic.””

    This is not correct. In fact there is a variant of South Semitic called Amiritic a sub-dialect of North Sabaic, which shows Arabic influence somewhere in Najran and Northern Yemen. The revisionist often assumes that Nabataean has two languages ; spoken language (Arabic Hismaic) and written language (Nabataean Aramaic) which is a flimsy hypothesis to form the foundation of Arabic history. It is basically like saying Nabataean have spoken a secret language that no one knew it besides them.


