Selection is going on with SLC24A5….

The ancestral allele for rs1426654 at SLC24A5

 
On this week’s episode of The Insight, I talked to Matt Hahn about why he wrote his new book, his opinions on “Neutral Theory”, and what he thought about David Reich’s op-ed. Without Spencer’s supervision, I have to admit that I think I lost control and just went “full nerd”. Next week we’re dropping Carl Zimmer’s podcast, so rest assured that the world will come back into balance, and The Insight will be more welcoming to civilians!

At a certain point, Matt and I were discussing allele frequency differences between populations and he came close to saying all such differences between human populations were of modest frequency in relation to pairwise comparisons (e.g., 40% vs. 49%). Obviously, this is not true, because there is always the huge difference in SLC24A5 at SNP rs1426654 (at Duffy and a few other loci). A substitution of a G for an A converts the codon from alanine to threonine.

You have heard of this locus because of a paper in 2005, SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. This paper came out in December of 2005, a few years after Armand Leroi wrote in Mutants that geneticists still hadn’t come to grips with normal variation in pigmentation in humans. The above publication was the first step in solving this question in the years between 2005 to 2010, at least to a good first approximation.

In the sample in the paper they explain 25-40% of the variation in melanin index between Africans and Europeans with this single genetic change (for various technical reasons it’s probably not that big an effect, though it is still big, and probably the largest effect quantitative trait locus for pigmentation in the human genome).

It turns out that this mutation, the derived variant, is almost disjoint is frequency between Europeans and Africans. That is, about ~100% of Africans carry the ancestry G base at while ~0% of Europeans carry the G base (as opposed to the A base). Interestingly, East Asians carry the G base at ~100% frequency as well. If you genotype an anonymous individual and their genotype is AG or GG on at rs1426654 then it is highly likely that that individual is not a European.

To give an example of how this works, in 2013 I stumbled onto a paper which genotyped 101 Europeans from Cape Town in South Africa. That means there are 202 alleles (two per person) at rs1426654. Of these, 5 of the alleles were ancestral (G). From this, I immediately concluded that it was highly likely that the Afrikaaner people of South Africa have non-European ancestry. I came to this conclusion because of 5 copies of the ancestral allele, ~2.5%, is shockingly high for a European population, and it was long surmised that the Afrikaaner people had some non-European heritage (Khoisan, Bantu, South and Southeast Asian) ancestry. The major of the whites sampled in Cape Town could have been Afrikaaners (I’ve confirmed this with genome-wide data).

To get a sense of where my intuitions come from you need to look at allele counts within populations. Using 1000 Genomes, Yale’s Alfred, and Gnomad I assembled a representative list to give you a sense of what’s going on. Using 126,548 counted alleles in Gnomad for individuals of European (non-Finnish) descent you see that 0.38% out of the total, 486, are ancestral.

Population Ancestral alleles Total alleles Freq
Samaritan 0 74 0%
Basque 0 216 0%
Greeks (Thrace, Athens) 0 184 0%
Burusho 0 50 0%
Pandit Brahmin, Kashmir 0 40 0%
European (Non-Finnish) 486 126548 0%
Ashkenazi Jewish 47 10148 0%
European (Finnish) 329 25790 1%
Iraq Kurds 1 68 2%
Yemenite Jews 2 78 3%
Havyaka Brahmin, Karnataka 2 62 3%
Palestinian 4 122 3%
Gujarati 10 206 5%
Tunisian Berber 6 110 5%
Andalusian 14 252 6%
Iranian 6 84 7%
Pashtun 21 190 11%
Uttar Pradesh Brahmin 4 34 12%
Pandit Brahmin, Haryana 13 78 17%
Punjabi 42 192 22%
South Asian 6921 30774 22%
Kalash 14 48 29%
Telugu 71 204 35%
Bangladeshi 80 172 47%
Sri Lanka Tamil 105 204 51%
Adi-Dravida, Karnataka 21 34 62%
Masai Kenya 192 286 67%
Austro-Asiatic tribe, Odisha 43 56 77%
Luhya Kenya 155 188 82%
Hausa 68 76 90%
Mende Sierra Leone 155 170 91%
Gambian 209 226 92%
Ibo 90 94 96%
Austro-Asiatic tribe, Odisha 92 96 96%
Esan Nigeria 193 198 97%
Yoruba Nigeria 213 216 99%
Biaka 135 136 99%
East Asian 18728 18856 99%
Ghana 140 140 100%
Mbuti 74 74 100%

Last fall Crawford et al. reported that rs1426654 is embedded in a haplotype that’s about ~30,000 years ago. Additionally, they contend that its presence within Africa is probably no earlier than the Holocene, the last ~12,000 years.  Martin et al. report that KhoeSan exhibit higher frequencies of the derived allele because of Eurasian back-migration and then in situ natural selection. Of course, not all Eurasians. Most East Asians have the ancestral variant of rs1426654.

This leaves us with West Eurasians, North Africans, and South Asians. I’ve put a few South Asian populations in the list to show you that there is a wide range of variation in allele frequencies. The South Asians in Gnomad, probably mostly Diaspora, have the ancestral variant at only 22%. In contrast, Austro-Asiatic speaking South Asian groups from northeast India have very high frequencies of the ancestral variant. There has clearly been in situ selection in some South Asian populations for the derived variant at rs1426654. Ancestral North Indian groups (ANI) probably brought the derived allele, and Ancient Ancestral South Indians (AASI) probably tended to carry the ancestral allele, like East Eurasians and Oceanians. Additionally, South Asian populations often have high drift. Some of the differences in the Alfred data seem to be impacted by this.

The situation in the Middle East, North Africa, and Europe is different.  In the Middle East and North Africa, the ancestral variant is present at frequencies around 1-10%.  Some of this can probably be attributed to admixture from Africa and in some cases South and East Asian populations. Ancient DNA from the Middle East and North Africa presents a mixed picture. The farmers who brought the Neolithic to Europe carried the derived variant at rs1426654, and some of the ancient Middle Eastern samples carry it. But not all. The recent Iberiomauserian samples which date to ~15,000 years ago don’t seem to have had the derived variant.

Though the hunter-gatherers of Western Europe only seem to have carried the ancestral variant at rs1426654, the hunter-gatherers of Scandinavia and Eastern Europe did exhibit the derived variant in some frequency, though lower than modern Europeans.

My own hunch is that the original genetic background against which the A mutation at rs1426654 emerged will be found increasing in frequency first somewhere in the Near East after the Last Glacial Maximum. But no ancient population shows the frequencies of the derived variant we see in modern Europeans. In isolated populations subject to drift it wouldn’t be surprising if the ancestral variant decreased to ~0%, But in European populations today in the vast majority of cases the ancestral variant is far lower than 1%, even though we know that within the last 10,000 years the ancestral populations streams had several groups with very high frequencies of that ancestral variant. The low frequency is not due to a freakish bottleneck all across Europe. It has to be selection

One thing I have pointed out is that this very low frequency of the ancestral variant indicates that the advantage at rs1426654 for the A allele in Europe is additive. In Northern Europe, the frequency of the derived variant that confers lactase persistence tops out at around ~90 percent. We know this region of the genome has been targeted by natural selection, but lactase persistence also happens to express dominantly genetically. That is, one variant of the mutant allele confers the phenotype. Once you hit ~90 percent of the derived variant only ~1 percent of the population would be lactose intolerant homozygotes (two copies of the ancestral variant). In the Gnomad sample of 60,000+ Europeans, they count three homozygote genotypes rs1426654. That’s 0.005%.

Something is happening at rs1426654. Selection. But why? No one really has any explanation beyond the obvious.

Our Edo period future?

The second season of Westworld has some scenes set in Edo period Japan. To spoil things for you there is apparently a scene-by-scene re-creation of a plot arc from the first season of the show set in the American West. Watching this scene, and comparing it to the earlier version, I can’t but help feel that the Edo period setting is more grand and refined. If the first season’s violent attack was brutalist, the scene above is more neoclassical.

Then again, Edo Japan and the American West are perhaps antipodes of second-millennium civilization. Where the 19th century American West was anarchic, chaotic, and creative, the Edo period in Japan was notable for its stability, order, and the perception that it was a culture in chrysalis. Old forms may have been reinvented, but those forms were treasured.

The context for the Edo period is that 16th century Japan was a dynamo. Not always in a good way. The islands were riven by internal warfare. The Japanese were known to be a piratical race by the Ming dynasty, and the 16th century ended with the warlord Hideyoshi’s disastrous invasion of Korea. Prefiguring Japanese ability to imitate the West in industriousness they developed a skill in the making of guns, while Roman Catholic Christianity had great success in the southern island of Kyushu.

Eventually, Tokugawa Ieyasu set the stage for Japan’s nearly three hundred year exile from the congress of nations, turning his back on Hideyoshi’s adventurousness. Of course, it is false to assume that the Japanese were totally insulated from the outside world. Not only did they connect with the West through the Dutch, but the Japanese maintained a more intense relationship with Korea. Even in the 17th and 18th century, a movement of “Western Learning” persisted through the interaction with the Dutch (though arguably late Confucian influences may have been more significant).

The violent suppression of Christianity in the 17th century and the emergence of a static caste system strikes modern sensibilities as brutal, barbaric and regressive. But the Edo period’s reduction in distribution and production of lethal firearms shows the upside of a conservative and controlling social land political elite. Violence continued, but it was relatively controlled and channeled.

We think of the future as endlessly protean and dynamic. But science fiction offers up an alternative possibility far more like Edo period Japan: technologically stagnant, culturally conservative. Frank Herbert’s Dune was set in the context of a universe where there had been a religious jihad against artificial intelligence. Meanwhile, Isaac Asimov’s Foundation series was originally based on imperial Rome, but later incarnations admitted that the better model was imperial China. Just as in the Dune series, the Foundation universe had to grapple with humanity’s protean and chaotic violence, which threatened to take down our civilization periodically due to enthusiasms.

The Edo period stretches from the early 17th century down to the middle of the 19th. All in all this is not a bad run. Our own republic’s 250 year anniversary will be on us in 2026.

The mutation accumulation controversy continues….

Every few years I check to see if the great mutation accumulation controversy has resolved itself. I don’t know if anyone calls it that, but that’s what I think of it as. There are two major issues that matter here: mutation rates are a critical parameter in evolutionary models, and, mutation accumulation over time matters for parental age effects when it comes to disease (speaking as an older father!).

In the latter case, I’m talking about the reasons that people freeze their eggs or sperm. In the former case, I’m talking about whether we can easily extrapolate mutation rates over evolutionary time as semi-fixed, so we can infer dates of last common ancestry and such. To give a concrete example of what I’m talking about, if mutation rates varied a lot over the evolutionary history of our hominin lineage, then we might need to rethink some of the inferred timings.

Today two preprints came out on mutation accumulation. First, Overlooked roles of DNA damage and maternal age in generating human germline mutations. Second, Reproductive longevity predicts mutation rates in primates. What a coincidence in synchronicity!

Additionally, the last author on the second preprint, Matt Hahn, is someone I’ll be doing a podcast with this week. So aside from talking about neutral theory, and his book Molecular Population Genetics, I’m going to have to bring up this mutation business.

The figure above from the first preprint shows that the proportion of mutations derived from the father don’t increase over time, as textbooks generally state. Why would we expect this? Sperm keeps replicating after puberty so you should be gaining more mutations. In contrast, the eggs are arrested in meiosis. There are various mechanistic reasons that the authors of the first preprint give for why the ratio does not change between paternal and maternal mutations (e.g., non-replicative mutations seem to be the primary one). The authors are using a very “pedigree” strategy, rather than an “evolutionary” one. They’re looking at sequenced trios, and noticing patterns. I think in the near future they’ll be far more sure of what’s going on because they’ll have bigger sample sizes. They admit the effects are subtle (also, some of the p-values are getting close to 0.05).

Instead of focusing on a human pedigree, the second preprint does some sequencing on owl monkeys (I had no idea there were “owl monkeys” before this paper). They find that the mutation rate is ~32% lower in owl monkeys than in humans. Why is this?

The plot to the left shows that mutations increase across age with species (though the number of data points is pretty small). The authors contend that:

The association between mutation rates and reproductive longevity implies that changes in life history traits rather than changes to the mutational machinery are responsible for the evolution of these rates. Species that have evolved greater reproductive longevity will have a higher mutation rate per generation without any underlying change to the replication, repair, or proofreading proteins.

If I read this right: owl monkeys reproduce fast and don’t have as much reproductive longevity. Ergo, lower mutation rates (less mutational build-up from paternal side).

After all these years I’m still not convinced about anything. I assume that eventually bigger data sets will come online and we’ll resolve this. Someone has to be right!

(not too many people on Twitter get what’s going on either)

Beyond “Out of Africa” within Africa

It looks as if the vast majority (95% or more depending on the population) of the ancestry of non-African humans derives from a population expansion which began around ~60,000 years ago. Before this period some researchers argue there was a non-trivial period of isolation. The “long bottleneck” (David Reich alludes to this in Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past). For the vast majority of humans then the last 60,000 years is characterized by a branching process, some reticulation (e.g., South Asians merge West and East Eurasian lineages) between these branches from a common ancestor, as well as introgression from archaic lineages like Neanderthals and Denisovans.

Though I do accept that it seems that modern humans probably migrated out of Africa before 60,000 years ago, mostly due to the results from archaeology, I think the genetic evidence is strong that these groups contributed very little genetically to contemporary populations.

The situation within Africa is very different. Being conservative it seems likely that the Khoisan ancestral lineage diverged from some other Africans ~200,000 years ago. I say conservative because there are researchers who want to push the divergence much further back. Additionally, several different research groups are now converging in a result that West Africans are a mixture between eastern Sub-Saharan Africans (think the population ancestral to Mota in Ethiopia) and a lineage basal to all other humans. That means that the Khoisan are not the most basal, so even assuming the conservative 200,000 year divergence point for Khoisan, modern humans share a common ancestor earlier than 200,000 years ago.

The upshot here is that around 75 percent of the history of modern humans is within (greater)* Africa. The distinctive “Out of Africa” bottleneck and expansion defines most humans only in the last 25 percent of the history of our species. And, within Africa, the dynamics were very different. The biggest difference is that African populations are not defined by a large number of lineages emerging and diverging around the same period, because there wasn’t a massive and singular expansion within Africa analogous to what occurred outside of Africa (at least until the recent past, with the Bantu expansion). That’s why there’s deep structure within Africa today between groups as divergent as the Bantu, Mbuti, Hadza, and Khoisan.

The term “Basal Eurasian” kind of makes sense in the non-African context because of the singular importance of divergence between lineages in the first 10,000 years or so after the “Out of Africa” event. I’m not sure “Basal human” makes as much sense because there wasn’t a singular event within Africa that allowed for the emergence of modern humans. Rather, it was a process, and probably quite resembles something like multiregionalism.

* Some wiggle room here for the likelihood that modern humans were long present in the liminal Near East.

The end of the century of privacy

Reading The Rise and Fall of American Growth: The U.S. Standard of Living since the Civil War has made me think more about the unique nature of urban civilization of the long 20th-century. The expansion of public health, in particular provision of clean water, meant that for the first time in the history of the world you had a situation where people in cities actually had a higher life expectancy than those in rural areas. Prior to this cities were demographic sinks. We have data from the 19th century which makes it clear that morbidity was higher for city dwellers. This is probably the major reason, in my opinion, the cosmopolitan worlds of antiquity had such a marginal demographic impact: the culturally vibrant city-dwellers who dominated Classical civilization politically and socially didn’t leave many descendants.

Even though cities were dominant politically and central to many earlier societies, only in the last century so have predominantly urban societies emerged. Before that most humans lived in villages or in hunter-gatherer bands. Everyone was in everyone else’s business. Anonymity was simply not a thing for most humans in most periods of our species’ history.

This changed with the rise of cities. In the early 2000s the anthropologist Robin Dunbar argued that people could maintain ~150 genuine social relationships in their mind. This is Dunbar’s number. Over the past two decades, there have been lots of arguments about Dunbar’s number. One can stipulate that the value may not be 150. Additionally, it seems likely that some people have a higher Dunbar’s numbers than others. But the general point that human social competencies have a ceiling value seems to be right.

And, that ceiling is smaller than the number of people who live in close proximity to each other in cities. The potential facelessness of your neighbors in a city, and its diversity and cosmopolitanism is one reason that it was in cities that written laws displayed in public places emerged as a custom. Societies not bound together by social interaction and kinship needed abstractions which could scale. Laws, kings, and religions are just some of the cultural inventions that were essential to maintain order in a city where strangers interacted daily.

But were these cities really incubators for anonymity? I would argue that the premodern city offered far less anonymity, and therefore privacy than the modern city. Premodern cities were dense, due to limitations in transportation. They were defined by neighborhoods. Additionally, economic activities in cities were often defined by relationships between people, whether it be between a patron and an artisan, or members of a cooperative guild. In some ways, premodern cities were a collection of villages.

What defined the 20th-century was the rise of massive corporations that rationalized economic consumption and production. The supermarket is cheaper than your local green-grocer, but there is also less of a personal relationship between you and the supermarket staff. Similarly, they may not know who you are. Rather than having economic relationships directly to other people, you have an economic relationship with an institution, which acts as an intermediary.

By the second half of the 20th century, individuals in cities could be totally self-sufficient and isolated from other human beings if they so chose when it came to personal relationships. The rationalization of modern life made deep human interaction a choice, and to some extent, privacy was the default state.

The rationalization of economic relations continues. But over the last 20 years, and especially the last ten or so, the default state of privacy has disappeared. If you know someone’s name you can usually find their age, where they have lived their adult life, who they lived with, and who their relatives are. Websites like Zillow can tell you their home-value or when/if they bought their home and for how much. Facebook, Twitter, and other social media make it so you can find out many things about a person.

Recently a friend of mine who became newly single after ten years in a relationship decided to try out online dating (for the first time). One thing he found is that you have to assume that your matches may have Googled you beforehand (presumably this depends on whether the site gives you full name or not). If you are too shy to talk to your neighbors, just look up who lives at the various addresses around you.  Once you have their names you can find out everything else.

Obviously, modern information technology doesn’t make it so that we live in a premodern village. But, it does mean that the faceless anonymity enabled by rationalized modern economics and socio-political systems is stripped away. In its place, you become a set of values for various parameters (age, income, political orientation, geographical mobility). You don’t know people in a tacit and natural manner, you know them through their data.

Whereas the political and social views of most employees of a corporation were out of view in the 20th-century, today many companies are snooping around in Facebook feeds and doing simple background checks. You may not have a personal relationship with a large company, but it has a relationship with the data that it defines you by.

The 20th-century was the century of privacy because the machinery of information distribution appropriate to hunter-gatherers and villages did not scale to cities. And 20th-century technology never caught up to the scale of the cities and economies of that period in terms of distributing information. As the 21st-century proceeds, it seems that information technology is finally now in place.

Open Thread, 5/20/2018

Warren Treadgold’s The University We Need: Reforming American Higher Education is going to come out in early July, but I’ve written my review. Don’t know when NRO will post it. In general, I’m positive. Though Treadgold has some ideological issues with Leftism in the academy, much of the book is apolitical and shines the light structural problems with contemporary academia.

It’s not a secret that I’m a fan of the author’s earlier work, A History of the Byzantine State and Society. So I checked some of the footnotes in The University We Need, and it turns out he’s a skeptic about the accolades given to Chris Wickham’s Framing the Early Middle Ages. Myself, I think both of these huge books are worth reading.

Bernard Lewis has died. He gets a lot of bad press from people like Edward Said of Orientalism fame, and over the last 20 years has become inextricably connected to neoconservatives who cheered on our nation’s foreign adventures. But a lot of his work is pretty interesting, especially the earlier stuff. I like The Middle East: A Brief History of the Last 2,000 Years.

On The Number Of Siblings And p-th Cousins In A Large Population Sample. I can’t say I follow all the mathematical details but jump to equation 7. But this preprint heavily informs Edge & Coop’s How lucky was the genetic investigation in the Golden State Killer case?

The Coming Wave of Murders Solved by Genealogy. The horse has left the barn and the great rush is on. Ultimatley this all going to be a normal part of forensic work soon enough.

I’m not sure that there’s a single fact yet in The Rise and Fall of American Growth: The U.S. Standard of Living since the Civil War that’s surprised me. Is this because so much of this stuff has now percolated across our culture (e.g., the increased demand for horses in the late 19th century due to complementarity with railroads).

That being said there is a lot of specific detail that’s of interest. For example, the proportion of households with telephones during the Great Depression dropped, but those with radios kept increasing as a fraction of the American populace. The reason is that telephones were rented and required recurrent payments, which many families could no longer afford, while radios were purchased once, after which usage was free.

I don’t know much about Jordan Peterson. Curiously the people who talk to me about him the most are moderate liberals who are annoyed about the demonization of him by the further Left. I don’t have much to say, except it’s shocking how many patrons he has, and, the Left-media attacks on him probably are making him more popular.

Men are far more dangerous than women:

Problematic anti-Semitism bill passes in South Carolina:

The Act, which if not challenged in court and struck down as unconstitutional, will require South Carolina’s public institutions of higher education to “take into consideration the [State Department’s] definition of anti-Semitism for purposes of determining whether the alleged practice was motivated by anti-Semitic intent” when “investigating, or deciding whether there has been a violation of a college or university policy prohibiting discriminatory practices on the basis of religion.”

Heavy-handed suppression of anti-Semitism on campus is going to lead to more, not less, anti-Semitism. You know why.

Genetic analysis of Sephardic ancestry in the Iberian Peninsula.

Hybridization and postzygotic isolation promote reinforcement of male mating preferences in a diverse group of fishes with traditional sex roles.

A New Way for DTC? Nathan Pearson, Root Deep Insight.

Was Kevin Cooper Framed for Murder?

Farmers, tourists, and cattle threaten to wipe out some of the world’s last hunter-gatherers.

The new book, The Book of Why, is important.

The material consequences of Rome’s decline


The plot at the top is from a Peter Turchin post, History Is Now a Quantitative Science. Peter has been on this for more than ten years now. I’ve long been broadly sympathetic, but of late it’s been nice to see his formal and data-intensive approach take hold and make some waves. Using raw data from a PNAS paper on the concentration of lead in Greenland ice caps one can illustrate the theory of secular cycles, as the western edge of the oikoumene went through periods of rise and fall. I don’t say specifically Rome because as Peter observes the first rise probably had more to do with Carthage than Rome, and the last recovery was particular mild probably because its focus was on the eastern Mediterranean, rather than the west.

As readers of this weblog know this lead data is not entirely new. I remember stumbling on it in The Fall of Rome: And the End of Civilization. It’s just more fine-grained and detailed than what came before. This sort of result definitively convinced me in a flash that the “fall of Rome” was neither fiction nor propaganda, but a true material event.

And yet the materiality is important. Like Song China, the Augustan and Antonine periods were characterized by a phase of intensive coordinated economic activity and productive output that one can’t deny. It’s right there in the material record. But from the perspective of a Christian or a Muslim, the collapse of the power of the Roman state coincided with the rise to power of the most important development in human history: the cultural dominance of singular religious visions.

The point being that when we say that “Rome fell,” it hides within it assumptions of value and importance. History is not fiction and can be understood in all its reality, but it is always critical to expose your assumptions and gain an understanding of the common ground shared between individuals whose viewpoints may differ.

 

Migration at the roof of West Asia

Click to see the full figure

The figure to the left is from The genetic prehistory of the Greater Caucasus. If you are a regular reader of this weblog, or Eurogenes, you can figure out what’s going on, and keep track of the terminology. But in 2018 I think we’re getting to the end of the line in making sense of “admixture graphs” in relation to West Eurasian population structure. The models are just getting too complicated to keep everything straight, and the distinct-populations-subject-to-pulse-admixture seems to be an assumption that may not necessarily hold.

To get a sense of what I’m talking about, the above preprint focuses on populations in and around the Caucasus region. One of the major reasons that this is important is that the Caucasus was and is to some extent a continental hinge, connecting Eastern Europe and the Pontic steppe, to the Near East. The Arab Muslims pushed north of the Caucasus, and came into conflict with the Khazars, while Cimmerians and Scythians moved south from the Pontic steppe.

The elephant in the room is the relevance to the “Indo-European controversy.” Colin Renfrew long ago posited that the Indo-European languages derive from West Asian farmers who expanded into Europe as early as ~9,000 years ago. A rival theory is that Indo-Europeans spread out of the Pontic steppe ~4,000 years ago. In 2015 two major papers suggested that the steppe was a major source of Indo-European expansion. Case closed? This preprint suggests perhaps not.

But we’ll get to that later. What do the results here show? The prose is a little hard to tease apart, but the major issues seem to be that in antiquity, or at least the period they’re focusing on, much of the gene flow seems to have been south (Near East) to the north (through the Caucasus, and out to the north slope). To some extent, we already knew this: the Yamna people of the Pontic steppe have “southern” ancestry from the Near East that earlier East European/Pontic people do not. In this preprint, the authors show that groups such as the Maykop of the north slope of the Caucasus carry Y haplogroups such as G2, and not the R1 lineages commonly found in the steppe. David W. suggests that this confirms that Near Eastern gene flow into the steppe was female-mediated.  This is plausible, but I would caution that Y chromosomes alone can be deceptive, due to the power of particular patrilineages. We’ll probably rely on the X chromosome to make a final judgment.

The plot below shows many of the relationships as a function of location and time. The green component is modal among “Iranian farmers,” the orange among “Anatolian farmers,” and the blue among “Western hunter-gatherers.”

A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals: the first, from Anatolian farmers proper, and the second from the descendants of European farmers, who themselves are a mix of Anatolian farmers with a minority ancestry among the hunter-gatherers. The answers would probably be totally unintelligible if not for archaeology. It’s clear that the steppe people had contact with both European and Near Eastern farmers and that later East European groups that succeeded the Yamna were subject to reflux from Central Europe, and received European farmer ancestry.

Another curious nugget in their results is that there was early detection of both Ancestral North Eurasian (ANE) ancestry and, some East Eurasian gene flow (related to Han Chinese). One of their individuals carries the East Eurasian variant of EDAR, which today is only found in Finns, though it was found in reasonable frequencies among the Motala hunter-gatherers of Scandinavia. Additionally, Fu et al. 2016 found that the ancestors of Mesolithic hunter-gatherers received some gene flow from Eastern Eurasians as well (also in the supplements of Lazaridis et al. 2016).

The authors admit that there is probably population structure among ANE and undiscovered groups of East Eurasians who were traversing the Inner Asian landscape. I think this is all suggestive of some long-distance contacts, though the intensity and magnitude increased a lot with high-density societies and the mobility of pastoralism.

Much of the genetic mixing in the Near East, and to some extent in the trans-Caucasian region, seems to date to the 4th millennium. This is technically prehistory, but it is also the Uruk period. This was a phase of Mesopotamian culture expansion between 4000 and 3100 BC which resulted in replicas of Uruk style settlements as far away as Syria and southeastern Anatolia. There is even evidence of Uruk-related migration to the North Caucasus.

The Uruk experienced abrupt and sudden collapse. Uruk settlements outside of the core zone of Mesopatamia disappear.

It’s the final paragraph that warrants discussion:

The insight that the Caucasus mountains served not only as a corridor for the spread of CHG/Neolithic Iranian ancestry but also for later gene-flow from the south also has a bearing on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia…Perceiving the Caucasus as an occasional bridge rather than a strict border during the Eneolithic and Bronze Age opens up the possibility of a homeland of PIE south of the Caucasus, which itself provides a parsimonious explanation for an early branching off of Anatolian languages. Geographically this would also work for Armenian and Greek, for which genetic data also supports an eastern influence from Anatolia or the southern Caucasus. A potential offshoot of the Indo-Iranian branch to the east is possible, but the latest ancient DNA results from South Asia also lend weight to an LMBA spread via the steppe belt…The spread of some or all of the proto-Indo-European branches would have been possible via the North Caucasus and Pontic region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and now widely documented ‘steppe ancestry’ in European populations, the postulate of increasingly patrilinear societies in the wake of these expansions (exemplified by R1a/R1b), as attested in the latest study on the Bell Beaker phenomenon….

But instead of tackling this let’s focus on the paper that came out of the Willerslev group, The first horse herders and the impact of early Bronze Age steppe expansions into Asia. This is a final manuscript in Science. That means it was probably written before The Genomic Formation of South and Central Asia. When it comes to South Asia, the results from the two publications are consanant. There is no conflict.*

More interesting are the results in West Asia, and the linguistic supplement. In the authors note that tablets now indicate an Indo-Aryan presence in Syria ~1750 BC. Second, Assyrian merchants record Indo-European Hittite, or Nesili (the people of Nesa), as early as ~2500 BC.

As suggested in earlier work Hittite remains don’t suggest steppe influence. David W. says:

The apparent lack of steppe ancestry in five Hittite-era, perhaps Indo-European-speaking, Anatolians was interpreted in Damagaard et al. 2018 as a major discovery with profound implications for the origin of the Anatolian branch of Indo-European languages.

But I disagree with this assessment, simply because none of these Hittite-era individuals are from royal Hittite, or Nes, burials. Hence, there’s a very good chance that they were Hattians, who were not of Indo-European origin, even if they spoke the Indo-European Hittite language because it was imposed on them.

The main aspect I’d bring up with this is that in other areas steppe ancestry has spread deeply and widely into the population, including non-Indo-European ones. It is certainly possible that the sample is not needed enough to pick up the genuinely Hittite elite, but I probably lean to the likelihood that the steppe signal won’t be found. It seems that the Anatolian languages were already diversified by ~2000 BC, and perhaps earlier. Linguists have long suggested that they are the outgroup to other Indo-European languages, though this could just be a function of their isolation among highly settled and socially complex populations.

Two alternative models present themselves for these results. The Anatolian Indo-European languages expanded through elite diffusion,  part of the same general migrations that emerged out of the Yamna culture ~3000 BC. The lack of a steppe signal may be due to sampling bias, as David W. suggested, or, more likely in my opinion, simple dilution of the signal. Second, the steppe migrations were one part of a broader palette of population movements and cultural diffusions, and the Anatolian Indo-Europeans are basal to the efflorescence of the steppe derived branches.

The evidence of the explosion of Indo-Aryans in the years after 2000 BC in West and South Asia, as well as the expansion of Iranians across vast swaths of Inner Asia during the same period, suggest to me that Indo-Iranians are most definitely part of the steppe pulse. The connection to the Sintashta charioteers presents itself, and, connections to the Uralic languages indicates incubation in the trans-Volga region.

In West Asia, the Indo-Aryans crashed themselves against the most advanced civilizations of their time. Like the Bulgars, and unlike the Hittites, Indo-Aryan Mitanni was totally absorbed by their non-Indo-European Hurrian substrate. Indo-Aryan linguistic influence was preserved in their names, their gods, and in particular words relating to chariots. And yet in 2017’s Continuity and Admixture in the Last Five Millennia of Levantine History from Ancient Canaanite and Present-Day Lebanese Genome Sequences, the authors observe:

We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modeled as Sidon_BA 93% ± 1.6% and a Steppe Bronze Age population 7% ± 1.6% (Figure 3C; Table S6). To estimate the time when the Steppe ancestry penetrated the Levant, we used, as above, LD-based inference and set the Lebanese as admixed test population with Natufians, Levant_N, Sidon_BA, Steppe_EMBA, and Steppe_MLBA as reference populations. We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).

This needs to be more explored. The admixture could have come from many sources. I am curious about the frequency of R1a1a-z93 among modern-day Syrians and Lebanese.

For me these arguments can only be resolved with a deeper understanding of linguistic evolution. The close relationship of Indo-Aryan and Iranian languages is obvious to any speaker of either of these languages (I can speak some Bengali). A divergence in the range of 4 to 5 thousand years before the present seems most likely to me. But the relationship of the other Indo-European languages is much less clear.

One of the arguments in Peter Bellwood’s First Farmers is that the Indo-European languages exhibit a “rake-like” topology with the exception of Indo-Iranian, which forms a clear clade. To him and others in his camp, this argues for deep divergences very early in time.

It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these. Only a deeper understanding of linguistic evolution, and multidisciplinary analysis of regional substrates will generate the clarity we need.

* I’m going to skip the Botai angle in this post.

Open Thread, 05/13/2018

The University We Need: Reforming American Higher Education is a funny book. The author, Warren Treadgold, is someone I know from his magisterial A History of the Byzantine State and Society. One of the complaints about A History of the Byzantine State and Society is that it’s too dry and academic. The University We Need is not dry at all, unless you are referring to the mordant wit on display.

Since I’ll be reviewing The University We Need for NRO I won’t say much more than that, except that Treadgold is most definitely in a “gives no fucks” mood. Yes, he attacks administrators as you’d expect, but he also slams Hillsdale, the professoriate, and students. He also has opinions about cafeteria food!

Italy’s 5 Star, League Reach Deal to Govern Nation. This sort of Left-Right hybrid to me illustrates that we’re in a “crisis of capitalism,” or more precisely a crisis of Western civilization. Italian total fertility rate is ~1.40. The rest is commentary.

Localizing and classifying adaptive targets with trend filtered regression. In Who We Are David Reich talks about his ambition to create a sort of encyclopedia of human genomic history. But once that history is established, we’ll need to move on to understanding selection. This preprint looks like it will be important.

23andMe Hits Ancestry.com With Patent Suit Over DNA Kit. One thing they are doing is suing over notifying about identity-by-descent. The “non-obvious” reason for awarding the patent is apparently the notification.

Intellectual property is a joke. But it’s also big business.

This week on the podcast we talked about the grandmother hypothesis with Kristen Hawkes. I should have been more aggressive in jumping in to get her to clarify what “life history theory” was. Live and learn.

We are now at 20 podcasts. If you can, please subscribe with iTunes or Stitcher. Also rate us highly and leave positive views if you can! It looks like we’ll get to 100 in iTunes soon, at which point I’ll stop pestering, though we’re only at 6 reviews on Stichter.

Carl Zimmer’s podcast has been recorded already, but won’t be dropped until close to when She Has Her Mother’s Laugh is published. We’re also recording a podcast with Patrick Wyman of Tides of History this week. This should be “evergreen”, at least on the scale of a year, so expect that in June or later. It will be nice to have a historian on.

The next few weeks we’re going to go it alone, covering a few topics Spencer and I are both interested in. It’s a good change of pace. We’ve got some ideas for what we’re going to talk about in June too. I think it is most definitely important to follow-up on the Indo-Aryan podcast, which we actually recorded in July of 2017, though we didn’t drop it until 2018.

There was an interesting fiasco on Twitter recently where some semi-prominent people asserted that Andrew Sullivan was against gay marriage. This is really bizarre because Sullivan was a major proponent of the idea before it was ever mainstream, and for a period also got resistance from the more radical anti-bourgeois faction of the gay rights movement.

Anyway, the mistaken tweets occurred because Sullivan is transforming into a hate-figure on the far left, and a lot of people on Twitter are stupid and ignorant, so they just inferred facts from their theory that Sullivan is right-wing. Some of these people retracted the falsehood grudgingly. But as usual, you can see that retweets and likes are/were more evident on the original tweet.

The point is repeating all this is is how “knowledge” is created now. If you have a prominent Twitter account it’s trivial to inject falsehoods into the debate. I’ve seen people doing this pretty consciously several times (this is really common in anonymous/pseudo accounts).

At the Brown Pundits weblog, I put up a post on this strange Slate piece on how the 1990s TV show Friends is contributing to sexism and homophobia in India. Though ostensibly about India, and narrated by an immigrant from India, the piece is about a preoccupation within American culture in 2018.

A publication like Slate is going to get a lot of clicks if they post something about misogyny and homophobia in Friends, but how to make it novel? Pretend it’s actually about India! To me, this is to journalism what science fantasy is to science fiction. Even the editor of this piece must have known it was hilarious bullshit, but they were also aware that in 2018 this is what Slate readers want.

Societies and cultures in relative decline and stagnation tend to undergo a period of involution. Narcissism writ-large.

I also wrote a post on Brown Pundits on why India did not become mostly Muslim. Need to think a lot more on this. Not all the comments were dumb. I wish more people would know more things.

Reading the coalescent chapter in Molecular Population Genetics, and it’s amusing to observe that the coalscent’s big advantage over forward-simulations in terms of computational horsepower needed isn’t really that big of a deal today. Even a few years back this was a huge issue. This is like in phylogenetics where everyone runs Bayesian stuff, when 15 years ago people were having a hard time imaging max-likelihood!

While reading Molecular Population Genetics I keep hearing the author’s voice in my head. I think this has to do with the fact that I knew the author before I read his book. This didn’t happen when I read She Has Her Mother’s Laugh because I had read Carl Zimmer before I got to know Carl in person. At least that’s my theory (The 10,000 Year Explosion was all in Greg Cochran’s voice).

Reading too much about Rome. So Carthage Must Be Destroyed is in the stack.

A systematic assessment of ‘Axial Age’ proposals using global comparative historical evidence. The argument here is that the “Axial Age” wasn’t a singular time period, but a continous event that spanned thousands of years. I think this is probably right, though “ages” are conceptually useful mental bookkeeping. This is similar to the idea that age cohorts are a real thing, but generations are not.

The Infectious Enthusiasm of Breaking the Bee.

Detection of shared balancing selection in the absence of trans-species polymorphism.

Self domestication and the evolution of language.

I need to set aside a day to catch up on the South Asian Genotype Project (SAGP). Also, figure out which plugin is causing the 500 errors.

The Roman, the Hun and the sun


I chose a fortuitous time to read Kyle Harper’s The Fate of Rome: Climate, Disease, and the End of an Empire. This is a great book, and a nice compliment to Bryan Ward-Perkins The Fall of Rome: And the End of Civilization. Where Ward-Perkins attempts to convince you that Rome did indeed fall, and that that fall mattered, Harper takes it as a given that you accept this position. Rather, he tries to show you in The Fate of Rome that a series of contingent and necessary causal factors set the Roman system up for its fall. The fall of Rome is not just an idea, but a material event that was given a strong push by material factors.

As the The Fate of Rome was published in the fall of 2017, so it was written well before recent work which highlights both the nature and role of steppe barbarians in triggering the changes which we dramatically term the “fall of Rome” and the “barbarian migrations.” A few months ago I wrote about a paper which reported that post-Hunnic people of the Balkans were genetically different from typical Europeans in that they exhibited some East Asian admixture. Harper does assume that the Huns were barbarians whose ultimate provenance was somewhere in the region of modern Mongolia, but emphasizes that their peregrinations transformed them.

As so it did. A new paper in Nature, 137 ancient human genomes from across the Eurasian steppes, nails the overall dynamics. As illustrated in the figure above the early steppe was dominated by peoples of a West Eurasian provenance, while the latter steppe shifted toward a more East Asian shifted population.

These early groups go by various names. But the Cimmerians, Scythians, and Sarmatians have origins on the Pontic steppe. Flourishing in the first millennium before Christ, I should precisely label them “Iranian,” but that might mislead readers a bit since some of these groups were never resident within Iran. The Scythians were a presence across a huge zone of Inner Asia and were a force in Eastern Europe, West Asia, South Asia, and in Eastern Asia. Likely emerging out of the Andronovo culture, genetically the results from the paper confirm early work that Scythians mixed with the local substrate where they went. In this way, they prefigure later steppe populations. Being a nomad was a lifestyle, the genetic correlates to some extent an accident.

In The Fate of Rome  the Huns have a role to play as a push for the migration of Goths into the Roman Empire, which eventually leads to their rebellion and a collapse in both the prestige and military manpower of the Roman state. The genetic evidence above and elsewhere is strongly indicative of the likelihood that the Huns were originally part of the Xiongnu confederacy. As they moved west they mixed with post-Scythian and other Iranian and Siberian elements, and presumably by the time they arrived on European frontier of Rome they had picked up some Germanic and proto-Slavic ancestry. In 137 ancient human genomes from across the Eurasian steppes the authors also report that the East Asian gene flow was somewhat “male-mediated” in the later steppe. Similarly, earlier work on proto-Iranian peoples in the Altai region is strongly suggestive of male-mediation in West Eurasian gene flow.

The obligate and exclusive Eurasian nomad lifestyle was one dominated by men, though as one can see the importance of Genghis Khan’s wives and daughters women maintained independence as well.

For whatever reason, full-blown nomadism only became a feature of the landscape north of what became China in the last few centuries before Christ. The mobile and militarized nomadic lifestyle that emerged in western Eurasia in the years around 1000 BC seem to have taken five centuries to penetrate the far eastern fringes. Until the crushing of the Dzhungar’s by the Manchus in the 18th century, 2,000 years later, the dynamic between nomad and settled was a defining feature of Chinese statecraft and political culture. And, it was also a major feature of nomad culture, because the wealthy Chinese state was an almost irresistible attraction to steppe elites as a source of plunder and tribute.

But human action is not the only relevant parameter in human history.  The Fate of Rome  is fundamentally a work of history, but it also takes ecology and evolution seriously. In fact, it foregrounds them. Kyle Harper makes the argument that the expansionary phase of the Roman Empire was not necessarily coincidental, or at least it was lucky indeed because there was a climatic optimum, similar to the one which preceded the demographic expansion of medieval Europe. In contrast, in the 6th century, the world went through some of the coldest years in the Holocene because of a combination of fluctuations in solar radiation and volcanic explosions. I assume that the likelihood of the latter is Poisson distributed, so the combination of decreased radiation and several successive volcanic events can be chalked up to randomness. But its consequences were not random at all.

The climatic changes can have demographic and social consequences obviously. Desperate armed pastoralists can overwhelm states, and change the course of history, just as peasants can rebel from taxes and subordination. And, pastoralists can also bring Yerisina pestis, the plague. Climate is an abiotic pressure which is to some extent an exogeneous shock which occurs randomly, and does not react to human feedbac k.* Disease though is a biotic pressure, and though it may relate to abiotic forces, human interaction and agency matter quite a bit.

The Fate of Rome clearly hinges on abiotic factors as initial drivers: a good harvest is good for the state. But the biotic factors, disease, are partly under the control of the state. The Romans did not have germ theory, and were under constant stress due to the high pathogen load, especially of the cities. Harper presents the evidence of high mortality within Roman society well. Because of the endemic ubiquity of disease even elites were impacted by it. But Rome was not just affected by endemic ailments, it was subject to pandemics and plagues. Three loom large in  The Fate of Rome:

  • The Antonine Plague, which ended the expansionary phase of the High Empire in the middle to late 2nd century.
  • The Plague of Cyprian of the middle 3rd century which ushered in a period of state collapse.
  • And finally, the Justinian Plague which marked the end of Late Antiquity and the beginning of the “Dark Ages.”

One of the major insights that Kyle Harper reiterates is that these plagues, these pandemics, are a feature/bug of the Roman imperial system. They are not just the consequence of simply settled agricultural society. As described in books such as Pandora’s Seed, agriculture and settled society transformed the lifestyles of human groups, and many diseases which were rare in hunter-gatherer populations probably became common among farmers. But The Fate of Rome the author argues that pandemics were a novel outcome of complex imperial state-systems with long-distance trade-networks. Small-scale pre-state Neolithic chiefdoms did not have the scale and interconnections to foster plague.

Mass pandemics of smallpox, plague, and influenza are then aspects of civilized life, not, settled agricultural life. This puts the argument of Charles C. Mann in 1493 into greater focus. It wasn’t just more extensive and intensive agriculture in the Old World which left Amerindians vulnerable, it was also that the Old World had thrown up several massive imperial systems which had incubated pandemic producing pathogens (smallpox and influenza epidemics were a major issue in New World societies). These were unleashed at once upon New World societies.

It also suggests to us why adaptation seems to be occurring in the last few thousand years. Bouts of plague which persisted for generations may have driven immunological responses.

Kyle Harper also seems to agree with the general thesis in The Fall of Rome that this period in European civilization was in some ways proto-modern, with economic specialization resulting in a modicum of affluence in ways unimaginable in times before, or after. Trade and some level of mass production allowed British peasants to eat off tableware that was standardized, and not homemade. In contrast after Britain’s post-Roman regression a more local economy had to step in. The most curious fact from The Fall of Rome is that pollution in British ponds did not attain Roman levels until the early modern period, with the rise of industrialization. Again and again  The Fate of Rome emphasizes that social and economic complexity achieved in the Roman Empire was not attained in Europe again to the same scale as the early modern period.

Roman wealth was fundamentally due to the returns on scale and specialization that are the hallmark of Smithian growth. Though the Romans did invent a few things, Roman prosperity was not fundamentally driven by innovation. Rather, the Roman peace was a framework for trade and exchange that took advantage of abiotic clement conditions (the Roman climatic optimum highlighted in The Fate of Rome).

But this political system had biotic costs, as well as being subject to biotic shocks. Though Romans may have been wealthier than their Iron Age predecessors in things, and also wealthier than their early medieval successors, they were also a smaller people. Using isotope data Harper suggests that this is not due to Malthusian immiseration as the imperial population pushed up against food supply. Apparently Romans did not subsist on gruel alone, but ate a fair amount of meat, especially pork. Rather it was the high pathogen load enabled by the advancement of Roman urban life and its scale. Rome was a world of intense morbidity.

Unlike physical/abiotic forces biological/biotic pressures on human existence are adaptive. Moderns know this with the rise of antibiotic resistance, it’s the eternal race. The Romans were not aware of the consequences of their means of prosperity, and were not ready for the exogenous shocks of climate and disease which were to perturb their state system.

But The Fate of Rome is not just a story of exogenous factors, climate and disease. Rather, Harper puts into stark relief the variables which might push an empire over the edge, or eat into its seed corn of human capital. That does not negate the fact that endogenous variables matter. The Roman elite of the early centuries exhibited some level of asabiyyah, social cohesion. The Empire was fundamentally not a strong state in comparison to modern ones. It was a thin skein of cities and fortifications binding together an overwhelmingly rural population of villages. Its achievement of peace and prosperity was bound up in an ideology and identity focused project which bound together an elite (or bound together elites).

The origins of this elite were not always arbitrary. Though the Empire was famously cosmopolitan, The Fate of Rome crystallizes something that anyone who had sat back and thought about could see: certain groups bound the imperial state together as a ruling caste. Harper observes that between the reign of Claudius and Phocas, from 268 to 602, 75% of the Emperors were of Illyrian/Balkan stock. That is, 75% of the Emperors were drawn from 2% of the Roman Empire’s territory. The exception being the Theodosian dynasty, which was of Iberian origin and jumped into the breach after the defeat of Valens at Battle of Adrianople.

This is a fascinating fact in and of itself. Harper points out that these Emperors from the Danube frontier did not enrich their own region to the detriment of others. They were ideological heirs of the earlier Roman project, and their identity was as Romans first, Illyrians and Thracians of Latin stock second (or third, after Christianization). But they brought particular skills of administration and an overall martial attitude which served to lead the Empire through a period of greater stress than it had been subject to during the earlier climatic optimum.

The Fate of Rome does not plumb the depths of ideological and social change but emphasizes their interaction with biotic and abiotic factors. Harper observes that public temple building decreases sharply after the Cyprian plague. Why? Perhaps there was a loss of faith in the old religious institutions. Though popular paganism remained dominant, new elite religious ideologies such as the cult of the Invincible Sun and later Christianity came out of the shadows during this period.

These cultural and political aspects remain bit players and mostly offstage in The Fate of Rome. If you are interested in political narrative, then something like Peter Heather’s The Fall of the Roman Empire may be more to your taste. If culture, then Mary Beard’s SPQRBut ultimately social, political, economic, biological, and climatological factors are critical and interconnected. The rise of plague is hard to understand outside of the context of trade, which was enabled by political power and unity. Ecological factors may have driven Yerisina pestis out of its Central Eurasian reservoir, and those ecological factors may have been triggered by climatic variables.

The fall of Rome is a huge topic. I’m just glad that we’re beyond the revision of the previous generation which denied that it happened in the first place. The reason that it occurred is probably contingent in the details, though inevitable over the long-term. All things must end, even the Roman peace.

* This is not totally true, but over the time-scales we’re talking about probably mostly true.