The cultural conditions of star-shaped phylogenies

In the generality, I think intergroup selection of paternal lineages is the answer to why star-shaped phylogenies are so evident in the phylogenetic record ~4,000 years ago. More precisely, most of the major clades of R1a, R1b, and I1 undergo massive expansion after a sharp reduction in effective population size around this period. The R lineages diversified during the Pleistocene, probably in Central Eurasia (it is a brother clade to Q). The I lineage derives from Western European hunter-gatherers, probably the late Pleistocene expansion which eventually gave rise to the Mesolithic groups that encountered the early farmers.

But what happened here specifically? Let me quote a section of Peter Turchin’s excellet Ultrasociety: How 10,000 Years of War Made Humans the Greatest Cooperators on Earth:

Lanchester’s Square Law yields an enormous return to social scale. If the opposing forces use a mix of ranged and shock weapons, numerical superiority will still be amplified, although not as much as with purely projectile weapons. So there is an intense selection pressure for cultural groups living in flat terrain to scale up, and a very high price to pay by those that fail to do s….

Though human interaction with horses as domesticates is probably older, light chariots emerged on the Pontic steppe ~4,000 years ago. Within a few centuries, this technology was ubiquitous in the Near East. The Indo-Aryan Mitanni arrive with chariots in modern Syria/Northern Iraq by ~3,750 years ago.

In the Near East chariots and bows were closely associated. The evidence from the Eurasian steppe during the Bronze Age seems less definitive (simply, bows may not preserve very well), though by the Iron Age the mounted archer became a ubiquitous feature of the military landscape.

The combination of chariots, likely bows, and the Sintashta/Srubna/Andronovo culture’s known focus on metallurgy, make it hard for me to deny the likelihood that the expansion of R1a1a-Z93 has something to do with intergroup conflict. The reality is that Lanchester’s Square Law means that even small initial advantageousness for a given paternal lineage will probably snowball. One victory will lead to an increase in territory and resources, which will produce later advantage. A sort of Y chromosomal Matthew Effect.

But this doesn’t explain what occurred in Europe, where R1b and I1 also underwent a massive expansion (and R1a as well). Europe’s relatively forested territory beyond the Hungarian plain always blunted the power and reach of mounted archers later in history. We do know that chariots arrived in the Mediterranean around the same time as in the Near East. But the rise to dominance of the Corded Ware and Bell Beaker peoples predates light chariots. Perhaps it is something as simple as the fact that metaethnic institutions and identities that could dampen intergroup conflict hadn’t emerged, but it’s still curious to me that one could have a ~90% population replacement in Britain in a few centuries.

Perhaps we will find out that it has to do with a disease as our understanding of ancient epidemics gets better.

 

Soft & hard selection vs. soft & hard sweeps


When I was talking to Matt Hahn I made a pretty stupid semantic flub, confusing “soft selection” with “soft sweeps.” Matt pointed out that soft/hard selection were terms more appropriate to quantitative genetics rather than population genomics. His viewpoint is defensible, though going back into the literature on soft/selection, e.g., Soft and hard selection revisited, the main thinkers pushing the idea were population geneticists who were also considering ecological questions.*

The strange thing is that I had already known the definitions of hard and soft selection on some level because I had read about them as I was getting confused with hard and soft sweeps! But this was more than ten years ago now, and since then I haven’t given the matter enough thought obviously, as I defaulted back to confusing the two classes of terms, just as I used to.

Matt pointed out that truncation selection is a form of hard selection. All individuals below (or above) a certain phenotype value have a fitness of zero, as they don’t reproduce. In a single locus context, hard selection would involve deleterious lethal alleles, whose impact on the genotype was the same irrespective of ecological context. So in a hard selection, it operates by reducing the fitness of individuals/genotypes to zero.

For soft selection, context matters much more, and you would focus more on relative fitness differences across individuals/genotypes. Some definitions of soft vs. hard selection emphasize that in the former case fitness is defined relative to the local ecological patch, while the latter is a universal estimate. Soft selection does not necessarily operate through the zero fitness value for a genotype, but rather differential fitness. Hard selection can crash your population size. Soft selection does not necessarily do that.

Though I won’t outline the details, one of the originators of the soft/hard selection concept analogized them to density-dependent/independent dynamics in ecology. If you know the ecological models, the correspondence probably is obvious to you.

As for hard and soft sweeps, these are particular terms of relevance to genomics, because genome-wide data has allowed for their detection through the impact they have on the variation in the genome. A “sweep” is a strong selective event that tends to sweep away variation around the focus of selection. A hard sweep begins with a single mutant, and positive selection tends to drive it toward fixation.

A classical example is lactase persistence in Northern Europeans and Northwest South Asians (e.g., Punjabis). The mutation in the LCT gene is the same across a huge swath of Eurasia. And, the region around the genome is also the same, because regions of the genome adjacent to that single mutation increased in frequency as well (they “hitchhiked”). This produces a genetic block of highly reduced diversity since the hard selective sweep increases the frequency of so many variants which are associated with the advantageous one, and may drive to extinction most other competitive variants.

Someone is free to correct me in the comments, but it strikes me that many hard selective sweeps are driven by soft selection. Fitness differentials between those with the advantageous alleles and those without it are not so extreme, and obviously context dependent, even in cases of hard sweeps on a single locus.

The key to understanding soft sweeps is that there isn’t a focus on a singular mutation. Rather, selection can target multiple mutations, which may have the same genetic position, but be embedded within different original gene copies. In fact, soft selection often operates on standing variation, preexistent alleles which were segregating in the population at low frequencies or were totally neutral. Genetic signatures of these events are less striking than those for hard sweeps because there is far less diminishment of diversity, since it’s not the increase in the frequency of a singular mutation and the hitchhiking of its associated flanking genomic region.

Soft sweeps can clearly occur with soft selection. But truncation selection can occur on polygenic traits, so depending on the architecture of the trait (i.e., effect size distribution across the loci) one can imagine them associated with hard selection as well.

Going back to the conversation I had with Matt the reason semantics is important is that terms in population genetics are informationally rich, and lead you down a rabbit-hole of inferences. If population genetics is a toolkit for decomposing reality, then you need to have your tools well categorized and organized. On occasion it is important to rectify the names.

* There are two somewhat related definitions of soft/hard selection. I’ll follow Wallace’s original line here, though I’m not sure they differ that much.

Open Thread, 05/28/2018

She Has Her Mother’s Laugh is now available. The interview with Carl Zimmer will be live on The Insight Wednesday night (EDT).

If you haven’t, please consider leaving a 5-star review on iTunes or Stitcher.

I’ve told that you can already read The University We Need on Google Books. I can’t vouch for this, but on Amazon the publication date is July 10th.

I suspect the field of cultural evolution is going to become big in the next ten years, breaking out its relatively rarified ghetto. If you haven’t, I’d recommend The Secret of Our Success by Joe Henrich.

The older, more technical books, are Cultural Transmission and Evolution, Culture and the Evolutionary Process.

I noticed the other day that the spam filter was a little overactive recently. Just in case you notice comments not going through….

Y chromosomal star-phylogenies as inter-group competition between paternal lineages

The figure to the left should be familiar to readers of this weblog. It is taken from A recent bottleneck of Y chromosome diversity coincides with a global change in culture (Kamin et al.). Over the past few years a peculiar fact long suspected or inferred has come into sharp focus: some of the Y chromosome haplogroups very common today were not so common in the past, and their frequency changed very rapidly over a short time period.

What Kamin et al. did was look at sequence data across the Y chromosome to make deeper inferences. The issue is that the Y chromosome is not genetically very diverse. Earlier generations of researchers focused on highly mutable microsatellite regions for identification. While microsatellites are good for identification and classification because of their genetic diversity, they are not as good when it comes to making evolutionary inferences about parameters such as time since last common ancestor. They have very high and variable mutation rates.

Single nucleotide polymorphisms (SNPs) are probably better for a lot of evolutionary inference, but the Y chromosome doesn’t have too many of these. SNP-chip era technology which focuses on a select subset of polymorphisms at specific locations didn’t have much to choose from and likely missed rare variants.

This is where whole-genome sequence of the Y comes in. It retrieves maximal information, and with that, the authors of Kamin et al. could definitely confirm that some Y chromosomal lineages under explosive expansion ~4,000 years ago after a bottleneck.

By and large ancient DNA take a different angle, focusing on genome-wide autosomal ancestry, and lacking in high-coverage whole-genome sequences. But they have confirmed the inferences from whole-genomes that some of these lineages exhibit explosive growth in the last ~4,000 years. One moment they were rare, and the next moment ubiquitous.

But geneticists are geneticists. They’re interested in genetical questions, methods, and dynamics. To be frank cultural models for how those genetic patterns might have come about are either exceedingly simple and probably true (e.g., gene-culture coevolution with lactase persistence), or vague and handwavy. With the surfeit of genomic data to analyze it isn’t surprising that this happens.

This is why researchers in the field of cultural evolution need to get involved. They’re model-builders and should see which models predict the copious empirical results we have now when it comes to genetic change over time.

For several years now I have been asserting that inter-group competition of paternal lineages best explains the pattern of Y chromosome expansions ~4,000 years ago. A new paper brings forth a formal model which explores this hypothesis, Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck:

In human populations, changes in genetic variation are driven not only by genetic processes, but can also arise from cultural or social changes. An abrupt population bottleneck specific to human males has been inferred across several Old World (Africa, Europe, Asia) populations 5000–7000 BP. Here, bringing together anthropological theory, recent population genomic studies and mathematical models, we propose a sociocultural hypothesis, involving the formation of patrilineal kin groups and intergroup competition among these groups. Our analysis shows that this sociocultural hypothesis can explain the inference of a population bottleneck. We also show that our hypothesis is consistent with current findings from the archaeogenetics of Old World Eurasia, and is important for conceptions of cultural and social evolution in prehistory.

Their model is interesting because inter-group competition between paternal lineages can result in a loss of haplogroup diversity without huge reproductive skew. That is, instead of a highly polygynous society, one can simply posit that group dynamics of expansion and extinction produce expansions of Y chromosomal lineages.

A formal model synthesized with genomic results is a major step forward, though I haven’t dug into the methods (computational or analytic). Presumably, this is a first step.

But the discussion does review a lot of anthropological literature about the nature of human conflict and social interaction. Basically, it seems that between nomadic hunter-gatherers and before chiefdoms, biologically defined paternal clans were often the organizing principle of society. To some extent this makes total sense since the meta-ethnic religious and social identities explicitly appeal to fictive relationships of blood even after blood was no longer paramount. Ancient Near Eastern kings addressed each other in familial terms (e.g., “brother” and “son”), while universal religions deploy the construct of brotherhood.

In Empires of the Silk Road the author makes the case that these bands of brothers were more influential in shaping history than we realize today. Not surprisingly, the authors of the above paper suggest that the Inner Asian nomad zone is where star-phylogenies have been most pervasive and persist down to historical time. As in Steven Pinker’s The Better Angels of Our Nature it seems that the rise of the state suppressed the viciousness of the paternal kin group. How do we know this? Because the period of the maximal explosion of star-phylogenies seem to be a transient between the early Neolithic and the historical age.

The Y chromosomal literature is just the low hanging fruit. I suspect in the next decade cultural evolutionary models will be brought to bear on the huge mountain of genomic data….

Citation: Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck Tian Chen Zeng, Alan J. Aw & Marcus W. Feldman.

The souls of peoples gone



Stonehenge was first erected around 3100 BC, though the timber was only replaced with stone in 2600 BC. The great monument was a product of the Late Neolithic in Britain. Ancient DNA today tells us that these people were distantly related to the modern Sardinians, and derive from a wave of farmers that radiated out of Anatolia across much of Europe.

About a century after the stone form of Stonehenge was erected, prehistoric Britain was culturally and genetically transformed. In the space of a few centuries after 2500 BC there was nearly a ~90% genetic turnover, and a new people more closely related to Northern Europeans in Germany and further east became ascendant. The majority of the ancestry in Britain today probably derives from this migration period.

And yet the new people continued to utilize Stonehenge for over 1,000 years. Clearly, they co-opted a monument erected by their predecessors and maintained its significance across an enormous cultural disruption.

This is on my mind because on the episode of The Insight recorded with Patrick Wyman (it will probably drop in June) we talked extensively about Roman demography. And one of the peculiarities of 2013’s The Geography of Recent Genetic Ancestry across Europe is that Italy has a lot of deep population structure. From the paper:

There is relatively little common ancestry shared between the Italian peninsula and other locations, and what there is seems to derive mostly from longer ago than 2,500 ya. An exception is that Italy and the neighboring Balkan populations share small but significant numbers of common ancestors in the last 1,500 years, as seen in Figures S16 and S17. The rate of genetic common ancestry between pairs of Italian individuals seems to have been fairly constant for the past 2,500 years, which combined with significant structure within Italy suggests a constant exchange of migrants between coherent subpopulations.

The implication here is that there’s population structure deeper than the Roman period. When I first saw these results I was surprised. Looking at genome-wide data I was pretty sure that most of the modern Italian population dated to the Roman Republican period, but I was not expecting provincial level substructure. It was like telling me that the Samnites and Umbrians were still with us!

But what about the great cosmopolitan cities of Neopolis, Rome, and Ravenna? Some commenters on this blog routinely get frustrated when I dismiss the textual and epigraphic evidence of massive migration into the Italian peninsula during the height of the Roman Empire. Actually, I believe that this migration occurred. I just do not believe it was particularly impactful genetically today. Though my general outlook on this issue goes back over ten years (in part thanks to the suggestion of Greg Cochran), I believe the issue here is that cities are such incredible demographic sinks.

Roman urban cosmopolitanism was parasitic on migration. Demographically it was never self-sustaining. In fact, as Patrick points out urban areas probably did not see sustained above replacement reproduction anywhere in the world before about 1900, with the emergence of germ theory and massive public sanitation works, especially in the United States. This is evident in books as diverse as Kyle Harper’s The Fate of Rome and The Rise and Fall of American Growth.

So did Roman urban civilization leave nothing to posterity? On the contrary. Like much of Rodney Stark’s work in the last twenty years Cities of God is needlessly polemical and oftentimes unscholarly*, it gets at the reality that Christianity was fundamentally an urban cult. It was brought to Italy by people from the Eastern Mediterranean, Jews and Greeks. In its early period it was dominated by urban cosmopolitans. Some of the sermons in urban churches even castigated rural peasants  as pagan beasts of the field.

Christianity was an international religion with foreign origins, and like many elite cultural constructions of the pre-modern oikoumene its existed operationally as a social network across the various cities around which elites congregated. In some ways the vast sea of villages which filled in the landscape were untouched by many of the cultural innovations occurring in the cities. A Neolithic person might be confused by some aspects of Roman village life (in particular, access to standardized manufactured goods), but they would be totally flabbergasted by the city of Rome.

Over the 200 years between 400 AD and 600 AD the population of Rome probably went from ~500,000 to ~50,000. The decline of the Western Empire and the period of the Gothic Wars choked off the economic subsidies which could maintain the city’s population by drawing newcomers. And yet the Bishop of Rome, the Pope, remained in the city. If Patrick and I are correct then medieval Rome was repopulated by the descendants of peasants from Lazio, the hinterlands around the city.

Some scholars, albeit often from a partisan Protestant viewpoint, have suggested that the Western Christian Church of the early Middle Ages did not truly Christianize the peasantry. Whether this is true or not, it does seem to correct to say that deeply rooted popular Christianity took many centuries to become pervasive in rural areas. Despite their relative decline in the medieval period, both substantively and in terms of cultural prestige, cities remained remained the stalwart redoubts of Roman Christianity. They were the braintrust of European civilization, even if they were not demographically self-sustaining.

To a great extent the last ten years has seen a refutation of “pots not peoples.” It turns out that many of the archaeological transitions seen in the physical record correlate with demographic changes inferred from genetic changes. And yet we know from history that some peoples and social groups which were highly influential left far less of a demographic footprint. I suspect that the rise of cities and complex polities transformed the “pots not peoples” calculus significantly.

* Google the fact that about ten years ago Stark was dismissing reports that Americans were getting more secular as wishful thinking by biased liberal scholars. Who do you really think had a bias with hindsight?

Selection is going on with SLC24A5….

The ancestral allele for rs1426654 at SLC24A5

 
On this week’s episode of The Insight, I talked to Matt Hahn about why he wrote his new book, his opinions on “Neutral Theory”, and what he thought about David Reich’s op-ed. Without Spencer’s supervision, I have to admit that I think I lost control and just went “full nerd”. Next week we’re dropping Carl Zimmer’s podcast, so rest assured that the world will come back into balance, and The Insight will be more welcoming to civilians!

At a certain point, Matt and I were discussing allele frequency differences between populations and he came close to saying all such differences between human populations were of modest frequency in relation to pairwise comparisons (e.g., 40% vs. 49%). Obviously, this is not true, because there is always the huge difference in SLC24A5 at SNP rs1426654 (at Duffy and a few other loci). A substitution of a G for an A converts the codon from alanine to threonine.

You have heard of this locus because of a paper in 2005, SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. This paper came out in December of 2005, a few years after Armand Leroi wrote in Mutants that geneticists still hadn’t come to grips with normal variation in pigmentation in humans. The above publication was the first step in solving this question in the years between 2005 to 2010, at least to a good first approximation.

In the sample in the paper they explain 25-40% of the variation in melanin index between Africans and Europeans with this single genetic change (for various technical reasons it’s probably not that big an effect, though it is still big, and probably the largest effect quantitative trait locus for pigmentation in the human genome).

It turns out that this mutation, the derived variant, is almost disjoint is frequency between Europeans and Africans. That is, about ~100% of Africans carry the ancestry G base at while ~0% of Europeans carry the G base (as opposed to the A base). Interestingly, East Asians carry the G base at ~100% frequency as well. If you genotype an anonymous individual and their genotype is AG or GG on at rs1426654 then it is highly likely that that individual is not a European.

To give an example of how this works, in 2013 I stumbled onto a paper which genotyped 101 Europeans from Cape Town in South Africa. That means there are 202 alleles (two per person) at rs1426654. Of these, 5 of the alleles were ancestral (G). From this, I immediately concluded that it was highly likely that the Afrikaaner people of South Africa have non-European ancestry. I came to this conclusion because of 5 copies of the ancestral allele, ~2.5%, is shockingly high for a European population, and it was long surmised that the Afrikaaner people had some non-European heritage (Khoisan, Bantu, South and Southeast Asian) ancestry. The major of the whites sampled in Cape Town could have been Afrikaaners (I’ve confirmed this with genome-wide data).

To get a sense of where my intuitions come from you need to look at allele counts within populations. Using 1000 Genomes, Yale’s Alfred, and Gnomad I assembled a representative list to give you a sense of what’s going on. Using 126,548 counted alleles in Gnomad for individuals of European (non-Finnish) descent you see that 0.38% out of the total, 486, are ancestral.

Population Ancestral alleles Total alleles Freq
Samaritan 0 74 0%
Basque 0 216 0%
Greeks (Thrace, Athens) 0 184 0%
Burusho 0 50 0%
Pandit Brahmin, Kashmir 0 40 0%
European (Non-Finnish) 486 126548 0%
Ashkenazi Jewish 47 10148 0%
European (Finnish) 329 25790 1%
Iraq Kurds 1 68 2%
Yemenite Jews 2 78 3%
Havyaka Brahmin, Karnataka 2 62 3%
Palestinian 4 122 3%
Gujarati 10 206 5%
Tunisian Berber 6 110 5%
Andalusian 14 252 6%
Iranian 6 84 7%
Pashtun 21 190 11%
Uttar Pradesh Brahmin 4 34 12%
Pandit Brahmin, Haryana 13 78 17%
Punjabi 42 192 22%
South Asian 6921 30774 22%
Kalash 14 48 29%
Telugu 71 204 35%
Bangladeshi 80 172 47%
Sri Lanka Tamil 105 204 51%
Adi-Dravida, Karnataka 21 34 62%
Masai Kenya 192 286 67%
Austro-Asiatic tribe, Odisha 43 56 77%
Luhya Kenya 155 188 82%
Hausa 68 76 90%
Mende Sierra Leone 155 170 91%
Gambian 209 226 92%
Ibo 90 94 96%
Austro-Asiatic tribe, Odisha 92 96 96%
Esan Nigeria 193 198 97%
Yoruba Nigeria 213 216 99%
Biaka 135 136 99%
East Asian 18728 18856 99%
Ghana 140 140 100%
Mbuti 74 74 100%

Last fall Crawford et al. reported that rs1426654 is embedded in a haplotype that’s about ~30,000 years ago. Additionally, they contend that its presence within Africa is probably no earlier than the Holocene, the last ~12,000 years.  Martin et al. report that KhoeSan exhibit higher frequencies of the derived allele because of Eurasian back-migration and then in situ natural selection. Of course, not all Eurasians. Most East Asians have the ancestral variant of rs1426654.

This leaves us with West Eurasians, North Africans, and South Asians. I’ve put a few South Asian populations in the list to show you that there is a wide range of variation in allele frequencies. The South Asians in Gnomad, probably mostly Diaspora, have the ancestral variant at only 22%. In contrast, Austro-Asiatic speaking South Asian groups from northeast India have very high frequencies of the ancestral variant. There has clearly been in situ selection in some South Asian populations for the derived variant at rs1426654. Ancestral North Indian groups (ANI) probably brought the derived allele, and Ancient Ancestral South Indians (AASI) probably tended to carry the ancestral allele, like East Eurasians and Oceanians. Additionally, South Asian populations often have high drift. Some of the differences in the Alfred data seem to be impacted by this.

The situation in the Middle East, North Africa, and Europe is different.  In the Middle East and North Africa, the ancestral variant is present at frequencies around 1-10%.  Some of this can probably be attributed to admixture from Africa and in some cases South and East Asian populations. Ancient DNA from the Middle East and North Africa presents a mixed picture. The farmers who brought the Neolithic to Europe carried the derived variant at rs1426654, and some of the ancient Middle Eastern samples carry it. But not all. The recent Iberiomauserian samples which date to ~15,000 years ago don’t seem to have had the derived variant.

Though the hunter-gatherers of Western Europe only seem to have carried the ancestral variant at rs1426654, the hunter-gatherers of Scandinavia and Eastern Europe did exhibit the derived variant in some frequency, though lower than modern Europeans.

My own hunch is that the original genetic background against which the A mutation at rs1426654 emerged will be found increasing in frequency first somewhere in the Near East after the Last Glacial Maximum. But no ancient population shows the frequencies of the derived variant we see in modern Europeans. In isolated populations subject to drift it wouldn’t be surprising if the ancestral variant decreased to ~0%, But in European populations today in the vast majority of cases the ancestral variant is far lower than 1%, even though we know that within the last 10,000 years the ancestral populations streams had several groups with very high frequencies of that ancestral variant. The low frequency is not due to a freakish bottleneck all across Europe. It has to be selection

One thing I have pointed out is that this very low frequency of the ancestral variant indicates that the advantage at rs1426654 for the A allele in Europe is additive. In Northern Europe, the frequency of the derived variant that confers lactase persistence tops out at around ~90 percent. We know this region of the genome has been targeted by natural selection, but lactase persistence also happens to express dominantly genetically. That is, one variant of the mutant allele confers the phenotype. Once you hit ~90 percent of the derived variant only ~1 percent of the population would be lactose intolerant homozygotes (two copies of the ancestral variant). In the Gnomad sample of 60,000+ Europeans, they count three homozygote genotypes rs1426654. That’s 0.005%.

Something is happening at rs1426654. Selection. But why? No one really has any explanation beyond the obvious.

Our Edo period future?

The second season of Westworld has some scenes set in Edo period Japan. To spoil things for you there is apparently a scene-by-scene re-creation of a plot arc from the first season of the show set in the American West. Watching this scene, and comparing it to the earlier version, I can’t but help feel that the Edo period setting is more grand and refined. If the first season’s violent attack was brutalist, the scene above is more neoclassical.

Then again, Edo Japan and the American West are perhaps antipodes of second-millennium civilization. Where the 19th century American West was anarchic, chaotic, and creative, the Edo period in Japan was notable for its stability, order, and the perception that it was a culture in chrysalis. Old forms may have been reinvented, but those forms were treasured.

The context for the Edo period is that 16th century Japan was a dynamo. Not always in a good way. The islands were riven by internal warfare. The Japanese were known to be a piratical race by the Ming dynasty, and the 16th century ended with the warlord Hideyoshi’s disastrous invasion of Korea. Prefiguring Japanese ability to imitate the West in industriousness they developed a skill in the making of guns, while Roman Catholic Christianity had great success in the southern island of Kyushu.

Eventually, Tokugawa Ieyasu set the stage for Japan’s nearly three hundred year exile from the congress of nations, turning his back on Hideyoshi’s adventurousness. Of course, it is false to assume that the Japanese were totally insulated from the outside world. Not only did they connect with the West through the Dutch, but the Japanese maintained a more intense relationship with Korea. Even in the 17th and 18th century, a movement of “Western Learning” persisted through the interaction with the Dutch (though arguably late Confucian influences may have been more significant).

The violent suppression of Christianity in the 17th century and the emergence of a static caste system strikes modern sensibilities as brutal, barbaric and regressive. But the Edo period’s reduction in distribution and production of lethal firearms shows the upside of a conservative and controlling social land political elite. Violence continued, but it was relatively controlled and channeled.

We think of the future as endlessly protean and dynamic. But science fiction offers up an alternative possibility far more like Edo period Japan: technologically stagnant, culturally conservative. Frank Herbert’s Dune was set in the context of a universe where there had been a religious jihad against artificial intelligence. Meanwhile, Isaac Asimov’s Foundation series was originally based on imperial Rome, but later incarnations admitted that the better model was imperial China. Just as in the Dune series, the Foundation universe had to grapple with humanity’s protean and chaotic violence, which threatened to take down our civilization periodically due to enthusiasms.

The Edo period stretches from the early 17th century down to the middle of the 19th. All in all this is not a bad run. Our own republic’s 250 year anniversary will be on us in 2026.

The mutation accumulation controversy continues….

Every few years I check to see if the great mutation accumulation controversy has resolved itself. I don’t know if anyone calls it that, but that’s what I think of it as. There are two major issues that matter here: mutation rates are a critical parameter in evolutionary models, and, mutation accumulation over time matters for parental age effects when it comes to disease (speaking as an older father!).

In the latter case, I’m talking about the reasons that people freeze their eggs or sperm. In the former case, I’m talking about whether we can easily extrapolate mutation rates over evolutionary time as semi-fixed, so we can infer dates of last common ancestry and such. To give a concrete example of what I’m talking about, if mutation rates varied a lot over the evolutionary history of our hominin lineage, then we might need to rethink some of the inferred timings.

Today two preprints came out on mutation accumulation. First, Overlooked roles of DNA damage and maternal age in generating human germline mutations. Second, Reproductive longevity predicts mutation rates in primates. What a coincidence in synchronicity!

Additionally, the last author on the second preprint, Matt Hahn, is someone I’ll be doing a podcast with this week. So aside from talking about neutral theory, and his book Molecular Population Genetics, I’m going to have to bring up this mutation business.

The figure above from the first preprint shows that the proportion of mutations derived from the father don’t increase over time, as textbooks generally state. Why would we expect this? Sperm keeps replicating after puberty so you should be gaining more mutations. In contrast, the eggs are arrested in meiosis. There are various mechanistic reasons that the authors of the first preprint give for why the ratio does not change between paternal and maternal mutations (e.g., non-replicative mutations seem to be the primary one). The authors are using a very “pedigree” strategy, rather than an “evolutionary” one. They’re looking at sequenced trios, and noticing patterns. I think in the near future they’ll be far more sure of what’s going on because they’ll have bigger sample sizes. They admit the effects are subtle (also, some of the p-values are getting close to 0.05).

Instead of focusing on a human pedigree, the second preprint does some sequencing on owl monkeys (I had no idea there were “owl monkeys” before this paper). They find that the mutation rate is ~32% lower in owl monkeys than in humans. Why is this?

The plot to the left shows that mutations increase across age with species (though the number of data points is pretty small). The authors contend that:

The association between mutation rates and reproductive longevity implies that changes in life history traits rather than changes to the mutational machinery are responsible for the evolution of these rates. Species that have evolved greater reproductive longevity will have a higher mutation rate per generation without any underlying change to the replication, repair, or proofreading proteins.

If I read this right: owl monkeys reproduce fast and don’t have as much reproductive longevity. Ergo, lower mutation rates (less mutational build-up from paternal side).

After all these years I’m still not convinced about anything. I assume that eventually bigger data sets will come online and we’ll resolve this. Someone has to be right!

(not too many people on Twitter get what’s going on either)

Beyond “Out of Africa” within Africa

It looks as if the vast majority (95% or more depending on the population) of the ancestry of non-African humans derives from a population expansion which began around ~60,000 years ago. Before this period some researchers argue there was a non-trivial period of isolation. The “long bottleneck” (David Reich alludes to this in Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past). For the vast majority of humans then the last 60,000 years is characterized by a branching process, some reticulation (e.g., South Asians merge West and East Eurasian lineages) between these branches from a common ancestor, as well as introgression from archaic lineages like Neanderthals and Denisovans.

Though I do accept that it seems that modern humans probably migrated out of Africa before 60,000 years ago, mostly due to the results from archaeology, I think the genetic evidence is strong that these groups contributed very little genetically to contemporary populations.

The situation within Africa is very different. Being conservative it seems likely that the Khoisan ancestral lineage diverged from some other Africans ~200,000 years ago. I say conservative because there are researchers who want to push the divergence much further back. Additionally, several different research groups are now converging in a result that West Africans are a mixture between eastern Sub-Saharan Africans (think the population ancestral to Mota in Ethiopia) and a lineage basal to all other humans. That means that the Khoisan are not the most basal, so even assuming the conservative 200,000 year divergence point for Khoisan, modern humans share a common ancestor earlier than 200,000 years ago.

The upshot here is that around 75 percent of the history of modern humans is within (greater)* Africa. The distinctive “Out of Africa” bottleneck and expansion defines most humans only in the last 25 percent of the history of our species. And, within Africa, the dynamics were very different. The biggest difference is that African populations are not defined by a large number of lineages emerging and diverging around the same period, because there wasn’t a massive and singular expansion within Africa analogous to what occurred outside of Africa (at least until the recent past, with the Bantu expansion). That’s why there’s deep structure within Africa today between groups as divergent as the Bantu, Mbuti, Hadza, and Khoisan.

The term “Basal Eurasian” kind of makes sense in the non-African context because of the singular importance of divergence between lineages in the first 10,000 years or so after the “Out of Africa” event. I’m not sure “Basal human” makes as much sense because there wasn’t a singular event within Africa that allowed for the emergence of modern humans. Rather, it was a process, and probably quite resembles something like multiregionalism.

* Some wiggle room here for the likelihood that modern humans were long present in the liminal Near East.

The end of the century of privacy

Reading The Rise and Fall of American Growth: The U.S. Standard of Living since the Civil War has made me think more about the unique nature of urban civilization of the long 20th-century. The expansion of public health, in particular provision of clean water, meant that for the first time in the history of the world you had a situation where people in cities actually had a higher life expectancy than those in rural areas. Prior to this cities were demographic sinks. We have data from the 19th century which makes it clear that morbidity was higher for city dwellers. This is probably the major reason, in my opinion, the cosmopolitan worlds of antiquity had such a marginal demographic impact: the culturally vibrant city-dwellers who dominated Classical civilization politically and socially didn’t leave many descendants.

Even though cities were dominant politically and central to many earlier societies, only in the last century so have predominantly urban societies emerged. Before that most humans lived in villages or in hunter-gatherer bands. Everyone was in everyone else’s business. Anonymity was simply not a thing for most humans in most periods of our species’ history.

This changed with the rise of cities. In the early 2000s the anthropologist Robin Dunbar argued that people could maintain ~150 genuine social relationships in their mind. This is Dunbar’s number. Over the past two decades, there have been lots of arguments about Dunbar’s number. One can stipulate that the value may not be 150. Additionally, it seems likely that some people have a higher Dunbar’s numbers than others. But the general point that human social competencies have a ceiling value seems to be right.

And, that ceiling is smaller than the number of people who live in close proximity to each other in cities. The potential facelessness of your neighbors in a city, and its diversity and cosmopolitanism is one reason that it was in cities that written laws displayed in public places emerged as a custom. Societies not bound together by social interaction and kinship needed abstractions which could scale. Laws, kings, and religions are just some of the cultural inventions that were essential to maintain order in a city where strangers interacted daily.

But were these cities really incubators for anonymity? I would argue that the premodern city offered far less anonymity, and therefore privacy than the modern city. Premodern cities were dense, due to limitations in transportation. They were defined by neighborhoods. Additionally, economic activities in cities were often defined by relationships between people, whether it be between a patron and an artisan, or members of a cooperative guild. In some ways, premodern cities were a collection of villages.

What defined the 20th-century was the rise of massive corporations that rationalized economic consumption and production. The supermarket is cheaper than your local green-grocer, but there is also less of a personal relationship between you and the supermarket staff. Similarly, they may not know who you are. Rather than having economic relationships directly to other people, you have an economic relationship with an institution, which acts as an intermediary.

By the second half of the 20th century, individuals in cities could be totally self-sufficient and isolated from other human beings if they so chose when it came to personal relationships. The rationalization of modern life made deep human interaction a choice, and to some extent, privacy was the default state.

The rationalization of economic relations continues. But over the last 20 years, and especially the last ten or so, the default state of privacy has disappeared. If you know someone’s name you can usually find their age, where they have lived their adult life, who they lived with, and who their relatives are. Websites like Zillow can tell you their home-value or when/if they bought their home and for how much. Facebook, Twitter, and other social media make it so you can find out many things about a person.

Recently a friend of mine who became newly single after ten years in a relationship decided to try out online dating (for the first time). One thing he found is that you have to assume that your matches may have Googled you beforehand (presumably this depends on whether the site gives you full name or not). If you are too shy to talk to your neighbors, just look up who lives at the various addresses around you.  Once you have their names you can find out everything else.

Obviously, modern information technology doesn’t make it so that we live in a premodern village. But, it does mean that the faceless anonymity enabled by rationalized modern economics and socio-political systems is stripped away. In its place, you become a set of values for various parameters (age, income, political orientation, geographical mobility). You don’t know people in a tacit and natural manner, you know them through their data.

Whereas the political and social views of most employees of a corporation were out of view in the 20th-century, today many companies are snooping around in Facebook feeds and doing simple background checks. You may not have a personal relationship with a large company, but it has a relationship with the data that it defines you by.

The 20th-century was the century of privacy because the machinery of information distribution appropriate to hunter-gatherers and villages did not scale to cities. And 20th-century technology never caught up to the scale of the cities and economies of that period in terms of distributing information. As the 21st-century proceeds, it seems that information technology is finally now in place.