All the commotion within Africa

Unless you’ve been asleep, you probably know by now that the Reich lab has come out with a paper that analyzes the remains of 4 individuals from western Cameroon, dating to 8,000 and 3,000 years ago (2 of each, with one of the older individuals yielding 18.5x coverage DNA!). The location and timing both matter.

This area of Cameroon is hypothesized to be the point of expansion for the Bantu migration. This expansion began about 3,000 years ago and swept east and south until the agricultural streams met back up in southern Africa.

Perhaps then the authors then “caught history in action” with a change between 8,000 and 3,000 years ago? No such luck actually. Here is the abstract, Ancient West African foragers in the context of African population history:

… One individual carried the deeply divergent Y chromosome haplogroup A00, which today is found almost exclusively in the same region…However, the genome-wide ancestry profiles of all four individuals are most similar to those of present-day hunter-gatherers from western Central Africa, which implies that populations in western Cameroon today—as well as speakers of Bantu languages from across the continent—are not descended substantially from the population represented by these four people. We infer an Africa-wide phylogeny that features widespread admixture and three prominent radiations, including one that gave rise to at least four major lineages deep in the history of modern humans.

Basically, just like elsewhere in Africa where the Bantu expanded, you see massive discontinuity in this region of Cameroon (the modern agriculturalists in the area are Bantu-speaking). If you have ever analyzed African genetic data, the lack of high magnitude structure of the Bantu over wide areas is pretty shocking. The reason there’s little structure seems to be two-fold

  1. Rapid population expansion, so not much time to accumulate distinct variants (you see this in Northern Europe too)
  2. Minimal admixture with local populations, at least until you get to modern-day South Africa (then there is an admixture cline with Khoisan)

Meanwhile, you have these zones of relic hunter-gatherers here and there. These samples seem to be one of those cases. I think it’s analogous to the fact that hunter-gatherers persisted in pockets for thousands of years after the initial arrival of Neolithic farmers in Europe.

There are two types of things you can take away from a paper like this. General insights. And specific details. The plot at the top of this post illustrates a model that they generated with these data. It seems quite clear that the details are not crisp, and subject to a further specific revision. But the general insights seem robust and extend what we already knew.

First, there were several human lineages that diverged 500,000 to 1 million years ago. In Eurasia, these became Neanderthals and Denisovans. In Africa, one of the branches led to what we call “modern” humans. But a variety of lines of evidence indicate that within Africa there were also highly diverged human groups, analogous to Neanderthals and Denisovans. One could call them “African Neanderthal” analogs. But within the context of this paper, they are “ghost archaics.” But those aren’t the only “ghosts.”

Extant human populations sample only a fraction of the “modern” family tree, which seems to have diversified from one of the African human groups 300,000 years ago or so.

There is now a fair amount of circumstantial evidence that Neanderthals mixed with an African lineage that is an outgroup to most other Africans and descended-from-Africans. Because of its size and warm climate, I believe that Africa was quite a good habitat for humans, and there were a variety of them across the continent. Though I don’t discount deep-time back migration of Neanderthal/Denisovan groups into Africa, I think due to the different population sizes it is probably more the case that Africans went into western Eurasia than vice versa. Additionally, Southeast Asia seems to be a good target habit for any African species due to similarities of biome (e.g., Sundaland).

Finally, there is the fact that it seems non-African ancestry is closest to the Mota sample, dated to 4,500 years ago in Ethiopia. This makes geographic sense, though I do wonder if this is an artifact of continuous gene flow back from Eurasia, as much as the likelihood that this is near the exit path of African humans.

What about the details of this paper? Look a the supplements and notice all the admixture graphs. There are lots of potential fits to the data, and more data will come in. The paper is clear to not put too much faith in one set of weights for gene flows, and different graphs might explain the patterns in the data. Additionally, a highly dense African landscape of hominins might exhibit lots of continuous gene flow and isolation by distance. There’s a lot more to learn. Nothing is being closed in this case.

Selection swimming against the genomic tide

One of the major issues that confuses people is that the distribution of a trait or gene is often only weakly correlated with overall phylogeny and the rest of the genome.

To give a strange but classic example, the MHC loci are subject to strong balancing selection. This means that novel alleles do not substitute and replace ancestral alleles. Substitution of this sort results in “lineage sorting,” so that when you look at chimpanzees and humans you can see many polymorphic loci where all humans carry one variant and all chimpanzees the other. In contrast at the MHC loci there is frequency-dependent selection for rare variants, so the normal cycling process does not occur. Humans and chimpanzees overlap quite a bit on MHC, and any given human may have a more similar profile to a given chimpanzee than another human.

There are 19,000 human genes. At 3 billion base pairs only about ~100 million are polymorphic on a worldwide scale (using some liberal definitions). There are lots of unique stories to tell here.

A new preprint, Inferring adaptive gene-flow in recent African history, illustrates how certain genes with functional significance may differ from genome-wide background. The authors find that among the Fula (Fulani) people of West Africa there has been introgression from a Eurasian mutation that confers lactase persistence. The area of the genome around this gene is much more Eurasian than the rest of the genome. In contrast, the area around the Duffy allele is much less Eurasian. The variation in this locus is related to malaria resistance. Finally, in other African populations, they found gene flow of MHC variants.

None of this is entirely surprising, though the authors apply novel haplotype-based methods which should have wider utility.

Genetic variation and disease in Africa


Very readable review, Gene Discovery for Complex Traits: Lessons from Africa. It’s open access, so I recommend it. The summary:

The genetics of African populations reveals an otherwise “missing layer” of human variation that arose between 100,000 and 5 million years ago. Both the vast number of these ancient variants and the selective pressures they survived yield insights into genes responsible for complex traits in all populations.

The main issue I might have is I’m not sure that focusing on 5 million year time spans is particularly useful. Rather, looking at the last major bottleneck for modern humans before the “Out of Africa” event would be key, since that’s when a lot of the common variation would disappear, and very rare variants probably don’t have deep time depth in any case. With all that being said, the qualitative analysis is on point.

One of the major issues in the “SNP-chip” era has been that ascertainment of variation has been skewed toward Europeans. Though more recent techniques have tried to fix this…this review points out that if you by necessity constrain the SNPs of interest to those that vary outside of Africa (most of the world’s population), you are taking may alleles private to Africa off the table. This is relevant because the “Out of Africa” bottleneck ~50,000 years ago means that African populations harbor a lot more genetic variation than non-African populations do.

The move to high-quality whole genome sequencing obviates these concerns. As a matter of course African variation will be “picked up” since the marker set is not constrained ahead of time.

Importantly the authors focus on South Africa and the Xhosa population. This group has about ~20% Khoisan genetic ancestry, which is very diverse, and, very distinct, from that of the remaining ~80% of its ancestry. With its large African immigrant population and highly diverse native groups, some of them quite admixed, South Africa could actually provide some hard-to-substitute value in biomedical genetics.