Models uncovering African population genetic history


In a deep sense, we know a lot more about the population genetic history of England at the fine-grain than we do about the whole continent of Africa. That’s going to change in the near future, as researchers now realize that the history and emergence of modern humans within the continent was a more complex, and perhaps more multi-regional, affair than had been understood.

Because of the relative dearth of ancient DNA, there has been a lot of deeply analytic work that draws from some pretty abstruse mathematical tools operating on extant empirical data. A series of preprints have come out which use different methods, and arrive at different particular details of results, but ultimately seem to be illuminating a reoccurring set of patterns. Dimly perceived, but sensed nonetheless.

Here’s the latest offering, Models of archaic admixture and recent history from two-locus statistics. I can’t pretend to have read the whole preprint (lots of math), but these empirical results jumped out at me:

We inferred an archaic population to have contributed measurably to Eurasian populations. This branch (putatively Eurasian Neanderthal) split from the branch leading to modern humans between ∼ 470 − 650 thousand years ago, and ∼ 1% of lineages in modern CEU and CHB populations were contributed by this archaic population after the out-of-Africa split. This range of divergence dates compares to previous estimates of the time of divergence between Neanderthals and human populations, estimated at ∼650 kya (Pr¨ufer et al., 2014). The “archaic African” branch split from the modern human branch roughly 460 − 540 kya and contributed ∼ 7.5% to modern YRI in the model (Table A2).

We chose a separate population trio to validate our inference and compare levels of archaic admixture with different representative populations. This second trio consisted of the Luhya in Webuye, Kenya (LWK), Kinh in Ho Chi Minh City, Vietnam (KHV), and British in England and Scotland (GBR). We inferred the KHV and GBR populations to have experienced comparable levels of migration from the putatively Neanderthal branch. However, the LWK population exhibited lower levels of archaic admixture (∼ 6%) in comparison to YRI, suggesting population differences in archaic introgression events within the African continent (Table A3).

To be frank I’m not sure as to the utility of the term “archaic” anymore. I sometimes wish that we’d rename “modern human” to “modal human.” That is, the dominant lineage that was around ~200,000 years ago in relation to modern population ancestry.

Skull from Iwo Eleru, Nigeria. Photo credit: Katerina Harvati and colleagues CC-BY

But, these results are aligned with other work from different research groups which indicate that something basal to all other modern humans, but within a clade of modern humans in relation to Neanderthal-Denisovans, admixed with a modern human lineage expanding out of eastern Africa. The LWK sample is Bantu, and has a minority Nilotic component that has West Eurasian ancestry. This probably accounts for the dilution of the basal lineage from 7.5% to 6%.

I wouldn’t be surprised if the final proportions differ. And other research groups have found deep lineages with African hunter-gatherers. My own view is that it does seem likely that one of the African human populations that flourished ~200, 000 years ago expanded and assimilated many of the other lineages. The “Out of Africa” stream is one branch of this ancient population. But it seems possible that the expansion was incomplete, and that other human lineages persisted elsewhere until a relatively late date.

How paternity testing is like international trade

Nonpaternity rate % N
Switzerland 0.83 1607
USA, Michigan, white 1.49 1417
USA, California, white 2.1 6960
USA, Hawaii 2.3 2839
UK, West London 3.7 2596
Paternity Testing Laboratories
UK 16.6 1702
USA, Los Angeles, white 24.9 1393
Sweden 38.7 5018
South Africa, Cape Coloured 40 1156

The results above are from Kermyt Anderson’s How Well Does Paternity Confidence Match Actual Paternity? This is still one of the best surveys of the field, despite being 12 years ago. A more recent paper, Cuckolded Fathers Rare in Human Populations, uses more powerful genetic genealogy methods to come to the same conclusion as Anderson’s survey: extrapair paternity, or nonpaternity events, are rare in Western societies. I don’t think it is limited to Western societies. I suspect that when high throughput sequencing is applied to Chinese clan lineages and Hindu gotras, you will found that nonpaternity events are similar to those in the West.*

On the other hand, in some small-scale societies, the rates are much higher.

I won’t delve into the evolutionary anthropology here. Rather, I want to point to a new paper, Growth of ancestry DNA testing risks huge increase in paternity issues. Ancestry testing is huge. Within the next year, it is almost certain that 10% of the American population when having some sort of high-density genomic testing done.

As the author of the paper pointed out to me on Twitter, 1% of 16 million people is still a lot. Yes, in absolute terms. But we need to look at the other side of the equation.

In Anderson’s original data one of the interesting results is that in most datasets drawn from paternity testing laboratories, where there is a very high suspicion of nonpaternity events, most of the fathers nevertheless were biological fathers! In a nonpaternity testing context, nonpaternity events will be much closer to ~1%. But, I think it is reasonable to suppose that some of the 99% of the fathers who turn out to be biological fathers also have suspicions…which are unfounded.

Like free trade, you tend to see one side of the equation much more than the other. In free trade scenarios, a minority of workers may lose their jobs or have to work under reduced wages, but the vast majority of consumers will get cheaper or better products. The former is much more salient than the latter.

Similarly, the small minority of fathers and families who are going to be “surprised” in a negative way, is balanced out by the likely larger number who have low-grade suspicions, but in fact, are confirmed in their biological relatedness.

Addendum: Needless to say, if you are part of the “cuckold community”, you should probably not getting this sort of testing.

* The necessity of good quality whole-genome sequencing is due to the fact that male relatives are excellent candidates for nonpaternity events. To get a certain estimate one would want to count unique mutations across the pedigree.

Patterns of genetic diversity within Africa

The violin-plot above is from a new preprint, Runs of Homozygosity in sub-Saharan African populations provide insights into a complex demographic and health history. Here’s the abstract:

The study of runs of homozygosity (ROH), contiguous regions in the genome where an individual is homozygous across all sites, can shed light on the demographic history and cultural practices. We present a fine-scale ROH analysis of 1679 individuals from 28 sub-Saharan African (SSA) populations along with 1384 individuals from 17 world-wide populations. Using high-density SNP coverage, we could accurately obtain ROH as low as 300Kb using PLINK software. The analyses showed a heterogeneous distribution of autozygosity across SSA, revealing a complex demographic history. They highlight differences between African groups and can differentiate between the impact of consanguineous practices (e.g. among the Somali) and endogamy (e.g. among several Khoe-San groups). The genomic distribution of ROH was analysed through the identification of ROH islands and regions of heterozygosity (RHZ). These homozygosity cold and hotspots harbour multiple protein coding genes. Studying ROH therefore not only sheds light on population history, but can also be used to study genetic variation related to the health of extant populations.

This sort of run-of-homozygosity analysis is enabled by high-density genotyping or whole-genome sequencing. After quality control, the authors had 1 to 1.5 million SNPs for all populations.

The interesting thing about this preprint is that by looking at the violin-plots can you can see exactly all the things that population geneticists have learned about the demography, structure, and history of humans in the past generation or so.

  • The rightmost panel shows the average total length of short ROH. Partly the pattern fits into the older serial bottleneck model of the settlement of the world. The pattern of Amerindian > East Asian > European > African. But what about the lower fractions for mixed Latin Americans and Gujuratis? This is a consequence of admixture, as these populations are mixtures in a sense of other groups.
  • The length of the long ROH segments, the second to last panel on the right, is indicative of recent patterns of marriage. Within Africa, you see some groups have many individuals with lots of long ROH segments. This is because of consanguinity. As the authors observe, the Oromo and Somali are both Cushitic speaking groups from the Horn of Africa, but the latter are universally Muslim, while only a minority of the former are. Islamic cultures have traditionally encouraged consanguineous marriages, and you can see the difference between these groups (whose total length of short segments is similar).
  • The pattern of ROH here can be predicted by simple genetic models: the extent of random mating within populations, recombination rates across the genome, and total population size. What modern genomic technology does is provide data to test the models.

 

The golden age of pigmentation is yet to come

Skin color is important and interesting. It is important because people think it is important. Humans often classify each other by complexion, and it has a high social importance in many cultures.

This tendency starts at a very young age. When my children are toddlers they’ve all misidentified photographs of black American males with a medium brown complexion as their father (for example, my son recently misidentified a photograph of me that was actually the singer Pharrell). In terms of my background though, I’m 100% Eurasian in ancestry. On a PCA plot, I’m about halfway between Europeans & Near Easterners and East Asians (I have 15% East Asian ancestry so I’m more shifted to East Asians than the typical South Asian).

Skin is the largest human organ, and we are a visual species. It is an incredibly salient canvas. So it’s no surprise that we use complexion as a diagnostic marker for taxonomic purposes. The ancient Greeks correctly observed that the peoples of southern India have dark skin like Sub-Saharan Africans (“Ethiopians”), but that their hair is not woolly. Islamic commenters regularly referred to South Asians as “black crows”, while European observers of the 17th century noted that the ruling class of Indian Muslims tended to be white (i.e., mostly Turkic and Iranian in provenance) while the non-elites were black (descendants of Indian converts).*

Luckily, for a characteristic that we’re fascinated by, pigmentation has been reasonably tractable to genetics. As early as the 1950s human geneticists using classical methods of pedigree analysis predicted that pigmentation was polygenic, but that most of the variation was due to a small number of loci (see The Genetics of Human Populations). In particular, they focused on families of mixed European and African ancestry in British ports with known pedigrees.

When genomic methods came on the scene in the 2000s, pigmentation was one of the first traits that yielded positive GWAS hits as well as population genetic findings related to natural selection. In Mutants, written in the middle aughts, the author observed that there wasn’t much known about the basis of normal human variation in pigmentation. This all changed literally a year after the publication of this book. By the middle of 2006, a review paper came out with the title, A golden age of human pigmentation genetics. The reason this paper was written is that a host of studies on European populations had identified several loci which explained a substantial proportion of the intercontinental difference in pigmentation between Africans and Europeans.

Read More

Hominins are still having sex, caught in flagrante delicto

Assuming you haven’t been sleeping under a rock, you have probably heard that a Nature paper came out on an F1 Neanderthal-Denisovan hybrid. The major new science in my opinion from the results of the genome itself is to be found in the figure above. It confirms that there was a lot of population turnover among Neanderthals, as this individual’s mother is more closely related to European Neanderthals who flourished ~40,000 years later than conspecifics from the same region 30,000 years earlier. This is not surprising in light of what we know about the genetics and paleoecology of this group, though it confirms what we know and increases our confidence.

Rather, what is surprising is that this paper was published because they found an F1. From their conclusion:

It is notable that one direct offspring of a Neanderthal and a Denisovan (Denisova 11) and one modern human with a close Neanderthal relative (Oase 1) have been identified among the few individuals from whom DNA has been retrieved and who lived at the time of overlap of these groups…In conjunction with the presence of Neanderthal and Denisovan DNA in ancient and present-day people…this suggests that mixing among archaic and modern hominin groups may have been frequent when they met.

The number of ancient genomes from these species/groups/lineages is literally in the range a handful. And among the early finds is an F1! This seems highly unlikely. It could be a fluke. Or, as inferred above, F1’s may have been very common when different hominin lineages met.

But that makes one ask: how is it that Neanderthals and Denisovans remained some genetically distinct over hundreds of thousands of years? The two reasons offered are that the lineages were geographically very distant from each other on the whole, and, that hybrid individuals had very low fitness. I think the former is the primary dynamic to focus on.

For my assertion to make sense, consider some context in the published literature and theory. From 2004 and 2011 respectively, Modern Humans Did Not Admix with Neanderthals during Their Range Expansion into Europe and Strong reproductive isolation between humans and Neanderthals inferred from observed patterns of introgression.

From the first paper:

…we estimate that maximum interbreeding rates between the two populations should have been smaller than 0.1%. We indeed show that the absence of Neanderthal mtDNA sequences in Europe is compatible with at most 120 admixture events between the two populations despite a likely cohabitation time of more than 12,000 y. This extremely low number strongly suggests an almost complete sterility between Neanderthal females and modern human males, implying that the two populations were probably distinct biological species.

And the second:

Recent studies have revealed that 2–3% of the genome of non-Africans might come from Neanderthals, suggesting a more complex scenario of modern human evolution than previously anticipated. In this paper, we use a model of admixture during a spatial expansion to study the hybridization of Neanderthals with modern humans during their spread out of Africa. We find that observed low levels of Neanderthal ancestry in Eurasians are compatible with a very low rate of interbreeding (<2%), potentially attributable to a very strong avoidance of interspecific matings, a low fitness of hybrids, or both.

Models are models, and they have assumptions. Don’t have the player, hate the model assumption and revisit your priors.

There are 22 ancient genomes from 40,000 years ago or before. One of them is an F1 between Neanderthals and Denisovans. And another, Oase 1, has a Neanderthal in their very recent ancestry. The sampling locations may not be totally representative. The Denisova cave is likely to be special because it’s at the nexus of the ranges of the two Eurasian archaic lineages. But with that out of the way, it seems very unlikely to me that very low fitness or very low likelihood of mating when it close contact is the reason that the lineages remained distinct. After less than half a dozen samples from Denisova, cave researchers hit on an F1. What are the chances?

And yet, if matings between the lineages occurred when they were in close contact, and they were genetically distinct nevertheless over such long periods, then that demands an explanation. Denisova hominins and Neanderthals were genetically closer than modern humans are to either. At the time that F1 was conceived the two lineages had been distinct for ~300,000 years. This is not qualitatively much longer than some modern human groups (e.g., Khoisan vs. everyone else) have been diverging. And yet, like the Denisovan-Neanderthal split, modern humans have a lot of population structure and evidence of isolation (also, note that modern humans show no evidence of reduced reproductive fitness from offspring and purification of admixture, as has been inferred for Neanderthal genomic regions in modern human genomes).

All this leads me to conclude that in Pleistocene hominins allopatry and metapopulation dynamics are the solutions to this quandary. The population density of archaic hominins was on average low, but you need to go beyond average. The distribution was possibly highly patchy and with large zones of little habitation. Gene flow across populations may have occurred, but they would run up to a wall of emptiness equivalent to the Atlantic ocean. Additionally, both Neanderthal and modern human ancient indicates a recurrent pattern of location population extinction and replacement. My hypothesis is that populations which were liminal to the range of both lineages, and so likely to have a higher load of admixture from the other lineage, were also in a marginal territory and most likely to go extinct and leave no descendants. Then, less admixed populations with larger numbers close to the core of the lineage range would repopulate the liminal region.

If the model is correct, I think the Altai was resettled by Neanderthals from the west after the Eemian interglacial.

A contrasting method to maintain genetic separation from allopatry (physical distance and barrier) are group cultural identities which maintain very strict endogamy. We see this over 2,000 years in India, where populations are co-localized but almost totally unrelated in any way you’d predict from geography. But 2,000 years is a blink of an eye geologically. The explanation for why Neanderthals and Denisovans, and various African human lineages, remained separate for hundreds of thousands of years as coherent populations despite some gene flow on the margins, has to be geology, geography and ecology. Domains where hundreds of thousands years of stasis on quite possible.

The HGDP in the post-ascertainment era

In the 1990s there was a huge debate around the “Human Genome Diversity Project” (HGDP). By the HGDP I don’t mean what you probably know as the HGDP panel, but a more ambitious attempt to genotype tens of thousands of individuals across the world. In the end activists “won”, and the grand plans came to naught. If you want to read about it, The Human Genome Diversity Project: An Ethnography of Scientific Practice has a scholarly viewpoint, though you can also just ask someone who was involved with the human population genetics community in the 1990s (this not a large set of scholars).

Ultimately the HGDP became the samples from L. L. Cavalli-Sforza’s dataset which you read about in The History and Geography of Human Genes. This is what drives the HGDP Browser. It’s also the data set at the heart of papers like Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation. Here is the abstract:

Human genetic diversity is shaped by both demographic and biological factors and has fundamental implications for understanding the genetic basis of diseases. We studied 938 unrelated individuals from 51 populations of the Human Genome Diversity Panel at 650,000 common single-nucleotide polymorphism loci. Individual ancestry and population substructure were detectable with very high resolution. The relationship between haplotype heterozygosity and geography was consistent with the hypothesis of a serial founder effect with a single origin in sub-Saharan Africa. In addition, we observed a pattern of ancestral allele frequency distributions that reflects variation in population dynamics among geographic regions. This data set allows the most comprehensive characterization to date of human genetic variation.

These SNPs though were ascertained on European populations. That is, the genetic variation tended to be genetic variation found in Europe. This is a problem, and one reason that the Human Origins Array was developed. The ascertainment problem was really obvious when researchers were looking at Khoisan genomes, and noticed how much variation they had that wasn’t being captured on SNP-arrays.

Today, we’ve finally moving beyond the era where ascertainment is so much of an issue. At the SMBE meeting earlier this month Anders Bergstrom presented results from the HGDP using whole-genome analysis. When you look at the whole genome, you obviate the problem with selecting a biased subset of the variation. You can look at all the variation, or vary the variation you want to look at.

Bergstrom & company will have a paper on the whole-genome analysis of the HGDP in the near future. I assume it will be somewhat like the 1000 Genomes paper, but I bet you the SNP count will be higher, because they have Khoisan in their samples (along with Mbuti, etc.). Anders shared with me some of the preliminary data that the Sanger Institute has generated.

Below the fold I plotted a PCA of the HGDP data. First, the classic SNP-chip data. Second, SNPs pulled out of the WGS which are very high quality calls (though they may still have wrong calls), but have a minor allele frequency of at least 1% (~1.5 million). You immediately notice the Eurasian compression along PC 1. Finally, using ~15 million SNPs that had no missingness in the data, you see you PC 2 being defined by San Bushmen vs. non-San-Bushmen, while Mbuti Pygmies along with Biaka clearly are the furthest along PC 1 excepting the San. There are 6 San Bushmen in the data. If there are SNPs which are very distinct to this group, and not polymorphic in other populations, then my 1% cut-off would actually remove that variation.

It’s an interesting world we live in, thanks to research groups like the Sanger Institute, Estonian Biocentre, and the 1000 Genomes Project, as well as tools such as PLINK. Analysis that took decades in the 20th century can now be whipped out in a matter of hours. Better analyses in fact.

Read More

Complex evolution of pigmentation in modern humans

Last fall Crawford  et al.Loci associated with skin pigmentation identified in African populations, was published in Science and made a huge splash. As I’ve been saying recently, and most people agree, much of the remaining “low hanging fruit” in human evolutionary genomics, and to some extent, human medical genetics, is going to be in Africa on Africans. From an evolutionary perspective, that’s probably because from a gene-centric viewpoint most of our recent evolutionary history was within Africa. As a friend once told me, “most of the last 200,000 years is about the collapse of ancient population structure.” This goes too far, but at least it gets at something we’ve not been too conscious of.

Top left clockwise: Luo Kenya, Khoisan, South Asian, Arrernte Australia

Crawford  et al. was important because it was a deep dive into a topic which has been understudied, the variation of pigmentation genetics within Africa (also see Martin et al.). The fact that there is variation in pigmentation within Africa should not be surprising, though some people are surprised that there is variation in pigmentation within Sub-Saharan Africa. But anyone who has seen photos of San Bushmen, knows they are very distinct from South Sudanese, who are very distinct from West Africans. As documented by both Crawford  et al. and Martin et al. some of this variation is likely novel.

By this, I mean there has been backflow of the derived Eurasian variant of a mutation on SLC24A5. Arguably the first major human pigmentation locus of the “post-genomic era”, its discovery was enabled by its huge effect in explaining variation among Eurasian populations and their differences from African groups. In Crawford  et al. the author observes within Africans nearly ~30% of the trait variance was due to four loci, with ~13% due to SLC24A5. In earlier work comparing just people of European and African descent, SLC24A5 variance explains closer to 30% of the pigmentation difference. It seems that pigmentation effects genetically exhibit an exponential distribution. A small number of loci have a large effect, and a numerous number of loci have small effects.

Distribution of rs1426654 at SLC24A5

The results from Crawford  et al. and Martin et al., a naive inspection of the modern distribution of the derived rs1426654 allele, and ancient DNA, seem to indicate a mutation associated with lighter skin emerged after 40,000 years ago. After the expansion of non-African humans, and, the divergence between eastern and non-eastern branches of non-Africans. A common haplotype around this mutation suggests that it wasn’t part of the ancestral “standing variation” of the human lineage. Ancient samples from Scandinavia, the Caucasus, and modern samples from Eurasia and from Africa, all exhibit the same pattern, suggesting recent common descent.

And though a mutation on rs1426654 is associated with lighter skin, it does not produce white skin. I have the homozygote derived genotype on rs1426654, as does my whole nearby pedigree. All of us have brown skin, to varying degrees. And interestingly, the locus around rs1426654 seems to be under strong selection in both South Asia and Africa, including East Africa. This makes me somewhat skeptical that there is a simple story to tell on this locus in relation to skin pigmentation being the driver here.

Let me quote from  Crawford  et al.:

Most alleles associated with light and dark pigmentation in our dataset are estimated to have originated prior to the origin of modern humans ~300 ky ago (26). In contrast to the lack of variation at MC1R, which is under purifying selection in Africa (61), our results indicate that both light and dark alleles at MFSD12, DDB1, OCA2, and HERC2 have been segregating in the hominin lineage for hundreds of thousands of years (Fig. 4). Further, the ancestral allele is associated with light pigmentation in approximately half of the predicted causal SNPs…These observations are consistent with the hypothesis that darker pigmentation is a derived trait that originated in the genus Homo within the past ~2 million years after human ancestors lost most of their protective body hair, though these ancestral hominins may have been moderately, rather than darkly, pigmented (63, 64). Moreover, it appears that both light and dark pigmentation has continued to evolve over hominid history….

For over ten years it has been clear that very light skin in eastern and western Eurasia are due to different mutational events. Crawford  et al. give us results that indicate this pattern of evolutionary complexity is primal and ancient.

But there is often a tacit understanding that the selection process is the same over time and space. Something to do with protection from UV light and also synthesization of vitamin D at higher latitudes. So this paper that just came out definitely piqued my interest, Darwinian Positive Selection on the Pleiotropic Effects of KITLG Explain Skin Pigmentation and Winter Temperature Adaptation in Eurasians. The authors looked at a lot of variants in KITLG with a focus on East Asians. They confirmed that there were at least two selection events, one just around the “Out of Africa” period, and possibly another one later, during a period when West and East Eurasians were genetically distinct.

This section is very intriguing: “Besides pigmentation, KITLG is also involved in mitochondrial function and energy expenditure in brown adipose tissue under cold condition (Nishio et al. 2012; Huang et al. 2014). We demonstrated that winter temperature showed a much stronger correlation than UV for rs4073022.” Earlier the authors review work which suggests that large melanocytes are much more susceptible to damage due to cold than than smaller ones. Dark-skinned individuals tend to have large melanocytes (and more of them!). The KITLG locus does a lot of things; some of you may know its relationship to testicular cancer.

What  Crawford  et al. tells us that there seems to have been recurrent and sometimes balancing selection around loci implicated in pigmentation for hundreds of thousands of years. What ancient DNA is telling us is that the genetic architectures we take for granted as typical across much of Eurasia are relatively novel. But, I think people are perhaps taking the implications of modern genetic architecture too far in predicting the variation of characteristics in the past. Even the best genomic predictors seem to account for only around half the variance in pigmentation. “Ancestry” accounts for the rest, which basically means there are many other loci which are not accounted for. It is not unreasonable to suppose that ancient northern Eurasian populations may have been light-skinned due to genetic variants which we are not aware of.

Of course, there are people at high latitudes who retain darker complexions. From what we know the Aboriginal people of Tasmania were isolated for about 10,000 years at the same latitude as Beijing and Barcelona, and yet their skin color remained dark brown. In contrast, Martin et al. report that Khoisan people who lived 10 degrees further north, in a much sunnier climate, were selected at loci that strongly correlate with lighter skin.

I think it is safe to say that in the near future we will close in on much of the reamining genetic factor accounting for variation in pigmentation in modern populations. It is polygenic, but almost certainly far less polygenic and more tractable than height or intelligence. But the story of why humans have varied so much over time, and why loci implicated in pigmentation are so often targets of selection in some many contexts, remains to be told.

India vs. China, genetically diverse vs. homogeneous

About 36% of the world’s population are citizens of the Peoples’ Republic of China and the Republic of India. Including the other nations of South Asia (Pakistan, Bangladesh, etc.), 43% of the population lives in China and/or South Asia.

But, as David Reich mentions in Who We Are and How We Got Here China is dominated by one ethnicity, the Han, while India is a constellation of ethnicities. And this is reflected in the genetics. The relatively diversity of India stands in contrast to the homogeneity of China.

At the current time, the best research on population genetic variation within China is probably the preprint A comprehensive map of genetic variation in the world’s largest ethnic group – Han Chinese. The author used low-coverage sequencing of over 10,000 women to get a huge sample size of variation all across China. The PCA analysis recapitulated earlier work. Genetic relatedness among the Han of China is geographically structured. The largest component of variance is north-south, but a smaller component is also east-west. The north-south element explains more than 4.5 times the variance as the east-west.

Read More

What Neanderthals tells us about modern humans

In Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past David Reich spends a fair amount of time on Neanderthal admixture into modern human lineages. Reich details exactly the process of how his team arrived to analyze the data that Svante Paabo’s group had produced, and how they replicated some peculiar patterns. In short, eventually, they concluded that modern humans outside of Africa have Neanderthal ancestry, because the Neanderthal genome that Paabo’s group had recovered happened to be subtly, but distinctively, closer to all non-Africans than to Africans. At the time, the group reported that Neanderthal ancestry was relatively evenly spread across non-African populations, which lead them to suggest that it was likely a singular admixture event early on during the expansion phase of modern humans.

Nearly a decade things have changed. There is a consistent pattern of West Eurasians having less Neanderthal ancestry than East Eurasians. That is, Europeans have lower Neanderthal ancestry fractions than Chinese (South Asians are in between, in direct proportion to their West Eurasian ancestral quantum). There have been a variety of arguments and explanations for why this might be, which fall into two classes:

  1. Neanderthal ancestry was purged more efficiently from West Eurasians due to larger effective population sizes (selection is stronger in large populations).
  2. There may have been multiple admixture events into modern humans, or, gene-flow into West Eurasians diluting their Neanderthal ancestry.

But what if all these arguments are mostly wrong? That’s what a new preprint seems to suggest: The limits of long-term selection against Neandertal introgression:

Several studies have suggested that introgressed Neandertal DNA was subjected to negative selection in modern humans due to deleterious alleles that had accumulated in the Neandertals after they split from the modern human lineage. A striking observation in support of this is an apparent monotonic decline in Neandertal ancestry observed in modern humans in Europe over the past 45 thousand years. Here we show that this apparent decline is an artifact caused by gene flow between West Eurasians and Africans, which is not taken into account by statistics previously used to estimate Neandertal ancestry. When applying a more robust statistic that takes advantage of two high-coverage Neandertal genomes, we find no evidence for a change in Neandertal ancestry in Western Europe over the past 45 thousand years. We use whole-genome simulations of selection and introgression to investigate a wide range of model parameters, and find that negative selection is not expected to cause a significant long- term decline in genome-wide Neandertal ancestry. Nevertheless, these models recapitulate previously observed signals of selection against Neandertal alleles, in particular a depletion of Neandertal ancestry in conserved genomic regions that are likely to be of functional importance. Thus, we find that negative selection against Neandertal ancestry has not played as strong a role in recent human evolution as had previously been assumed.

The basic argument in the preprint is that the model assumed for the ancestry of West Eurasians and Africans was wrong. Wrong assumptions can lead to wrong inferences. Using two Neanderthal genomes which are from different populations, one of whom directly contributed to the Neanderthal ancestry in modern humans, a new statistic which was insensitive to model assumptions about modern human phylogeny was computed.

The older statistic held that West Eurasians and Africans were distinct clades which had not had gene flow in ~50,000 years. Using simulations the authors argue that the best fit to the statistics that they do see, the earlier flawed one, and the current more robust one, is a situation where a population of West Eurasian origin mixed with Africans starting about ~20,000 years ago.

This explains why there was a consistent decline in Neanderthal ancestry: the earlier statistic’s model assumption got worse and worse over time, and so began to underestimate Neanderthal ancestry more and more. There was continuous gene flow into Africa over the past 20,000 years.

Not everything that came before is wrong. It could still be that there are multiple admixtures. And, the authors do agree that some selection for Neanderthal alleles has occurred. It’s just that it’s not the primary reason for the decline of Neanderthal ancestry in West Eurasians.

As for the other explanation, that Neanderthal-less Basal Eurasian ancestry diluted the European hunter-gatherer fractions, the authors seem very skeptical of that. One point the authors make is that though an early European farmer was estimated to have ~40% Basal Eurasian, its Neanderthal estimate is still quite high. Iosif Lazaridis points out that this is an old estimate, and the Reich group now puts it closer to ~25%. Additionally, another recent preprint put the fraction closer to ~10%. With such low values, it is possible that Basal Eurasians may have had low Neanderthal fractions, but that that was a marginal effect on the aggregate West Eurasian ancestry quantum from Neanderthals.

I think the bigger thing to consider is that our understanding of the relationships of modern humans is roughly right, but there are lots of nuanced details we’re missing or misunderstanding. Ancient DNA from South Africa, for example, shows that modern Bushmen all seem to have exotic ancestry compared to samples from 2,000 years ago. But what about samples from 20,000 years ago?

We have the best temporal transect from Ice Age Europe, and in this region, there are many population turnovers and admixtures. It seems implausible that Europe is entirely exceptional. The West Eurasian gene flow event dated to ~20,000 years ago is curiously coincidental with the beginning of the recession of the Last Glacial Maximum. To get a better understanding of the relationships of Pleistocene people looking at paleoclimate data is probably useful. The ancient DNA will come online at some point…and unless you think ahead, we’re going to be surprised.

Human genomics will uncover a lot of treasure in Southeast Asia


On this week’s podcast on “Isolated Populations” I mentioned offhand to Spencer that I believe it is a bit ridiculous to bracket a host of Southeast Asian populations as “Negritos,” as if they were an amorphous and homogeneous substratum over which the diversity of modern South and Southeast Asian agriculturalists were overlain.There was almost certainly a great deal of population structure which accrued over the Pleistocene. Another issue, which I didn’t mention, is that Southeast Asia is also very geographically expansive. Modern Indonesia alone spans the length of North America.

Of course, you could say the same for Europe, from the Urals to the Atlantic. And yet we know that European hunter-gatherers were relatively homogeneous (albeit, with some structure!) at the beginning of the Holocene. I think the difference though is that Europe was a landscape into which hunter-gatherers expanded during the Last Glacial Maximum, while Southeast Asia, like Africa, has long been a refuge for human populations even during the coldest and driest periods of the Pleistocene.

There are three major classes of “Negrito” peoples in South and Southeast Asia.  To the west, are the indigenous peoples of the Andaman Islands. These tribes probably arrived from what is today Myanmar during the Pleistocene, when sea levels were lower. In peninsular Malaysia you have groups such as the Semang. Though physically very different from their neighbors, these people speak the Aslian form of Austro-Asiatic languages. They are not linguistic isolates like the Andaman tribes.

This speaks to the reality that unlike the Andaman Islanders the Negritos of mainland Southeast Asia have long been interacting with local populations. The languages they speak reflect interactions with Austro-Asiatic rice farmers. Curiously though, the dominant people amongst whom they live no longer speak Austro-Asiatic languages. Rather, they speak Austronesian or Tai dialects. These two groups are later arrivals on the Southeast Asian scene, and both seem to have assimilated Austro-Asiatic groups culturally and genetically, except in Cambodia and Vietnam (and to a lesser extent in pockets of Thailand and Myanmar).

If you are curious about the relationship between the various modern Southeast Asian groups, then two ancient DNA papers, Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia and Ancient genomes document multiple waves of migration in Southeast Asian prehistory, should do the trick. Some of the migrations are historically or semi-historically attested. In particular, the intrusion of the Tai, the long occupation of what became Vietnam by the Chinese, and the settlement of Han officials amongst the local people, and the migrations of the ancestors of the Hmong into Laos.

Others processes are vaguer and poorly understood. It has long been clear that the Austronesian probably assimilated Austro-Asiatic rice farmers in much of maritime Southeast Asia. And yet unlike mainland Southeast Asia to my knowledge, there are no Austro-Asiatic populations in Indonesia. Additionally, it has been brought to my attention that the ~ 3,000-year-old sample from Myanmar has no clear Austro-Asiatic signature, despite the common sense suggestion that Austro-Asiatic languages must have entered India via that region (it has affinities to modern Tibeto-Burman individuals). And, importantly the Austro-Asiatic populations themselves seem to have been deeply mixed between a dominant element strongly related to the Han Chinese, and a minority component which was basal Southeast Asian, for lack of a better term. This means that the Munda populations within India have several distinct components of ancient South and Southeast Asian substratum.

Aeta family

But speaking of this substratum, probably the best paper recently focusing on these groups is from last year, Discerning the Origins of the Negritos, First Sundaland People: Deep Divergence and Archaic Admixture. In many ways, it just reinforced the results of Reich et al. 2011. All the Negrito groups are only distantly related to each other. The Negritos of the Andaman Islanders and those of peninsular Malaysia seem to be somewhat closer to each other than either is to those of the Philippines. And, the groups in the Phillippines seem to be somewhat closer to the peoples of Melanesia. To some extent, this is just geographically expected, but there are also interesting details.

The Negritos of the Philippines, in particular, those from the northern island of Luzon, have some of the highest fractions of Denisovan ancestry of any human populations outside of Melanesia. No one is clear whether the admixture is from the same event as the one that leads to the high fractions in Melanesians, or whether there were separate mixing events (not implausible). The western Negrito groups have far lower fractions of Denisovan.

Another surprising result is that the Negritos of the southern Philippines seem very distinct from those of the northern Philippines. This may be an artifact of particular admixture history, but I wouldn’t be surprised if these islands preserved a lot of diversity which has been homogenized elsewhere.

Like many people, I believe that human evolutionary genomics will have a lot to say about Africa in the next 10 years. But, outside of Africa Southeast Asia may be one of the most fertile regions in terms of exposing deep history. This was an area that was always amenable to habitation by modern-like Africans. It seems very likely now that the predominant modern human ancestry found in the Negrito substratum, and shared with all other non-Africans, is actually not the signal of the oldest modern humans to be present in Southeast Asia. Second, there seem to be many archaic human species which made their homes in Southeast Asia.

Humans arrived in Southeast Asia a long time ago. Our speciosity and census sizes were high. With more ancient DNA and better deep whole genome sequence analysis, we’ll uncover some surprising things. I guarantee.