The genetic risk of prostate cancer is probably higher in people of West African descent

When David Reich’s op-ed came out some discussion ensued about his focus on prostate cancer risk in African Americans. This is the research which put Reich on my personal radar (if you care, start with this 2006 paper, Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men). I had a back-and-forth with Debbie Kennett about whether this was a robust result. To be honest I hadn’t followed the research closely because 1) my own risk of dying of prostate cancer is probably pretty low knowing what people in my extended pedigree tend to die from 2) I’m not terribly interested in disease genetics unless they have a strong evolutionary genomic implication.

Doing some cursory literature searches suggested that Reich was right to include that example in the book and the op-ed because there had been follow-up work that verified the initial result. I had told myself that perhaps I’d follow up on this at a later point. After reading Laura Hercher’s rather patronizing take on David’s op-ed I decided that now is as good a time as any.

Looking around I found a very recent paper which hits the spot. Genetic hitchhiking and population bottlenecks contribute to prostate cancer disparities in men of African descent (it’s in Cancer Research). It came out in February  2018, so it will be up on the literature, and, there is an evolutionary angle here (I am friendly with the first author and respect his work overall).

The paper is open access so I recommend you read it. But here’s the high level:

  1. They had access to Sarah Tishkoff’s huge data set of African populations, as well as 1000 Genomes, to produce a combined panel with 1 million markers and 64 populations (38 African).
  2. Then, they focused on the hits in the literature for prostate cancer SNPs, which they called CaP susceptibility loci. 68 SNPs with high confidence (they looked for p-values of 10-5 or less).

So they have the data set with populations and allele frequencies, and a subset of markers that they want to interrogate (no imputation here, they had all the SNPs). They developed a statistic, Genetic Disparity Contribution (GDC), to evaluate the impact of SNP differences across populations in terms of CaP risk (that is, prostate cancer risk).

First, they need to look at a SNP in a particular population:

= SNP, = individual, and k = population. The SNP here is the “risk allele” (remember, they come in two forms). 2, is reflecting the frequency of the risk allele. ORis basically the odds ratio of a given SNP of developing prostate cancer.

Now, the GDC:

A = African and N = non-African. You are just using the frequencies within the populations of interest for the given SNP. You can compare different populations presumably.

Finally, the individual Genetic Risk Score (GRS):

The score for an individual in population k is the sum of ̅  across all 68 markers. If the individual has no “risk alleles” (those that increase odds of developing prostate cancer), then their GRS = 0.

As I stated above I don’t know much about prostate cancer. Honestly, I should take more of an interest, since it seems to run on my sons’ maternal side, so they are at risk (I know I am at risk, but people in my family tend to die of heart issues rather than cancer). The heritability for this cancer is 0.42-0.58. This is not trivial. The authors state that “CaP has the highest familial risks of any major cancer.” I certainly did not know that.

Combining their population-wide data set and the knowledge of risks from GWAS on CaP risk SNPs, they generated the plot to the left which shows you each population’s mean GRS. They confirm earlier work which suggests that African populations are at more risk than non-African populations and that West African populations are at more risk than East African populations. The authors observe that some African populations do have low risks even on the global scale. But on the whole the rank here is:

West African > East African > South Asian > European > East Asian.

They used ADMIXTURE to confirm the obvious correlations; the more West African ancestry in an individual the higher the GRS. The highest non-African population are Puerto Ricans, who have substantial West African admixture.

But one thing to remember here is that some of these African populations are quite distinct. For example, though West African populations have the highest risks, the Hadza and the Baka have high risks as well, and these hunter-gatherers are very diverged from other Africans.  In fact, we know from ancient DNA that modern African populations are fusions of extremely distinct groups whose divergence may go well north of 200,000 years ago.

The pattern of risk seemed a bit strange to me outside of Africa. On the genome-wide scale, South Asians are between Europeans and East Asians, with a slight bias if any toward Europeans. This is because half the ancestry of South Asians is closely related to that that contributed to Europeans, and half is distantly related to the ancestry of East Asians. This can easily explain why their archaic admixture fractions are between these two groups. And yet the average GRS makes it clear dthat they seem higher than these two populations.

Lachance et al. do the standard genetic calculations of risk, and perform some exploratory analysis of the population structure in their data (since they curated this from well-known sources this wasn’t necessary for outlier removal as much as the regression that they ran of GRS on ancestry fractions). But they didn’t delve deeply into demographic history that I allude to above. Rather, what they did focus on were signals of selection in regions of the genome that these the risk markers were embedded in.

They seem to come to two general conclusions:

  1. Selection through the side-effect of hitch-hiking does seem to drive some of the African vs. non-African divergences.
  2. Much of the difference can probably be due to specifics of drift in non-African populations in the “out of Africa” event, and there isn’t evidence of polygenic selection across the 68 loci in the aggregate.

The latter seems unsurprising because prostate cancer hits late in life. As a trait, it is not what you are going to be selecting against in a pre-modern world (anyway, grandmothers, not grandfathers, seem to increase descendant fitness the most in ethnographic work). Additionally, the authors say that “risk allele frequencies tend to be higher in Africa when risk alleles are ancestral, and risk allele frequencies tend to be higher in non-African populations when risk alleles are derived.” Ancestral/derived here relates to new mutations (the latter). We know that the “out of Africa” bottleneck resulted in the extinction of some ancestral variation, presumably including ancestral risk alleles.

The former, in regards to linked selection, is also not surprising. As non-Africans spread across the world they developed new local adaptations, and some allele frequencies shifted from the African ancestors. But not all. And that I think explains why South Asians have a higher risk than Europeans and East Asians. The authors observe several protective (lower risk) alleles rose in frequency due to being in a region where there was selection for lighter pigmentation. Pigmentation is one trait which is highly heritable where some non-Africans (South Asians, Oceanians) are often more like African populations than other Eurasian groups. If high-risk CaP alleles were somehow associated with ancestral pigmentation alleles, then it makes sense that South Asians have a higher risk, since they are more ancestral on these loci than other Eurasians.

Finally, there is the question of how applicable these GWAS are to diverse populations. These markers were discovered in mostly European panels, so there is the standard ascertainment bias. Though the authors do say that “The International Agency for Research on Cancer GLOBOCAN program estimates that CaP has the highest incidence of any tumor site in African-American, Caribbean, and African men.” That is, African men, just like men of the Diaspora, are at higher risk. And remember, the association with African ancestry emerged in African American men, with those with elevated African ancestry in a particular region of the genome being at higher risk. It wasn’t a naive observation of higher rates of CaP in African Americans.

Because the OR can vary between populations, the authors ran their analysis by equalizing the OR and also by using the literature value of OR at a marker population by population. They found the broad disparity held. Subsampling the markers also maintained the rank order in broad geographic terms. Finally, the authors observe that because of the bias in the discovery of European risk variants, there are probably African risk variants that are not in their marker set which result in an underestimate of the GRS.

What is the upshot of all of this? The less important one is that David Reich used the example of prostate cancer to open his discussion about population structure because it’s probably a robust result (and also, in the book he makes clear a lot of sociologists and anthropologists did not appreciate the correlation between disease and ancestry that seemed due to biology). The balance of the evidence points to the likelihood that men with African ancestry, in particular, but not exclusively, of West African ancestry, have somewhat higher risks all things equal of developing prostate cancer. As the authors note the risks overlap quite a between populations. A substantial number of men of European ancestry have a higher GRS for CaP than those of African ancestry. There are two classes of alleles driving this risk. One class has high-frequency differences between populations, and another class has a large impact on odds ratios (so small differences still matter).

The figure to the right shows that there is a strong correlation between predicted genetic risk score and the real death rate from prostate cancer. I’m a little confused though here about the relationship between the training set and the population one is predicting on. Presumably, the GWAS come from these populations based on medical research, which is the same body of literature collecting the death rates. But the interesting thing here is that East Asians, Europeans & Latin Americans, and Diaspora Africans, are all distinct clusters in both mortality and GRS.

Since the heritability is not high, but only moderate, and even this correlation is imperfect, one can still argue that the disparity is attributed to environment. But to be honest the South Asian prediction along with the relationship to pigmentation regions indicates to me that the GRS is capturing something real in population differences due to a combination of demographic history and natural selection.

Moving on from CaP, these academic debates about whether disparities are driven by genes, environment or both (or an interaction), miss the bigger picture that due to the contingencies of history different populations probably have different risks in late-in-life diseases. The South Asian risk for cardiac and metabolic illnesses is so extreme that I think most people won’t deny that that is a real thing (in particular since there is variation within South Asia for this judging by British medical data).

Open Thread, 4/17/2018

Almost done with She Has Her Mother’s Laugh: The Powers, Perversions, and Potential of Heredity. To be honest I’m a little relieved that there wasn’t that much focus on the “perversions” of heredity. Lots of interesting stuff. This is definitely a book that scientists and lay people could benefit from.

Carl is a great writer so he makes rather abstruse concepts clear and engaging to nonspecialists. As for those of us who have our noses close to the ground, we sometimes lose the bigger perspective. There is a lot of interesting research that he surfaces in She Has Her Mother’s Laugh that I wasn’t very familiar with, though I had probably read about it or seen it in one of his columns (or Ed Yong’s).

Met a lot of cool people, and touched base with others who I knew ahead of time, at the AAPA 2018. Compared to ASHG or even SMBE the conference was very white. I guess that’s why there were all the diversity sessions?

Lee & I

I had a lot of discussions with Lee Berger about science on a broad philosophical level. Unfortunately, specialization is such that it can be hard to communicate across disciplines such as human genomics and paleoanthropology. But as Lee brings enough samples into the open to do some real statistics I think that will change how constrained to the elect paleoanthropological knowledge is.


Lee’s son introduced me to the concept of South African barbecue. I haven’t had any yet, but I’m curious about it.

Lee will be on this week’s episode of The Insight. Again, please subscribe on iTunes, Stitcher, Google Play. The last episode with Stuart Ritchie was our most successful yet in terms of traffic. We’re suspecting that Lee’s episode will do quite well as well. People keep finding the podcast by chance. We really need reviews to get featured by iTunes!

Spencer and I will probably shift back to a two-person conversation next week. We should probably do an AMA again soon.

Was There a Civilization On Earth Before Humans? Very interesting piece, especially for those of us who have read science fiction. But my issue is straightforward: humans have scrambled biogeography so much in such a short amount of time. I think any other industrial species would have done the same. Even after they went extinct, the phylogeographic chaos they wrought would remain.

It seems very likely that all Australian marsupials descend from one South American ancestor species. The explosive emergence of very different placentals all across Australia simultaneously in the fossil record would be quite suspicious (or red deer descendants in New Zealand).

I spent some time with the people who were associated in some way with the Reich lab a fair amount during the AAPA meeting. I also talked to a few friends about what they thought about David’s op-ed and book. It’s no surprise that there are legitimate human population geneticists considering writing a response of some sort. It’s also no surprise that even critics of David within the population genetics community think that the Buzzfeed op-ed was so bad that it makes it harder for them say something, as the water has been nuddied.

In some ways the reaction has made one of David’s major points: population geneticists need to offer their unvarnished opinions, rather than cosigning people in other fields who mangle their findings.

Some people feel that David “threw me under the bus” in his now infamous chapter. I don’t see it that way.

As many of you know (if you subscribe to my total content feed you know) I have a few other blogs, one of them Brown Pundits. It actually receives substantial traffic from India now. It will be “interesting” to say the least.

A population genetic interpretation of GWAS findings for human quantitative traits. Stuck in the weeds of ancient DNA these past few years I haven’t been paying attention to the storm of GWAS and PRS approaching.

Signatures of negative selection in the genetic architecture of human complex traits.

What did modern humans look like during the “Out of Africa” event?

Recently I was having an email exchange with a friend (a prominent public intellectual who is not a scientist), and we were thinking about what “ancestral Africans” looked like. More precisely, the populations which were resident around ~100,000 to ~200,000 years before the present. These are the people who are depicted in paleoanthropology documentaries. Here were some of my major contentions:

1) We don’t know what they looked like
2) They probably were more likely to look like modern Africans than non-Africans
3) But modern Africans are diverse in their looks and we could expect that ancient Africans were too

The neighbor-joining tree above is generated with a naive model of successive bifurcation.

1) Khoisan split off 200,000 years ago
2) Mbuti split off 150,000 years ago
3) Mende split off 100,000 years ago
4) Japanese about 50,000 years ago
5) While Pathan and Basque only 15,000 years ago

The model is wrong in the details. Pathan and Basque have some ancestry is which recently diverged, and much that is deeply diverged. The 15,000 year value is just an average. Similarly, the Khoisan have some Eurasian ancestry. But in the broad sketch it illustrates that some African populations diverged a very long time ago from other groups.

Ancient Africans date to ~200,000 years before the present for all the modern populations. Khoisan to Japanese. You could probably use phylogenetic character reconstruction methods to attempt to infer what ancient Africans looked like…but I’m not sure that it would be useful since modern humans have spread over so many ecologies over such a short span of time.

Outside of Sub-Saharan Africa perhaps on the order of 95% of the ancestry derives from an expansion from a small founder group between 60 and 80 thousand years ago. Removing the “Basal Eurasian” component, groups as diverse as Native Americans, Oceanians and East Asians probably derive their ancestry from a common group which flourished between 50 and 60 thousand years ago (this pulse is the majority of the ancestry of Europeans and South and West Asians as well).

The point here is to illustrate that 50,000 years is definitely sufficient for a great deal of diversity to have emerged in human physical variation. And yet the Khoisan are ~200,000 years diverged from their ancestors within Africa. We actually know that indigenous southern Africans have been selected for lighter pigmentation. We also know that loci associated with pigmentation in modern humans exhibits a lot of variation in Africans, and this variation is likely an ancestral feature of our species.

In sum, the number of generations between ancestral Africans and all modern descendent populations is great enough that I’m not uncertain that we can predict what they look like in anything except their skeletal features. Additionally, most of the history of anatomically modern humans was likely highly structured within Africa. That’s another way of saying that ancient Africans themselves were probably physically diverse.

With all that being said, all things equal ancient Africans probably are more likely to look like modern Africans than modern non-Africans. The main reason is simply that modern Africans occupy the same broad ecological landscape as ancient Africans, and many of our features, from our build to our complexion seem dependent upon environmental pressures. There’s lot of evidence that very light skin is probably a derived characteristic of our species (there are consistent signatures of sweeps around pigmentation loci). And, there is also evidence that some of the archaic introgression into non-Africans may have consequences in our morphology and external physical characteristics. For example, Eurasians seem to have very high frequencies of Neanderthal variants of the keratin gene. This is implicated in hair, skin and nail development.

Addendum: Note that even if we have ancient genomes, polygenic characteristics are still hard to predict. Even today common SNPs only explain a minority of the variation in hair color in Europeans.

Personal genomics lives!

Reflecting back to it I think I started “exploring personal genomics” in the late 2000s. That’s when direct-to-consumer testing started to become popular, albeit very niche. The book Exploring Personal Genomics is now 5 years old, and a lot has changed since then. In the same year, 2013, David Mittelman and I cowrote Rumors of the death of consumer genomics are greatly exaggerated in Genome Biology.

Now Science has a commentary out, Crowdsourced genealogies and genomes, which reviews how large amounts of public data, genetic and classical genealogical, are being used to change the field before our very eyes. I would recommend though that you read the less edited (longer, more detailed) version on the website of the authors, Crowdsourcing big data research on human history and health: from genealogies to genomes and back again.

This fact from that piece is really illustrative of what’s happening today:

As the number of customers of whole-genome DTC genetic testing just crossed 16 million, it is worth noting that almost two-thirds of them joined since the beginning of 2017 [19]. Based on current rates, this number of customers is predicted to be close to 100 million by end of 2020.

Rainforest hunter-gatherers are not primitive or primal

Recently I had a discussion with a friend that I suspect the “tropical pygmy” phenotype you see Central Africa and Southeast Asia is a pretty recent development. So this sort of assertion, “The Sentinelese tribe have remained on their North Sentinel Island, almost completely uncontacted for nearly 60,000 years…” is probably wrong. First, the Sentinelese probably arrived with other Andaman peoples during the Pleistocene from mainland Southeast Asia when the archipelago may have been connected to the mainland due to low sea levels.

Second, the small size of many tropical hunter-gatherer populations may simply be due to the difficulty of surviving in this environment. Though rainforests are lush, humans can’t access a lot of it, and small animals tend to require more energy to catch than is justified by how much meat they provide.

Genomics is now on the case: Polygenic adaptation and convergent evolution across both growth and cardiac genetic pathways in African and Asian rainforest hunter-gatherers:

Different human populations facing similar environmental challenges have sometimes evolved convergent biological adaptations, for example hypoxia resistance at high altitudes and depigmented skin in northern latitudes on separate continents. The pygmy phenotype (small adult body size), a characteristic of hunter-gatherer populations inhabiting both African and Asian tropical rainforests, is often highlighted as another case of convergent adaptation in humans. However, the degree to which phenotypic convergence in this polygenic trait is due to convergent vs. population-specific genetic changes is unknown. To address this question, we analyzed high-coverage sequence data from the protein-coding portion of the genomes (exomes) of two pairs of populations, Batwa rainforest hunter-gatherers and neighboring Bakiga agriculturalists from Uganda, and Andamanese rainforest hunter-gatherers (Jarawa and Onge) and Brahmin agriculturalists from India. We observed signatures of convergent positive selection between the Batwa and Andamanese rainforest hunter-gatherers across the set of genes with annotated ‘growth factor binding’ functions (p<0.001). Unexpectedly, for the rainforest groups we also observed convergent and population-specific signatures of positive selection in pathways related to cardiac development (e.g. 'cardiac muscle tissue development'; p=0.003). We hypothesize that the growth hormone sub-responsiveness likely underlying the pygmy phenotype may have led to compensatory changes in cardiac pathways, in which this hormone also plays an essential role. Importantly, we did not observe similar patterns of positive selection on sets of genes associated with either growth or cardiac development in the agriculturalist populations, indicating that our results most likely reflect a history of convergent adaptation to the similar ecology of rainforest hunter-gatherers rather than a more common or general evolutionary pattern for human populations.

A minor note: there is some ethnographic data that the isolated Sentinelese are not as small as the other Andaman Islanders. Some of their small size may simply be due to exposure to diseases and the stress of settlers from the mainland.

The Insight, episode 17: Stuart Ritchie, intelligence and genes

On this week’s episode of The Insight (Stitcher and Google Play) we talk to Stuart Ritchie, a postdoc in Ian Deary’s lab, about recent developments in cognition and genomics. There’s a reason that Deary gets some time in She Has Her Mother’s Laugh; his group is publishing some really interesting work.

Before we get to the good stuff, Stuart gives us a quick review of general intelligence and why it matters. If you want a book-length treatment then his own book should suffice, Intelligence: All That Matters. Richard Haier’s The Neuroscience of Intelligence goes a little more into the “wet biology” aspect of the brain if that is more your style.

There are two reasons I wanted us to have Stuart on the podcast.

First, psychometrics is not a field which was hit by the replication crisis. It’s a pretty robust and reliable discipline. Companies such as the Educational Testing Service (ETS) rely on the predictive power of the constructs in the field to sell their products. And yet most well-educated people don’t really know much about intelligence testing except that it has been “debunked” by the Mismeasure of Man.

Because people don’t understand the history of intelligence testing (i.e., it enabled the meritocracy by removing the importance of “polish” and “good breeding”) it’s easy for American graduate schools to do things like removing the GRE as a criterion on admissions. Privately some academics have told me that this will mostly result in increasing the importance of undergraduate education and pedigree (because anti-GRE sentiment has become connected to “social justice” I think it’s removal is a fait accompli).

Second, the field of cognitive genomics is moving through a major turning point. A publication like this in January, A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence, is going to be superseded in months. I’m not speculating. I know this as a fact, and so do many others. Where will we be in two years?

Ray Kurzweil has many ideas, some of them interesting, some kooky, and some of them wrong. But one idea he’s promoted which I think is correct is humans are not good at modeling exponential rates of growth. The field of psychometric genomics is now moving into the steep phase of ascent, as sample sizes go well above 1 million, and some researchers shift from proxy characteristics such as education and delve into raw intelligence test scores. Most people “outside of the know” are about to smash into the concrete before they even know it’s coming up at them….

Notes from the personal genomic inflection point

There’s a debate that periodically crops up online about the utility, viability, and morality of returning results from genetic tests to consumers. Consumers here means people like you or me. Pretty much everyone.

If you want to caricature two stylized camps, there are information maximalists who proclaim a utopia now, where people can find out so much about themselves through their genome. And then there are information elitists, who emphasize that the public can’t handle the truth. Or, more accurately, that throwing information without context and interpretation from someone who knows better is not just useless, it’s dangerous.

Of course, most people will stake out more nuanced complex positions. That’s not the point. Here is my bottom-line, which I’ve probably held since about ~2010:

  1. The value for most people in actionable information in direct-to-consumer genetics is probably not there yet when set against the cost.
  2. With the reduction in the cost of genotyping and sequencing, there’s no way that we have enough trained professionals to handle the surfeit of information. And there will really be no way in 10 years when a large proportion of the American population will be sequenced.

At some point, the cost will come down enough, and the science probably is strong enough, that direct-to-consumer genetics moves away from novelty and early adopters to the mass market. At that point, we need to be able to make the best use of that data. Genetic counselors, geneticists, and doctors all cost a fair amount of money and have a finite amount of labor supply to provide to the public. They need to focus on serious, complex, and consequential cases.

To some extent, we need to reduce much of interpretation in the personal genomics space to an information technology problem. For example, if someone’s genotype pulls out a bunch of statistically significant hits of interest the tool should automatically condition significance on that individual’s genetic background.

Yes, there are primitive forms of these sorts of tools out there already. But they’re not good enough. And that’s because there isn’t the market need. But there will be.

European hunter-gatherers were mostly replaced, but not totally. And they were neither black nor white

Peter Frost over at his blog has a long post on the transition to agriculture and pastoralism in Northern Europe.

He tagged me on Twitter, so presumably, he’s soliciting my opinion/response.

The post starts off with a quick reference to the attempt to leverage massive replacement in Northern Europe eight to four thousand years ago in the interests of contemporary politics. I’m not going to address that because I’m not very interested in how these topics relate, and I won’t post comments (or will delete) that engage with that. I will focus on the science.

First, I tried to leave a comment on his weblog and blogger ate it. So I’m just going to put a post here in the interests of open exchange. I also think many readers here have some of the same opinions as Peter, or suspicions, so it might be best to clear things up.

I don’t think his Peter’s argument can really be understood without reading his 2006 paper, European hair and eye color: A case of frequency-dependent sexual selection? My opinion in regards to this hypothesis is that I think it’s probably wrong and I’m skeptical. More skeptical than I was when I first read the paper because we have more understanding of the process of the settlement of Europe during the late Pleistocene and early Holocene. But, there is still a small window for it to be correct, as one can see in Peter’s post.

The argument hinges a lot on the pigmentation profiles of proto-European groups based on predictions from algorithms which use modern Europeans as a training set. These predictions are in the papers themselves, so Peter isn’t doing anything that the authors didn’t do. But, I have come to the conclusion that they’re probably not trustworthy. These ancient populations were very different from modern Europeans, and their genetic architecture for pigmentation may have been different (modern Europeans are a compound of several groups).

Though Mesolithic Western European hunter-gatherers were probably darker in complexion than modern Europeans, I believe it is likely that they were not nearly as dark as pigmentation prediction algorithms suggest. Second, it is true that alleles correlated with blonde hair in Europeans within the KITLG locus are found in Siberia nearly 20,000 years ago. But it is not true that “Ancient DNA from Afontova Gora has shown that people had blond hair in mid-Siberia as early as 18,000 years ago.”

What has been found is that Europeans who carry the derived variant at rs12821256 are more likely to have blonde hair. Those who are heterozygote are twice as likely, while homozygotes are four times as likely. At least against the population base rate. The frequency in Scandinavia of the derived variant is ~20%. Many blonde people don’t have the derived variant. And, not all people who have the derived variant have blonde hair.

Of my three children two are heterozygotes for the derived variant (they carry one copy). Probably not coincidently these two have lighter hair than the third. But neither are really blonde, though perhaps they are blond(ish) during certain times of the year. More accurately their hair is probably sandy brown. Why? I’m their father, and as a normally complected South Asia, I give them a host of alleles at other loci which make them different from the typical European genetic architecture of pigmentation.

As I said earlier Peter can’t really be blamed for making these inferences because they are in the scientific literature themselves. But just because they’re there doesn’t make them true (though I do think Peter should be careful about extrapolating from odds ratios against a particular base rate probability to some deterministic relationship).

A final issue is the idea that the alleles that define modern Northern European pigmentation were present in Scandinavian and Eastern European hunter-gatherers. This is correct. But again, modern prediction algorithms are trained groups with modern genetic backgrounds. In mixed populations, the largest effect QTLs explain only half the variance in pigmentation. The rest of it is accounted for by “genomic ancestry”, which basically means there are loci associated with ancestral groups that haven’t been discovered yet. But a second and more important issue is that the frequency of some the alleles in modern Northern European groups is different from what you find in the ancient ones. The ancestral variant on SLC24A5 is almost impossible to find in Northern Europe in indigenous people today (in Europeans the ancestral variant is most often found in Spain, due to admixture with Africa during the Moorish period). I don’t need to review the literature, but there is evidence for a fair amount of selection on these loci within the last 4,000 years. Even SHG and EHG still segregated ancestral variants at higher frequencies that modern Europeans.

The second major theme in the blog post has to do with hunter-gatherer ancestry. There’s a section on haplogroup U where Peter suggests that its disappearance is due to selection, not a replacement. U is associated with hunter-gatherer ancestry. This may be true, but mtDNA and Y need to be interpreted cautiously in any case (both R1b and R1a are far more common than one might predict from autosomal distributions of the ancestry of populations in which they were originally found).

Then there is the argument that bottlenecks/founder effects and natural selection might have skewed our estimates. I don’t really get the former argument at all:

Founder effects may be another causal factor. When bands of hunter-gatherers are given the opportunity to adopt farming, most of them turn up their noses and only a few will make the change. Because those few bands are not perfectly representative of the hunter-gatherer gene pool, and because their numbers may increase many times over (thanks to the increase in food supply) the resulting founder effects will be substantial.

These are verbal models, and unpersuasive to anyone who has looked at the data and generated results. Mesolithic hunter-gatherers were a genetically homogeneous lot to begin with. They didn’t have all this variance to sample from. There was later increase in hunter-gatherer ancestry into European farmers from demographic reservoirs, but the argument about founder effect doesn’t work because the two groups are so different that playing around with biasing the sample from which one mixes does not change the overall result. Replace hunter-gatherer and farmer with “Ashkenazi Jew” and “Chinese.” The latter two groups have some variance, but a bottleneck on one isn’t going to change one’s estimate of admixture in a daughter population.

The issue about selection suffers from the problem that the magnitude would have to be too large and extensive across the whole genome to reshape hunter-gatherers in this manner to be plausible. One might imagine a case where gene flow and selection on parts of the genome from the donor group inflates the donor group proportion…but I don’t think that’s Peter’s point? Theoretically, a model of admixture followed by sweeps around one population’s ancestry component is possible, but I don’t think we see evidence of that in the ancient DNA.

In any case, though the verbal argument seems reasonable on first blush, the models and dynamics don’t work out.

Peter ends:

Some of the confusion in this debate may arise from the assumption that “late hunter-gatherers” formed a single group in Europe. In fact, there were at least three such groups (WHGs, SHGs, EHGs), whose genetic profiles significantly differed from each other and whose fates were likewise different. WHGs were an evolutionary dead end. They were replaced. The same cannot be said for the hunter-fisher-gatherers of Scandinavia and the Baltic, who were able to achieve high population densities by exploiting marine resources (Price 1991). With them we see more genetic continuity than rupture, and it is possible that some genetic characteristics formerly ascribed solely to “Anatolian” farmers were in fact of SHG origin.

The people who are making the assertions that Peter is rebutting are not confused as to the nature of the populations which they named and which they modeled. Peter can download the data and replicate the analyses himself. WHG, SHG and EHG seem to exist on some sort of continuum, with post-“Villabruna cluster” ancestry at one end of the spectrum and post-Ancestral North Eurasian (ANE) ancestry at the other. WHG is mostly descended from ancestors of the Villabruna cluster, who share a common ancestry derived from late Pleistocene West Eurasians with Anatolian farmers (the latter of whom admixed with Basal Eurasians). EHG is a mix of the same Villabruna people (or at least their eastern fringe), but with a preponderance of ANE-like ancestry. SHG is between these two groups.

It also seems that European hunter-gatherers sometime in the late Pleistocene and or early Holocene recieved a small but detectable pulse of East Asian ancestry. Also, commonly shared haplotypes with West Asians on SLC24A5 (SHG and EHG) and EDAR with East Asians (SHG) indicates some gene flow with other places (though I believe SHG has no detectable East Asian ancestry).

Finally, there is much discussion of a late occupation of Northeast Europe by farmers. Since I predicted this 10 years ago I don’t have much objection to this section…except I don’t think that it supports his other points at all. That is, the persistence of hunter-gatherer populations around the Baltic does not mean that hunter-gatherers were more similar to farmers than we might think, nor does it reject the likelihood of total replacement in many areas of Europe to the south.

The overall conclusion here is two-fold:

  1. The assertions about pigmentation are not necessarily wrong, but they are far weaker based on the data that might be inferred from the post. Additionally, modern Europeans have lots of evidence of recent selection and allele frequency change at several of these loci.
  2. The assertions about very large misestimations of inferred mixing proportions are probably wrong.

Open Thread, 04/10/2018

About ~2/3 of the way through She Has Her Mother’s Laugh: The Powers, Perversions, and Potential of Heredity. It’s what you’d expect from a Carl Zimmer book, threading history with rock-solid attention to science. So far he’s actually been a really good, if popular, history of science. I say popular not pejoratively, but because the thematic and chronological structure isn’t academic, but hinges on more personal stories, whether it Carl’s own family, or people, famous and not so famous, with genetic issues that passed on down through the generations.

The book isn’t out yet, but you can pre-order of course. The current plan is to get Carl on The Insight (Stitcher, Google Play and web).

Speaking of which, it’s doing really well right now.

Because I’ve been pestering you, some of you have left nice feedback for us, which is pretty important over the long-term. I’ll probably keep on this until we reach 100 reviews on iTunes.

Last week’s episode on the topic Jewish genetics is the biggest one so far in terms of single-week downloads, and this week’s conversation with Stuart Ritchie should also pull in some interest. We talk a fair amount about Stuart’s book, Intelligence: All That Matters, and depressing topics such as the decline in fluid intelligence over a lifetime.

We’ll probably be revisiting intelligence and genetics with a future guest soon, but in the short-term we’ll pivot toward paleoanthropology since the AAPA is going on this week. I don’t know anything about bones so I’m going to mostly check out the pop-gen sessions, and then ask John Hawks for a core-dump at some point near the end of the week for the rest.

Because most people are ignorant heathens the “read the supplements” t-shirt did not sell well. But I got one for myself (we don’t comp ourselves, so I paid for it fair and square!).

Many rely on Twitter and Google Scholar, but I want to remind people of Pubchase and SciReader. They’re still useful to finding things right outside of your core zone of interest.

I mentioned the book The Invention of Humanity before. I was reading it before switching to Carl’s book (I want to prep for a podcast and I’m also going to give the book to someone else), and it’s OK, but it has the same problem as Inventing the Individual: intellectual history which engages in a sequence of inferences and asserts their validity by fiat without any argument.

There’s a lot to learn from books like this, but that mostly involves facts, rather than arguments (whose premises and method I generally find unpersuasive).

Randall Parker said he liked The Fate of Rome better than The Fall of Rome. On that recommendation, I got The Fate of Rome, as  The Fall of Rome is arguably my favorite history book of all time.

We’ll see.

Ezra Klein and Sam Harris had a podcast debate. I didn’t learn much new in this debate aside from how the two view each other (lots of commentary on the comments of the other).

But one thing I have to say is that Sam Harris’ contention that America’s racial caste system was not historically rooted in a biological conception of racial hierarchy is a point I agree with. By the late 19th and early 20th century, the public rhetoric was based on such an understanding, but that understanding developed organically over time with the emergence of taxonomy and then evolutionary biology in the 18th and 19th centuries.

Its origins are far more ancient, and arguably primal.

Though Daniel Walker Howe’s magisterial What Hath God Wrought is not about race fundamentally, it is a useful work to try and get a sense of how our modern conceptions of the white supremacist republic may mislead us in terms of how it was initially conceived (as on many things, white nationalists and people on the extreme cultural Left agree on many things about early America, where I think they are being anachronistic).

I think most readers now get a sense I am rather pessimistic about concepts such as public reason and getting the populace on board with ideas through persuasion. But, self-styled intellectual elites should still try to cultivate less stupidity and ignorance than is the case today. We’re led too often in the public arena by fools who can’t do their own data analysis, haven’t read the history books they were assigned in college, and whose goal is to seem smart enough to trick the masses than actually impress themselves with what they’ve achieved. Though I guess for most people impressing oneself is about the bank account.

The Rakhigarhi publication is supposed to be here within a month or so. But that’s what I was told a month or so ago. At this point I don’t expect to be surprised. We need to think about archaeology, linguistics, and mythology.

For your amusement:

Genetic influence on social outcomes during and after the Soviet era in Estonia. Heritability increases with meritocracy. That’s what you’d expect.

Slope or correlation, not variance explained, allow estimation of heritability.

Viktor Orban: Hungary PM re-elected for third term. 70% of the vote went to right-wing nationalist parties. Europe’s mainstream elite shouldn’t blame the people, they’re the ones who are promoting the worship of democracy as the only legitimate form of government. They need to blame themselves.

Comparison of phasing strategies for whole human genomes. Not a big surprise if you’ve tried this, but if you haven’t, a must read.

It looks like modal extra-pair paternity rates in human populations are in the range of 1-2%. Sorry aspiring cuckolds!

Arabia as Africa-across-the-sea

In antiquity ostriches and lions roamed the Syrian desert. The cheetah even still clings to a tenuous existence in the fastness of the central Iranian desert. The point being that the new finding of African modern human remains on the southeast fringe of Arabia ~85,000 years ago shouldn’t be too surprising. Old modern(ish) looking humans date to 73,000 years before the present in Southeast Asia. Modern-like ancestry can be found in eastern (Altai) Neanderthals dating to ~100,000 years ago. And the earliest humans may have arrived in Australia 65,000 years ago.

These dates are important because the genetic results indicate that much of the population divergence of modern Eurasian, Amerindian, and Oceanian peoples dates to the period between 50 to 60 thousand years ago. This was the classic epoch for the emergence of “behavioral modernity,” and the older models of “Out of Africa” which posited a rapid explosive demographic growth after a punctuated speciation even in East Africa ~60,000 years ago.

Today with remains such as Ust’-Ishim man, we can peg the admixture of Neanderthal into modern Eurasians 52,000 and 58,000 years ago. About the same period that the preponderance of the ancestry of modern Eurasians and peoples of Australia and the Americas expanded across the world, as noted above.

Most peoples in Western and Southern Eurasia also have substantial ancestry from another group which doesn’t seem to have much Neanderthal ancestry at all, the “Basal Eurasians” (BEu). This population obtained its name from the fact that it was hypothesized to have diverged from the common ancestors of northern Eurasians (the Pleistocene peoples of Europe and Siberia), eastern Eurasians, the ancestors of the Amerindians, and Oceanians, before these groups moved on and then separated (i.e., proto-Melanesians are closer to Pleistocene European hunter-gatherers than they are to BEu). These facts suggest proto-BEu was a distinct population >60,000 years ago.

The maximum range of Neanderthals


Because of the distribution of Neanderthal admixture across so many groups relatively evenly it probably came from a single major admixture event. Geography tells us that the most likely area of this admixture would be somewhere in the northern area of West Asia.

This implies that BEu was probably resident in the southern area of West Asia, and possibly into North Africa. We do not have any samples which are “pure BEu.” Ancient agriculturalist samples from the western Near East and the eastern Near East are high in BEu ~10,000+ years ago, but these populations are still substantially mixed with a population with affinities to Mesolithic Western European hunter-gatherers (WHG). Fu et al. 2016 use a Pleistocene transect to infer that this affinity between Near Easterners and Europeans dates to the period after ~15,000 years before the present. I presume that this late Pleistocene period was when BEu was admixed away as a pure population by an expanding hunter-gatherer culture with a nexus in Southeast Europe and into Anatolia and the trans-Caucasian region.

The recent Arabian find makes sense I think in the context of BEu and other such populations, which had diverged from the Africa metapopulation ~100,000 years ago, but had not pushed further north and east, and so mixed with Neanderthals.

But what about the older modern human remains which are showing up in eastern Eurasia? I think it is entirely likely that these populations left only a little bit of an imprint in modern groups. A paper from a few years back reported having detected such an admixture in Oceanians. The first ancient genome we have from eastern Eurasia >60,000 years ago that is from a modern human will probably yield much more satisfying results.

The big dynamic looming over the likely existence of anatomically modern human range on the edge of Africa in Arabia is that for several hundred thousand years modern humans existed within Africa as a metapopulation. The proto-Out-of-Africa population can only be understood as part of this broader metapopulation. ~100,000 years before the present humans, inclusive of Neanderthals, Denisovans, and modern humans, our species was probably defined by a set of distinct metapopulations. We know that there was gene flow between these metapopulations, but the strong evidence of purifying selection of Neanderthal and Denisovan ancestry in modern human genomes tells us that this gene flow was minimal enough that biological incompatibilities were beginning to build up and the groups were on their way to speciation as defined by the biological species concept.

There is no evidence of this between any modern populations, even the most diverged (e.g., the Khoisan, who carry Eurasian and African agriculturalist genetic material). This means that within the modern human metapopulation gene flow was sufficient to prevent incompatibilities from developing due to isolation. That being said, with the oldest (proto-)modern human skull dating to ~300,000 years, and likely discernible population structure between various African lineages going beyond 200,000 years ago, there are lots of distinct modern human groups with very long histories within Africa and on its periphery.

The earliest point that you could probably say non-African humans diverged from any African (Sub-Saharan) populations is ~100,000 years ago (and this is probably a bit too generous). A conservative estimate would suggest that modern human lineages were emerging within Africa 200,000 to 300,000 years ago. So most of modern humanity’s existence has been within Africa.

The non-African populations descend from a group which underwent a period of reduced population size vis-a-vis all the African groups. But one thing I think is important to remember is that this was probably not exceptional. We know now that over the past 5,000 years African population genetic structure has been reshaped by events such as the Bantu expansion. But there were surely small and marginal groups with low effective population sizes within Africa that either went extinct or were absorbed by other populations.

The difference in the non-African population is that it was on the edge of the modern human range, and likely occupied territory that was relatively isolated from other modern humans due to the dry nature of the Sahara during most of the Pleistocene. This prevented its absorption into more numerous groups of modern humans further south and to the west. And the strong cultural and genetic barriers with the Neanderthals probably limited gene flow as well.

But even in the inclement conditions of North Africa and West Asia for most of the past 100,000 years, modern humans may have had a larger effective population size than archaic Eurasian hominins. And with this larger effective population size, one can imagine that greater cultural creativity and genetic robustness to dynamics such as population declines gave the modern humans a long-term advantage. In this context, the existence of modern human remains in a diverse array of places across warmer areas of Eurasia before 60,000 isn’t that surprising. And, the demographic wave that swallowed Neanderthals and Denisovans probably swallowed the earlier modern humans who ventured into eastern Eurasia before 60,000 years ago!