
Doing some cursory literature searches suggested that Reich was right to include that example in the book and the op-ed because there had been follow-up work that verified the initial result. I had told myself that perhaps I’d follow up on this at a later point. After reading Laura Hercher’s rather patronizing take on David’s op-ed I decided that now is as good a time as any.
Looking around I found a very recent paper which hits the spot. Genetic hitchhiking and population bottlenecks contribute to prostate cancer disparities in men of African descent (it’s in Cancer Research). It came out in February 2018, so it will be up on the literature, and, there is an evolutionary angle here (I am friendly with the first author and respect his work overall).
The paper is open access so I recommend you read it. But here’s the high level:
- They had access to Sarah Tishkoff’s huge data set of African populations, as well as 1000 Genomes, to produce a combined panel with 1 million markers and 64 populations (38 African).
- Then, they focused on the hits in the literature for prostate cancer SNPs, which they called CaP susceptibility loci. 68 SNPs with high confidence (they looked for p-values of 10-5 or less).
So they have the data set with populations and allele frequencies, and a subset of markers that they want to interrogate (no imputation here, they had all the SNPs). They developed a statistic, Genetic Disparity Contribution (GDC), to evaluate the impact of SNP differences across populations in terms of CaP risk (that is, prostate cancer risk).
First, they need to look at a SNP in a particular population:
i = SNP, j = individual, and k = population. The SNP here is the “risk allele” (remember, they come in two forms). 2, is reflecting the frequency of the risk allele. ORi is basically the odds ratio of a given SNP of developing prostate cancer.
Now, the GDC:
A = African and N = non-African. You are just using the frequencies within the populations of interest for the given SNP. You can compare different populations presumably.
Finally, the individual Genetic Risk Score (GRS):
The score for an individual j in population k is the sum of ̅ across all 68 markers. If the individual has no “risk alleles” (those that increase odds of developing prostate cancer), then their GRS = 0.

Combining their population-wide data set and the knowledge of risks from GWAS on CaP risk SNPs, they generated the plot to the left which shows you each population’s mean GRS. They confirm earlier work which suggests that African populations are at more risk than non-African populations and that West African populations are at more risk than East African populations. The authors observe that some African populations do have low risks even on the global scale. But on the whole the rank here is:
West African > East African > South Asian > European > East Asian.
They used ADMIXTURE to confirm the obvious correlations; the more West African ancestry in an individual the higher the GRS. The highest non-African population are Puerto Ricans, who have substantial West African admixture.
But one thing to remember here is that some of these African populations are quite distinct. For example, though West African populations have the highest risks, the Hadza and the Baka have high risks as well, and these hunter-gatherers are very diverged from other Africans. In fact, we know from ancient DNA that modern African populations are fusions of extremely distinct groups whose divergence may go well north of 200,000 years ago.

Lachance et al. do the standard genetic calculations of risk, and perform some exploratory analysis of the population structure in their data (since they curated this from well-known sources this wasn’t necessary for outlier removal as much as the regression that they ran of GRS on ancestry fractions). But they didn’t delve deeply into demographic history that I allude to above. Rather, what they did focus on were signals of selection in regions of the genome that these the risk markers were embedded in.
They seem to come to two general conclusions:
- Selection through the side-effect of hitch-hiking does seem to drive some of the African vs. non-African divergences.
- Much of the difference can probably be due to specifics of drift in non-African populations in the “out of Africa” event, and there isn’t evidence of polygenic selection across the 68 loci in the aggregate.

The former, in regards to linked selection, is also not surprising. As non-Africans spread across the world they developed new local adaptations, and some allele frequencies shifted from the African ancestors. But not all. And that I think explains why South Asians have a higher risk than Europeans and East Asians. The authors observe several protective (lower risk) alleles rose in frequency due to being in a region where there was selection for lighter pigmentation. Pigmentation is one trait which is highly heritable where some non-Africans (South Asians, Oceanians) are often more like African populations than other Eurasian groups. If high-risk CaP alleles were somehow associated with ancestral pigmentation alleles, then it makes sense that South Asians have a higher risk, since they are more ancestral on these loci than other Eurasians.

Because the OR can vary between populations, the authors ran their analysis by equalizing the OR and also by using the literature value of OR at a marker population by population. They found the broad disparity held. Subsampling the markers also maintained the rank order in broad geographic terms. Finally, the authors observe that because of the bias in the discovery of European risk variants, there are probably African risk variants that are not in their marker set which result in an underestimate of the GRS.
What is the upshot of all of this? The less important one is that David Reich used the example of prostate cancer to open his discussion about population structure because it’s probably a robust result (and also, in the book he makes clear a lot of sociologists and anthropologists did not appreciate the correlation between disease and ancestry that seemed due to biology). The balance of the evidence points to the likelihood that men with African ancestry, in particular, but not exclusively, of West African ancestry, have somewhat higher risks all things equal of developing prostate cancer. As the authors note the risks overlap quite a between populations. A substantial number of men of European ancestry have a higher GRS for CaP than those of African ancestry. There are two classes of alleles driving this risk. One class has high-frequency differences between populations, and another class has a large impact on odds ratios (so small differences still matter).

Since the heritability is not high, but only moderate, and even this correlation is imperfect, one can still argue that the disparity is attributed to environment. But to be honest the South Asian prediction along with the relationship to pigmentation regions indicates to me that the GRS is capturing something real in population differences due to a combination of demographic history and natural selection.
Moving on from CaP, these academic debates about whether disparities are driven by genes, environment or both (or an interaction), miss the bigger picture that due to the contingencies of history different populations probably have different risks in late-in-life diseases. The South Asian risk for cardiac and metabolic illnesses is so extreme that I think most people won’t deny that that is a real thing (in particular since there is variation within South Asia for this judging by British medical data).




