Substack cometh, and lo it is good. (Pricing)

Selection everlasting, suppositions no more

The Biometrician
I have alluded over the years to the early 20th century conflict between Mendelians, who were proto-geneticists, and the biometricians, who were classical Darwinians. As if in a Hegelian dialectic this clash of egos eventually lead to the synthesis which became population genetics. The historical process is outlined beautifully in Will Provine’s The Origins of Theoretical Population Genetics, but at the time it bore fruit in R. A. Fisher’s 1918 paper The Correlation between Relatives on the Supposition of Mendelian Inheritance. One might argue that though this publication ended the explicit debate and division, the reality is that the difference continued, not because of fundamental differences, but pragmatic ones. Classical population geneticists focused on single or two locus models to develop their intuitions about the trajectory of evolutionary processes. Quantitative geneticists refined their statistical techniques of inference on continuous characters whose heritable character was confirmed, but whose specific causal genetic elements remained mysterious.


The geneticist
It could be no different in the pre-genomic era. Without “big data” and “big metal” (i.e., computation) the rich, but messy, empirical work of quantitative genetics and the elegant and analytic landscapes of population genetics were separated by a methodological chasm. Polygenic models built from the bottom up were simply not practicable for evolutionary geneticists. And without genomics the likelihood of ascertaining causal loci in very polygenic traits was unlikely, and not necessary, for quantitative geneticists. But that is changing, one century on after Fisher’s seminal paper which fused the two fields in their theoretical axioms.

And this is not a trivial matter, because adaptive evolution occurs upon continuous characters affected by polygenic standing variation. Subtle heritable differences on quantitative traits almost certainly have a genetic basis, but when that variation is distributed across hundreds or thousands of loci, the reality must remain abstract. One can make educated assertions about the broad flow of evolutionary process, but can not get at the nuts and bolts details of how it proceeds. And these quantitative characters are of some note. Diseases such as type 2 diabetes and schizophrenia seem to have a heritable component, but their possible evolutionary origins are murky at best.

With the availability of large data sets, theorists are now rousing themselves, and attempting to close the book that Fisher opened. Over at Haldane’s Sieve they have posted a preprint with an intriguing title, The Population Genetic Signature of Polygenic Local Adaptation. The claims seem characterized by a grand modesty:

Adaptation in response to selection on polygenic phenotypes occurs via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that have undergone local adaptation. Using GWAS data, we estimate the mean additive genetic value for a give phenotype across many populations as simple weighted sums of allele frequencies. We model the expected differentiation of GWAS loci among populations under neutrality to develop simple tests of selection across an arbitrary number of populations with arbitrary population structure. To find support for the role of specific environmental variables in local adaptation we test for correlations with the estimated genetic values. We also develop a general test of local adaptation to identify overdispersion of the estimated genetic values values among populations. This test is a natural generalization of QST /FST comparisons based on GWAS predictions. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles. We apply our tests to the human genome diversity panel dataset using GWAS data for six different traits. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results.

How? The mathematics will likely be a touch gnarly for most readers of this weblog, but the brave should just go the preprint. What I will say is that the methods outlined within the paper seem to attempt to account for the diverse multi-valent forces that polygenic traits are subject to. Dispersed weak selection is naturally subtle, and easily masked and confounded. What one must do is compare the patterns within the genome against the neutral expectations that one might predict from phylogeny and geography. Easy enough to right, but totally unfeasible in the pre-computer age. The main empirical result I will offer is that they find little evidence for selection loci implicated in type 2 diabetes. This is not dispositive of the proposition, but, it does lend credence to the idea that ideas of a ‘thrifty gene’ seem rather fanciful.

Ultimately the task of model building is tedious, and it will be iterative. But the early years of the 21st century have seen the same sort of theoretical revival and reformation which occurred in the early 20th. Only good things can come….

Citation: The Population Genetic Signature of Polygenic Local Adaptation.

Posted in Uncategorized