Substack cometh, and lo it is good. (Pricing)

Evidence for natural selection between populations

In February we discussed a new Science paper, Whole-Genome Patterns of Common DNA Variation in Three Human Populations.

They genotyped 1.6 million SNPs in 71 unrelated individuals from three populations: 24 European Americans, 23 African Americans, and 24 Han Chinese from the Los Angeles area. While this was a tool building exercise, they did perform a number of interesting analyses on the data they collected. One that caught my eye was an analysis that looked for evidence of selection between populations. Their analysis made use of Fst calculations (a measure of how much genetic variation is between rather than within populations) for each SNP. In summary, they found that Fst was higher for SNPs in gene-containing (genic) regions than gene-poor (nongenic) regions and that Fst was higher in coding than non-coding regions. Put another way, there is greater genetic variation between the populations in the regions of the genome that are most likely to be functional. These differences were small but significant. They interpret these findings as evidence for local selection. However, when they asked the question of whether the SNPs that provided evidence for natural selection were private to a single population, they did not find this to be true. From this they conclude that most functional genetic variation is not population specific.

Any one interested in looking at this closer can find the data online here. It might be interesting to ask which genes are associated with SNPs that show high Fst.

Here is the associated text from the paper:

Evidence for natural selection between populations. It has been suggested that natural selection distorts the observed distribution of FST across the human genome and that large FST values can be used to identify candidate loci likely to have undergone local selection (13, 19). If this is true, then larger FST values should be found near functional genetic elements. We looked at the distribution of FST for SNPs that were genic or nongenic, coding or noncoding, and synonymous or nonsynonymous. We performed the analysis within subsets of SNPs grouped by MAF, so that effectively, we looked at the fraction of between-population variance for SNPs with the same total genetic variance (fig. S3). Common SNPs in genic regions do have slightly but significantly higher FST values than nongenic SNPs with the same MAF [analysis of variance (ANOVA), P = 1.8 x 10–46], and common coding SNPs have slightly higher FST values than noncoding SNPs in genic regions (ANOVA, P = 1.1 x 10–4). We did not see a significant difference in FST between synonymous and nonsynonymous coding SNPs, but our sensitivity is limited by the small sample sizes and expected correlations among SNPs within the same transcript. These results are consistent with local selection changing the distribution of FST near functional sequences. However, because the distributions of FST among genic and nongenic SNPs are very similar, large FST values by themselves appear to be very weak evidence of selection.

We performed a similar analysis to see if there is also an association between private SNPs and functional genetic elements. When conditioned on MAF, we saw no difference in frequency of private SNPs among genic and nongenic SNPs or among coding and noncoding SNPs (fig. S4). This indicates that the SNPs responsible for evidence of local selection in the FST analysis tend not to be private and instead are segregating in multiple populations. Although there are known examples linking population-specific SNP alleles to phenotypic differences (20–22), our results are more consistent with the conclusion that most functional human genetic variation is not population-specific.

Posted by rikurzhen at 12:43 AM

Posted in Uncategorized