One of the major criticisms leveled against genome-wide association studies for complex diseases is that they have identified loci which account for a relatively small proportion of the variance in most traits. The difference between this small proportion of variance explained by known loci and the (generally large) total amount of variance known to be due to genetic factors has been called the “missing heritability”. Much ink has been spilled speculating about where this missing heritability lies.
Two papers published this week suggest that maybe much of the heritability isn’t actually missing at all. The argument is simple: when performing a genome-wide association study, people use very stringent thresholds for calling a SNP associated with a trait. This is reasonable; people generally want to follow up only on true positives. However, there are probably many loci which don’t reach these highly stringent cutoffs but which truly influence the trait in question. Using methods to determine how much of the variance can be explained by these loci of smaller effect, one group suggests that about half of the heritability of height can be explained by common SNPs, and possibly close to all of it if other factors are taken into account. The authors have, in their discussion, one of the most reasonable, non-hyperbolic discussions of where the “missing heritability” lies, and how whole-genome sequencing will affect genome-wide association studies. It’s worth reading the whole thing, but here’s their conclusion::
If other complex traits in humans, including common diseases, have genetic architecture similar to that of height, then our results imply that larger GWASs will be needed to find individual SNPs that are significantly associated with these traits, because the variance typically explained by each SNP is so small. Even then, some of the genetic variance of a trait will be undetected because the genotyped SNPs are not in perfect LD with the causal variants. Deep resequencing studies are likely to uncover more polymorphisms, including causal variants that will be represented on future genotyping arrays. Our data provide strong evidence that the variation contributed by many of these causal variants is likely to be small and that very large sample sizes will be required to show that their individual effects are statistically significant. A similar conclusion was drawn recently for schizophrenia. In some cases the small variance will be due to a large effect for a rare allele, but this will still require a large sample size to reach significance. Genome-wide approaches like those used in our study can advance understanding of the nature of complex-trait variation and can be exploited for selection programs in agriculture and individual risk prediction in humans.
Park et al. (2010) Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature Genetics. doi:10.1038/ng.610.
Yang et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nature Genetics. doi:10.1038/ng.608.