A note on the Common Disease-Common Variant debate

Share on FacebookShare on Google+Email this to someoneTweet about this on Twitter

One of the more heated debates in human medical genetics in the last decade or so has been centered around the Common Disease-Common Variant (CDCV) hypothesis. As the name implies, the hypothesis posits that genetic susceptibility to common diseases like hypertension and diabetes is largely due to alleles which have moderate frequency in the population. The competing hypothesis, also cleverly named, is the Common Disease-Rare Variant (CDRV) hypothesis, which suggests that multiple rare variants underlie susceptibility to such diseases. As different techniques must be used to find common versus rare alleles, this debate would seem to have major implications for the field. Indeed, the major proponents of the CDCV hypothesis were the movers and shakers beind the HapMap, a resource for the design of large-scale association studies (which are effective at finding common variants, much less so for rare variants).

However, CDCV versus CDRV is an utterly false dichotomy, as I’ll explain below. This point has slipped past many of the human geneticists who actually do the work of mapping disease genes, and I feel the problem is this: essentially, geneticists are looking for a gene or the gene, so they naturally want to know whether to take an approach that will be the best for finding common variants or one for finding rare variants. However, common diseases do not follow simple Mendelian patterns– there are multiple genes that influence these traits, and the frequencies of these alleles has a distribution. A decent null hypothesis, then, is to assume that the the frequencies of alleles underlying a complex phenotype is essentially the same as the overall distribution of allele frequencies in the population– that is, many rare variants and some common variants.

This argument would seem to favor the CDRV hypothesis. Not so. The key concept for explaining why is one borrowed from epidemiology called the population attributable risk–essentially, the number of cases in a population that can be attributed to a given risk factor. An example: imgaine smoking cigarettes gives you a 5% chance of developing lung cancer, while working in an asbestos factory gives you a 70% chance. You might argue that working in an asbestos factory is a more important risk factor than cigarette smoking, and you would be correct–on an individual level. On a population level, though, you have to take into account the fact that millions more people smoke than work in asbestos factories. If everyone stopped smoking tomorrow, the number of lung cancer cases would drop precipitously. But if all asbestos factory workers quit tomorrow, the effect on the population level of lung cancer would be minimal. So you can see where I’m going with this: common susceptibility alleles contribute disproportinately to the population attributable risk for a disease. In type II diabetes, for example, a single variant with a rather small effect but a moderate frequency accounts for 21% of all cases[cite].

So am I then arguing in favor of the CDCV hypotheis? Of course not– rare variants, aside from being predictive for disease in some individuals, also give important insight into the biology of the disease. But it is possible right now, using genome-wide SNP arrays and databases like the HapMap, to search the entire genome for common variants that contribute to disease. This is an essential step–finding the alleles that contribute disproportionately to the population-level risk for a disease. Eventually, the cost of sequencing will drop to a point where rare variants can also be assayed on a genome-wide, high-throughput scale, but that’s not the case yet. Once it is, expect the CDRV hypothesis to be trumpted as right all along.

Labels: , , ,


  1. As it happens, the risk of lung cancer because of smoking and because of exposure to asbestos is multiplicative: you take far more risk by doing both than by doing either one alone. 
    Does this detract from the power of your illustration?

  2. The same effect applies to smoking and exposure to radon. Prolonged exposure to sufficiently high concentrations of radon results in risk of contracting lung cancer. Smoking does likewise. But people who both smoke and are exposed long term to high enough concentrations of radon are at very much higher risk – multiplicative (can’t even pronounce that). 
    My question was going to be – why? I suspect it may add to the explanation, not detract from it, but it would be of interest to see it enunciated. 
    And the other poser is why non-smoking women suffer the relatively high risk of contracting lung cancer that they do.

  3. As it happens, the risk of lung cancer because of smoking and because of exposure to asbestos is multiplicative: you take far more risk by doing both than by doing either one alone. 
    it’s often assumed that this is the same for genetic susceptibility as well. that is, each exposure (or in this case, locus) is independent, so the risks multiply. but that point that the frequency of the allele (or the exposure) affects the impact on the population stands.

  4. …makes me wonder about the utility of stark verbal hypotheses in the (coming) era of fine grained data.

  5. My question was going to be – why? 
    but think about it the other way– why not? that is, imagine a general level of risk that everyone has. if factor a increases that risk 3X and factor b increases risk 10X, why wouldn’t factors a and b increases risk 10 x 3 = 30X? 
    you’d have to imagine some interaction between the factors– that radon only increases risk if you smoke, or smoking only increases risk if you’re not exposed to radon, to explain a lack of multiplicativity.

  6. The asbestos fibres puncture the cell walls and let the carcinogens in. If you don’t smoke, far fewer carcinogens to admit. If you don’t work in the asbestos biz, fewer damaged cell walls.

  7. s/wall/membrane/

  8. Until researchers pin down one, common, potentially fatal disorder on genes (and not the result of Heterozygote advantage) this is sort of like debating how many angels can dance on the head of a pin. Once we have proof for one, the theory can develop from there.

  9. Until researchers pin down one, common, potentially fatal disorder on genes (and not the result of Heterozygote advantage)… 
    I’m not sure what you mean by this. there are environmental and genetic components to all common diseases. Some of the genes involved in some of them are already known. see the citation in the post for one involved in type II diabetes. Also: 
    type I diabetes: 
    cardiovascular disease: 

  10. P-ter, 
    Like everyone else on GNXP I am endlessly fascinated with genetics research. (I love space.com too) However those articles all showed genes to be risk factors, not triggers. That is a world of difference. There are genes that give us more or less resistance to flu virus but nobody would claim that genes cause the flu. 
    I think of genes as building blocks, not stumbling blocks. DON’T GET ME WRONG! I love GNXP and love reading the stories.

  11. However those articles all showed genes to be risk factors, not triggers. That is a world of difference. 
    is there a world of difference? consider cystic fibrosis, a pretty classic case of a single gene disorder. Mutations in this gene could then be considered a “trigger”, right? 
    however, some people with mutations in the CFTR gene don’t develop serious disease– indeed, some end up with the “mild” phenotype of infertility.  
    so is any gene a “trigger” for a disease? We’re looking at probabilities here, and a trigger would seem to imply 100% probability of developing a disease. There may be cases of that, but even traits with “simple” Mendelian inheritance aren’t quite at 100%. Genes involved in common diseases obviously have much lower risks associated with them, and I’m willing to bet that no gene will be shown to be a “trigger” or “cause” for/of any common disease
    This debate isn’t about whether common alleles will be found that cause disease (in the sense you’re looking for), it’s about whether common alleles will be risk factors for disease. The days of a sinlge gene for a disease trait are over.

  12. p-ter 
    I agree with everything you just wrote. Maybe we are just discussing semantics.

  13. Maybe we are just discussing semantics. 
    Somewhat — if I hook a prisoner up to an electric chair and throw the switch, what “caused” their death? For common diseases, the environmental triggers you’re talking about are like throwing the switch on an electric chair — a “catalyst” of some kind — while the genetic susceptibility is like the “background condition” that the prisoner’s skin conducts electricity the way it does. 
    Both are included within the “necessary and sufficient conditions” for the prisoner’s death / person developing diabetes, but the human mind distinguishes between background conditions and catalyst events. For common diseases, I think our intuitive psychology accords with an evolutionary perspective: susceptibility alleles set the stage, but it’s really a pathogen or toxin or over-reliance on one foodstuff that ushers in the diseased state. 
    Over time, fitness-reducing alleles will be weeded out (assuming the obvious caveats — not implicated in hetero advantage, etc.), though the triggers may remain. It’d be like the prisoner evolving plastic or wooden skin, though that wouldn’t change our minds about what causes “death by electric chair” (still the throwing of the switch).

  14. To be more concrete, variation at the CCR5 locus partially accounts for why some people develop AIDS when infected with HIV, while others develop mild and delayed symptoms, while others remain pretty much unaffacted. At the end of the day, though, it’s HIV that “causes” AIDS.

  15. OK, so taking those helpful illustrations, let me try to phrase another question, which may be an unfair question to ask anyone who has not studied or researched colon cancer in particular. 
    I have read that the ‘background condition’ for colon cancer is a ‘defective gene’, but I know from p-ter that a single gene for a disease trait is now known not to be the case. So I assume that certain alleles must be implicated, the frequency for which I assume can’t be that low because of the relatively high incidence of colon cancer in the USA, UK and Australian populations. I also assume it can’t be one set of alleles, because several different preconditions are known for colon cancer – it gets lumped together as one outcome, but there are different ‘background conditions’. 
    Interestingly, the frequency appears to be lower in Australian aboriginal people than others, but I digress. 
    Drinking of alcohol is stated to be a risk factor in the occurrence of colon cancer. Smoking is also stated to be a risk factor in colon cancer. If someone both smokes and drinks, the risk is multiplicative – the resulting risk has been stated to be 400% of that of someone who neither drinks not smokes. Other stated risk factors are age, obesity, lack of exercise, low fibre diet, and high consumption of fats. Presumably there may be other risk factors which are not yet known or suspected – perhaps prolonged excessive environmental stress, for example.  
    But maybe there is ample opportunity for confounding, because perhaps people who drink and smoke a lot are also not very good at eating a healthy diet, getting plenty of exercise and getting health checks done after the age of 50 to determine whether they have a precondition for colon cancer. 
    Digressing again, Australian aboriginal people seem to have less genetic predisposition for colon cancer, but for those who get it the survival rate is worse, apparently due to poorer access to health care, which is credible. 
    So assuming that colon cancer qualifies as a CD, is the genetic predisposition a CV or a lot of RVs, and are all of the risk factors multiplicative?  
    I would guess that the answer is both CV and RVs, because there is not a single pre-condition for colon cancer, there are known to be several, but I’m guessing, having been excited by p-ter’s original post. What is not clear to me is whether the risk factors are the same in every case and whether they are always multiplicative.

  16. I have read that the ‘background condition’ for colon cancer is a ‘defective gene’, but I know from p-ter that a single gene for a disease trait is now known not to be the case 
    You’re referring, I believe, to the genes that are involved in hereditary colon cancer. Some families have particularly high risk because there are mutations in the APC or some other gene segregating in the family. These genes are almost directly causal in the sense Looc wanted– the probability of developing disease given a single mutation in these genes is rather high. However, on a population level, these muations are rare, and only account for a few percent of all colon cancer cases.  
    I’m not familiar with the most recent research into colon cancer (the examples I’ve given are “classics” in cancer genetics from the early 90s). but I’ll give it a look.

  17. Thank you p-ter, I would greatly appreciate it.

  18. so just a cursory look suggests that studies of the genetics of colon cancer stalled out after the genes behind the mendelian forms were discovered. from the NCI:About 75% of patients with colorectal cancer have sporadic disease, with no apparent evidence of having inherited the disorder. The remaining 25% of patients have a family history of colorectal cancer that suggests a genetic contribution, common exposures among family members, or a combination of both. Genetic mutations have been identified as the cause of inherited cancer risk in some colon cancer?prone families; these mutations are estimated to account for only 5% to 6% of colorectal cancer cases overall. It is likely that other undiscovered major genes and background genetic factors contribute to the development of colorectal cancer, in conjunction with nongenetic risk factors.

  19. but there is this study: 
    which shows the association of a common allele with a moderate increase in risk:The frequency of the 1663A [protective] variant allele in Japanese in Hawaii (42%) was the same as that found in Japan by Hasegawa et al. (12) and in Caucasians in our study. The high frequency (56%) of this allele and the stronger associated risk reduction observed in Native Hawaiians are of interest because this group, like other Polynesians, has unexpectedly low rates of colon cancer (31). It is unclear why the association with colorectal cancer was not observed in Japanese because the A allele seems to affect circulating levels of IGF-I in this ethnic group as well.