Sunday, May 21, 2006

Fitness and disease II   posted by JP @ 5/21/2006 09:03:00 AM

The second point made by Eyre-Walker et al. in their paper "The distribution of fitness effects of new deleterious amino acid mutations in humans" is that "it will be very difficult to locate most of the genes involved in complex disease" (see my discussion of their first point here).

They come to this conclusion because, using their model, the variations that contribute the most to the variance of a disease-associated trait are rare (see Jonathan Pritchard's 2001 paper for a different way to come to a similar conclusion), and thus difficult to map, except in two cases: when the disease has no effect on fitness or the alleles invloved have been under positive selection. They clearly don't think those two cases are likely, though they do acknowledge they may number "a few".

I disagree. We've all heard the story about the guy at night, searching for a lost coin under a streetlight because "that's where the light is". That's essentially what's happening here. They've derived the distribution of fitness effects of new deleterious amino acid mutations, so obviously disease alleles must be new. And deleterious. And amino acid mutations. Right?

Ok, ok, it's true they mention a couple possible violations of those assumptions, but they still feel confident enough to say that "most of the genes involved in complex disease" can be described with their model. But here are a couple different theoretical paradigms to consider: the "thrify genotype" and "cryptic genetic variation".

1. The "thrify genotype" hypothesis. We've mentioned this hypothesis before, as a part of the ancestral susceptibility model for common disease. It was first formulated about type II diabetes and obesity, and goes something like this: imagine a normal distribution of a quantitative trait like the efficiency with which excess calories are converted to fat. Some people quickly convert all of their excess to fat, some less efficiently, and some not efficiently at all. In an environment of scare food resources, the first group has the clear advantage: in times of famine, they can live off the fat they stored from before. So those variants that predispose one to be on the right half of the curve (if we define the x-axis as efficiency) are favored.

Now enter agriculture, which allows food from good times to be stored and eaten in rough times. The selective pressure for fat storage efficiency is somewhat relaxed, so new mutations that predispose one to be more on the left half of the curve can gain traction in the population. Finally, enter a "modern" diet of, most importantly, lots of sugar. In this environment, the people on the right half of the curve, with their "thrifty" ancestral genotype, well, get fat, while people with the new derived genotype don't as much.

To extend the "thrify genotype" hypothesis to other traits, one needs a trait that was adaptive in an ancient environment but is neutral or even selected against now. One example is susceptibility to hypertension: in the expansion to colder climates, selection against ancestral alleles adapted to hotter climates have given rise to the current differential susceptibility to hypertension between populations of African and European origin.

2. Uncovering crytpic genetic variation (for a review, see here[pdf]). The premise here is that common disease is a result of the uncovering, through either genetic or environmental perturbation, of variation that wouldn't normally contribute to human health. That is, alleles influencing common disease are only conditionally deleterious or beneficial. Following the obesity example from above, operating in this paradigm would lead to the hypothesis that alleles that don't affect fat storage efficiency in normal circumstances do play a role when the environment is perturbed (i.e. in the presence of a high-sugar diet). One possible example: a mutation in the regulatory region of the gene coding for the dopamine transporter DAT1 (which has a frequency of ~70%, suggesting it may be ancestral) is associated with cocaine dependance. This variant slightly affects expression of the DAT1 gene under normal conditions (caveat: I mean normal tissue culture conditions), but in the presence of cocaine, that difference is amplified. Thus, the varation in expression is only made obvious in an altered condition.

If this paradigm holds, it would mean that many of the alleles (either derived or ancetral) contributing to common disease are neutral except in certain circumstances. So if the environment that exposes this variation is our modern environment (long lifespans, high food availability, etc.), the alleles have been neutral for most of history.

In general, Eyre-Walker et al. are operating in a world where an amino acid-changing mutation arises with a certain selection coefficient, and where this selection coefficient is constant until the mutation is removed from the population. If common diseases are caused by these new, deleterious amino acid changes, their conclusion that it will be very difficult to locate disease alleles is fair. But, as I've tried to show, other models for the allelic architecture of common disease are certainly plausible, and perhaps even more likely.