Gene Expression: Response, heritability and selection (R = h<sup>2</sup> * S), little bits and reiterations

Front page

Tuesday, November 08, 2005

Response, heritability and selection (R = h² * S), little bits and reiterations posted by Razib @ 11/08/2005 08:32:00 PM

A few more thoughts on R = h²*S. First, h² isn't "squared,"^* it is just the narrow sense heritability, which is the additive genetic variance. For the prediction equation to be valid you need a continuous quantitative trait. In other words, not an on-off trait which we are familiar with in the "Mendelian" world. This trait should be easily characterized by a normal distribution, the canonical bell curve. A classic Fisherian perspective would hold that this distribution is simply a reflection of the central limit theorem, numerous loci of small effect are equivalent to the independent random variables which result in a normal distribution, which displays a range around a median. This last point is important, heritable traits (where some of their variation is due to additive genetic factors) are usually not extremely unequivocal in their fitness implications. The reasoning is simple, there is still heterogeneity at their loci. If the fitness deviation was either extremely positive or negative for a given allele, it should be purged or driven to fixation. Over time, one would assume that all the polymorphism across the loci, where the frequency of the modal allele at a locus is less than 95% (or 99%, depends on who you talk to), would be expunged. The "random variable" wouldn't be so random and variable and you would see a concomitant reduction in the additive genetic variance. Eventually, the trait would become classically genetic, like the number of fingers you have (this kid aside, all humans are pretty similar on this trait)

Two heritable traits are height and IQ. They are continuous quantitative traits, you can peg a number to them on a distribution chart. They exhibit both exhibit a classical bell curve (assuming a population isn't subject to famine, steroids or high stakes testing!). Though one would assume that being taller and smarter would make you more "fit," the fact that there is a great deal of intra and interpopulational variance on these traits should make you reconsider this assertion. Larger people have greater caloric needs. The brain is also an energy hog. Additionally, there might be correlated responses of other traits which would cancel out the fitness boosts as one moves up the distribution on a trait (extremely tall men often have health problems, extremely smart people are often pretty weird). One could posit a host of frequency dependent selective effects, in other words variation is preserved because a fitness equilibrium is reached between the various alleles (you get too common, and your "edge" is lost). And of course, there might be a lot of variation as a function of time, so that periodic famines increase the fitness of alleles which confer small size to males (who have lower caloric needs) before they are purged from the gene pool. Many stories can be spun from this one distribution.

Now, one implication of breeding a population using the prediction equation is that heritability should decrease if you are selecting strongly for a trait, because you are exhausting the additive genetic variation (you are purging the diversity from the loci). In practice, this takes often takes a while. Additionally, the range in phenotype you begin within is not necessarily a fundamental bound on the range in phenotype for the population. That is, the highest value within a population at generation t might be lower than the mean value at generation t + 1000, even without the addition of new alleles via mutation or migration. The reasoning is simple: in the initial population extremely rare alleles might never have combined and expressed extreme phenotypes, but, as you selected for particular alleles via their average effects on the phenotype you are shifting the underlying gene frequencies and altering the architecture. Eventually, extremely rare alleles might not be so rare, and so combinations which once might have appeared once every few generations in a large population could theoretically become modal! An important point to remember though is that continuous traits are only roughly normal. This is important because of the "fat tail" tendency of some of these traits, like IQ. There are many more high IQ individuals than a real normal distribution would predict. This discrepancy tends to increase in magnitude as you slide up the scale. Remember before that a priori we simply modeled the distribution as being due to independent additive effects across loci? This the theory, the reality is that there are almost certainly alleles of larger effect and dependencies across loci. Some of the non-linear kinks could be due to epistasis on the genetic level. Or, on the extra-genetic level they could be due to gene-environment interactions (imagine an allele which is subject to a norm of reaction which modulates its magnitude of effect as well as its propensity toward epistasis). And of course, as Michael pointed out, small deviations of the mean even in an ideal normal distribution can have outsized effects at the extremes (check it yourself computationally, I kept trying to get people to do this during the Summers fiasco).

Now, I've tried to make this concrete, but, I want to emphasize that quantitative traits and additive genetic variation are not simply empirical laws and phenomena, they can be derived a priori from basic genetics. Rather than repeat the details, I invite those with an interest to flip to the last chapter, 9, of Principles of Population Genetics (a lot of mundane multiplication of fractions if you want to know the truth!). Or, for the more historically inclined, I suggest R.A. Fisher's classic 1918 paper, The Correlation between Relatives on the Supposition of Mendelian Inheritance (PDF), where he manages to elucidate how continuous quantitative traits emerge out of the cloud of discrete mendelian genetic loci (Please note, Fisher received a degree in mathematics, and his first postgraduate academic training was in statistical mechanics in the context of physics).

Finally, I just noticed in Introduction to Quantitative Genetics that the halfway point of theoretical response to selection assuming an ideal populational additive genetic architecture and a selected proportion per generation of 1/2¹ is ~1.4N_e, N_e being the effective breeding population, while for recessive genes it is ~2N_e. If you look it up, it is a bit more nuanced than that, but it is a first approximation that emphasizes that the size of the population is an important factor to consider because of the range of standard genetic variation that that population brings to the table.

In this age when the soldiers of Creationism are marchin' it behooves us to remind our citizens that evolutionary theory as it emerged out of the Modern Synthesis has a strong mathematical basis. Creationists often reply that they accept microevolution as opposed to macroevolution, but the border between the two is often as fluid as the continuity which characterizes quantitative traits. Macroevolution is predicated on species level distinctiveness which implies walls between the flow of alleles between populations because of mating barriers. In fact, even something as concrete as "species" can be tenditious and fluid, as opposed to a precise type (or "kind"). In the early 1990s genetic tests confirmed that there were 3 hybrid whales who were "Fin-Blue" whales, and that one of these hybrids was a pregnant female (page 364 of Molecular Markers, Natural History and Evolution). A few weeks ago I listened with amusement as a developmental biologist rattled off exactly how he would go about hybridizing a human with chimpanzee and what sort of chromosomal rearrangements would need to occur for the possibility of a fertile hybrid to emerge from the offspring of the F1 generation. After I observed that the process seemed pretty clear in his head, the researcher smiled and explained that he was "simply extrapolating" from what he knew in regards to . Sure. Are humans fish? Actually, cladistically we are (lobe-fin fish that is, as opposed to ray-fin fish).

Addendum: A few weeks ago Jim opined that this blog was diving into "diving deep into technical genetic jargon." 1) I don't think that something like the breeder's prediction equation should be intimidating (no calculus here!), and 2) the jargon is highly salient to our everyday life, and our discussion of public policy. I simply invite those for whom this is jargonistic and technical to bear with me, I'm not the best communicator, but, I think these concepts need to be part of the toolkit of many more intelligent people. Frankly, too many biologists don't know basic population genetics! Speciation, differences and shifts between species, is really cool and charismatic, but microevolution within a species, especially the human species, is a very important topic as well. And of course my interests in history and religion haven't disappeared, and I will be posting more on those topics in the near future, just as I think non-scientists need to know about science (the means of much of modernity), I think scientists should know more about non-science (the ends of modernity).

Update: In the comments David B points out:

IIRC, heritability is symbolised as h^2 (not h) because back in the early days h was used for a correlation coefficient between phenotype and genotype, while h^2, the square of that correlation coefficient, measured the proportion of variance 'explained' by the correlation. (This is a standard statistical result: if r is the correlation between a and b, the variance of a for given b, or vice versa, is (1 - r^2)V, where V is the full variance of the relevant population.) But when someone (Wright?) later introduced the term heritability, they were stuck with the existing symbolism, which was too well-established to change.

Obviously if h² explains the proportion of variation of the phenotype, that sounds rather like r², the coefficient of determination, which quantifies the proportion of the variation of Y due to X. I simply wanted no one to get confused and actually take a heritability value, 0.5, and square it, when actually calculating R from h² and S.

1 - As the proportion selected of the population from each generation drops, the S is increasing, but the N_e is dropping. The relations above are to give a rough measure, the main point is that there is a dependency on the number of breeding individuals that could contribute to additive genetic variance.

Haloscan Comments