Monday, October 02, 2006

AT --> GC mutations, recombination and selection   posted by JP @ 10/02/2006 07:31:00 PM
Share/Bookmark


In the first comment on my last post, RPM brings up an interesting point, which I'm going to develop a little more here:

First, let's hypothesize that a non-protein-coding region of the genome is transcribed into some sort of functional RNA. As seen in the example of such an RNA on the left, the molecule has a secondary structure formed by the interactions of certain base pairs in the sequence, and that the interacting bases could be coded far away from each other. In proteins, amino acids that interact in the structure of the protein can evolve together. Perhaps there's a similar effect in the evolution of RNA genes.

If bases in an RNA molecule are to evolve together, this implies some sort of epistatic interaction between the two, in that a change in both sites is more fit than a change at only one site. However, the sweep of a beneficial mutation is greatly impeded by the presence of another beneficial mutation in the same area but on a different background, an effect known as Hill-Robertson interference. The only way multiple beneficial mutations can sweep to fixation together is if they're on the same haplotype, which can only happen if recombination brings them together.

Finally, note that recombination in a region is correlated with, and possibly causes, increased GC content in the region.

So, when Pollard et al. look for small (around 100 bp) regions with accelerated evolution in humans, they're essentially looking for small regions where a large number of mutations have gone to fixation. By the logic above, the only regions where this could possibly occur is in regions with high recombination. Further, regions with high recombination necessarily have a bias towards AT→GC mutations. So was their result (accelerated regions are found in areas of high recombination, most changes are AT→GC) a predicable outcome of their approach? Hindsight is 20-20.

Note that this is not at all an argument against these regions being under selection, just a comment that the only selected regions they should expect to find are a subset of all selected regions-- those with the specific properties noted above.