Gene Expression: Genome-wide association studies in the UK

Front page

Wednesday, June 06, 2007

Genome-wide association studies in the UK posted by p-ter @ 6/06/2007 05:59:00 PM

Results from the most ambitious (and expensive) set of genome-wide association studies for common diseases were published today in Nature (open access! You can read it for free!). Funded by the Welcome Trust in the UK, a "dream team" of clinical geneticists and statisticians assembled a common set of 3000 controls to compare genetically to around 2000 cases each of Type I diabetes, Type II diabetes, arthritis, cardiovascular disease, Crohn's disease, bipolar disorder, and hypertension.

This study is being trumpeted as a major success, and to some extent, it is-- for all diseases except hypertension, at least one strong signal and many weaker signals were identified. As correlational studies are largely hypothesis-generating, some of these will lead to major discoveries about the pathology of disease. In Crohn's disease, for example, the consortium has found a couple loci involved in autophagy and the elimination of intracellular bacteria. They also confirm the association of another locus involved in autophagy. It's easier for people working on a disease to focus on pathways that are already known to be involved in the disease (for example, there's a known autoimmune component to Crohn's disease); it often takes this kind of top-down study to jolt people out of complacency.

The consortium has also make publicly available an impressive suite of software, along with new algorithms for genotype calling and mutlti-locus association, incorporating information from the HapMap. These tools are certainly at the cutting edge, and represent major advances in their own right.

On the other hand, one can't help but notice that the loci identify contribute only a fraction of the known genetic component of these diseases. This is a proof of principle-- the base has been laid; it's now feasable to scale these sorts of case-control studies up to tens of thousands of individuals. But is that really the most effective way at getting at the genetic basis of these diseases? Perhaps not.

A final comment-- I noted in the comments of a previous post that the big data sets used for population genetics these days are generated for medical reasons. There's a ton of population genetic information here, which the authors are likely going to make more use of the future. They do give us a glimpse, though-- they note a number of genomic regions that show marked geographic variability within the UK (and note they limit themselves to self-identified "white Europeans"):

Thirteen genomic regions showing strong geographical variation are listed in Table 1, and Supplementary Fig. 7 shows the way in which their allele frequencies vary geographically. The predominant pattern is variation along a NW/SE axis. The most likely cause for these marked geographical differences is natural selection, most plausibly in populations ancestral to those now in the UK. Variation due to selection has previously been implicated at LCT (lactase) and major histocompatibility complex (MHC), and within-UK differentiation at 4p14 has been found independently, but others seem to be new findings. All but three of the regions contain known genes. Aside from evolutionary interest, genes showing evidence of natural selection are particularly interesting for the biology of traits such as infectious diseases; possible targets for selection include NADSYN1 (NAD synthetase 1) at 11q13, which could have a role in prevention of pellagra, as well as TLR1 (toll-like receptor 1) at 4p14, for which a role in the biology of tuberculosis and leprosy has been suggested.

Labels: Association, Genetics, Population genetics