Sunday, December 06, 2009

Finding the missing heritability   posted by ml @ 12/06/2009 09:31:00 PM

In a recent special issue of The Economist magazine, evolutionary psychologist Geoffrey Miller of the University of New Mexico writes that there is a "looming crisis in human genetics". Setting aside a number of mistakes Miller makes, a core truth he reports is that to date most genetic variants that have been associated with complex diseases such as diabetes and complex phenotypes such as height can account for only a small percentage of the estimated genetic contribution to population variation in these traits (~5% in total for the cases of type-2 diabetes and height). This residual unexplained variability has been dubbed the "missing heritability".

The missing heritability is an open and fascinating research question, not a crisis, although it has become fashionable to characterize it as such. First, it's important to realize that a primary goal of genetic studies of human disease is to identify the biological underpinnings of these diseases and thus advance work toward improved diagnoses, cures, and mechanisms of prevention. Important work has already been done towards these ends with many new and unexpected biological pathways now associated with diseases thanks to the recent successes of genome-wide association studies (GWAS). Thus, it may not be necessary to explain all of the missing heritability in order to make great strides towards these goals.

Second, the first generation of GWAS were not realistically expected to find loci of large effect as they focused on common variant; although the potential benefits of first-generation GWAS may have been oversold by some. Nonetheless, first-generation GWAS have explained a substantial fraction of the heritability of some traits, such as age-related macular degeneration.

Third, where should we look for the missing heritability? A recent review in Nature offers some suggestions:
Many explanations for this missing heritability have been suggested, including much larger numbers of variants of smaller effect yet to be found; rarer variants (possibly with larger effects) that are poorly detected by available genotyping arrays that focus on variants present in 5% or more of the population; structural variants poorly captured by existing arrays; low power to detect gene–gene interactions; and inadequate accounting for shared environment among relatives. Consensus is lacking, however, on approaches and priorities for research to examine what has been termed 'dark matter' of genome-wide association—dark matter in the sense that one is sure it exists, can detect its influence, but simply cannot 'see' it (yet). Here we examine potential sources of missing heritability and propose research strategies to illuminate the genetics of complex diseases.

A key consideration is the genetic architecture of each trait--the number, type, effect size and frequency of variants affecting a trait. Because the genetic architecture of a complex trait cannot be known a priori (although theory can suggest which architectures are more probable) it is an open question as to which approach to finding the missing heritability will yield the most success. Until more approaches are attempted, it is premature to predict a crisis. This does not mean, however, that success is guaranteed. It is possible that the genetic architecture of some complex traits is too complex to dissect practically with current methods. Rather than a crisis, we might instead expect more slow but steady progress in upcoming years. Some traits will be more amenable to new methods (as age-related macular degeneration was to fist-generation GWAS). Other traits will likely remain largely intractable.


Wednesday, September 30, 2009

Sweeps & Markov Models   posted by Razib @ 9/30/2009 01:53:00 PM

Detecting Selective Sweeps: A New Approach Based on Hidden Markov Models. Over at Mailund on the Internet.


Tuesday, August 18, 2009

In defense of big genetics   posted by p-ter @ 8/18/2009 08:19:00 PM

Greg Mayer, filling in for Jerry Coyne, has a post up on a somewhat odd objection to the appointment of Francis Collins as director of NIH: that he's a geneticist. The argument seems to be that diseases are complicated and not entirely genetic, and that Collins isn't hip to non-genetic subtleties. To be frank, this is silly--while it's sometimes a revelation to non-biologists that the "gene for X" way of framing things is inaccurate, Collins is not incompetent. If I had to guess what direction he's planning on taking the NIH, I'd look to what he's actually written.

In the comments to the post, there's the additional worry that Collins represents "big science", which I suppose is considered to be a bad thing (apparently Collins thinks it would be nice to catalogue all the transcripts in a cell, which for whatever reason really pissed off this dude). It's not a bad thing at all; in many cases, big, relatively hypothesis-free science is actually really nice for rationally choosing which "little science" projects to pursue.

Let's take a couple recent examples from genome-wide association studies (these studies over the past few years have exponentially increased our understanding of complex disease; whether that exponential increase is enough for you depends on your prior expectations). First, a little over two years ago, an association was found between a genetic variant in the FTO gene and obsesity in humans. At the time, the gene had unknown function. Now, there's a mouse model and focused biochemical analysis being done on this gene, and we're light years closer to understanding what it does and how nearby variation influence obesity. Would all of this been done without the "hypothesis-free" GWAS? Not anytime soon.

Second, consider the genome-wide association studies in several cancers that all pointed to the same, gene-free region on chromosome 8. In the last few weeks, three separate groups have published their "small-scale" molecular biology work establishing that the associated region appears to be an enhancer important for either proper temporal or spatial gene expression. How does it work? It's not clear, but that's the point--this is an interesting question. Much of "small-scale" molecular biology is done in a few model systems, or on a few "popular" genes. There's a very good reason for this--these systems or genes are already known to be interesting either scientifically or medically. One efficient way to identify novel, potentially interesting systems is through large-scale work.

Labels: ,

Saturday, August 08, 2009

Genome sequencing shop talk   posted by p-ter @ 8/08/2009 10:43:00 AM

There's a nice post over at Genetic Future getting into the details of a recent paper using ABI SOLiD to resequence a human genome. The comments are quite instructive as well. For those not dealing with these sorts of technologies regularly, it can all seem a bit incomprehensible, but the outcome of these sorts of debates will determine who dominates the sequencing business for the next few years...


Monday, June 08, 2009

Mobile genetic elements in the wooly mammoth   posted by Razib @ 6/08/2009 11:27:00 PM

Cool new paper about ancient DNA, Mobile DNA Elements In Woolly Mammoth Genome Give New Clues To Mammalian Evolution:
The woolly mammoth died out several thousand years ago, but the genetic material they left behind is yielding new clues about the evolution of mammals. In a study published online in Genome Research, scientists have analyzed the mammoth genome looking for mobile DNA elements, revealing new insights into how some of these elements arose in mammals and shaped the genome of an animal headed for extinction.

The paper isn't online yet, but it will be here. Kind of mind-blowing that we might know so much about the genomics of an extinct organism. We've come a long way since E. B. Ford.


Wednesday, April 15, 2009

Personal genomics & NEJM   posted by Razib @ 4/15/2009 06:24:00 PM

Multiple articles on personal genomics in The New England Journal of Medicine, Genomewide Association Studies and Human Disease, Common Genetic Variation and Human Traits, Genomewide Association Studies - Illuminating Biologic Pathways and Genetic Risk Prediction -Are We There Yet?. Nick Wade has a piece on these articles in The New York Times.

Related: Preparing doctors for the genomic tsunami, Linkage versus association: a mini-primer, A note on the Common Disease-Common Variant debate, Common disease, common variant and Common disease, common variant.

Labels: ,

Wednesday, August 27, 2008

Why are Finns anxious?   posted by Razib @ 8/27/2008 12:44:00 PM

An Association Analysis of Murine Anxiety Genes in Humans Implicates Novel Candidate Genes for Anxiety Disorders:
Specific alleles and haplotypes of six of the examined genes revealed some evidence for association (p ≤ .01). The most significant evidence for association with different anxiety disorder subtypes were: p = .0009 with ALAD (δ-aminolevulinate dehydratase) in social phobia, p = .009 with DYNLL2 (dynein light chain 2) in generalized anxiety disorder, and p = .004 with PSAP (prosaposin) in panic disorder.

Furthermore, the team's international collaborators in Spain and the United States are trying to replicate these findings in their anxiety disorder datasets to see whether the genes identified by Finnish scientists predispose to anxiety disorders in other populations as well. Only by replicating the results firm conclusions can be drawn about the role of these genes in the predisposition to anxiety in more general.

Haplotter shows selection around ALAD for Africans. PSAP is interesting:
This gene encodes a highly conserved glycoprotein which is a precursor for 4 cleavage products: saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease, Tay-Sachs disease, and metachromatic leukodystrophy....

Labels: , ,

Saturday, July 05, 2008

Genes underlying cognitive ability   posted by Razib @ 7/05/2008 03:35:00 PM

Ben G in the comments points to COMMON GENETIC VARIANTS UNDERLYING COGNITIVE ABILITY, a dissertation. I don't have time to read the whole thing right now, but comments welcome.


Thursday, June 19, 2008

Dogs, behavior & genomics   posted by Razib @ 6/19/2008 12:30:00 PM

A reader pointed me to this paper, Single-Nucleotide-Polymorphism-Based Association Mapping of Dog Stereotypes:
...Analysis of other morphological stereotypes, also under extreme selection, identified many additional significant loci. Less well-documented data for behavioral stereotypes tentatively identified loci for herding, pointing, boldness, and trainability. Four significant loci were identified for longevity, a breed characteristic not under direct selection, but inversely correlated with breed size. The strengths and limitations of the approach are discussed as well as its potential to identify loci regulating the within-breed incidence of specific polygenic diseases.

I've placed an important table below the fold.


Labels: ,

Monday, May 12, 2008

Browsing biology on the web: NextBio   posted by Razib @ 5/12/2008 01:21:00 AM

Last year p-ter put up a post pointing to useful online tools such as Haplotter. One of the great things about biology today is that so much of the data from genomics is being thrown out there within reach of the plebs. And a lot of value is being added through user interfaces which smooth the connection between you and these databases. So check out NextBio; from the FAQ:
NextBio is a life science search engine that enables researchers and clinicians to access and understand the world's life sciences information. With NextBio, in just one click you can search through tens of thousands of study results with billions of data points spanning across different experimental platforms, organisms and data types. NextBio also searches across millions of publications to help you find new articles pertaining to your query. NextBio's search engine makes massive amounts of disparate biological, clinical and chemical data from public and proprietary sources searchable, regardless of data type and origin, and empowers scientists to quickly understand their own experimental results within the context of other research.

I'm sure the slick AJAX-driven search tools are a nice Web 2.0+ pitch to investors; but the substantive element is the data. There are only so many researchers with eyeballs in the world; on occasion amateur astronomers can still pick out something new amongst the constellations, and I think to some extent that that sort of dynamic also holds for the amount of unprocessed data that the post-genomic era has made available to us. I really encourage readers of this weblog to poke and prod around the data piles with these new tools; Web 2.0 isn't just YouTube and Facebook....

Related: VentureBeat weighted in a few weeks ago on this company....