A bold prediction: "synthetic associations" are not a panacea

There’s a bit of press surrounding the interesting result from David Goldstein’s group that, in certain situations, a number of “rare” (defined as an allele frequency less than 5% [1]) variants influencing a trait can lead to an association signal at “common” SNPs. This phenomenon they authors call a “synthetic association”.

The authors claim this is potentially the cause of many of the associations found in genome-wide association studies (with common SNPs), as well as a potential solution to the “missing heritability problem” (this isn’t mentioned in the paper itself, but rather in a Times article describing it). In other words, this could be a panacea for all the ills of the human genetics community. Unfortunately, this seems rather unlikely.

1. There are a range of parameter values for which “synthetic associations” are plausible–where the effect of the rare variants is small enough to have avoided detection by linkage studies but big enough to show up via correlation with common variants. This range of parameters is kind of small–from Figure 2, it looks like maybe a set of mutations at a gene with a genotypic relative risk greater than 2 but less than 6. Will this be the case for some loci? Sure, that sounds plausible. Is it going to explain everything? No, of course not.

2. It has been pointed out (rightly) that diseases that are selected against should have their genetic component enriched for rare variants. Goldstein himself has made this argument about diseases like schizophrenia. So if schizophrenia has all these rare variants, and rare variants cause rampant “synthetic associations” at common SNPs, why hasn’t anyone picked up whopping associations using common SNPs in schizophrenia?

3. The sickle cell anemia example, as presented in the paper, is extremely misleading. It seems the authors did a simple case control test for sickle cell in an African-American population. Recall that African-Americans are an admixed population, with each individual carrying large chunks of “European” and “African” chromosomes. Anyone will sickle cell will have at least one block of African chromosome surrounding the beta-globin locus, while those without will have two chromosomes sampled from the overall distribution of chromosomes in the population–15-20% of which, approximately, will be of European descent [2]. So any SNP with an allele frequency difference between African and European populations in this region will show up as a highly significant association with the disease due to the way they’ve done the test, and these associations will extend out to the length of admixture linkage disequilibrium–well, well beyond the LD found in African populations alone. The presentation of this example in the paper–the large block of association contrasting with the small blocks of LD in the Yoruban population–is a bit silly.

If I had to guess, and put a concrete bet on how this will play out, let’s take the associations listed in their Table 1, which they call candidates for being due to synthetic associations. My bet: none of them are. Ok, maybe one.

[1] These sorts of thresholds are important to watch–in a year people will be calling things at 1% frequency “common” if it suits them for rhetorical purposes.

[2] Corrected from: “… will have two large blocks of “African” chromosomes surrounding the beta-globin locus, and everyone without will have at least one European chromosome in the same area”; see comments.

Posted in Uncategorized

Show me the data

Update: Also see p-ter at Gene Expression Classic.
Follow up on
yesterday’s post on the new Dickson et al. paper from David Goldstein’s lab, A New Way to Look for Diseases’ Genetic Roots:

The Icelandic gene-hunting firm deCODE genetics, which emerged last week from bankruptcy, has long led in detecting SNPs associated with common disease. Dr. Kari Stefansson, the company’s founder and research director, agreed that whole genome sequencing would “give us a lot of extremely exciting data.” But he disputed Dr. Goldstein’s view that rare variants carried most of the missing heritability. Both deCODE genetics and scientists at the Broad Institute in Cambridge, Mass., have sequenced regions of the genome surrounding SNPs in search of rare variants, but have found very few, Dr. Stefansson said.
“We can speculate till we are blue in our faces,” he said, “but the fact of the matter is that there is no substitute for data.”

It would have been nice to get a quote from someone whose recent career hasn’t been as checkered as Stefansson. The issue of missing heritability is going to be interesting in the near future….

Confucius biopic

I noticed that a new biopic of Confucius just opened in China. It’s pretty obvious that they “sexed up” his life, as you can see in the trailer. In terms of a big-budget biopic it seems to me that the life of Confucius is a very thin source of blockbuster material in relation to other social-religious figures of eminence. Jesus, Moses and Buddha have supernatural aspects to their lives. Muhammad’s life offers the opportunity for set-piece battles. Confucius was in many ways a failed bureaucrat, a genius unrecognized in his own day. His life can’t be easily appreciated unless you have the proper context of his impact on Chinese history in mind. Stepping into it without a grand frame can lead one to conclude that he was quite a pedestrian man. Confucius was a man of ideas (though even those ideas can seem somewhat obscure, e.g. rectification of names). You see this in Annping Chin’s The Authentic Confucius: A Life of Thought and Politics; I can’t imagine Karen Armstrong writing such a dense and slow book.

Posted in Uncategorized

The origins of the Yakuts

One of the more substantive consequences of the powerful new genomic techniques has been in the area of ancient DNA extraction and analysis. The Neandertal genome story is arguably the sexiest, but closer to the present day there’ve been plenty of results which have changed the way we look at the past. The input of genetics has basically demanded a revision of the contemporary consensus of the origins of the Etruscans which emerged from archaeology. Though certainly ancestry and genetic relationship are informative, ancient DNA has also given us windows into the change of function and a record of adaptation which rests less on inference. I’m thinking here of the fact that ancient inhabitants of Central Europe 7,000 years ago do not seem to have been able to process milk in the manner which is the norm in that region today; some we know about because we know that they lacked the mutation which confers lactase persistence in Europeans. There are other examples, and I assume that in the near future we’ll still see a steep exponential increase in the generation of new results as techniques get better and cheaper.
A new paper explores the demographic history of an obscure Siberian population using DNA extraction and phylogenetic analysis. The questions are historical, and relatively easy to resolve. Who are the Yakuts, where did they come from? Those of you who have played Risk know that Yakutsk is a region of Siberia, and the Yakuts are residents of that region. Interestingly the Yakuts speaking a Turkic language. Here is a map which shows the modern distribution of Turkic languages (I have shaded in what is presumed to be the Turkic Urheimat):
turkdomain.png
Within the last 2,000 years the Turkic languages have rapidly spread across Eurasia. Most of the expansion was to the south and west, but as you can see, some pushed their way into Siberia. These are the Yakuts. I have discussed the genetics of Anatolian Turks before. These Turks sit at one end of the domains of the language family, and how Anatolia came to be Turkic-speaking can tell us something about the dynamics of language change and ethnic reorientation more generally. The Yakuts may tell us something more.
Human evolution in Siberia: from frozen bodies to ancient DNA:

Read More

Common disease-common variant hypothesis taken down a notch (again)

David Goldstein, a geneticist at Duke, has critiqued the current focus on large-scale genomwide associations before. Now he is taking to the next step, as his group has a paper out which suggests that the reason that association studies have been relatively unfruitful in terms of bang-for-buck is due to the fact that they’re picking up “synthetic associations.” Rare Variants Create Synthetic Genome-Wide Associations:

It has long been assumed that common genetic variants of modest effect make an important contribution to common human diseases, such as most forms of cardiovascular disease, asthma, and neuropsychiatric disease. Genome-wide scans evaluating the role of common variation have now been completed for all common disease using technology that claims to capture greater than 90% of common variants in major human populations. Surprisingly, the proportion of variation explained by common variation appears to be very modest, and moreover, there are very few examples of the actual variant being identified. At the same time, rare variants have been found with very large effects. Now it is demonstrated in a simulation study that even those signals that have been detected for common variants could, in principle, come from the effect of rare ones. This has important implications for our understanding of the genetic architecture of human disease and in the design of future studies to detect causal genetic variants.

The conclusion in the discussion elaborates on the relevance:

… Under our model, the causal sites are both rare and relatively high-penetrant contributors to disease, and will therefore be unlikely to be detected in a small number of control samples. Finally, the focus of attention on genes that are near GWAS signals may be incomplete or misleading in that the actual causal sites may occur in many different genes surrounding the implicated common variant. It is also worth emphasizing that as few as one or two rare variants, at much lower frequency than the associated common SNP, can create a significant synthetic association. In such a case, sequencing a small number of cases that carry the “at risk” common variant might miss entirely the causal rare variants even if the correct genome region is resequenced. These considerations argue for caution in efforts to resequence around genome-wide associations and argue instead that genome-wide sequencing in carefully phenotyped cohorts might be a better use of resources.

PLoS thought that this paper was important enough to commission and accompanying article, Common Disease, Multiple Rare (and Distant) Variants:

Read More

Bat & whale echolocation genetic convergence

In Bats and Whales, Convergence in Echolocation Ability Runs Deep:

…”However, it is generally assumed that most of these so-called convergent traits have arisen by different genes or different mutations. Our study shows that a complex trait — echolocation — has in fact evolved by identical genetic changes in bats and dolphins.”
A hearing gene known as prestin in both bats and dolphins (a toothed whale) has picked up many of the same mutations over time, the studies show. As a result, if you draw a phylogenetic tree of bats, whales, and a few other mammals based on similarities in the prestin sequence alone, the echolocating bats and whales come out together rather than with their rightful evolutionary cousins.
Both research teams also have evidence showing that those changes to prestin were selected for, suggesting that they must be critical for the animals’ echolocation for reasons the researchers don’t yet fully understand.
“The results imply that there are very limited ways, if not only one way, for a mammal to hear high-frequency sounds,” said Jianzhi Zhang of the University of Michigan, who led the other study. “The sequence convergence occurred because the amino acid changes in prestin that result in high-frequency selection and sensitivity were strongly favored in echolocating mammals and because there are [apparently] very limited ways in which prestin can acquire this ability.” Prestin is found in outer hair cells that serve as an amplifier in the inner ear, refining the sensitivity and frequency selectivity of the mechanical vibrations of the cochlea, Zhang explained.

This obviously plays into the arguments about contingency and inevitability. On the one hand the convergence across these two taxa suggest that this trait seems to result in an inevitable genetic architecture. But perhaps only for mammals. So inevitability might be a contingent aspect of evolution here. Reminds me a bit of opsins and vision.

Citation:

  1. Yang Liu, James A. Cotton, Bin Shen, Xiuqun Han, Stephen J. Rossiter, Shuyi Zhang. Convergent sequence evolution between echolocating bats and dolphins. Current Biology, 2010; DOI: 10.1016/j.cub.2009.11.058
  2. Ying Li, Zhen Liu, Peng Shi, Jianzhi Zhang. The hearing gene Prestin unites echolocating bats and whales. Current Biology, 2010; DOI: 10.1016/j.cub.2009.11.042

How much is "The Situation" worth?

‘Jersey Shore’ — MTV Tries to Divide and Conquer:

Sources tell TMZ the network has told the cast if they don’t accept MTV’s deal by the end of business Monday, they will be replaced. And, MTV has told them it does not have to be a package deal — the cast members who accept the offer will stay … those who do not can have a nice life.

As we first reported, MTV offered each cast member a $10,000 signing bonus and $5,000 per episode. We’re told the cast rejected the offer and made it clear they would all stand together and hold out for their price, though they didn’t say what it was.

MTV made a new offer of $10,000 an episode — there are 12 episodes in the new season — but so far the cast hasn’t responded.

We’re told MTV already has replacements if Snooki, Pauly D, The Situation and the others don’t accept the offer on Monday. But, we’re told, MTV is happy to mix and match if some of the cast accepts the offer and others don’t.

As for who’s being the most hard-headed in the negotiations — The Situation and Pauly D.

Posted in Uncategorized

Rice, alcohol and genes

Changes in human diet driven by cultural evolution seem to be at the root of many relatively recently emerged patterns of genetic variation. In particular, lactase persistence and varied production of amylase are two well known cases. Both of these new evolutionary genetic developments are responses to the shift toward carbohydrates over the last 10,000 years as mainstays of caloric intake. Rice and wheat serve as the foundations of much of human civilization. It is notable that both China and India are divided into rice and wheat (or millet) belts, so essential are modes of agriculture in our categorizations of societies. Even nomad societies are dependent on carbohydrates in the form of “simple sugars,” as much of the nutritive value of milk is from its lactose sugar.
Carbohydrates are convenient because they can be grown and controlled by humans, but also because they can be stored, and finally, reprocessed. Some of that reprocessing is straightforward, such as with breads, but for this post alcohol is what we are concerned with. Tom Standage’s A History of the World in 6 Glasses documents the importance of beer & wine in ancient human societies (and hard liquor in modern ones), and also argues from both empirics and theory that fermented beverages are almost an inevitability in an agricultural society. Alcohol is rich in energy, portable, keeps, and, has a far lower pathogenic load than water in a pre-modern environment. Not to mention the pleasant “buzz” it provides. But like milk, those “without tolerance” often suffer negative physiological consequences. It turns out that like LCT, the locus critical in controlling the levels of the enzyme lactase, the alcohol metabolization loci exhibit variation across populations.
A new paper is out which argues for the causal connection between the spread of rice agriculture, and a derived variant of ADH1B, The ADH1B Arg47His polymorphism in East Asian populations and expansion of rice domestication in history:

Read More

The rise of the irreligious Left

Barry Kosmin at CUNY has published the results of three surveys of American religion since 1990. These “American Religious Identification Surveys” (ARIS) were done in 1990, another in 2001, and finally in 2008. One of the major findings of the ARIS has been the rise of those who avow “No Religion”. Looking through the data it is also clear that aggregating nationally understates some of the local changes. In 1990 47% Vermonters were non-Catholic Christians (i.e., Protestants). In 2008 29% were. In 1990 13% of Vermonters had No Religion. In 2008 34% of Vermonters had No Religion! In fact, No Religion has a plural majority in Vermont, with 26% of the population being Catholic. This is a much bigger shift than nationally. In Kosmin’s book One Nation Under God, which drew upon the 1990 survey results, he noted that though the Northeast has a reputation for being relatively secular, it is in fact highly confessionalized in comparison to other regions, such as the Pacific Northwest. This isn’t true anymore; much of New England has experienced a wave of rapid secularization and disaffiliation. If current rates of secularization continue Vermont may become the first minority non-Christian state. It was only 55% Christian in 2008.
I am going to explore these data with scatterplots and maps. All the plots have 1990 data on the X-axis and 2008 data on the Y-axis. I also look at the 1988 vs. 2008 presidential race, and attempt to see if there is any connection between secularization and political change.

Read More