Gene Expression

Monday, February 01, 2010

"Synthetic associations" and sickle cell anemia posted by p-ter @ 2/01/2010 07:40:00 PM

Last week, I made a silly error in describing a problem in the sickle cell anemia example given by Dickson et al. (2010) as an empirical example of the phenomenon they call "synthetic association". So allow me to take a mulligan, and re-try this:

The authors performed an association study in African-Americans, using ~200 individuals with sickle cell anemia as cases, and >7,000 controls. From their description, they simply performed a logistic regression of disease status on common polymorphisms genome-wide. This turned up a large (~2.5Mb) region surrounding HBB (known to harbour the rare disease-causing mutation) as highly associated with the phenotype. This large region of association stands in contrast, they argue, to the known patterns of linkage disequilibrium in the region, which extends over a few kilobases at most.

This observation, they argue, is an empirical example of how associations due to rare variants can lead to large blocks of associations at common variants. This effect is due to the fact that haplotypes surrounding rare variants are longer and have had little time to be broken up by recombination. Under certain genetic models, this effect of "synthetic associations" is plausible, however, this example is a poor one for making their case.

The reason is that individuals with sickle cell anemia have two chromosomes of African ancestry in the region of HBB, while individuals without sickle cell anemia have approximately the background distribution of European and African chromosomes at the locus--~20% European and ~80% African. To put it another way, let X_d be number of chromosomes of African ancestry of an individual some distance d from HBB (X can be 0, 1, or 2), and Y be the number of chromosomes of African ancestry of an individual at HBB. In the cases, they've conditioned on the fact that Y=2, while in the controls they have not. P(X_d) != P(X_d | Y =2), so much of their association is likely due simply to differences in ancestry between the cases and controls in the HBB region (recall that admixture linkage disequilibrium in African-Americans extends for megabases).

More concretely, any SNP near the HBB locus that happened to be fixed for opposite alleles in Europe and Africa would have a whopping 20% allele frequency difference between cases and controls in their analysis, attributable simply to differences in local ancestry. That's the extreme (and unlikely) situation, but alleles with more modest allele frequency differences between populations will show the same effect.

To some extent, this is their point--the haplotype carrying the causal mutation is long. But the effect in this case is massively exaggerated by admixture, and the presentation of this exaggerated effect is misleading.

Labels: Genetics

Tuesday, January 26, 2010

A bold prediction: "synthetic associations" are not a panacea posted by p-ter @ 1/26/2010 08:48:00 PM

There's a bit of press surrounding the interesting result from David Goldstein's group that, in certain situations, a number of "rare" (defined as an allele frequency less than 5% [1]) variants influencing a trait can lead to an association signal at "common" SNPs. This phenomenon they authors call a "synthetic association".

The authors claim this is potentially the cause of many of the associations found in genome-wide association studies (with common SNPs), as well as a potential solution to the "missing heritability problem" (this isn't mentioned in the paper itself, but rather in a Times article describing it). In other words, this could be a panacea for all the ills of the human genetics community. Unfortunately, this seems rather unlikely.

1. There are a range of parameter values for which "synthetic associations" are plausible--where the effect of the rare variants is small enough to have avoided detection by linkage studies but big enough to show up via correlation with common variants. This range of parameters is kind of small--from Figure 2, it looks like maybe a set of mutations at a gene with a genotypic relative risk greater than 2 but less than 6. Will this be the case for some loci? Sure, that sounds plausible. Is it going to explain everything? No, of course not.

2. It has been pointed out (rightly) that diseases that are selected against should have their genetic component enriched for rare variants. Goldstein himself has made this argument about diseases like schizophrenia. So if schizophrenia has all these rare variants, and rare variants cause rampant "synthetic associations" at common SNPs, why hasn't anyone picked up whopping associations using common SNPs in schizophrenia?

3. The sickle cell anemia example, as presented in the paper, is extremely misleading. It seems the authors did a simple case control test for sickle cell in an African-American population. Recall that African-Americans are an admixed population, with each individual carrying large chunks of "European" and "African" chromosomes. Anyone will sickle cell will have at least one block of African chromosome surrounding the beta-globin locus, while those without will have two chromosomes sampled from the overall distribution of chromosomes in the population--15-20% of which, approximately, will be of European descent [2]. So any SNP with an allele frequency difference between African and European populations in this region will show up as a highly significant association with the disease due to the way they've done the test, and these associations will extend out to the length of admixture linkage disequilibrium--well, well beyond the LD found in African populations alone. The presentation of this example in the paper--the large block of association contrasting with the small blocks of LD in the Yoruban population--is a bit silly.

If I had to guess, and put a concrete bet on how this will play out, let's take the associations listed in their Table 1, which they call candidates for being due to synthetic associations. My bet: none of them are. Ok, maybe one.

[1] These sorts of thresholds are important to watch--in a year people will be calling things at 1% frequency "common" if it suits them for rhetorical purposes.

[2] Corrected from: "... will have two large blocks of "African" chromosomes surrounding the beta-globin locus, and everyone without will have at least one European chromosome in the same area"; see comments.

Labels: Genetics

Wednesday, January 13, 2010

On the Y posted by Razib @ 1/13/2010 01:30:00 PM

Here's the link to the new paper in Naure on the evolution of the human & chimp Y chromosome, Chimpanzee and human Y chromosomes are remarkably divergent in structure and gene content. ScienceDaily and The New York Times have summaries up. Wonder if there'll be future editions of Adam's Curse....

Labels: Genetics

Wednesday, January 06, 2010

How Chinese relate to each other and the Japanese posted by Razib @ 1/06/2010 03:16:00 PM

Last month I pointed to a paper on Chinese population structure, Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies. One to note was that the average F_ST differentiation Han populations was on the order of 0.002, while those differentiating Europeans was on the order of 0.009. Below are the various Han population, along with Japanese. CHB = Beijing, while CHD = Denver. The Denver sample is probably biased toward Cantonese and Fujianese, since most American Chinese are from these two groups. As a point of reference, here are South Asian genetic distances.

Labels: China, Genetics

Saturday, January 02, 2010

PRDM9 and the evolution of recombination hotspots posted by p-ter @ 1/02/2010 11:48:00 AM

This week in Science, three papers report that the product of the gene PRDM9 is an important determinant of where recombination occurs in the genome during meiosis. Though this may sound like something of an esoteric discovery, it's actually pretty remarkable, and brings together a number of lines of research in evolutionary genetics. How so?

A bit of background.

A few somewhat related facts:

1. A major goal in the study of speciation is the identification of the genes that underlie reproductive barriers between species. In 2008, the first such gene in mammals was found--in a cross between two subspecies of mouse where the male offspring are sterile (note that this follows Haldane's rule), a introduction of the "right" version of a single gene was sufficient to restore fertility. This gene? PRDM9, which encodes a histone methyltranferase expressed in the mouse germline. This gene has evolved rapidly across animals, especially in the part of the protein that binds DNA. This suggests it is binding a sequence that is changing particularly rapidly over evolutionary time.

2. The positions in the genome at which recombination during meiosis are not scattered randomly, but rather cluster together in what are called "recombination hotspots". Enriched within these hotspots in humans is a particular sequence motif, presumably an important binding site for whatever factor is controlling recombination. As this fact was becoming clear, a group compared the positions of these recombination hotspots between humans and chimpanzees. The result? The positions of these hotspots are remarkably different between these species. In fact, the positions of recombination hotspots in humans and chimpanzees are nearly non-overlapping, a fairly impressive fact given that the genomes themselves are 99.X% identical.

3. But perhaps #2 isn't all that surprising. If there are two alleles at a hotspot, one of which is "hot" and the other of which is "cold" (ie. doesn't initiate recombination), the mechanism of recombination results in gene conversion of the "hot" allele to the "cold" allele (for details, see here). This should result in the relatively rapid loss over evolutionary time of recombination hotspots, which in turn results in what has been called "the hotspot conversion paradox"--if hotspots should trend over time to be more "cold", how is it that they exist? One plausible resolution of this paradox--a sequence or gene that doesn't contain a hotspot itself might control the positioning of recombination elsewhere in the genome.

4. Indeed, such genes exist. In mice, two groups last year identified regions of the genome (though they didn't at the time narrow it down to a gene) controlling the usage of individual hotspots. Importantly, one such region was located distantly to the hotspot, indicating an important regulator of recombination positioning. In humans, a group last year showed that these is extensive variability between humans in how often previously identified hotspots are used, and that this variation is heritable.

PRDM9 brings all of these observations together

These three papers all report that item #1 and items #2-4 above are all related. What do they show?

1. Two groups followed up on the observation in #4 above that there was a particular region in mouse controlling hotspot usage, and identified the relevant gene as PRDM9. One group went further, testing whether variation in this gene also influenced hotspot usage in humans. Remarkably, it did, showing that variation in PRDM9 in both mice and humans leads to variation in hotspot usage. This variation changes the binding specificity of the gene, leading to changed hotspots and a resolution of the "hotspot conversion paradox" mentioned in #3 above.

2. Another group took a different route to a similar conclusion. They followed up the sequence motif mentioned in #2 above as being enriched in recombination hotspots in humans. The "hotspot paradox" predicts that, if this motif is "hot", it should be in the process of being removed from the human genome. Similarly, if it's not "hot" in chimpanzees, it should not be in the process of being removed from the chimp genome. Indeed, this motif has been preferentially lost along the human lineage as compared to the chimp lineage. They then asked, what is binding this motif? They had two criteria--a protein with a predicted binding site similar to their motif, and lack of conservation of this protein between humans and chimpanzees. Only one gene fit these criteria--PRDM9. Thus, the rapid evolution of PRDM9 is responsible for the puzzling observation that recombination hotspots are entirely unconserved between humans and chimps.

A brief conclusion

I'll reiterate that this is a pretty remarkable discovery, opening up the possibility of a direct link between the evolution of recombination and speciation. Is the effect of PRDM9 on recombination responsible for the conformation to Haldane's rule in the mouse cross described in #1? Or is there some additional effect of this gene? Is the evolution of PRDM9 sufficient to describe the evolution of recombination hotspots in all animals? One can imagine a whole host of additional questions. Certainly, this is a story to be continued.

Labels: Evolution, Genetics

Sunday, December 27, 2009

Mutation and selection in stickleback evolution posted by p-ter @ 12/27/2009 08:46:00 AM

Understanding the precise molecular mechanisms underlying changes in animal morphology is a tricky problem--usually two species which have diverged morphologically (say, mice and humans) are now so unrelated as to make genetic study exceedingly difficult, if not impossible. For years, a group led by David Kingsley has been addressing this problem in a cleverly-chosen model--three-spined sticklebacks. Importantly for the question of morphological evolution, freshwater populations of this fish have lost many of the spines and pelvic girdle carried by the saltwater populations (there are a number of hypotheses, probably not all mutually exclusive, for why this has been under selection).

In a new paper, this group demonstrates the precise genetic alteration underlying this change in a number of freshwater populations. Perhaps surprisingly, it appears to be due to the recurrent deletion (in different freshwater populations) of an enhancer of an important developmental gene. Strikingly, creating a transgenic freshwater fish with a copy of this enhancer (which normally is missing) leads to freshwater fish with a pelvis like the saltwater fish.

In fact, this enchancer seem to fall in a "fragile" (read: repeat-laden) region of the genome, which presumably increases the rate of deletion at this site. If one imagines there are a number of genetic paths to get to the reduced pelvis size favored in freshwater environments, the probability of each path depends on the mutation rate of each genetic change. In this case, many (though not all) freshwater populations have independently taken the same path, likely due to the increased mutation rate at this fragile site.

-----

Citation: Chan et al. (2009) Adaptive Evolution of Pelvic Reduction in Sticklebacks by Recurrent Deletion of a Pitx1 Enhancer. Science. Published Online December 10, 2009 [DOI: 10.1126/science.1182213]

Labels: Genetics

Wednesday, December 23, 2009

The diversity of the east posted by Razib @ 12/23/2009 09:33:00 PM

Just a weird random thought. In the early 20th century the Ainu of Japan were considered by many physical anthropologists a branch of the white race. This fit in nicely with the historical fantasy of the period which often featured "Lost Races," with a lost white race the best of all. By contrast, the Negrito and Melanesian populations were considered outliers of the black race. Though the idea of Ainu as white seems to have diminished, in part because those sorts of ideas aren't too popular today, and partly because hardly any Ainu remain who do not have substantial ancestry from the Japanese. On the other hand, there remain pan-Africanists and black nationalists who talk about the unity of black peoples, from India to Melanesia. To the left is a photo where I've placed an Ainu man from the 19th century next to contemporary Andaman Islanders. I think you could understand why physical anthropologists of the period classified populations as they did based on appearance.

But with all the more recent genetic studies it seems pretty clear that the Ainu and the Andaman Islanders are part of a broader swath of "easterners" who swept out of Africa (in fact, there are Y chromosomal haplogroups which the Ainu share with Andaman Islanders). Older classical markers suggested that the Ainu were an East Asian people, and the uniparental markers suggest the same thing (I don't see any more recent SNP array studies which look at the Ainu). As for the Andaman Islanders, it seems very likely that they're simply an island population of the ancient "eastern" substrate of South Asia, which has been admixed on the mainland with a "western" quasi-European element, which in many regions and castes is now dominant. The Ainu and the Andaman Islanders are probably just the remains of the physical diversity which was once much more common in eastern Eurasia than it is today. That diversity may have gone by the wayside because of the expansion of the Han and the Austronesians, but it may serve as a hint that there may be only a few basic human racial morphs which reoccur, whether by chance or adaptation.

Addendum: The non-Bantu populations of southern Africa look East Asian. Also, since the Reich et al. paper on Indian genetics came out I've been reading up, and now I can see how the Andaman Islanders do kind of "look Indian." More specifically, there are some subtle facial features which South Asians have which must have come down from people distantly related to the Andaman Islanders. Look at the individual on the left in the photo above.

Labels: Genetics

Selection & African Americans posted by Razib @ 12/23/2009 02:12:00 PM

I already posted on the new paper on African American Genetics. I noticed that Frank Sweet says:

It is interesting that the 18 percent mean of Euro DNA markers in A-As has been holding steady for about 8 years now, having replaced the prior estimate of 25 percent.

Where did the prior estimate come from? I recall seeing it as well too. Were the older markers biased towards ones which might have been shaped by recent selection? The new paper doesn't have anything definitive in regards to this (they they mention the variance in African vs. European across different regions of the genome), though certainly some genes which affect malaria seem to have been shifted away from what you'd expect.

Labels: Genetics

Monday, December 21, 2009

Brain size & microcephaly genes posted by Razib @ 12/21/2009 10:22:00 PM

Microcephaly Genes Associated With Human Brain Size:

Highly significant associations were found between cortical surface area and polymorphisms in possible regulatory regions near the gene CDK5RAP2. This gene codes for a protein involved in cell-cycle regulation in neuronal progenitor cells -- cells that migrate to the cerebral cortex during the second trimester of gestation and eventually become fully functioning neurons. The cerebral cortex is the outer layer of the brain, often referred to as "gray matter." The most highly developed part of the human brain, the cerebral cortex is responsible for higher cognitive functions, such as thinking, perceiving, producing and understanding language, some of which is considered uniquely human.

Similar but less significant findings were made for polymorphisms in two other microcephaly genes, known as MCPH1 and ASPM. All findings were exclusive to either males or females but the functional significance of this sex-segregated effect is unclear.

"One particularly interesting feature of this new discovery is that the strongest links with cortical area were found in regulatory regions, rather than coding regions of the genes," said Andreassen. "One upshot of this may be that in order to further understand the molecular and evolutionary processes that have determined human brain size, we need to focus on regulatory processes rather than further functional characterization of the proteins of these genes. This has huge implications for future research on the link between genetics and brain morphology."

Wouldn't be the first time that genes which have a connection to pathologies turn out to be useful in illuminating normal human variation. It'll be on the site of PNAS someday.

Labels: Genetics

Tuesday, December 15, 2009

What Heritability is Not posted by ben g @ 12/15/2009 09:11:00 AM

Because so many people abuse or misunderstand the concept of heritability, I decided that it would be nice to have a list of what heritability is not in one place. If you have questions or if there is a misconception about heritability you'd like me to address here, feel free to comment. This post will serve as an updated reference.

Heritability is not an indicator of malleability. Entirely genetic disorders such as phenylketonuria can be cured through the proper diet.
Heritability is not a measure of straightforward genetic effects. For example, genes that affect physical appearance have an effect on personality development.
Heritability is not independent of the population. It may differ from one group of individuals to the next, because groups differ environmentally and genetically.
Heritability is not independent of age. The effects of genes or environments may grow in potency through development.
Heritability is not an indicator of the causes of group differences. A trait can be highly heritable, as in the crop field metaphor, and group differences may still be due to environment. This applies also in the real world situation for humans, where the environmental differences between groups are not as systematic.
Heritability is not necessarily homogeneous within a population. A heritability of 50% may be hiding the heritabilities of 40% and 60% in subgroups.
Heritability is not a measure of intergenerational transmission. A trait may be highly heritable but not pass on from one generation to the next. This is because the relevant genes and environments may differ from one generation to the next.
Heritability is not a statistic for individuals. If you are using your knowledge of heritability to understand a single individual you are a biographer, not a scientist.

So, some of you may be wondering, why is heritability a useful statistic? That's easy to answer: it's a measure of how much phenotypic variation in a given population at a given time is due to genetic variation in that population. Measuring heritability allows us to say that, for adults in the modern world, variation on IQ and personality measures is primarily due to genetic variation. That's a pretty remarkable, and important finding if you ask me.

Labels: Genetics

Thursday, December 10, 2009

Carbs & ancestry posted by Razib @ 12/10/2009 11:43:00 PM

Stable Patterns of Gene Expression Regulating Carbohydrate Metabolism Determined by Geographic Ancestry:

Methodology/Principal Findings
Using a combination of genetic/genomic and bioinformatics approaches, we identified a large number of genes that were both differentially expressed between American subjects self-identified to be of either African or European ancestry and that also contained single nucleotide polymorphisms that distinguish distantly related ancestral populations. Several of these genes control the metabolism of simple carbohydrates and are direct targets for the SREBP1, a metabolic transcription factor also differentially expressed between our study populations.

Conclusions/Significance
These data support the concept of stable patterns of gene transcription unique to a geographic ancestral lineage. Differences in expression of several carbohydrate metabolism genes suggest both genetic and transcriptional mechanisms contribute to these patterns and may play a role in exacerbating the disproportionate levels of obesity, diabetes, and cardiovascular disease observed in Americans with African ancestry.

Figure 2 had me thinking of Me, Myself & Irene.

Labels: Genetics, Population genetics

Wednesday, November 25, 2009

GWAS, population structure and the Han Chinese posted by Razib @ 11/25/2009 01:47:00 PM

Two new articles in AJHG, Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies:

To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at ∼160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (FST = 0.0002 ∼0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (FST > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10−101). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.

And, Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation:

Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future.

Labels: Chinese, Genetics, Han, Population genetics

Tuesday, November 24, 2009

R1a1 and the peopling of Eurasia posted by Razib @ 11/24/2009 07:58:00 PM

A few weeks ago people in the comments were nagging me a bit about some new papers on the haplogroup R1a1. This Y chromosomal lineage is found at very high frequencies from East-Central Europe into India. Initially, researchers such as Spencer Wells assumed that R1a1 signaled the arrival of Indo-Aryans to the Indian subcontinent, its frequencies decline in a northwest-to-southeast gradient, and from high to low castes. In Europe the modal frequencies are among Slavic groups, with a high representation among Germanic-speakers. The frequency of R1a1 declines sharply in Western and Southern Europe. It is very common in Central Asia as well as eastern Iran and Afghanistan. One parsimonious explanation would be that R1a1 spread with Kurgan males, along with Indo-European languages, on the order of 4-5,000 years ago.

There is a problem with this model though. One of the new papers reiterates the finding that the coalescence of the European and South Asian lineages is on the order of 10,000 years ago: Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a (R1a1 is the dominant clade within R1a). A second paper reports the finding that R1a1 is very diverse in India, indicating deep time depth: The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system. For both R1a1 &"Ancestral North Indians" (ANI) in Reich et al.: the frequency seems intuitively way too high among tribal populations, even in South India. Remember that the low bound for ANI was ~40%. R1a1 is found at frequencies as high as 25% or so among some South Indian tribals. If this lineage arrived with the Indo-Aryans it is peculiar that it is found in such high frequencies in populations which were marginal and isolated from the dominant non-Indo-Aryan populations of South India. Back to Europe, here is a section from the abstract of the first paper:

Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.

The Holocene started 11,700 years ago. We are living in the Holocene. So the means that gene flow can't be any later than 6,000 years ago. The paper which focuses specifically on Indian lineages reports a coalescence time on the order of 10,000 years in the past for South Asian R1a1 branches. Additionally, they confirm earlier findings that of caste ranking of R1a1 in terms of frequency, as well Brahmins having the most diversity of all groups in terms of haplotypes (ergo, the title of the paper).

Both Dienekes and Polish Genetics and Anthropology suggest that the calibration is wrong on these coalescence times. They argue that one should reduce the time to a common ancestor by a factor of 3. This would of course make a huge difference. In regards to the Reich et al. paper which argued for a plausible two-way admixture between ANI and "Ancestral South Indians" (ASI), the linkage disequilibrium has decayed too much from the time of admixture to peg a date. This was a method used to calculate the emergence of the Uyghurs as a hybrid population, on the order of 2-3 thousand years ago (admixture between two very different populations generates linkage disequilibrium which decays over time due to recombination). In terms of Fst the ANI have a value in relation to Northern Europeans which is about 3 times larger than the mean between population differences in Europe. This is somewhat greater than the pairwise values between any European populations except for the Baltic peoples (in particularly, the swath from Karelia to Lithuania) to the groups of Southern Europe. The degree of Neolithic Middle Eastern ancestry within Europe under debate, but I think one can assume that Southern Italians and Karelians are likely at opposite ends in terms of frequency of this contribution to the pre-Ice Age demographic substratum of Europe. From this I offer that it is not totally unreasonable to posit that the ANI contribution to South Asian ancestry was closer to the margins of the last Ice Age, rather than the period of the Indo-European expansion, and that its Fst values are not unreasonable in relation to modern European groups.

The main issue that is confusing is the diversity of R1a1 in South Asia. A first order model going from just this data would be that R1a1 derives from India, and spread to the Eurasian plain. But Reich et al. show data that imply little likelihood of South Asian contribution to European ancestry. The only possibility would be if ANI and ASI were totally separated when a branch of ANI left South Asia for the Eurasian plain, and which point the process of admixture between ANI and ASI began. Another possibility is that the distribution of R1a1 in Eurasia is a palimpsest. Recent work in ancient DNA is suggesting that inferring past distributions from contemporary ones may lead us astray. It could be that R1a1 was once far more diverse in Europe and Central Asia, but that subsequent demographic events eliminated most of that diversity, while such events did not occur in Europe. Y chromosomal lineages may be particularly likely to be wiped out by the expansion of new tribes as old elites are killed or marginalized. The current distribution of a particular branch of R1a1 in Europe, associated in particular with Slavs, may be an expansion of the lineage which managed to survive elimination at some point in the mid-Holocene.

Though do note I put little weight in my speculations. It seems rather confusing. But since I was asked....

Labels: Genetics, Phylogeography

Why whales get no bigger posted by Razib @ 11/24/2009 05:33:00 PM

Carl Zimmer reports that it might be a function of physics. Bigger whales have proportionality bigger mouths, but at some point the biological engineering runs up against constraints:

s they report today in the Proceedings of the Royal Society, Goldbogen and his colleagues found that big fin whales are not just scaled-up versions of little fin whales. Instead, as their bodies get bigger, their mouths get much bigger. Small fin whales can swallow up about 90% of their own body weight. Very big ones can gulp 160%. In other words, big fin whales need more and more energy to handle the bigger slugs of water they gulp. As their body increases in size, the energy their bodies demand rises faster than the extra energy they can get from their food.

...

If the scientists are right, they may have discovered one of the big ironies in evolution. Lunge-feeding may have allowed whales to become the biggest animals ever to roam the planet. But this was not an open-ended invitation. Once whales got large enough, lunge feeding itself became so costly it prevented them from getting any bigger. Perhaps some day another animal will evolve a new strategy that will let it get even bigger than a blue whale. But for the animal kingdom as we know it, we may be sharing the planet with the biggest species it can offer.

Given enough time and a large population one can imagine that evolution might be able to figure out a solution, or back out of the adaptive dead end.

Labels: Evolution, Genetics, Population genetics

Monday, November 23, 2009

1 million SNPs to bind us all posted by Razib @ 11/23/2009 01:41:00 PM

A a new paper in PLoS ONE, Genetic Variation and Recent Positive Selection in Worldwide Human Populations: Evidence from Nearly 1 Million SNPs:

Our analyses both confirm and extend previous studies; in particular, we highlight the impact of various dispersals, and the role of substructure in Africa, on human genetic diversity. We also identified several novel candidate regions for recent positive selection, and a gene ontology (GO) analysis identified several GO groups that were significantly enriched for such candidate genes, including immunity and defense related genes, sensory perception genes, membrane proteins, signal receptors, lipid binding/metabolism genes, and genes involved in the nervous system. Among the novel candidate genes identified are two genes involved in the thyroid hormone pathway that show signals of selection in African Pygmies that may be related to their short stature.

They seem to have looked at about twice as many SNPs by combining the sets of Illumina and Affymetrix chips as the norm. But they looked at only around 1/4 the number of individuals as other studies which used the HGDP panel. To a first approximation the Affy and Illumina chips are really close in the patterns of variation which they detect, but, the Illumina chip had a significantly higher heterozygosity (this is evident in some of the supplementals just by inspection).

I reformatted a figure which shows ancestral contributions to the individuals in their sample at K = 6 (6 hypothetical populations which contribute to genetic variation). In the paper they discuss the fact that the Uyghur and Hazara resemble each other, and that the Uyghur seem to have a non-trivial Central/South Asian component, and finally that the Russian and Adygei have East Asian and Central/South Asian ancestry. None of this is surprising, all this was evident in other papers which used the same sample.

First, in regards to Russians, analysis of genetic variation among East European populations sometimes show a "long tail" of variation which leads toward East Asia among Russians. That is, Russians tend to cluster with other Europeans, but a minority of individuals are deviated in the direction of East Asians, that minority shrinking in proportion to distance from Europeans. The historical reason for this presents itself plainly: a significant minority of ethnic Russians have Tatar antecedents in the recent past, and of those who do not such ancestry may be derived from Slavicized Finno-Ugric populations who may have ancient connections to the populations of Siberia. The Russian Orthodox priest who was murdered last week known for preaching to Muslims was himself an ethnic Tatar by origin.

Second, one should expect the Uyghur and Hazara to resemble each other. The Hazara likely emerged during the period of Mongol rule of Iran and Afghanistan, and are descendants in part of Mongols and Turks from greater Mongolia who settled down in Afghanistan. The Uyghurs are a Turkic-speaking people, but historically the Tarim Basin was inhabited by Europoid populations. The emergence of the Uyghur and Hazara mimic each other almost perfectly. In particular, the East Asian component of their ancestry is from the same region. The non-East Asian aspect differs a bit, but not too much when set next to the East Asian component. Interestingly, the Uyghur speak a Turkic language, while the Hazara speaking Dari, the Persian dialect. One can probably chalk that up to distance from the Turco-Mongol ur-heimat.

Third, the Central/South Asian component among the Uyghur should not be too surprising, there is significant evidence that the Tarim basin was influenced by Indo-Iranians, as well as the Tocharians. Buddhism arrived in East Asia via the Tarim Basin after all, and there have always been trade routes from the southern edge of the Tarim down into northern India. But what about the Russians and the Adygei? I think that this signal has something to do with what we've termed elsewhere as "Ancestral North Indians" (ANI), who were closely related to European populations, and probably emerged from somewhere in Eastern Europe to Central Asia. I've been told that the Fst number for ANI-Northern European populations is on the order of the distance between Baltic peoples and southern Italians. So this group may have emerged on the margins of Europe, and expanded mostly within Asia.

There's also an interesting chart showing patterns of selection, or at least what they detected, across geographies. Even if most of the signals are false positives one may hold that the real signals within this subset will still recapitulate the geographic relationships shown to the left. The patterns of selection mirror overall phylogenetic relationships. Note the overlap patters of Central/South Asians with Europeans and East Asians, some of both, but dominated by the former.

Citation: Lopez Herraez D, Bauchet M, Tang K, Theunert C, Pugach I, et al. 2009 Genetic Variation and Recent Positive Selection in Worldwide Human Populations: Evidence from Nearly 1 Million SNPs. PLoS ONE 4(11): e7888. doi:10.1371/journal.pone.0007888

Labels: Genetics, Population genetics

Sunday, November 22, 2009

The mosaic of North American populations posted by Razib @ 11/22/2009 08:51:00 PM

A few months ago an interesting paper connected the historical demographics of New Hampshire with genetic variation. One of the notable features of North American history and culture is that it is a mosaic of different populations, and, that mosaic has come about in very different ways. For example, the millions of Italian and Jewish Americans descend from hundreds of thousands of Italian and Jewish immigrants. By contrast, millions of Yankees and Quebecois descend from tens of thousands of ancestors, who arrived in the 17th and 18th centuries. In 1992 a Census demographer, Gibson Campbell, calculated that 49% of the population of the United States in 1990 was descended from those whose ancestors were resident withi nthe United States in in 1790 in "The Contribution of Immigration to the Growth and Ethnic Diversity of the American Population" (inclusive of blacks and whites). 51% were descended from those who arrived after 1790. Put it another way, 127 million Americans in 1990 were attributable to the net 50 million immigrants who arrived after 1790. The remainder of the population would be attributable to the 4 million U.S. residents in 1790.

Note: Looking at the immigration records more than 1 million Italians and Jews remained in the United States (around 4 million Italians arrived between 1820 and 1920, but the majority seem to have gone back to Italy). But reproductive variance being what it is, I think it is plausible to assuming that fewer than 1 million may contribute most to the current generations of these two groups.

Labels: Genetics, History

Friday, November 20, 2009

Latin America is not panmixia posted by Razib @ 11/20/2009 03:35:00 PM

A new provisional paper, Ancestry-related assortative mating in latino populations. Here are the results:

Using 104 ancestry informative markers, we examined spouse correlations in genetic ancestry for Mexican spouse pairs recruited from Mexico City and the San Francisco Bay Area, and Puerto Rican spouse pairs recruited from Puerto Rico and New York City. In the Mexican pairs, we found strong spouse correlations for European and Native American ancestry, but no correlation in African ancestry. In the Puerto Rican pairs, we found significant spouse correlations for African ancestry and European ancestry but not Native American ancestry. Correlations were not attributable to variation in socioeconomic status or geographic heterogeneity. Past evidence of spouse correlation was also seen in the strong evidence of linkage disequilibrium between unlinked markers, which was accounted for in regression analysis by ancestral allele frequency difference at the pair of markers (European versus Native American for Mexicans, European versus African for Puerto Ricans). We also observed an excess of homozygosity at individual markers within the spouses, but this provided weaker evidence, as expected, of spouse correlation. Ancestry variance is predicted to decline in each generation, but less so under assortative mating. We used the current observed variances of ancestry to infer even stronger patterns of spouse ancestry correlation in previous generations.

The correlations are to the left. An interesting point is that the correlations of total genome content seem too high to be explained by assortative mating for salient physical features (skin color, hair form, etc.) alone. From the text:

Another possibility involves physical characteristics, such as skin pigment, hair texture, eye color, and other physical features. Certainly, these traits are correlated with ancestry and are likely to be factors in mate selection. However, the spouse correlation for these traits must be high and the correlation of these traits with ancestry must also be high to explain the observed ancestry correlations....

...

If the spouse trait correlation is 0.6 (a reasonably high value), then for a spouse ancestry correlation of 0.3 (Puerto Ricans), the trait-ancestry correlation is 0.7; for a spouse ancestry correlation of 0.4 (Mexicans), the trait-ancestry correlation is 0.8. Previous studies on assortative mating in Latin American groups have retrieved correlation coefficients of 0.29 to 0.46 for education level, 0.48 for skin reflectance, 0.07 to 0.18 for eye and hair color, and 0.16 to 0.24 for different anthropometric measurements

As noted above, they controlled for SES and geography, and the correlation remains. Looking at the correlations within the genomes of these individuals they also inferred that assortative mating in the past was actually greater than it is today (they also have a historical citation which suggests this). I wonder of the correlation of ancestry is due to sorting by many traits which are subtle and nuanced, and relatively difficult to capture in surveys of the coarse salient traits are used to categorize phenotypic races. Looking at many traits, as opposed to a few, and one would have a better sense of total genome content. When it comes to mating one might look to a range of traits which in other circumstances are not noted, or fall below the threshold of reflective awareness. I'm assuming there might be something here which is Gestalt and subconscious. Kind of like the various studies which attempt to correlate mate preferences by HLA polymorphism.

Labels: Genetics, Population genetics

Tuesday, November 10, 2009

Spengler does it again! posted by Razib @ 11/10/2009 12:03:00 PM

Just Spengler (David Goldman) being Spengler, From "Zionism is Racism" to "Judaism is Racism":

Judaism has nothing to do with race-there are Jews of every race-but it does have to do with family. Jews are members of Abraham's family. Not only tradition, but a great deal of DNA evidence support this claim. To insist that Jews adopt the criterion of "belief" for membership is to rule that God must act in accordance with a human court's notion of the permissible range of God's behavior. No wonder the Reform Jews and the British Humanist Association support this.

1) Yes, Jews are genetically distinct.

2) But, they are also the product of genetic admixture.

3) And, it seems more likely that that admixture arrived via maternal lineages, that is, gentile female ancestors (the mtDNA results are somewhat confused, but the Y lineages seem to be relatively strongly Middle Eastern in provenance in comparison to total genome content).

In light of the fact that the debate is over the validity of the criterion of maternal descent as to "Who is a Jew," it seems deceptive to appeal to genetics when that field opens up more questions in regards to Jewish tradition than it closes. Of course, this sort of shell-game is normal behavior for Spengler. Someone should really put a "For Entertainment Purposes Only" sticker on his blog.

Labels: Genetics, Jewish Genetics

Friday, October 23, 2009

What's going on at ASHG 2009? posted by Razib @ 10/23/2009 10:11:00 PM

If you haven't been following the goings-on via Twitter, Luke Jostins has been posting some tidbits on his blog, Genetic Inference. If you get interested in something, remember you can search abstracts.

Labels: Genetics, Population genetics

Inferring demographic history posted by Razib @ 10/23/2009 12:52:00 PM

Very interesting paper in PLoS Genetics, Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data. Here's the author summary:

The demographic history of our species is reflected in patterns of genetic variation within and among populations. We developed an efficient method for calculating the expected distribution of genetic variation, given a demographic model including such events as population size changes, population splits and joins, and migration. We applied our approach to publicly available human sequencing data, searching for models that best reproduce the observed patterns. Our joint analysis of data from African, European, and Asian populations yielded new dates for when these populations diverged. In particular, we found that African and Eurasian populations diverged around 100,000 years ago. This is earlier than other genetic studies suggest, because our model includes the effects of migration, which we found to be important for reproducing observed patterns of variation in the data. We also analyzed data from European, Asian, and Mexican populations to model the peopling of the Americas. Here, we find no evidence for recurrent migration after East Asian and Native American populations diverged. Our methods are not limited to studying humans, and we hope that future sequencing projects will offer more insights into the history of both our own species and others.

And from the abstract:

We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40-270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17-43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3-26.9 kya), and our analysis yields no evidence for subsequent migration.

I would keep in mind these 95% confidence intervals, but I immediately wondered about this European-East Asian divergence time just like Dienekes.

Labels: Genetics, Population genetics

Monday, October 19, 2009

Humans still evolving, etc. posted by Razib @ 10/19/2009 04:26:00 PM

Are Humans Still Evolving? Absolutely, Says A New Analysis Of A Long-term Survey Of Human Health:

"There is this idea that because medicine has been so good at reducing mortality rates, that means that natural selection is no longer operating in humans," said Stephen Stearns of Yale University. A recent analysis by Stearns and colleagues turns this idea on its head....

Taking advantage of data collected as part of a 60-year study of more than 2000 North American women in the Framingham Heart Study, the researchers analyzed a handful of traits important to human health. By measuring the effects of these traits on the number of children the women had over their lifetime, the researchers were able to estimate the strength of selection and make short-term predictions about how each trait might evolve in the future. After adjusting for factors such as education and smoking, their models predict that the descendents of these women will be slightly shorter and heavier, will have lower blood pressure and cholesterol, will have their first child at a younger age, and will reach menopause later in life.

Since large numbers of humans forgo reproduction in an evolutionary sense they might as well have died (excluding some inclusive fitness effects). If reproductive variance and heritable variation in traits correlated with that variance continues then naturally selection will be an operative phenomenon.

The paper is coming out in PNAS, so no guarantee when it'll be online, Byars, S., D. Ewbank, et al. Natural selection in a contemporary human population. Proceedings of the National Academy of Sciences, 106(42) DOI: 10.1073_pnas.0906199106.

Labels: Evolution, Genetics

Tuesday, October 13, 2009

There are no NFL genes (?) posted by Razib @ 10/13/2009 01:34:00 PM

23andMe performs genome-wide association study on NFL players, fails to find athlete genes:

It's unsurprising that the results of this study are negative (more on this below), but the conclusions they draw from this are fallacious. In fact we know from twin and family studies that many (but not all) traits related to athletic performance are highly heritable; researchers just haven't been able to track down the vast majority of the genetic variants responsible yet, and this study is no exception.

What 23andMe have actually shown here is that the limited subset of genetic variation captured by their genotyping chip (which almost exclusively targets genetic variants with a frequency of greater than 5%) doesn't include any variants with an extremely strong association with NFL prowess.

That shouldn't come as a surprise to anyone who's been following advances in human genetics for the last few years; a genome-wide association study on a highly complex trait with a sample size of 100 has, historically speaking, a vanishingly small chance of yielding any positive results at all. (Yes, there are exceptions, but I don't think a sensible prior expectation would be that athletic performance has a similar genetic architecture to macular degeneration.)

NFL players are taller and heavier than average, in addition to being able to run the 40 in 4.5 seconds. Seems like a lot of these are quantitative traits.

Labels: athletes, Genetics

Wednesday, September 23, 2009

The genomes of Indians posted by Razib @ 9/23/2009 04:23:00 PM

A new paper is getting a lot of press, Reconstructing Indian population history. I will probably have something up on ScienceBlogs tomorrow (have to read the supplements). But I thought I'd highlight a paragraph in the text:

We warn that 'models' in population genetics should be treated with caution. Although they provide an important framework for testing historical hypotheses, they are oversimplifications. For example, the true ancestral populations of India were probably not homogeneous as we assume in our model, but instead were probably formed by clusters of related groups that mixed at different times. However, modelling them as homogeneous fits the data and seems to capture meaningful features of history.

This caution did not percolate to the level of the press releases from what I can gather. John Hawks has some criticisms up....

Labels: Genetics

Saturday, September 19, 2009

The randomness of model organisms posted by p-ter @ 9/19/2009 12:33:00 PM

I thought I'd point quickly to a really nice paper showing that the RNAi pathway, thought to be absent in budding yeasts, is actually only missing from baker's yeast, Saccharomyces cerevisiae. Remarkably, the authors are able to reconstitute the pathway (which was presumably present in the ancestor of all budding yeasts) in S. cerevisiae with exogenous expression of only two genes. The authors close with a remark about the role of contingency (in particular with regards to the choice of model organism) in research:

While anticipating a productive future for RNAi research in budding yeasts, we note that if in the past S. castellii [a yeast with an endogenous RNAi pathway] rather than S. cerevisiae had been chosen as the model budding yeast, the history of RNAi research would have been dramatically different.

Labels: Genetics

Tuesday, September 15, 2009

Personal genomics, beyond the hype posted by Razib @ 9/15/2009 04:57:00 PM

ScienceDaily has an interesting piece, Individual Genetic Data Illuminates How Genes Influence Human Health. Points to two papers, Epistasis and Its Implications for Personal Genetics and Genetic Population Structure Analysis in New Hampshire Reveals Eastern European Ancestry. If you read a weblog like Genetic Future you are probably cognizant of the fact that personal genomics firms are invested in overselling and hyping their current efficacy to serve their economic interests. And yet I was intrigued as to the disjunction between the present day capabilities of personal genomics and the perception of its power in the general media when listening to a recent episode of Plant Money. The hosts spoke as if personal genomics had already rendered health insurance obsolete because science had removed uncertainty from prediction of disease risk. Even when personal genomics does become powerful enough to push itself far beyond the margins when it comes to effecting personal health decisions, randomness is probably going to be a big factor in who becomes ill simply because randomness is a fact of biology.

Labels: Genetics, Personal Genomics

Saturday, September 05, 2009

EDAR & the shovel-shaped incisor posted by Razib @ 9/05/2009 07:23:00 PM

Dienekes is posting more ASHG abstracts. This one is interesting:

A nonsynonymous SNP in EDAR is associated with tooth shoveling

Teeth display variations among individuals in the size and the shape of cusps, ridges, grooves, and roots. In addition, there are certain dental characteristics which are predominant in certain human groups, such as tooth shoveling of upper incisors that is major in Asian populations but rare or absent in African and European populations...Human genome diversity data have revealed that the derived allele of a nonsynonymous single nucleotide polymorphism (SNP), rs3827760 that is also called EDAR T1540C, is predominant in East Asian populations but absent in populations of African and European origins. It has recently been reported that the 1540C allele is associated with Asian-specific hair thickness. The aim of this study is to clarify whether the nonsynonymous polymorphism in EDAR is also associated with dental morphology in humans or not. For this purpose, we measured crown diameters and tooth shoveling grades, genotyped EDAR T1540C, and analyzed the correlations between them in Japanese populations. To comprehend individual patterns of dental morphology, we applied a principal component analysis (PCA) to individual-level metric data, the result of which implies that multiple types of factors affect the tooth size. This study clearly demonstrated that the number of the Asian-specific EDAR 1540C allele is strongly correlated with the tooth shoveling grade.

We've posted on EDAR. Interesting that it seems it is related to another classic "Mongoloid" physical trait, the shovel-shaped incisor, which loomed large back in the day when bones and teeth were the way you identified remains. Of course, other ancient populations had shovel-shaped incisors, so it isn't as if this is totally unique to the peoples of East Asia and the Americas. In any case, this shouldn't be too surprising, EDAR does a lot of things, as evident by the summary in GeneCards:

This gene encodes a member of the tumor necrosis factor receptor family. The encoded transmembrane protein is a receptor for the soluble ligand ectodysplasin A, and can activate the nuclear factor-kappaB, JNK, and caspase-independent cell death pathways. It is required for the development of hair, teeth, and other ectodermal derivatives. Mutations in this gene result in autosomal dominant and recessive forms of hypohidrotic ectodermal dysplasia.

Labels: Genetics

Friday, September 04, 2009

Super Y lineages over the past 10,000 years posted by Razib @ 9/04/2009 03:45:00 PM

Dienekes posted a bunch of abstracts from the 2009 American Society of Human Genetics meeting. This one is of interest in light of recent posts on this weblog:

Some Y-chromosomal haplotypes have been found at unusually high frequencies in Asian and European human populations. The massive spreadof these lineages has been explained by the impact of social selection i.e.the high reproductive success of some males and their relative/descendants due to their high social status. The most well-known examples are the "Khan haplotype" and the "Manchou haplotype" in Asia, and the U’Neill haplotype in Ireland. But are these frequent haplotypes always associated with recentevents of social selection, or could they be linked to much older processes? To address this question, we have surveyed ~ 3500 males in 97 populationsfrom Turkey to Japan. We have focused on the 12 most frequently represented haplotypes in Eurasia and tested whether their expansions are linked to a specific factor such as language or subsistence methods. Our results show that both recent and ancient processes are responsible for the expansions of these lineages. The recent expansions (2000-3000 years) likely to be linked to social selection are prevalent in Altaic-speaking and pastora lpopulations. This might indicate a recent cultural change in the social organizationof these populations. The ancient expansions (8000-10000 years) are over-represented in Indo-European speaking and sedentary farmer populations,and are likely to be the result of the Neolithic transition.

Asymmetries between male and female lineages are always of interest. For example, diversity of Y and mtDNA correlates well with patrilocality vs. matrilocality. The idea of "super-male" lineages was mooted by Bryan Sykes several years ago in the wake of the "Genghis Khan haplotype", though it benefited from particular preconceptions many have about the nature of male genetic reproductive fitness. But it is likely that these dynamics vary by population due to ecological and/or social parameters. The time window for the expansion of Y lineages among Altaic speakers is very suggestive in light of historical records and archaeological data. It seems that early on (i.e., before 500 BCE) horse-based nomadism was dominated by Indo-Europeans, predominantly Iranians, in Eurasia. In the few centuries before Christ the populations of the eastern steppe, the precursors of Altaic language families, adopted this lifestyle, and to a great extent superseded the Iranian populations across the length and breadth of the non-sedentary zone over the next 1,500 years (the fact that the Ossetians are now a people who reside in the Caucasus is illustrative of the great retreat of Iranian peoples on the steppe). I have suggested that there is a winner-take-all dynamic in regards to steppe polities, and I suspect this will be reflected in the genetics of male lineages as well.

* It is notable that Ireland was to a great extent a pastoralist society during the period of domination by the Ui Neill .

Labels: Genetics, History

Monday, August 31, 2009

Curly haired dogs posted by Razib @ 8/31/2009 12:40:00 AM

Since I see p-ter hasn't posted on this, in Science, Coat Variation in the Domestic Dog Is Governed by Variants in Three Genes:

Coat color and type are essential characteristics of domestic dog breeds. While the genetic basis of coat color has been well characterized, relatively little is known about the genes influencing coat growth pattern, length, and curl. We performed genome-wide association studies of more than 1000 dogs from 80 domestic breeds to identify genes associated with canine fur phenotypes. Taking advantage of both inter- and intrabreed variability, we identified distinct mutations in three genes, RSPO2, FGF5, and KRT71 (encoding R-spondin-2, fibroblast growth factor-5 and keratin-71, respectively), which together account for the majority of coat phenotypes in purebred dogs in the United States. This work illustrates that an array of varied and seemingly complex phenotypes can be reduced to the combinatorial effects of only a few genes.

See ScienceDaily for summary. This will help us cure cancer! OK, probably not, but hopefully perhaps we might get toward understanding hair form beyond EDAR.

Labels: Dog genetics, Genetics, Hair

Friday, August 28, 2009

Bad headlines? posted by Razib @ 8/28/2009 10:59:00 AM

Nature News headline: Human-chimp interbreeding challenged. Makes it seem like there's a breeding program somewhere, and others are challenging it on ethical grounds....

Labels: Genetics

Thursday, August 27, 2009

Computing the spread of lactase persistence posted by Razib @ 8/27/2009 08:08:00 PM

As most readers of this weblog know most humans as adults cannot digest lactose. The ability to digest lactose via the persistence of the enzyme lactase is differentially distributed. Both inferential methods and a small number of ancient genetic extractions suggest that this ability arose within the last 10,000 years. A new paper, The Origins of Lactase Persistence in Europe:

Most adults worldwide do not produce the enzyme lactase and so are unable to digest the milk sugar lactose. However, most people in Europe and many from other populations continue to produce lactase throughout their life (lactase persistence). In Europe, a single genetic variant, −13,910*T, is strongly associated with lactase persistence and appears to have been favoured by natural selection in the last 10,000 years. Since adult consumption of fresh milk was only possible after the domestication of animals, it is likely that lactase persistence coevolved with the cultural practice of dairying, although it is not known when lactase persistence first arose in Europe or what factors drove its rapid spread. To address these questions, we have developed a simulation model of the spread of lactase persistence, dairying, and farmers in Europe, and have integrated genetic and archaeological data using newly developed statistical approaches. We infer that lactase persistence/dairying coevolution began around 7,500 years ago between the central Balkans and central Europe, probably among people of the Linearbandkeramik culture. We also find that lactase persistence was not more favoured in northern latitudes through an increased requirement for dietary vitamin D. Our results illustrate the possibility of integrating genetic and archaeological data to address important questions on human evolution.

Here's a graphical illustration of their conclusion:

Labels: Genetics, Population genetics

Thursday, August 20, 2009

MECP2 and brain structure posted by Razib @ 8/20/2009 11:37:00 PM

ScienceDaily, Genetic Variations Linked To Brain Size. The write-up seems a bit garbled to me, so probably best to read the paper, A common MECP2 haplotype associates with reduced cortical surface area in humans in two independent populations, when it is live on the PNAS site.

Labels: Genetics, Neuroscience

Nudge the fat; satiety & the implicit mind posted by Razib @ 8/20/2009 09:58:00 PM

Megan McArdle has has been talking about the high heritability of BMI again. I have expressed concern about her putting the high heritability numbers out there when it comes to its relevance for public policy, though I do tend to agree with her general stance that glib assertions about the importance of will-power are probably non-starters. And, rather than point to arguments such as "I have a slow metabolism," it is probably more critical emphasize the complexity of the chain of events and framing of how we make decisions, much of which occurs "under the hood" and outside the purview of conscious explicit control. Interestingly, the reality that choice is highly conditioned by details of our environment combined with innate predispositions, and proximately is driven by many implicit factors, has pushed me in a less libertarian direction.

In any case, the whole discussion got me interested in the topic of obesity & heritability, and I found this review, Human Obesity: A Heritable Neurobehavioral Disorder That Is Highly Sensitive to Environmental Conditions. You can read the full text, it's Open Access now, but this part caught my attention:

...He hypothesizes that random natural variation in "hypothalamic energy balance set points" has occurred over millions of years of primate evolution. Whereas variants that would tend to produce a state of low energy stores would have been systematically selected against, at least in part because of their adverse impact of reproductive success, upward drifts in such set points would have been allowed to persist (rather than being positively selected for, as the “thrifty gene” hypothesis would have it). This upward drift would be particularly prominent because the formation of organized social groups and the discovery of fire, both of which occurred around 2,000,000 years ago, made our ancestors less susceptible to predation. Not particularly emphasized by Speakman, but likely to be important, is the probability that such natural tendencies toward an upward drift in adipose stores may rarely have actually manifested themselves as obesity because of the high energy cost of obtaining food during most of human evolution. It is only in the past 50 years or so, when for the first time in human history the majority of people in the developed and developing world can readily access sufficient daily calories to exceed the calories expended in acquiring them, that those with intrinsically higher set points have manifested their "obesity potential" on a grand scale. Unlike the “thrifty gene” hypothesis, this scenario provides a credible explanation for the fact that even in places where obesity is very common, a substantial proportion of the population remains lean.

This is an old hobby horse of mine: if you see a quantitative trait which can be conceived of as normally distributed with a high degree of heritability, such as body mass index, then its fitness implication can't have been too stark. In other words, if a very heritable trait still has a great deal of extant genetic variation, then it is either in transient, or, more likely the fitness implication of any particular trait value was low or there is balancing dynamics preserving the variance. Like IQ, body weight has been increasing over the past century. Many people think that they know the reason why this is occurring. If the reasons are ever established to a high degree of certitude, is it possible to reverse the slouch toward obesity without coercion?

Labels: Genetics, Origin of obesity

Tuesday, August 18, 2009

In defense of big genetics posted by p-ter @ 8/18/2009 08:19:00 PM

Greg Mayer, filling in for Jerry Coyne, has a post up on a somewhat odd objection to the appointment of Francis Collins as director of NIH: that he's a geneticist. The argument seems to be that diseases are complicated and not entirely genetic, and that Collins isn't hip to non-genetic subtleties. To be frank, this is silly--while it's sometimes a revelation to non-biologists that the "gene for X" way of framing things is inaccurate, Collins is not incompetent. If I had to guess what direction he's planning on taking the NIH, I'd look to what he's actually written.

In the comments to the post, there's the additional worry that Collins represents "big science", which I suppose is considered to be a bad thing (apparently Collins thinks it would be nice to catalogue all the transcripts in a cell, which for whatever reason really pissed off this dude). It's not a bad thing at all; in many cases, big, relatively hypothesis-free science is actually really nice for rationally choosing which "little science" projects to pursue.

Let's take a couple recent examples from genome-wide association studies (these studies over the past few years have exponentially increased our understanding of complex disease; whether that exponential increase is enough for you depends on your prior expectations). First, a little over two years ago, an association was found between a genetic variant in the FTO gene and obsesity in humans. At the time, the gene had unknown function. Now, there's a mouse model and focused biochemical analysis being done on this gene, and we're light years closer to understanding what it does and how nearby variation influence obesity. Would all of this been done without the "hypothesis-free" GWAS? Not anytime soon.

Second, consider the genome-wide association studies in several cancers that all pointed to the same, gene-free region on chromosome 8. In the last few weeks, three separate groups have published their "small-scale" molecular biology work establishing that the associated region appears to be an enhancer important for either proper temporal or spatial gene expression. How does it work? It's not clear, but that's the point--this is an interesting question. Much of "small-scale" molecular biology is done in a few model systems, or on a few "popular" genes. There's a very good reason for this--these systems or genes are already known to be interesting either scientifically or medically. One efficient way to identify novel, potentially interesting systems is through large-scale work.

Labels: Genetics, Genomics

Monday, August 17, 2009

Breeding a better athlete posted by Razib @ 8/17/2009 02:43:00 PM

Taller, heavier: the speedy evolution of the fastest people on the planet:

While the average person has gained about five centimetres since 1900, the height of champion runners has increased 16.2 centimetres, say Duke University researchers, Jordan Charles and Adrian Bejan, who studied the heights and weights of 100-metre world record holders.

''The trends revealed by our analysis suggest that speed records will continue to be dominated by heavier and taller athletes,'' said Mr Charles, whose study was published last month in The Journal of Experimental Biology .

...

While Dr Norton dismissed those predictions, he believed that the laws of genetics, thanks to the habit of athletes marrying athletes - and possibly even the creation of athlete sperm banks - meant runners would continue growing taller, more powerful and faster.

Remember the "Little Hercules" with the myostatin mutation? His mother was very muscular, and reportedly there was a family history of mesomorphicity. One way population level quantitative trait mean value can shift through selection beyond the most extreme values of the original population without new mutation being necessary is simply to change the underlying allele frequencies enough so that originally unlikely combinations become common. Assortative mating is another variant of this dynamic, if people several sigmas from the mean mate, then new combinations are likely to emerge.

Labels: Genetics, Sports

Sunday, August 16, 2009

Including genetic information in clinical trials: hepatitis C and IL28B posted by p-ter @ 8/16/2009 06:41:00 PM

Online this week, Nature has published a genome-wide association study for response to treatment for chronic hepatitis C infection-the authors identify a polymorphism in an interleukin gene that is a strong predictor of how well an individual is able to clear the virus. Interestingly, the frequency of the polymorphism in different populations tracks the previously noted population difference in drug response, and the authors claim to explain half the difference in response rate between African- and European-Americans with this single polymorphism.

This paper is also interesting in that it represents one of the first (if not the first?) studies to coordinate a drug trial (in this cases, three different treatments for hepatitis C) with a genome-wide association study. This promises to lead to both important advances--as researchers are able to identify genetic subgroups of individuals who respond (or not) to a drug, even if it is ineffective in the population as a whole--as well as (my cynical side speaks) additional opportunities for misleading post hoc analyses by drug companies to try to salvage and market drugs that don't work. Hopefully, mostly the former--this could be an important step towards legitimizing genetic information in the eyes of MDs, and an ever-so-slight step towards personalized medicine.

Labels: Genetics

Friday, August 14, 2009

Sleep genetics posted by p-ter @ 8/14/2009 06:21:00 PM

A remarkable study published in Science this week identifies a rare mutation in the gene DEC2 which influences the duration of sleep in humans. The authors started with a family where patterns of short sleep (about 6 hours of sleep a night on non-workdays, versus ~8 hours for other people in the family) seemed to follow a Mendelian inheritance pattern. In a candidate gene resequencing study, they identified a mutation in all of two people--a mother and daughter--with the short sleep pattern.

In what can only be described as a ballsy move, the authors then invested what must have been a considerable amount of time and money on following up this mutation. Which, I emphasize again, was found in only two individuals in a single family. In particular, they generated mice carrying both of the human versions of the gene, and were thus able to explicitly compare the two human alleles in an animal model. (This is in contrast to most mouse studies, which completely remove a gene or dramatically up-regulate it). Not wishing to show any mammalian bias, they did the same thing in flies. In all cases, the results were consistent with the human data--the low-sleep allele led to increased activity in both species.

Out of curiosity, is 6 hours of sleep (without an alarm clock) really all that odd for people? It certainly would be for me, but I feel like I know plenty of people who claim to naturally need only about that much.

Labels: Genetics

Tuesday, August 04, 2009

The neuroscience of psychopathy posted by Razib @ 8/04/2009 09:15:00 PM

Altered connections on the road to psychopathy:

... Earlier studies suggested that dysfunction of the amygdala and/or orbitofrontal cortex (OFC) may underpin psychopathy. Nobody, however, has ever studied the white matter connections (such as the uncinate fasciculus (UF)) linking these structures in psychopaths. Therefore, we used in vivo diffusion tensor magnetic resonance imaging (DT-MRI) tractography to analyse the microstructural integrity of the UF in psychopaths (defined by a Psychopathy Checklist Revised (PCL-R) score of 25) with convictions that included attempted murder, manslaughter, multiple rape with strangulation and false imprisonment. We report significantly reduced fractional anisotropy (FA) (P<0.003), an indirect measure of microstructural integrity, in the UF of psychopaths compared with age- and IQ-matched controls. We also found, within psychopaths, a correlation between measures of antisocial behaviour and anatomical differences in the UF. To confirm that these findings were specific to the limbic amygdala–OFC network, we also studied two 'non-limbic' control tracts connecting the posterior visual and auditory areas to the amygdala and the OFC, and found no significant between-group differences. Lastly, to determine that our findings in UF could not be totally explained by non-specific confounds, we carried out a post hoc comparison with a psychiatric control group with a past history of drug abuse and institutionalization. Our findings remained significant. Taken together, these results suggest that abnormalities in a specific amygdala–OFC limbic network underpin the neurobiological basis of psychopathy.

I'm a little skeptical about psychiatry's ability to diagnose distinctive phenotypes in general, but from what I have read genuinely amoral psychopaths are a real phenomenon, and not a politicized constructed pathology. Readers with more neuroscience chops are invited to weight in if this another sexy neuro paper with little substance. Also see ScienceDaily.

Labels: Genetics, Psychology

Monday, August 03, 2009

Whence Canis familiaris? posted by p-ter @ 8/03/2009 09:20:00 PM

The NY Times reports on a new paper calling into question the theory that the dog was first domesticated in East Asia.

The evidence for an East Asian origin of dogs came from a study of mitochondrial DNA, which showed that dog populations in East Asia harbored more diversity than their counterparts in other parts of the world (the reasoning is that more diverse populations are likely to be the origin of a species; populations which split off and move about will be less diverse). The authors of the new paper, however, point out that this study included more semi-wild "village dogs" from East Asia than from other parts of the world, and that these village dogs tend to be more diverse than purebreds from the same part of the world.

When African village dogs are included in the analysis, the East Asian dogs no longer stand out as extremely diverse (see right; the fitted line is what would be expected is diversity were equal in all places). Though this places the East Asian origin hypothesis in serious doubt, for now the authors--having only sampled African dogs (Africa does not have the wolf from which dogs are thought to have been domesticated)--do not present an alternative. As more data is collected from dogs around the world, this state of affairs in unlikely to last.

Labels: Genetics

Thursday, July 23, 2009

Lactase persistence, pastoralism in Africa, don't know in Europe posted by Razib @ 7/23/2009 11:41:00 PM

Impact of Selection and Demography on the Diffusion of Lactase Persistence:

The lactase enzyme allows lactose digestion in fresh milk. Its activity strongly decreases after the weaning phase in most humans, but persists at a high frequency in Europe and some nomadic populations. Two hypotheses are usually proposed to explain the particular distribution of the lactase persistence phenotype. The gene-culture coevolution hypothesis supposes a nutritional advantage of lactose digestion in pastoral populations. The calcium assimilation hypothesis suggests that carriers of the lactase persistence allele(s) (LCT*P) are favoured in high-latitude regions, where sunshine is insufficient to allow accurate vitamin-D synthesis. In this work, we test the validity of these two hypotheses on a large worldwide dataset of lactase persistence frequencies by using several complementary approaches.

...

Our results show that gene-culture coevolution is a likely hypothesis in Africa as high LCT*P frequencies are preferentially found in pastoral populations. In Europe, we show that population history played an important role in the diffusion of lactase persistence over the continent. Moreover, selection pressure on lactase persistence has been very high in the North-western part of the continent, by contrast to the South-eastern part where genetic drift alone can explain the observed frequencies. This selection pressure increasing with latitude is highly compatible with the calcium assimilation hypothesis while the gene-culture coevolution hypothesis cannot be ruled out if a positively selected lactase gene was carried at the front of the expansion wave during the Neolithic transition in Europe.

The "calcium hypothesis" idea is of course one of the explanations for light skin in Northern Europe as well. The locus responsible for 1/3 of the skin color difference between Africans and Europeans, SLC24A5, is a relative recent sweep, on the order of the last 10,000 years. The authors do caution to be careful about the assumptions of their model. Point taken to heart, as I don't think they have a good enough grasp on the fine-grained variation in the lactase persistence alleles and how they track ecology within Europe. The Greenland Norse did not raise cattle just because of lack of Vitamin D (which they ended up getting through a shift toward a marine diet in any case), rather, there were ecological constraints in terms of the maximum productivity of grain-based subsistence farming (particularly with wheat in cold damp climates). In the conclusion of the paper it is noted that Iberia is a good test case of the model, and more data needs to be gathered there. If it is gene-culture coevolution than many Iberian peoples should be lactase persistent, but if it is due to Vitamin D, they should not be.

Labels: Genetics

Tuesday, July 21, 2009

Genetic background & medicine, HIV & differences between blacks & whites posted by Razib @ 7/21/2009 01:24:00 PM

The Duffy-null state is associated with a survival advantage in leukopenic HIV-infected persons of African ancestry:

Persons of African ancestry, on average, have lower white blood cell (WBC) counts than those of European descent (ethnic leukopenia), but whether this impacts negatively on HIV-1 disease course remains unknown. Here, in a large natural history cohort of HIV-infected subjects we show that although leukopenia...was associated with an accelerated HIV disease course, this effect was more prominent in leukopenic subjects of European than African ancestry. The African-specific -46C/C genotype of Duffy Antigen Receptor for Chemokines (DARC) confers the malaria-resisting, Duffy-null phenotype, and we found that the recently described association of this genotype with ethnic leukopenia extends to HIV-infected African Americans (AA). The association of Duffy-null status with HIV disease course differed according to WBC but not CD4+ T cell counts, such that leukopenic but not non-leukopenic HIV+ AAs with DARC -46C/C had a survival advantage compared with all Duffy-positive subjects. This survival advantage became increasingly pronounced in those with progressively lower WBC counts. These data highlight that the interaction between DARC genotype and the cellular milieu defined by WBC counts may influence HIV disease course, and this may provide a partial explanation of why ethnic leukopenia remains benign in HIV-infected African Americans, despite immunodeficiency.

Duffy status is a highly ancestrally informative trait. This is a case where the relatively low between population variance found among humans does not apply. Rather, it seems that the Duffy null phenotype is a recent adaptation to malaria among West Africans. Because malaria has such a strong fitness implication many independent genetic adaptations have emerged, many of them with other negative side effects. On net individuals with side effects may still have higher fitness in an environment where malaria is endemic. Sometimes the net benefit is most evidence on a population wide scale, sickle-cell anemia is a deleterious homozygote which exists because of the much higher frequency of heteryzogytes vis-a-vis wild type homozygotes. Many malaria adaptations exhibit the large effect dynamic and suboptimal characteristic which one might except from the early stages of natural selection in a Fisherian model. You deal with the adaptive pressures of the present and let the future take care of itself. In this case, the future involved HIV:

The researchers found that leukopenia was generally associated with a faster disease progression from HIV to AIDS, independent of known predictors of AIDS development. "On average, leukopenic European Americans progressed nearly three times faster than their non-leukopenic African or European counterparts," explained Hemant Kulkarni, MD, first author of this study. "However, leukopenic African Americans had a slower disease course than leukopenic European Americans, even though twice as many African Americans in the study had leukopenia."

The investigators found that the DARC variation, not race, explained the differences in WBC counts in African Americans with HIV. Among those who were leukopenic, only those with the DARC variation experienced a significant survival benefit. Additionally, this survival advantage became increasingly pronounced in those with progressively lower WBC counts, suggesting that the interaction between DARC and WBC counts was the primary influence on slowing HIV disease progression in African Americans.

There are no doubt details in the genetic architecture of those with the null genotype worth future investigation.

Labels: Genetics, human biodiversity, Population genetics, race

Monday, July 20, 2009

How evolution happens (sometimes, perhaps) posted by Razib @ 7/20/2009 12:46:00 PM

Partial penetrance facilitates developmental evolution in bacteria:

Development normally occurs similarly in all individuals within an isogenic population, but mutations often affect the fates of individual organisms differently...This phenomenon, known as partial penetrance, has been observed in diverse developmental systems. However, it remains unclear how the underlying genetic network specifies the set of possible alternative fates and how the relative frequencies of these fates evolve...Here we identify a stochastic cell fate determination process that operates in Bacillus subtilis sporulation mutants and show how it allows genetic control of the penetrance of multiple fates. Mutations in an intercompartmental signalling process generate a set of discrete alternative fates not observed in wild-type cells, including rare formation of two viable 'twin' spores, rather than one within a single cell. By genetically modulating chromosome replication and septation, we can systematically tune the penetrance of each mutant fate. Furthermore, signalling and replication perturbations synergize to significantly increase the penetrance of twin sporulation. These results suggest a potential pathway for developmental evolution between monosporulation and twin sporulation through states of intermediate twin penetrance. Furthermore, time-lapse microscopy of twin sporulation in wild-type Clostridium oceanicum shows a strong resemblance to twin sporulation in these B. subtilis mutants...Together the results suggest that noise can facilitate developmental evolution by enabling the initial expression of discrete morphological traits at low penetrance, and allowing their stabilization by gradual adjustment of genetic parameters.

Also, see press release, Caltech-led team shows how evolution can allow for large developmental leaps. A bit grandiose in headline.

Labels: Evolution, Genetics

Saturday, July 18, 2009

Toll-like receptors and human evolution posted by Razib @ 7/18/2009 12:48:00 AM

Evolutionary Dynamics of Human Toll-Like Receptors and Their Different Contributions to Host Defense. Interesting stuff on inter-population variation in the discussion:

Our data show that TLR1, and more specifically the nonsynonymous T1805G variant (I602S), is the genuine target of positive selection detected in the TLR10-TLR1-TLR6 gene cluster in Europeans. First, TLR1 is ~2 times more diverse in non-African than in African populations, a pattern not compatible with the African origin of modern humans...This pattern has been observed only once among the 323 genes (0.3%) sequenced by the Seattle SNP consortium. Thus, the increased diversity observed in TLR1 among non-Africans probably results from ongoing hitchhiking between the selected allele and neutral variation at linked sites. Second, the 1805G (602S) mutation presents the highest level of population differentiation (FST = 0.54) of all SNPs located in this gene cluster...Third, among the three nonsynonymous variants composing the haplotype identified as being under positive selection in Europeans (H34, see Figure S5), only the TLR1 1805G (602S) variant has a remarkable impairment effect on agonist-induced NF-κB activation, showing a decreased signaling by up to 60%...These findings are consistent with previous studies showing that, homozygous, and to a lesser extent heterozygous, individuals for the 1805G allele present impaired TLR1-mediated immune responses after whole blood stimulation ...Taken together, it is tempting to speculate that an attenuated TLR1-mediated signaling, and a consequently reduced inflammatory response, has conferred a selective advantage in Europeans - a scenario that would explain the very high frequency (51%) of the "hypo-responsiveness" T1805G mutation in Europe. This observation raises questions about the possible evolutionary conflict between developing optimal mechanisms of pathogen recognition by TLRs, and more generally PRRs, and avoiding an excessive inflammatory response that can be harmful for the host.

This looks to be the same area fingered earlier in Icelanders.

Labels: Genetics

Thursday, July 16, 2009

Dog legs: the genetics of short and stubby posted by p-ter @ 7/16/2009 07:32:00 PM

In recent years, the genetic mechanisms by which humans have generated massive phenotypic diversity in dogs have started to be uncovered. We now know, for example, much about the genetics of pigmentation in dogs, and a major gene controlling body size. This week, another phenotype--the short, stubby legs of some dog breeds (see right)--has been revealed to have a simple, but interesting genetic basis.

The authors mapped the short leg phenotype to a small region on chromosome 18; further analysis revealed that the probable causal mutation is the insertion of a transcriptionally-active processed (ie. intronless) retrotransposed copy of the FGF4 gene. How this change leads to the phenotype itself is unknown, but understanding the mechanism will likely lead to some interesting biology.

Labels: Genetics

Thursday, July 09, 2009

A live birth is hard to do posted by Razib @ 7/09/2009 12:28:00 AM

Chromosomal Problems Affect Nearly All Human Embryos: Discovery May Explain Low Fertility Rates In Humans:

For the first time, scientists have shown that chromosomal abnormalities are present in more than 90% of IVF embryos, even those produced by young, fertile couples. Ms Evelyne Vanneste, a PhD student in the Centre for Human Genetics and the University Fertility Center, Leuven University, Belgium, told the 25th annual conference of the European Society of Human Reproduction and Embryology on July 1st, that the surprising finding meant that current techniques used in preimplantation genetic screening (PGS), where embryos are screened genetically in order to select the best embryo for transfer, do nothing to improve pregnancy and live birth rates. Indeed, it can lead to potentially viable embryos being discarded, she said.
...
"Although in vitro culture conditions are known to have a limited influence on the rate of chromosomal imbalances in IVF/ICSI embryos, it is probable that the chromosome instability observed in vitro also occurs in spontaneous pregnancies since, at most, 30% of human conceptions result in a live birth and more than 50% of spontaneous abortions carry chromosomal aberrations. The high rate of chromosomal abnormalities is almost certainly responsible for the low fecundity of humans compared with other mammals," she added.

The exact proportion of fertilizations which end in spontaneous abortion (or lack of implantation) seems rather sketchy from what I can tell, but it's very high. In The Cooperative Gene Mark Ridley suggests that the very high rates of spontaneous abortion among humans is one reason he is not particularly worried about increased genetic load, a prime concern of W. D. Hamilton. I assume he believes that the proportion of spontaneous abortions would simply increase for individuals who have a high mutational load. Hamilton worried about the fact that many more humans lived to reproduce than would have in the past, so that natural selection was no longer operative. Ridley is suggesting that actually the power of selection may simply be transferred to the gestational stage.

One interesting idea would be to see if different populations have different rates of spontaneous abortion. How one would get a precise measure of this, I don't know.

Labels: Genetics

Sunday, July 05, 2009

Schizophrenia genetics: complex posted by p-ter @ 7/05/2009 03:38:00 PM

Nicholas Wade (guesting posting for John Tierney) points to a set of papers in this week's Nature reporting large genome-wide association studies of schizophrenia. The main upshot of the papers is that variants in the MHC region influences susceptibility to schizophrenia. This region influences susceptibility to a number of autoimmune diseases, so the association is suggestive evidence that schizophrenia as well has an autoimmune component.

Outside of the MHC, however, there are few convincing signals of association. One interpretation of this might be that there are simply no other common polymorphisms that influence risk of developing schizophrenia. One of the groups, however, set out to test whether this was the case. They took thousands of the top associated SNPs--none of them individually showing a strong association with the disease--and assembled them into a genotypic score for predicting whether an individual has schizophrenia. And indeed, using these thousands of markers, they were able to do significantly (in the statistical sense, not really in the practical sense) better than random at classifying individuals as schizophrenic or not from genotype data alone. Thus, the aggregate effect of thousands of polymorphisms impacts the development of this disease.

What is the practical significance of this? In terms of treatment or drug development, there is essentially none. But it does suggest that there will be no "silver bullet"--copy number polymorphism, rare variants, or what have you--that will solve schizophrenia genetics.

Labels: Genetics

Tuesday, June 30, 2009

Genetics of Cape Coloureds posted by Razib @ 6/30/2009 08:36:00 AM

A few weeks ago I noticed that the Wikipedia entry for Cape Coloureds has little fleshed out information on their genetics. As a mixed population it seems that people would be interested, but has always been hard to find anything from Google Scholar on this topic. But the recent Tishkoff paper, The Genetic Structure and History of Africans and African Americans, has some data. You can find a full post at my other weblog, but it seems that not only are the Cape Coloureds substantially European, Khoisan and Bantu, but likely they're also substantially Indian, and there is a definite East Asian element, no doubt from slaves brought from Maritime Southeast Asia by the VOC. There's also a lot of variance in this particular sample of Cape Coloureds. Assuming this is representative I would offer that the main reason is that the Coloured population has historically had many people entering it from other groups, and, many leaving to other groups.

Labels: Genetics, History

Sunday, June 28, 2009

From genome-wide association studies to molecular biology posted by p-ter @ 6/28/2009 05:10:00 PM

One of the rationales advanced for the identification of common alleles that confer modest risk to a disease via genome-wide association studies is that these associations will lead to biological insight into the disease. Two papers published today represent an important first step towards this goal for a variant associated with colorectal cancer.

Like many polymorphisms associated with complex diseases, the one investigated in these studies does not fall within a gene--this particular variant falls hundreds of thousands of bases away from the nearest gene. It does, however, fall within a non-coding element that is conserved across millions of years of evolution, suggesting that it is functional. These studies show that, indeed, the SNP falls in a binding site for a transcription factor, and that the two alleles have different binding affinities for that factor. Additionally, one of the studies shows that the genomic region containing the SNP loops over and makes physical contact with the nearest gene (MYC, a known oncogene), supporting the hypothesis that the SNP affects its regulation.

These studies raise more questions than they answer, of course. None of the studies find an association between the SNP itself and steady-state MYC expression in cell lines. My guess is that, like many transcriptional enhancers, developmental-time-point-specific manner. An important direction now is to determine when that important time point is.

Labels: Genetics

Wednesday, June 24, 2009

Duffy and malaria in baboons? posted by p-ter @ 6/24/2009 07:33:00 PM

So after my wingeing about the quality of genetic associations found through candidate gene studies, it's only appropriate that I point to a fun candidate gene association study published this week in Nature.

The interesting point here is that the organism isn't humans, but rather baboons, and the phenotype is susceptibility to malaria. Briefly, the authors find that a SNP in the promoter of the Duffy locus (recall that a mutation that abolishes the expression of Duffy in humans leads to protection from Plasmodium vivax and is one of the best characterized instances of recent positive selection in our species) appears to lead to protection from a malaria-like disease in baboons. The authors seem to really, really want this polymorphism to also be under selection in baboons (to complete the parallel story to humans), but they can't bring themselves to say the evidence is anything more that "suggestive" (and to be honest, even that may be wishful thinking).

So is the association true? The study suffers from the same problem of candidate gene studies mentioned before, in that it's small and the evidence for an association is fairly weak. If I had to bet, I'd guess no, the association isn't real. But collecting and genotyping a large sample of baboons is simply not feasible at this point (if it ever will be), so this is what's possible, and it's a kind of fun, suggestive study that would be really cool if it ends up being true.

Labels: Genetics

Sunday, June 21, 2009

Why are most genetic associations found through candidate gene studies wrong? posted by p-ter @ 6/21/2009 03:47:00 PM

In a recent post, I made a blanket statement that the vast majority of candidate gene association studies published in psychiatric genetics (actually, in nearly all fields of genetics) are wrong. I'm not just being offhandedly dismissive--below, I outline the statistical argument behind that claim. This discussion is cribbed almost verbatim from a discussion of the issue by statisticians at the Welcome Trust.

Let's assume that there are a finite number of loci in genome, and we test some number of those (in a genome-wide association study, this is on the order of 500K-1M; in a candidate gene study it's more likely in the tens. But the actual marker density is irrelevant for what follows) for association with some phenotype of interest. In general, the criterion used to decide if one has discovered a true association is the p-value, or the probability of seeing the data that you have given that there is no association. But that's not really the quantity you're interested in. The real quantity of interest is the probability that there's a true association given the data you see--the inverse of what's being reported.

By Bayes' Law, this probability depends on the prior probability of an association at that marker, the p-value threshold you've chosen to call a finding "significant", and crucially, the power you had to detect the association [1][2]. Thus, the interpretation of a given p-value depends on the power to detect an association, such that the lower your power, the lower the probability that a "significant" association is true [3].

That's where recent evidence from large genome-wide association studies comes into play. For nearly all diseases, reproducible associations have small effect size and are only detectable when one has sample sizes in the thousands or tens of thousands (for many psychiatric phenotypes, even studies with these sample sizes don't seem to find much). The vast majority of candidate gene association studies had sample sizes in the low hundreds, and thus had essentially zero power to detect the true associations. By the argument above, in this situation the probability that a "significant" association is real approaches zero. The problem with candidate gene association studies is not that they were only targeting candidate genes, per se, but rather that they tended to have small sample sizes and were woefully underpowered to detect true associations.

[1] Let D be the data, T be the event that an association is true, t, be the event that an association is not true, and P(T) be the prior probability that an association is true.

P(T|D) = P(D|T)P(T) / [ P(D|T) P(T) + P(D|t) (1-P(T) ]

P(D|T) is the power, and P(D|t) is the p-value. Clearly, both are relevant here.

[2] http://jnci.oxfordjournals.org/cgi/content/full/96/6/434#FD1

[3] As the authors note,

A key point from both perspectives is that interpreting the strength of evidence in an association study depends on the likely number of true associations, and the power to detect them which, in turn, depends on effect sizes and sample size. In a less-well-powered study it would be necessary to adopt more stringent thresholds to control the false-positive rate. Thus, when comparing two studies for a particular disease, with a hit with the same MAF and P value for association, the likelihood that this is a true positive will in general be greater for the study that is better powered, typically the larger study. In practice, smaller studies often employ less stringent P-value thresholds, which is precisely the opposite of what should occur.

Labels: Genetics

Tuesday, June 16, 2009

Another candidate gene association bites the dust posted by p-ter @ 6/16/2009 09:29:00 PM

In 2003, Avshalom Caspi and colleagues published an influential article (Google Scholar lists it as having almost 2000 citations in 6 years) claiming that genetic variation in the seratonin transposter gene influences how people respond to traumatic events--the particular, in terms of risk of depression. For years, this has been the poster-child example of gene-environment interactions (for whatever reason, finding significant interaction terms like this is the Holy Grail of human genetics for some). Like the more recent dubious breastfeeding-IQ-genetics story (led by the same group, it should be noted), the authors identified a phenotype they wished to study (depression), an environmental factor that plays a role in the phenotype (traumatic events), genotyped a couple markers in a gene they thought might reasonably be expected to play a role in that phenotype (seratonin), and found a "statistically significant" interaction. Voila.

The 2003 article, as I noted, received quite a bit of attention. This led to attempts to replicate it, and this week, a comprehensive meta-analysis was published of those studies. The result: nothing. There is no evidence for an interaction between genotype at the seratonin receptor and trauma on risk of depression. And in retrospect, why should there be? The probability of happening on the proper combination of genotype and environmental exposure when sampling one environmental exposure (out of an infinite number) and a few gene markers (out of millions) is miniscule--the statistical burden of proof should be much higher than a simple p-value cutoff of 0.05.

These sorts of candidate gene association studies were/are used in all fields, but in my mind are lent the most credence in psychiatric genetics (the place where they should probably be given the least credence, IMO). This is just an additional cautionary tale: the vast majority of associations found through small candidate gene studies, even ones with functional work, plausibility, and the status of publication in a high-profile journal--MAOA and social problems, FADS and IQ (actually, any study published to date on IQ), NPY and stress--are likely wrong.

Labels: Genetics

Saturday, June 13, 2009

Mapping phenotypic variation in chickens posted by p-ter @ 6/13/2009 10:05:00 PM

PLoS Genetics has a nice paper identifying copy-number polymorphism in the transcription factor SOX5 as the causal mutation leading to the pea-comb phenotype (the bottom panels on the right) in chickens. The mutation leads to more widespread expression of the gene at a particular developmental time point, which presumably represses comb formation.

Labels: Genetics

Monday, June 08, 2009

Selection for tameness posted by Razib @ 6/08/2009 12:21:00 PM

Genetic Architecture of Tameness in a Rat Model of Animal Domestication:

A common feature of domestic animals is tameness-i.e., they tolerate and are unafraid of human presence and handling. To gain insight into the genetic basis of tameness and aggression, we studied an intercross between two lines of rats (Rattus norvegicus) selected over >60 generations for increased tameness and increased aggression against humans, respectively. We measured 45 traits, including tameness and aggression, anxiety-related traits, organ weights, and levels of serum components in >700 rats from an intercross population. Using 201 genetic markers, we identified two significant quantitative trait loci (QTL) for tameness. These loci overlap with QTL for adrenal gland weight and for anxiety-related traits and are part of a five-locus epistatic network influencing tameness. An additional QTL influences the occurrence of white coat spots, but shows no significant effect on tameness. The loci described here are important starting points for finding the genes that cause tameness in these rats and potentially in domestic animals in general.

Also see ScienceDaily.

Labels: Genetics, Population genetics

Friday, June 05, 2009

The evolution of Icelanders posted by Razib @ 6/05/2009 01:17:00 AM

Iceland has long been of some interest because of its peculiar demographic history and their genetic consequences. So a new paper in PLoS Genetics is of interest, The Impact of Divergence Time on the Nature of Population Structure: An Example from Iceland:

The Icelandic population has been sampled in many disease association studies, providing a strong motivation to understand the structure of this population and its ramifications for disease gene mapping. Previous work using 40 microsatellites showed that the Icelandic population is relatively homogeneous, but exhibits subtle population structure that can bias disease association statistics. Here, we show that regional geographic ancestries of individuals from Iceland can be distinguished using 292,289 autosomal single-nucleotide polymorphisms (SNPs). We further show that subpopulation differences are due to genetic drift since the settlement of Iceland 1100 years ago, and not to varying contributions from different ancestral populations. A consequence of the recent origin of Icelandic population structure is that allele frequency differences follow a null distribution devoid of outliers, so that the risk of false positive associations due to stratification is minimal. Our results highlight an important distinction between population differences attributable to recent drift and those arising from more ancient divergence, which has implications both for association studies and for efforts to detect natural selection using population differentiation.

Figure 3 is a PCA map which shows how individuals from different regions of Iceland sort out. The Scottish and Norwegian populations are there two, and they don't vary much along the components of variation which Icelanders sort out along, the conclusion being that the Iceland variation isn't due to different ancestral proportions. They further calculate that if the ancestral Iceland populations were like the modern Scottish and Norwegian ones, Icelanders are ~35% Scottish and ~65% Norwegian. Most of the differences between Icelanders and continental Europeans is no doubt due to drift because of their very small population size, no migration due to their isolation and the a few specific bottleneck events. But a section on natural selection in Icelanders is interesting:

We found eight SNPs, representing two chromosomal regions, for which the evidence of unusual population differentiation was genomewide-significant...Six of the SNPs lie in or near the TLR (toll-like receptor) genes TLR10 and TLR1, while the other two lie inside the NADSYN1 (NAD synthesase 1) gene....

Toll-like receptors were pinpointed in a recent paper as likely possibilities for localized adaptation.

Labels: Genetics, Population genetics

Thursday, June 04, 2009

50 Genetics Ideas You Really Need to Know posted by Razib @ 6/04/2009 12:59:00 PM

Dan MacArthur reviews 50 Genetics Ideas You Really Need to Know. Gives it "3.5 nucleotides out of 4."

Labels: Genetics

Tuesday, June 02, 2009

Earwax and breast cancer posted by Razib @ 6/02/2009 01:19:00 AM

In light of p-ter's post on KITLG and cancer risk, I stumbled onto this today, Earwax, osmidrosis, and breast cancer: why does one SNP (538G>A) in the human ABC transporter ABCC11 gene determine earwax type?:

One single-nucleotide polymorphism (SNP), 538G>A (Gly180Arg), in the ABCC11 gene determines the type of earwax. The G/G and G/A genotypes correspond to the wet type of earwax, whereas A/A corresponds to the dry type. Wide ethnic differences exist in the frequencies of those alleles, reflecting global migratory waves of the ancestors of humankind. We herein provide the evidence that this genetic polymorphism has an effect on the N-linked glycosylation of ABCC11, intracellular sorting, and proteasomal degradation of the variant protein. Immunohistochemical studies with cerumen gland-containing tissue specimens revealed that the ABCC11 WT protein was localized in intracellular granules and large vacuoles, as well as at the luminal membrane of secretory cells in the cerumen gland, whereas granular or vacuolar localization was not detected for the SNP (Arg180) variant. This SNP variant lacking N-linked glycosylation is recognized as a misfolded protein in the endoplasmic reticulum and readily undergoes ubiquitination and proteasomal degradation, which determines the dry type of earwax as a mendelian trait with a recessive phenotype. For rapid genetic diagnosis of axillary osmidrosis and potential risk of breast cancer, we developed specific primers for the SmartAmp method that enabled us to clinically genotype the ABCC11 gene within 30 min

I blogged a paper on this SNP relating it to earwax form a few years ago. Also see ScienceDaily. The variation in earwax seems to conform pretty closely to that of EDAR.

Labels: Genetics, Population genetics

Wednesday, May 27, 2009

Sex differences and variation in personality posted by Razib @ 5/27/2009 05:21:00 PM

Look before you leap: Are women pre-disposed to be more risk averse than male adventurers?:

"It's not at all that women are risk averse," says Jody Radtke, program director for the Women's Wilderness Institute in Boulder, Colorado. When men are confronted with challenging situations, they typically produce adrenaline, which is what causes them to run around, hollering like frat boys at a kegger. An adrenaline rush is a good feeling, but when confronted with the same situation, women produce a different chemical, called acetylcholine.

"Pretty much what (acetylcholine) does is it makes you want to vomit," says Jody.

Because women don't have the same positive chemical reward, they tend to be less pumped about confronting stressful situations. This leads them to rely on decision-making. Essentially, they want the whole picture before they go diving in.

Research, Jody says, shows women have more cross-networking between the two hemispheres of the brain, which subconsciously allows them to evaluate different sensory cues, facts and emotions when making decisions. The cause of this difference probably lies somewhere in the debate of nature versus nurture and the history of evolution.

Marvin Zuckerman, professor emeritus at the University of Delaware, has studied risk for decades. He found men are typically more likely to take risks when seeking novel or exciting sensations, and that comes from both genetics and environment.

"What's important seems to be the environment that isn't shared by siblings in the same family," he says.

The above was originally published by Women's Adventure Magazine. The last reference is to the repeated finding that non-shared environment matters a great deal but isn't well accounted for. Obviously both men and women vary in terms of psychological attributes, and there have been plenty of attempts to adduce the variation to different quantities of neurochemicals (the "chemical soup" model is easy to translate into prose).

The content of the piece isn't too surprising, you see it all the time. Suggesting innate differences between men and women is totally acceptable so long as it is perceived to be neutral, or, better yet, casts women in a positive light. Michael Lewis' recent article on the Icelandic financial turmoil hints to sex differences and male psychology as a root problem. He presented a rather conventional stereotype of men as financial cowboys willing to take outsized risks for reward, while women were risk averse socialists. During the run up to the Iraq War and afterward I recall many people, mostly but not always women, calling into Leftish radio shows promoting a sex determinist theory that war was the result of the male nature, and the fact that men are head of states of most nations was the ultimate problem (this argument crops up in science fiction as well).

The interesting point to me is the sort of articles which highlight "different ways of thinking" between the sexes and how they might be rooted in biological differences have implications which point in different directions in terms of positive or negative valuation depending on your perspective and circumstance. As a specific example, the risk taking predispositions of many males can be seen to be folly and lack of prudence, but, risk often entails both an upside and a downside. Decisions which may seem foolish and wrongheaded viewed through a conventional mainstream lens are often lauded in hindsight as visionary. Unfortunately the nature of uncertainty is such that one has little idea which risks will pay off and which will simply extract a downside cost. It is likely that human societies dominated by those who are only risk averse, or those who are only risk accepting, would not be those which we would truly wish to live in. Variation in human personalities is probably beneficial in an aggregate sense when it comes to human progress. There are downsides risks to both the risk averse and risk accepting strategy, so it is probably best to have some of both. In an economic scenario what I'm talking about is straightforward; consider two individuals with degrees in computer science, one who goes to work for IBM and another who founds a start-up. You wouldn't want everyone to aspire to become a corporate employee, where would the innovation which drives productivity growth come from? On the other hand, there are only so many start-ups which succeed and there is a need for individuals who work in less sexy sectors who service older established technologies which are at the heart of the current economy. In other words, you want to be able to squeeze more juice from the oranges you have, as well be funding research which might result in the discovery of jucier varietals.

Addendum: Obviously what I'm saying here isn't too novel. It's rooted in human nature itself: our minds are cobbled together from disparate competencies and subfunctions, and our unitary consciousness is a delusion very successfully promoted by the prefrontal cortex. But even when it comes to concepts and assumptions which are the purview of the prefontal cortex its priority isn't usually to keep its story straight. Rather it seem geared toward generative ad hoc narratives which are only proximately consistent. Yes it can engage in rationality, but most of the time its forte is rationalization. And why not? Rationalizing the contradictory feels good! It was almost certainly highly adaptive in the past, and likely is today, in terms of keeping everyone in the group on the same page.

Labels: Genetics, human biodiversity, sex differences

Thursday, May 21, 2009

The incentives for finding "genes for".... posted by Razib @ 5/21/2009 07:30:00 PM

Genes, Brains and the Perils of Publication:

I have no wish to criticize these findings as such. But the way in which this paper is written is striking. The negative results are passed over as quickly as possible. This despite the fact that they are very clear and easy to interpret - the rs1344706 variant has no effect on cognitive task performance or neural activation. It is not a cognition gene, at least not in healthy volunteers.

By contrast, the genetic association with connectivity is modest (see the graphs above - there is a lot of overlap), and very difficult to interpret, since it is clearly not associated with any kind of actual differences in behaviour.

And yet this positive result got the experiment published in no less a journal than Science! The negative results alone would have struggled to get accepted anywhere, and would probably have ended up either unpublished, or published in some rubbish minor journal and never read. It's no wonder the authors decided to write their paper in the way they did. They were just doing the smart thing. And they are perfectly respectable scientists - Andreas Meyer-Lindenberg, the senior author, has done some excellent work in this and other fields.

Labels: Genetics

Tonal languages, perfect pitch, and ethnicity posted by Razib @ 5/21/2009 01:54:00 AM

Tone Language Is Key To Perfect Pitch:

In a study published in the Journal of the Acoustical Society of America and being presented at the ASA meeting in Portland on May 21, Deutsch and her coauthors find that musicians who speak an East Asian tone language fluently are much more likely to have perfect pitch.

...

In 2004, she found that students at the Central Conservatory of Music in Beijing, China, all of whom spoke Mandarin, were almost nine times more likely to have perfect pitch than students at the Eastman School of Music in New York. That last study, however, left open the question of whether perfect pitch might be a genetic trait - since all the Mandarin speakers were East Asian.

...

The present study looked at 203 students at the University of Southern California's Thornton School of Music, all of whom agreed to take the test in class (so there was no self-selection in the sample). The students listened to the 36 notes that haphazardly spanned three octaves. They attempted to identify the notes, and they self-reported their musical, ethnic and linguistic backgrounds - including whether they were very fluent in an East Asian tone language, fairly fluent or not at all fluent. Deutsch and her colleagues found that students who spoke an East Asian tone language very fluently scored nearly 100 percent on the test, and that students who were only fairly fluent in a tone language scored lower overall. Those students - either Caucasian or East Asian - who were not at all fluent in speaking a tone language scored the worst on average.

The abstract makes it a bit clearer that East Asians who do not speak a tonal language are no better than Europeans.

Labels: Genetics

Wednesday, May 20, 2009

OXTR & prosociality posted by Razib @ 5/20/2009 11:19:00 PM

The Oxytocin Receptor (OXTR) Contributes to Prosocial Fund Allocations in the Dictator Game and the Social Value Orientations Task:

The demonstration that genetic polymorphisms for the OXTR are associated with human prosocial decision making converges with a large body of animal research showing that oxytocin is an important social hormone across vertebrates including Homo sapiens. Individual differences in prosocial behavior have been shown by twin studies to have a substantial genetic basis and the current investigation demonstrates that common variants in the oxytocin receptor gene, an important element of mammalian social circuitry, underlie such individual differences.

Here's a figure from the paper:

And the SNPs from the HGDP (G = C & A = T for the first SNP, or at least the paper and PubMed agree on this):

Related: It's hard out here for a vole. Heritability of the Ultimatum Game. Altruism and Risk-Taking: Kinda Heritable. Can someone put the psychic unity of makind out of its misery? DRD4, politics & friendship.

Labels: Behavior Genetics, Behavioral Economics, Genetics

Wednesday, May 13, 2009

Myriad Genetics sued over BRCA testing posted by p-ter @ 5/13/2009 07:50:00 PM

Many readers have probably heard that the ACLU has sued Myriad Genetics for its patent on genetic testing on BRCA1/2 (these genes account overall for a small fraction of breast cancer cases, but for many of the strongly inherited cases).

Many companies hold gene patents, so why sue Myriad? The answer is simple: in the battle of public opinion, there's no way Myriad can come out of this looking good. A bit of recent history: the BRCA1 gene was famously mapped by a group led by Mary-Claire King, currently at the University of Washington. That group, however, narrowed down the location of the gene only to a relatively large region, and the gene itself was finally isolated months later and patented by Myriad (BRCA2 came later). Myriad did the obvious--they designed a test for a series of mutations in the genes and began to market it. However, the series of mutations they test is not the whole story--other mutations, untested by Myriad, can cause the disease as well. Other labs would be happy to market tests for these mutations, except, of course, that Myriad refuses to license its patent, preferring instead to hold onto their monopoly on the gene. The result: families that would like to be tested for rare mutations in BRCA1 but an environment in which it is illegal for any company to sell them such a test. It's not for nothing that Myriad is considered among the most hated diagnostics companies.

Myriad is probably within their legal rights, but when cases like this get publicity, laws have the funny tendency to change.

Labels: Genetics

Monday, May 11, 2009

Ancestry of Mexican Mestizos by region posted by Razib @ 5/11/2009 10:21:00 PM

Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. The title says it all, so I won't post the abstract. The article is OA, so you can read the whole thing, but I thought this figure from the supplements was pretty informative:

Sonora is exactly where you would expect Mestizos to be the most European, while Guerrero on the coast has more African ancestry. See the paper for other Mexican provinces. The use of a Northwest European population is of course somewhat imperfect as the white ancestry of Mestizos is Iberian (though European populations really are not very differentiated in the worldwide context). Additionally, the Zapotecs would be imperfect representative of the genetic variation of all the Amerindians of Mexico (some of whom are likely to emigrated from the American Southwest relatively recently).

Labels: Genetics, Population genetics

Sunday, May 03, 2009

Nurture on nature's leash posted by Razib @ 5/03/2009 01:53:00 PM

This is fascinating. De novo establishment of wild-type song culture in the zebra finch. See ScienceDaily, Birds Raised In Complete Isolation Evolve 'Normal' Species Song Over Generations.

Labels: Ethology, Genetics

Monday, April 27, 2009

Bone mutants and recent selection posted by p-ter @ 4/27/2009 08:15:00 PM

The New York Times has an interesting little piece on bones, including a description of the unsettling genetic disorder fibrodysplasia ossificans progressiva:

When Harry Eastlack was 5 years old, he broke his left leg while out playing with his sister. The fracture failed to set properly, and soon his hip and knee had stiffened up as well. Examining the boy, doctors found ominous bony growths on the muscles of his thigh. Within a few years, bony deposits had spread throughout Harry's body, infiltrating his chest, neck, back and buttocks. Surgeons tried to cut the excess bone away, only to watch it grow back thicker and more invasive than before.

By his mid-20s, his vertebrae had fused together, his torso been thrust rigidly forward and his back muscles replaced with solid bone. Finally, even his jaw locked up, and he died of pneumonia in 1973, just shy of his 40th birthday.

Fun fact: the gene that causes this disease is ACVR1, which lies in a region of extended haplotype homozygosity and extreme population differentiation suggestive of recent positive selection in non-African populations.

Labels: Genetics, Population genetics

Friday, April 24, 2009

Horse coat color variation and domestication posted by Razib @ 4/24/2009 01:30:00 AM

Coat Color Variation at the Beginning of Horse Domestication:

The transformation of wild animals into domestic ones available for human nutrition was a key prerequisite for modern human societies. However, no other domestic species has had such a substantial impact on the warfare, transportation, and communication capabilities of human societies as the horse. Here, we show that the analysis of ancient DNA targeting nuclear genes responsible for coat coloration allows us to shed light on the timing and place of horse domestication.We conclude that it is unlikely that horse domestication substantially predates the occurrence of coat color variation, which was found to begin around the third millennium before the common era.

Also see ScienceDaily.

Related: Horse genetics & color, White horses and blonde humans: a genetic connection?, The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World and Earliest domestication of horse?

Labels: Genetics, History

Sunday, April 19, 2009

An argument for searching for rare variants in human disease posted by p-ter @ 4/19/2009 03:04:00 PM

Based on the comments on my previous post, I'm going to lay out an argument which I find reasonable for sequencing studies in human disease:

Let's follow Goldstein's back-of-the-envelope calculations: assume there are ~100K polymorphisms (assuming Goldstein isn't making the mistake I attribute to him, this includes polymorphisms both common and rare) that contribute to human height, that we've found the ones that account for the largest fractions of the variance, and that these fractions of variance follow an exponential distribution.

Now, assume you have assembled a cohort of 5000 individuals and done a genome-wide association study using common SNPs. You find some interesting things, but you want more. Now, you have two choices: sequence those 5000 individuals to look for rarer variation, or increase sample size to 20,000 and perform another association study using the same set of common polymorphisms.

As Daniel Macarthur points out, you've not yet sucked every drop of marrow out of those 5000 individuals: there are presumably some (many?) rarer SNPs that have modest effect sizes (in sense 2 from this post), and thus account for measurable (though still small) fractions of the variance in your trait. Those are low-hanging fruit for you to find if you pony up the cash for some sequencing (the price of which keeps dropping). This is especially true if there are more rare variants than common ones that influence the trait, as is likely the case (there's more rare variation than common variation overall). So instead of spending on scaling up your sample size, spend on sequencing, and have impact now.

Is this along the lines the argument Goldstein is making? I don't really think so, but welcome comment. In any case, the choice above is somewhat arbitrary--if you want to look for very rare variation, you need a sample size larger than 5000 anyways, and if you're sequencing, you're obviously not just going to look at the rare variants since the common ones come along for free.

Labels: Genetics

Friday, April 17, 2009

Notes on the Common Disease-Common Variant debate: two years later posted by p-ter @ 4/17/2009 08:07:00 PM

Just over two years ago, I wrote a brief post explaining why I find the "debate" about common variants versus rare variants in human medical genetics to be largely unhelpful. I concluded thusly, after explaining some of the rationale for looking for common variants that affect disease susceptibility:

So am I then arguing in favor of the CDCV [Common Disease-Common Variant] hypothesis? Of course not-- rare variants, aside from being predictive for disease in some individuals, also give important insight into the biology of the disease. But it is possible right now, using genome-wide SNP arrays and databases like the HapMap, to search the entire genome for common variants that contribute to disease. This is an essential step--finding the alleles that contribute disproportionately to the population-level risk for a disease. Eventually, the cost of sequencing will drop to a point where rare variants can also be assayed on a genome-wide, high-throughput scale, but that's not the case yet. Once it is, expect the CDRV [Common Disease-Rare Variant] hypothesis to be trumpeted as right all along.

Well, two years later, the price of sequencing has dropped precipitously. And in this week's New England Journal of Medicine, David Goldstein makes the argument that association studies using common variants have been disappointing and what people really need to be doing is--would you believe it?--searching for rare variants using sequencing.

Your opinion about the current crop of genome-wide association studies depends, of course, on what you were expecting to begin with: if you thought that a few common variants would be discovered for each common disease and fully explain its prevalence, you're likely to think the whole enterprise has been a bust (with a few exceptions, of course--Goldstein mentions exfoliation glaucoma and macular degeneration). If, on the other hand, you thought that genome-wide association studies would have about as much success as the linkage and candidate gene studies that preceded them (Daniel Macarthur characterized the field as a "scientific wasteland" prior to 2005, and that's only mild hyperbole), you're probably astounded by their success.

In any case, the objections to large association studies are/have been numerous, but Goldstein has come up with the most bizarre one yet--that large association studies using common variants might find too many things! The premise is this (and let's take a non-disease trait like height as an example): current association studies have identified many loci of small effect that influence human height. Together, these loci account for ~3% of the population variation in height. Assuming these are the largest effect sizes out there to find, and an exponential distribution on effect sizes (both probably approximately fair assumptions), then a massive number of loci influence height, potentially genes across the entire genome. Thus, "[i]f common variants are responsible for most genetic components of type 2 diabetes, height, and similar traits, then genetics will provide relatively little guidance about the biology of these conditions, because most genes are 'height genes' or 'type 2 diabetes genes.'"

The solution to this problem, Goldstein claims, is to look for rare variants that (he presumes) have larger effects. This claim, though it appears reasonable, is a non sequitur. The reason why is that Goldstein is conflating two definitions of effect size. In definition one, effect size is defined as the proportion of variance in a trait explained by a polymorphism. In definition two, effect size is defined as the difference in mean trait value between two genotype classes. Why is this a problem? Because the proportion of variance in a trait explained by a polymorphism is a function both of its frequency and the impact it has on the trait [1]. To re-use a previous example, imagine smoking cigarettes gives you a 5% chance of developing lung cancer, while working in an asbestos factory gives you a 70% chance. In sense one, smoking has a larger effect size--since so many more people smoke than work in asbestos factories, the number of lung cancer cases due to smoking is much higher than the number due to asbestos. However, under definition two, working in an asbestos factory has the larger effect size--the probability of developing the disease is much higher. Thus, though a rare polymorphism might have a large effect (in sense 2), it will explain a tiny amount of the variance in the trait simply due to the fact it is rare [2].

The contention that the number of loci needed to explain the heritability of a trait will somehow be smaller if one looks at rare variation is simply false.

[1] Assuming additivity, the variance explained by a locus is 2p(1-p)a^2, where p is the allele frequency and a is half the difference between the means of each homozygote. See Figure 4.8 of Lynch and Walsh.

[2] For example, let's use the equation in [1] and assume a polymorphism has a frequency of 0.001%. Then, in order for this polymorphism to account for 0.05% of the variation in height (on the small end of the proportions accounted by common polymorphisms identified to date), a single allele would have to increase height by a whopping 5 standard deviations.

Labels: Genetics

Wednesday, April 15, 2009

Personal genomics & NEJM posted by Razib @ 4/15/2009 06:24:00 PM

Multiple articles on personal genomics in The New England Journal of Medicine, Genomewide Association Studies and Human Disease, Common Genetic Variation and Human Traits, Genomewide Association Studies - Illuminating Biologic Pathways and Genetic Risk Prediction -Are We There Yet?. Nick Wade has a piece on these articles in The New York Times.

Related: Preparing doctors for the genomic tsunami, Linkage versus association: a mini-primer, A note on the Common Disease-Common Variant debate, Common disease, common variant and Common disease, common variant.

Labels: Genetics, Genomics

Saturday, April 11, 2009

Genetics of domestication posted by p-ter @ 4/11/2009 07:37:00 PM

Most readers are likely familiar with the classic "taming of the fox" experiment started by Dmitri Belyaev--starting with a wild silver fox, the group was able to quickly breed both a tame and hyper-aggressive line of animals. I was unaware that, concurrently with this experiment, that same group was also performing the same experiment on rats.

Just published is the the first steps in a search for the precise genetic changes underlying the differences between a tame and aggressive line of rats, separated by only 60 generations of breeding. The basic result is that they are able to identify two regions of the genome that very likely carry variation affecting tameness. They are unable to identify particular genes due to the resolution of the study, but one can only assume they're in the process of following this up (since these two lines are only separated by 60 generations, one easy way to search for a causal polymorphism might be to just sequence the two lines--there's likely very few differences between them in the candidate regions).

It was also been noticed, in both the fox and rat experiments, that changes in tameness were associated with changes in pigmentation--in the rat case, the presence of a white patch of fur. This study design allows the authors to determine whether the same locus influencing pigmentation is also involved in tameness. In this case, they're not.

Labels: Genetics

New model organisms posted by p-ter @ 4/11/2009 08:28:00 AM

Nature has a nice piece on the uses of alternative model organisms in various parts of biology. The focus is on the medical applications of these models, which I suppose is due to issues with funding. But the real message is that with novel genomics applications (mainly high-throughput sequencing), understanding the genetics of a wide variety of nature's bizarre creatures is possible. Sure, understanding how the Antarctic icefish adapted to sub-zero temperatures might in theory help understanding of some human disease, but let's be honest--that's worth studying just because it's really cool.

Labels: Genetics

Thursday, April 09, 2009

African Pygmies & their origins posted by Razib @ 4/09/2009 05:01:00 PM

There was some talk about Pygmies on the post about Jerry Coyne's weblog. PLoS Genetics has a new paper up on the topic of Pygmy origins and their relationship to non-Pygmy populations. I've blogged it over at ScienceBlogs.

Labels: Evolution, Genetics, Population genetics, Pygmies

Tuesday, March 31, 2009

Virginity & heritability posted by Razib @ 3/31/2009 11:43:00 PM

Genes may time loss of virginity:

As genetic determinism goes, the new findings are modest. Segal's team found that genes explain a third of the differences in participants' age at first intercourse - which was, on average, a little over 19 years old. By comparison, roughly 80% of variations in height across a population can be explained by genes alone.
...
On the other hand, conservative social mores might delay a teen's first sexual experience, causing scientists to low-ball the effect of genes. Indeed, Segal's team noticed a less pronounced genetic effect among twins born before 1948, compared with those who came of age in the 1960s or later.
...
As for the specific genes involved, another team previously found that a version of a gene encoding a receptor for the neurotransmitter dopamine is associated with age at first intercourse. Others have linked the same version of the gene - called DRD4 - to impulsive, risk-taking behaviour.

The paper is Age at first intercourse in twins reared apart: Genetic influence and life history events.

FuturePundit notes:

The team found a weaker effect from genes with people born before 1948. This supports an argument I've made here previously: the breakdown of old cultural constraints on behavior frees up people to follow genetically driven desires and impulses. We become more genetically driven as external constraints weaken.

When you remove the strength of environmental parameters from the equation it naturally results in a greater salience of heritable ones. Ergo, the logic whereby you can make the case that in a perfect meritocracy there will be much stronger genetic sorting by class (via assortative mating, etc.).

Related: DRD4 and virginity.

Labels: Genetics, Personality

Monday, March 30, 2009

Connections between Mendelian diseases and natural variation posted by p-ter @ 3/30/2009 07:49:00 PM

I've written before about a pattern emerging from genome-wide association studies--genes in which mutations cause rare extreme forms of a phenotype often harbor common variation that influence natural, non-disease variation in that same phenotype. A pair of new studies on variation in cardiac repolarization (summarized here) provide an additional example of this pattern.

It's worth noting that this was something of an obvious hypothesis--candidate gene association studies often targeted gene known to cause Mendelian disorders when mutated. In retrospect, the reason these studies were often inconclusive was a simple lack of power.

Labels: Genetics

Tuesday, March 24, 2009

Signals of recent positive selection in a worldwide sample of human populations...again, sort of posted by Razib @ 3/24/2009 10:46:00 AM

New paper in Genome Research, Signals of recent positive selection in a worldwide sample of human populations:

Genome-wide scans for recent positive selection in humans have yielded insight into the mechanisms underlying the extensive phenotypic diversity in our species, but have focused on a limited number of populations. Here, we present an analysis of recent selection in a global sample of 53 populations, using genotype data from the Human Genome Diversity-CEPH Panel. We refine the geographic distributions of known selective sweeps, and find extensive overlap between these distributions for populations in the same continental region but limited overlap between populations outside these groupings. We present several examples of previously unrecognized candidate targets of selection, including signals at a number of genes in the NRG-ERBB4 developmental pathway in non-African populations. Analysis of recently identified genes involved in complex diseases suggests that there has been selection on loci involved in susceptibility to type II diabetes. Finally, we search for local adaptation between geographically close populations, and highlight several examples.

I've blogged it at ScienceBlogs, and so has Genetic Future, and John Hawks offers a response. Though there are so many references to the Supplements, which aren't online, I feel like there's on more course remaining....

Labels: Genetics, Population genetics

Wednesday, March 18, 2009

Pigmentation variation in humans posted by p-ter @ 3/18/2009 07:21:00 PM

It was only six short years ago that Greg Barsh wrote an "unsolved mystery" review in PLoS Biology asking, "What Controls Variation in Human Skin Color?"

A recent review provides a nice summary of the developments since then--in short, pigmentation is now probably one of the best understood (at a genetic level) phenotypes in humans. A pretty impressive story.

Labels: Genetics

Tuesday, March 17, 2009

Inbreeding over time posted by p-ter @ 3/17/2009 05:56:00 PM

A number of people have commented on a recent paper showing an increase in heterozygosity in human populations over time, presumably due to increased outbreeding (though Dienekes suggests some of this effect may be due to more homozygous individuals living longer, my feeling is that the results associating homozygosity and lifespan are more likely to be artifacts due to increased outbreeding over time, rather than vice versa).

This is an interesting result, and seems plausible, but the figure in the paper is difficult to judge--I wondered why the authors chose not to show their actual data, but rather only the fitted regression line. The answer is that the data itself looks much less impressive than the pretty lines in the main text (see right).

This isn't to say that the result isn't correct (I assume the authors made sure their results are robust to the few outliers in that plot), but the relationship between homozygosity and time is certainly more noisy than implied by the figure.

Labels: Genetics

Tuesday, March 10, 2009

Blue-eyed lemurs: not HERC2 posted by p-ter @ 3/10/2009 07:51:00 PM

The genetics of blue eye color in humans is almost entirely controlled by a single SNP in a conserved non-coding region in an intron of HERC2, as was strikingly demonstrated in a recent study on using genetics to predict eye pigmentation.

Humans are not the only primate to have blue eyes--one notable example is the blue-eyed black lemur (pictured on the right). As it's well-known that convergent evolution in pigmentation has occurred in many taxa via similar genetic mechanisms (eg. MC1R), one obvious question is: have similar genetic changes led to blue eyes in humans and other primates? For blue-eyed lemurs, a new study demonstrates that, well, the answer is no. The authors sequence the region known to be causal for human blue eyes in both blue-eyed black lemurs and a closely-related, non-blue-eyed species, and find no differences.

Though this is a negative result, it's still kind of fun, and establishes a nice example of convergent evolution via separate genetic mechanisms in primates.

Labels: Evolution, Genetics

COMT & Fear posted by Razib @ 3/10/2009 07:17:00 PM

Genetic Gating of Human Fear Learning and Extinction: Possible Implications for Gene-Environment Interaction in Anxiety Disorder:

Pavlovian fear conditioning is a widely used model of the acquisition and extinction of fear. Neural findings suggest that the amygdala is the core structure for fear acquisition, whereas prefrontal cortical areas are given pivotal roles in fear extinction. Forty-eight volunteers participated in a fear-conditioning experiment, which used fear potentiation of the startle reflex as the primary measure to investigate the effect of two genetic polymorphisms (5-HTTLPR and COMTval158met) on conditioning and extinction of fear. The 5-HTTLPR polymorphism, located in the serotonin transporter gene, is associated with amygdala reactivity and neuroticism, whereas the COMTval158met polymorphism, which is located in the gene coding for catechol-O-methyltransferase (COMT), a dopamine-degrading enzyme, affects prefrontal executive functions. Our results show that only carriers of the 5-HTTLPR s allele exhibited conditioned startle potentiation, whereas carriers of the COMT met/met genotype failed to extinguish conditioned fear. These results may have interesting implications for understanding gene-environment interactions in the development and treatment of anxiety disorders.

Also see ScienceDaily. Here's the COMT SNP in SNPedia. Also, here it is in the HGDP browser. A/A is low activity variant.

Related: Other posts on COMT.

Labels: Behavior Genetics, Genetics

Older father = duller child? posted by Razib @ 3/10/2009 01:08:00 AM

Advanced Paternal Age Is Associated with Impaired Neurocognitive Outcomes during Infancy and Childhood. I blogged it at ScienceBlogs.

Labels: Genetics

Thursday, March 05, 2009

Finding rare variants involved in disease posted by p-ter @ 3/05/2009 08:01:00 PM

It has been noticed in some diseases that common variants which lead to modest increases in risk are located in or near genes that also, when mutated, cause severe monogenic forms of a the same disease (eg. obesity). This naturally leads to the hypothesis that newly identified genes containing modest risk alleles will also contain rarer alleles of strong effect.

A new study tests this hypothesis in type I diabetes: the authors take 10 genes known to be involved in diabetes etiology (note that many of these genes were discovered by genome-wide association studies of common variants) and re-sequence them in a large set of cases and controls.

What do they find? As hypothesized, a number of rare protein-altering changes in one of the genes (IFIH1, a gene involved in response to viral infection) end up being strongly associated with type I diabetes. The effect sizes aren't massive (the risk alleles have odds ratios around 2), but they certainly have stronger effects than the common variants identified (though because of their low frequencies, they explain only a minimal fraction of all the variance in diabetes risk).

This is only a proof-of-principle-- expect many similar studies, including full exome re-sequencing, in the years to come.

Labels: Genetics

Earliest domestication of horse? posted by Razib @ 3/05/2009 04:02:00 PM

Via Dienekes, The Earliest Horse Harnessing and Milking:

Horse domestication revolutionized transport, communications, and warfare in prehistory, yet the identification of early domestication processes has been problematic. Here, we present three independent lines of evidence demonstrating domestication in the Eneolithic Botai Culture of Kazakhstan, dating to about 3500 B.C.E. Metrical analysis of horse metacarpals shows that Botai horses resemble Bronze Age domestic horses rather than Paleolithic wild horses from the same region. Pathological characteristics indicate that some Botai horses were bridled, perhaps ridden. Organic residue analysis, using 13C and D values of fatty acids, reveals processing of mare's milk and carcass products in ceramics, indicating a developed domestic economy encompassing secondary products.

Related: The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World and lactase persistence.

Labels: Genetics, History

Sunday, March 01, 2009

Against intuition posted by Razib @ 3/01/2009 01:49:00 PM

John Hawks points to an article by James F. Crow, Mayr, mathematics and the study of evolution. As John stated this is Open Access assuming you take the time to register. Here is a taste:

In 1959 Ernst Mayr...flung down the gauntlet...at the feet of the three great population geneticists RA Fisher, Sewall Wright and JBS Haldane..."But what, precisely," he said, "has been the contribution of this mathematical school to the evolutionary theory, if I may be permitted to ask such a provocative question?" His skepticism arose in part from the fact that the mathematical theory at the time had little to say about speciation, Mayr's major interest. But his criticism was more broadly addressed to the utility of the entire approach. A particular focus was the simplification that he called "beanbag genetics", in which "Evolutionary genetics was essentially presented as an input or output of genes, as the adding of certain beans to a beanbag and the withdrawing of others."
...
Recent mathematical work has gone well beyond that of the three pioneers. Partly this is due to skilled mathematicians entering the field and bringing new techniques with them; especially noteworthy are stochastic processes. Second, and perhaps more important, is the extensive use of computers. Often you can use a computer to get by without deep mathematical knowledge. An additional influence is the explosive growth of molecular data, which lend themselves to mathematical treatment. In the first half of the twentieth century, population genetics and evolution had a beautiful theory, but there were very limited opportunities to apply it. Now the situation is reversed. Molecular data accumulate too fast to be assimilated.

Crow referred to some of these questions 3 years ago when I interviewed him. Though much of the essay is a restatement of ideas floated elsewhere, it's still awesome that Crow is publishing at the age of 92. Judging by how quickly he replied when I sent him an email he is also still actively corresponding.

As for the general thesis outlined in the article, of course I tend to agree with Crow. From what I know Ernst Mayr's viewpoint in Systematics was overturned by the cladist revolution, which introduced a rigorous hypothetico-deductive framework into the field. It is perhaps just part of a trend of a marginalization of more philosophical biologists who rely on intuition in the realm of theory, and serves as a specific case study of Mayr's own philosophy of science and how it is ceding ground to more moral analytic techniques. Nevertheless, we can thank Mayr for his mentoring of someone like Robert Trivers. I remember talking to a friend of mine who was at OEB in the early 2000s, and she mentioned getting stuck in the elevator with Ernst Mayr, and my first reaction was, "Dude is still alive?!?!"

Labels: Genetics, Philosophy of science

Tuesday, February 24, 2009

Epigenetics in the NY Times posted by p-ter @ 2/24/2009 07:50:00 PM

Not news to many readers, I'm sure, but Nicholas Wade has a nice article on epigenetics and gene regulation. Some people in the article complain about the lack of a focused investment by the government in this area. I found this a little odd--isn't quite a bit of large-scale work being done by the ENCODE project?

Labels: Epigenetics, Genetics

Sunday, February 22, 2009

From human genetics to biological insight posted by p-ter @ 2/22/2009 11:20:00 AM

In 2007, SNPs in an intron of the gene FTO were reported to be associated with obesity. At the time, essentially nothing was known about the gene. A few months later, a group of biochemists proposed a role for the gene in demethylation of nucleic acids (RNA or DNA). This week, a group of mouse geneticists present an analysis of a knockout of the gene, and show that the knockouts are resistant to weight gain due to increased energy expenditure.

There's still quite a ways to go before the mechanism by which FTO contributes to weight variation in humans is understood (oddly enough, there's some evidence that the mechanism is through increased energy intake rather than expenditure), but people keep chipping away...

Labels: Genetics, Molecular biology

Friday, February 20, 2009

Convergent evolution in pigmentation posted by Razib @ 2/20/2009 11:16:00 AM

Short article in Conservation and Convergence of Colour Genetics: MC1R Mutations in brown Cavefish:

One of the most striking observations in nature is when similar phenotypes appear independently, such as wings in birds and bats, or melanism in moths and mice. These examples of so-called convergent evolution naturally lead us to ponder the question of genetic repeatability, i.e., the extent to which similar phenotypes that evolved in parallel share the same genetic mechanisms. Cave-dwelling organisms provide an attractive system for studying genetic repeatability, since populations in geographically isolated caves often undergo striking convergent evolution in response to the drastically altered environment, with reduced pigmentation and vision being particularly common phenotypes.

Labels: Genetics

Wednesday, February 11, 2009

Human CCR5 knockout posted by p-ter @ 2/11/2009 07:27:00 PM

This is a pretty nice little story: a man enters a clinic with leukemia and HIV, gets a bone marrow transplant from a donor homozygous for the CCR5 deletion (these individuals are largely resistant to HIV infection), and ends up no longer needing anti-retroviral therapy.

It's just a single patient, and I somehow doubt this is a viable option for HIV treatment, but still, this is pretty impressive:

In our patient, transplantation led to complete chimerism, and the patient's peripheral-blood monocytes changed from a heterozygous to a homozygous genotype regarding the CCR5 delta32 allele. Although the patient had non–CCR5-tropic X4 variants and HAART was discontinued for more than 20 months, HIV-1 virus could not be detected in peripheral blood, bone marrow, or rectal mucosa, as assessed with RNA and proviral DNA PCR assays. For as long as the viral load continues to be undetectable, this patient will not require antiretroviral therapy.

Labels: Genetics

Thursday, February 05, 2009

The rise of the black wolf posted by Razib @ 2/05/2009 01:34:00 PM

Molecular and Evolutionary History of Melanism in North American Gray Wolves:

Morphologic diversity within closely related species is an essential aspect of evolution and adaptation. Mutations in the Melanocortin 1 receptor (Mc1r) gene contribute to pigmentary diversity in natural populations of fish, birds, and many mammals. However, melanism in the gray wolf, Canis lupus, is caused by a different melanocortin pathway component, the K locus, that encodes a beta-defensin protein which acts as an alternative ligand for the Mc1r. We show that the melanistic K locus mutation in North American wolves derives from past hybridization with domestic dogs, has risen to high frequency in forested habitats, and exhibits a molecular signature of positive selection. The same mutation also causes melanism in the coyote, Canis latrans, and Italian gray wolves, and hence our results demonstrate how traits selected in domesticated species can influence the morphologic diversity of their wild relatives.

Also see ScienceDaily. The general dynamics should be relatively familiar. Here's the model in a figure:

Labels: Genetics

Friday, January 30, 2009

Picking the perfect baby posted by Razib @ 1/30/2009 08:21:00 PM

A few years ago I had a semi-serious post up making fun of Armand Leroi for broaching the topic of neo-eugenics. Now there are reports of elective pre-implantation screenings:

Genes determining sex, hair and eye colour can be identified, alongside any DNA red flags for diseases such as muscular dystrophy, cystic fibrosis and Down's Syndrome.

"Basically any genetic ailment, and there are thousands of them. We find the genetic error responsible for that in the embryo," Dr Steinberg says.

Only those embryos free of problem genetic markers and matching parental wishes, if stated, are then implanted in the mother.
...
"To deny them the ability to do that when the technology is there is to me unethical," Dr Steinberg says.

"You can say eye colour and hair colour are not diseases, no they're not, and there is a cosmetic element to it, but we fix crooked noses all the time.
...
He says he's concerned Australian women are risking their health by undertaking IVF overseas for "frivolous" reasons, using a process that raises the moral issue of "deliberate embryo loss".

"But the main issue is the idea of treating the child as an object, as product for which you are seeking quality control," Dr Tonti-Filippini says.

1) Part of this is publicity, you can get only so much information out of genetic tests right now (see Genetic Future). Take a look at Genetic determinants of hair, eye and skin pigmentation in Europeans, and note how much higher the odds ratio (20-30 vs. ~5) for OCA2 "blue-eye" markers are vs. the ones which might give some information about hair color. The same differences in effect size apply to disease loci. I suspect many people will balk at paying up when confronted with the provisionality of some of the inferences.

2) It isn't as if these fertility technologies aren't without downsides (not to mention the cost).

I'm tempted to say we're barely past the Difference Engine era when it comes to these technologies. But it probably does make sense to have the bioethics people talk through these issues through now, the general outlines are already discernible. Of course it isn't as if many parents didn't view their children as accessories before these sorts of technologies.

Note: The link above is to an Australian newspaper. So I don't take everything they report literally...perhaps they spiced up a quote here and there?

H/T FuturePundit

Labels: Genetics

Sunday, January 25, 2009

Psoriasis genome-wide association studies posted by p-ter @ 1/25/2009 02:37:00 PM

The latest disease to be put under the scrutiny of a large genome-wide association study is psoriasis--see articles here, here, and here. These are mostly standard studies, but once again I'm struck by the effect of the MHC (HLA) region (see the figure). It was well-known, of course, that variation in HLA affects all manner of immunity-related phenotypes, but what's becoming clear is that this variation has much larger effects than other loci in the genome. I find this somewhat surprising--associations with MHC variability were identified because typing HLA was feasible many years ago; well before genome-wide association mapping with SNPs was even considered. This could be a case where, in the analogy to the man searching for the lost keys in the dark, looking under the lamppost was actually the best bet.

Labels: Genetics

Friday, January 16, 2009

Koreans are like the Hmong posted by Razib @ 1/16/2009 09:19:00 PM

Over at my other weblog I review a paper on the genetics of Koreans. The title is a shout-out to an old Korean American friend of mine who received a great deal of grief from his female Korean American peers for openly admitting that he was into a Hmong girl. She was very good-looking, but as they said, "But she's Hmong...."

Labels: Genetics, Koreans

Monday, January 12, 2009

Pinker on personal genomics posted by p-ter @ 1/12/2009 07:49:00 PM

Everyone is talking about Steven Pinker's article in the NY Times on his experiences toying with results from personal genomics companies Counsyl and 23andme. The article is, like most of Pinker's writing, entertaining and scientifically legit; well worth a read.

One interesting point to notice is that the different companies involved in this nascent field (I find Razib's analogy to the early computer industry apt) are taking very different angles; there's been something of an adaptive radiation into these new niches (ie. 23andme going for the social networking approach, Counsyl a more medical one, etc.) My impression is this is likely to be overall a good thing for people who, of course, will have varied ideas of the type of data that piques their interest.

Labels: Genetics

Saturday, January 03, 2009

Convergent loss of pigmentation in cavefish posted by p-ter @ 1/03/2009 09:04:00 AM

One of the established cool examples of convergent evolution (which for my purposes here I'll define loosely here as the evolution of different populations to the same phenotype via different mutations) has been the repeated loss of pigmentation (and eyes) in fish that have adapted to life in light-poor, nutrient-poor caves. In 2006, a group reported that albinism (panel J in the picture) in several of these caves was due to mutations in OCA2 (a SNP in a regulatory region of this gene also causes blue eyes in humans).

Not all cavefish however, are fully albino--in some populations, there also exists a "brown" phenotype (panel "G" in the picture) with reduced pigmentation. In a new paper, the gene underlying this phenotype is shown to be MC1R (this gene, of course influences pigmentation in all sorts of species), and, similarly to OCA2, two different mutations have arisen in different populations.

One might imagine that light pigmentation in cavefish could just be due to simple drift--a random mutation that knocks out pigmentation is no longer selected against in a place where there's little light, and so could drift up to high frequency. But the fact that this phenotype has arisen so many times, and reached high frequency in the presumably short time period that these fish populations have been isolated (I say presumably short because I can't find any numbers on this, but the different populations can interbreed freely) suggests a role for strong positive selection for this phenotype in adaptation to the cave environment.

Labels: Evolution, Genetics

Wednesday, December 24, 2008

Lactase persistence review posted by p-ter @ 12/24/2008 08:49:00 AM

This is a pretty thorough review of biology and evolution of lactase persistence. It's interesting that the precise genetic mechanism underlying the phenotype remains unknown-this seems like a potentially very interesting model phenotype for people interested in the temporal and spatial regulation of gene expression.

Labels: Genetics

Saturday, December 20, 2008

Transcription around promoters posted by p-ter @ 12/20/2008 05:38:00 PM

A number of papers out this week (summarized here) notice, using various technologies, the presence of extensive transcription off both DNA strands around active promoters. A figure from one of the papers is above--note the peak in transcription from the sense strand just downstream of the transcription start site (TSS), and the peak in anti-sense transcription just upstream of the TSS. This is an interesting observation, and an example of the unexpected things you can see with new technologies, but no one is exactly sure what to make of it--it could just be the transcriptional machinery being a bit sloppy.

Labels: Gene Expression, Genetics

Wednesday, December 17, 2008

Testing natural selection with genetics posted by p-ter @ 12/17/2008 09:22:00 PM

H. Allen Orr (one of the authors of the study I mentioned recently) has a fun article in Scientific American on testing hypotheses about natural selection using genetic data. Orr has been one of the few people to try and formally model adaptation in a population genetics framework (I highly recommend this review article from 2005 for a well-written and accessible discussion of this issue), so his thoughts are worth a read.

And if you're in the mood for a chuckle, check out Larry Moran's thoughts on the article.

Labels: Genetics

Tuesday, December 16, 2008

Speciation genes posted by p-ter @ 12/16/2008 05:00:00 PM

RPM points to a couple great papers on the genetics of speciation in Drosophila and mouse. The first is particularly interesting--the gene underlying hybrid incompatibility is also involved in meiotic drive.

What's fun about these sorts of studies is that one can almost start to reconstruct the sequence of population genetic events leading to speciation--like all alleles, one that leads to hybrid infertility has to pass through a phase in which it segregates in the population. This is counterintuitive, of course--any allele causing infertility in some fraction of offspring should be deleterious. One possibility is that divergence at these genes is due to differential selection, and this new paper raises the possibility that sometimes this selection might not be due to external selection pressures, but rather to intragenomic conflict.

Labels: Genetics

Saturday, December 06, 2008

How different are gene expression levels between Europeans and Africans? posted by p-ter @ 12/06/2008 08:29:00 AM

In early 2007, a paper on expression differences between populations claimed that something like 25% of all genes are differentially expressed between two population groups (in that case, in cells lines from people of either European or Chinese origin). That paper, though, had a pretty serious flaw--ancestry effect on expression were completely confounded with microarray batch effects, so the precise numbers in the paper were somewhat suspect.

One way to test whether differences between populations in expression levels are real would be to measure expression on admixed individuals--if expression levels correlate with admixure proportions within a sample, that's pretty good evidence that genetic background plays an important role in expression (barring some third factor that correlates with both genetic background and expression of a large number of genes). A population of admixed European-Asian individuals is probably a little hard to come by, but admixed European-African individuals (AKA African-Americans) are less so. A recent paper lays out the results of a study like this in African-Americans.

The results are somewhat surprising--by correlating expression levels with admixture proportions, the authors speculate that nearly all genes have an ancestry effect on expression. The reason this is somewhat surprising is that, given the way the authors did the analysis, it means the expression of a locus depends on a large number of other loci throughout the genome (if the expression levels of a locus were only affected by variation at that same locus, there would be no correlation between total ancestry and expression). Indeed, the authors estimate that only ~12% of heritable variation in expression of a given gene is due to the effects of local (or cis) variation. Other studies have had little success in identifying distant (or trans-acting) effects in humans, this suggests that the reason, as in many other genome-wide association studies, is simply a lack of power.

Labels: Genetics

Monday, December 01, 2008

Widespread copy number variation affecting phenotypes? posted by p-ter @ 12/01/2008 07:51:00 PM

A report in Science from the annual meeting of the American Society for Human Genetics focuses on copy number variation. Some interesting observations:

Don Conrad and his colleagues at the Sanger Institute have their eyes on smaller common CNVs, as little as 500 base pairs in length. Checking about every 50 base pairs across parts of the genomes of people of African and European ancestry, they uncovered more than 10,000 CNVs--suggesting that other efforts, which have identified about 1500 common ones, are missing most CNVs. Although "there haven't been many" CNVs linked to disease yet, Conrad said in his talk, "there might be quite a few out there." Indeed, he noted that 129 of the 419 genetic-association regions pinpointed in genome-wide association studies hunting for disease DNA contain a common CNV.

Labels: Genetics

Monday, November 24, 2008

The genetic map of Europe we already knew.... posted by Razib @ 11/24/2008 11:22:00 AM

From Henry Harpending:

This is from a 1984 paper, citation below the figure. The genetic data were 6 red cell antigens, 9 electrophoretic systems, and HLA and HLB. The context was the authors' effort to set up a big population genetic and demographic database of Mormons, which was criticized because the Mormons were thought to be derived from a small isolated inbred group. They wrote this paper to show that Mormon allele frequencies were generic northern European. Another paper the followed this showed that Amish and Mennonites were indeed off in another dimension, but not Mormons.

This isn't up to current standards but it does show that 25 years ago the correspondence between genetic and geographic distances in Europe was clear.

McLellan T, Jorde LB, Skolnick MH. 1984. Genetic distances between the Utah Mormons and related populations. Am J Hum Genet 36:836-857.

I made a minor modification to the figure, which is courtesy of Lynn Jorde.

Labels: Genetics

Sunday, November 23, 2008

Why does the genetic map of Europe still work? posted by Razib @ 11/23/2008 09:58:00 AM

In the comments below Susan C asks an interesting question:

I'm still surprised that this works as well as it does, given that there were mass movements of people during the nineteenth and twentieth century.

For Europe prior to 1815, I'd expect it to work. Genealogical records show that people were very often born in the same village that their parents were, or the next village along. I would guess the rate of diffusion to be a few km per generation.

After the Napoleonic Wars, though, it goes nuts. Changing methods of agriculture (e.g. enclosure of land) meant that many rural agricultural labourers were put out of work, and had to move to the major industrial cities. This migration could easily be in the range of 100km in one generation, or even transcontinental - people emigrating to North America or Australia.

Moving forward to the Second World War, many people from central Europe fled the Nazis and came to settle in Britain.

So if you take a British person today, and ask them where their grandmother was born, likely answers range from Aberystwyth to Krakow, even if they answer "white" to an ethnicity question. (Of course there's plenty of evidence of immigration from e.g. India or the Caribbean, too)

An interesting point. Some levels of immigration and movement have always been part of European history. Think about the outflow of Huguenots after the revocation of the Edict of Nantes. The trade and migration between the Low Countries and the eastern shore of Britain. The immigration of Spaniards, Poles and Italians to France in the 19th century. The relocation of Saxons to Romania, Russia, etc.

Some thoughts:

1) Many of the immigrants, like the Huguenots, settled disproportionately in cities and towns (the Volga Russians are an exception obviously). French in Berlin, British Puritans in Amsterdam, Jewish industrial workers in East London, Asian sailors in Cardiff. And cities until recently were powerful relative population sinks. So modern European cities might be affected by past immigration (e.g., in changing the accent on dialects) culturally, but they are far less reshaped genetically than you would expect.

2) Many of the immigrants were from nearby regions. Spanish and Italian immigration to France was far higher than Polish. So the affect would be more to subtly shift the positions and centers of gravity, as opposed to rearranged the expected spatial relationship.

3) Aside from France, there wasn't much migration as a proportion of the population. The ancestors from Aberswyth and Krakow are very salient because of their exoticism. This is just subject to the same dynamics as disappearing English phenomenon.

4) They sampled from only a few locations within each nation, so the clumping is exaggerated, and combined with #3, the migration effect wasn't strong enough to change your impression. Perhaps they also generally don't sample ethnic minorities in these studies; e.g., avoiding Hungarians and Saxons in Romania.

5) Some migrations, like the expulsion of Germans from Eastern Europe after World War II, rolled back the obscuring effects of earlier movements.

I was thinking about following the notes and what not and see where the samples came from, but I'll leave it to enterprising readers. I'm sure that can answer some of these questions.

Labels: Genetics

Friday, November 21, 2008

Another genetic map of Europe posted by Razib @ 11/21/2008 12:02:00 PM

I pointed to the paper at my other weblog, but since ScienceBlogs has a narrow page width, I've put the important charts below the fold.

Table 4 - Each horizontal line in the table shows the proportions of test samples originating from a given country that were assigned to each possible target country. I made a few edits, see paper for original.

Populations	Spain	France	Belgium	UK	Norway	Sweden	Romania	Germany	Hungary	Slovakia	Czech	Poland	Russia
Spain	0.945	0.055	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
France	0.085	0.515	0.270	0.105	0.000	0.000	0.004	0.014	0.007	0.000	0.000	0.000	0.000
Belgium	0.000	0.086	0.854	0.059	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
UK	0.000	0.009	0.027	0.947	0.000	0.000	0.000	0.017	0.000	0.000	0.000	0.000	0.000
Norway	0.000	0.000	0.000	0.000	0.991	0.010	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Sweden	0.000	0.000	0.000	0.000	0.099	0.901	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Romania	0.000	0.000	0.000	0.000	0.000	0.000	0.960	0.000	0.040	0.000	0.000	0.000	0.000
Germany	0.000	0.000	0.102	0.004	0.029	0.022	0.008	0.644	0.003	0.003	0.177	0.008	0.000
Hungary	0.000	0.000	0.000	0.000	0.000	0.000	0.022	0.051	0.546	0.292	0.090	0.000	0.000
Slovakia	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.077	0.220	0.453	0.250	0.000	0.000
Czech	0.000	0.000	0.000	0.000	0.000	0.000	0.038	0.052	0.161	0.205	0.484	0.062	0.000
Poland	0.000	0.000	0.000	0.000	0.000	0.000	0.008	0.002	0.009	0.025	0.021	0.802	0.134
Russia	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.008	0.008	0.000	0.040	0.944

Labels: Genetics