What is a gene “for”?

Share on FacebookShare on Google+Email this to someoneTweet about this on Twitter

“Scientists discover gene for autism” (or ovarian cancer, or depression, cocaine addiction, obesity, happiness, height, schizophrenia… and whatever you’re having yourself). These are typical newspaper headlines (all from the last year) and all use the popular shorthand of “a gene for” something. In my view, this phrase is both lazy and deeply misleading and has caused widespread confusion about what genes are and do and about their influences on human traits and disease.

The problem with this phrase stems from the ambiguity in what we mean by a “gene” and what we mean by “for”. These can mean different things at different levels and unfortunately these meanings are easily conflated. First, a gene can be defined in several different ways. From a molecular perspective, it is a segment of DNA that codes for a protein, along with the instructions for when and where and in what amounts this protein should be made. (Some genes encode RNA molecules, rather than proteins, but the general point is the same). The function of the gene on a cellular level is thus to store the information that allows this protein to be made and its production to be regulated. So, you have a gene for haemoglobin and a gene for insulin and a gene for rhodopsin, etc., etc. (around 25,000 such genes in the human genome). The question of what the gene is for then becomes a biochemical question – what does the encoded protein do?

But that is not the only way or probably even the main way that people think about what genes do – it is certainly not how geneticists think about it. The function of a gene is commonly defined (indeed often discovered) by looking at what happens when it is mutated – when the sequence of DNA bases that make up the gene is altered in some way which affects the production or activity of the encoded protein. The visible manifestation of the effect of such a mutation (the phenotype) is usually defined at the organismal level – altered anatomy or physiology or behaviour, or often the presence of disease. From this perspective, the gene is defined as a separable unit of heredity – something that can be passed on from generation to generation that affects a particular trait. This is much closer to the popular concept of a gene, such as a gene for blue eyes or a gene for breast cancer. What this really means is a mutation for blue eyes or a mutation for breast cancer.

The challenge is in relating the function of a gene at a cellular level to the effects of variation in that gene, which are most commonly observed at the organismal level. The function at a cellular level can be defined pretty directly (make protein X) but the effect at the organismal level is much more indirect and context-dependent, involving interaction with many other genes that also contribute to the phenotype in question, often in highly complex and dynamic systems.

If you are talking about a simple trait like blue eyes, then the function of the gene at a molecular level can actually be related to the mutant phenotype fairly easily – the gene encodes an enzyme that makes a brown pigment. When that enzyme is not made or does not work properly, the pigment is not made and the eyes are blue. Easy-peasy.

But what if the phenotype is in some complex physiological trait, or even worse, a psychological or behavioural trait? These traits are often defined at a very superficial level, far removed from the possible molecular origins of individual differences. The neural systems underlying such traits may be incredibly complex – they may break down due to very indirect consequences of mutations in any of a large number of genes.

For example, mutations in the genes encoding two related proteins, neuroligin-3 and neuroligin-4 have been found in patients with autism and there is good evidence that these mutations are responsible for the condition in those patients. Does this make them “genes for autism”? That phrase really makes no sense – the function of these genes is certainly not to cause autism, nor is it to prevent autism. The real link between these genes and autism is extremely indirect. The neuroligin proteins are involved in the formation of synaptic connections between neurons in the developing brain. If they are mutated, then the connections that form between specific types of neurons are altered. This changes the function of local circuits in the brain, affecting their information-processing parameters and changing how different regions of the brain communicate. Ultimately, this impacts on neural systems controlling things like social behaviour, communication and behavioural flexibility, leading to the symptoms that define autism at the behavioural level.

So, mutations in these genes can cause autism, but these are not genes for autism. They are not even usefully or accurately thought of as genes for social behaviour or for cognitive flexibility – they are required, along with the products of thousands of other genes, for those faculties to develop.

But perhaps there are other genetic variants in the population that affect the various traits underlying these faculties – not in such a severe way as to result in a clinical disorder, but enough to cause the observed variation across the general population. It is certainly true that traits like extraversion are moderately heritable – i.e., a fair proportion of the differences between people in this trait are attributable to genetic differences. When someone asks “are there genes for extraversion?”, the answer is yes if they mean “are differences in extraversion partly due to genetic differences?”. If they mean the function of some genetic variant is to make people more or less extroverted, then they have suddenly (often unknowingly) gone from talking about the activity of a gene or the effect of mutation of that gene to considering the utility of a specific variant.

This suggests a deeper meaning – not just that the gene has a function, but that it has a purpose – in biological terms, this means that a particular version of the gene was selected for on the basis of its effect on some trait. This can be applied to the specific sequence of a gene in humans (as distinct from other animals) or to variants within humans (which may be specific to sub-populations or polymorphic within populations).

While geneticists may know what they mean by the shorthand of “genes for” various traits, it is too easily taken in different, unintended ways. In particular, if there are genes “for” something, then many people infer that the something in question is also “for” something. For example, if there are “genes for homosexuality”, the inference is that homosexuality must somehow have been selected for, either currently or under some ancestral conditions. Even sophisticated thinkers like Richard Dawkins fall foul of this confusion – the apparent need to explain why a condition like homosexual orientation persists. Similar arguments are often advanced for depression or schizophrenia or autism – that maybe in ancestral environments, these conditions conferred some kind of selective advantage. That is one supposed explanation for why “genes for schizophrenia or autism” persist in the population.

Natural selection is a powerful force but that does not mean every genetic variation we see in humans was selected for, nor does it mean every condition affecting human psychology confers some selective advantage. In fact, mutations like those in the neuroligin genes are rapidly selected against in the population, due to the much lower average number of offspring of people carrying them. The problem is that new ones keep arising – in those genes and in thousands of other required to build the brain. By analogy, it is not beneficial for my car to break down – this fact does not require some teleological explanation. Breaking down occasionally in various ways is not a design feature – it is just that highly complex systems bring an associated higher risk due to possible failure of so many components.

So, just because the conditions persist at some level does not mean that the individual variants causing them do. Most of the mutations causing disease are probably very recent and will be rapidly selected against – they are not “for” anything.


Jamain S, Quach H, Betancur C, Råstam M, Colineaux C, Gillberg IC, Soderstrom H, Giros B, Leboyer M, Gillberg C, Bourgeron T, & Paris Autism Research International Sibpair Study (2003). Mutations of the X-linked genes encoding neuroligins NLGN3 and NLGN4 are associated with autism. Nature genetics, 34 (1), 27-9 PMID: 12669065

11 Comments

  1. For example, if there are “genes for homosexuality”, the inference is that homosexuality must somehow have been selected for, either currently or under some ancestral conditions. Even sophisticated thinkers like Richard Dawkins fall foul of this confusion – the apparent need to explain why a condition like homosexual orientation persists.

    It is safe to say that it is intentional on Richard Dawkins part when he makes use of “a gene for” language as he explicitly defended doing so in The Selfish Gene. His point was that it entirely proper to speak of a a gene for something as long as that gene made such a phenotypic effect more probable, all other things being equal. So when Dawkins is talking about a gene for something, he is talking more about what it does, rather than why it was selected.

  2. I understand his argument in using the phrase – I just don’t agree with it, because it can be so easily misinterpreted. And you will see in the YouTube clip that Dawkins does go to some lengths to construct possible scenarios to explain the persistence of “genes for homosexuality”. My argument is that that phrase itself tends to put one in the frame of mind of considering the purpose, rather than the function of a gene or the effect of its mutation. In this particular instance (which is all I am referring to) he is explicitly discussing why these genetic variants might have been selected for.

  3. Do you have a proposed fix that press release writers and journalists can simply drop into articles that would fix the problem? Note that I didn’t say headline writers; they’re pretty much a lost cause unless you can think of something as catchy as “man bites dog.”

    One of the points you bring up is that most of these are defects in a gene, so a lede that starts with “Scientists have discovered a genetic defect that leads to X” might be a good start. Or possibly “implicated in X.”

  4. I think your proposed solution works well actually: “Scientists have discovered a genetic defect that leads to X” – that might be modified sometimes to “associated/linked with X”, depending on how strong an effect it is. For general traits, you might say: “Scientists have discovered a genetic variant that affects X”. Not very exciting prose, but accurate, and if you don’t say precisely what you mean, people will infer things you didn’t intend. (Sometimes they do that even when you have said precisely what you meant!)

  5. What do you think of Greg Cochran’s theory that many of the fitness-impacting traits people try to ascribe to genes are better thought of as stemming from pathogens?

  6. Remember, genes are NOT blueprints. This means you can’t, for example, insert “the genes for an elephant’s trunk” into a giraffe and get a giraffe with a trunk. There are no genes for trunks. What you CAN do with genes is chemistry, since DNA codes for chemicals. For instance, we can in theory splice the native plants’ talent for nitrogen fixation into a terran plant.

    Academician Prokhor Zakharov, “Nonlinear Genetics”

    Sid Meier’s Alpha Centauri

  7. TGGP

    Greg Cochran is of course correct. Since Autism was mentioned the latest evidence strongly suggests an environmental trigger. For anyone who has at least some faith in natural selection this doesn’t come as a surprise.

    http://www.sciencedaily.com/releases/2011/11/111107162734.htm

    “The study searched, on a genome-wide scale, for genes that show an abnormal epigenetic signature — specifically histone methylation. Histones are small proteins attached to the DNA that control gene expression and activity. While genetic information is encoded by the (genome’s) DNA sequence, methylation and other types of histone modifications regulate genome organization and gene expression. The study found hundreds of loci (the places genes occupy on chromosomes) across the genome affected by altered histone methylation in the brains of autistic individuals. However, only a small percentage — less than 10 percent — of the affected genes were affected by DNA mutations.”

    In related news it’s likely that Autism begins in the womb.
    http://www.sciencedaily.com/releases/2011/11/111108200720.htm

    “The researchers found that children with autism had 67 percent more neurons in the prefrontal cortex and heavier brains for their age compared to typically developing children. Since these neurons are produced before birth, the study’s findings suggest that faulty prenatal cell birth or maintenance may be involved in the development of autism. Another possible factor that may contribute to the neuronal excess is a reduction in apoptosis, or programmed cell death, which normally occurs during the third trimester and early postnatal life.”

  8. DR01D, thanks for your comments. I would argue strongly with your statement that the latest research strongly suggests an environmental trigger. I know of no such evidence. The study on epigenetics that you refer to does not speak to this issue. It shows some differences in DNA methylation across the genome in cells from patients with autism versus controls. DNA methylation is a means of gene regulation that is carried out by proteins encoded in the genome – it is part of the genetic programme of differentiation and development. It is also part of the dynamic response to environmental factors and experience. So, seeing epigenetic differences does not necessarily implicate environmental factors – this could be just an expression of underlying genetic differences and altered developmental trajectories. (Epigenetic does not mean environmental or “not genetic”).

    And to your second point, I wholeheartedly agree – as with many other psychiatric disorders, even ones with fairly late onset like schizophrenia, all the evidence suggests the initial insults are in early (prenatal) neurodevelopment

  9. kjmtchl

    Thanks for your response. This is why I assumed the Autism study pointed towards environment.

    “However, only a small percentage — less than 10 percent — of the affected genes were affected by DNA mutations.”

    If less than 10 percent of the difference in genetic expression was the result of heredity doesn’t that strongly argue for environment? I understand that Autism might get rolling because of a problem in the >10%. But all things being equal wouldn’t the smart money bet that the trigger resides in the <90%?

  10. [...] ii. What is a gene “for”? [...]

  11. Congratulations to kjmtchl, the author of this incisive article. For more than ten years now I have been questioning in my own mind why scientists like Dean Hamer and his co-workers would want to look for a gene for homosexuality. Given the importance in evolution that attraction to the opposite sex has for the continuation of the species, this property could not be left to chance and would have to be encoded in our genes. Hence in the male of the species, the individual in most cases (in humans it has been estimated at 90%) is sexually attracted to the female and conversely the female is attracted to the male (by the same estimated 90% – this, as discussed below is probably not a co-incidence). So it seems to me that scientists should be trying to identify the gene (I use the term gene in the singular here for ease of expression although, in all probability, there will be a set of genes, and ultimately there will be many interactions when one comes to the question of sexual behaviour, some of which will be genetic and others will not be of a genetic nature) and the product of that gene that causes sexual attraction to the opposite sex as a starting point.

    As it has happened many times in the early studies in biochemistry, it has been the study of the abnormal (the word “abnormal” from here onwards, is used in the context of that of the minority of the human population, that is the 1 in 10) that has provided the first insights into the normal biochemical situation. Thus early studies on the family tree of homosexual males pointed to the fact the genetic trait of homosexuality carried by these individuals was inherited from their heterosexual mothers and later researchers concentrated their studies on the X chromosome. A heterosexual grandmother has a 1 in 2 chance of passing this trait to her children by way of one of her two X chromosomes), whether they be sons or daughters. The sons will have a 1 in 2 chance of inheriting the trait and those that do will be genetically homosexual (the “gay” uncle). The daughters will have a 1 in 2 chance of inheriting the trait and their sons (the grandsons) will in turn have a 1 in 2 chance of inheriting the trait.

    If the gene coding for “sexual attraction” is carried on the X chromosome in the gay male, is it not reasonable to conclude that the gene for determining sexual attraction in the remainder of the human population is also carried on the X chromosome? This is my starting hypothesis. The gay male has the same genetic coding as the heterosexual female – there is no mutation and no “gay gene” – just an inheritance by the male of a normal female gene. So in this 90% of the population, the two sequences of the genes on the X chromosomes (XX in the female) produce enough “product protein” to confer on the individual, an attraction to the male whilst with only one sequence there is only enough product protein to make the individual attracted to the female. The genetically gay male then would behave as if he had an XX and the lesbian would behave as a single X (in so far as this particular gene sequence is concerned and not the whole X chromosome). This could have come about in our ancestral past if there were a defect in the replication of that part of the X chromosome that produced a situation whereby the gene in question migrated from a normal sequence in the X chromosome (leaving that X chromosome deficient for that gene) and attaching itself to another normal sequenced X chromosome making it surplus by one for that gene). This postulates that there are three variants (with respect to coding for sexual orientation) of the X chromosome in the human population, one producing no product for sexual orientation because the gene sequence is missing, one producing a normal amount of product and one producing a double amount of product. I have called these X-, X and X+. As the X- and the X+ are postulated to have arisen from the same single event, it would not be surprising that they would be found (or seen to be expressed) in the population to the same extent (now believed to be about 10%).

    From this, it is hypothesised that most lesbians would be XX- ; a heterosexual female would be XX or X-X+; and a nymphomaniac would be a XX+ or X+X+. Similarly for the male population, the gay male would be X+Y, the heterosexual male would be XY and the womaniser may be X-Y. As stated above and elsewhere, actual sexual behaviour of an individual will be the result of many factors, both genetic and non-genetic such as social, peer and religious pressures and could vary during the lifetime of that individual. It will be up to others, if thought worthwhile, to examine the detailed analytical genetic work of others such as Dean Hamer to determine whether it either supports or argues against (or are not capable of either) this hypothesis. I am unsure whether sequencing or other techniques are capable of determining whether a particular gene sequence does occur twice in the same chromosome or not.

Leave a Reply

a