The missing heritability of human nature

41ePHetk1dLCarl Zimmer has a review of Svante Paabo’s new book, Neanderthal Man: In Search of Lost Genomes, up at The New York Times. All of Carl’s stuff is good, and as someone who doesn’t have the time to read Paabo’s new book I do appreciate reviews. But this last chunk concerned me: Neither Paabo nor any other scientist can yet clearly link our mutations to our human nature. I don’t think this is Carl inventing an issue out of whole-cloth, Paabo has mooted the general idea before, to my consternation. I’m not alone here. John Hawks and Graham Coop, scientists looking at the same questions from divergent methodological traditions, have little use for fixing upon specific genes which might illuminate distinctive aspect of our humanity. It seems to me that a model of ensoulment is best left to religion. Science has more profitably dethroned man from the center of cosmos. Becoming human as we understand it is more likely an evolutionary process than a miraculous moment.*

* I am also on record as stating that “human rights” or “personhood” should not be the privilege of the lineage of H. sapiens. Genuine artificial intelligence, or non-human oraganisms which have been “uplifted”, should have the same rights that natural born humans do.

The End of History Will Never Arrive

The above clip of Neil DeGrasse Tyson has been lighting up my social feeds. It’s made Upworthy. Tyson ends by stating that “Before we start talking about genetic differences [between race and gender], you gotta come up with a system where there is equal opportunity, then we can have that conversation.” The major question that immediately comes to mind with these sorts of assertions, which are quite ubiquitous, is how one determines the extent of equal opportunity if one does not have a model for the outcomes of equal opportunity. The reality is that those making such claims have a model of the outcomes, unstated because it is shared by so many. Proportionate representation, because they assume that in fact that there are no innate dispositional differences* between groups. The Left liberal version of Homo economicus. Once this model is in place then lack of proportionate representation can be taken as ipso facto evidence of lack of equal opportunity.** With this model in hand innate dispositional differences would give the same outcomes, but could be taken as evidence of lack of equal opportunity. So ultimately the “lack of interest” in these issues dovetails nicely with priors. If it turned out there were differences between the groups that the model would start to get messier.***

Since the clips such as above are shared by like minded individuals naturally there’s no strong critique. Rather, the assertions are “devastating”to the opposing view, which are almost entirely absent among like-minded individuals. Larry Summers may be a moderately liberal Democrat, but his airing of possible differences between males and females in the early aughts is now grounds for reading him out of polite company from what I can tell. A few years ago I had dinner with Chris Mooney about his contention that overall there is a greater skepticism of science among Republicans/Right than Democrats/Left. I can accede to this point as being possible. It seems unlikely skepticism of science or religion or any other cultural trait would be equally distributed across the ideological spectrum, and in our day and age in the United States natural scientists tend to align with the political Left, and the political Right has a generalized distrust for intellectuals. But I pointed out to Chris that on the modern cultural Left acknowledgement of sex differences seems to still be in bad odor. But a moderate amount of sexual dimorphism seems to be evident in the natural history of our own species, so it isn’t unreasonable to posit some differences. But many now consider it an implausible prior. Chris was skeptical, as he contended that this battle had ended long ago, and a hardcore “blank slate” position has lost. I wish it were so. I had the experience of having an exchange with a prominent science writer with a background in science who would not concede that men, on average, have stronger upper bodies than women. When push comes to shove I doubt that this person would stand by such skepticism, but it illustrates how deep the reflex is if even basic size and strength differences are now subject to interrogation.

The normative roots of skepticism in this domain become clear when one focuses on the one area where Left and Right invert when it comes to the biological basis of human behavior: homosexuality. As a moderately heritable complex trait it seems entirely likely there is a biological basis for homosexuality, at least in part. But the case has not been clinched by a “gay gene,” nor is the trait one which develops in a genetically deterministic fashion like the generation of five fingers on one’s hand. For reasons common to many complex traits it seems unlikely that there will ever be found a singular “gay gene,” and evidence from fields such as psychology and neurobiology do not offer silver bullet models for how homosexuality comes about, because its expression has environmental correlates (for example, same-sex intercourse is practiced in a facultative manner in prison in the Arab world, without being homosexual orientation, so some nuance in terminology is necessary). But the cultural Left, and now the majority of young Americans, can grasp that a complex behavioral trait does not necessarily lend itself to explanatory models as simple as Newtonian physics. The threshold of skepticism of “innate differences” seems to curiously be lower in this case for the Left, and tuned up higher on the social Right.

Motivated reasoning is powerful. This will not be answered by one blog post, or a decades’ worth of research. Because complex traits have genetic architectures which are not easily reducible to a few genes of large effect, “final answers” may be a while in coming (if ever). But the truth is what it is. Even if people in the United States “lack interest” in particular subjects, that is unlikely to stop other nations, whose economies and scientific institutions are still developing, from exploring avenues of research neglected by Americans. Obviously there are no perfectly objective humans, but one convenient fact about ideological bias is that different groups have different blind spots. The future will likely be one of scientific cooperation as a side effect of competition.

Finally, it is always useful for me to outline some of my thoughts by referring to a piece by one of the greatest population geneticists of the 20th century, James F. Crow. He writes in Unequal by nature:
a geneticist’s perspective on human differences

Two populations may have a large overlap and differ only slightly in their means. Still, the most outstanding individuals will tend to come from the population with the higher mean. The implication, I think, is clear: whenever an institution or society singles out individuals who are exceptional or outstanding in some way, racial differences will become more apparent. That fact may be uncomfortable, but there is no way around it.

The fact that racial differences exist does not, of course, explain their origin. The cause of the observed differences may be genetic. But it may also be environmental, the result of diet, or family structure, or schooling, or any number of other possible biological and social factors.

My conclusion, to repeat, is that whenever a society singles out individuals who are outstanding or unusual in any way, the statistical contrast between means and extremes comes to the fore. I think that recognizing this can eventually only help politicians and social policymakers.

The basic model is exceedingly simple. Representation of the tails of a distribution can be much more skewed than small differences in mean values might imply. Let’s give a concrete illustration. Imagine a population at the mean of the height for American males. 70 inches or 5’10). Assuming a standard deviation of 2 inches and a normal distribution 1 out of 770 males will be 76 inches or above (6’4 or greater).**** Now imagine a population where the average height for males is 71 inches. Obviously most of the distribution will overlap. But now 1 out of 161 males is 76 inches or above. For the two populations the overwhelming number of individuals are going to occupy the vast middle ground about the mean. But for particular professions great height might be indispensable, in which case the two populations may have greatly different representations in such fields.

I’m thinking in the above case American basketball. But it is key to remember that basketball requires more than great height. It requires grace and strength as well. In some domains, such as professional sports and the highest echelons of the academy, it seems likely that individuals will exhibit a combination of exceptional traits, not just one, in which case the above argument is further amplified.

None of this is difficult to understand, even if you reject any empirical basis in specific cases. But 10 years of discussing this topic has informed me that this is irrelevant, when people are highly motivated they will refuse to engage in what Ernst Mayr terms “population thinking”. Rather, they will insist on referring to typologies, rather than distributions, even if one asserts that one is discussing distributions. For one, this is comfortable as a mode of analysis for humans. Categories are clear and distinct. Second, it makes for much easier refutation of plainly incorrect categorical assertions. But despite futility some things must be said now and again.

Addendum: There are some asking how one can disentangle environmental and genetic effects. That is a large part of what fields like behavior genetics, and now much of social science, attempt to do. But that being said I have outlined a very simple design enabled by modern genomics, leveraging the imperfect correlation between genomic ancestry and physical appearance.

* These need not be heritable or genetic. So I’m being vague with the terminology.

** A second implicit assumption is a normative understanding of how humans flourish and the set of choices which they should make to self-actualize.

*** It isn’t logically impossible to contend that there are differences between populations/sexes which make proportional representation unlikely, and, that there are social impediments which might amplify or dampen skewed representation in particular fields. The former cases seem self-evident, but what would I put in the latter categories? Clearly throughout the 20th century the representation of Jews, and later Asians in the United States, in areas of higher education have been dampened by quota systems. Similarly, segregation in sports resulted in an over-representation of non-Hispanic whites in many fields in the United States. Once equality of opportunity was allowed (or in cases where it has been) one saw not a decrease, but increase, is representation in the elite levels away from population wide proportions.

**** In reality many quantitative traits exhibit “fat tails,” so there are more individuals at the extremes than one might expect. But that doesn’t alter the qualitative effect.

Steak, it’s what’s for publication

Citation: Decker, Jared E., et al. "Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle." arXiv preprint arXiv:1309.5118 (2013).

Citation: Decker JE, McKay SD, Rolf MM, Kim J, Molina Alcalá A, et al. (2014) Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. PLoS Genet 10(3): e1004254. doi:10.1371/journal.pgen.1004254


440px-Steak_03_bg_040306I am a man of a particular age, old enough to remember when the idea of thousands of what were then quaintly termed ‘molecular markers’ would have left one aghast at the surfeit of data. Today the term “post-genomic” almost strikes me as as anachronistic as the “information superhighway.” This is not the post-genomic era, it just is, the wildest dreams that were, are. But the glorious present of data abundance is not without its limitations and pitfalls. As a friend explained once, bioinformaticians just “do stuff,” sometimes without understanding why they do stuff. Somewhere along the way the bio part seems to have been forgotten in the hurry to assemble the next organism as the machine demands more and more for its hungry maw. But the mechanical monster slurping through the fire hose of data with a hacked together chimera of a regular expression isn’t without some purpose. Many biologists with an interest in evolution have a dream of dense marker painting vast swaths of the tree of life, an empire of phyolgenetic information to be conquered.

But these vistas need some context, a horizon of information about the organism. This came to mind when I read Jared Decker’s new paper on the phylogenetics of domestic cattle, Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. In many ways it is a straightforward paper. You can see discussions on the earlier iterations over at Haldane’s Sieve (the preprint process seems to have worked to make it a more robust and clear publication from what I can tell!). Decker utilizes some straightforward methods (at least straightforward in 2014) on a very large SNP marker data set with expansive geographic coverage. In particular, TreeMix, Admixture, and PCA. With about ~40,000 SNPs these packages should blast through the data rather quickly (I’ve used all of them with this marker density, and sample sizes of approximately the size of the one Decker has).

You can read the whole paper yourself since it is open access. To me it seems to reiterate that cattle truly are cattle, to be pulled and prodded and traded at the whim of human beings. The fact that many East African cattle have predominantly Indian heritage (one of the two major clades) illustrates the fact that domestic animals exhibit the protean tendencies of human culture, rather than biological organisms which are governed by standard geographical and morphological diversification through conventional population genetic pressures. But I have to still admit that much of the narrative force of this paper escapes me because I lack understanding of the cattle at a level beyond the plainly statistical genetic. In other words, the organism matters. Cattle geneticists who may “hum through” the plots may still be able to grasp the force of argument with a greater clarity because their understanding of the topic is fundamentally thicker than that of outsiders. Many of the paper’s inferences from genetic data clearly draw their plausibility from elements of natural history which bovine biologists would take for granted.

And this is just the beginning. Over the next decade it seems inevitable that the clusters at the heart of “genomics cores” across the world will be gorging on whole sequences of thousands of individuals for many organisms. It will be a “flood the zone” era for attempting to understand the tree of life. An army of bioinformaticists will be thrown at the data in human waves, absorbing shock after shock, slowly transforming the ad hoc kludge pipelines of the pre-Model T era of genomics into simpler turnkey solutions. And then the biology will come back to the fore, and the deep wellspring of knowledge by those who focus on specific organisms and is going to be the essence of the enterprise once more.

It takes a village more than parents


The New York Times has a piece up, Raising a Moral Child, which does a run through on the various orthodoxies and heresies about child-rearing and inculcating values floating around the upper middle class Zeitgeist of modern America. There are plenty of references to research and the like. Literature reviews or not, it always helps to know some history, and realize that many ‘orthodox’ opinions seem to be a manner of fashion, not science (in fact, this is clear when you look at cross-cultural mores; France is today different from the USA when it comes to views on how to raise a good child). But there are things I think that are known, and should be reiterated. So, the author states that:

Genetic twin studies suggest that anywhere from a quarter to more than half of our propensity to be giving and caring is inherited. That leaves a lot of room for nurture, and the evidence on how parents raise kind and compassionate children flies in the face of what many of even the most well-intentioned parents do in praising good behavior, responding to bad behavior, and communicating their values.

The-Nurture-Assumption-Harris-Judith-Rich-9780684857077Two insights from behavior genetics can shed light here. First, shared-environmental effects are often the smallest proportion of the variation in behavior. This is the part which is due to the family home and the parental influence. Second, the proportion of variance explained by shared-environment tends to go down as people get older. So parental influence tends to diminish.

Obviously part of the reason you behave as you do can be put down to genes. Or more precisely genetic dispositions which express themselves. And another portion can be chalked up to what your parents teach you. But a large proportion, in fact in many cases the largest proportion, is accounted for by factors which we don’t have a good grasp of. We don’t know, and term this “non-shared environment.”* In The Nurture Assumption Judith Rich Harris posited that much of non-shared environment was one’s peer group. This is still a speculative hypothesis, but I do think it is part of a broader set of models which emphasize culture and society, and how it shapes your mores and behaviors, as opposed to the nuclear family.

The research cited in the piece shows how modeling by parents or people in positions of authority can affect short-term changes in the behavior of children. I am sure that these effects are real, what I am skeptical about is that these effects maintain themselves in a non-congenial social environment. To illustrate what I am getting at, imagine two children who are given up for adoption, and whose biological parents are alcoholics. Imagine that you know the biological parents are both carrying genes which are strongly correlated with alcoholism. Both these hypothetical children are adopted into conservative white upper middle class families, one in Orange county California, and another in an affluent suburb of Salt Lake City. Both families are socially conservative, and do not tolerate drinking among their children. My prediction is that the child adopted into a Mormon culture which is far less tolerant of individual choice on the issue of alcohol consumption will have lower risks of being an alcoholic simply because the whole landscape of decisions is going to be altered throughout their whole life. An adopted child with a family history of alcoholism is still going to have a higher risk within their population, but the nature of the population is likely to shift the baseline odds.

Contemporary child-rearing advice and literature has a focus on the nuclear family because parents are the ones buying the books, reading the magazines, and attending the workshops. They want to believe that they have control on the outcomes of their offspring decades after they their leave home. Reality is not congenial to this. Parents do have control, but it is far more a case of establishing frameworks through choice in nationality and cultural identification, and loading the die with genetic dispositions.

* This might actually be genetic or more broadly biological; epigenetics, epistasis, and developmental stochasticity.

The edges of the genotype-phenotype map

Claes, Peter, et al. “Modeling 3D Facial Shape from DNA.” PLoS genetics 10.3 (2014): e1004224.

A few weeks ago a paper came out in PLoS GENETICS, Modeling 3D Facial Shape from DNA, which attempts to push the ball forward when it comes to mapping complex traits like facial morphology from genetic data. The long term goal here is clear: to forensically infer one’s phenotype purely from genetic information. For simple quasi-Mendelian traits like blue vs. brown eye color difference this is already possible with a high degree of certainty (or at least far higher than eyewitness reports) because much of the trait variation is controlled by one or a few genes. But for complex traits like height where variance is distributed across thousands of loci this is not feasible because the understanding of the genetic architecture is far more primitive and incomplete. The most explanatory height loci are on the order of ~1 percent of the variance of the trait in a population. In contrast blue vs. brown eye color variance within Europeans has a ~75 percent explanatory proportion at the HERC2-OCA2 locus.

The technical details of how they modeled facial morphology is rather interesting, but I’m not going to focus on that (they zeroed in on a mixed-race population to maximize phenotypic and genotypic variance). Rather, note that the authors can now relate phenotype and genotype very precisely on a population wide basis. In the past you would have to look at someone and assess their racial background or sex status. But today genetics  can give you a very precise estimate of their racial ancestry, or an accurate prediction of their biological sex. The correlations are good, but observe that there are deviations from a strict association. These are perhaps just as interesting over the long term.

When it come to the domain of sociology and culture there are many models of how your physical appearance impacts how others perceive you, and how you develop as a person. You can see in the chart that there are many people who have more African ancestry, but less African features, than other sets of people (Rashida Jones and Joakim Noah likely fall into this class of pairs; Rashida Jones’ father is 66% African in ancestry, while Noah’s is 50%, so she is likely to have more African ancestry, despite having more European facial features by most accounts). By looking at these deviations from expectation you can actually test the power of genetics vs. sociology. In the case of the Duffy antigen trait clearly genetics will be determinative. On the other hand there are all sorts of medical (e.g., hypertension) and behavioral (e.g., intelligence test scores) differences between the socially understood black and white American populations where looking at individuals who vary in genotype and phenotype might illuminate the weight of the variables.

And I’m not talking rocket science here. Within 10 years surely much research will have been done in this area simply by looking at numerous genetic data sets, and combining them with phenotypic information. If scientists don’t do it, I suspect marketing and credit rating firms will, because they already have huge piles of data on most Americans (i.e., phenotypes, or instrumental variables to infer phenotypes). And this does not apply to just race. Though the difference between male and female faces is striking, there is some overlap there as well. Questions could be asked about the outcomes of men and women conditional on their facial morphology. The existence of AIS individuals would be another dimension to explore.