Sunday, January 14, 2007

Race: the current consensus   posted by p-ter @ 1/14/2007 04:58:00 PM

There has been surprisingly little outrage in the internets over Steve Hsu's argument that the concept of "race" has a biological basis. But still, it might be worth going over in a bit more detail the evidence supporting him, so that's what this post will aim to do; it will hopefully be worthwhile to have this all compiled in a single spot.

First, an important preliminary-- there are millions of places in the human genome where any two given people could possible differ, either by a single base change, the addition of an entire chunk of DNA, the inversion of a chunk of DNA, or whatever. Keep that in mind: millions and millions of places (for a database of many of the single base changes, see the HapMap). Now, the intuitive argument: after humans arose in Africa, they dispered themselves throughout the world. By both chance and in response to selection due to their new environments, populations in different parts of the world ended up with different frequencies of those millions of DNA variants. Simple enough. Now, below the fold, I will present the evidence that 1. the patterns of genetic variation form clusters on a world-wide scale, 2. genetic clusters coincide with what is commonly called "race", and 3. genetic variation between clusters is relevant phenotypically.

I. Genetic variation in humans forms clusters that correspond to geography

The fact that one can cluster humans together by geography based solely on their genetic information was most convincingly demonstrated in two papers (the second one is open access) by a group out of Stanford. These studies looked at several hundred variable places in the genome in 52 populations scattered across the globe. The hypothesis was as follows-- on applying a clustering algorithm to these data, individuals from similar geographic regions would end up together. I've put a representation on the right, where colors represent poplations-- on top is a pattern of variation that would lead to no clustering (the colors all blend one into the next) while on the bottom is a pattern of variation that would lead to clustering (there are subtle but noticable jumps from yellow to green, for example, though there is much variation within each color). Note that the lack of clustering would not mean that all populations are genetically the same (in the top figure, yellow and orange are not "the same" even though you couldn't find a fixed boundry between them). But indeed, the researchers found the situation corresponding to the bottom figure-- the individuals formed five clusters which represented, in the authors' words, "Africa, Eurasia (Europe, Middle East, and Central/South Asia), East Asia, Oceania, and the Americas". Some populations were exceptions, of course (there are always exceptions in biology)-- they seemed to be a mix between two clusters, or could even form their own cluster in certain models.

But in general, the second model in the figure is a good fit for human variation based on the spots in the genome used by these researchers-- continents correspond to clusters, and geographic barriers like the Himalayas or an ocean correspond to those areas where a "jump" from one cluster to the next occurrs.

II. Clusters and race

The fact that humans cluster together based on genetic information could, in theory, be entirely orthoganal to the concept of race. However, at least in the United State (where this has been explicitly tested), this is not the case. The most important reason for this, in my mind, is that the ancestors of European-Americans and African-Americans were not randomly sampled from the globe (there's a bias towards points on the globe that are quite distant), and this non-random sampling accentuates the genetic differences between the two groups. But in any case, the reasons for this are irrelevant to the argument; let's look at the data.

The basis for this assertion comes from a paper (open access) by a different set of researchers at Stanford, who assembled a group of Americans who identified themselves as either African-American, white, East Asian, or Hispanic. They followed a similar protocal as the studies in the first section-- they took DNA from all individuals, looked a hundreds of different DNA variants, and applied a clustering algorithm. They then looked to see if their clusters corresponded to self-reported group. And indeed, in 3631 out of 3636 cases (99.85%), the individuals were clustered by the algorithm into the "correct" racial group.

This result is obviously only valid in America, but presumably it could be repeated in other parts of the world (though there is some evidence that skin color and genetic ancestry are becoming independent in Brazil). But it is certainly the case that knowing someone's race will give you some probabilistic insight into their genetics[1].

III. Genotype and Phenotype

Once one accepts that genetic information clusters people together according to geography and that these clusters sometimes correspond to race, the next question is, do these genetic differences add up to phenotypic differences? The answer to this question is slowly emerging, and in the shadows I see the outline of a "YES".

All of the studies I will cite are based on the HapMap, a resource with genetic data as well as cell lines for individuals from four populations-- one of Western European ancestry, an Nigerian population, a Chinese population, and a Japanese population. Does the Nigerian population represent all populations in the African cluster, or the European population represent all the populations in the Eurasian cluster? Of course not, but analyzing them certainly gives an insight as to what makes one population different from any other.

First, the genetic data from the different populations can be analyzed to search for areas of the genome that have been under recent selection-- i.e. that have recently become beneficial for Nigerians, or Chinese, or whichever group. That analysis was done by two groups (both papers are open access), though I will discuss the second one. What they found was that each of the populations (they group the Chinese and Japanese together into a single population) has been under, and probably continues to be under, natural selection. It would be theoretically possible (if remarkable) to find that all humans are undergoing the same selective pressures and responding identically to them, but that is not the case. I've posted on the right a Venn diagram from the paper showing that most of the loci identified as under selection are detected in only one of the three groups, indicating that selection is causing people in different parts of the globe to become more distinct. The precise effects of the genetic variation between populations is unclear, but (as it's under selection) it's certainly phenotypically relevant. And lest you think the genes under selection are related only to "boring" physiological traits, note that one of the papers found that a number of genes involved in "neuronal function" have been under selection.

Even more recently, another group analyzed gene expression in both the Asian HapMap samples and the European HapMap samples and found that around 25% of the genes in the two were differentially expressed, and that this differential expression is due to genetic differences in many cases. The road from genotype to phenotype goes through gene expression, so this is a major step in connecting genetic variation to phenotypic variation.

So it's clear that populations differ genetically and that these differences are relevant phenotypically and informative about race. So, do genetic differences explain racial differences in any given phenotype? I hope that for phenotypes like eye color and skin color people accept the answer as obviously yes; these sorts of things have been convincingly demonstrated. For other phenotypes like IQ or personality, if you're inclined to react negatively, I say wait a few years before you get too confident; the study of human genetic variation is in its infancy, and once it hits adolescence it's going to start becoming a real pain in the ass.

[1]A note on race being a societal construct. To some extent, of course it is--some people that would be called "black" in the US might not be called "black" in France, for example (and not because of the language difference, for all you smartasses. The word "black" in French specifically refers to racial classification). I have enough faith in human intelligence to think that the first person who called race a societal construct did not mean that it had no biological component as well--note that the Wikipedia entry on adolesence refers to it as a "cultural and social phenomenon" but also "the transitional stage of human development in which a juvenile matures into an adult". People seem to somehow be able to keep the cultural and biological aspects of adolescence in their heads at the same time, as I imagine the first sociologists to study race were able to do (I may, of course, be wrong), yet somehow the fact that biological differences are interpreted through a cultural lens has somehow morphed into the idea that the biological differences don't exist to begin with (see, e.g. the ASA statement on race). Weird.