European population substructure…again

Share on FacebookShare on Google+Email this to someoneTweet about this on Twitter

The discussion continues in regards to the relationship of various West Eurasian and North African groups (i.e., Europeans, North Africans and Near Easterners). There have been several papers published within the last few years which shed some light on these questions. We’ve blogged them before, and I don’t think that they radically alter what you might find in History and Geography of Human Genes, but I thought I’d point to them again, with a special focus on figures of note.

European Population Substructure: Clustering of Northern and Southern Populations. Figure 4 B:

Analysis and Application of European Genetic Substructure Using 300 K SNP Information, Figure 1 B & D:

Discerning the Ancestry of European Americans in Genetic Association Studies, Figure 3 A:

Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping, Figure 9:

Since these papers are all Open Access there’s really no excuse to not read them (at least the “Discussion” sections). I hope people won’t go around looking for charts to “prove” whatever pet hypothesis they want to promote, the population-level classifications we generate often have only an approximate relationship to the multi-dimensional shape of human genetic variation at the finer-grained level. Note that some of these principal component charts really don’t have that many individuals typed, and you may wonder about the representativeness of the samples of their putative national populations. Though these are important points, I do think we need to be cautious about our expectations in regards to the sort information we’re going to extract on the margins as the N increases and the individuals typed come from every region of a nation. I suspect we’ll get more oddities like the Etruscans as isolated or peculiar populations are included in these samples, and the exceptions to the broad patterns tell us a lot about the details of human history. But, I doubt we’ll overturn the general shape of the relationships and clinal gradients we see here.

Addendum: I somewhat played down the future surprises that these sorts of fine grained analyses might have for us…but I do want to note that the studies will continue. That’s because they aren’t done for the purposes of elucidating human genetic history as such, rather, the primary rationale is to highlight substructure which might be relevant when attempting to ascertain disease relevant alleles. In the medical context then there may be significant returns on the investment here which I don’t want to underestimate. If, for example, a particular drug’s efficacy within the African American population in the United States is directly proportional to the makeup of one’s ancestry then identifying ancestry-informative markers is very useful.

Update: Measuring European Population Stratification with Microarray Genotype Data, Figure 1 A:

Labels:

15 Comments

  1. These triple-eigenvector 3-D graphs are enormously easier for me to read than the one that Steve Sailer posted earlier on his blogsite.  
     
    Graphical display technology leaps forward!

  2. Does it? 
     
    Graph 1 puts Ashkenazi firmly within a SE European cluster. 
     
    Graph 2 puts Ashkenazi firmly outside of any individual sub-group, but lines them up perfectly with either NW or SE Europeans depending on which PC you consider. 
     
    Graph 3 completely excludes Ashkenazi from European populations, including SE Europeans (though again they seem to show more similarity to either group, depending on which PC you look at). 
     
    Graph 4 tells us that a few NW European have converted to Judaism before or after migrating to Utah (the grandparents of the AJ outlier in the CEU cluster), but not many (the outlier “outlies” a lot). 
     
    Seriously, the plot thickens.

  3. I’m starting to realize that the number of samples of each population are extremely important on how the PC1/PC2 graphs end up. In most of the graphs above, Askenazis (being a very numerous sample) appear as separated (probably defining one of the two PCs themselves), while in the first one and in Baucher’s (posted in the other gnxp blog), they don’t stand out because their numbers are small in comparison with the total sample (as these two studies are focused in European structure).  
     
    This can also apply to other elements: if you study 82 Spaniards and almost 2,000 US people of European ancestry (like Seldin does), most of which have no Spanish ancestry whatsoever but likely high North European one, the PC and cluster result will notice no peculiarity that Spaniards may have.  
     
    In this regard, I think that Baucher’s paper is more representative of real European clustering, even if it also has important blanks (France, Eastern Europe, West Asia).

  4.  
    Graph 2 puts Ashkenazi firmly outside of any individual sub-group, but lines them up perfectly with either NW or SE Europeans depending on which PC you consider.
     
     
    see “D” of graph 2. i think that jives well with most of the other results. 
     
    Graph 3 completely excludes Ashkenazi from European populations, including SE Europeans (though again they seem to show more similarity to either group, depending on which PC you look at). 
     
    but the distance between SE euros & ashk are smaller on the first PC. larger on the second.

  5. specifically, graph 2, PC1 = 42% variation, PC2 = 8% variation.

  6. Graph 4 tells us that a few NW European have converted to Judaism before or after migrating to Utah (the grandparents of the AJ outlier in the CEU cluster), but not many (the outlier “outlies” a lot). 
     
    well, 
     
    Participants completed a self-administered questionnaire about their medical history, date of birth, date of last mammogram, race, religious affiliation, as well as country of birth and religious affiliation of grandparents. To be eligible for enrollment in this study, individuals must have indicated that all four grandparents were Jewish and of Eastern European ancestry. 
     
    is it totally implausible that an older woman would have been adopted and never been told by her jewish parents? i don’t know. (these were associated with, or screened for, so the parents could have been deceased since they’re likely to be older)

  7. also, here’s my catchall interp of these PC’s: ashkenazi jews are originally a mid eastern pop. ergo, on the first PC they’re on the “other side” of southeast europeans from north/northwest euros. but they have an input of northern european genes, so subsequent PC’s might show relationships them closer to these groups….

  8. Thanks. 
     
    It would be useful to look at some graphs that include Ashkenazis, Germans, Italians, Irish, Arabs and then some real outliers like sub-Saharan Africans, Australian Aborigines, Japanese, and American Indians. This would give some useful perspective. 
     
    When it comes to thinking about who your relatives are, it’s all relative.

  9. The Bauchet paper has one such figure showing Europeans, Ashkenazis, generic Middle Easterners, and representative Sub-Saharan African populations. The Middle Eastern and North African individuals were outliers from the Southeast Europe/Ashkenazi cluster. However, only 2 or 3 Middle Eastern / N. African individuals were tested compared to several more Armenians and Ashkenazis. It would obviously be nice to have more samples of all the populations in question.

  10. i added that figure. unfortunately, no information on what/who the middle eastern and north african individuals were (though STRUCTURE shows pretty obviously that one of them has substantial sub-saharan african ancestry for the mid easterner).

  11. BTW, I just wanted to clarify that I was also a bit puzzled by the seeming conflicts between the various genetic estimates in the different studies. Presumably, as more individuals, more markers, and more populations are added, this will gradually be resolved. 
     
    But I was just saying that those new 3-D/3-PC display graphs are really nice, and much easier to decipher than the previous ones I’d seen. Sometimes software leads science rather than the other way round.

  12. is it totally implausible that an older woman would have been adopted and never been told by her jewish parents? 
     
    Sure, in terms of genealogy, adoption and conversion amount pretty much to the same thing. 
     
    ashkenazi jews are originally a mid eastern pop. ergo, on the first PC they’re on the “other side” of southeast europeans from north/northwest euros. but they have an input of northern european genes, so subsequent PC’s might show relationships them closer to these groups…. 
     
     
    That’s what I thought, but I can’t see why Armenians follow a similar pattern (though more close to NE Europe) on the graph on your previous post. Have the ever-scheming Finns invaded Anatolia en masse at some point?

  13. I don’t know why these graphs continue to show genetic data from “italians” without further much more precise geographic ID. A country almost made from scratch by a king from Savoy. On a southern european scale it’s like saying the “Russians” or the “indians”. 
    Also the term southern italian is vague at best.

  14. I don’t know why these graphs continue to show genetic data from “italians” without further much more precise geographic ID. 
     
    Maybe because the Italians do end up clustering, just as expected? I’ll grant you that the Italian cluster looks just a wee bit more spread out than some of the others, but it still seems silly to me to complain about overbroad classification when the clustering is staring you right in the face. If anyone would have cause to complain about the lack of precision it would seem to be the Ashkenazim, who do end up with some sprawling outliers…

  15. You are right, Dart. Depending on who you talk to, Southern Italy could be defined as anything south of Rome, or anything south of Florence (by a “Padanian”, perhaps).  
    Genetically, Italy is definitely not as tight a cluster as some of the North Euro nations like Ireland, Poland or Sweden, and to cite an example, it is likely that Sicilians or Calabrians would be closer to Greeks than to Piedmontese, and the latter would be closer to French or Swiss than to Sicilians. 
    I agree that there is a loose cluster, and those in the center of the country are likely to fall closer to the center of the cluster than the examples I have cited. On the other hand, there are probably outliers even there, which reflect the region’s history (e.g. the Middle Eastern ancestry of people in the Tuscan hill town of Murlo, which is probably a legacy of heavy Etruscan settlement there).

a