Why does the genetic map of Europe still work?

Share on FacebookShare on Google+Email this to someoneTweet about this on Twitter

In the comments below Susan C asks an interesting question:

I’m still surprised that this works as well as it does, given that there were mass movements of people during the nineteenth and twentieth century.

For Europe prior to 1815, I’d expect it to work. Genealogical records show that people were very often born in the same village that their parents were, or the next village along. I would guess the rate of diffusion to be a few km per generation.

After the Napoleonic Wars, though, it goes nuts. Changing methods of agriculture (e.g. enclosure of land) meant that many rural agricultural labourers were put out of work, and had to move to the major industrial cities. This migration could easily be in the range of 100km in one generation, or even transcontinental – people emigrating to North America or Australia.

Moving forward to the Second World War, many people from central Europe fled the Nazis and came to settle in Britain.

So if you take a British person today, and ask them where their grandmother was born, likely answers range from Aberystwyth to Krakow, even if they answer “white” to an ethnicity question. (Of course there’s plenty of evidence of immigration from e.g. India or the Caribbean, too)

An interesting point. Some levels of immigration and movement have always been part of European history. Think about the outflow of Huguenots after the revocation of the Edict of Nantes. The trade and migration between the Low Countries and the eastern shore of Britain. The immigration of Spaniards, Poles and Italians to France in the 19th century. The relocation of Saxons to Romania, Russia, etc.

Some thoughts:

1) Many of the immigrants, like the Huguenots, settled disproportionately in cities and towns (the Volga Russians are an exception obviously). French in Berlin, British Puritans in Amsterdam, Jewish industrial workers in East London, Asian sailors in Cardiff. And cities until recently were powerful relative population sinks. So modern European cities might be affected by past immigration (e.g., in changing the accent on dialects) culturally, but they are far less reshaped genetically than you would expect.

2) Many of the immigrants were from nearby regions. Spanish and Italian immigration to France was far higher than Polish. So the affect would be more to subtly shift the positions and centers of gravity, as opposed to rearranged the expected spatial relationship.

3) Aside from France, there wasn’t much migration as a proportion of the population. The ancestors from Aberswyth and Krakow are very salient because of their exoticism. This is just subject to the same dynamics as disappearing English phenomenon.

4) They sampled from only a few locations within each nation, so the clumping is exaggerated, and combined with #3, the migration effect wasn’t strong enough to change your impression. Perhaps they also generally don’t sample ethnic minorities in these studies; e.g., avoiding Hungarians and Saxons in Romania.

5) Some migrations, like the expulsion of Germans from Eastern Europe after World War II, rolled back the obscuring effects of earlier movements.

I was thinking about following the notes and what not and see where the samples came from, but I’ll leave it to enterprising readers. I’m sure that can answer some of these questions.



  1. What amazes me is that the components seem to align with perpendicular spacial axes.

  2. What amazes me is that the components seem to align with perpendicular spacial axes 
    this is what’s expected under isolation by distance model, see here: 

  3. The current population of Germany is 82 million; on the order of 300,000 immigrate annually now, although this was less in pre-unification days and does involve many ethnic Germans. Still, let’s assume the proportion of migrants is 0.004 annually. We’ll also assume that population expansion was uniform and did not favor either immigrants or native Germans, and that immigrants mixed immediately into the native population. 
    If this situation had been stable since Napoleonic times (200 years), then the present population would have 45 percent pre-Napoleonic German ancestry. I imagine that’s a pretty gross overestimate of migration, and it’s not nearly enough to eliminate population structure.

  4. “many people from central Europe fled the Nazis and came to settle in Britain.” 
    Not ‘many’.

  5. I’d also throught about the Huguenots. If you look at 19th century parish registers and census data for Wales, you see a fair number of French-derived names (possibly, though not definitively, indicative of Huguenot ancestry), even in rural areas. 
    But it may be that the proportion of people involved in these migrations is small enough that it doesn’t disrupt the genetic map too much. Birth, marriage and death records for the UK are reasonably complete going back to 1837, which conveniently takes us almost back to the point where I’m suggesting mass migration became more significant (around 1815). So you could use BMD records to estimate how much migration there’s been since. 
    It’s a good point that urban areas probably attract more immigrants than rural ones. So sampling rural areas only – and excluding the major cities – might give you a better picture of what the genetic map used to look like.

  6. I’m interested in the outliers on these maps. Are they people who identify as “white natives” but who in fact have ancestry they don’t know about, perhaps Jewish or Roma?

  7. Many genetic studies of this kind deliberately exclude people with known non-local ancestry. Certainly in the UK they would routinely exclude non-Europeans.

  8. …The paper includes the following statement: 
    ‘In addition to identifying related pairs, the IBS [identical by state] analysis can also detect individuals who are ‘less’ related to the rest of the population than would be expected if the samples were homogenous [sic]. This is because of the individuals in question having either a different ethnic background or a problem in the quality of their genotypes. Such individuals were also excluded from further analyses.’ 
    I’m not sure exactly what this means, but if it means that individuals who were markedly different from others in their locality were excluded, the results cannot be taken as representative of the entire population.

  9. As both John Hawks and Greg Cochran say above, the number of immigrants is small relative to the population of the place they’re immigrating into. So that’s part of the answer. 
    I was also thinking about the cumulative effect of multiple generations. 
    To oversimplify the model a lot, if vector v(n) gives the proportions of an allele in 2 different countries after n generations, and 
    V(n) = A**n . v(0), where A=((1-p p)(p 1-p)) 
    then for large n you’ll get significant diffusion, even if p is small. Assuming about 25 years per generation, there are around 8 generations since the Napoleonic era. (So n isn’t that big, either). 
    You could use nineteenth century UK census data to estimate the diffusion rate: look for children still living with their parents (and hence at same address when enumerated for the census) and see how many were born in the same parish/county/country as their parents. You could do this for an urban centre like Liverpool versus some rural.

  10. Even if the number of immigrants is large (which on the whole it isn’t) it has to be concentrated as well in order to change the picture. Massive amounts of overall immigration that aren’t from A->B just end up being noise, and the method being applied is designed to extract the signal and discard the noise…

  11. this is what’s expected under isolation by distance model, see here: 
    http://www.gnxp.com/blog/2008/04…ords-part- n.php
    Thanks, I get it! Relevant pictures are PC1 and PC2 from here.

  12. gcochran, it depends what you mean by many. According to http://www.sovereignty.org.uk/features/articles/immig.htm, referencing Kathleen Paul, Whitewashing Britain: Race and Citizenship in the Postwar Era, about 55,000 Jews arrived in Britain from 1933 to 1939. Also, a large number of Poles arrived (not sure how many of these overlap the Jewish figure); according to the same site, the Polish population of Britain increased from 44,000 in 1931 to 162,000 in 1951. This seems to qualify as ‘many’ to me.

  13. This seems to qualify as ‘many’ to me. 
    it’s not in the context of the british population. see john hawks’ comment. we can’t have a rational discussion without an acknowledge of the preeminent importance of proportions.

  14. The Jewish immigration of that time period amounts to roughly one part in 1000 of the total British population.

  15. It seems pointless to discuss some of these issues without knowing who is excluded from the sample by the approved methodology. It seems clear that people of wholly non-European ancestry are excluded. What about people of mixed race? What about people of mixed national ancestry? E.g. French-Polish, English-French, or German-Turkish. What about Jewish or Roma? To put it in more concrete terms, would (e.g.) President Sarkozy, Charlotte Gainsbourg, or Helena Bonham-Carter be excluded from the sample?

  16. Some of Bryan Sykes’s studies are specifically done in rural areas and concentrate on individuals with four grandparents from the region.