On the heels of the previous paper describing the “genetic map of europe” comes a new paper that makes the same general observation that genetic data contain information about geography. These authors also develop a model that does reasonably well at predicting the country of origin of an individual based on genetics alone.
It’s worth considering why this is possible. A previous paper by some of these same authors proved that under a simple isolation by distance model, the first two principal components of genetic data are perpendicular in geographic space. So it appears that this basic model is a decent approximation to Europe; further work will likely refine the ways, which are likely to be interesting, that this model doesn’t fit the data.
The method the authors develop for predicting an individual’s country of origin from genetics are only a beginning for this kind of application of genetic data. They note that the SNP chip used in the study only includes common variation, while rare variants are likely to be much more geographically restricted (and thus more informative in this kind of analysis). The limits to the resolution of these sorts of methods are likely to be very fine indeed; the authors note that, even with this panel, they’re able to distinguish with some confidence individuals that are from the German, Italian, and French-speaking parts of Switzerland. With full resequencing data, it’s likely that even the precise village of origin of an individual will be predictable from genetics alone.