Substack cometh, and lo it is good. (Pricing)

Local ancestry deconvolution made simpler (?)

I’ve been waiting for a local ancestry deconvolution method to come out of Simon Myers’ group for a few years. Well, I think we’re there, Fine-scale Inference of Ancestry Segments without Prior Knowledge of Admixing Groups. Here’s the abstract:

We present an algorithm for inferring ancestry segments and characterizing admixture events, which involve an arbitrary number of genetically differentiated groups coming together. This allows inference of the demographic history of the species, properties of admixing groups, identification of signatures of natural selection, and may aid disease gene mapping. The algorithm employs nested hidden Markov models to obtain local ancestry estimation along the genome for each admixed individual. In a range of simulations, the accuracy of these estimates equals or exceeds leading existing methods that return local ancestry. Moreover, and unlike these approaches, we do not require any prior knowledge of the relationship between sub-groups of donor reference haplotypes and the unseen mixing ancestral populations. Instead, our approach infers these in terms of conditional “copying probabilities”. In application to the Human Genome Diversity Panel we corroborate many previously inferred admixture events (e.g. an ancient admixture event in the Kalash). We further identify novel events such as complex 4-way admixture in San-Khomani individuals, and show that Eastern European populations possess 1-5% ancestry from a group resembling modern-day central Asians. We also identify evidence of recent natural selection favouring sub-Saharan ancestry at the HLA region, across North African individuals. We make available an R and C ++ software library, which we term MOSAIC (which stands for MOSAIC Organises Segments of Ancestry In Chromosomes).

The truth is I’ve only done a quick skim of the preprint and not run the method myself to see how it works. But to be honest I can’t see where the part about Eastern Europeans is in the manuscript (I checked the supporting text)? That being said, if you run a PCA many Northern and most Eastern Europeans are clearly shifted toward East Asians compared to Southern Europeans. So I accept it.

In any case, always remember, all models are wrong. But some of them have insight.

11 thoughts on “Local ancestry deconvolution made simpler (?)

  1. As economist Deirdre McCloskey says, “All models are metaphors, and all metaphors are lies.

  2. In the main text, the Eastern European / North-Central Asia link arises where they deconvolute a 2-way admixture in the Chuvash. One of the components has the strongest similarity to Oroquen / Yakut but is also ubiquitous in Eastern Europe and Central Asia (with Russians being the strongest hit there, followed by Finns, Uygurs and Uzbeks) (Fig. 4)

  3. They probably mean the results in table 4 where they test more Central and Eastern Europeans. Volgaic populations like Chuvash are much more eastern and it’s been known enough that it shouldn’t be the novel event abstract mentions.

  4. The Yakut-like component in Eastern Europe seems to have arrived with the Turkic-Mongolic invasions, according to Yunusbayev 2015 (their ADMIXTURE painted the respective Chuvash component as similar to Ket / Nenets / Nganasan, but IBD sharing was the strongest with Buryat, Yakut, Evenki, and Mongols, even though these populations had only a minor Ket-like component in ADMIXTURE). The new paper didn’t use Ket / Nenets / Nganasan, but added a number of Tungus-Manchurian groups which all resembled the Chuvash’s “Asian component”.
    Yunusbayev dated the admixture event in the Chuvash to approx. IX c. AD, consistent with the idea that they were a remnant of Volga Bulghars, displaced from their Pontic-Caspian Steppe lands into Middle Volga Basin by the Khazars in VII-IXth centuries, and still retaining their unusual Oghur-subfamily Turkic language once spoken by the Bulghars.

  5. Busby 2015 agrees re: Chuvash. With Mordovians the mixture event is a few centuries older, Huns perhaps?

  6. Supplement also contains 3-way deconvolution of Chuvash into components with lowest Fst from respectively East-Central European 0.61, West Asian 0.15 (Turkish, Jordanian, Iranian, Armenian), and East Siberian 0.24 (Oroqen, Yakut, Daur).

    Unexpectedly, for the West Asian component, the top donor population looks to be Finnish though (assuming I’m reading correctly).

    Fst is reasonably substantial for the deconvoluted components from closest population; Chuvash East-Central European 0.004 from Russian, Chuvash West Asian 0.009 from Turkish, Chuvash NE Asian 0.03 from Oroqen. European scale Fst is something like 0.004 to 0.005 for Irish-Polish / 0.03 for Japanese-Yakut/Japanese-Dusun so fairly differentiated from closest proxies (albeit isolated streams of ancestry should exhibit higher Fst).

    Moroccan / N African deconvolution is interesting in light of Iberomaurasian ancient dna suggests ancient N African contribution independent of West African+EEF streams.

  7. That West Asian component in Chuvash may actually be a mix of European and West Asian ancestral streams. Fst-wise closest to Turkish but Eastern Europeans and even Norwegians and Germans are higher on its donor list than Turks. Going by that donor list, Adygei and Georgians are the best proxies for the West Asian part.

  8. To complement the fst analysis, it would have been interesting to run other f-statistic analyses. For instance, does the West Asian component in the 3-way deconvolution with Chuvash form a clade with any present day population? That would also tell likely you if there is any special connection with East Asian ancestry in that stream…

  9. Based on the copying chart I’d hazard a guess that this “West Asian” in Chuvash isn’t like any present day population (looks too much like a N-Euro/West Asian mix) but combines Caucasus & steppe Iranian groups. East Asian donor levels look similar to those in the East Central European component.

  10. Yeah, looking closer at the copying chart I think that’s a good guess; I’ve probably emphasized the relative peak in donation for the Chuvash West Asian from Finns, and although it does peak there, the distribution of donation is broad and shallow, which supports your contention that no present day population is a good match, and that the general distribution of donation peaks around the western steppes+Caucasus. The Siberian component also isn’t much changed in proportion (0.24) or sources from the two-way deconvolution for Chuvash, which supports what you say about East Asian donor levels.

  11. As to the “Siberian” component of the Chuvash, it may be important to remind that its history may have key implication for the ethnogenesis of East Slavs and Ashkenazi Jews. The Chuvash are often used as as proxy for the extinct Khazars, although the relation isn’t 100% assured. It’s assumed that by the late 600s AD, the Khazars have displaced their kin Bulgars from the Pontic Steppe, partly towards today’s Bulgaria, but partly to Volga-Kama Basin. Centuries later, the Volga Bulgars used two very distinct Turkic languages, one of which had similarities to the extant Chuvash, a linguistic outlier in the Turkic family. The Khazars and the Bulgars were likely heterogenous groups, and the ancestors of the Chuvash among the Volga Bulgars may or may not have been a part of the Pontic Steppe Bulgaria, but there is no better genetic and linguistic proxy for the Khazars yet.

    Both this preprint and Busby 2015 observe that the “Siberian/Mongol” component of the Chuvash (20% or so) is also found in the Russians (<5%), but is largely missing from Ukrainians and Belorussians. One may argue that the Russians acquired it during their North-Easterly expansion to the Finno-Ugric lands, and that the assimilated Finnic population may have been the conduit for the Siberian gene flow?

    Behar 2013 also notices the same component in the Eastern Ashkenazi (2.2%) but not in the Western Ashkenazim. The Eastern Ashkenazi Jews may be ~20% Slav-admixed, so their "Siberian-Mongol" ancestry is too high to be explained by a Slav-mediated flow (not to mention that the Siberian admixture in the Slavs may have occurred more Easterly than where the Jews lived). Perhaps it is a hint of a small, and direct, Khazar admixture? Although an extreme bottleneck event in the Eastern Ashkenazi may make it difficult to gauge their minor ancestry components…

Comments are closed.