~33% of a Malagasy genotype, first pass

Last week I begged for a Malagasy genotype. I didn’t quite get that, but I got the second best thing: a part Malagasy-genotype. I decided to take it for a spin.

But first some preliminaries. Here’s what we know about this individual (or what this individual knows):

– 25% French (paternal grandfather)
– 18.75% West African? 6.25% French? (paternal grandmother French Antilles)
– 19% Indian Muslim Bohras from Bombay + 6.25% Malagas, Sakalava tribe, royal family of Mahajanga (maternal grand -father)
– 25% Malagasy (Sakalava, maternal grandmother mtDNA haplogroup M23)

This is a very mixed individual in terms of ancestry. As for the Malagasy people, we know both a lot and a little about them. They’re a hybrid population, more or less, of Austronesians with a very close connection to the to the Dayaks of southern Borneo. I have hypothesized that these Austronesians were part of a circum-Indian ocean trading network which was marginalized by the rise of Islam in the second half of the first millennium. Such an early date would explain why the Malagasy seem to have been only lightly touched by Indic cultural influences, let alone Islamic ones. There is also the African component to their ancestry, which is more prominent in the lowland populations to the west of the island of Madagascar. The Sakalava are a somewhat more African group (as opposed to the Merina of the eastern highlands, who are more Austronesian).

Below are some results from ADMIXTURE and PCA generated with EIGENSOFT. Most of the PCA plots were not too useful, because I didn’t fine-tune the populations ahead of time too much (this is a first pass), so I didn’t post them. The ADMIXTURE runs are those which seem highly informative to me. There were three data sets into which I merged the part-Malagasy individual:

– #1, A Southeast Asian focused one, using mostly the Pan-Asian Consortium populations

– #2, An Asian focused data set which used the HGDP

– #3, An African focused data set which used the Henn et al. populations as well as some HGDP ones


#1 is plagued by a thin marker set. The Southeast Asian groups had ~56,000 markers, but the part-Malagasy individual only shared ~22,000 with them. Still, I made a go of it. I probably overcompensated in #2, as I used ~590,000 markers (the HGDP has a pretty good overlap with the 23andMe raw data). Finally, #3 had ~180,000 markers, which I feel to be very sufficient for this sort of exploratory endeavor.

Population K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 K11 K12 K13
Taiwan Aborigine 0% 0% 0% 2% 0% 0% 5% 23% 69% 0% 0% 0% 0%
Hmong 0% 0% 3% 78% 0% 0% 7% 4% 3% 2% 0% 0% 2%
Jinuo 0% 0% 73% 2% 0% 0% 9% 1% 1% 3% 1% 0% 9%
Wa 0% 0% 5% 4% 0% 0% 17% 2% 1% 7% 0% 0% 63%
Malagasy 31% 0% 2% 0% 6% 26% 0% 0% 0% 2% 31% 0% 3%
Alorese 0% 83% 0% 0% 1% 2% 0% 3% 2% 0% 0% 8% 0%
Javanese 0% 4% 1% 1% 0% 0% 1% 25% 8% 22% 1% 5% 32%
Lamaholot 0% 61% 0% 1% 1% 3% 1% 13% 7% 2% 0% 8% 2%
Mentawai 0% 0% 0% 0% 0% 0% 0% 98% 0% 1% 0% 0% 0%
West Javanese 0% 4% 1% 1% 0% 2% 0% 26% 9% 22% 1% 4% 31%
Toraja 0% 13% 0% 1% 1% 1% 3% 44% 21% 7% 0% 2% 5%
Indian 1% 1% 0% 0% 2% 59% 0% 0% 0% 1% 25% 1% 11%
Japanese 1% 1% 2% 3% 0% 1% 87% 1% 1% 1% 0% 0% 1%
Kensiu Negrito 0% 0% 0% 0% 0% 0% 1% 1% 1% 2% 0% 90% 4%
Utah, white 1% 0% 0% 0% 2% 20% 0% 0% 0% 0% 74% 0% 3%
Luhya 78% 0% 1% 0% 2% 18% 0% 0% 0% 0% 1% 0% 0%
Mamanwa 0% 1% 0% 0% 83% 0% 0% 9% 3% 0% 0% 2% 0%
West Mindanao 0% 6% 1% 4% 3% 3% 11% 36% 24% 5% 2% 1% 4%
West Luzon 0% 4% 1% 5% 2% 2% 16% 36% 24% 4% 1% 1% 4%
Karen 9% 1% 6% 6% 0% 0% 18% 2% 1% 8% 0% 1% 50%
Plang 1% 1% 8% 5% 0% 0% 12% 6% 2% 18% 1% 2% 43%
HTin 1% 0% 1% 1% 0% 0% 0% 2% 0% 84% 1% 0% 10%
Han Taiwan 0% 0% 6% 16% 0% 0% 45% 10% 13% 3% 0% 0% 7%

Population K1 K2 K3 K4 K5 K6 K7 K8 K9 K10
Mbuti Pygmies 0% 100% 0% 0% 0% 0% 0% 0% 0% 0%
Biaka Pygmies 0% 9% 6% 0% 0% 0% 0% 59% 0% 25%
French 0% 0% 100% 0% 0% 0% 0% 0% 0% 0%
Papuan 1% 0% 0% 99% 0% 0% 0% 0% 0% 0%
Cambodians 75% 0% 0% 2% 10% 5% 8% 0% 1% 0%
Japanese 0% 0% 0% 0% 0% 0% 98% 0% 1% 0%
Han 28% 0% 0% 0% 0% 1% 70% 0% 1% 0%
Mandenka 0% 0% 0% 0% 0% 0% 0% 0% 0% 100%
Yakut 0% 0% 3% 0% 0% 0% 1% 0% 95% 0%
San 0% 0% 0% 0% 0% 0% 0% 100% 0% 0%
Bant S Africa 0% 4% 0% 0% 0% 0% 0% 23% 0% 73%
Tujia 34% 0% 0% 0% 0% 3% 63% 0% 0% 0%
Yizu 12% 1% 0% 1% 0% 9% 78% 0% 0% 0%
Miaozu 48% 0% 0% 0% 0% 1% 51% 0% 0% 0%
Hezhen 1% 0% 0% 0% 0% 1% 69% 0% 29% 0%
Xibo 4% 0% 2% 0% 1% 2% 73% 0% 18% 0%
Dai 88% 0% 0% 0% 0% 2% 10% 0% 0% 0%
Lahu 11% 0% 0% 0% 0% 82% 7% 0% 0% 0%
She 49% 0% 0% 0% 0% 0% 51% 0% 0% 0%
Naxi 4% 0% 0% 1% 0% 9% 85% 0% 1% 0%
Tu 8% 0% 4% 0% 3% 3% 74% 0% 7% 0%
Bantu Kenya 0% 5% 2% 0% 1% 0% 0% 7% 0% 84%
Malagasy 5% 1% 37% 0% 11% 3% 0% 4% 4% 36%
Indian 0% 0% 0% 0% 99% 0% 0% 0% 0% 0%

Population K1 K2 K3 K4 K5 K6 K7 K8 K9 K10
Hadza 16% 0% 76% 1% 5% 1% 1% 1% 0% 0%
Yemen Jews 0% 0% 0% 84% 0% 0% 0% 0% 0% 15%
Ethiopian 19% 0% 4% 52% 25% 0% 1% 0% 0% 0%
Sandawe 6% 0% 1% 2% 90% 1% 0% 1% 0% 0%
Biaka Pygmies 0% 0% 0% 0% 0% 100% 0% 0% 0% 0%
Mbuti Pygmies 0% 0% 0% 0% 0% 0% 100% 0% 0% 0%
French 0% 0% 0% 20% 0% 0% 0% 0% 1% 80%
Cambodians 0% 100% 0% 0% 0% 0% 0% 0% 0% 0%
Mandenka 98% 0% 0% 0% 0% 1% 0% 0% 0% 0%
Yoruba 96% 0% 0% 0% 0% 4% 0% 0% 0% 0%
Bant S Africa 72% 0% 0% 0% 0% 9% 1% 17% 0% 0%
Bantu Kenya 77% 0% 1% 2% 12% 5% 2% 0% 0% 0%
Malagasy 32% 13% 0% 3% 5% 4% 1% 0% 7% 36%
Luhya 77% 0% 2% 1% 12% 5% 3% 0% 0% 0%
Indian 0% 1% 0% 0% 0% 0% 0% 0% 99% 0%
San 8% 2% 0% 1% 0% 1% 0% 79% 3% 6%

I don’t really trust the proportions for the Pan-Asian focused data set. But I figured I should report them. No idea why the Malagasy shows so much Yakut. Could be an artifact from the hybridization? As for the rest, it seems that the African ancestry of this individual isn’t too atypical for an East African Bantu.

0
Posted in Uncategorized

Comments are closed.