The non-European admixture in Afrikaners


About seven years ago I got my hands on some white South African individuals from the Family Tree DNA database. It was immediately clear that a subset of them had clear and consistent non-European ancestry. More precisely, this ancestry was a mix of African and Asian. In contrast, some of the other white South Africans were Ashkenazi Jews, and others seemed to be English with no African ancestry.

Last year the paper, Patterns of African and Asian admixture in the Afrikaner population of South Africa, investigated the issue thoroughly. The authors investigated 77 Afrikaners on high density chips, and they found ubiquitous ancestry from non-European population groups:

1) Likely Khoi ancestry related to the ǂKhomani
2) Sub-Saharan African ancestry, but closer West Africans and even East Africans than neighboring Bantu
3) Indian ancestry
4) East Asian ancestry

The East Asian ancestry is almost certainly from what is today Indonesia. The ubiquity of Indian slaves in the 17th century should make #3 unsurprising. And the Khoi people were indigenous to the Cape when the Dutch arrived.

The second component is harder to parse. But, it seems that the arrival of a few external slave ships was critical. The non-European admixture into the Afrikaners dates to the earliest period of settlement, not to later centuries when there was much more ubiquitous contact with Bantu-speaking populations. By then the whites were endogamous. Or, the mixed offspring were being assimilated into the Coloured community, rather than into the whites.

Curiously, the European ancestry of the Afrikaners was not subject to a strong bottleneck. Perhaps this is due to the heterogeneity of its source? Dutch, French Huguenots, and Germans, all played a role.*

* Not an apples-to-apples comparison, but it is clear that self-identified non-Hispanic whites of European heritage have a much lower proportion of non-European ancestry than Afrikaners.

Selection swimming against the genomic tide

One of the major issues that confuses people is that the distribution of a trait or gene is often only weakly correlated with overall phylogeny and the rest of the genome.

To give a strange but classic example, the MHC loci are subject to strong balancing selection. This means that novel alleles do not substitute and replace ancestral alleles. Substitution of this sort results in “lineage sorting,” so that when you look at chimpanzees and humans you can see many polymorphic loci where all humans carry one variant and all chimpanzees the other. In contrast at the MHC loci there is frequency-dependent selection for rare variants, so the normal cycling process does not occur. Humans and chimpanzees overlap quite a bit on MHC, and any given human may have a more similar profile to a given chimpanzee than another human.

There are 19,000 human genes. At 3 billion base pairs only about ~100 million are polymorphic on a worldwide scale (using some liberal definitions). There are lots of unique stories to tell here.

A new preprint, Inferring adaptive gene-flow in recent African history, illustrates how certain genes with functional significance may differ from genome-wide background. The authors find that among the Fula (Fulani) people of West Africa there has been introgression from a Eurasian mutation that confers lactase persistence. The area of the genome around this gene is much more Eurasian than the rest of the genome. In contrast, the area around the Duffy allele is much less Eurasian. The variation in this locus is related to malaria resistance. Finally, in other African populations, they found gene flow of MHC variants.

None of this is entirely surprising, though the authors apply novel haplotype-based methods which should have wider utility.

Genetic variation and disease in Africa


Very readable review, Gene Discovery for Complex Traits: Lessons from Africa. It’s open access, so I recommend it. The summary:

The genetics of African populations reveals an otherwise “missing layer” of human variation that arose between 100,000 and 5 million years ago. Both the vast number of these ancient variants and the selective pressures they survived yield insights into genes responsible for complex traits in all populations.

The main issue I might have is I’m not sure that focusing on 5 million year time spans is particularly useful. Rather, looking at the last major bottleneck for modern humans before the “Out of Africa” event would be key, since that’s when a lot of the common variation would disappear, and very rare variants probably don’t have deep time depth in any case. With all that being said, the qualitative analysis is on point.

One of the major issues in the “SNP-chip” era has been that ascertainment of variation has been skewed toward Europeans. Though more recent techniques have tried to fix this…this review points out that if you by necessity constrain the SNPs of interest to those that vary outside of Africa (most of the world’s population), you are taking may alleles private to Africa off the table. This is relevant because the “Out of Africa” bottleneck ~50,000 years ago means that African populations harbor a lot more genetic variation than non-African populations do.

The move to high-quality whole genome sequencing obviates these concerns. As a matter of course African variation will be “picked up” since the marker set is not constrained ahead of time.

Importantly the authors focus on South Africa and the Xhosa population. This group has about ~20% Khoisan genetic ancestry, which is very diverse, and, very distinct, from that of the remaining ~80% of its ancestry. With its large African immigrant population and highly diverse native groups, some of them quite admixed, South Africa could actually provide some hard-to-substitute value in biomedical genetics.