My top posts?

The fact that platforms tend to disappear has started to make me wonder: what posts should I figure out a way to preserve for posterity?

Do any long-time readers have opinions? I’m not talking popular posts necessarily even, that I can find through google analytics.

(note that I have my full archives on this site, so if you want to some for something that you vaguely remember that should be doable)

Facebook AMA with Spencer Wells & myself

Spencer and I are doing a Facebook live AMA at 2 PM EDT/1 PM CDT/12 PM MDT/11 AM PDT on the 1st of February (tomorrow as of when I’m writing this). People will ask us questions, and the questions will be relayed to us, and we’ll answer live on video.

In other ‘media’ news our podcast with Joe Pickrell of Gencove, Ancestry Deconvoluted, is now live.

Finally, there has been some talk about me doing a Reddit AMA. Thoughts?

The “Finns” are probably an Iron Age intrusion into the East Baltic

One of the first things I wrote at length about historical population genetics, in late 2002, happened to be a rumination on the Y chromosomal phylogeography of Finnic peoples. At the time there was debate as to the provenance of the N1c Y chromosomal haplotype (this is the haplotype of the Rurikids by the way). Just as R1b is ubiquitous in Western Europe, and R1a in Eastern Europe (and to some extent in Indo-Iranian lands), N1c has an extensive distribution in the northern zone of Eurasia.

The question at the time was whether N1c was from Europe and in particular the Finnic peoples, or, whether it was from Siberia.

NJ Tree of Pairwise Fst

Today we have many of the questions resolved. At this point, we know that the Finns, Sami, and Estonians, all exhibit evidence of gene flow from a Siberian-like population. This is clear on any genome-wide analyses. Though this is very much a minority component, even among the Sami, because it is genetically very different from the Northern European background, it is clear on any analysis.

Ancient DNA has also established the likelihood that this Siberian-like element is relatively new to the Baltic region. In a recent paper, The genetic prehistory of the Baltic Sea region:

We suggest that the Siberian and East Asian related ancestry in Estonia, and Y-haplogroup N in north-eastern Europe, where it is widespread today, arrived there after the Bronze Age, ca. 500 calBCE, as we detect neither in our Bronze Age samples from Lithuania and Latvia.

This is not the only ancient DNA paper that shows this. Of course, sampling is imperfect, and perhaps they’ve missed pockets of ancient Finnic peoples. But the most thorough analysis of Mesolithic hunter-gatherers in Scandinavian does not pick them up either, Population genomics of Mesolithic Scandinavia: Investigating early postglacial migration routes and high-latitude adaptation. Populations, such as the Comb Ceramic Culture, which have been identified as possible ancestors of the modern Finnic culture and ethnicity, lack the distinctive Siberian-like component.

At the SMBE 2017, I saw a poster which had results that were sampled from Finland proper, and distinctive ancestry of Siberian-like peoples was present in an individual who lived after 500 AD. This means that in all likelihood the circumpolar Siberian population which introduced this new element into the East Baltic arrived in the period between 500 BC and 500 AD.

Someone with more knowledge of paleoclimatology and archaeology needs to comment at this point. Something happened in this period, and it probably left a big ethno-linguistic impact. But I don’t know enough detail to say much (the Wikipedia entries are out of date or don’t illuminate).

I will add when I run Treemix Finns get the Siberian gene flow you’d expect. But the Lithuanians get something from the Finns. Since the Lithuanians have appreciable levels of N1c, that is not entirely surprising to me (the basal flow from the Yakut/European region to Belorussians may be more CHG/ANE).

Additionally, I will note that on a f-3 test Lithuanians have nearly as high a z-score (absolute) as Swedes (i.e., Finn; Swede/Lithuanian, Yakut), indicating that the predominant Northern European ancestry isn’t necessarily Scandinavian, as much as something between Lithuanian-like and Swedish-like (on Admixture tests the Finns do seem to have less EEF than Swedes, and Lithuanians probably the least of all among non-Finn peoples).

Addendum: I should note here that the genetics is getting clearer, but I have no great insight into the ethno-linguistic aspect. Perhaps the Siberian-like people did not introduce Finnic languages into the Baltic. Perhaps that was someone else. But I doubt it. That being said, though the Siberian-like component adds great distinctiveness to the Finns, it is important to add that by and large Finns are actually generic (if highly drifted) Northern Europeans.


Variation with the 1000 Genomes data set in China

I have mentioned before that the 1000 Genomes Chinese are heterogenous. Many of the ones sampled in Beijing are North Chinese. But there is structure within the South Chinese samples as well. The PCA above shows it. I’ve pruned some of the data for clarity (it’s probably a cline really, with cut-offs and breaks happening because of variation in population density)

Nothing surprising in the Fst matrix. The two South Chinese groups are close to each other, while the North Chinese are shifted toward the Koreans, who are shifted toward the Japanese.

Admixture analysis shows that the two South Chinese groups can be modeled as a mix of North Chinese and the Dai people of southern China, who are ancestral to the Tai people of Southeast Asia. The “South China 2” cluster is somewhat more Dai than the “South China” cluster proper.

The Miao/Hmong samples from the HGDP are very similar to the South China cluster in admixture analysis (and less Dai than the South China 2 cluster). This is not surprising, as the Miao/Hmong are relatively recent migrants into Southeast Asia from China.

What does Treemix say? Basically, the two South Chinese clusters seem to differ mainly in their Dai proportions (as admixture would imply).  They could be on the same cline, and the perception of structure might be an artifact.

Species are what you want them to be

What is a species? I don’t know. And honestly, I don’t really care too much.

Species is just a semantic label I place on a set of individuals related to a phylogeny. There tends to be a correlation in genetic variants between these creatures. For sexual organisms, which does not include all organisms, it generally denotes the ability to produce fertile offspring between any two pairs of the opposite sex.

Over ten years ago I read Speciation by Jerry Coyne and H. Allen Orr. As evolutionary geneticists with an interest in taxonomy they take the “species problem” somewhat seriously, but ultimately they’re instrumentalists. “Species” are not the ultimate goal of their scholarship from what I can tell. Rather, species are instruments, semantic tools to smoke out evolutionary processes which shape and determine the pattern of biological variation we see around us. The “origin of species” is less important in relation to the species themselves, as opposed to why we can create categories of species out of the specialized morphological diversity around us.

Not everyone agrees with this position. And not everyone has the same opinion about species. On the whole plant systematists and ecologists will take a different tack on the species problem than evolutionary geneticists. Evolutionary geneticists who work with plants will have a different view from those who work on animals, let alone those who work with bacteria.

The point then is that species are social constructs whose utility and nature varies by discipline. I’m not being a solipsist here. Nature is real. And genetic and phenotypic variation is real. But in some ways the labels we give it can become matters of emphasis.

Of course, I am aware this is an idiosyncratic view. For Carl Linnaeus, the cataloging of species, natural kinds, was cataloging the Creation of God. If you are a Creationist, as most pre-modern people were, then species in their variety and number reflect the will and intention of God. Their study and enumeration would be a glimpse into the mind of the divine.

This doesn’t come out of a vacuum. The religious and Creationist thought simply systematized deep intuitions about the nature of things and biological categories. One doesn’t have to be a genius to make a story about why it would be adaptive to promiscuously and compulsively categorize nature around you. Religious thinkers were simply reshaping and firming up ideas which were in the air.

And this probably brings up why questions about “species” crop up over and over in the comments. And this is why a few times a year I have to put this post up….

None dare call it multiregionalism

Dienekes Pontikos resurfaces with a post, Out of Africa: a theory in crisis. The title is a bit hyperbolic. But in Dienekes’ defense, he’s been on this wagon for over ten years, and the evidence is moving in his direction, not against him. I think a little crowing is understandable on this part.

With that being said, I think the biggest rethinking that we’re doing is less about where modern humans arose (Sub-Saharan Africa, North Africa, the Middle East), but how they arose. Some geneticists are quite open to the idea of Eurasian (Neanderthal?) back-migration to Africa several times (and out of Africa several times). Others are positing that a “multiregional” model might actually be about the situation within Africa.

A simple stylized model of a rapid punctuated expansion of humanity which replaces other lineages in toto is no longer likely. There has been widespread admixture, even though the last major demographic wave seems to be overwhelmingly predominant, at least outside of Africa. But the whole process might result in a much more complex history than we had thought.

The multiregional model is probably wrong on the details. The history of our species is not really phyletic gradualism and anagenesis. But there are also many processes and dynamics which a multiregional model takes into account and anticipates that probably are important in a general sense toward understanding the origin of our species.

The rapid fading of information

In Robert Heinlein’s uneven late work Friday the mentor of the protagonist mentions that because of a possible collapse of technological civilization he maintains a collection of paper books.

This crossed my mind when I saw that Storify is shutting down. Or Kevin Drum’s reflections on the changes in blogging.

I’ve put a lot of content out there over the years. Probably on the order of 5 million words across my blogs. Some publications here and there. Lots of tweets. But very little of it will persist into future generations. Digital is evanescent.

But so is paper. I believe that even good hardcover books probably won’t last more than a few hundred years.

Perhaps we should go back to some form of cuneiform? Stone and metal will last thousands of years.

Open Thread, 01/28/2108

For various reasons we focus Classical Greece and Rome, but neglect the Hellenistic period, with the exception of the biography of Alexander. If you want to read something besides Alexander to Actium, check Dividing the Spoils.

A heads up, this week on The Insight we’ll be talking to Joe Pickrell of Gencove. The main topic will be DNA and ancestry.

Is the United States the new Saudi Arabia? This stuff is crazy if you read books about “Peak Oil” in the 2000s. Also, I really don’t ever want to hear about this stuff from random guys who read these books and thought they had all the answers ever again.

DNA Geeks is now gearing toward more general STEM and items for children. The 100x LED Microscope for Mobile Devices has been quite popular, and shipping is right now free. Also, we have European vendors for our t-shirts, so shipping is cheaper and faster.

Following many liberals on Twitter has confirmed my right-wing identity, though modified by policy beliefs. In particular, far Left people, such as Matt Stoller, seem to make coherent criticisms of capitalism and what it has wrought. Criticisms which I don’t always have a good answer for.

In contrast, moderate liberals, with their mild platitudes and thin policies are not persuasive, but their adherence to sex/race identity politics and smearing of all those to the Right of them as white supremacists means that it’s pretty obvious all of those who are “Other” need to band together as one when the time comes. We hang together, or we’ll go to the re-education camps separately.

The Follower Factory. This makes sense.

Why Ursula Le Guin Matters.

Nicholas Christakis is being treated pretty unfairly. I don’t expect that he’ll get satisfaction. This isn’t the age for honorable men.

Big Data Comes to Dieting.

The new gnomAD is pretty dope.

Dissecting historical changes of selective pressures in the evolution of human pigmentation. Not sure of the demographic model.

Sarah Haider on Secular Jihadists.

Punjabi genetic variation in 1000 Genomes: Hindu caste in the Land of the Pure?

In the 1000 Genomes, there is a Punjabi dataset. Here is the description:

These cell lines and DNA samples were prepared from blood samples collected in Lahore, Pakistan. The samples are from a mix of parent- adult child trios and unrelated individuals who identified themselves and their parents as Punjabi.

A few years ago I did an analysis of the population structure in the 1000 Genomes dataset. In the Chinese data, there seemed to be some curious structure (there were two clusters of South Chinese). But the biggest issues predictably were in the South Asians. To give concrete examples, there were a few Brahmins in the Telugu data. A subset of Tamils and Telugus were highly ASI shifted. The Gujurati were highly heterogeneous, and one subcluster were almost certainly Patels (the samples were collected in Houston). The ASI shifted groups were almost certainly Scheduled Castes (Dalits) because I could see that they clustered with those samples from Estonian Biocentre dataset.

There was something curious about the samples from Pakistan and Bangladesh. Aside from a small number of individuals, whose samples were collected at the same time judging by their IDs (these individuals cluster with Scheduled Castes), the Bangladeshi sample didn’t have much South Asian style structure. That is, there wasn’t a cline or lots substructure within the ethnicity.

As noted by some commenters, the Punjabi samples were very different. Like the Gujurati samples, there was a huge variance along the ANI-ASI cline. To me, this was somewhat surprising. To make the 1000 Genomes more useful I used PCA and divided both Gujuratis and Punjabis into groups based on their position on the ANI-ASI cline. So that ANI_1 is the subpopulation with the most ANI and ANI_4 the least.

Using Treemix produced some weird results. As you can see above Punjabi_ANI_1 looks like an Iranian population with gene flow from Punjabi_ANI_3. Punjabi ANI_2 looks like a North Indian population with Iranian gene flow (so it is more ASI). Punjabi_ANI_3 are less ANI shifted than Uttar Pradesh Brahmins, but more than Uttar Pradesh Kshatriya. Finally, Punjabi_ANI_4 actually is very similar to Punjabi_ANI_2, except it has gene flow from a Dalit-like population.

With the South Asian Genotype Project I have a few Punjabi samples. All of them are within Punjabi_ANI_1.

I don’t know what’s going on here. Is this really caste-like structure in Punjab? Or are we see lots of admixture of people who are called “Punjabi” today? For example, the gene flow edges suggest lots of mixing between quite South Asian types of groups and an Iranian sort. Perhaps this is the absorption of Pathans into South Asian groups? Could it be Muhajir people who mixed with local Punjabis and identified as such?

I was curious to see if I could find something similar in relation to the three Jatts. As you can see with Treemix, no. Jatts are just very ANI-shifted. I added Lithuanians and Georgians, and you can see that Uttar Pradesh Brahmins get gene flow from a Lithuanian shifted group, while South Indian Brahmins have a more Georgian gene flow. This is just an artifact I suspect of the fact that South Indian Brahmins have a lot of admixture from non-Brahmin South Indians, who are more Georgian than Lithuanian (Iran_N as opposed to Yamnaya).

Finally, going back to the Bengali (Bangladeshi) vs. Punjabi contrast, it is really interesting. If Punjab has such deep caste-like structures it really goes to show how within South Asia caste is a very very powerful institution, and ~1,000 years of Muslim rule and in western Punjab a majority Muslim population did not break down the institution. In contrast, in Bangladesh, there doesn’t seem to be much caste structure. I am routinely the most East Asian shifted Bengali in datasets, but my family is also from the eastern edge of eastern Bengal. Why the difference?

in The Rise of Islam and the Bengal Frontier the author posits that the Islamicization of eastern Bengal was to a great extent the function of the opening up of lands for cultivation under the supervision of Muslim elites under the rule of Afghans and later Mughals. This would explain the lack of caste structure because presumably, caste structure would be difficult to maintain in a frontier landscape, where the cultural elite does not promote or accept caste (though the elite West Asian Muslims were racially exclusive, they were also a very small minority).

In contrast, the Punjab has long been settled by Indo-Aryan peoples, and despite its long history of Islam, it was not recently a frontier society.

Anyway, that’s all I got to say for that. I’m sure readers will have more insight on this pattern than I do….

Anhui, in the shadow of Shanghai

Anhui is inland of the prosperous lower Yangzi river valley. According to Wikipedia this province is a recent creation, dating to the Kangxi Emperor. The northern part of the province is part of North China while the south closer to the Yangzi river valley regions.

It’s relatively poor in comparison to the provinces to the east and seems to be a mishmash of rural regions. But it has been close enough to cosmopolitan regions to be forward thinking in its political orientation.