Substack cometh, and lo it is good. (Pricing)

The new post-genetic paradigm will come

Oftentimes the domain on which a technical framework is applied matters a great deal. Imagine, if you will, an explicit statistical test for a phylogenetic relationship between a set of extant populations, whereby one infers a group of ancestral populations. If the genus is Drosophila, it’s academic. Interesting, but academic. If the genus is Homo, then it gets complicated.

People care a great deal about the historical inferences made from human population genomic datasets. I say genomic, and not genetic, because the last ten years with genome-wide analyses and ancient DNA is very different from what we saw in the late 20th century and aughts. The definitive granularity is such that population genomics has touched upon very sensitive and precious issues, both in a scholarly and non-scholarly context.

A lot of the time I have my head down reading supplements where the statistical methods are. The reality is that this sort of science is cutting edge, and there are always later revisions. Usually you can see where those revisions might come from if you look at the detailed methods and conclusions that are found in the supplements. Also, you will find that that is where you see the limitations, and the reasons that the authors chose particular parameters.

To give you a sense of what I’m talking about, consider 2016’s Genomic insights into the origin of farming in the ancient Near East. The paper proper is 24 pages. But the supplemental text is 148 pages. There is a lot of interesting stuff in there, but I would just jump to page 125 and read the whole section there and down to the end. The method portion is important because you always need to take number values in results with a grain of salt. You see for example later work which refines fractions significantly when it comes to estimating admixture between a finite set of putative populations. And the last section seems likely to become a paper in and of itself at some point

But that doesn’t mean that the genetic inferences are not robust and come out of a vacuum. In the details the phylogenetic models being tested are going to be wrong on many particulars, but in relation to hypotheses being tested they are often entirely sufficient to reject to accept.

For example, there was long the idea that the Basque people of the western trans-Pyrenees region of Spain and France descended from pre-farming Europeans, and therefore the Basque language, which is an isolate, might have local roots which went back to the Pleistocene. Today, ancient DNA along with explicit testing of various phylogenetic scenarios makes it clear that the largest fraction of Basque ancestry derives from “Early European Farmers,” who represent a demographic pulse which radiated out of the Eastern Mediterranean and reached Spain 7,500 years ago. Of course Basques do have local hunter-gatherer ancestry, but these Mesolithic peoples themselves were the last in a sequence of very distinctive populations in Pleistocene Europe. Finally, Basques do have admixture from Indo-European peoples, just less than other people in Iberia.

Of course, genetics can’t tell us about languages. Using linguistic labels in population genetic papers is to some extent a lexical convenience, but it is also one we use because of the constellation of information we have. The last major demographic pulse into Iberia is associated with an ancestry which derives from Central Eurasia. This ancestry is copious in Northern Europe, but is also found in South Asia, and ancient DNA suggests its expansion occurred between 5,000 and 3,500 years ago. It also happens that the Indo-European languages are spoken in both India and Europe. The natural inference then is to make an association between this language family, and this demographic pulse.

Some observers note discordance between estimated fractions from paper to paper, but don’t seem to understand that the point isn’t to estimate fractions of ancestry as ends in and of themselves, but to estimate fractions of ancestry to expose and highlight demographic change (or lack thereof). We can say with a very high degree of certainty that the period between 3000 and 2000 BC witnessed massive demographic change in Northern Europe. Somewhat later there was a similar change in Southern Europe, but more demographically modest. These are simple facts.

There are some scholars, frankly often archaeologists, who dismiss the relevance of the genetic findings. But anyone who has read archaeology knows that there are many cases where researchers see demographic continuity, and posit in situ cultural evolution, where it is just as possible that a new people arrived. The reason ancient DNA has revolutionized our understanding of prehistory isn’t because it has brought us new knowledge, it has foregrounded old and buried knowledge. The knowledge being that migration matters.

But genetics is only a skeleton. A framework. True flesh on the bones of the story needs the input of archaeologists, linguistics, and other scholars. In Who We Are and How We Got Here David Reich expresses his ambition to construct a historical genetic atlas of the world. But that atlas will be all the poorer without the input from other fields besides genetics. Many archaeologists have gotten on board with genetics as a tool, but the reality is that there needs to occur the rejection of some theories precious to some scholars if there is going to be total buy-in. Eventually that will happen, and a new synthesis will arise.

2 thoughts on “The new post-genetic paradigm will come

  1. I am continually struck by shallowness of the non-mathematical analysis and the paucity of effort made in most historical genetics papers to develop an accurate, complete and rich context for their discoveries from history, local mythology, anthropology (describing both populations considered fully and accurately and global historical precedents of other cultures that can shed light on those cultures that are examined), archaeology (not just of humans, but of domesticated and wild flora and fauna), climate history, non-human genetics (ancient and modern), and linguistics. Where this analysis and discussion appears it all, it is often confined to the supplements. Chinese geneticists are particularly deficient in this regard, often making bold assertions that are flatly wrong about the historical record. And, while authors are better about citing to other work in the field, integration of new findings with the existing scholarship is also frequently deficient.

    In academia, the motto I was always taught was to strive to write the definitive, complete and comprehensive work on whatever you happen to be writing about, but that norm has apparently gone out the window.

    Some of this is encouraged by the tyranny of the 24 page journal article format, which encourages investigators to edit out “superfluous detail” that is actually material, which means that even if some of that ends up in the supplemental materials, it isn’t integrated into the analysis of the results. It also encourages authors to leave assumptions and inferences (e.g. the ones you mention about linking genetic profiles to linguistic profiles) unstated, which is often problematic because really big mistakes by trained professionals more often involve bad assumptions than bad calculations or explicit reasoning in an article.

    There also seems to be a strong preference in journal writing for bland, sterile, sparse, unimaginative prose, when a style of writing closer to that of historians would be more valuable.

    This format seems to be almost determined to turn the work of authors capable of writing A- work into B- or C+ work products.

    In my own field (law), we have the opposite problem. Journal articles are expected to have 75-85 page articles full of opinionated prose, with hundreds of footnote references, whether the question at hand really calls for that much depth or not, for much narrower issues that don’t require casting a wide net in the face of incomplete information.

    There is room for middle ground.

    A historical genetics journal that encouraged more comprehensive interdisciplinary context rich papers that only banished truly ancillary matters to the supplements could add a lot of value to the academic community.

  2. No, no, no. In academia, the motto is “minimum publishable unit.” Since promotion and tenure often depend on the number of your publications, you try to maximize that number, which means splitting up your research as much as you can.

    Actually, that’s too cynical. A very good paper will mean more to the powers that be than a mediocre one. But three mediocre ones (all of which had outside funding) compared to one very good one …

Comments are closed.