Genetics allows the dead to speak from the grave

51qciM4cBhL._SX258_BO1,204,203,200_*The past after the word*

If science is hard, history is harder. Harder in that the goal is to understand what happened in ages which are fading away like evanescent ghosts of our imagination. But we must be cautious. We are a great storytelling species, seduced by narrative. The sort of empirically informed and rigorous analysis which is the hallmark of modern scholarship is a special and distinctive thing, even if it is usually packaged in turgid and impenetrable prose. It is too pat to state that history was born fully formed with the work of Thucydides (or Sima Qian). In fact Thucydides’ pretensions at historical objectivity despite obvious perspective and bias lend credence to the assertions of those who make the case that the past is fiction (in this way Herodotus may actually have been more honest). The temptation is always great to paint an edifying myth which gives succor to national pride or flatters our contemporary self-image. The fact that modern nation-states in the technological age have vigorous debates about details as to the nature of periods of history in the recent past, when the people who lived during those times are still here to bear witness, is telling in terms of the magnitude of the task before us. Fraught questions must be answered with far fewer resources.

Much of history we see only vaguely through chance and contingency, known through happenstance and the whims of our ancestors. In the West the documents which shed light upon antiquity come to us through tunnels of finite transmissions, a furious period of textual transcription in the last few centuries before 1000 A.D. The Carolingians, the Byzantines, and the Abbasids all engaged in sponsoring the capital intensive project of taking ancient texts and making copies for posterity. The vast majority of the works of antiquity we have today can be traced back to this period [1]. Biases and concerns of the elites who sponsored these projects were critical in determining the nature of the source material which serves as the foundation for our understanding of the deeper past which we take for granted today. We know how little was copied because the extant material make copious reference to a vast body of work which was circulating in the ancient world on assorted topics (and even many of the works we do have are only portions of multi-volume endeavours, such as that of Livy).

brotherhoodBut what about pushing beyond what the text can tell us, and transitioning from history to prehistory? Here is where matters become opaque and conditional upon the nature of the texts (or lack thereof). This is clear when you observe that there are very early periods of human history when our knowledge of individual actors and daily life is actually greater than later epochs due to regress of civilization, or, changes in technology which mitigated against preservation of texts [2]. The “Dark Ages” of Greece between the Mycenaeans and the Classical Greeks are the purview purely of archaeology (and even during the Mycenaean period most Linear B were of a bureaucratic nature; I do not know of narrative literature such as we have for Egypt or Babylon). For the Classical Greeks the rupture was traumatic enough that their Mycenaean past became the subject of legends. The citadels of the Bronze Age warlords were viewed as “cyclopean” works, as if only giants could have created them. Similarly, the period in Britain between the end of central Roman rule and the Christianization of the Anglo-Saxons, about two centuries, is perceived only faintly because of the paucity of written records (this also explains why this period is often utilized as the setting for historical fantasy).

9780192807281_p0_v1_s260x420Yet when text is silent one still has material remains. Their collection and analysis are the domain of archaeology, a historical science. The fact that history as we understand it deals in the written word, and so limits its focus to the period when we have texts, is itself a historical coincidence. Ideally traditional history and archaeology should work in concert, and critically, words have a way of deceiving and misleading. Most obviously we have a major ascertainment bias in our understanding of the past when we listen only to the perspectives of those who can speak through words, because they who were literate or had access to literate professionals were a very small subset of the broader human experience. Archaeology has less of this bias, because all classes leave behind their material evidence (though if one wants textual representations of a broader cross section of the Roman populace, the novel The Golden Ass is a good place to start). An excellent illustration of this for me, as readers know, is the extended argument in the book The Fall of Rome, which brings material evidence to buttress the position that the decline and fall of the unitary Roman state in the 5th century coincided with a genuine degradation of what we might term civilization. Revisionists looking purely at textual materials have long argued that the classical view was misleading, and to reduce their argument down toward its  essence, suggest that classical civilization evolved and transformed, channeling its energies into different activities (e.g., the rise of Christian theology as a successor to the classical liberal arts, see Peter Brown’s The Rise of Western Christendom). But what material remains tell us is that there was indeed an economic and demographic collapse, despite apologia that one can make as to the reshaping of high culture in texts. One may choose to weight these facts, or not, but the facts nevertheless remain, no matter how many glosses one wishes to upon them. The Rome of 600 may have had many more Christian theologians than the Rome of 400 (which was then a mainly non-Christian city), but the Rome of 400 probably had a population on the order of 10-20 times greater.

41hdiv6SmHL._SY344_BO1,204,203,200_In a world without text, which is almost all of human history, the material remains are all that we have to grasp upon. Though we can attempt to glean the minds of people long gone from paintings and scratches in stone, the reality is that what they hunted with, what they ate with, and the dwellings in which they lived, are going to give us concrete information where leaps of imagination are unnecessary. Moving beyond the text can allow us to truly illuminate the vast dark oceans of human history with more than our dreams, from the dawn of our species, down to even recent periods when literacy was the privilege of the few, and the experiences of the many were dead to us. Despite this, the paintings have only a few colors on the palette, because archaeology is filled with enormous gaps in perception. Pots not cloth. Caves not tents.

Which brings us to biology, and specifically genetics, as it turns out that DNA is actually one of the material remains that one can extract from archaeological field sites. It’s a robust macromolecule, and today researchers believe that it is feasible that some information can be drawn from remains as old as 1 to 2 million years, though that’s a best case scenario. When it comes to questions of demographic change genetic insights are key, and present data in a way that allows for more rigorous analysis. As has been the case in previous posts I must now give a nod here to L. L. Cavalli-Sforza and The History and Geography of Human Genes. Cavalli-Sforza’s magnum opus reopened the book in attempting to understand history through demographics. It was the first page, and the first chapter. Prior to this before World War II there was a cottage industry which attempted to do what Cavalli-Sforza achieved in the late 20th century. But these endeavors were hobbled by two problems. First, they was not scientific, often relying upon intuition derived from their erudition (they were not hypothetico-deductive, though that’s overrated if you have lots of data). Second, the reliance upon intuition meant that many of the conclusions dovetailed rather neatly with the ideological preferences of the day, National Socialism most horrifically, but much more widely than that was a shoddiness of nationalism inflected prehistory. Scientific romance without the genocide (see Pat Shipman’s The Evolution of Racism). After World War II archaeologists reversed course and decoupled cultural evolution and change from demographic variation. Works such as the Races of Europe became anachronistic when decades before they’d have been mainstream, and there was a strong bias toward a null hypothesis that pots, that is cultural traditions, migrate, but people do not.


k7442Into this intellectual climate stepped Cavalli-Sforza and his students, triggering a minefield in academic explosions (see The Human Genome Diversity Project: An Ethnography of Scientific Practice). Molecular anthropology in its earliest incarnations focused on deep time. In particular, there was a recalibration of time depth of the origin of apes and humans, where the molecular biologists clashed with paleontologists, and came out the victors (see The Monkey Puzzle for a history of these controversies). Then, there was the “Out of Africa” debate (see The African Exodus). Though these were somewhat fractious and personalized arguments, the emotions around the implications of these contests of ideas were often limited to scholars (though the scholars themselves may not have felt the fallout was limited; apparently at Stanford in the late 1990s a cultural anthropologist gave a presentation where he juxtaposed a photo of Cavalli-Sforza with Josef Mengele). What Cavalli-Sforza did was bring genetic science toward addressing more contemporary phenomena, to answer questions which come to the cusp of the present, tackling issues of relevance to living human people on the scale of nations and peoples. Over many decades his lab collected enough information from hundreds of genetic loci to arrive at the sum totality of inferences which were eventually presented in The History and Geography of Human Genes.

CosttoSequenceaGenome-e1409924136899Let’s take a step back here. Cavalli-Sforza and his colleagues had access to hundreds of markers at best. Note that ~2% of the human genomic codes for proteins, but there are 3 billion positions in terms of bases. Today anyone who wants to pay can get millions of positions through SNP-chip services. My son has billions of positions, because he’s been whole-genome sequenced. For phylogenetic purposes you don’t need billions, millions, or even thousands, depending on the nature of the questions you have in mind. But, it puts in perspective how far we’ve come in literally 20 years. Even 5 years.

As is the nature of science there was much that Cavalli-Sforza got wrong in  The History and Geography of Human Genes. But there was much that he got right, because the results were so clear and strong on particular points of contention. In short, very broad patterns on the continental level jumped out when analyzing even hundreds of neutral (that is, not subject to natural selection) markers. For example, the data confirm a gradient of genetic diversity which implies human origins from an African locus, as well as the relative homogeneity of Europe (aside from Finns, European populations have a surprisingly low between-population pairwise genetic distance in most cases). But, more subtle counterintuitive relationships were often not robust (e.g., North and South Chinese do not bifurcate in the manner that he reported in the 1990s). And, most critically for the purposes of this post inferring past demography from current phylogeographic patterns had serious limitations.

*The present as a window into the past*

downloadm511NSSGQNWL._SY344_BO1,204,203,200_The basic idea behind historical population genetics (archaeogenetics) which was pioneered by Cavalli-Sforza at the HPGL at Stanford was to look at patterns of diversity and relatedness among modern populations, and intersect that with what was and is known about history, as well as geography, and then allow those intersections to peal back the palimpsests of human history (see his The Great Human Diasporas). Though Cavalli-Sforza focused initially on autosomal markers scattered through the genome, in the period between 1995 and 2005 there was a great deal of work using uniparental data., the markers on the Y and mtDNA. The mtDNA is passed through women only, is copious in terms of quantity on a cellular level, and has a highly mutable region of utility for molecular phylogenetics. The Y chromosome exhibited some technical difficulties in comparison to mtDNA, but with the emergence of better extraction techniques as well as a focus on highly mutable microsatellite regions, it came to be set next mtDNA as a critical tool in the forensic reconstruction of human population history. In addition, both had the virtue of being nonrecombining, so that the generation of a phylogenetic tree was not an artificiality, but a reflection of the nature of the transmission of these two regions of the genome (congenial to a coalescent framework as well).

Human_migrationIn the end this line of research often resulted in a transposition of a phylogenetic tree upon a world map, outlining patterns of human migration. It also aligned well with another line of research which explicitly modeled the expansions of humans out of Africa as a “serial founder bottleneck” process. That is, each population which left Africa progressively branched out in a unidirectional manner, resulting in reduced genetic diversity as one progressed out of Africa.

Ramachandran, Sohini, et al. "Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa." Proceedings of the National Academy of Sciences of the United States of America 102.44 (2005): 15942-15947.
Ramachandran, Sohini, et al. “Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa.” Proceedings of the National Academy of Sciences of the United States of America 102.44 (2005): 15942-15947.

In its broadest strokes this model is not without validity. It does seem that most of the ancestry of modern humans can be traced to a population which flourished around or in Africa ~50-100 thousand years ago. Much of the inter-continental racial variation that we see in extant populations does nicely fit onto a bifurcating tree-like model (e.g., Non-Africans branch off from Africans, West Eurasians and East Eurasians diverge, Amerindians branch off from East Eurasians). The problem though is that the branches themselves turn out to be brambles which turn back in on themselves, and in some cases twist with other branches, creating lineages with very diverged ancestral roots. The yield of the earliest efforts by Cavalli-Sforza and his heirs was on a very coarse continental grain, where the effects of the dynamics were so striking that they would exhibit themselves across most neutral markers without much difficulty. But, when the questions were narrower, and the temporal and spatial scope more constrained, the earlier methods were not perceptive enough to smoke out the real dynamics.

Li, Jun Z., et al. "Worldwide human relationships inferred from genome-wide patterns of variation." science 319.5866 (2008): 1100-1104.
Li, Jun Z., et al. “Worldwide human relationships inferred from genome-wide patterns of variation.” science 319.5866 (2008): 1100-1104.

By the middle years of the 2000s researchers had gone back to a focus on recombining autosomal markers. But now they had a whole human genome to compare it to, as well as SNP-chips which quickly yielded large troves of data with little effort. In 2008 a paper was published which took the origin HGDP data set collected by Cavalli-Sforza and his colleagues, and utilized the new technologies to make deeper inferences. First, instead of hundreds of markers you had 650,000 SNPs. Second, the emergence of powerful new analytic and computational resources allowed for the complemention of tree-based and PCA visualizations of genetic relationship with model-based understandings of genetic variation and population structure. By “model-based,” I mean that the algorithm posits particular parameters (e.g., “3 ancestral populations”) and operates upon the data (e.g., “650,000 SNPs in 1000 individuals”) , to generate results which are the best representation of the fit of the data to the model.HGDPme This different from PCA, which has fewer assumptions, and represents genetic variation geometrically (each axis represents an independent dimension of variation within the data). Model-based clustering is very clear and aesthetically appealing. It gives precise results. But, the model itself is not necessarily right.

Anyone who uses these methods understands their limitations. If you use PCA to project variation of the data set, then the composition of the data you input is going to influence the largest principal components. Therefore, if you are asking questions on a broader spatial scale you should be careful about the possibility that you are overloading the sample set of interest with particular populations. More data in this case might result in less insight. Similar issues crop up with model-based clustering you don’t appropriately weight the populations. Another major problem is that the models are imposing limitations which might produce false inferences (false in that they do not accurately reflect demographic history). Most simply you might ask for many more population divisions than is realistic for the demographic and genetic history of the data. Consider a data set of Irish from Cork and Nigerians from a small village. PCA would no doubt show you two very tight and distinct clusters. With a model-based framework you could look for divisions and structure beyond K = 2 (two ancestral populations). The method is devised in such way that you would get results. But, they wouldn’t be very informative, and they’d be forced. They wouldn’t be robust. The model would be a poor fit to reality.

*From model to reality*

Obviously no model captures all elements of reality. But when the model deviates so much from reality that you get a false sense of what is true then that model is not nearly as useful. Being wrong is a definite bug. Aside from model-based admixture analysis, which posits a finite number of ancestral populations which come together to produce the genetic variation in the data set, you notice that the 2008 paper also had a tree representation of genetic variation. These two together give real and substantive results that can be useful. But, they mislead to the point of falsity in many specific cases.

Reich, David, et al. “Reconstructing Indian population history.” Nature 461.7263 (2009): 489-494.

This can be illustrated by the instance of South Asians, who are about 20% of the world’s population. A 2009 paper, Reconstructing Indian Population History, utilized both the higher autosomal marker density sets and new analytic frameworks to come to some specific conclusions which resolve many confusions about the nature of the genetic history of the peoples of the Indian subcontinent. So what did we know before? If you go back to the ideas of the old physical anthropologists they observed that many South Asian groups had an affinity to the peoples of West Eurasia (Europeans and West Asians). This varied as a function of geography and caste. In other words, there was a cline to the northwest, as well as up and down the caste system. You can see it in a PCA, where Indian groups vary in distance from Europeans, while Europeans form a very tight cluster. It also shows up in admixture based analyses. There is usually a K value where a South Asian modal cluster emerges, and it is near fixation in South Indian non-Brahmins, declining in frequency as one moves toward Pakistan, or, in North India up the caste hierarchy (the residual are West Asian and European clusters, except Bengalis, who have East Asian admixture). In The History and Geography of Human Genes South Asians form an outgroup to Europeans and Middle Eastern populations using older distance measures.

So far all good. One can imagine then a cline of genetic variation, with South Asians at one end, and West Eurasians at the other. On a PCA between East Asians and Europeans South Asians usually fall in the middle, but closer to Europeans. But there have long been major problems with this model when you drilldown into the details. The mtDNA and Y chromosomes of South Asians give very different results. The former classes them as distinct from West Eurasians, with distance affinities to East Eurasians. The latter on the other hand are quite a bit more like West Eurasians. Second, South Asians exhibit a lot of variation as a function of both geography and class in terms of their relatedness to word populations. If South Asians were deeply rooted in the subcontinent, as the migration maps above would imply, then we’re talking about massive barriers to gene flow which have persisted for tens of thousands of years. An alternative explanation is that South Asians are the product of recent admixture between two very different groups, which is what is often the norm when there is a lot of inter-individual variation in ancestral components and PCA position within a putative population group (e.g., African Americans). Finally, tests of natural selection geared toward detecting very recent sweeps have indicated a commonality between South Asians and Europeans and Middle Easterners on the haplotype of SLC24A5, which implies either extreme connectedness, or, recent admixture and migration (on the margin these two models are going to be hard to distinguish, since connections are mediated through migration).

I will sidestep the technical issues at this point, and just offer up that the work on South Asians has presaged much of what we’ve learned over the past decade when it comes to the genesis of modern population structure. The puzzles about South Asian genetic variation are resolved when you admit a model where a West Eurasian population mixed with a local indigenous group with distant affinities with other East Eurasians (see Genetic Evidence for Recent Population Mixture in India). The high level of between population variance within South Asia is due to the recent nature of the admixture event and the high genetic distance between the source populations. This may actually be the story of much of the world over the last 10,000 years. Instead of a regular branching process, imagine branches that periodically fuse back together, in a reticulated pattern. Another way to conceive of it is that the last 10,000 years have been a story of the destruction of population structure accrued over the past 100,000 years. A survey of this field can be found in the review Toward a new history and geography of human genes informed by ancient DNA.

*Inference made concrete, ancient DNA*

Up until now we have been talking about increasing the power of analysis of genetic variation in existent populations. Processes like bottlenecks and positive selection leave footprints in the genomes of modern peoples. But these methods of inference have limits. And, to a great extent they necessitate a simplicity of population dynamics to allow for them to have utility in painting a portrait of the past. Researchers had to assume that the past was simple, or the methods that they had wouldn’t be able to tell them as much as they claimed. The complexity of the demographic palimpsest could never race beyond ability of the genetic methods to peel it back, so there was a ceiling on the number of layers imposed upon the model.

41ePHetk1dL._SY344_BO1,204,203,200_Ancient DNA was a game changer, because it did not come with these limitations. Instead of just inferring the past from the present, the past could now be inferred from the past! That is, a temporal transect in time could be generated which explicitly explored the trajectory of genetic variation across time and space. As if to recapitulate history the earliest work was with mtDNA, just as it had been with “mtDNA Eve” in the 1980s. The sequence target here is small and mtDNA is copious. The immediate upshot though is that massive discontinuities were detected. Populations replaced each other repeatedly in many regions. Pulse admixture events being inferred with novel methodologies on extant populations now could be understood to have been the natural result of migration and population change over the past ~50,000 years. Thanks to the work of researchers such as Svante Paabo and Eske Willerslev the number of samples we have from ancient DNA for humans has grown to such an extent over the past 5 years that a bright line is shining into what had been a dark cavern of prehistory.

*European man, made and unveiled*

Because of both the concentration of researchers in Europe, as well as suitable preservation conditions in Northern Eurasia, ancient DNA has totally changed how we understand the genetic history of this continent most especially. Two new papers have expanded the sample set to 170 individuals, and many major questions have now been answered, and other new questions have been triggered by perplexing results. A few years ago I was talking to Spencer Wells about the age that we are privileged to live in. Spencer is a history and genetics buff (he was one of Richard Lewontin’s last grad students). So naturally as genetic science has emerged to shed light on history we’ve tracked its developments very closely. Spencer professionally, he’s a genetic anthropologist. Many questions which in the past would have been unanswerable are now answerable. Truth is coming at us so fast that it is hard to even respond to all of it (if you wait too long to publish, everything might have changed).

Carl Zimmer’s piece in The New York Times, DNA Deciphers the Roots of Modern Europeans, is accurate as to the current state of the accelerating research in this area. This is the equivalent of having a Rosetta Stone. The ancients are now coming back to life. They speak! Everything has changed. In Nature Ewen Callway quotes a scientist stating in plain language, “Christ, what does this mean?” I’ll try and flesh out further what it means, but the papers themselves do a good job. These are first steps, but they’re very big steps. There’s only so much more to go, and truth will be at hand.

First, the two papers, Massive migration from the steppe was a source for Indo-European languages in Europe, and Population genomics of Bronze Asia Eurasia. As might be suggested by the title the latter paper has coverage of populations outside of Europe, while the former focuses on Europe. The samples sizes are 69 and 101 respectively. The first paper uses a methodology which yields many SNPs, while the latter relied upon whole-genome sequencing (variation is variation, so really this is a minor detail for the results, though it matters a lot for the working scientists who are generating the data). Both agree broadly on the major results. Additionally, there is a third work, a preprint, Eight thousand years of natural selection in Europe, which has results in line with the second paper above (it has a section on selection as well as phylogenomics).

*European genetic structure is younger than the pyramids*


The old debate whether Europeans are descended from farmers or hunter-gatherers was always somewhat incoherent. All humans are descended from hunter-gatherers. Rather, the issue was whether modern Europeans descend primarily from people who were resident within the continent of Europe at the end of the last Pleistocene, or, whether they descend from peoples who developed agriculture in the Middle East ~10,000 years ago. That is, did farming spread through cultural diffusion or migration? Plants or people? The answer is actually not straightforward, but, the results are not controversial today.

First, migration seems to have been the dominant dynamic which defined the spread of farming, especially early on. These first farmers who arrived in Europe were genetically very different from the hunter-gatherers of Europe’s north and west. Some of their ancestry had been isolated by long distances for tens of thousands of years before contact. The people of the Iberian peninsula today have less genetically in common with the hunter-gatherers which were present in the region when the farmers arrived than do modern Northern Europeans, who harbor a greater fraction of ancestry which derives from the Pleistocene people. The main qualifier I’d put on this though is that the farmers themselves seem to have picked up European hunter-gatherer admixture on their way out of the Middle East. The fraction is on the order of ~50%. The other component has been termed “Basal Eurasian,” because this element is an outgroup to all other Eurasians, including the European hunter-gatherers. That is, the Basal Eurasians are an outgroup to a clade that includes such as diverse populations as Andaman Islanders, Australian Aborigines, Japanese, and European hunter-gatherers.

Lazaridis, Iosif, et al. "Ancient human genomes suggest three ancestral populations for present-day Europeans." Nature 513.7518 (2014): 409-413.
Lazaridis, Iosif, et al. “Ancient human genomes suggest three ancestral populations for present-day Europeans.” Nature 513.7518 (2014): 409-413.

The figure to the left is from the paper Ancient human genomes suggest three ancestral populations for present-day Europeans. WHG = “Western (European) Hunter-Gatherers.” EEF = “Early European Farmers.” You can see that EFF is a compound. I don’t think there’s too much clarity right now with where the EEF got its WHG-like ancestry. It could have been structure in the Middle East. Or it could have been in Southeast Europe. In the supplements of Haak et al. they test a Hungarian sample, and it does seem that the EEF individuals are closer to it than the Western European hunter-gatherer samples. So there might have been structure in the ancestral European population, but the confidence here is low. And from what I can tell Basal Eurasian is still something of a mystery, almost occupying the role of “Planet X” before the discovery of Nepture. To make the patterns make sense they have to exist, but much isn’t known about them in detail. And of course there seems to be a huge lacunae right now in terms of exploring the population genetics of the Middle East in a similar fashion as has occurred in Northern Eurasia (my understanding is that Carlos Bustamante was an important person in getting Latin American populations in the 1000 Genomes; unfortunate that there wasn’t someone else to advocate for including a Middle Eastern group, since this is such an important part of the world for human history).

With all that said, if one assumes that the West Eurasian admixture in EEF was from European hunter-gatherers, then it is clearly obvious that most of the ancestry of modern Europeans can date to the Pleistocene (i.e., EEF + Yamnaya likely means more than half the ancestry is WHG-like if you look back 10,000 years). But, this proportion obscures the fact that massive migrations and population turnovers have occurred, so that a simple model of expansion out of Ice Age refuges no longer holds. Cavalli-Sforza has long argued that pure proportions of ancestry are less important than the dynamic, as population growth driven “waves of advance” will over time dilute the initial genetic signal anyway (though the final proportion of non-WHG-like ancestry is actually higher in much of Europe than Cavalli-Sforza conceded in the early 2000s). Whether the ancestry of modern Europeans derives predominantly from those of European hunter-gatherers, the idea of dominant local continuity in a given region has been thoroughly refuted. The hunter-gatherer ancestry in the British Isles, for example, may be mostly from admixture into agricultural groups far to the south and east during the initial waves of advance, not from the people who initially recolonized Northern Europe in the early Holocene.

k8488The second demographic turnover event which has been highlighted by the papers cited so far is from the east. The migration from the steppes. This event had disproportionate, even dominant, impact across much of Northern Europe. Culturally it is often rooted in the Yamnaya complex, which gave rise to various disparate and wide ranging “daughter” societies. David Anthony’s The Horse, the Wheel, and Language surveys the archaeological terrain thoroughly. If you are interested in this topic, and haven’t read it, do read it. In this work Anthony outlines the spread of Indo-European languages via expansion of a mobile pastoralist elite. He was involved in the retrieval of some of the samples in these studies, and from what I am to understand he was personally surprised that the genetic data imply not just elite migration, but a folk wandering. Not just a band of brothers, but whole peoples on the move.

Haak, Wolfgang, et al. "Massive migration from the steppe was a source for Indo-European languages in Europe." Nature (2015).
Haak, Wolfgang, et al. “Massive migration from the steppe was a source for Indo-European languages in Europe.” Nature (2015).

Focusing on the genetics, these people seem to themselves be a compound of disparate elements. First, some of their ancestry derives from a population which Haak et al. term “Eastern Hunter-Gatherers” (EHG). And the other half derives from a population with affinities to those of the Near East, but different from that of the EEF. There is some disagreement between the two papers in Nature as to the details, but Allentoft et al. admit that they did not have EHG samples, which may have impacted their ability to detect admixture. Allentoft et al. also diverge from  Haak et al. in the emphasis they place on the ancestral component among the Yamnaya which some term “Ancient North Eurasian” (ANE) based on the location of the most ancient individual of this line (see Upper Paleolithic Siberian genome reveals dual ancestry of Native Americans). What does seem clear is that this element is deeply diverged from other West Eurasian populations, on the order of ~20 to 30 thousand years. And, they contribute about half the ancestry to the EHG (the rest is WHG-like). The descendants of the Yamnaya people brought this component all throughout Europe, with the exception of the Sardinians and Sicilians, likely isolated because of their position on the Mediterranean littoral (Sicilians have later Near Eastern admixture as well). But this is not limited to Europeans, as a substantial proportion of Native American and  West and South Asian ancestral heritage (at least the Kalash) also exhibit connections to this component.  Allentoft et al., like Haak et al., points out that there was likely structure in this broader group. That is, the ANE themselves were diversified, with the ancestors of the element in Native Americans and Europeans different from that which contributed to the Siberian component. In fact I have talked to researchers who believe that the term “Ancient North Eurasian” is misleading, as there is little clarity on the distribution of this group (the highest inferred fractions in Eurasia are in the North Caucasus). It is feasible that the Kalash have a different ANE source than Europeans.

A key issue to note, and that confuses some people, is that the ancestry of groups such as Yamnaya exhibited commonalities with other groups across Eurasia. Therefore, if you replaced similar groups then the change in admixture components utilizing model-based programs may not be as extreme as you would think. To illustrate what I’m getting at concrete, the population transfer between Greece and Turkey during the 1920s was far more impactful as a dynamic than simple before and after admixture estimates would suggest to you (since genetically the two groups were very similar). The figure from  Haak et al does not use admixture components that break out naturally, but their inferred demographic mixes taking into account the genetic character of the putative ancestral populations. The blue component refers to WHG, but WHG-like ancestry is also in both the green (Yamnaya) and orange (EEF) elements (this is why I’m saying it is likely that modern Europeans are mostly >50% WHG-like).

One temporal dimension that Haak et al emphasizes in particular, but seems clear in Allentoft et al. as well, is that non-Yamnaya ancestry slowly begins to rise again by the Bronze Age. Why? I will address that below. But, Allentoft et al. has broader Eurasian samples, including likely Indo-European populations in the trans-Ural and trans-Altai regions. In both of these areas the successor cultures had EEF-like ancestry. That is, like the Corded Ware population, and unlike the parent Yamnaya group. This strongly implies back-migration by this complex from Eastern Europe, as far east as western China, during the Bronze Age.

warbeforeIn The New York Times piece David Anthony states two things which puzzle me as an interested lay person without his expertise. First, he seems to think that the amalgamation of the Yamnaya and EEF-descended populations was not a warlike process. Specifically he says “It wasn’t Attila the Hun coming in and killing everybody,”. This is a useful image, but let’s be honest and note that the Huns were not primary producers, and did not aim just to increase pasturage by killing settled peoples as Genghis Khan had wanted to do (see The End of Empire: Attila the Hun & The Fall of Rome). Rather, they conquered and subordinated other barbarian groups, as well as extorted tribute from the East Roman Empire. The demographic impact of the Huns was not directly from them, but the fact that they and their successors (in particular the Avars) facilitated the migration of other groups, first, the Goths, and later the expansion of the Slavs. By the time of Attila barbarian leaders were well aware that the conquered were vital as economic producers whose capture and subjugation would allow them to engage in status competitions of conspicuous consumption. I do not believe that this was quite the case in the Copper and Bronze Ages beyond the limes of the civilized world, which was then an small archipelago of literacy in a sea of barbarism. Both the above papers indicate massive demographic disruption across Europ. Though war as we understand it is necessarily inevitable for our species, between the rise of agriculture and the modern period it seems to have been very common. It is not a coincidence that the Scandinavian Corded Ware culture are also called the Battle-Axe culture. Yes, many archaeologists believe that they were primarily a status symbols. I’m willing to bet many archaeologists are wrong. It’s been known to happen.

gokturk_empire_by_still_atesThe second issue which Anthony brings up is the connectedness of the various post-Yamnaya cultures, in particular that of the earliest Indo-Europeans on the fringes of western China, 4,000 miles from their likely point of origin. The genetic characteristics of these eastern groups is also such that it is likely that there was gene flow from Europe, mediated by a common steppe culture. Anthony states that “I myself have a hard time wrapping my head around explanations for that”. This totally confuses me, because he’s a professional archaeologist, so he must know that widespread gene flow and cultural ties cross the vast swath of the Eurasian heartland is not surprising at all! To Carl Zimmer I pointed out the example of the Goturk Empire of the mid 6th century A.D., which expanded rapidly from the core Altai zone, and prefigured the later distribution of the Turkic people, from the Nile to the fringes of the Arctic sea. Language and lifestyle mediate relationships and demographic contact. The peripatetic character of steppe peoples is well known and attested from the historical and semi-historical record. Groups such as the Huns, Avars, and Alans, had inchoate origins in the heart of Eurasia, and moved back and forth along lines of cultural affinity as needed. Alans were serving under the Mongols in China in the 13th century, but 800 years earlier they had accompanied the Vandal tribe to North Africa, and maintained a separate identity there until the conquest of Justinian. It seems entirely plausible that this pattern of hyper-mobility arose with agro-pastoralism along the whole range of continuous ecological appropriateness, only ending with the rise of gunpowder empires and the crushing of the Oirat by the Manchus (with the tacit approval of Russia).

*Northern European archetypical physical characteristics are younger than the pyramids*

Spencer Wells, a new look in the world
Spencer Wells, a new look in the world

Phylogenomics is tangled and complicated still, even with all these new results. I’ve only scratched the surface above. You really need to read the papers, and their supplements, to even get a sense of what’s going on (yes, ideally you’ll know what an f3 statistic is!). But, the population genomics which give us a sense of the character of natural selection and phenotype over time is much clearer. The suite of traits which we associate with white Europeans is quite possibly very recent, as late as post-Bronze Age. White supremacist scholars of the early 20th century who posited that ancient Egypt (in fact, all civilizations) were founded by blonde Nordic people turn out to likely be wrong because these civilizations probably predate the existence of blonde Nordic people, both in their genetic structure, and in their physical type (at least in any number).

nature14507-f4The genetic architecture of pigmentation is something geneticists know a fair amount about, because genome-wide association has been very fruitful in this area. Unlike traits such as height there is a large amount of between population variation in pigmentation. And, that variation is due in large part to a few genes of large effect.  At SLC24A5 there is a SNP which accounts for around 1/3 of the melanin index difference between Europeans and Africans, using an admixed African American population to test the effect. As I have observed before SLC24A5 in its derived form is as close to fixed as you can get in Europeans. In the 1000 Genomes data set of thousands of individuals I found a few samples with a heterozygote and the ancestral copy. In the Middle East this allele is also near fixation, though not quite. As you can see from the figure I adapted from Allentoft et al., among South Asians the derived allele is also at high frequency. My whole family is a homozygote for the “European” variant. There is some suggestive evidence that this haplotype derives from the Middle East. It was only at low frequency among European hunter-gatherers [3]. But, by the Bronze Age had it gone to fixation in Europe, as well as on the Eurasian steppe.

Of more interest to me is the trajectory of SLC45A2. The derived allele is nearly fixed in modern European populations, though not nearly to the same extent at SLC24A5. In Iberian and Sardinian populations the ancestral type is in the range of ~10%. During the Bronze Age in Europe it was only at ~50% frequencies, which is in the range of modern Middle Eastern populations. It was even at lower frequency in the steppe, from which the putative Indo-Europeans migrated.

Finally, in this panel for pigmentation they included a major SNP in OCA2-HERC2 region. This locus is famous for being involved in blue-brown eye color variation, explaining 75% of the variance, and also exhibiting the third longest haplotype in the European genome. Naively projecting from these SNPs one could credibly argue that the ancient hunter-gatherers of Europe at the beginning of the Holocene were dark-skinned and blue-eyed! The Bronze Age European samples, which in this case are biased toward Northern Europeans, had a range of genetic variation equivalent to modern Southern Europeans. The people of the steppe did not seem to have blue eyes at all.


These results align perfectly with those in Mathieson et al. One thing to observe is that the Paleolithic samples, which have a much deeper time depth, are “ancestral” at all these positions. Even if the sample size is small (N =4), they’re from diverse times and places. Does that mean that they were much darker than even the Holocene hunter-gatherers of Europe? As some have pointed out we can’t just straight-line extrapolate from the genetic architecture of today to the past. Remember that Neanderthals exhibited pigmentation polym]orphism, but of a different sort. A deeper functional analysis may yield the possibility that Paleolithic Europeans had alleles which also resulted in lighter skin, but they were different ones from the ones segregating as polymorphisms today. I have already stated that I doubt much of modern European ancestry derives form before the Last Glacial Maximum. The reason that modern genetic variation in terms of predicting phenotype gives these sorts of results is that they may have arrived at the same trait value via a different set of polymorphisms. Genotype-phenotype maps derived from modern populations may be a poor predictor of the relationship 30,000 years ago. Why would one think that selection upon variation in pigmentation began at the cusp of the Holocene?

But, I do think we can predict with more confidence the nature of phenotypes for populations which are genetically much closer to modern ones. Bronze Age Europeans fit that bill. And, I know something personally about what the appearance of individuals during this period might have been based on genetic architecture: both my children exhibit a genotype profile on pigmentation loci similar to many Bronze Age Europeans. That is, they’re fixed for the derived variant of SLC24A5, and are heterozygotes at SLC45A2 and OCA2-HERC2 (my son, but not my daughter, is a heterozygote at KITLG; it does seem to make a difference in hair color). In terms of just their complexion they could pass as indigenous Southern Europeans, but definitely not Northern European.

*Culture leads genes by the leash*

Another major finding of  Mathieson et al. and  Allentoft et al. is that the derived allele found across West Eurasians that allows them to digest lactose sugar as adults has been sweeping up in frequency over the last 4,000 years. This allele spans a diverse array of populations, from Basques to South Asians. With pigmentation it seems that we need to consider jointly the impact of ancestry and selection (in South Asia derived SLC24A5 frequencies are definitely a function of both selection and descent). But with LCT it seems likely that selection is paramount. The predominant genetic character of Eurasia was established by the Bronze Age, but the frequency of the lactase persistent allele was still far lower. Tests of natural selection which focus on patterns of haplotype variation long detected a huge hit from LCT so this is not surprising.

51r8Ph-vcaL._SY344_BO1,204,203,200_Intriguingly Allentoft et al. indicates that though the Bronze Age steppe populations had low frequencies of the derived allele, it seems that they did have a higher frequency than contemporary populations. This suggests that the origin of this haplotype, which spans the whole range of Indo-European speaking populations, and also into Finnic groups and the Basque, may still be attributed to the Yamnaya complex. In 10,000 Year Explosion Greg Cochran proposed the hypothesis that the favored mutation for LCT enabled the spread of Indo-European pastoralists. These results are not strong support for that direct causal relationship; rather, it strikes me that the ascendancy of the pastoralists drive the selection pressures for the allele in question. Biology did not drive culture, culture drove biology. The milk-drinking Celts and Germans encountered by Julius Caesar 2,000 years ago may still have been in the middle stages of adaptation to the agro-pastoralist lifestyle slowly being perfected by their ancestors.

*As the white man is, so shall we all be*

A new look as well
A new look as well

It is a running joke of mine on Twitter that the genetics of white people is one of those fertile areas of research that seems to never end. Is it a surprise that the ancient DNA field has first elucidated the nature of this obscure foggy continent, before rich histories of the untold billions of others? It’s funny, and yet these stories, true tales, do I think tell us a great deal about how modern human populations came to be in the last 10,000 years. The lessons of Europe can be generalized. We don’t have the rich stock of ancient DNA from China, the Middle East, or India. At least not enough to do population genomics, which requires larger sample sizes than a few. But, climate permitting, we may. And when that happens I am confident that very similar stories will be told. Using extant genetics we can already infer that modern populations in South Asia are a novel configuration of genotypes and phenotypes. The same in Southeast Asia, the Americas, and probably Africa. Probably the same in East Asia. Perhaps in Oceania. Even without admixture humans evolve in situ and changed, but with admixture the variation increases, and the parameter space of adaptation becomes richer and more flexible.

In Isaac Asimov’s later Foundation books he touched upon the existence of racial diversity in the future (from what I recall his earlier works from the pulp era were whites-only galaxies). At one point Hari Seldon encounters someone whose physical appearance seems to be East Asian, and they discuss the strangeness of people with East Asian ancestry being termed “Easterners” and those with European appearance being “Westerners.” With a loss of memory of the ancient distribution of these populations on the home planet only the shadow of a semantic recollection exists as a ghost in the galaxy-spanning Empire based out of Trantor. But of course tens of thousands of years in the future, even barring genetic and mechanical modification, it is unlikely that modern racial types will persist in any way we would recognize them.

But these results coming out of ancient DNA are telling us that what is likely to be true for the far future was also true for the recent past. White Europeans are a new type. But so are brown South Asians. Ethiopians have a recent ethnogenesis, as do most North African groups. The Bantu expansion has reshaped the face of Africa on the edge of the historical horizon. And so forth. In the big picture Young Earth Creationists are wrong, but in the specifics the idea that the sons of Noah populated the world ~5,000 years ago is not looking as crazy as it once did! Human genetic variation across Eurasia today may be mostly clinal, but in the recent past it was not. Rather, it was characteristic by sharp discontinuities and isolated local populations with diverged ancestry from their neighbors.

*And culture made man in its image*

51L3op-B8fL._SY344_BO1,204,203,200_About ten years ago it was common in paleoanthropology to assume that human beings emerged almost fully formed ~50,000 years ago, and wiped out all the others in a genocidal wave of advance. Richard Klein advanced this model in The Dawn of Human Culture. Klein’s thesis was that some stochastic event, a mutation, resulted in the punctuation of a new species, our own. This singular genetic process allowed for the emerged of fully formed linguistic faculties in our lineage, which allowed for the development of the cultural flexibility, which made the rest of the human lineages evolutionary dead ends. It was a single and elegant story. It appealed to the principle of parsimony. The reality of “archaic” admixture was a difficulty for Klein’s model, evidenced by the fact that he voiced his skepticism of genetic claims of admixture in The New York Times after most others had moved on. For Klein a biological change explained the rise and success of our species, not a cultural one.

At the time I found the thesis compelling. We were after all a very special species. Modern Homo made it to Oceania and the New World. Something must have happened. Something big. What else could explain our rapid expansion and marginalization of other lineages? I’m a biologist, and so biology is an appealing causal mechanism.


*The luck of the English facing the ocean*

At about the same time the evidence for Neanderthal admixture came out, Luke Jostins posted results which showed that other human lineages were also undergoing encephalization, before their trajectory was cut short. That is, their brains were getting bigger before they went extinct. To me this suggested that the broader Homo lineage was undergoing a process of nearly inevitable change due to a series of evolutionary events very deep in our history, perhaps ancestral on the order of millions of years. Along with the evidence for admixture it made me reconsider my priors. Perhaps some Homo lineage was going to expand outward and do what we did, and perhaps it wasn’t inevitable that it was going to be us. Perhaps the Neanderthal Parallax scenario is not as fantastical as we might think?

41z97bDZvUL._SY344_BO1,204,203,200_Consider the case of Europe around 1600. In England and northern Germany (or what was to become northern Germany) you have two Protestant and genetically similar populations. But by 1850 it looked as if England was going to demographically overtake Germany in a broader genetic sense. James Belich’s Replenishing the Earth reviews the history of this period, when England spearheaded a demographic revolution far out of proportion to what one might have predicted in the year 1000. But by 2000 Germany, or Germans, had caught up somewhat. How? Millions of Germans migrated to the United States, starting in very large numbers in the mid-19th century, and were “picked up” by the demographic revolution which was the United States. The point is that contigencies of history, cultural and social, rather than biology, explain the trajectory of the gene pool over time. Much of the human past, and the sharp fluctuations in gene frequencies, might be driven by the long and forceful arm of culture.

In the treatment above I note that the EEF farmers who by and large replaced the indigenous hunter-gatherer groups in modern Southern Europe were themselves a compound. The hunter-gatherer ancestry within the EEF was far more successful than that of those they replaced, but the only reason that this was so was geographic coincidence. The WHG-like groups absorbed into the EEF were positioned further east, and so closer to the initial locus of expansion of Neolithic farmers. Similarly, the Neanderthal admixture into modern populations was almost certainly localized to particular groups. This is not to say that there are no biological differences between human populations which may explain a wide range of phenomena. Anyone looking at the skull of a Neanderthal and a modern human knows there are. There are also likely bio-behavioral differences between extent populations. Gene-culture coevolution is a real process, even if the details need to be worked out. But the interplay between biology and culture is complex, and in many cases cultural changes are driving the biological change, and then fixing differences which are advantageous to the “winners” (lactase persistence seems rather to be a perfect case of this). But just as in the individual case we must also remember that winning is often in part a function of being lucky. Naturally selection, generally thought of as a deterministic process, is also to some extent stochastic [4].

*From genetic islands to a roiling sea of humans*

One of the most shocking things for many of the geneticists working in the area of ancient DNA, and encountering the variation of the past, is the high level of population structure. That is, you have groups co-resident for many generations who nevertheless exhibit genetic distances of intercontinental scale. But as I stated above David Reich himself found the same results for India. And, in Africa you have long symbiotic populations, such as the pygmy groups of the Congo, and their agricultural neighbors, who are genetically very different, and have been for tens of thousands of years. Allentoft et al. dryly observe that “These results are indicative of significant temporal shifts in the gene pools and also reveal that the ancient groups of Eurasia were genetically more structured than contemporary populations.”

castesofmindAbout 10 years ago I read Nicholas Dirks’ Castes of Mind. Dirks is an eminent scholar who is now the chancellor of UC Berkeley. He emphasizes the power of European categories and systematization in creating the modern caste system. I don’t want to reduce his argument to a caricature. Obviously caste predates European colonialism. Dirks would admit this. But in Castes of Mind it is hard to shake the feeling that he believes that the British imposition of formalization made it what we truly understand it to be today. That caste has to be understood as a contemporary and early modern phenomenon, rather than an ancient one that was a structural feature of South Asian society.

The genetic evidence is clear now, and it paints a very different landscape. Many of the caste, even jati, boundaries we see today are thousands of years old. Endogamy long predates the British. It may predate the Aryans! Rather than the British, or Aryans, inventing caste, this form of ethnic segregation may date to the initial admixture event, to be reinvented and modified with each new population which arrives and imposes its hegemony on the subcontinent. In The New York Times David Reich states “You have groups which are as genetically distinct as Europeans and East Asians. And they’re living side by side for thousands of years.” He then he goes on to say “There’s a breakdown of these cultural barriers, and they mix,” alluding to the rise in WHG ancestry in farmer samples over time. Of course it is interesting to remember Reich’s work on India has highlighted exactly how persistent caste has been, and how it maintains genetic variation in a localized region that is often nearly inter-continental in magnitude.

We can never know if 6,000 years ago the LBK people, the first farming culture of Northern Europe, imposed a caste-like system of segregation when encountering the indigenous hunter-gatherers. Nor can we say with total confidence whether their relationship exhibited a symbiosis analogous to that between the Bantu agriculturalists and pygmies of the Congo (though do note that in these scenarios the Bantu communities are higher status, and the individual pygmies often have a semi-slave status). But, we need to look to what cultural evolutionary models and empirical results can tell us to make sense of these patterns. Ancient DNA can tell us very concretely the details of changes in allele frequencies. We can somewhat confidently reconstruct the faces and complexions of our ancestors. The questions population genomicists ask and answer in relation to animal models are relatively cleanly addressed by these data sets, assuming the sample sizes are large enough. But humans are the cultural animal par excellence, and that is the critical new variable which will require a new set of scholars to come together and create a truly multi-disciplinary understanding of the human past, present, and perhaps future. Powerful genomic techniques which produce results which have implications for the study of human history needs to leverage the full array of scholars who study human historical science.

1 – The three-fold copying is an important matter, because the different cultures had different preferences and goals. The Arab effort for example focused mostly on the philosophical production of the ancients. Without the Byzantines we would have far less of the humanistic production of Classical Greece, in particular the theatrical tradition.

2 – Much of what is known about the diplomatic history of the Bronze Age Near East has been preserved in cuneiform tablets. Though unwieldy, this form of writing on clay tablets is obviously more robust and less dependent upon copying than parchment and papyrus which came later.

3 – I would be curious to know if it is the same haplotype as is currently common in Eurasia.

4 – New mutations will usually go extinct, even if they are favored, in the initial generations. It is only when the frequency becomes high enough due to chance that selection will inevitably drive its frequency up, perhaps to fixation.

Only the glimpses of the shadows of history

41eAv4GeGqL._SY344_BO1,204,203,200_Sometimes what you infer about history is totally wrong. How often? Sometimes we find out. As I’ve outlined on this blog over the course of years inferences made in historical population genetics using extant variation have often turned out to be totally wrong. How do we know? Time machines. Ancient DNA.

Yesterday I received a copy of Sewall Wright and Evolutionary Biology. I read it about 10 years ago, but I didn’t know as much about evolutionary biology back then. So I wanted to get a copy of it (unlike R. A. Fisher: The Life of a Scientist there are actually affordable copies). I decided to get straight to the section which covered the general time period of the Wright-Fisher controversies, when two of the great eminences involved in the development of the field of population genetics were hashing out somewhat different perspectives.

41qS+5MyBmL._SY344_BO1,204,203,200_Rather than getting into that, what I want to recount is the passage the author, Will Provine, offers up from Sewall Wright’s personal correspondence which reproduced R. A. Fisher’s last letter to him. One interesting sidebar here is that R. A. Fisher, from all the biographical information I’ve been able to gain an impression from, was a much more flawed person than Sewall Wright. Fisher was the greater scientist (seeing that he made original contributions to statistics), but Wright was the greater human. After a period of somewhat frequent correspondence Fisher and Wright ceased their direct interaction, right at a time when their differences were being highlighted, leading to decades of ill feeling. In particular, there had been a mixed review of The Genetical Theory of Natural Selection which Wright had submitted to Genetics.

From everything Will Provine knew beforehand he was expecting a rather cold and unfriendly last letter from Fisher to Wright due to the nature of the review. His expectations were totally off base. R. A. Fisher was entirely gracious and good-natured, and seemed appreciate of the review despite its dissents. The lesson that Provine takes from this is that we don’t truly know what we don’t truly see, and we should have greater humility about the darkness outside of the bounds of our direct perception.

Mr. Martin, please acknowledge a “forking” of HBO’s show & the books

91-rbDuvMoL._SL1500_There are some of us who have waited years. It has even come to pass that we have grown into manhood, and one generation has waned and another waxed as George R. R. Martin has been weaving his series A Song of Ice and Fire. His “So Spake Martin” once had oracular aspect for the patient ones. Now we wait…we wait. Years pass.

And yet a new specter haunts the horizon, the specter of HBO. There are those of us who have not followed this show, which began at the beginning of the decade. We have watched with curiosity as the newcomers have reacted with horror at what we who have waited have known for all these years. But we harbor no ill will, we, the waiters. We know. We know the end will come, and a new creation will join the constellation of the volumes which have come before. We wait.

But now there is talk that the show is proceeding ahead of the books, and that our patience is for naught. George R. R. Martin needs to make this clear: are the shows “spoiling” the books, or, is the television series a “fork”? Obviously I’m hoping for the second.

Beyond printing with pixels: internet native science publishing

german+flying+machine+paleo-futureThere is a season for everything. Last year my friend David Mittelman and I teamed up with GigaScience editor Laurie Goodman to write up a commentary in Genome Biology, Dragging scientific publishing into the 21st century. We’re obviously in the 21st century, but for science publishing we’re in the “long twentieth.” But wait, it’s worse! As we noted in the piece, to a great extent the internet is used as a PDF delivery device by many publishers, and the PDF is an electronic form of the classic paper journal article, whose basic outlines were established in the 17th and 18th centuries. In other words, in a qualitative sense we’re not that much beyond the Age of Newton and the heyday of the Royal Society. Scientific publishing today is analogous to “steampunk.” An anachronistic mix of elements somehow persisting deep into the 21st century. Its real purpose is to turn the norms of the past into cold hard cash for large corporations.

neveryone1Obviously I’m not the only one with this thought. To a great extent PLOS and the open access revolution arose to overturn the procrustean status quo. More recently preprint culture, and the transparency of “personal communication” via Twitter, have changed the terms of discussion. The metabolic pace has increased, and the transparency which breathes life into scientific discourse is on the march. It seems likely that the old order will die a death of a thousand cuts, as one practice after another fades into obscurity.

One of the main weak points of the current framework is that it does not serve the needs of the end user. For many, the goal of getting published is to add a line to the c.v., at least outside of the top-tier journals. This explains the emergence of vanity and fraudulent publishing houses. Many researchers of genuine eminence exist, but for some workaday scientists publishing somewhere will do well enough to keep the salary and perks coming. But science should be more than just a job. Science feeds the spirit of our society, it allows us to see with our mind’s eye how the world truly is. Scientific discussion has to flourish in a manner which is not simply an ends to careerism.

logo-v2.originalSo back to a specific weakness of the current system: how to engage with the end user? David has assembled a small team  to begin actualizing the “wish list” that we outlined in Dragging scientific publishing into the 21st century. That actualization takes the form of a new startup, N of Everyone, which exists to roll out technology to help folks better engage with and discuss science. Their first project is a reader which leverages the way people today actually read “papers.” That is, not simply a pixelization of paper, but a form of engagement with science which actually brings to the table the interactivity which is invited by the nature of electronic media. At many journal clubs people now read “papers” on tablets, notebook computers, and even phones. Why retrofit the print format of yore for the cutting-edge technology of today?

efbd8da242c257aadc4aa40c2383fb54_originalThis probably sound a bit vague and nebulous. To make this concrete N of Everyone is looking to the crowd for support and to raise initial funds. Funding to transform an idea into a reality. There’s already a prototype (I’ve seen it). Imagine leaving comments on specific sentences of papers. Basically, the sort of annotation you already do emailing files or sharing docs when it comes to collaborating to get a publication polished.

Here’s some more information from their Kickstarter page:

  • Share or comment on any sentence in the paper without having to leave the paper
  • Get in-line context for references as well as a map of where those references are discussed throughout the paper.
  • Get in-line context and discussion of figures in a paper as well as the entire discussion for a figure, even if it is distributed throughout the paper.

Get more information and lots of great screenshots at their their Kickstarter page, and consider contributing to the project (obviously). There’s a lot of ways one can imagine the communication of science going, and it will change. I’m confident of that. David and his partners are attempting to grab the bull by the horns and drive in a fruitful direction. I know they’re passionate about science, and for me that’s key. You can make money in a variety of ways. The reason they’re tackling this project is because this is an issue that’s close to their heart.

Note: comments are closed to this post. Since David & co. would appreciate feedback I’m sure, I’ll just point you to their Twitter,

Posted in Uncategorized

Who’s afraid of a hybrid wolf?

Citation: Prüfer, Kay, et al. "The complete genome sequence of a Neanderthal from the Altai Mountains." Nature 505.7481 (2014): 43-49.
Citation: Prüfer, Kay, et al. “The complete genome sequence of a Neanderthal from the Altai Mountains.” Nature 505.7481 (2014): 43-49.

Quartz has a quizzical piece up, which is useful for fleshing out the incoherency of some tendencies within conservation biology. It turns out that the large coyotes which have been expanding across the eastern United States as the forests have taken over abandoned farmlands (due to the shift of agricultural activity to the Midwest in the 20th century from New England and Mid-Atlantic). To no one’s surprise these coyotes are filling the niche of the timber wolves of yore, predating upon the white-tailed deer whose numbers have increased with the rewilding of the landscape. But there’s a twist in this tale:

…scientists have since discovered these super-sized coyotes are only about two-thirds coyote. About 10% of their genes belong to domestic dogs and a quarter comes from wolves, with which they hybridized as they moved east north of the Great Lakes.

Monzón says hybridization enabled eastern coyotes to adapt quickly to fill the niche left by wolves. In fact, areas with the highest densities of deer had coyotes with the greatest proportion of wolf in their genomes. “There was a very rich resource that was waiting to be exploited,” says Monzón. “They’ve done very well here.”

So what’s the problem? As observed later in the piece our own species is to some extent a compound of diverged lineages. This pattern of reticulated ancestry is as old as evolutionary process itself. The “tree” metaphor was simply a stylized fact which elided detail so we could get to the heart of the matter in attempting to understand in our bones what common descent entailed. But supposedly there are rumblings:

Some scientists and conservationists see the coywolf as a nightmare of the Anthropocene—a poster child of mongrelization as plants and animals reshuffle in response to habitat loss, climate change and invasive species.

I am curious about any scientist that would use the term “mongrelization.” Name the scientists. It strikes me that a ghost-strawman is being set up here. But I’m willing to be proven wrong.

The reality is that many biologists have had issues with the biological species concept as an idealized and Platonic ideal, as opposed to an instrumental concept. Plant biologists have never had much truck with a strict form of the biological species concept, and how exactly does it to apply to asexual organisms? Our public policy has been built on a narrow conception of biological purity which only holds in a small slice of the tree of life, and even there it is violated constantly. Our own species is here as proof of that.

The end of the bookstore and the end of genre?

AmericanGods-ReprintThe new TNR seems to be a weird mismash of SJW-clickbait and interesting pieces on aspects of culture which the old TNR probably wouldn’t have thought to publish. I doubt that this version of TNR is long for this world, but I do appreciate pieces such as this conversation between Neil Gaiman and Kazuo Ishigoru, Breaking the Boundaries Between Fantasy and Literary Fiction. It’s rather self-indulgent, but what do you expect? The main question which they circle around is the nature of genre boundaries. This portion really jumped out at me:

NG: I loved the idea, because it seems to me that subject matter doesn’t determine genre. Genres only start existing when there’s enough of them to form a sort of critical mass in a bookshop, and even that can go away. A bookstore worker in America was telling me that he’d worked in Borders when they decided to get rid of their horror section, because people weren’t coming into it. So his job was to take the novels and decide which ones were going to go and live in Science Fiction and Fantasy and which ones were going to Thrillers.

KI: Does that mean horror has disappeared as a genre?

NG: It definitely faded away as a bookshop category, which then meant that a lot of people who had been making their living as horror writers had to decide what they were, because their sales were diminishing. In fact, a lot of novels that are currently being published as thrillers are books that probably would have been published as horror 20 years ago.


When I was an adolescent the way I would decide how to purchase a book, usually a paperback science fiction or fantasy, was to look for specific authors and covers. There wasn’t really that much planning or research ahead of time. There was a great deal of serendipity involved.

Things are different today. Usually before buying something in person I do some research online. Also, recommendation engines are pretty useful, and good at guiding you to a narrower set of choices attuned to your preferences. This obviates the need to some extent for genre categories as guides in the first place.

I’m thinking of this specifically because apparently Spotify Wants Listeners to Break Down Music Barriers (well, according to Farhad Manjoo). It makes sense for Spotify sense it has so much more data to work with than old style radio stations. Similarly, at some point Amazon will have enough reading and purchase information to get really good at pointing you to authors and works that are suited to your interests.

Open Thread, 6/7/2015

CoverReadingInTheBrainWhen you narrow in on a part of science it is easy to lose sight of the rest. That’s how I feel when it comes to Reading in the Brain: The New Science of How We Read. It’s been a while since I read much about cognitive neuroscience, so it’s a novel rediscovery. Though the most interesting point that I’ve internalized so far is that the irrationality of the English spelling system does result in a cost in terms of the amount of time that children must invest to become functionally literate. Apparently the Turkish transition from the Arabic script to Latin alphabet resulted in immediate yields in literacy gains, illustrating that writing systems are not arbitrarily useful.

Second, another comment about my comments policy:

I’ve been reading your blog for some time. It is always interesting. But one thing frequently puzzles me. The prose in your main posts is always so reasonable and persuasive that I am surprised by your surly, intemperate responses to some of the commenters. Often, as in your reply to Sean, you totally ignore the commenter’s remarks and simply “go postal.” Whether or not Sean has completed “post-doctoral research” really has nothing to do with the quality of his remarks.

First, I know more than you. You really should shut up sometimes. I know the post-doctoral researcher in question, and he’s done more thought on issues of reproduction than you or Sean have forgotten. So I don’t want people to waste his, mine, or, your, time. Don’t think I don’t keep track of commenters and what they’ve said. Comments can’t be evaluated in a singular fashion. If people have commented intelligently or informatively in the past, they get slack. If not, no.

Second, a recent poll suggests that 4 percent of Americans say they are less intelligent than average. It’s pretty obvious that people overestimate their intelligence and knowledgeability, and this includes readers of this weblog. I spend time, away from other activities, writing these posts. If you don’t like them, don’t read them. I don’t care. But if you make a comment don’t waste my time. If you tell me something I don’t know, I’m going to be happy. If you make errors in areas where I am more knowledgeable than you, I will not be happy. Unfortunately many readers of this weblog are used to being the “smartest person” in the room. That means that often they think I should be happy with their incredible comments, when they only usually pass muster because of their pathetic peer groups (this tends to be a major problem with older commenters who are never told that they’re not awesome).

My main posts are “reasonable and persuasive” because I do a lot of reading and thinking. If your comments are well informed and well thought out, I won’t be surly.

Peeling away the past

David at Eurogenes points to me a list of abstracts for ISABS 2015. Three caught my attention, and I will share them.


The most important process in the prehistory of our species is arguably the Neolithization. In the course of 10000 years, it took us from a hunter-gatherer lifestyle to the society we live in today. For Eurasia, Anatolia and the Near East played a key role in this process. It has already been shown that the neolithic expansion from this area and westwards was driven by migration. But we know little about the actual establishing of neolithic societies in Anatolia, and on what kind of population dynamics effected their gene pool. And we also know little more about the Neolithic gene flow from Anatolia than that it had occurred. For the first time we present genomic results from an ancient Anatolian farmer, from Troy’s proto-settlement Kumtepe, and it anchors the European neolithic genepool to Anatolia. Further, the late-neolithic individual from Kumtepe does not only contain the genetic element that is frequent in early European farmers, but also a component found mainly in modern populations from the Near and Middle East and Northern Africa, and to a much smaller degree, in some Neolithic European farmers. The scene presented by Kumtepe is compatible with geneflow into Europe from or through the neolithic core area in Anatolia. And it is likely that this occurred early, perhaps just after the neolithic core area had been established in southeastern Anatolia.


The consequences of the Neolithic transition in Europe – one of the most important cultural changes in human prehistory – is a subject of intense study. However, the consequence of this transition on prehistoric and modern-day people in Iberia, the westernmost frontier of the European continent, remains unresolved. Here we present the first genome-wide sequence data from eight human remains, dated to between 5,500 and 3,500 years before present (Chalcolithic and Bronze Age), excavated in the El Portalón cave at the Sierra de Atapuerca, Spain. We show that these individuals emerged from the same ancestral gene pool as early farmers from other parts of Europe suggesting that migration was the dominant transfer-mode of farming practices throughout western Eurasia. Early farmers, including the El Portalón individuals, were found to have mixed with different local hunter-gatherers as they migrated to different parts of Europe and that the proportion of hunter-gatherer related admixture into farmers increased over the course of two millennia. Among all early farmers, the Chalcolithic El Portalón individuals show the greatest genetic affinity to Basques. These El Portalón genomes reveal important pieces of the demographic history of Iberia and Europe and advance our understanding of the relationship between hunter-gatherers and farmers, and how they relate to modern-day groups.


Different origins of the late hunter-gatherers (HGs) of the Scandinavian Middle Neolithic (5300-4300 BP) Pitted Ware Culture (PWC) have been proposed. While many archaeologists advocate an ancestry in the preceding and partly contemporary farmers of the Funnel Beaker Culture (FBC) ancient genomic data show a strong genetic differentiation between these two groups. It has, in fact, been suggested from genome-wide capture data that PWC are the direct descendants of the Mesolithic HGs from Motala (Sweden) with no additional admixture from other populations. However, by reanalyzing published full genome sequence data from Ajvide (PWC), Gökhem (FBC) and Motala (Mesolithic HG) we gained higher resolution, and found previously unknown differences between Ajvide and Motala. First, Principal Component Analyses show that, even though Ajvide and Motala cluster together, Ajvide is closer to present-day European populations (and to early farmers). Second, D-tests show asymmetric relationships to other groups which would not be expected under a scenario of complete continuity. More specifically we show that i) Gökhem is closer to Ajvide than to Motala, ii) the Siberian Paleolithic HG MA1 is closer to Motala than to Ajvide, and iii) that the Central European Mesolithic HG Loschbour is closer to Ajvide than to Motala. These results indicate a more complex transition from Mesolithic to Neolithic HG societies in Scandinavia and the demographic implications will be discussed.

I believe that papers such as Massive migration from the steppe is a source for Indo-European languages in Europe probably give us a good idea about the details of how the genetic structure of northwest Eurasia came about. At least in the broad outlines. But there are curious details which need to be worked out and explicated. I ordered the abstracts above from the most easy to interpret to the least.

First, it does seem clear that the Middle East has changed a fair amount since the first farmers arrived in Europe. These first farmers were geographically Middle Eastern, but they were somewhat different from modern Middle Easterners. Attempts to use modern Middle Eastern populations as proxies from the first farmers gave us a false picture in part because modern Middle Eastern populations themselves are not fossils from the Neolithic. They’ve changed in situ. This is obviously a lesson we should have learned by now. Many populations are not what we think they are.

In particular this has been an issue for the Basques. About ten years ago I read something about Basque culture where the author casually mentioned that aspects of their language give clear evidence as to their origins as the last hunter-gatherers of Europe. We now know that this is probably wrong in many ways. Rather than being the purest distillation of the genetic variation of the hunter-gatherers of Europe (in fact, modern Northern Europeans seem to be genetically closer to hunter-gatherers who lived in the Iberian peninsula than modern Spaniards!), the Basque are likely descendants of one of the early waves of farmers into Europe (at least in large part, we should never forget that some amount of assimilation of the indigenous substrate occurred, especially early on). Earlier work used various genetic characteristics of Basques to assess how “hunter-gatherer” other populations were. The problem with that turns out to be that Basques themselves are not very hunter-gatherer in origin, at least in terms of being a population whose ancestors were mostly hunter-gatherers in Europe at the end of the last Ice Age.

The final paper is something that I really don’t have a good grasp of, or explanation for. But, I will say that Scandinavia is on the edge of the range of humans. Population densities in many areas (though not all) have been rather low for a long time. It is not implausible to me that you’d see lots of population extinctions and replacements over time, leading to a lot of confusing discontinuity.

California’s receding Hispanic majority

FT_15.06.03_califProjectionsShareHispanic as a catchall is a ridiculous term. I was thinking about this a few days ago when I saw this article, Google’s staff worldwide still overwhelmingly white and Asian men, where it actually notes the underrepresentation of “Hispanics.” Why does this matter exactly for an international corporation like Google? Presumably people of Middle Eastern descent (and no, I don’t count Ashkenazi Jews who are not Israeli as Middle Eastern!) are also underrepresented. But through an American prism having a surname from the Iberian peninsula really, really, matters* (also, it can transform people who are 100% white into underrepresented minorities; e.g., a friend who is the grandchild of Jewish refugees to Mexico who proudly checks Latino on demographic boxes to gain diversity points). The term arose almost by happenstance during the Nixon administration in the early 1970s. Though at least unlike the term “Asian American” there is some sort of cultural-historical coherency, as it generally connotes people of Latin American provenance, who are shaped by the events and migrations after 1492.

In any case, I just want to point to a Pew piece today, Will California ever become a majority-Latino state? Maybe not. The key to remember is that the sensitivity to assumptions is a major issue with all ~50 year projections. And yet the news media, and the general populace, tends to take these extrapolations as fact. “Proven.” It might strike you as ridiculous that seven years can radically overturn earlier initial conditions and qualitatively change predictions such as the “inevitable Hispanic majority” in a state as marinated in Latino culture as California, but that’s the takeaway, it is ridiculous. One should have only marginally more confidence in this projection than the one from 2007.

Remember, historically inevitable forces have a way of being not inevitable. History over timescales that mortals care about tends to be highly contingent. We make our own history. It does not proceed as the tides rise and fall and the sun marches across the sky.

* Latino is often preferred by some as it is more inclusive of Brazilians.

In defense of Paul Ehrlich

51CX2KCW80L._SY344_BO1,204,203,200_When I was younger I was very concerned with overpopulation. Today I am not very concerned. When I was younger I read books such as Paul Ehrlich’s The Population Explosion, and Garrett Harden’s The Ostrich Factor: Our Population Myopia. It is because I read these books and internalized their lessons that I am not very concerned. You shall judge a prophet by the veracity of his visions, and the predictions of these books have not come to pass in our time. That is also the conclusion of a piece in The New York Times, The Unrealized Horrors of Population Explosion. In it the reporter observes just how much Paul Ehrlich got wrong, and, also exposes just how unrepentant he is. It strikes many, including myself, that his belief in his theory, his model, is far more robust than his adherence to the philosophy that one must update one’s expectations with new data. We are all familiar with the fact that evaluated over the past ~10,000 years the human population has exploded. But, over the past 50 years the growth has been far less explosive in relative terms. When Paul Ehrlich wrote his original book, The Population Bomb, in the late 1960s the global fertility rate was ~5. Today it is close to ~2.5. When I was born in Bangladesh the fertility rate was close to 7. Today it is close to 2.

cb07ae0c-5106-416c-8407-38da526923c6The model which Ehrlich and many biologists are enamored of is that of carrying capacity, and the logistic growth curve. It is known to all biologists, and for those in fields such as ecology it permeates their understanding of the phenomena which define our world. In short, density dependent dynamics are such as that over time species reach a carrying capacity, where their numbers are held in “check” by exogenous and endogenous forces. The exogenous being resource depletion, and the endogenous being competition within the species. These are “iron laws” of nature, and it is no surprise that the logistic growth model arose in response to the verbal arguments of Thomas Malthus. Paul Ehrlich, and many of his fellow travelers in the 1970s, were basically neo-Malthusians, and Malthusian thinking is only the formalization and explication of ideas which are as old as humanity itself. Hunter-gatherers engage in birth-spacing and infanticide because of considerations of finite resources. The city-states of ancient Greece practiced anti-natalism and encouraged migration because of resource constraints.

5171uyglKoL._SY344_BO1,204,203,200_So where did Ehrlich go wrong? Julian Simon, Ehrlich’s nemesis, would say that his model went wrong when it viewed humans as a stress upon finite resources. Rather, humans were the ultimate resource. My friend Ramez Naam wrote a book, The Infinite Resource: The Power of Ideas on a Finite Planet, which outlines a major piece of the puzzle: humans are innovators, and those innovations increase productivity beyond imagining. Without nitrogen based fertilizer there is almost no way that we’d be able to support the population we have today, to give one example. Economists have attempted to formalize this process of growth through innovation (see Knowledge and the Wealth of Nations).

But there’s another element which is often neglected. Humans seem to reduce their fertility of their own will. That is, the demographic transition. Classical economic and ecological theory would predict that as humans produce more resources, they would produce more offspring. Times of plenty result in plenty of descendants. But what happened in nations like England in the 19th century was that gains in economic production correlated with declines in fertility. Naturally the gains were not swallowed by increased numbers. Rather, each human began to consume more and more, as the size of the pie exploded far faster than the number of humans. We live in a world of resource surplus, as the Malthusian trap was torn apart, and the gap between production and population kept growing.

Which comes to the point of this post: those who dismiss the population doomsayers need to be cautious of their own hubris. First, most of the gains in global decline in poverty has been driven by economic growth in China. And, that economic growth has partially been driven by a demographic dividend derived from the favorable dependency ratio of the last generation. It has been reported that Deng Xiaoping was convinced the wisdom of the “one-child policy” after observing that the “Asian Dragon” economies all saw benefits from reduced population growth. And it has to be remembered that this policy is coercive in exactly the manner that Paul Ehrlich had recommended.

But that’s a specific, and tendentious, objection. I say tendentious because China was already going into demographic transition, and East Asian nations which did not enact coercive population control also have very low fertility. But it seems plausible that the policy and the coercion had a major effect on the margin. The fertility would have been higher, and the demographic transition less sharp, without the policy. Since China is a nation of over one billion even marginal effects are very important. The second issue is that this is specific and somewhat narrow focused. The big picture is more important.

Paul Ehrlich himself seems to employ the classic dodge that his predictions will come to pass…you just have to wait long enough. This is a laughable response, even if on some level it is logically coherent. If you wait long enough everything you predict will come to pass. Your credibility stands and falls on whether you can predict it with some level of timely accuracy. It seems that today Ehrlich is resting his case on the fact that estimates always are bracketed by confidence intervals, but if you read his earlier work you’d not get a sense of this at all. What gives? Either he was very confident in the past, and now has simply moved goal posts, or, his writings are a mix of science and ideological polemic. I suspect the truth is a mix of both. But if you talk enough sometimes you’ll land on the truth like a dart on a bullseye.

9780192807281_p0_v1_s260x420If you have been reading me for a while you know that one of my favorite books of all time is The Fall of Rome: And the End of Civilization by Bryan Ward-Perkins. The author is an archaeologist, and he reviews the material record, and concludes that contrary to the revisionists and exponents of “Late Antiquity,” a great civilization did fall and decline in the 4th and 5th centuries. Ward-Perkins also emphasizes the Romans of that period did not truly see the great rupture coming. The end of antiquity was a surprise to them, for their world was an eternal one. The Pax Romana had lasted centuries. True, there were interruptions in the calm, such as the Crisis of the Third Century, but they had passed. Though obviously the Roman peace did not engender an affluent consumer society, Ward-Perkins notes that the industrial production in domains such as Britain in the 4th century left evidence in the form of pollution in alluvial deposits which were not matched again until the 18th century!

The_black_swan_taleb_coverFocusing upon models such as that of a carrying capacity results in the idea of iron laws which proceed in a deterministic fashion. Neglecting the protean capacity for human innovation, the ability to transform the very parameters of the model itself, is a recipe for looking foolish. But embedded within Paul Ehrlich’s denial of the facts is the deep intuition that social chaos and collapse can come upon us when we’re least expecting it. Human ingenuity is hard to predict, but, it is inevitable. But so are panics and irrational excesses. Innovation and human ingenuity exists in a social context, and that social context may be more easily perturbed than we would like to think. Rather than a clean elegant deterministic model, we need to keep in mind the non-linearities of social processes. The human spirit is the source of our salvation. But it may also be the root of the demons which damn us.