Homo sapiens’ star-shaped phylogeny

Citation: Zerjal, Tatiana, et al. "The genetic legacy of the Mongols." The American Journal of Human Genetics 72.3 (2003): 717-721.
Citation: Zerjal, Tatiana, et al. “The genetic legacy of the Mongols.” The American Journal of Human Genetics 72.3 (2003): 717-721.

The above is a figure from The genetic legacy of the Mongols, and illustrates the concept of a “star-shaped phylogeny.” This is basically a phenomenon where a massively rapid demographic expansion of one particular lineage results in a host of nearly simultaneously mutational events which derive from the ancestral state. It is illustrated by the topology above, where derived states cluster around the ancestral type. Obviously it is harder to characterize the sequential structure of lineage fissions in these circumstances.

I am beginning to think that some of the same issues may apply to the expansion of Homo sapiens sapiens 40 to 60 thousand years. The thought was triggered by the recent abstract at SMBE 2014 on the 45 thousand year old Siberian which was whole genome sequenced:

The complete genome sequence of a 45,000-year-oldmodern human from Eurasia

We have sequenced to high coverage the genome of a femur recently discovered near Ust-Ishim in Siberia. The bone was directly carbon-dated to 45,000 years before present. Analyses of the relationship of the Ust-Ishim individual to present-day humans show that he is closely related to the ancestral population shared between present-day Europeans and present-day Asians. The over-all amount of genomic admixture from Neandertals is similar to that in present-day non-Africans and there is no evidence for admixture from Denisovans. However, the size of the genomic segments of Neandertal ancestry in the Ust-Ishim individual is substantially larger than in present-day individuals. From the size distribution of these segments we estimated that this individual lived about 200-400 generations after the admixture with Neandertals occurred. The age of this genome allows us to directly assess the mutation rate in the different compartments of the human genome. These results will be presented and discussed.

200-400 generations means 5 to 10 thousand years. So the implication is that the admixture event which led to the Neandertal ancestry in non-Africans dates to 50 to 55 thousand years before the present. Australia was settled by modern humans ~45 thousand years before the present, while western Europe was settled ~35 thousand years before the present. Ancient DNA from China 40 thousand years ago already suggests that East Eurasians had begun to diverge as an independent lineage. Joe Pickrell says that when he saw the poster the ancient Siberian seemed more closely related to one of the two dominant Eurasian lineages. I suspect it would be to West Eurasians, as the Ma’lta Paleo-Siberian from 22 thousand years ago was. Overall the picture seems to be that many of the ancestral lineages which are geographically distinct across Eurasia and Oceania had already come into being in the interval between the admixture even with Neandertals 55 thousand years ago the 45 thousand year old Siberian. Because these lineages diverged so rapidly in sequence you see a situation where sometimes have polytomies, where the phylogeny can not be fully resolved. With the emergence of ancient DNA and whole genome sequences I believe that this issue will mostly be overcome, but, it explains why different methods of inference have given someone different results (e.g., the old question with Oceanians are an outgroup to Eurasians or more similar to East Eurasians).

Finally, there’s the issue why these neo-Africans were so rapid in their spread and demographic dominance ~50 thousand years ago. Probably the dominant position, most forcefully articulated by Richard Klein in The Dawn of Human Culture, is that Homo sapiens sapiens has biological competencies which allowed them to marginalize other hominins. This is obviously one reason that some geneticists are trying to find specific differences between the genomes of our own lineage and that of our cousins. But if researchers focused on modern human lineages they wouldn’t present a biological explanation at all, but a cultural one. How do you set up your priors? Because many people presumed that the emergence of sapiens sapiens was a speciation event it seemed that a biological explanation was more plausible. I don’t deny that that’s the case, and that we should weight cultural explanations  more strongly when it comes to something like the Austronesian expansion (first, the genetic difference between Austro-Asiatic Southeast Asians and Austronesians is not that great at all). But perhaps we shouldn’t dismiss the possibility of some cultural innovation as being the root of the neo-African advantage? I believe we need to start thinking more systematically about the expansions of hominin lineages since the early Pleistocene.

1177 B.C., the year civilization did not collapse

k10185Recently I read Eric Cline’s 1177 B.C.: The Year Civilization Collapsed. It’s a short book. If you are looking to familiarize yourself with the history and culture of the Bronze Age Near East in a format which isn’t a scholarly monograph, this is a good book for that, best read in complement with Robert Drews’ End of the Bronze Age (also see The Coming of the Greeks). If you are looking to understand why the complex of Near Eastern societies, spanning Mycenaean Greece to Babylon and Egypt, went into severe regress in the 12th century, this is not the book. Cline is good at stringing you along, but at the end of the day he doesn’t come to a definitive conclusion.

Which is fine, as it seems unlikely that there is a definitive answer at this point. But the set up leads you to some disappointment, as there is the standard discussion about the enigmatic Sea Peoples. Cline suggests that there isn’t one reason (e.g., drought, new techniques of warfare, economic shifts brought about by the rise of iron), but a complex set of interlocking contingencies which set forth a chain reaction which brought the globalized world of the 1st millennium down. In some ways this is complementary to Brian Fagan’s thesis that sophisticated civilizations develop ways to buffer themselves against the natural fluctuations which might result in population decrease in small-scale societies, only to squeeze themselves so tightly against the Malthusian limit that they are brought down by a mild perturbation which fractures a fragile system. In contrast in The Human Web William McNeill argues that various innovations, both material and culture, produced increased social complexity which entailed a robustness of “civilization” against a Dark Age.

The “Dark Age” between the 12th and 8th centuries BC is one of McNeill’s examples of a relatively loose network of societies which were not well enough integrated to prevent a extreme regression. This is most evident in Greece, where writing disappeared, and the world of the Mycenaneans was a legendary one to the Classical Greeks. In some ways the lived world of 5th century Athens is more alive to us, through the great works of philosophy and literature of specific individuals such as Plato and Euripedes, than the world of 12th century Athens was to the citizens of Pericles’ world. In 1177 B.C. Cline observes that certain techniques of architecture normal in the Mycenanean period were assumed by their Iron Age descendants to have been performed by giants, while the idea of a king, wannax, disappeared as the norm among the city-states of the Classical period. We know that elements of Greek identity persisted through the barbaric interregnum, because the Linear B script of the Mycenaneans has been translated, but to a great extent the Classical Hellenes suffered from culture amnesia. In some very deep ways they lost their sense of self and were reborn, rather than reformed, after the Dark Age.

And yet civilization did not collapse. It maintained genuine continuity in places such as Egypt and Assyria across the Bronze to Iron Age, and eventually these societies played critical roles in the cultural efflorescence which gave rise to the Axial Age. Arguably 1177 was notable because civilization did not collapse. It seems likely that proto-civilizations did die earlier, lost to history.* But by the late Bronze Age the network of societies around the eastern Mediterranean and out toward Mesopotamia were thick enough that they served as redundant informational nodes. Eventually the Greeks rediscovered their cultural genius, thanks to outside stimuli such as the alphabetic system, which persisted in the Levant despite the depredations of the Sea Peoples’ and the collapse of the protective umbrella of erstwhile hegemonic powers such as the Hittites and Egyptians.

* There seems to have been an expansion of Mesopotamian influence in Anatolia in 4th millennium which ceased because of a collapse. We will never know the details of this because this civilization likely never left descendants.

How much informative “structure” is in the HGDP data set?

A few weeks ago people were arguing about the utility of the model based clustering packages which produce intuitive bar plots which break down individual and population percentages. To understand the fundamental basis of these packages I’ll refer you to the original Pritchard et al. paper. As you probably know at this point one of the major parameters of the packages is the K value, which refers to the number of populations which are going to be assumed as the constituents of the genetic variation. A key point is that those who use the packages are forcing the variation to fit a particular model. You can take the data for Icelanders, to pick an example, and find K = 100. It will be produce results, but I suspect you’ll intuit that this really isn’t the best model in terms of fitting reality. Similarly, you can take a population of Northern Europeans, West Africans, and East Asians, and set K = 2. This will likely separate the Eurasians from the Africans, as that’s the natural phylogenetic affinity. But K = 3 is probably a better fit to the data. By this, I mean that Northern Europeans and East Asians are not, and have not been for a long time, random mating populations. K = 3 reflects this reality.

So far this is intuitive. Is there a formal way to check this? Yes. A variety. Structure outputs log likelihoods for each K. Admixture gives you cross-validation errors. For a full treatment of how Admixture estimates cross-validation error see Alexander et al. An intuitive way to think about how you should interpret these values is that they are giving you a sense of where you are trying to squeeze too many K’s out of the data set. Admixture’s cross-validation value has a simple interpretation, look for the lowest point on the graph.

Going to back to the HGDP data set I wanted to know where that point on the scale of K’s was. Looking over the populations I assumed more than 5, but likely less than 20. That wide range tells you that I don’t honestly have a good intuition (some distinct populations are going to be hard to separate in pooled data sets because there hasn’t been much time since divergence, or they are not really genetically separate populations).

Read More

Surfing in from the east

YHammer
Citation: European Journal of Human Genetics advance online publication, 4 June 2014; doi:10.1038/ejhg.2014.106

Since I am now the father of a son my Y chromosomal haplogroup, R1a1a, has replicated itself one more time. That’s not a big deal seeing as it is probably the most widespread Eurasian paternal lineage. But why this particular lineage has the distribution it does is interesting and complex. In the early 2000s Spencer Wells published The Eurasian Heartland: A continental perspective on Y-chromosome diversity, which seemed to suggest that its expansion was due to that of the Indo-Europeans. Others have argued for an earlier diversification, due to divergences between European and Asian branches of R1a1a. A new paper puts these debates into deeper historical perspective, and nicely sums up where we are with uniparental lineages. Improved phylogenetic resolution and rapid diversification of Y-chromosome haplogroup K-M526 in Southeast Asia, with relevant bits:

These haploid systems are subject to stronger genetic drift and large evolutionary variance, and may not render accurate signals of population processes by themselves. Yet, the considerable geographic structure at these loci suggests that current patterns of variation may be informative of past population processes. With the implicit assumption that groups dispersing in the Pleistocene were small and experienced strong and long-lasting bottlenecks, patterns of mtDNA and NRY variation have been deemed useful as starting points to formulate hypotheses about human demographic history.

Note the caution. Also, I would take a slight issue with the last quoted sentence: the reason that haploid lineages were useful starting points had less to do with their utility in reconstructing the history of populations and more to do with technical constraints. Mitochondrial DNA is famously copious in comparison to nuclear DNA, while the nonrecombining nature of uniparental lineages makes them very amenable to transparent tree building. Rather than starting points, mtDNA and Y chromosomal lineages should be seen as informative supplements. With the rise of dense SNP chip technologies, and now whole genome sequencing, that’s they’re becoming.

This particular paper is interesting insofar as it traces back the emergence of the ancestor of R1a1a, and more generally R, among the panoply of non-African haplogroups. You can see on the map above that the light blue is basically absent from eastern Eurasian. Those are the R and Q lineages.The diversity of lineages in southeast Asia is very suggestive. Usually where lineages diversify they’ve been around for a while. Areas with lower diversity have often been settled later, and gone through diversity decreasing bottlenecks and such. The authors conclude:

In sum, our results support the hypothesis of a Southeast Asian/Oceanian center for the diversification of Oceanian K-haplogroup lineages and underscore the potential importance of Southeast Asia as a source of genetic variation for Eurasian populations.

The K group in question being the broader linage of which R is just a subset. I’m not sure that I buy the specific model here, but do note that it seems that the ~20,000 year old Siberian remain seems to have been of haplogroup R. And in the above paper they state “Interestingly, ancient DNA evidence suggests that haplogroup R1b – the current dominant lineage in western Europe –
did not reach high frequencies until after the European Neolithic period as given in Lacan et al 26,27 and Pinhasi et al. 28” In other words the results above would have seemed really strange a few years ago, as one would have previously though R was an ancient western Eurasian lineage just by its distribution. And it is today clearly a western Eurasian lineage, but did it always have to be so?

Recall that Y lineages are subject to “stronger genetic drift and large evolutionary variance.” The people today who tend to be carriers of the R lineages don’t exhibit any strong connection to the peoples of southeast Asia. But this one lineage may have risen in frequency among elite males at some point in a particular Eurasian population which came to be dominant. Rapid spread of males and constant intermarriage would dilute the whole genome signal quite rapidly, but the Y chromosomal lineage can maintain itself in the face of this because it does not mix. It replaces or is replaced. Chance may have increased the frequency of the eastern R lineage, but eventually it hitchhiked with destiny.

A new business

Little Lord Khan
Little Lord Khan

Busy times in the household, in case you are wondering about a dropoff in the posting frequency. We are pleased so far with the results. The only notable phenotypic distinction in relation to his sister is his relatively long dark hair at birth. It is difficult in these early days to not think of an Asian warlord when taking in his visage, though with his name that might be appropriate.

The eternal dynamic of inequality

Bill Gates' mugshot
Bill Gates’ mugshot

Inequality is a big deal today. It was the subject (or persistence thereof) of Greg Clark’s most recent book, The Son Also Rises, and more famously Thomas Piketty’s Capital in the Twenty-First Century. And obviously it is at the center of many contemporary policy debates. But to frame these modern arguments we need to get a sense of inequality’s natural history. In Clark’s previous book, A Farewell to Alms, he reported the standard economic historical finding that agricultural societies had high rates of inequality, which began to drop after the arrival of modernization in societies due to industrialization. The wage gap between skilled and non-skilled workers in Britain dropped between ~1800 AD to ~1970, only rising again over the past two generations.
Read More