Women hate going to India


For some reason women do not seem to migrate much into South Asia. In the late 2000s I, along with others, noticed a strange discrepancy in the Y and mtDNA lineages which trace one’s direct male and female lines: in South Asia the male lineages were likely to cluster with populations to the north an west, while the females lines did not. South Asia’s females lines in fact had a closer relationship to the mtDNA lineages of Southeast and East Asia, albeit distantly.

One solution which presented itself was to contend there was no paradox at all. That the Y chromosomal lineages found in South Asia were basal to those to the west and north. In particular, there were some papers suggesting that perhaps R1a1a originated in South Asia at the end of the last Pleistocene. Whole genome sequencing of Y chromosomes does not bear this out though. R1a1a went through rapid expansion recently, and ancient DNA has found it in Russia first. But in 2009 David Reich came out with Reconstructing Indian population history, which offered up somewhat of a possible solution.

What Reich and his coworkers found that South Asia seems to be characterized by the mixture of two very different types of populations. One set, ANI (Ancestral North Indian), are basically another western or northwestern Eurasian group. ASI (Ancestral South Indian), are indigenous, and exhibit distant affinities to the Andaman Islanders. The India-specific mtDNA then were from ASI, while the Y chromosomes with affinities to people to the north and west were from ANI. In other words, the ANI mixture into South Asia was probably through a mass migration of males.

But it’s not just Y and mtDNA in this case only. A minority of South Asians speak Austro-Asiatic languages. The most interesting of these populations are the Munda, who tend to occupy uplands in east-central India. Older books on India history often suggest that the Munda are the earliest aboriginals of the subcontinent, but that has to confront the fact that most Austro-Asiatic language are spoken in Southeast Asia. There was no true consensus where they were present first.

Genetics seems to have solved this question. The evidence is building up that Austro-Asiatic languages arrived with rice farmers from Southeast Asia. Though most of the ancestry of the Munda is of ANI-ASI mix, a small fraction is clearly East Asian. And interestingly, though they carry no East Asian mtDNA, they do carry East Asian Y. Again, gene flow mediated by males.

The same is true of India’s Bene Israel Jewish community.

A new preprint on biorxiv confirms that the Parsis are another instance of the same dynamic: The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection:

Zoroastrianism is one of the oldest extant religions in the world, originating in Persia (present-day Iran) during the second millennium BCE. Historical records indicate that migrants from Persia brought Zoroastrianism to India, but there is debate over the timing of these migrations. Here we present novel genome-wide autosomal, Y-chromosome and mitochondrial data from Iranian and Indian Zoroastrians and neighbouring modern-day Indian and Iranian populations to conduct the first genome-wide genetic analysis in these groups. Using powerful haplotype-based techniques, we show that Zoroastrians in Iran and India show increased genetic homogeneity relative to other sampled groups in their respective countries, consistent with their current practices of endogamy. Despite this, we show that Indian Zoroastrians (Parsis) intermixed with local groups sometime after their arrival in India, dating this mixture to 690-1390 CE and providing strong evidence that the migrating group was largely comprised of Zoroastrian males. By exploiting the rich information in DNA from ancient human remains, we also highlight admixture in the ancestors of Iranian Zoroastrians dated to 570 BCE-746 CE, older than admixture seen in any other sampled Iranian group, consistent with a long-standing isolation of Zoroastrians from outside groups. Finally, we report genomic regions showing signatures of positive selection in present-day Zoroastrians that might correlate to the prevalence of particular diseases amongst these communities.

The paper uses lots of fancy ChromoPainter methodologies which look at the distributions of haplotypes across populations. But some of the primary results are obvious using much simpler methods.

1) About 2/3 of the ancestry of Indian Parsis derives from an Iranian population
2) About 1/3 of the ancestry of Indian Parsis derives from an Indian popuation
3) Almost all the Y chromosomes of Indian Parsis can be accounted for by Iranian ancestry
4) Almost all the mtDNA haplogroups of Indian Parsis can be accounted for by Indian ancestry
5) Iranian Zoroastrians are mostly endogamous
6) Genetic isolation has resulted in drift and selection on Zoroastrians

The fact that the ancestry proportion is clearly more than 50% Iranian for Parsis indicates that there was more than one generation of males who migrated. They did not contribute mtDNA, but they did contribute genome-wide to Iranian ancestry. There are wide intervals on the dating of this admixture event, but they are consonant oral history that was later written down by the Parsis.

So there you have it. Another example of a population formed from admixture because women hate going to India.

Citation: The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection.
Saioa Lopez, Mark G Thomas, Lucy van Dorp, Naser Ansari-Pour, Sarah Stewart, Abigail L Jones, Erik Jelinek, Lounes Chikhi, Tudor Parfitt, Neil Bradman, Michael E Weale, Garrett Hellenthal
bioRxiv 128272; doi: https://doi.org/10.1101/128272

Only half of the traffic on this website is from a personal ‘computer’

I spend way too much time semi-competently managing the VPS this site is hosted on. But at least now I can look at Google Analytics. I’ve found some interesting things.

For example, 35% of the traffic on this site comes from phones, and 10% from tablets. That means that conventional computers are only somewhat more than half of the views. Additionally, 50% more of the Facebook shares are via the mobile Facebook app than the normal desktop version (I tend to get the most referrals from Twitter since I have a bigger Twitter following, but at some point I expect Facebook to surpass that as people realize I’m blogging again).

Probably going to make a few changes to make the site more mobile friendly since so many of you tend to read it on that device….

Evolutionary game theory and international relations

The North Korea Paradox: Why There Are No Good Options:

Denny Roy, a political scientist who studies Asian security issues, told me last fall that North Korea “intentionally employs a posture of seemingly hyper-risk acceptance and willingness to go to war as a means of trying to intimidate its adversaries.”

This puts the world in a quandary: How could any outside threat possibly exceed the risk that North Korea already takes on itself? How could any concession remove the North Korean weakness that drives its behavior?

Basically North Korea is a weak state. Its only leverage is to hold the world hostage and act crazy. Unfortunate, but true.

But this piece reminded me a lot of stuff that John Maynard Smith described in Animal Signals. Sometimes it is the weaker and more vulnerable animals which have to engage in high risk agonistic competition, so that they can show more fit individuals that there is going to be a significant cost in initiating hostilities.

It also reminds me of high school. If you are smaller than average, it is best to make it clear to larger bullies that you won’t be passive. You may lose the fight, but by escalating rapidly you can dissuade a bully from targeting you, as opposed to someone who is more likely to be an easier victim.

Of course, bullies need to be “rational” actors here….

In a hopeless world hope is better than resignation

There’s really nothing one can say anymore about what Hugo Chavez did to his country, No Food, No Medicine, No Respite: A Starving Boy’s Death in Venezuela. But now in France a left-wing politician is on the rise who praises Chavez, Left-Wing Politician Shakes Up France’s Presidential Race:

That man is Jean-Luc Mélenchon, admirer of Fidel Castro and Hugo Chávez, sworn enemy of NATO and high finance, and candidate of his own “France Unsubjugated” movement, who has been drawing tens of thousands to his rallies, especially the young, as he did here Sunday at Toulouse on the banks of the Garonne River. They came to hear a veteran French politician give them a dousing of old-fashioned Robin Hood-revolutionary rhetoric, with promises to tax the rich hard, give to the poor and start a “citizen revolution.”

There is a serious chance that this will be the next president of the French republic. This man, who has no problems being called a Communist. If there is one political system where the experiment has been done, it is command economy socialism. There may be cases of market failure where the state needs to intervene, but by and large an economy dominated by the state has not done good for the common man.

And yet the reality is what alternatives are the people being given? They are looking Left and looking Right, because they want hope that the future will have some of the promise that the past had. Sober realistic centrists with broadly liberal views only offer them only hard truths.

Truths such as this: Evidence That Robots Are Winning the Race for American Jobs. The far Left anti-capitalist program in economics really doesn’t offer a long run path to prosperity. But capitalism itself only leads to individual and broad-based prosperity as a side effect of market logic. If returns to capital could accrue without labor inputs, then that would be even “better.”

The Warlord Chronicles

The Winter is Coming website has a post up, What books should you read as you wait for The Winds of Winter? (The Winds of Winter is the next Song of Ice and Fire book).

I don’t have much time for fiction at this point, but the first entry that they suggested was Bernard Cornwell’s Warlord Chronicles. This is a very dark, gritty, and realistic, retelling of the Arthurian legend, written in a fashion more reminiscent of historical fiction than fantasy. I read this series perhaps a year after first reading Game of Thrones, and was struck by similarities of tone.

As it happened this was before George R. R. Martin was quite as famous, and I emailed him at some point in 2000 about various issues relating to his works and inspirations, and asked him about Cornwell’s series. Martin admitted that he was a huge fan, and appreciated that there were similarities of style and tone.

In any case, I second this recommendation. Warlord Chronicles is not the most easy read…but worth it.

Open Thread, 4/16/2017

Happy Easter. Spend most of the day figuring out how to restart Varnish. I don’t really know why there are so many database connection problems and caching…but I inherited the VPS. Might have to bone up on being a sysadmin more. Do any readers know if Varnish is really worth a modest site like mine?

Erdogan Claims Vast New Powers After Narrow Victory in Turkish Referendum. First, I have to say that The Future of Freedom: Illiberal Democracy at Home and Abroad is pretty relevant today. Second, Erdogan has shown many faces to the world over the past 15 years. I remember for example him telling people in post-Arab Spring Tunisia that in a free society atheism is a real option (to some criticism).

Are 90% of academic papers really never cited? Reviewing the literature on academic citations. It’s really a problem in the humanities:

Many academic articles are never cited, although I could not find any study with a result as high as 90%. Non-citation rates vary enormously by field. “Only” 12% of medicine articles are not cited, compared to about 82% (!) for the humanities. It’s 27% for natural sciences and 32% for social sciences (cite). For everything except humanities, those numbers are far from 90% but they are still high: One third of social science articles go uncited! Ten points for academia’s critics. Before we slash humanities departments, though, remember that much of their most prestigious research is published in books. On the other hand, at least in literature, many books are rarely cited too.

White supremacist who created stir at Stanislaus State seen punching woman at Berkeley protest. First, please note that this woman went to the protest to get “Nazi scalps” according to her social media. Second, the image of a white supremacist punching an anti-fascist woman is exactly what Sarah Haider told me was going to be a problem with contemporary Leftist valorization of violence: Left-wing organizations have proportionally many more women than right-wing militant organizations, which isn’t an asset in pitched physical combat.

Theresa May’s Conservatives are 21 points ahead of Labour in new poll. I think Scotland will leave the United Kingdom in the next 5 years.

Suzan Mazur interviews Richard Lewontin. I used to think Mazur was exceptional, and she still is, but only in her artlessness in pushing her agenda.

Treasure your exceptions, progress is real but not universal

The beginning of How the Scots Invented the Modern World describes the execution of a man for the crime of atheism around 1700 in Scotland. More precisely, this individual was rather loud about their heresy, and that is always the problem. Silent dissent is usually tolerated. This is the last time someone was executed for this particular crime in the British Isles.

In the United Kingdom the book was more accurately titled The Scottish Enlightenment. A relatively moderate and low fuss affair, the Scottish Enlightenment gave us Adam Smith and David Hume, to name two. The point of the book is that in it one can see many of the seeds of the liberal Enlightenment in Scotland, which at the beginning of the 18th century was arguably more backward and medieval than its southern neighbor.

But the “modern world” means many different things. Mobile phone technology is ubiquitous, to the point that even the poor in developing nations have it. But a broader consumer affluence is out of reach for many. And the rights and liberties of a liberal democratic order are more an ideal than concrete existence for much of the world’s population (and you have cases such as Saudi Arabia where illiberal norms and politics merge with consumer affluence).

As you may know a young man was killed by his fellow students in Pakistan for the crime of blasphemy a few days ago. Whether he was an atheist or a free thinker, or a skeptic more generally, can be hard to ascertain. But the critical aspect is that he was killed in broad daylight by a mob. He was lynched. This being 2017, you can watch a video of the killing, and hear his screams as he is murdered in front of a crowd.

One aspect people have been noticing is that this was a killing enabled by majority and consensus opinion. Abdul Wali Khan University Mardan is a public university, not a cloistered madrassa or a branch of the Red Mosque network. You can’t blame ISIS or some crazy jihadi network. These were university students.

Pakistan has a population of 182 million. The United States around 300 million. I understand that Americans believe we are the future. That the rest of the world is the exception. But how long will that be? Perhaps the arrow of history is more a circle?

Genetic variation in human populations and individuals


I’m old enough to remember when we didn’t have a good sense of how many genes humans had. I vaguely recall numbers around 100,000 at first, which in hindsight seems rather like a round and large number. A guess. Then it went to 40,000 in the early 2000s and then further until it converged to some number just below 20,000.

But perhaps more fascinating is that we have a much better catalog of the variation across the whole human genome now. Often friends ask me questions of the form: “so DTC genomic company X has about 800,000 SNPs, is that enough to do much?” To answer such a question you need some basic numbers in your head, as well as what you want to “do.”

First, the human genome has about 3 billion base pairs (3 Gb). That’s a lot. But most of the genome famously doesn’t code for proteins. The exome, the proportion of the genome where bases directly translate into a protein accounts for 1% of the whole genome. That’s 30 million bases (30 Mb). But this small region of the genome is very important, as the vast majority of major disease mutations are found in the exome.

When it comes to a standard 800K SNP chip, which samples 800,000 positions across the 3 Gb genome, it is likely that the designers enriched the marker set for functional positions relevant to diseases. Not all marker positions are created equal. Though even outside of those functional positions there are often nearby SNPs that can “tag” them, so you can infer one from the state of the other.

But are 800,000 positions enough to make good ancestry inference? (to give one example) Yes. 800,000 is actually a substantial proportion of the polymorphism in any given genome. There have been some papers which improved on the numbers in 2015’s A global reference for human genetic variation, but it’s still a good comprehensive review to get an order-of-magnitude sense. The table below gives you a sense of individual variation:

Median autosomal variant sites per genome

When it comes to single nucleotide polymorphisms (SNPs), what SNP chips are getting at, an 800K array should get a substantial proportion of your genome-wide variation. More than enough for ancestry inference or forensics. The singleton column shows mutations specific to the individual.  When focusing on new mutations specific to an individual that might cause disease, singleton large deletions and nonsynonymous SNPs is really where I’d look.

But what about whole populations? The plot to the left shows the count of variants as a function of alternative allele frequency. When we say “SNP”, you really mean variants which exhibit polymorphism at a particular cut-off frequency for the minor allele (often 1%). It is clear that as the minor allele frequency increases in relation to the human reference genome the number of variants decreases.

From the paper:

The majority of variants in the data set are rare: ~64 million autosomal variants have a frequency <0.5%, ~12 million have a frequency between 0.5% and 5%, and only ~8 million have a frequency >5% (Extended Data Fig. 3a). Nevertheless, the majority of variants observed in a single genome are common: just 40,000 to 200,000 of the variants in a typical genome (1–4%) have a frequency <0.5% (Fig. 1c and Extended Data Fig. 3b). As such, we estimate that improved rare variant discovery by deep sequencing our entire sample would at least double the total number of variants in our sample but increase the number of variants in a typical genome by only ~20,000 to 60,000.

An 800K SNP chip will be biased toward the 8 million or so variants with a frequency of 5%. This number gives you a sense of the limited scope of variation in the human genome. 0.27% of the genome captures a lot of the polymorphism.

Citation: 1000 Genomes Project Consortium. “A global reference for human genetic variation.” Nature 526.7571 (2015): 68-74.

Why overdominance probably isn’t responsible for much polymorphism

Hybrid vigor is a concept that many people have heard of, because it is very useful in agricultural genetics, and makes some intuitive sense. Unfortunately it often gets deployed in a variety of contexts, and its applicability is often overestimated. For example, many people seem to think (from personal communication) that it may somehow be responsible for the genetic variation around us.

This is just not so. As you may know each human carries tens of millions of genetic variants within their genome. Populations have various levels of polymorphism at particular positions in the genome. How’d they get there? In the early days of population genetics there were two broad schools, the “balance” and “classical.” The former made the case for the importance of balancing selection in maintaining variation. The latter suggested that the variation we see around us is simply a transient between fixation of a favored mutation from a low a frequency or extinction of a disfavored variant (perhaps environmental conditions changed and a high frequency variant is now disfavored). Arguably the rise of neutral theory and empirical results from molecular evolution supported the classical model more than the balance framework (at least this was Richard Lewontin’s argument, and I follow his logic here).

But even in relation to alleles which are maintained at polymorphism through balancing selection, overdominance isn’t going to be the major player.

Sickle cell disease is a classic consequence of overdominance; the heterozygote is more fit than the wild type or the recessive disease which is caused by homozygotes of the mutation. Obviously polymorphism is maintained despite the decreased fitness of the mutant homozygote because the heterozygote is so much more fit than the wild type. The final proportion of the alleles segregating in the population will be conditional on the fitness drag of the homozygote in the mutant type, because as per HWE it will be present in the population ~q2.

The problem is that this is clearly not going to scale across loci. That is, even if the fitness drag is more minimal than is the case with the sickle cell locus, one can imagine a cummulative situation. The segregation load is just going to be too high. Overdominance is probably a transient strategy which fades away as populations evolve more efficient ways to adapt that doesn’t have such a fitness load.

So how does balancing selection still lead to variation without heteroygote advantage? W. D. Hamilton argued that much of it was due to negative frequency dependent selection. Co-evolution with pathogens is the best case of this. As strategies get common pathogens adapt, so rare strategies encoded by rare alleles gain in fitness. As these alleles increase in frequency their fitness decreases due to pathogen resistance. Their frequency declines, and eventually the pathogens lose the ability to resist it, and its frequency increases again.

What if you call for a revolution and no one revolts?

When I was in 8th grade my earth science teacher explained he did not believe in Darwinism. He seemed a reasonable fellow so my first reaction was shock. My best friend at the time, who sat next to me, laughed, “Yeah, some people believe we’re descended from monkeys! Crazy, huh?” I didn’t really know what to say. But what followed was even more confusing to me: my teacher explained that he accepted punctuated equilibrium, not Darwinism. He did not elaborate much beyond this, though I tried to get at what he believed after class in the few minutes I had.

Later on I realized that he had drunk deeply at the well of Stephen Jay Gould, paleontologist and polymath. I will quote Richard Lewontin, Gould’s longtime collaborator and friend:

Now I should warn you about my prejudices. Steve and I taught evolution together for years and in a sense we struggled in class constantly because Steve, in my view, was preoccupied with the desire to be considered a very original and great evolutionary theorist. So he would exaggerate and even caricature certain features, which are true but not the way you want to present them. For example, punctuated equilibrium, one of his favorites. He would go to the blackboard and show a trait rising gradually and then becoming completely flat for a while with no change at all, and then rising quickly and then completely flat, etc. which is a kind of caricature of the fact that there is variability in the evolution of traits, sometimes faster and sometimes slower, but which he made into punctuated equilibrium literally. Then I would have to get up in class and say “Don’t take this caricature too seriously. It really looks like this…” and I would make some more gradual variable rates. Steve and I had that kind of struggle constantly. He would fasten on a particular interesting aspect of the evolutionary process and then make it into a kind of rigid, almost vacuous rule, because—now I have to say that this is my view—I have no demonstration of it—that Steve was really preoccupied by becoming a famous evolutionist.

Gould succeeded, after a fashion. His reputation within evolutionary biology is mixed, at best. Just look at what someone who thinks he made genuine original contributions to science admits above. But in the mind of the public Stephen Jay Gould was an oracle of sorts.

A revolution is sexy. A revolution sells. Having read both of them, I would say that Richard Dawkins is the better stylist when compared to Gould. Additionally, though some might disagree with this Dawkins is closer to the mainline of the modern evolutionary biological tradition than Gould. But in the United States Gould far overshadowed Dawkins…until the latter began to make a name for himself as an anti-religion polemicist in the 2000s. Revolution. Controversy. They’re salient. The press eats it up, and the public trusts the press.

And some things never change. Every few years there is an impending “revolution” in evolutionary biology or genetics. But the revolution is mostly in the minds of a few journalists, and a public that reads a little too much into a puff piece here and there. The sort of well educated public woolly on what the “central dogma” is, but clear that it has been overthrown.

Sometimes this gets out of control. Suzan Mazur’s The Altenberg 16: An Exposé of the Evolution Industry is probably the weirdest instance of this genre of “the sky is falling in evolutionary theory!” But of late some scholars have been coming out with more sober critiques, arguing that the Neo-Darwinian Synthesis needs to be extended or modified significantly. Kevin Laland’s Darwin’s Unfinished Symphony: How Culture Made the Human Mind is the latest instance of this, but this was preceded by Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life. You can also read David Dobbs’ sympathetic treatment from a few years back around this issue.

I can communicate to you what seems to be the majority view among the evolutionary biologist I know: there isn’t a need for a revolution in conceptual thought, just a working out of details and reallocation of resources. Many who are sympathetic to Kevin Laland’s argument still believe that it’s about emphases and semantics. There’s no reason to put out a clarion call that evolution needs to be rethought in its conceptual foundations.

Honestly I don’t know if there’s been much that is revolutionary conceptually since the original period of the synthesis. Perhaps the rise of molecular evolution and neutrality as a null hypothesis? But even I’m not sure about that.

Erik I. Svensson has put up a preprint which speaks for many people, On reciprocal causation in the evolutionary process. Read the whole thing, it’s thorough, and accessible to a lay audience. The main aspect a bit surprising to me is the good word put in for The Dialectical Biologist, which I have heard is an interesting book:

Recent calls for a revision the standard evolutionary theory (ST) are based on arguments about the reciprocal causation of evolutionary phenomena. Reciprocal causation means that cause-effect relationships are obscured, as a cause could later become an effect and vice versa. Such dynamic cause-effect relationships raises questions about the distinction between proximate and ultimate causes, as originally formulated by Ernst Mayr. They have also motivated some biologists and philosophers to argue for an Extended Evolutionary Synthesis (EES). Such an EES will supposedly replace the Modern Synthesis (MS), with its claimed focus on unidirectional causation. I critically examine this conjecture by the proponents of the EES, and conclude, on the contrary, that reciprocal causation has long been recognized as important in ST and in the MS tradition. Numerous empirical examples of reciprocal causation in the form of positive and negative feedbacks now exists from both natural and laboratory systems. Reciprocal causation has been explicitly incorporated in mathematical models of coevolutionary arms races, frequency-dependent selection and sexual selection. Such feedbacks were already recognized by Richard Levins and Richard Lewontin, long before the call for an EES and the associated concept of niche construction. Reciprocal causation and feedbacks is therefore one of the few contributions of dialectical thinking and Marxist philosophy in evolutionary theory, and should be recognized as such. While reciprocal causation have helped us to understand many evolutionary processes, I caution against its extension to heredity and directed development if such an extension involves futile attempts to restore Lamarckian or soft inheritance.