When journalists get out of their depth on genetic genealogy

For some reason The New York Times tasked Gina Kolata to cover genetic genealogy and its societal ramifications, With a Simple DNA Test, Family Histories Are Rewritten. The problem here is that to my knowledge Kolata doesn’t cover this as part of her beat, and so isn’t well equipped to write an accurate and in depth piece on the topic in relation to the science.

This is a general problem in journalism. I notice it most often when it comes to genetics (a topic I know a lot about for professional reasons) and the Middle East and Islam (topics I know a lot about because I’m interested in them). It’s unfortunate, but it has also made me a lot more skeptical of journalists whose track record I’m unfamiliar with.* To give a contrasting example, Christine Kenneally is a journalist without a background in genetics who nevertheless is immersed in genetic genealogy, so that she could have written this sort of piece without objection from the likes of me (she did write a book on the topic, The Invisible History of the Human Race: How DNA and History Shape Our Identities and Our Futures, which I had a small role in fact-checking).

What are the problems with the Kolata piece? I think the biggest issue is that she didn’t go in to test any particular proposition, and leaned on the wrong person for the science. She quotes Joe Pickrell, who knows this stuff like the back of his hand. But more space is given to Jonathan Marks, an anthropologist who is quite opinionated and voluble, and so probably a “good source” for any journalist.

Marks seems well respected in anthropology from what I can tell, but he’s also the person who put up a picture of L. L. Cavalli-Sforza juxtaposed with a photo of Josef Mengele in the late 1990s during a presentation at Stanford. Perhaps this is why anthropologists respect him, I don’t know, but I do not like him because of his nasty tactics (I wouldn’t be surprised if Marks had power he would make sure people like me were put in political prison camps, his rhetoric is often so unhinged).

Marks’ quotes wouldn’t be much of an issue if Kolata could figure out when he’s making sense, and when he’s just bullshitting. But she can’t. For example:

…“tells me I’m 95 percent Ashkenazi Jewish and 5 percent Korean, is that really different from 100 percent Ashkenazi Jewish and zero percent Korean?”

The precise numbers offered by some testing services raise eyebrows among genetics researchers. “It’s all privatized science, and the algorithms are not generally available for peer review,” Dr. Marks said.

The part about precise numbers is an issue, though a lot less of an issue with high density SNP-chips (the real issue is sensitivity to reference population and other such parameters). But if a modern test says you are 95 percent Ashkenazi Jewish and 5 percent Korean it really is different from 100% Ashkenazi. Someone who comes up as 5% Korean against an Ashkenazi Jewish background is most definitely of some East Asian heritage. In the early 2000s with ancestrally informative markers and microsatellite based tests you’d get somewhat weird results like this, but with the methods used by the major DTC companies (and in academia) today these sorts of proportions are just not reported as false positives. Marks may not know because this isn’t his area, but Pickrell would have. Kolata probably did not think to double-check with him, but that’s because she isn’t able to smell out tendentious assertions. She has no feel for the science, and is flying blind.

Second, Marks notes that the science is privatized, and it isn’t totally open. But it’s just false that the algorithms are not generally available for peer review. All the details of the pipeline are not downloadable on GitHub, but the core ancestry estimation methods are well known. Eric Durand, who wrote the originally 23andMe ancestry composition methodology presented on it at ASHG 2013. I know because I was there during his session.

You can find a white paper for 23andMe’s method and Ancestry‘s. Not everything is as transparent as open science would dictate (though there are scientific papers and publications which also mask or hide elements which make reproducibility difficult), but most geneticists with domain experience can figure out what’s going on and it if it is legitimate. It is. The people who work at the major DTC companies often come out of academia, and are known to academic scientists. This isn’t blackbox voodoo science like “soccer genomics.”

Then Marks says this really weird thing:

“That’s why their ads always specify that this is for recreational purposes only: lawyer-speak for, ‘These results have no scientific standing.’”

Actually, it’s lawyer-speak for “do not sue us, as we aren’t providing you actionable information.” Perhaps I’m ignorant, but lawyers don’t get to define “scientific standing”.

The problem, which is real, is that the public is sometimes not entirely clear on what the science is saying. This is a problem of communication from the companies to the public. I’ve even been in scientific sessions where geneticists who don’t work in population genomics have weak intuition on what the results mean!

Earlier Kolata states:

Scientists simply do not have good data on the genetic characteristics of particular countries in, say, East Africa or East Asia. Even in more developed regions, distinguishing between Polish and, for instance, Russian heritage is inexact at best.

This is not totally true. We have good data now on China and Japan. Korea also has some data. Using haplotype-based methods you can do a lot of interesting things, including distinguish someone who is Polish from Russian. But these methods are computationally expensive and require lots of information on the reference samples (Living DNA does this for British people). The point is that the science is there. Reading this sort of article is just going to confuse people.

On the other hand a lot of Kolata’s piece is more human interest. The standard stuff about finding long lost relatives, or discovering your father isn’t your father. These are fine and not objectionable factually, though they’ve been done extensively before and elsewhere. I actually enjoyed the material in the second half of the piece, which had only a tenuous connection to scientific detail. I just wish these sorts of articles represented the science correctly.

Addendum: Just so you know, three journalists who regularly cover topics I can make strong judgments on, and are always pretty accurate: Carl Zimmer, Antonio Regalado, and Ewen Callaway.

* I don’t follow Kolata very closely, but to be frank I’ve heard from scientist friends long ago that she parachutes into topics, and gets a lot of things wrong. Though I can only speak on this particular piece.