Since I’ve talked about this issue before, Warren releases results of DNA test:
There were five parts of Warren’s DNA that signaled she had a Native American ancestor, according to the report. The largest piece of Native American DNA was found on her 10th chromosome, according to the report. Each human has 23 pairs of chromosomes.
“It really stood out,” said Bustamante in an interview. “We found five segments, and that long segment was pretty significant. It tells us about one ancestor, and we can’t rule out more ancestors.”
He added: “We are confident it is not an error.”
The proportion of ancestry is not large. But it is clearly there. They compared to the Utah white and British European 1000 Genomes populations, which is a good standard for Old Stock Anglo-Americans. She’s clearly an outlier, with about an order of magnitude more “Native American” ancestry. So it’s unlikely to be some artifact.
There is some talk in the article about lack of reference populations. But remember, the key is to identify Native American ancestry, so all of this should coalesce back 10-15,000 years ago. Compared to the divergence from Northern Europeans, this is going to jump out against the genetic background.
So does Elizabeth Warren have Native American ancestry? 99% sure that that is a yes. Is she going to run? Well, I wouldn’t say 99%, but that seems likely too….
(I doubt she’ll do it, but it would be neat if she released her raw results)
Update: Here’s the technical report.
Update II: Some quick responses to comments. I’m going to address the genetic aspects. I’ll leave the cultural and political angles to others.
- The analyst, who I know personally as well as by reputation, did exactly what I’d have expected he do with this data. So nothing atypical in terms of method/analytic pipeline. You can download and use the tools yourself!
- The number of markers used in the analysis, 660,000, is a good number. Sufficient most definitely for the local ancestry analysis done here (and probably on some level necessary to gain a high level of confidence).
- Some people wonder about the sample size of the reference population. Is the number sufficient? Yes, for the purposes of this analysis. For the scope of the questions asked. You aren’t looking for recent relatives, you are looking for a good representation of the genealogical networks from a given geography/ethnicity. The Utah whites are an industry standard sample set that is well known. The British data set in the 1000 Genomes is also pretty well known. Both seem representative of people of Northwest European heritage, a set of populations which are genetically very similar to each other.
- People are asking about the robustness of this result. One thing you have to remember when comparing reference sets against an individual is that the genetic distance of the reference sets is important. Applying local ancestry to an individual of Dutch ancestry with training sets of Germans and English heritage is going to produce results, but the training sets themselves are going to overlap in some ways. Now, if you take someone with Dutch ancestry and do local ancestry for English vs. Javanese ancestry, then you’re going to get really clear results in comparison.
- Some serious individuals are questioning the representativeness of the European panel and the Native American panel. As well as the lack of Siberian groups, who are closely related. But we know that Warren’s family background is such that a shift toward a Northeast Asian group is likely to be Native American. Not Chukchi. Further analysis could confirm, but the most likely hypothesis is that this is a woman of Northwest European ancestry with some Native American ancestry. Other models could fit these results. But those are not likely models in the first place (also, the PCA on Native American groups makes it likely that she is not Siberian, and she is not shifted to the northern groups).
- A huge issue is that people are worried about the representativeness of the Native American groups. First, if you are looking for someone with indigenous North American ancestry, Mexican groups are sufficient. If anything this will reduce your power to detect, not produce false positives. Second, look at the plot, Warren’s haplotype is positioned between Canadian and Mexican natives:
- People are interpreting this local ancestry method, which assigns segments of the genome to particular populations with a probability, to the point estimates provided in most consumer genomics results. From what I can see, they assigned 0.4% of Warren’s genome as Native American. But 8% was not assigned. This is almost certainly mostly European, but some of it may be Native as well. Basically, the method here was less about assigning a specific proportion, and more about testing whether it was likely she had detectable indigenous American ancestry (she did), and, the range of periods in which that ancestry could have admixed into the Northern European genetic background. This is not comparable to the estimates you are getting from personal genomics tests.
- One way you can try to assess whether these are artifactual is to compare an individual to populations of known ancestry and see the distribution of empirical results. Warren’s results are very atypical in comparison to Northern European reference sets. If this is a “false positive” due to the training sets, then you would expect the same type of problem to crop up when test out sample individuals.
- Some are asking whether Warren is just a typical white American. You would need to do apples-to-apples comparisons. But my intuition is that she’s not. Most Old Stock white Americans probably have a genealogical relationship to Native Americans, but they may not have any segments of DNA because it is too far back. Warren is part of the minority of white Americans who have detectable Native American ancestry.
Basically, I think it is very likely that Warren has Native American ancestry. Follow-up analysis would probably just increase our confidence.