In the lead up to the 2020 US presidential election and the party primaries, Democratic Senator Elizabeth Warren led President Donald Trump by four percentage points in the polls. Warren fared better than Joe Biden and the remaining candidates and was the most realistic candidate to win the presidential Democratic ticket. Warren was on the rise. Surveys showed that she was the favorite among Democratic voters, but there was a concern about her popularity among other voters. Perhaps it was that concern that prompted Warren to take a bold move. It was not about school loan forgiveness or fighting corruption; those ideas were taunted by most Democratic candidates. Warren’s move was more personal.
Testing Warren’s DNA for Native American Ancestry
Warren believed that one of her ancestors was a Native American and that she carries a Native American heritage. Already in 1986, Warren identified as “American Indian” on a registration card for the State Bar of Texas , according to a Washington Post report. Warren is like many other people who have heard family stories about their heritage being Native American, Jewish, or of another background, but possibly without concrete evidence to validate these stories; however, those people do not all run for the presidency. When this information surfaced, Warren was criticized and insulted, with claims that she invented this heritage. She decided to take a DNA test for her ancestry, a reasonable decision, but unfortunately, this is where things went wrong. Rather than choosing a credible genetic company, Warren relied on a DNA analysis performed with questionable tools and published the results .
To analyze Warren’s DNA, Carlos Bustamante used a mathematical procedure called Principal Component Analysis (PCA), which he elevated to be a “machine learning technique” (it is not). PCA reported that Warren’s DNA clusters with European-Native American admixed individuals, which is interpreted as shared ancestry.
The second analysis was even more promising. Removing the non-Native American populations, Warren clustered within Native Americans !
Bustamante reported that Warren is “clearly distinct from segments of European ancestry ” and is “strongly associated with Native American ancestry.” Bustamante concluded that:
“While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.”
Warrant proudly presented these results, only to be heavily criticized. She ended up apologizing and retracting her Native American identity: “I know that I have made mistakes. I am sorry for harm that I have caused. I have listened and I have learned” said Warren . However, it was too late. Warren lost her momentum. She fared poorly in the primaries and lost the ticket to President Biden.
The main question remains unanswered, does Warren have Native American ancestry?
PCA Results are Not Reliable, Robust, or Replicable
In my recent paper, I showed that PCA results are not reliable, robust, or replicable. I demonstrated how expert users could easily manipulate PCA to generate any desired results. Is it possible that Bustamante’s results were false? Can anyone find a geneticist that will “prove” their Native American ancestry? Let us try to answer both questions.
Warren’s DNA was not published, so I could not test her ancestry. Fortunately, I did not have to. If I could apply Bustamante’s procedure to non-Native-Americans and show that they cluster with Native Americans – it would prove that his test is invalid. The figure below shows the outcome of PCA for Iranian (A), Pakistani (B), and two Russian (C-D) individuals using the same setting that Bustamante used. They all clustered with Native Americans as if they were Native Americans!
These results show how the experimenter can easily generate desired patterns to support false ancestral claims. Anyone who understands the math behind PCA can generate almost any desired results. It is not surprising that PCA is the most commonly used tool among geneticists and direct-to-consumer ancestry companies, like 23andme, that adopted PCA to assess ancestry, disease risk, and “cultural traits” (whatever that is). It is precisely because it produces desirable, albeit unreliable and misleading, results. Who doesn’t like a tool that tells them they are always right? Bustamante used another tool, alongside PCA, which was equally wrong on the same grounds.
Evaluation of Native American ancestry for four Eurasians: Iranian (A), Pakistani (B), and two Russian (C-D) individuals. (Author Provided)
The Importance of Choosing the Right Genetic Ancestry Test
While I am unable to comment on Warren’s ancestry (recall that I do not have her DNA), it is worth emphasizing that only reliable genetic ancestry tests , preferable those that employ ancient DNA analysis, should be used to get an accurate assessment of ancestry. Ancient DNA has the advantage of preserving the genomic signature before the arrival of Europeans and Africans (many if not most Native Americans are mixed by now) and is therefore more powerful, particularly for testing a small fraction of that ancestry.
At the end of the day, ignorance of math curbed the future of ‘President Elizabeth Warren,’ but if it is any comfort, powerful tools are now available if she wishes to discover her true ancestry.
By Eran Elhaik