The Wirecutter has a thorough review of direct-to-consumer ancestry testing services. Since I now work at a human personal genomics company I’m not going to comment on the merits of any given service. But, I do want to clarify something in regards to the precision of these tests. Before the author quotes Jonathan Marks, he says:
For Jonathan Marks, anthropology professor at University of North Carolina at Charlotte, the big unknown for users is the margin for error with these estimates….
The issue I have with this quote is that the margin of error on these tests is really not that high. Margin of error itself is a precise concept. If you sample 1,000 individuals you’ll have a lower margin of error than if you sample 100 individuals. That’s common sense.
But for direction-to-consumer genomic tests you are sampling 100,000 to 1 million markers on SNP arrays (the exact number used for ancestry inference is often lower than the total number on the array). For ancestry testing you are really interested in the 10 million or so (order of magnitude) markers which vary between population, and a random sampling of 100,000 to 1 million is going to be pretty representative (consider that election year polling usually surveys a few thousand people to represent an electorate of tens of millions).
If you run a package like Admixture you can repeat the calculation for a given individual multiple times. In most cases there is very little variation between replicates in relation to the percentage breakdowns, even though you do a random seed to initialize the process as it begins to stochastically explore the parameter space (the variance is going to be higher if you try to resolve clusters which are extremely phylogenetically close of course).
As I have stated before, the reason these different companies offer varied results is that they start out with different models. When I learned the basic theory around phylogenetics in graduate school the philosophy was definitely Bayesian; vary the model parameters and the model and see what happens. But you can’t really vary the model all the time between customers, can you? It starts to become a nightmare in relation to customer service.
There are certain population clusters that customers are interested in. To provide a service to the public a company has to develop a model that answers those questions which are in demand. If you are designing a model for purely scientific purposes then you’d want to highlight the maximal amount of phylogenetic history. That isn’t always the same though as the history that customers want to know about it. This means that direct-to-consumer ethnicity tests in terms of the specification of their models deviate from pure scientific questions, and result in a log of judgment calls based on company evaluations of their client base.
Addendum: There is a lot of talk about the reference population sets. The main issue is representativeness, not sample size. You don’t really need more than 10-100 individuals from a given population in most cases. But you want to sample the real population diversity that is out there.
For some reason The New York Times tasked Gina Kolata to cover genetic genealogy and its societal ramifications, With a Simple DNA Test, Family Histories Are Rewritten. The problem here is that to my knowledge Kolata doesn’t cover this as part of her beat, and so isn’t well equipped to write an accurate and in depth piece on the topic in relation to the science.
This is a general problem in journalism. I notice it most often when it comes to genetics (a topic I know a lot about for professional reasons) and the Middle East and Islam (topics I know a lot about because I’m interested in them). It’s unfortunate, but it has also made me a lot more skeptical of journalists whose track record I’m unfamiliar with.* To give a contrasting example, Christine Kenneally is a journalist without a background in genetics who nevertheless is immersed in genetic genealogy, so that she could have written this sort of piece without objection from the likes of me (she did write a book on the topic, The Invisible History of the Human Race: How DNA and History Shape Our Identities and Our Futures, which I had a small role in fact-checking).
What are the problems with the Kolata piece? I think the biggest issue is that she didn’t go in to test any particular proposition, and leaned on the wrong person for the science. She quotes Joe Pickrell, who knows this stuff like the back of his hand. But more space is given to Jonathan Marks, an anthropologist who is quite opinionated and voluble, and so probably a “good source” for any journalist.
Marks seems well respected in anthropology from what I can tell, but he’s also the person who put up a picture of L. L. Cavalli-Sforza juxtaposed with a photo of Josef Mengele in the late 1990s during a presentation at Stanford. Perhaps this is why anthropologists respect him, I don’t know, but I do not like him because of his nasty tactics (I wouldn’t be surprised if Marks had power he would make sure people like me were put in political prison camps, his rhetoric is often so unhinged).
Marks’ quotes wouldn’t be much of an issue if Kolata could figure out when he’s making sense, and when he’s just bullshitting. But she can’t. For example:
…“tells me I’m 95 percent Ashkenazi Jewish and 5 percent Korean, is that really different from 100 percent Ashkenazi Jewish and zero percent Korean?”
The precise numbers offered by some testing services raise eyebrows among genetics researchers. “It’s all privatized science, and the algorithms are not generally available for peer review,” Dr. Marks said.
The part about precise numbers is an issue, though a lot less of an issue with high density SNP-chips (the real issue is sensitivity to reference population and other such parameters). But if a modern test says you are 95 percent Ashkenazi Jewish and 5 percent Korean it really is different from 100% Ashkenazi. Someone who comes up as 5% Korean against an Ashkenazi Jewish background is most definitely of some East Asian heritage. In the early 2000s with ancestrally informative markers and microsatellite based tests you’d get somewhat weird results like this, but with the methods used by the major DTC companies (and in academia) today these sorts of proportions are just not reported as false positives. Marks may not know because this isn’t his area, but Pickrell would have. Kolata probably did not think to double-check with him, but that’s because she isn’t able to smell out tendentious assertions. She has no feel for the science, and is flying blind.
Second, Marks notes that the science is privatized, and it isn’t totally open. But it’s just false that the algorithms are not generally available for peer review. All the details of the pipeline are not downloadable on GitHub, but the core ancestry estimation methods are well known. Eric Durand, who wrote the originally 23andMe ancestry composition methodology presented on it at ASHG 2013. I know because I was there during his session.
You can find a white paper for 23andMe’s method and Ancestry‘s. Not everything is as transparent as open science would dictate (though there are scientific papers and publications which also mask or hide elements which make reproducibility difficult), but most geneticists with domain experience can figure out what’s going on and it if it is legitimate. It is. The people who work at the major DTC companies often come out of academia, and are known to academic scientists. This isn’t blackbox voodoo science like “soccer genomics.”
Then Marks says this really weird thing:
“That’s why their ads always specify that this is for recreational purposes only: lawyer-speak for, ‘These results have no scientific standing.’”
Actually, it’s lawyer-speak for “do not sue us, as we aren’t providing you actionable information.” Perhaps I’m ignorant, but lawyers don’t get to define “scientific standing”.
The problem, which is real, is that the public is sometimes not entirely clear on what the science is saying. This is a problem of communication from the companies to the public. I’ve even been in scientific sessions where geneticists who don’t work in population genomics have weak intuition on what the results mean!
Earlier Kolata states:
Scientists simply do not have good data on the genetic characteristics of particular countries in, say, East Africa or East Asia. Even in more developed regions, distinguishing between Polish and, for instance, Russian heritage is inexact at best.
This is not totally true. We have good data now on China and Japan. Korea also has some data. Using haplotype-based methods you can do a lot of interesting things, including distinguish someone who is Polish from Russian. But these methods are computationally expensive and require lots of information on the reference samples (Living DNA does this for British people). The point is that the science is there. Reading this sort of article is just going to confuse people.
On the other hand a lot of Kolata’s piece is more human interest. The standard stuff about finding long lost relatives, or discovering your father isn’t your father. These are fine and not objectionable factually, though they’ve been done extensively before and elsewhere. I actually enjoyed the material in the second half of the piece, which had only a tenuous connection to scientific detail. I just wish these sorts of articles represented the science correctly.
Addendum: Just so you know, three journalists who regularly cover topics I can make strong judgments on, and are always pretty accurate: Carl Zimmer, Antonio Regalado, and Ewen Callaway.
* I don’t follow Kolata very closely, but to be frank I’ve heard from scientist friends long ago that she parachutes into topics, and gets a lot of things wrong. Though I can only speak on this particular piece.
23andMe has gone below $50 for “Prime Day”! For those of us who bought kits (albeit more fully featured) at $399 or even more this is pretty incredible. But from what I’m to understand these sorts of SNP-chips are now possible to purchase from Illumina for well less than $50 so this isn’t charitable.
When I first began writing on the internet genomics was an exciting field of science. Somewhat abstruse, but newly relevant and well known due to the completion of the draft of the human genome. Today it’s totally different. Genomics is ubiquitous. Instead of a novel field of science, it is transitioning into a personal technology.
But life comes at you fast. For all practical purposes the $1,000 genome is here.
And yet we haven’t seen a wholesale change in medicine. What happened? Obviously a major part of it is polygenicity of disease. Not to mention that a lot of illness will always have a random aspect. People who get back a “clean” genome and live a “healthy” life will still get cancer.
Another issue is a chicken & egg problem. When a large proportion of the population is sequenced and phenotyped we’ll probably discover actionable patterns. But until that moment the yield is going to not be too impressive.
Out of 50 healthy adults [selected from a random 100] who had their genomes sequenced, 11—or 22 percent—discovered they had genetic variants in one of nearly 5,000 genes associated with rare inherited diseases. One surprise is that most of them had no symptoms at all. Two volunteers had genetic variants known to cause heart rhythm abnormalities, but their cardiology tests were normal.
There’s another possible consequence of people having their genome sequenced. For participants enrolled in the study, health-care costs rose an average of $350 per person compared with a control group in the six months after they received their test results. The authors don’t know whether those costs were directly related to the sequencing, but Vassy says it’s reasonable to think people might schedule follow-up appointments or get more testing on the basis of their results.
Researchers worry about this problem of increased costs. It’s not a trivial problem, and one that medicine doesn’t have a response to, as patients often find a way to follow up on likely false positives. But it seems that this is a phase we’ll have to go through. I see no chance that a substantial proportion of the American population in the 2020s will not be sequenced.
Today I got an email from 23andMe that they’d hit the 2 million customer mark. Since they reached their goal of 1 million kits purchased the company seems to have taken its foot off the pedal of customer base growth to focus on other things (in particular, how to get phenotypic data from those who have been genotyped). In contrast Ancestry has been growing at a faster rate of late. After talking to Spencer Wells (who was there at the beginning of the birth of this sector) we estimated that the direct-to-consumer genotyping kit business is now north of 5 million individuals served. Probably closer to 6 or 7 million, depending on the numbers you assume for the various companies (I’m counting autosomal only).
This pretty awesome. Each of these firm’s genotype in the range of 100,000 to 1 million variant markers, or single nucleotide base pairs. 20 years ago this would have been an incredible achievement, but today we’re all excited about long-read sequencing from Oxford Nanopore. SNP-chips are almost ho-hum.
But though sequencing is the cutting edge, the final frontier and terminal technology of reading your DNA code, genotyping in humans will be around for a while because of cost. At ASHG last year a medical geneticist was claiming price points in bulk for high density SNP-chips are in the range of the low tens of dollars per unit. A good high coverage genome sequence is still many times more expensive (perhaps an order of magnitude ore more depending on who you believe). It also can impose more data processing costs than a SNP-chip in my experience.
Here’s a slide from Spencer:
This is a slide I’ve used in presentations in the past – needs to be updated w/ current numbers, but the discussion is around the inflection pic.twitter.com/Vj9XNb4Exq
I suspect genotyping will go S-shaped before 2025 after explosive growth in genotyping. Some people will opt-out. A minority of the population, but a substantial proportion. At the other extreme of the preference distribution you will have those who will start getting sequenced. Researchers will begin talk about genotyping platforms like they talk about microarrays (yes, I know at places like the Broad they already talk about genotyping like that, but we can’t all be like the Broad!).
Here’s an article from 2007 on 23andMe in Wired. They’re excited about paying $1,000 genotyping services…the cost now of the cheapest high quality (30x) whole genome sequences. Though 23andMe has a higher price point for its medical services, many of the companies are pushing their genotyping+ancestry below $100, a value it had stabilized at for a few years. Family Tree DNA has a father’s day sale for $69 right now. Ancestry looks to be $79. The Israel company MyHeritage is also pushing a $69 sale price (the CSO there is advertising that he’s hiring human geneticists, just so you know). It seems very likely that a $50 price point is within site in the next few years as SNP-chip costs become trivial and all the expenses are on the data storage/processing and visualization costs. I think psychologically for many people paying $50 is not cheap, but it is definitely not expensive. $100 feels expensive.
Ultimately I do wonder if I was a bit too optimistic that 50% of the US population will be sequenced at 30x by 2025. But the dynamic is quite likely to change rapidly because of a technological shift as the sector goes through a productivity uptick. We’re talking about exponential growth, which humans have weak intuition about….
Addendum: Go into the archives of Genomes Unzipped and reach the older posts. Those guys knew where we were heading…and we’re pretty much there.
The figure above is from Noah Rosenberg’s relatively famous paper, Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure. The context of the publication is that it was one of the first prominent attempts to use genome-wide data on a various of human populations (specifically, from the HGDP data set) and attempt model-based clustering. There are many details of the model, but the one that will jump out at you here is that the parameter K defines the number of putative ancestral populations you are hypothesizing. Individuals then shake out as proportions of each element, K. Remember, this is a model in a computer, and you select the parameters and the data. The output is not “wrong,” it’s just the output based how you set up the program and the data you input yourself.
These sorts of computational frameworks are innocent, and may give strange results if you want to engage in mischief. For example, let’s say that you put in 200 individuals, of whom 95 are Chinese, 95 are Swedish, and 10 are Nigerian. From a variety of disciplines we know to a good approximation that non-Africans form a monophyletic clade in relation to Africans (to a first approximation). In plain English, all non-Africans descend from a group of people who diverged from Africans more than 50,000 years ago. That means if you imagine two populations, the first division should be between Africans and non-Africans, to reflect this historical demography. But if you skew the sample size, as the program looks for the maximal amount of variation in the data set it may decide that dividing between Chinese and Swedes as the two ancestral populations is the most likely model given the data.
This is not wrong as such. As the number of Africans in the data converges on zero, obviously the dividing line is between Swedes and Chinese. If you overload particular populations within the data, you may marginalize the variation you’re trying to explore, and the history you’re trying to uncover.
I’ve written all of this before. But I’m writing this in context of the earlier post, Ancestry Inference Is Precise And Accurate(Ish). In that post I showed that consumers drive genomics firms to provide results where the grain of resolution and inference varies a lot as a function of space. That is, there is a demand that Northern Europe be divided very finely, while vast swaths of non-European continents are combined into one broad cluster.
Another aspect though is time. These model-based admixture frameworks can implicitly traverse time as one ascends up and down the number of K‘s. It is always important to explain to people that the number of K‘s may not correspond to real populations which all existed at the same time. Rather, they’re just explanatory instruments which illustrate phylogenetic distance between individuals. In a well-balanced data set for humans K = 2 usually separates Africans from non-Africans, and K = 3 then separates West Eurasians from other populations. Going across K‘s it is easy to imagine that is traversing successive bifurcations.
But today we know that’s more complicated than that. Three years ago Pickrell et al. published Toward a new history and geography of human genes informed by ancient DNA, where they report the result that more powerful methods and data imply most human populations are relatively recent admixtures between extremely diverged lineages. What this means is that the origin of groups like Europeans and South Asians is very much like the origin of the mixed populations of the New World. Since then this insight has become only more powerful, as ancient DNA has shed light as massive population turnovers over the last 5,000 to 10,000 years.
These are to some extent revolutionary ideas, not well known even among the science press (which is too busy doing real journalism, i.e. the art of insinuation rather than illumination). As I indicated earlier direct-to-consumer genomics use national identities in their cluster labels because these are comprehensible to people. Similarly, they can’t very well tell Northern Europeans that they are an outcome of a successive series of admixtures between diverged lineages from the late Pleistocene down to the Bronze Age. Though Northern Europeans, like South Asians, Middle Easterners, Amerindians, and likely Sub-Saharan Africans and East Asians, are complex mixes between disparate branches of humanity, today we view them as indivisible units of understanding, to make sense of the patters we see around us.
Personal genomics firms therefore give results which allow for historically comprehensible results. As a trivial example, the genomic data makes it rather clear that Ashkenazi Jews emerged in the last few thousand years via a process of admixture between antique Near Eastern Jews, and the peoples of Western Europe. After the initial admixture this group became an endogamous population, so that most Ashkenazi Jews share many common ancestors in the recent past with other Ashkenazi Jews. This is ideal for the clustering programs above, as Ashkenazi Jews almost always fit onto a particular K with ease. Assuming there are enough Ashkenazi Jews in your data set you will always be able to find the “Jewish cluster” as you increase the K value.
But the selection of a K which satisfies this comprehensibility criterion is a matter of convenience, not necessity. Most people are vaguely aware that Jews emerged as a people at a particular point in history. In the case of Ashkenazi Jews they emerged rather late in history. At certain K‘s Ashkenazi Jews exhibit mixed ancestral profiles, placing them between Europeans and Middle Eastern peoples. What this reflects is the earlier history of the ancestors of Ashkenazi Jews. But for most personal genomics companies this earlier history is not something that they want to address, because it doesn’t fit into the narrative that their particular consumers want to hear. People want to know if they are part-Jewish, not that they are part antique Middle Eastern and Southwest European.
Perplexment of course is not just for non-scientists. When Joe Pickrell’s TreeMix paper came out five years ago there was a strange signal of gene flow between Northern Europeans and Native Americans. There was no obvious explanation at the time…but now we know what was going on.
It turns out that Northern Europeans and Native Americans share common ancestry from Pleistocene Siberians. The relationship between Europeans and Native Americans has long been hinted at in results from other methods, but it took ancient DNA for us to conceptualize a model which would explain the patterns we were seeing.
But in the context of the United States shared ancestry between Europeans and Native Americans is not particularly illuminating. Rather, what people want to know is if they exhibit signs of recent gene flow between these groups, in particular, many white Americans are curious if they have Native American heritage. They do not want to hear an explanation which involves the fusion of an East Asian population with Siberians that occurred 15,000 to 20,000 years ago, and then the emergence of Northern Europeans thorough successive amalgamations between Pleistocene, Neolithic, and Bronze Age, Eurasians.
In some of the inference methods Northern Europeans, often those with Finnic ancestry or relationship to Finnic groups, may exhibit signs of ancestry from the “Native American” cluster. But this is almost always a function of circumpolar gene flow, as well as the aforementioned Pleistocene admixtures. One way to avoid this would be to simply not report proportions which are below 0.5%. That way, people with higher “Native American” fractions would receive the results, and the proportions would be high enough that it was almost certainly indicative of recent admixture, which is what people care about.
Why am I telling you this? Because many journalists who report on direct-to-consumer genomics don’t understand the science well enough to grasp what’s being sold to the consumer (frankly, most biologists don’t know this field well either, even if they might use a barplot here and there).
And, the reality is that consumers have very specific parameters of what they want in terms of geographic and temporal information. They don’t want to be told true but trivial facts (e.g., they are Northern European). But neither they do want to know things which are so novel and at far remove from their interpretative frameworks that they simply can’t digest them (e.g., that Northern Europeans are a recent population construction which threads together very distinct strands with divergent deep time histories). In the parlance of cognitive anthropology consumers want their infotainment the way they want their religion, minimally counterintuitive. Consume some surprise. But not too much.