The Tasmanian devil has a future (probably)

In the late 2000s there was a lot of talk about how the Tasmanian devil was going to go extinct because of devil facial tumor disease. I expressed the thought that we need to be really cautious thinking that disease could drive the devils to extinction. This was not based on detailed knowledge of the biology of the devils. It was based on the fact that complex organisms are often subject to disease, and populations do crash, but it is clear that even if census size gets rather small, unless a population is very small and restricted disease itself and alone probably won’t eliminate an organism because there will be an evolutionary response. That’s one of the main arguments for why complex organisms tend to be predominantly sexual lineages despite the process’ two-fold cost. From what I recall many of the researchers expressing alarmist sentiments were ecologists, who often do not have evolutionary responses foremost in their mind.

But, there are reasons to be worried. The devil went extinct on the mainland ~3,000 years ago. The reason was probably the same as with the Tasmanian tiger: competition with the introduced dingo dogs. Probably more dangerous to the long term viability of the devil is that it is now restricted to an island the size of the Republic of Ireland (and much of the habitat on Tasmania is not devil optimal), where it is subject to habitat loss, and competition with introduced species.

Mind you, I am somewhat worried about the possible loss of the devil. But I thought it was frankly somewhat sensationalistic to assume that the devil’s disease was sui generis, and we just happened to be living at the time when the whole species was going to go down after persisting in Tasmania for ten thousand years despite various ups and downs. Diseases come and go. What is sui generis is the terraforming inclinations of our own species. We are the necessary and often sufficient condition.

A new open access paper suggests I may have been on the right path, Rapid evolutionary response to a transmissible cancer in Tasmanian devils:

Although cancer rarely acts as an infectious disease, a recently emerged transmissible cancer in Tasmanian devils (Sarcophilus harrisii) is virtually 100% fatal. Devil facial tumour disease (DFTD) has swept across nearly the entire species’ range, resulting in localized declines exceeding 90% and an overall species decline of more than 80% in less than 20 years. Despite epidemiological models that predict extinction, populations in long-diseased sites persist. Here we report rare genomic evidence of a rapid, parallel evolutionary response to strong selection imposed by a wildlife disease. We identify two genomic regions that contain genes related to immune function or cancer risk in humans that exhibit concordant signatures of selection across three populations. DFTD spreads between hosts by suppressing and evading the immune system, and our results suggest that hosts are evolving immune-modulated resistance that could aid in species persistence in the face of this devastating disease.

Again, we need to be cautious about these results. They’re preliminary. Just as we needed to be cautious about the original extreme claims.

The ending of the liberal interregnum

The above talk is from Alice Dreger, author of . I don’t know Dreger personally, but she seems like a brave and courageous person. In the broadest strokes there’s very little where we disagree. Yes, our politics, and many of our specific beliefs, diverge, but we generally at least hold to the ideal of truth.

There is one section of her talk where Dreger waxes eloquently about the Enlightenment, and freedom of thought, which caught my attention. We have always missed the mark, but at there was a point where in Western intellectual culture the idea that freedom of thought and striving toward truth was at least the paramount method and goal. I am not so sure that is the case today.

When Dreger pointed approvingly on Twitter to University of Chicago’s statement on “safe spaces,” I told her that most of my liberal Twitter follows were enthusiastically sharing this piece, UChicago’s anti-safe spaces letter isn’t about academic freedom. It’s about power. The piece makes some coherent points, but mostly it is self-congratulatory intellectual masturbation. At a certain point the cultural Left no longer made any pretense to being liberal, and transformed themselves into “progressives.” They have taken Marcuse’s thesis in Repressive Tolerance to heart.

Though I hope that Dreger and her fellow travelers succeed in rolling back the clock, I suspect that the battle here is lost. She points out, correctly, that the total politicization of academia will destroy its existence as a producer of truth in any independent and objective manner. More concretely, she suggests it is likely that conservatives will simply start to defund and direct higher education even more stridently than they do now, because they will correctly see higher education as purely a tool toward the politics of their antagonists. I happen to be a conservative, and one who is pessimistic about the persistence of a public liberal space for ideas that offend. If progressives give up on liberalism of ideas, and it seems that many are (the most famous defenders of the old ideals are people from earlier generations, such as Nadine Strossen and Wendy Kaminer, with Dreger being a young example), I can’t see those of us in the broadly libertarian wing of conservatism making the last stand alone.

Honestly, I don’t want any of my children learning “liberal arts” from the high priests of the post-colonial cult. In the near future the last resistance on the Left to the ascendency of identity politics will probably be extinguished, as the old guard retires and dies naturally. The battle will be lost. Conservatives who value learning, and intellectual discourse, need to regroup. Currently there is a populist moood in conservatism that has been cresting for a generation. But the wave of identity politics is likely to swallow the campus Left with its intellectual nihilism. Instead of expanding outward it is almost certain that academia will start cannibalizing itself in internecine conflict when all the old enemies have been vanquished.

Let the private universities, such as Oberlin, wallow in their identity politics contradictions. Dreger already points to the path we will probably have to take: gut the public universities even more than we have. Leave STEM and some professional schools intact, and transform them for all practical purposes into technical universities. All the other disciplines? Some private universities, the playgrounds of the rich and successful, will continue to be traditionalist in maintaining “liberal arts,” which properly parrot the latest post-colonial cant. But much learning will be privatized, and knowledge will spread through segregated “safe spaces.” Those of us who read and think will continue to read and think, like we always have. We just won’t have institutional backing, because there’s not going to be a societal consensus for such support.

I hope I’m wrong.

Genotype them all, and let GWAS sort it out

Screenshot 2016-08-28 15.41.08
About thirteen years ago I expressed the opinion that an understanding of population structure will become a matter of intellectual curiosity once we have a better understanding of the genetic basis of characteristics. A friend, who was a statistical geneticist, told me that this was unlikely. We were unlikely to capture the ability to predict all outcomes well enough on even high heritable complex traits to simply discard population structure information. Some of this is not due to genetics; different populations may expose themselves to different environmental conditions. For example, it would be useful to know which individuals in the CEU white European American data set are practicing Mormons, and which are not, because Mormonism tends to result in a lot of behavior modification.

But some of the concern about population structure has to do with the fact that genetic background matters, and we are unlikely to ever have total omniscience as to the nature of genetic interactions and dependencies. By this, I mean that if we have a strong causal signal which associates disease risk with a genetic variant, that risk is still conditional on dependencies of other genetic variations across the genome. Those variations are the outcome of demographic histories, which one can “control” for to some extent by accounting for population structure. In more plain language, a signal that predicts an outcome in Norwegians may not predict the same outcome in Nigerians. The may be due to different frequencies of other variants which are not directly causal, but interact with the causal signals, which vary between populations.

Screenshot 2016-08-28 15.58.43More recently I’ve been a bit sanguine. I don’t follow the literature closely, but papers like High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants, make me wonder if the genetic background concerns weren’t over-wrought.

A new preprint, Population genetic history and polygenic risk biases in 1000 Genomes populations, suggests we should be worried. Or, more precisely, we should be cognizant of the limitations genetic background imposes upon us for certain classes of variants and disease. In particular, rare variants are going to be less portable across populations because of shallower time depth of their emergence, after, populations have diverged. So, if you have a low frequency major effect causal variant in Europeans, there is a much lower likelihood that it is in other populations.

The histogram above illustrates an excellent case study from the preprint. The genetic architecture of height and its genomic basis has been most well elucidated for Europeans. We know, for example, many of the loci which distinguish Northern and Southern Europeans, and, we know that selection has resulted in divergence between the two populations over the past 5,000 years. But as you can see the predicted heights seem to simply follow genetic distance from Europeans. SAS = South Asians, while AMR = a mixed cohort of populations from the Americas. EAS and AFR are East Asians and Africans. In reality, Africans are nearly as tall as Europeans (taller or shorter depending upon the reference European population), and taller than East Asians. The predictions here are off because the causal variants inferred from the studies of European cohorts are portable in direction proportion to shared demographic history. South Asians share a relatively ancient demographic history with Europeans, while many mixed groups from the Americas have Europeans as one of their recent founding populations. But in both cases the causal variants were likely segregating in the ancestral populations before divergence, so there is no major difference in the consequence.

The preprint has a lot more than just a reanalysis of GWAS. Using local ancestry deconvolution methods they show how one can infer history from patterns of genetic variation (though as always, this should not be taken as gospel, as there are biases in the methods currently used). The major take home is simple: population structure is real, and, it has real consequences functionally.

Open Thread, 8/28/2016

About 2/3 of the way through by Sanjeev Sanyal. It’s a wide-ranging book which synthesizes diverse disciplinary threads. The big over-arching thesis seems to be that movement of peoples and ideas was far less unidirectional than we often tend to think and are told. Probably one of the major examples of this which I think has been somewhat misleading to many people has been the idea that migration out of Africa can be purely defined unidirectional migration in a series of stepwise events.

Human_migration_out_of_AfricaThat being said, there are the usual problems that occur when you synthesize diverse disciplines. Since I know a fair amount about the intersection of genetics and history I can say with great confidence that some of the genetics in the book is now outdated. The reason is that it relies on work that was published ~5 years ago. Also, there is the unfortunate reality that sometimes high-impact journals publish works that are almost certainly wrong. For example, Sanyal cites Genome-wide data substantiate Holocene gene flow from India to Australia. This paper is interesting, but it was clear to many that it was probably wrong almost immediately upon publication.

Longer review when I have time.

I need to read the paper closely. But the demographic-historical implications of this are pretty straightforward. (it’s open access)

G.E., the 124-Year-Old Software Start-Up. The story is interesting to me mostly because it illustrates how contingent how modern civilization is. There are so many people doing so many specialized things that we take for granted.

Forget “Earth-Like”—We’ll First Find Aliens on Eyeball Planets. M. J. Engh’s was set on one of these planets.


Ohana is a suite of software for analyzing population structure and admixture history using unsupervised learning methods. We construct statistical models to infer individual clustering from which we identify outliers for selection analyses.

It may be better than ADMIXTURE, but we’re reaching a point where “good-enough” tools are achieving “lock-in.”

Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. No surprise.

Down in the valley, up on the ridge. On Melungeons.

Leon Hadar is now a contributor to Secular Right, Will Trump Usher the GOP’s Secular Age?

The crescent and the globe. I wrote this.

Response to Euny Hong’s critique of 23andMe

Screenshot 2016-08-26 19.23.09

Update: In light of further comments I may have been wrong about Hong’s recent admixture! See the comments below (also, further discussion with Spencer Wells offline). I don’t have total clarity on what’s going on, because I’m sure my friends weren’t lying…but they were also early adopters, and the methods may have changed. And, I do think 23andMe has the talent and methods to resolve Korean ancestry, so it’s a matter of investment, not data.

All that being said, all individuals should pull down the raw data and do a reanalaysis.

End update

Quartz has an article up, 23andMe has a problem when it comes to ancestry reports for people of color, which I want to comment on at length. Though literally taken the title is not something I’d disagree with too much, the tone and details I have serious issues with.

First, some disclosure. Hong talked to me on the phone for an hour about this story. Mostly we talked about her Korean ancestry results. More on that later. Second, I consulted for 2.5 years for Family Tree DNA, am friends with Spencer Wells (who is quoted), and am on friendly terms (I’d like to think!) with Joanna Mountain, and quite respect many of the scientists at 23andMe (e.g., Kaisa Bryc and Ivan Juric off the top of my head).

I will go through the article point by point. First:

I doubt that most 23andMe users realize how paltry the company’s data is for non-Caucasians. For example: The data set that 23andMe used to generate my report has 76 Koreans in it, according to Dr. Joanna Mountain, the company’s senior director of research. 76 Koreans. It is estimated there are at least 7 million Koreans living outside of the Korean peninsula—including 1.7 million in the US—among a worldwide population of 83 million.

Seventy-six Koreans seemed small to me, but what do I know? I’m just a journalist. So I spoke to geneticist Spencer Wells, founder and former director of National Geographic’s Genographic Project (arguably a 23andMe competitor), which he ran from 2005-2015. “[76] is a really low number,” he concurred.

The small sample sizes seem really, really problematic if you are a lay person, or a journalist. The issue is that with genotype technology that looks for common polymorphisms you really don’t get that much more information from 1,000 individuals than you do from 100. All things equal, more sample size is better, but the gap between 10 and 100 is much much greater than 100 and 1,000 or 100 and 10,000. You can see this in the robustness of results for model-based clustering conditional on different sample sizes. For a homogeneous population like the peoples of the Korean peninsula, who seem relatively panmictic, a bigger sample size would have only marginal effect on the overall outcomes using these methods (also, it might matter if you were looking at low-frequency alleles from whole genome sequencing).

Before I talked to Hong I checked in with a friend who was half north Korean (in that her father’s family was from the northern half of the peninsula and migrated south) and half central Korean (i.e., her mother’s family was from around Seoul). Just like her husband, whose family was from Busan in the far south, her results came back as 99% Korean. Some genetic research has been done on Koreans, and there just isn’t that much structure. The Koreans have a composite origin if you go far back enough, but they’ve been intermarrying with each other a long time.


Also, astonishingly, the report shows that I am 13.4% Japanese and 14% Chinese—and only 61.6% Korean. I was looking forward to watching my parents freak out. My sister texted me, “Oh [Dad will] probably blame Mom.”

To my disappointment, my parents did not freak out, nor did they get into an amusing argument about which of their ancestors was the ho. Because they simply did not believe the data. And, for once, they were right.

The public relies on journalists for the truth. Sometimes the truth can be slippery. But sometimes it is clear. Most of conversation between Hong and myself was about her Korean ancestry. As I said to her, I asked a handful of my Korean friends about their 23andMe results before we spoke. From that I told Hong I was 99% sure that she had recent non-Korean ancestry. 23andMe’s results are really robust. I tried to emphasize that over and over. Hong can believe what she wants, but it is obvious that she almost certainly has non-Korean ancestry relatively recently in the past.

Because 23andMe uses chromosome painting, you can see she has very long segments of inferred Chinese and Japanese ancestry. This non-Korean ancestry is probably from within the last three generations because ancestry tract lengths indicate that recombination hasn’t broken apart the associations across the chromosomes (there are 20-40 recombination events across the genome per generation).


I asked Wells whether my percentage breakdowns of Korean, Chinese, and Japanese meant anything. “Yes,” he said, “but I think it is misleading to go to a decimal place or even to go out two digits.” Wells said that another problem with the data is that “Most of those [samples] are from the US. They’re not terribly useful for studies of indigenous composition—which is effectively what this analysis is trying to do.”

I had a long text conversation with Spencer on this after the article came out. I can see where he’s coming from. And 23andMe does have a shortfall of indigenous and non-European samples. But as I said, I asked around to Korean friends who had used 23andMe before and the population is pretty homogeneous, and the friends’ results I cited above were representative. I have also worked with and seen samples from Family Tree DNA, and it’s the same story. There might be undersampled populations from Korea, but I’d bet against it. Koreans are relatively homogeneous, with a position between Japanese and North Chinese. Where you would expect them to be.

Spencer is correct about the decimal places issue. They give people a false impression of precision. I do know that scientists within DTC companies struggle against it. But scientists don’t always win these arguments.


I also interviewed Harvard geneticist Robert Green, who made the important point that private companies have different methods and standards from those of an academic lab. “There is a difference between analysis you can do with hundreds of [genetic] markers at a research level, and the kind of analysis that even the best companies can do, which is more an approximation,” he said.

Green is a medical geneticist who does great work. But I’ll be generous and assume he’s taken totally out of context here, because what he says makes no sense. The genotyping platforms do have error rates (no-calls, mistypings, etc.) on the order of 1%. But they’re using hundreds of thousands of SNPs. This error rate doesn’t matter too much for what 23andMe is doing in relation to ancestry. And with population structure inference these errors usually don’t cause a major issue if they aren’t systematic.

Then there’s this:

A few of the geneticists I interviewed for this article (but not Green or Wells) outright accused 23andMe of commercially driven ethnic bias. For example, no distinction is made between northern and southern Chinese, who have very different traits. This was a serious allegation, so I put the question directly before 23andMe’s Mountain. “As a scientist, I find that insulting,” she said in a phone interview.

I brought up the issue with the Chinese to Hong, and I apologize to Mountain here if it came off as offensive, because I certainly didn’t mean it that way. My point, which I’ve brought up for years both in public, and when I have consulted for DTC companies, is that South and East Asians are huge groups, and it’s incongruous that they aren’t differentiated as much as the Europeans. These tests basically tell you are South Asian, or Chinese, or Korean, or Japanese. In the case of Koreans and Japanese there isn’t that much structure within these groups, but that is not the case with the Han Chinese. There is an decent amount of structure, but last I checked 23andMe has a catchall Han Chinese group. Why? I’ll get to that later. (It’s not because they don’t have the data.)

Though I disagree with the tone and the emphasis, a simple inspection by Hong has shed light on something that has been glaringly obvious in the genetic genealogy community: there is laser-like focus on differentiating very close Northern European groups, such as Irish and English, and not so much emphasis on differentiating diverse populations such as South Asians. This was one thing I did talk to Hong about at length. I don’t think it’s crass racism, and I think that I made that clear to her, but I’m not happy with the situation either (23andMe representatives know I’m not happy, and have talked to me about it at ASHG).

The final sections involve Hong reviewing the disparities in sample representation. As I said above, some of this overdone. But, it is a little ridiculous that there are only a few hundred African population samples in their data. Granted, it turns out that between-population genetic distance in Africa is actually not as much as you’d think based on aggregate variation (the within population variation is what makes all the news). I think Hong is correct that 23andMe should have made more effort on sample collection these past few years…but I’m not CEO of 23andMe, and Joanna Mountain and her scientists don’t call all the shots. I think Hong’s piece leaves Mountain and the researchers holding the bag for something that really isn’t their doing (perhaps it is, but I’m really skeptical of that).


Could the company be doing a better job with collecting ethnographic data? “Absolutely they could,” Wells said, “but it’s not their raison d’être.” Which, of course, is pharma and health research. Fair enough—it’s their money. But how about a disclaimer attached to the ancestry part of the report? Like, “for entertainment purposes only?” Because data based on 76 Koreans (or any other ethnic group) is definitely not worth potentially causing family discord or a blood feud. I don’t know whether the company understands the realities of deadly global ethnic tensions and the potential damage created by people’s trust in these reports.

I think Spencer has highlighted the major dynamic here: 23andMe is pivoting towards biomedical research. It has a database of north of a million, mostly European-origin individuals. The real money now comes from leveraging the database to collect information on health, and combining it with the genotypes they already have. On the margin, getting greater population diversity is probably not a major avenue by which they could gain higher valuations. And getting from one million to ten million genotypes is nothing without increasing their database of phenotypes.

The real story here is not one of racism. It’s one of capitalism. Most of 23andMe’s customers are white European in ancestry, and a disproportionate number of those are Northern European. Is it a surprise that their tools breakdown Northern European ancestry so finely? That’s their customer base.

Second, many Asians I’ve talked to are relatively uninterested in fine-grained breakdowns in their ancestry. For several years I worked with an engineer from Fujian, and his Family Tree DNA results showed that he was shifted toward the southern end of the north-south Chinese cline. He didn’t care at all, because he was from Fujian, so of course he knew this. Many Asians seem to have this attitude where the ancestry results are viewed as confirmatory. Hong’s case, where there was a surprise, is exceptional.

If 23andMe wanted to they could easily breakdown Asians into further subcomponents. I think there are two reasons they don’t want to aside from the firm’s recent focus on health and pharma. First, they don’t have that many Asian customers. Second, their Asian customers might actually get a bit irritated!

Ultimately, Hong can think whatever she wants to about her 23andMe results. But the data are out there. It’s pretty obvious that unless there was a sample mix-up, she has recent Chinese and Japanese ancestry (she could put the raw results in the public domain and have people cross-check with other methods, like PCA; I’m pretty sure they would confirm the 23andMe results).

On a last nerdy note: the data generated by DTC companies is great. Their Illumina SNP-chips are really good, with 99% or so correct-call rates. Hong referred to data in the piece when she really meant results. The thing is that results are basically generated through a sieve of methods geared toward human digestibility. 23andMe and other DTC companies differ because of different methods and parameters in those methods, that are determined by what humans want out of these techniques. But the data, that’s pretty straightforward and robust.

If you are interested in a more philosophical take, Joe Pickrell’s What is ancestry?

Addendum: My conversation with Hong was very wide-ranging. We talked about EDAR, random mating populations, and local ancestry deconvolution. Well, perhaps not in those words. It’s a little saddening to me that ultimately what came out of all that is a piece which tries to paint 23andMe as prejudiced against minorities. The only prejudice they exhibit as a firm is against smaller market share.

Southern Crafted Hot Sauce

20160821_152743 (1)I got this hot sauce at Whole Foods. The original Whole Foods.

What a disappointment. Salty. Without much other flavor besides the spice. It was like a watery spin on Louisiana hot sauce. I couldn’t taste the “aromatic spices” and “fresh herbs.” And don’t tell me it is because it’s too spicy, I didn’t find it too spicy. I did find it very salty though.