Epistasis is one of those terms in biology which has multiple meanings, to the point that even biologists can get turned around (see this 2008 review, Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems, for a little background). Most generically epistasis is the interaction of genes in terms of producing an outcome. But historically its meaning is derived from the fact that early geneticists noticed that crosses between individuals segregating for a Mendelian characteristic (e.g., smooth vs. curly peas) produced results conditional on the genotype of a secondary locus.
Molecular biologists tend to focus on a classical, and often mechanistic view, whereby epistasis can be conceptualized as biophysical interactions across loci. But population geneticists utilize a statistical or evolutionary definition, where epistasis describes the extend of deviation from additivity and linearity, with the “phenotype” often being fitness. This goes back to early debates between R. A. Fisher and Sewall Wright. Fisher believed that in the long run epistasis was not particularly important. Wright eventually put epistasis at the heart of his enigmatic shifting balance theory, though according to Will Provine in even he had a difficult time understanding the model he was proposing (e.g., Wright couldn’t remember what the different axes on his charts actually meant all the time).
These different definitions can cause problems for students. A few years ago I was a teaching assistant for a genetics course, and the professor, a molecular biologist asked a question about epistasis. The only answer on the key was predicated on a classical/mechanistic understanding. But some of the students were obviously giving the definition from an evolutionary perspective! (e.g., they were bringing up non-additivity and fitness) Luckily I noticed this early on and the professor approved the alternative answer, so that graders would not mark those using a non-molecular answer down.
My interested in epistasis was fed to a great extent in the middle 2000s by my reading of . Unfortunately not too many people read this book. I believe this is so because when I just went to look at the it told me that “Customers who viewed this item also viewed” Robert Drews’ . As it happened I read this book at about the same time as …and to my knowledge I’m the only person who has a very deep interest in statistical epistasis and Mycenaean Greece (if there is someone else out there, do tell).
In any case, when I was first focused on this topic genomics was in its infancy. Papers with 50,000 SNPs in humans were all the rage, and the HapMap paper had literally just been published. A lot has changed.
So I was interested to see this come out in Science, Negative selection in humans and fruit flies involves synergistic epistasis (preprint version). Since the authors are looking at humans and Drosophila and because it’s 2017 I assumed that genomic methods would loom large, and they do.
And as always on the first read through some of the terminology got confusing (various types of statistical epistasis keep getting renamed every few years it seems to me, and it’s hard to keep track of everything). So I went to Google. And because it’s 2017 a citation of the paper and further elucidation popped up in Google Books in . Weirdly, or not, the book has not been published yet. Since the author is the second to last author on the above paper it makes sense that it would be cited in any case.
So what’s happening in this paper? Basically they are looking to reduced variance of really bad mutations because a particular type of epistasis amplifies their deleterious impact (fitness is almost always really hard to measure, so you want to look at proxy variables).
Because de novo mutations are rare, they estimate about 7 are in functional regions of the genome (I think this may be high actually), and that the distribution should be Poisson. This distribution just tells you that the mean number of mutations and the variance of the the number of mutations should be the same (e.g., mean should be 5 and variance should 5).
Epistasis refers (usually) to interactions across loci. That is, different genes at different locations in the genome. Synergistic epistasis means that the total cumulative fitness after each successive mutation drops faster than the sum of the negative impact of each mutation. In other words, the negative impact is greater than the sum of its parts. In contrast, antagonistic epistasis produces a situation where new mutations on the tail of the distributions cause a lower decrement in fitness than you’d expect through the sum of its parts (diminishing returns on mutational load when it comes to fitness decrements).
These two dynamics have an effect the linkage disequilibrium (LD) statistic. This measures the association of two different alleles at two different loci. When populations are recently admixed (e.g., Brazilians) you have a lot of LD because racial ancestry results in lots of distinctive alleles being associated with each other across genomic segments in haplotypes. It takes many generations for recombination to break apart these associations so that allelic state at one locus can’t be used to predict the odds of the state at what was an associated locus. What synergistic epistasis does is disassociate deleterious mutations. In contrast, antagonistic epistasis results in increased association of deleterious mutations.
Why? Because of selection. If a greater number of mutations means huge fitness hits, then there will strong selection against individuals who randomly segregate out with higher mutational loads. This means that the variance of the mutational load is going to lower than the value of the mean.
How do they figure out mutational load? They focus on the distribution of LoF mutations. These are extremely deleterious mutations which are the most likely to be a major problem for function and therefore a huge fitness hit. What they found was that the distribution of LoF mutations exhibited a variance which was 90-95% of a null Poisson distribution. In other words, there was stronger selection against high mutation counts, as one would predict due to synergistic epistasis.
Thus, the average human should carry at least seven de novo deleterious mutations. If natural selection acts on each mutation independently, the resulting mutation load and loss in average fitness are inconsistent with the existence of the human population (1 − e−7 > 0.99). To resolve this paradox, it is sufficient to assume that the fitness landscape is flat only outside the zone where all the genotypes actually present are contained, so that selection within the population proceeds as if epistasis were absent (20, 25). However, our findings suggest that synergistic epistasis affects even the part of the fitness landscape that corresponds to genotypes that are actually present in the population.
Overall this is fascinating, because evolutionary genetic questions which were still theoretical a little over ten years ago are now being explored with genomic methods. This is part of why I say genomics did not fundamentally revolutionize how we understand evolution. There were plenty of models and theories. Now we are testing them extremely robustly and thoroughly.
Addendum: Reading this paper reinforces to me how difficult it is to keep up with the literature, and how important it is to know the literature in a very narrow area to get the most out of a paper. Really the citations are essential reading for someone like me who just “drops” into a topic after a long time away….
Citation: Science, Negative selection in humans and fruit flies involves synergistic epistasis.