Apes just being apes

A while back I made fun of bonobos and chimpanzees for being kind of losers for looking across at each other on either side of the Congo river for ~1.5 million years the time elapsed since their diversion. I finally ended up reading the paper from last year, Chimpanzee genomic diversity reveals ancient admixture with bonobos, which reported complex population history between these two species. In other words, “they got it on”.

The key was a reasonable sample size of N=40 and high coverage genomes (>20x), to give them the amount of information necessary to have the power to detect admixture. If you aren’t human and have a reasonable size genome, and all mammals do, get to the back of the line. But the Pan‘s turn finally arrived.

The paper primary result is that over past few hundred thousand years there have been reciprocal gene flow events of small, but detectable, magnitude between chimpanzees and bonobos. Naturally, there was some geographic specificity here, in that chimpanzees from far West Africa lack much evidence of this while those from Central Africa have a great deal. The admixture is directly proportional to proximity to b0nobo range.

To obtain the result their initial focus on high-frequency bonobo derived alleles that were at low to moderate frequencies in chimpanzees. There was a notable excess for this class among Central African chimpanzees. And, these alleles seem to have introgressed recently.

I suppose the major takeway is that hominids do it like they do it on the Discovery Channel.

Selection swimming against the genomic tide

One of the major issues that confuses people is that the distribution of a trait or gene is often only weakly correlated with overall phylogeny and the rest of the genome.

To give a strange but classic example, the MHC loci are subject to strong balancing selection. This means that novel alleles do not substitute and replace ancestral alleles. Substitution of this sort results in “lineage sorting,” so that when you look at chimpanzees and humans you can see many polymorphic loci where all humans carry one variant and all chimpanzees the other. In contrast at the MHC loci there is frequency-dependent selection for rare variants, so the normal cycling process does not occur. Humans and chimpanzees overlap quite a bit on MHC, and any given human may have a more similar profile to a given chimpanzee than another human.

There are 19,000 human genes. At 3 billion base pairs only about ~100 million are polymorphic on a worldwide scale (using some liberal definitions). There are lots of unique stories to tell here.

A new preprint, Inferring adaptive gene-flow in recent African history, illustrates how certain genes with functional significance may differ from genome-wide background. The authors find that among the Fula (Fulani) people of West Africa there has been introgression from a Eurasian mutation that confers lactase persistence. The area of the genome around this gene is much more Eurasian than the rest of the genome. In contrast, the area around the Duffy allele is much less Eurasian. The variation in this locus is related to malaria resistance. Finally, in other African populations, they found gene flow of MHC variants.

None of this is entirely surprising, though the authors apply novel haplotype-based methods which should have wider utility.

Quantitative genomics, adaptation, and cognitive phenotypes

The human brain utilizes about ~20% of the calories you take in per day. It’s a large and metabolically expensive organ. Because of this fact there are lots of evolutionary models which focus on the brain. In Catching Fire: How Cooking Made Us Human Richard Wrangham suggests that our need for calories to feed our brain is one reason we started to use fire to pre-digest our food. In The Mating Mind Geoffrey Miller seems to suggest that all the things our big complex brain does allows for a signaling of mutational load. And in Grooming, Gossip, and the Evolution of Language Robin Dunbar suggests that it’s social complexity which is driving our encephalization.

These are all theories. Interesting hypotheses and models. But how do we test them? A new preprint on bioRxiv is useful because it shows how cutting-edge methods from evolutionary genomics can be used to explore questions relating to cognitive neuroscience and pyschopathology, Polygenic selection underlies evolution of human brain structure and behavioral traits:

…Leveraging publicly available data of unprecedented sample size, we studied twenty-five traits (i.e., ten neuropsychiatric disorders, three personality traits, total intracranial volume, seven subcortical brain structure volume traits, and four complex traits without neuropsychiatric associations) for evidence of several different signatures of selection over a range of evolutionary time scales. Consistent with the largely polygenic architecture of neuropsychiatric traits, we found no enrichment of trait-associated single-nucleotide polymorphisms (SNPs) in regions of the genome that underwent classical selective sweeps (i.e., events which would have driven selected alleles to near fixation). However, we discovered that SNPs associated with some, but not all, behaviors and brain structure volumes are enriched in genomic regions under selection since divergence from Neanderthals ~600,000 years ago, and show further evidence for signatures of ancient and recent polygenic adaptation. Individual subcortical brain structure volumes demonstrate genome-wide evidence in support of a mosaic theory of brain evolution while total intracranial volume and height appear to share evolutionary constraints consistent with concerted evolution…our results suggest that alleles associated with neuropsychiatric, behavioral, and brain volume phenotypes have experienced both ancient and recent polygenic adaptation in human evolution, acting through neurodevelopmental and immune-mediated pathways.

The preprint takes a kitchen-sink approach, throwing a lot of methods of selection at the phenotype of interest. Also, there is always the issue of cryptic population structure generating false positive associations, but they try to address it in the preprint. I am somewhat confused by this passage though:

Paleobiological evidence indicates that the size of the human skull has expanded massively over the last 200,000 years, likely mirroring increases in brain size.

From what I know human cranial sizes leveled off in growth ~200,000 years ago, peaked ~30,000 years ago, and have declined ever since then. That being said, they find signatures of selection around genes associated with ‘intracranial volume.’

There are loads of results using different methods in the paper, but I was curious note that schizophrenia had hits for ancient and recent adaptation. A friend who is a psychologist pointed out to me that when you look within families “unaffected” siblings of schizophrenics often exhibit deviation from the norm in various ways too; so even if they are not impacted by the disease, they are somewhere along a spectrum of ‘wild type’ to schizophrenic. In any case in this paper they found recent selection for alleles ‘protective’ of schizophrenia.

There are lots of theories one could spin out of that singular result. But I’ll just leave you with the fact that when you have a quantitative trait with lots of heritable variation it seems unlikely it’s been subject to a long period of unidirecitional selection. Various forms of balancing selection seem to be at work here, and we’re only in the early stages of understanding what’s going on. Genuine comprehension will require:

– attention to population genetic theory
– large genomic data sets from a wide array of populations
– novel methods developed by population genomicists
– and funcitonal insights which neuroscientists can bring to the table

The future will be genetically engineered


If the film Rise of the Planet of the Apes had come out a few years later I believe there would have been mention of CRISPR. Sometimes science leads to technology, and other times technology aids in science. On occasion the two are one in the same.

The plot I made above shows that in the first five years of the second decade of the 20th century CRISPR went from being an obscure aspect of bacterial genetics to ubiquitous. Friends who had been utilizing “advanced” genetic engineering methods such as TALENS and zinc fingers switched overnight to a CRISPR/Cas9 framework.

As I’ve said before the 2010s are the decade when “reading” the genome becomes normal. We really don’t know what the CRISPR/Cas9 technology is capable of. It’s early years yet. With that, First Human Embryos Edited in U.S.. Technically they’re single celled zygotes. The science itself is not astounding. Rather, it is that the human rubicon has been passed in the United States. As indicated in the article there has been some jealousy about what the Chinese have been able to do because of a different cultural and regulatory framework.

There are those calling for a moratorium on this work (on humans). I’m not in favor or opposed. Rather, my question is simple: if CRISPR/Cas9 makes genetic engineering cheap, easy, and effective, how exactly are we going to enforce a world-wide moratorium? A Butlerian Jihad?

Note: I know that people are freaking about humans + genetic engineering. But most geneticists I know are more excited about the prospects of non-human work, since human clinical trials are going to be way in the future. Over 20 years since Dolly it’s notable to me that no human has been cloned from adult somatic cells yet.

Reason is but a slave of passions as it always has been

David Hume stated that “reason is, and ought only to be the slave of the passions.” I don’t know about the ought part, that’s up for debate. But the is part seems empirically true. The reasons people give for this or that is often just a post hoc rationalization. To give a different twist to this contention, others have argued that reason exists to win arguments, not converge upon truth. Or more precisely in my opinion to give the patina of erudition or abstraction to sentiments which are fundamentally derived from emotion or manners enforced through group norms (ergo, the common practice of ‘educated’ people citing scholars whose work we can’t evaluate to buttress our own preconceptions; we all do it).

One of the reasons I recommend In Gods We Trust, and cognitive anthropology more generally, to atheists and religious skeptics is that it gives a better empirical window into the mental processes that are really at work, as opposed to those which people say are at work (or, more unfortunately, those they think are at work). In In Gods We Trust the author reports on research conducted where religious believers are given a set of factual assertions purportedly from scholarship (e.g., the Dead Sea Scrolls). These assertions on the face of it flatly contradict their religious beliefs in some deep fundamental way. But when confronted with facts which seem to logically refute the coherency of their beliefs, they often still accept the validity of the scholarship before them. When asked about the impact on their beliefs? Respondents generally asserted that the new facts strengthened their beliefs.

This is one reason that cognitive anthropologists term religious ‘reasoning’ quasi-propositional. It takes the general form of analysis from axioms, but ultimately the rationality is besides the point, it is simply a quiver in the arrow of a broader and deeper cognitive phenomenon.

To give a personal example which illustrates this. Many many years ago I knew a Jewish girl of Modern Orthodox girl background passingly. She once asserted to me that the event of the Holocaust strengthened her belief in her God. I didn’t follow through on this discussion, as it was too disturbing to me. But it brought home to me that in some way the “reasoning” of many religious people leaves me totally befuddled (and no doubt vice versa).

As it happens, while in the course of writing this post, I found out that Hugo Mercier and Dan Sperber, the authors of the above argument in relation to reason and argumentation, published a book last month, The Enigma of Reason. I encourage readers to get it. I just bought a Kindle copy. Dan Sperber, who I interviewed 12 years ago, is a very deep thinker on the level of Daniel Kahneman. He’s French, and his prose can be somewhat difficult, so I wonder if that’s one reason he’s not nearly as well known).

Ultimately the point of this post actually goes back to genomics and history. Anne Gibbons has an excellent piece in Science, There’s no such thing as a ‘pure’ European—or anyone else. In it she draws on the most recent research in human population genomics to refute antiquated ideas about the purity of any given population. If you have read this blog for the past few years you already know most human populations are complex admixtures; that is, it isn’t a human family tree, but a human family graph.

Gibbons’ piece attacks directly some standard racialist talking points which have been refuted on a factual basis by genetic science:

When the first busloads of migrants from Syria and Iraq rolled into Germany 2 years ago, some small towns were overwhelmed. The village of Sumte, population 102, had to take in 750 asylum seekers. Most villagers swung into action, in keeping with Germany’s strong Willkommenskultur, or “welcome culture.” But one self-described neo-Nazi on the district council told The New York Times that by allowing the influx, the German people faced “the destruction of our genetic heritage” and risked becoming “a gray mishmash.”

In fact, the German people have no unique genetic heritage to protect. They—and all other Europeans—are already a mishmash, the children of repeated ancient migrations, according to scientists who study ancient human origins. New studies show that almost all indigenous Europeans descend from at least three major migrations in the past 15,000 years, including two from the Middle East. Those migrants swept across Europe, mingled with previous immigrants, and then remixed to create the peoples of today.

First, let’s set aside the political question of welcoming on the order of one million refugees to Germany. I will not post comments discussing that.

As a point of fact the truth genetically in relation to Germans is even more complex than what Gibbons’ asserts. When I worked with FamilyTree DNA I had access to their database and presented at their year conference some interesting results from people whose four grandparents were from Germany. In short, Germans tended to fall into three main clusters, one that was strongly skewed toward people from some parts of France, another which was shifted toward Scandinavians, and a third which was very similar to Slavs.

The historical and cultural reasons for this are easy to guess at or make conjectures. The takeaway here is that unlike Finns, or Irish, and to a great extent Scandinavians and Britons, Germany exhibits a lot of population substructure within it because of assimilation or migration in the last ~1,000 years. This is why genetically saying someone is “German” is very difficult when compared to saying someone is Polish or Swedish. By dint of their cultural expansiveness Germans are everyone and no one set next to other Northern Europeans* (with the exception perhaps of the French…I’m sure Germans will appreciate this comparison!).

The conceit of these sort of pieces is that racists will confront refutations which will shatter their racist axioms. But since most of the people who are writing these pieces and read Science are not racists, they won’t have a good intuition on the cognitive processes at work for genuine racists.

This causes problems. As a comparison, many atheists seem to think that refutation of the Athanasian creed will blow Christians away and make them forsake their God (or showing them contradictions in the Bible, admit that you’ve gone through that phase!). Though the Church Father Tertullian’s assertion that he “believed because it is absurd” is more subtle than I often make it out to be, on the face of it it does reflect how outsiders view a normative social group like Christianity.

The emphasis here is on normative. Social or religious movements and sentiments are often about norms, which emerge at the intersection of history, intuition, instinct, and facts. I place facts last in the list, because I think it is a defensible stance to take that facts are the least important variable!

The field of cultural evolution has shown that group cohesion and communal norms have been major drivers of human evolution. Likely there has been gene-cultural coevolution so that group conformity has been selected for as a way to make social units operate more smoothly. Social cognition is a thing; people believe what they believe because other people in their social groups believe something, not because they’ve reasoned to it themselves. Originally reasoning is hard. Letting others derive for you, and plugging and chugging is easy. As Muhammad stated, the Ummah will not agree upon error! The smarter people are, the better they are are reasoning…but the better they are at motivated reasoning, ignorance, and rationalization.

When faced with disconfirming evidence some people can dig in and deny the plain facts. Creationists are a straightforward case of this. Then there are evaders.  From what I have seen on the political Left in the United States at least over the last 15 years (when I’ve been engaging actively with people on the internet) there has been a consistent pattern of obfuscation and dodging the likely reality of sex differences in many quarters. When pinned down on the fundamentals few deny the principle or the possibility, but they almost always impose an extremely high level of skepticism that is not found in other domains, where their epistemology is far less stringent.

But then there is a third case, where facts that seem to refute on first blush to you  only strengthen the beliefs of someone with whom you already disagree. I am generally of the view that the rise of naturalistic science has probably undermined the case for classical supernaturalist theism, which emerged in the pre-modern era. Reasonable people can disagree, as I have smart religious friends who are also scientists. Some of these people, like Francis Collins, will even assert that modern findings which boggle the mind and shock our intuitions confirm and strengthen their belief in pre-modern religious systems!

My point is not to take a strong stance on science and religion. Rather, it is to say that when you present evidence and declare “I refute you thus!”, they may simply respond “Aha! You have proven my point!”

In relation to the Gibbons’ article the writing has been on the wall for at least three years, and probably longer. In Towards a new history and geography of human genes informed by ancient DNA Pickrell and Reich content:

…Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture and population replacement have been the rule rather than the exception in human history. In light of this, we argue that it is time to critically re-evaluate current views of the peopling of the globe and the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically-known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.

Since this was published in spring of 2014 the evidence has gotten stronger and stronger. That is, the distribution of outcomes is getting more consistent and converging to a high confidence truth.

From this, are we to conclude that white nationalism would decline from marginal to non-existent in the past three years? A review of the empirical data does not seem to support that proposition. Therefore, a naive model that white nationalism is predicated on facts about racial purity may be wrong.

The responses that I have seen (often in the form of comments I don’t publish on this weblog) are denial/rejection, confusion, reinterpretation and vindication (along with standard issue racial insults directed toward me, their colored cognitive inferior). As with the religious case I have a difficult time “putting myself” in the shoes of a racialist of any sort, so I don’t totally understand how they’re getting from A to B, but in their own minds they are.

Let’s reaffirm what’s going on here: white racial consciousness in the United States has exploded on the public scene over the past three years, just as scientists have come to the very strong conclusion that the “white European race” as we understand it is an artifact of the last ~5,000 years or so.**

We need to go back to Hume, and the anthropological understanding of what reason is. Reason is a tool to confirm what you already hold to be true and good. If reason falsifies in some way what you hold to be true and good, that does not mean for most people that reason is where they will stand. Likely there will be some subtle reinterpretation, but magically reason will support their presuppositions. Ask the descendants of the followers of William Miller about falsification.

The fact is that very few people in the world know about David Reich and his research. I know this personally because I’m a voluble evangelist, and many geneticists, even human geneticists, are not aware of the revolution in historical population genetics that ancient DNA has wrought. I do not know any Nazis personally, I suspect that perhaps their knowledge of human phylogenomics is not at the same level as a typical geneticist.

Of course this sort of logic about logic cuts both ways. Before 2010 I actually assumed, as did most human geneticists who took an interest in these topics, that human populations had long been resident in their region of current occupation for tens of thousands of years. When I read Reconstructing Indian Population History by David Reich I was shocked out of my prior model, because the inferences were so ingenious and plausible, and, the updated story of how South Asians came to be actually made a lot of anomalies make a lot more sense. When Lazaridis et al. posted Ancient human genomes suggest three ancestral populations for present-day Europeans on biorxiv in the December of 2013 I was far more surprised, because I had always assumed that the thesis that most European ancestry dated to the Pleistocene in any given region was a robust one. Both the phylogeography from mtDNA and Y pointed to a Pleistocene origin.

But the data were compelling. It’s one thing to make inferences on present day genetic distribution, it’s another to actually genotype ancient individuals (remember, I can reanalyze the data myself, and have done so numerous times). Lazaridis et al. and Priya Moorjani’s Genetic Evidence for Recent Population Mixture in India totally changed my personal life. All of a sudden my wife and I were far closer emotionally and spiritually because we understood that the TMRCA of many segments in our autosomal genome was about 5-fold closer than I had assumed!!!***

Actually, the last sentence is a total fiction. The history which changed how I understood my wife and I to be related on a historical population genetic sense had zero impact on our relationship. That’s because we’re not racists, and race doesn’t really impact our relationship too much (the fact that my parents are Muslim, well, that’s a different issue….). Sorry Everyday Feminism. This is not an uncommon view, though perhaps not as common as we’d assumed of late (actually, as someone who has looked at the fascinating interracial dating research, I pretty much understood that what people say is quite different than what they do; anti-racism is the conformist thing to do, so people will play that tune for a while longer).

Just because the state of the world is one particular way, it does not naturally follow that it should be that way, or that it always will be that way. Most ethical religions saw in slavery an aspect of injustice; rational arguments aside, on some level extension of empathy and sympathy makes its injustice self-evident. But they accepted that it was an aspect of the world that was naturally baked into the structure of reality. The de jure abolition of slavery today does not mean it has truly gone away, but its practice has certainly been curtailed, and much of the cruelty diminished. Theories of human nature or necessities of economic production at the end of the day gave way changing mores and values. Facts about the world became less persuasive when we decided to let them no longer dictate tolerance of slavery.

All that I say above in relation to how humans use reason does not leave scientists or journalists untouched. All humans have their own goals, and even though they see through the glass darkly, they see in the visions beyond what they want to see. The cultural and theoretical structure of modern science is such that some of these impulses are dampened and human intuitions are channeled in a manner so that theories and models of the world seem to correspond to reality. But I believe this is deeply unnatural, and also deeply fragile. When moving outside of their domain of specialty scientists can be quite blind and irrational. Even when one steps away at a mild remove in terms of domain knowledge this becomes clear, such as when Linus Pauling promoted Vitamin C. And motivated reasoning can creep into the actions of even the greatest of scientists, such as when R. A. Fisher rejected the causal connection between tobacco and cancer.****

I will end on a frank and depressing note: I believe that the era of public reason and fealty to empirical standards in at least official capacities is fading. Social cognition, tribal logic, is on the rise. But we have to remember that in the historical perspective social cognition and tribal logic ruled the day. They are the norm. This is age when he abide by public reason is the peculiarity in the sea of polemic. Ultimately it may be the fool who fixates on being right or wrong, as opposed to being on the winning team. I hope I’m wrong on this.

Addendum: I have written a form of this post many times.

* The current chancellor of Germany has a Polish paternal grandfather.

** If Middle Easterners are included as white we can extended the time horizon much further back, but that seems to defeat the purpose of white nationalism in the United States….

*** I had assumed that the western affinity in South Asians had diverged from Europeans during the Last Glacial Maximum. In turns out some of it may be as recent as ~4,500 years ago or so.

**** This may have been unconsciously as opposed to malicious, as Fisher was keen on tobacco personally.

Beyond “Out of Africa” and multiregionalism: a new synthesis?

For several decades before the present era there have been debates between proponents of the recent African origin of modern humans, and the multiregionalist model. Though molecular methods in a genetic framework have come of the fore of late these were originally paleontological theories, with Chris Stringer and Milford Wolpoff being the two most prominent public exponents of the respective paradigms.

Oftentimes the debate got quite heated. If you read books from the 1990s, when multiregionalism in particular was on the defensive, there were arguments that the recent out of Africa model was more inspirational in regards to our common humanity. As a riposte the multiregionalists asserted that those suggesting recent African origins with total replacement was saying that our species came into being through genocide.

Though some had long warned against this, the dominant perception outside of population genetics was that results such the “mitochondrial Eve” had given strong support to the recent African origin of modern humans, to the exclusion of other ancestry. 2002’s Dawn of Human Culture took it for granted that the recent African origin of modern humans to the total exclusion of other hominin lineages was established fact.

In 2008 I went to a talk where Svante Paabo presented some recent Neanderthal ancient mtDNA work. It was rather ho-hum, as Paabo showed that the Neanderthal lineages were highly diverged from modern ones, and did not leave any descendants. Though of course most modern human lineages did not leave any descendants from that period, Paabo took this evidence supporting the proposition that Neanderthals did not contribute to the modern human gene pool.

When his lab reported autosomal Neanderthal admixture in 2010, it was after initial skepticism and shock internally. I know Milford Wolpoff felt vindicated, while Chris Stringer began to emphasize that the recent African origin of modern humanity also was defined by regional assimilation of other lineages. The data have ultimately converged to a position somewhere between the extreme models of total replacement or balanced and symmetrical gene flow.

This is not surprising. Extreme positions are often rhetorically useful and popular when there’s no data. But reality does not usually conform to our prejudices, so ultimately one has to come down at some point.

The data for non-Africans is rather unequivocal. The vast majority of (>90%) of the ancestry of non-Africans seems to go back to a small number of common ancestors ~60,000 years ago. Perhaps in the range of ~1,000 individuals. These individuals seem to be a node within a phylogenetic tree where all the other branches are occupied by African populations. Between this period and ~15,000 years ago these non-Africans underwent a massive range expansion, until modern humans were present on all continents except Antarctica. Additionally, after the Holocene some of these non-African groups also experienced huge population growth due to intensive agricultural practice.

To give a sense of what I’m getting at, the bottleneck and common ancestry of non-Africans goes back ~60,000 years, but the shared ancestry of Khoisan peoples and non-Khoisan peoples goes back ~150,000-200,000 years. A major lacunae of the current discussion is that often the dynamics which characterize non-Africans are assumed to be applicable to Africans. But they are not.

A 2014 paper illustrates one major difference by inferring effective population from whole genomes: African populations have not gone through the major bottleneck which is imprinted on the genomes of all non-African populations. The Khoisan peoples, the most famous of which are the Bushmen of the Kalahari, have the largest long term effective populations of any human group. The Yoruba people of Nigeria have a history where they were subject to some population decline, but not to the same extent as non-Africans.

What do we take away from this?

One thing is that we have to consider that the assimilationist model which seems to be necessary for non-Africans, also applies to Africans. For years some geneticists have been arguing that some proportion of African ancestry as well is derived from lineages outside of the main line leading up to anatomically modern humans. Without the smoking gun of ancient genomes this will probably remain a speculative hypothesis. I hope that Lee Berger’s recent assertion that they’ve now dated Homo naledi to ~250,000 years before the present may offer up the possibility that ancient DNA will help resolve the question of African archaic admixture (i.e., if naledi is related to the “ghost population”?).

The second dynamic is that the bottleneck-then-range-expansion which is so important in defining the recent prehistory of non-Africans is not as relevant to Africans during the Pleistocene. The very deep split dates being inferred from whole genome analysis of African populations makes me wonder if multiregional evolution is actually much more important within Africa in the development of modern humans in the last few hundred thousand years. Basically, the deep split dates may highlight that there was recurrent gene flow over hundreds of thousands of years between different closely related hominin populations in Africa.

Ultimately, it doesn’t seem entirely surprising that the “Out of Africa” model does not quite apply within Africa.

Addendum: Over the past ~5,000 years we have seen the massive expansion of agricultural populations within the continent. The “deep structure” therefore may have been erased to a great extent, with Pygmies, Khoisan, and Hadza, being the tip of the iceberg in terms of the genetic variation which had characterized the Africa during the Pleistocene.

“Out of Africa” bottleneck is what really matters for mutations


At least in relation to mutational load, if you read a new preprint in biorxiv, The demographic history and mutational load of African hunter-gatherers and farmers:

The distribution of deleterious genetic variation across human populations is a key issue in evolutionary biology and medical genetics. However, the impact of different modes of subsistence on recent changes in population size, patterns of gene flow, and deleterious mutational load remains to be fully characterized. We addressed this question, by generating 300 high-coverage exome sequences from various populations of rainforest hunter-gatherers and neighboring farmers from the western and eastern parts of the central African equatorial rainforest. We show here, by model-based demographic inference, that the effective population size of African populations remained fairly constant until recent millennia, during which the populations of rainforest hunter-gatherers have experienced a ~75% collapse and those of farmers a mild expansion, accompanied by a marked increase in gene flow between them. Despite these contrasting demographic patterns, African populations display limited differences in the estimated distribution of fitness effects of new nonsynonymous mutations, consistent with purifying selection against deleterious alleles of similar efficiency in the different populations. This situation contrasts with that we detect in Europeans, which are subject to weaker purifying selection than African populations. Furthermore, the per-individual mutation load of rainforest hunter-gatherers was found to be similar to that of farmers, under both additive and recessive modes of inheritance. Together, our results indicate that differences in the subsistence patterns and demographic regimes of African populations have not resulted in large differences in mutational burden, and highlight the role of gene flow in reshaping the distribution of deleterious genetic variation across human populations.

There’s two major moving parts in this preprint. First, they using phylogenomic methods to explicitly model population history. Second, they integrated their demographic results in generation and interpreting the distribution of mutations within the exomes of these populations. That is, they combined phylogenomics to gain insight into population genomics, as the latter focuses more on the parameters which define variation with a population.

The data they worked with was from the exome. The regions of the genome which translate into genes. That’s ~30 million bases. They get really good precision due to high coverage, hitting site about 70 times. Their sample was about 300 Africans and 100 Europeans, and they got ~500,000 polymorphisms or variants for their trouble.

The populations were labeled by subsistence and provenance. The Europeans were Belgians. For the Africans they had two groups of hunter-gatherer Pymgies, and two groups of Bantu agriculturalists, sampled from western and eastern locations as you see on the map above.

The admixture plots, which separate out individuals into K numbers of populations break out in a way that makes sense. First, Europeans separate, and the eastern agriculturalist populations have a little bit of evidence of European-like ancestry. This is almost certainly Middle Eastern farmer, which has been found in many East African populations, and those populations which have mixed with them. Then the hunter-gathers separate from the agriculturalists. This is in line with expectation and earlier research; the hunter-gatherers of Africa seem very different from the agriculturalists, and are actually more closely related to each other than the agriculturalists in their neighboring regions.

The exception to this pattern is caused by recent gene flow, which is clearly evident above. Due to population size differences it looks like there is more agricultural ancestry in the Pygmies than vice versa. I wish that they had sampled Mbuti Pygmies. I’m told that this group has the least agricultural admixture.

But then they decided to get fancy and explicitly model demographic histories with fastsimcoal2. What does this do? From the website for the software:

While preserving all the simulation flexibility of simcoal2, fastsimcoal is now implemented under a faster continous-time sequential Markovian coalescent approximation, allowing it to efficiently generate genetic diversity for different types of markers along large genomic regions, for both present or ancient samples. It includes a parameter sampler allowing its integration into Bayesian or likelihood parameter estimation procedure.

fastsimcoal can handle very complex evolutionary scenarios including an arbitrary migration matrix between samples, historical events allowing for population resize, population fusion and fission, admixture events, changes in migration matrix, or changes in population growth rates. The time of sampling can be specified independently for each sample, allowing for serial sampling in the same or in different populations.

The models you see that were tested are pretty simple, and they all seem plausible I suppose. Their simulations suggested that the three above scenarios, with alternative branching patterns and various gene flows, were all of equal likelihood. That is, given the models and the data that they had (4-fold synonymous sites which are likely to be neutral) you can’t distinguish which is right.

In all the models hunter-gatherers diverged relatively recently and so did the agriculturalists. Europeans, who are stand-ins for all non-Africans in this scenario, diverged pretty early from the Africans. But how the Africans relate to each other and Europeans is not totally clear. Why? Because ancient population structure. It is becoming rather obvious now that ~100,000 years ago, and earlier, there were many different modern human lineages which had already diversified. The Khoisan seem to have diverged from other human lineages closer to 200,000 thousand than 100,000 years ago. What this means is that for most of the history of anatomically modern humans population structure  existed between distinct lineages. And some of that persists down to today within Africa.

I’ll bullet point some of their inferences from these models (verbatim quotes below):

  1. Our results suggest that the ancestors of the contemporary RHG, AGR and EUR populations diverged between 85 and 140 thousand years ago (kya), from an ancestral population that underwent demographic expansion between 173 and 191 kya
  2. After the initial population splits, the Ne of AGR and RHG (NaAGR and NaRHG) remained within a range extending from 0.55 to 2.2 times the ancestral African Ne (NHUM), whereas EUR (NaEUR) experienced a decrease in Ne by a factor of three to seven.
  3. The ancestors of the wRHG and eRHG populations diverged 18 to 20 kya (TRHG), and underwent a decreased in Ne by a factor of 3.8 to 5.7 for the wRHG (NwRHG) and 7.1 to 11 for the eRHG (NeRHG), regardless of the branching model considered.
  4. The ancestors of the AGR (NaAGR) split into western and eastern populations 6.7 to 11 kya (TAGR), and underwent a mild expansion, by a factor of 2.3 to 3.1 for the wAGR (NwAGR) and 1.2 to 2.2 for the eAGR (NeAGR).
  5. The EUR population experienced a 7.1- to 8.3-fold expansion (NEUR) 12 to 22 kya (TEUR).

No results are perfect. But some of these dates do not make sense. There’s a lot of circumstantial evidence that the ancestors of European populations began to expand over the last 10,000 years. The dates above suggest there was a Pleistocene expansion. Basically you can divide that value by half, and then you get a reasonable range.

Second, both the agriculturalists sampled here are Bantu speaking, and there’s a good amount of cultural and genetic data for recent shared ancestry of the Bantu over the last 3,000 years. I understand that admixture with a very diverged lineage (e.g., eastern Bantu agriculturalist samples mixing with Nilotic populations, which is how they got some non-African ancestry, as well as local Pygmy groups) can inflate these divergence dates. If that’s the case, they should note that in the text.

We don’t have much historical or archaeological clarity from what I know about divergences between Pygmy groups. This particular group has studied the topic and published on it before, so I’m inclined to trust them more than anyone else. But, the above dates for groups we do know make me a bit more skeptical of a simple divergence around the Last Glacial Maximum.

Then there are the earliest divergences. And 85 to 140,000 year interval is huge for when non-Africans split off from Africans. If closer to 140 than 85, then that means that non-African divergence from Africans preserves ancient African diversity. That is, non-Africans descend from an African group that no longer exists (or has not been sampled in this study at least!). I’ve poked around this question, and when you take into account recent gene flow, it is hard to find the specific African group that non-Africans descend from, though there is some consensus that they branched off from the non-Khoisan Africans later than from the Khoisan.

But there is also a lot of archaeological and some ancient genetic DNA now that indicates that the vast majority of non-African ancestry began to expand rapidly around 50-60,000 years ago. This is tens of thousands of years after the lowest value given above. Therefore, again we have to make recourse to a long period of separation before the expansion. This is not implausible on the face of it, but we could do something else: just assume there’s an artifact with their methods and the inferred date of divergence is too old. That would solve many of the issues.

I really don’t know if the above quibbles have any ramification for the site frequency spectrum of deleterious mutations. My own hunch is that no, it doesn’t impact the qualitative results at all.

Figure 3 clearly shows that Europeans are enriched for weak and moderately deleterious mutations (the last category produces weird results, and I wish they’d talked about this more, but they observe that strong deleterious mutations have issues getting detected). Ne is just the effective population size and s is the selection coefficient (bigger number, stronger selection).

Why are the middle two values enriched? Presumably it’s the non-African bottleneck. This is where another non-African population would have been a nice check to make sure that it was the “Out of Africa” bottleneck…but it’s probably asking a bit much to sequence more individuals to 70x coverage.

The lack of difference between the African populations is an indication that recent demography is not shaping the distribution much. Additionally, they note that gene flow between the African groups probably increased diversity in some ways, so that as long as a group is connected with other populations it will probably be rescued (note that none of these in their data were particular inbred as judging by runs of homozygosity).

Finally, they found that the number of homozygote mutations that were deleterious is higher in their model results for Europeans than the African groups. This is not surprising, and what one expects. But, they found that this is a function likely of continuous gene flow between the African groups. Without gene flow homozygosity would have been much higher. This gets back to the fact that gene flow is a powerful homogenizing tool, and the lack of gene flow has to be pretty extreme for divergence to occur.

Which brings us back to the “Out of Africa” event. The next ten years are going to see a lot of investigation of African phyologenomics and population genomics. Basically, the relationships, and selection pressures. It is totally implausible that Bantu groups in Kenya and Tanzania did not absorb local non-Nilotic populations. We’ll figure that out. Additionally, selection pressures are probably different between different groups. We’ll know more about that. But, ancient DNA will probably give us some understanding of why non-Africans went through such a massive demographic sieve. We know in broad sketches. But most people want to fill in the details.

Citation: The demographic history and mutational load of African hunter-gatherers and farmers, Marie Lopez, Athanasios Kousathanas, Helene Quach, Christine Harmant, Patrick Mouguiama-Daouda, Jean-Marie Hombert, Alain Froment, George H Perry, Luis B Barreiro, Paul Verdu, Etienne Patin, Lluis Quintana-Murci, doi: https://doi.org/10.1101/131219

Mouse fidelity comes down to the genes

While birds tend to be at least nominally monogamous, this is not the case with mammals. This strikes some people as strange because humans seem to be monogamous, at least socially, and often we take ourselves to be typically mammalian. But of course we’re not. Like many primates we’re visual creatures, rather than relying in smell and hearing. Obviously we’re also bipedal, which is not typical for mammals. And, our sociality scales up to massive agglomerations of individuals.

How monogamous we are is up for debate. Desmond Morris, who is well known to many from his roles in television documentaries, has been a major promoter of the idea that humans are monogamous, with a focus on pair-bonds. In contrast, other researchers have highlighted our polygamous tendencies. In The Mating Mind Geoffrey Miller argues for polygamy, and suggests that pair-bonds in a pre-modern environment were often temporary, rather than lifetime (Miller is now writing a book on polyamory).

The fact that in many societies high status males seem to engage in polygamy, despite monogamy being more common, is one phenomenon which confounds attempts to quickly generalize about the disposition of our species. What is preferred may not always be what is practiced, and the external social adherence to norms may be quite violated in private.

Adducing behavior is simpler in many other organisms, because their range of behavior is more delimited. When it comes to studying mating patterns in mammals voles have long been of interest as a model. There are vole species which are monogamous, and others which are not. Comparing the diverged lineages could presumably give insight as to the evolutionary genetic pathways relevant to the differences.

But North American deer mice, Peromyscus, may turn to be an even better bet: there are two lineages which exhibit different mating patterns which are phylogenetically close enough to the point where they can interbreed. That is crucial, because it allows one to generate crosses and see how the characteristics distribute themselves across subsequent generations. Basically, it allows for genetic analysis.

And that’s what a new paper in Nature does, The genetic basis of parental care evolution in monogamous mice. In figure 3 you can see the distribution of behaviors in parental generations, F1 hybrids, and the F2, which is a cross of F1 individuals. The widespread distribution of F2 individuals is likely indicative of a polygenic architecture of the traits. Additionally, they found that some traits are correlated with each other in the F2 generation (probably due to pleiotropy, the same gene having multiple effects), while others were independent.

With the F2 generation they ran a genetic analysis which looked for associations between traits and regions of the genome. They found 12 quantitative trait loci (QTLs), basically zones of the genome associated with variation on one or more of the six traits. From this analysis they immediately realized there was sexual dimorphism in terms of the genetic architecture; the same locus might have a different effect in the opposite sex. This is evolutionarily interesting.

Because the QTLs are rather large in terms of physical genomic units the authors looked to see which were plausible candidates in terms of function. One of their hits was vasopressin, which should be familiar to many from vole work, as well as some human studies. Though the QTL work as well as their pup-switching experiment (which I did not describe) is persuasive, the fact that a gene you’d expect shows up as a candidate really makes it an open and shut case.

The extent of the variation explained by any given QTL seems modest. In the extended figures you can see it’s mostly in the 1 to 5 percent range. In Carl Zimmer’s excellent write up he ends:

But Dr. Bendesky cautioned that the vasopressin gene would probably turn out to be just one of many that influence oldfield mice. Though it is strongly linked to parental behavior, the vasopressin gene accounts for 6.7 percent of the variation in nest building among males, and only 2.9 percent among females.

The genetic landscape of human parenting will turn out to be even more rugged, Dr. Bendesky predicted.

“You cannot do a 23andMe test and find out if your partner is going to be a good father,” he said.

Sort of. The genetic architecture above is polygenic…but not incredibly diffuse. The proportion of variation explained by the largest effect allele is more than for height, and far more than for education. If human research follows up on this, I wouldn’t be surprised if you could develop a polygenic risk score.

But I don’t have a good intuition on how much variation in humans there really is for these sorts of traits that are heritable. I assume some. But I don’t know how much. And how much of the variance in behavior might be explained by human QTLs? Humans don’t lick or build nests, or retrieve pups. Also, as one knows from Genetics and Analysis of Quantitative Traits sexually dimorphic traits take a long time to evolve. These are two deer mice species. Within humans there may not have been enough time for this sort of heritable complexity of behavior to evolve.

There are a lot of philosophical issues here about translating to a human context.

Nevertheless, this research shows that ingenious animal models can powerfully elucidate the biological basis of behavior.

Citation: The genetic basis of parental care evolution in monogamous mice. Nature (2017) doi:10.1038/nature22074

Ancestry inference won’t tell you things you don’t care about (but could)

The figure above is from Noah Rosenberg’s relatively famous paper, Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure. The context of the publication is that it was one of the first prominent attempts to use genome-wide data on a various of human populations (specifically, from the HGDP data set) and attempt model-based clustering. There are many details of the model, but the one that will jump out at you here is that the parameter defines the number of putative ancestral populations you are hypothesizing. Individuals then shake out as proportions of each element, K. Remember, this is a model in a computer, and you select the parameters and the data. The output is not “wrong,” it’s just the output based how you set up the program and the data you input yourself.

These sorts of computational frameworks are innocent, and may give strange results if you want to engage in mischief. For example, let’s say that you put in 200 individuals, of whom 95 are Chinese, 95 are Swedish, and 10 are Nigerian. From a variety of disciplines we know to a good approximation that non-Africans form a monophyletic clade in relation to Africans (to a first approximation). In plain English, all non-Africans descend from a group of people who diverged from Africans more than 50,000 years ago. That means if you imagine two populations, the first division should be between Africans and non-Africans, to reflect this historical demography. But if you skew the sample size, as the program looks for the maximal amount of variation in the data set it may decide that dividing between Chinese and Swedes as the two ancestral populations is the most likely model given the data.

This is not wrong as such. As the number of Africans in the data converges on zero, obviously the dividing line is between Swedes and Chinese. If you overload particular populations within the data, you may marginalize the variation you’re trying to explore, and the history you’re trying to uncover.

I’ve written all of this before. But I’m writing this in context of the earlier post, Ancestry Inference Is Precise And Accurate(Ish). In that post I showed that consumers drive genomics firms to provide results where the grain of resolution and inference varies a lot as a function of space. That is, there is a demand that Northern Europe be divided very finely, while vast swaths of non-European continents are combined into one broad cluster.

Less than 5% Ancient North Eurasian

Another aspect though is time. These model-based admixture frameworks can implicitly traverse time as one ascends up and down the number of K‘s. It is always important to explain to people that the number of K‘s may not correspond to real populations which all existed at the same time. Rather, they’re just explanatory instruments which illustrate phylogenetic distance between individuals. In a well-balanced data set for humans K = 2 usually separates Africans from non-Africans, and K = 3 then separates West Eurasians from other populations. Going across K‘s it is easy to imagine that is traversing successive bifurcations.

A racially mixed man, 15% ANE, 30% CHG, 25% WHG, 30% EEF

But today we know that’s more complicated than that. Three years ago Pickrell et al. published Toward a new history and geography of human genes informed by ancient DNA, where they report the result that more powerful methods and data imply most human populations are relatively recent admixtures between extremely diverged lineages. What this means is that the origin of groups like Europeans and South Asians is very much like the origin of the mixed populations of the New World. Since then this insight has become only more powerful, as ancient DNA has shed light as massive population turnovers over the last 5,000 to 10,000 years.

These are to some extent revolutionary ideas, not well known even among the science press (which is too busy doing real journalism, i.e. the art of insinuation rather than illumination). As I indicated earlier direct-to-consumer genomics use national identities in their cluster labels because these are comprehensible to people. Similarly, they can’t very well tell Northern Europeans that they are an outcome of a successive series of admixtures between diverged lineages from the late Pleistocene down to the Bronze Age. Though Northern Europeans, like South Asians, Middle Easterners, Amerindians, and likely Sub-Saharan Africans and East Asians, are complex mixes between disparate branches of humanity, today we view them as indivisible units of understanding, to make sense of the patters we see around us.

Personal genomics firms therefore give results which allow for historically comprehensible results. As a trivial example, the genomic data makes it rather clear that Ashkenazi Jews emerged in the last few thousand years via a process of admixture between antique Near Eastern Jews, and the peoples of Western Europe. After the initial admixture this group became an endogamous population, so that most Ashkenazi Jews share many common ancestors in the recent past with other Ashkenazi Jews. This is ideal for the clustering programs above, as Ashkenazi Jews almost always fit onto a particular K with ease. Assuming there are enough Ashkenazi Jews in your data set you will always be able to find the “Jewish cluster” as you increase the value.

But the selection of a K which satisfies this comprehensibility criterion is a matter of convenience, not necessity. Most people are vaguely aware that Jews emerged as a people at a particular point in history. In the case of Ashkenazi Jews they emerged rather late in history. At certain K‘s Ashkenazi Jews exhibit mixed ancestral profiles, placing them between Europeans and Middle Eastern peoples. What this reflects is the earlier history of the ancestors of Ashkenazi Jews. But for most personal genomics companies this earlier history is not something that they want to address, because it doesn’t fit into the narrative that their particular consumers want to hear. People want to know if they are part-Jewish, not that they are part antique Middle Eastern and Southwest European.

Perplexment of course is not just for non-scientists. When Joe Pickrell’s TreeMix paper came out five years ago there was a strange signal of gene flow between Northern Europeans and Native Americans. There was no obvious explanation at the time…but now we know what was going on.

It turns out that Northern Europeans and Native Americans share common ancestry from Pleistocene Siberians. The relationship between Europeans and Native Americans has long been hinted at in results from other methods, but it took ancient DNA for us to conceptualize a model which would explain the patterns we were seeing.

An American with recent Amerindian (and probably African) ancestry

But in the context of the United States shared ancestry between Europeans and Native Americans is not particularly illuminating. Rather, what people want to know is if they exhibit signs of recent gene flow between these groups, in particular, many white Americans are curious if they have Native American heritage. They do not want to hear an explanation which involves the fusion of an East Asian population with Siberians that occurred 15,000 to 20,000 years ago, and then the emergence of Northern Europeans thorough successive amalgamations between Pleistocene, Neolithic, and Bronze Age, Eurasians.

In some of the inference methods Northern Europeans, often those with Finnic ancestry or relationship to Finnic groups, may exhibit signs of ancestry from the “Native American” cluster. But this is almost always a function of circumpolar gene flow, as well as the aforementioned Pleistocene admixtures. One way to avoid this would be to simply not report proportions which are below 0.5%. That way, people with higher “Native American” fractions would receive the results, and the proportions would be high enough that it was almost certainly indicative of recent admixture, which is what people care about.

Why am I telling you this? Because many journalists who report on direct-to-consumer genomics don’t understand the science well enough to grasp what’s being sold to the consumer (frankly, most biologists don’t know this field well either, even if they might use a barplot here and there).

And, the reality is that consumers have very specific parameters of what they want in terms of geographic and temporal information. They don’t want to be told true but trivial facts (e.g., they are Northern European). But neither they do want to know things which are so novel and at far remove from their interpretative frameworks that they simply can’t digest them (e.g., that Northern Europeans are a recent population construction which threads together very distinct strands with divergent deep time histories). In the parlance of cognitive anthropology consumers want their infotainment the way they want their religion, minimally counterintuitive. Consume some surprise. But not too much.