Quantitative genomics, adaptation, and cognitive phenotypes

The human brain utilizes about ~20% of the calories you take in per day. It’s a large and metabolically expensive organ. Because of this fact there are lots of evolutionary models which focus on the brain. In Catching Fire: How Cooking Made Us Human Richard Wrangham suggests that our need for calories to feed our brain is one reason we started to use fire to pre-digest our food. In The Mating Mind Geoffrey Miller seems to suggest that all the things our big complex brain does allows for a signaling of mutational load. And in Grooming, Gossip, and the Evolution of Language Robin Dunbar suggests that it’s social complexity which is driving our encephalization.

These are all theories. Interesting hypotheses and models. But how do we test them? A new preprint on bioRxiv is useful because it shows how cutting-edge methods from evolutionary genomics can be used to explore questions relating to cognitive neuroscience and pyschopathology, Polygenic selection underlies evolution of human brain structure and behavioral traits:

…Leveraging publicly available data of unprecedented sample size, we studied twenty-five traits (i.e., ten neuropsychiatric disorders, three personality traits, total intracranial volume, seven subcortical brain structure volume traits, and four complex traits without neuropsychiatric associations) for evidence of several different signatures of selection over a range of evolutionary time scales. Consistent with the largely polygenic architecture of neuropsychiatric traits, we found no enrichment of trait-associated single-nucleotide polymorphisms (SNPs) in regions of the genome that underwent classical selective sweeps (i.e., events which would have driven selected alleles to near fixation). However, we discovered that SNPs associated with some, but not all, behaviors and brain structure volumes are enriched in genomic regions under selection since divergence from Neanderthals ~600,000 years ago, and show further evidence for signatures of ancient and recent polygenic adaptation. Individual subcortical brain structure volumes demonstrate genome-wide evidence in support of a mosaic theory of brain evolution while total intracranial volume and height appear to share evolutionary constraints consistent with concerted evolution…our results suggest that alleles associated with neuropsychiatric, behavioral, and brain volume phenotypes have experienced both ancient and recent polygenic adaptation in human evolution, acting through neurodevelopmental and immune-mediated pathways.

The preprint takes a kitchen-sink approach, throwing a lot of methods of selection at the phenotype of interest. Also, there is always the issue of cryptic population structure generating false positive associations, but they try to address it in the preprint. I am somewhat confused by this passage though:

Paleobiological evidence indicates that the size of the human skull has expanded massively over the last 200,000 years, likely mirroring increases in brain size.

From what I know human cranial sizes leveled off in growth ~200,000 years ago, peaked ~30,000 years ago, and have declined ever since then. That being said, they find signatures of selection around genes associated with ‘intracranial volume.’

There are loads of results using different methods in the paper, but I was curious note that schizophrenia had hits for ancient and recent adaptation. A friend who is a psychologist pointed out to me that when you look within families “unaffected” siblings of schizophrenics often exhibit deviation from the norm in various ways too; so even if they are not impacted by the disease, they are somewhere along a spectrum of ‘wild type’ to schizophrenic. In any case in this paper they found recent selection for alleles ‘protective’ of schizophrenia.

There are lots of theories one could spin out of that singular result. But I’ll just leave you with the fact that when you have a quantitative trait with lots of heritable variation it seems unlikely it’s been subject to a long period of unidirecitional selection. Various forms of balancing selection seem to be at work here, and we’re only in the early stages of understanding what’s going on. Genuine comprehension will require:

– attention to population genetic theory
– large genomic data sets from a wide array of populations
– novel methods developed by population genomicists
– and funcitonal insights which neuroscientists can bring to the table

18,000 years BC (the film)


Alpha, set 20,000 years ago in Europe, was apparently originally titled “Solutrean.” The change is probably for the best. It will come out next spring. I really hope that this movie is good and does well. It isn’t often that you have something which takes place during the Last Glacial Maximum.

The plot seems to reflect the what you might read in Pat Shipman’s The Invaders, but it’s about 20,000 years too late for her model to work. One of the major criticisms of the idea that dogs and modern humans operated as a team is that it seems way too early. But of late there have been suggestions that the date is earlier than we’d previous thought in relation to when dogs as we understand them arose: Ancient European dog genomes reveal continuity since the Early Neolithic. Here’s the relevant section: “By calibrating the mutation rate using our oldest dog, we narrow the timing of dog domestication to 20,000–40,000 years ago.”

Please note though that the divergence of the dog lineage from the ancestors of modern wolves is a distinct question and process from domestication as such as we understand it. Though it seems likely these events didn’t occur too far apart in time.

Desperately seeking the secret of FOXP2


Since the early 2000s FOXP2 has shown up again and again in the press and scientific literature. Dubbed the “language gene” it exhibits evidence of accelerated evolution in the human lineage after it split from other apes. Additionally, a homolog of the same gene shows evidence of evolutionary change distinctive to songbirds and whales. Obviously this locus is involved in vocalization. Mutated mice on FOXP2 can even sing.

It isn’t difficult to connect the dots here. From 2002:

Dr. Paabo says this date fits with the theory advanced by Dr. Klein to account for the sudden appearance of novel behaviors 50,000 years ago, including art, ornamentation and long distance trade. Human remains from this period are physically indistinguishable from those of 100,000 years ago, leading Dr. Klein to propose that some genetically based cognitive change must have prompted the new behaviors. The only change of sufficient magnitude, in his view, is acquisition of language.

Klein’s thesis, advanced in The Dawn of Human Culture, is that a singular genetic change resulted in some sort of developmental cascade that allowed for the emergence of syntactically rich recursive language. And from language comes culture, and from culture comes world domination.

It was a clean and powerful hegemony while it lasted, but genomic and archaeological findings of the last decade have put such a elegant and simple model under a harsh light. With genomic technology even FOXP2 turns out to be much more complex and rich than the earlier reports had suggested. Neanderthals exhibited all the same mutations as modern humans to make them distinctive from chimpanzees. In other words, the changes on FOXP2 by and large predate the emergence of modern humanity, and go back closer to the root of the hominin lineage (Neanderthals and modern humans diverged ~600,000 years ago).

But FOXP2 keeps coming back. Why? It is an important gene. But another issue is that researchers still perceive in it the key to the holy grail of finding out what makes us distinctively human.

A new preprint (which is somewhat peculiarly formatted), takes another look at FOXP2, Human-specific changes in two functional enhancers of FOXP2:

Two functional enhancers of FOXP2, a gene important for language development and evolution, exhibit several human-specific changes compared to extinct hominins that are located within the binding site for different transcription factors. Specifically, Neanderthals and Denisovans bear the ancestral allele in one position within the binding site for SMARCC1, involved in brain development and vitamin D metabolism. This change might have resulted in a different pattern of FOXP2 expression in our species compared to extinct hominins.

The big picture is now the authors are focusing on gene expression levels as what might allow for modern human traits to be distinctive. Basically the DNA does not magically turn into protein. Biological machinery has to transcribe the sequence, and to transcribe it it has to bind to a particular region, a transcription factor binding site.

Most of the analysis involves comparing genomes of Neanderthals, Denisovans, and the human reference. I would be curious if they looked across lots of whole genomes to check if there was polymorphism in modern human populations. If modern humans with Neanderthal and Denisovan mutations had perfectly fine speech, that would be interesting.

Also, they spend a lot of time talking about how other genes interact and express with FOXP2, and all the other functions that are implicated. This is important, because of course selection may have nothing to do with speech, though perhaps speech changes are a side effect? Remember to that the Altai Neanderthal had some modern human admixture, and that one of the introgressed regions turns out to be FOXP2.

This sort of comparative genomic style research is interesting and suggestive. But we need more population wide analysis.

But, the authors do allude other work using genetic engineering where cell lines did show radically difference gene expression based on the mutation above. I do believe that CRISPR/Cas9 technology is cheap enough and going to be widespread enough that someone’s going to play around with splicing in “human” variants into primate models. Meanwhile, bioethicists will furrowing their brows about sequencing humans….

The search for Eden opens up new vistas

The end of Eden

A particular conception of the “Out of Africa” model of human origins died in this decade. This model hooked into preexistence narratives about “Adam” and “Eve”, utilizing Y and mitochondrial DNA lineages passed down through direct male and female lines respectively. Its most extreme manifestation could be exemplified by Richard Klein’s ideas in the early 2000s outlined in his book The Dawn of Human Culture.

For Klein the chasm between Homo sapiens sapiens, humans, and other hominins was vast. A physical anthropologist who surveyed with skill the rapid expansion and proliferation of modern human cultures over the past ~50,000 years, Klein relied on a particular evolutionary model to explain how this occurred. He posited that humankind emerged in East Africa as a punctuated speciation event, triggered by a mutation which allowed for the development of fully elaborated recursive language.

The difference between our own lineage and our relatives in this framework was huge. To not put too fine a point on it, Neanderthals and other archaic humans were animals. We, Homo sapiens sapiens, were humans qua humans.

Though Klein was a paleoanthropologist, he gained great support from a school of molecular evolution which arose in the 1970s and 1980s under Allan Wilson. Wilson’s initial fame arose because he utilized a “molecular clock” analysis of primates to contend that the divergence of our human lineage from great apes was much more recent than paleontologists had believed. Eventually new fossil finds confirmed the molecular phylogeny. After this event Richard Leakey has stated paleoanthropologists were reluctant to challenge molecular results.

Wilson later focused on recent on human origins, utilizing mitochondrial DNA, which is passed down directly through the maternal lineage. In this way they found that African mtDNA lineages were very diverse, and that non-African lineages were nested within the broader tree of African lineages.

The conclusion from this finding was that modern humans arose in Africa and spread to other parts of the world. This conclusion in general has been confirmed.

But over the years more and more evidence has accumulated that the story is more complicated than the original narrative that all modern humans descend from a small bad of East Africans who populated the whole world ~50,000 years ago.

Dissenters from Eden

There were always geneticists who were skeptical of the neat Out of Africa with total replacement model. In Origins Reconsidered Richard Leakey recounts a conference in 1992 where he was pigeon-holed by geneticists who thought there was no reason to accept without dispute the mitochondrial Eve narrative. Over the years talking to some older geneticists I can say that Leakey was reporting a real undercurrent of irritation with the confidence that Allan Wilson’s group and their fellow travelers projected in relation to their model. Nordborg 1998 On the probability of Neanderthal ancestry reflects some of the technical objections to inferring too much from one locus when it came to the possibility of other components of ancestry.

When genome-wide analyses in the middle 2000s became feasible, a visible counter-culture within genetics argued that total replacement was not supported by the data. In 2006 Wall & Hammer published Archaic admixture in the human genome. They concluded that “Recent work suggests that Neanderthals and an as yet unidentified archaic African population contributed to at least 5% of the modern European and West African gene pools, respectively.” They were not that far off with European populations. As far as Africa goes, that is a question that will be explored in detail in the next few years.

That analysis though has only 44 citations. I have had debates on Twitter with how exotic and marginal these ideas were. In general it is safe to say that they were not exotic and marginal in the community of human evolutionary population geneticists. But that’s not a large set. Both John Hawks and Milford Wolpoff have indicated a lot of marginalization for models outside of the narrow window of Out of Africa with total replacement. From everything I’ve heard about the run up to the 2010 publication of the Neanderthal genome many of the principal researchers, including Svante Paabo, were totally surprised by the evidence of admixture into modern lineages. Wolopff even emailed me after I reviewed the paper to suggest that it felt so good to come out of the wilderness and have some of his views accepted.

Anagenesis and punctuation?

But were Wolpoff’s views accepted? The revised model actually kept much of the Out of Africa framework in place, except it added the wrinkle of assimilation of some archaic lineages. The dominant signal in the non-African genomes seems to have come from an African lineage which left around ~50,000 years ago.

The classical multi-regional model that Wolpoff was associated with, whereby modern humans evolved across the whole world from local archaic lineages, but maintained species cohesion through gene flow, was not supported. Rather, the archaic admixture of Neanderthals and Denisovans into Oceanians pointed to local continuities, which was a broader position of multi-regionalism. But this is not speciation without branching, anagenesis.

Nevertheless, there was another aspect of Out of Africa with replacement that needed revision. Though not explicitly outlined in many framings, one aspect implicit is that the dynamics that Africa and Eurasia were subject to during the emergence of modern humans were the same.

But that doesn’t seem to be the case. The ancestors of all non-Africans went through a major population bottleneck. On the order of ~1,000 individuals (this is a very large bottleneck actually, and I’ve seen numbers as low as 100, though that seems on the small side; calculating effective population size ~50,000 years ago can be tricky). The same is not true of African populations. Though many of them show signals of population declines during the Pleistocene, the extreme uniform bottleneck which characterizes all non-Africans, from Iberia to Australia to Patagonia is just not evident in Sub-Saharan African populations.

In other words, the Out of Africa event did not apply within Africa. Here’s an excerpt of an email I sent to Carl Zimmer in December of 2010 (he was updating the second edition of The Tangled Bank):

…it may be that there was no rapid antique population expansion in Africa which was analogous to [the] out of Africa migration. IOW, non-Africans are just a branch of Northeast Africans, and the Bushmen and other groups were already differentiated by that point. So you could theoretically remove the arrows within Africa! I think this is a subtle and tendentious point, so probably best to leave that as it is. But remember how deep the basal branching of the Bushmen was in the Denisova paper? It WAY predates any possible out of Africa migration by multiples.

Which brings me to the current year and the present time. The recent paper which utilized an ancient genome from South Africa to push back the date of the diversification of African lineages to about ~250,000 years before the present was not entirely surprising to me. Every time I talked to people who had access to African whole genomes their dates kept getting pushed back further and further into the past.

And of course we now have fossil confirmation that human populations which seemed to be anatomically modern (or close) were already present ~300,000 years ago in Morocco. The New York Times has a good overview of the work, Oldest Fossils of Homo Sapiens Found in Morocco, Altering History of Our Species. I read the papers and the commentaries and don’t have much to add, nor do they add much for non-specialists in my opinion (since we can’t really judge the morphology too well, nor do we have a detailed understanding of the fossil record). In one of the Nature letters the authors conclude in the abstract that “The emergence of our species and of the Middle Stone Age appear to be close in time, and these data suggest a larger scale, potentially pan-African, origin for both.”

This suggest to me anagenesis. Has multi-regionalism come back, but no within Africa?

Parameters, not paradigms

John Hawks has put in his two cents, and it’s always worth paying attention. My major take home is that we don’t know a lot even though we know more, and we need to be careful here. The genome blogger from the 2000s who has been relatively quiet over the last five years, Dienekes, resurfaced, dismissing the idea of pan-African anagenesis and asserting an Out of North Africa viewpoint. He’s been talking about this model since 2011, so there’s nothing new here. In January of 2011 he asserted that “Africa was home to a structured population.” That is what we are seeing today.

The publication of the Nature letters triggered a lot of discussion on Twitter. When I was involved it mostly consisted of Aylwyn Scally and Pontus Skoglund, with John Hawks, Chris Stringer, and others jumping into the stream. Here are some points which are of note:

1) Most people now suspect that large scale population structure within Africa over the past few hundred thousand years is a major story.

2) But there is an assumption that collapsing of that structure through gene flow was not reciprocal. That is, some populations likely expanded at the expense of others. The arguments are whether the assimilation of the secondary groups is on the order of a few percent, as seems to be the case in Eurasia, or a much higher fraction.

3) Because the phylogenetic distance between within African lineages is likely smaller than between Neanderthals and modern humans, as well as the likely similar census sizes and technological toolkits, I contended that it is not unreasonable to guess that as much as 20% of the ancestry of a daughter population of an expanding group could be from the local substrate. There was no great objection to this guess.

4) Remember, even simple mtDNA phylogenies as far back as the 1980s, as well as paleontological analyses of fossils, indicated that an Out of Africa movement into Eurasia. This was such a strong signal in the data that it was clear with even relatively little to go on. The situation for within Africa is not analogous, suggesting to me that an extreme model of replacement or gene flow across persistent demes in local regions is not tenable.

5) Ultimately, the issue will resolve on parameters of admixture and the nature of demographic expansion in the details. Instead of a tree, we will conceive of this as a graph, a trellis with lengths of different thicknesses.

6) Hawks brought up the fact that one reason classic multi-regionalism did not work is that the Fisher wave of expansion of favored genes is slower than the migration of humans. When I suggested it does not seem that the genetics of gene flow in plants, which do resemble classical multi-regionalism, were a good analogy for humans, Skoglund contrasted the sessile nature of the taxon in contrast to mobile humans. I did point out though that after favored alleles moved through migration into a population, there was often in situ selection. He agreed.

7) A key issue that both Hawks and Dienekes emphasize is that we don’t know the role that extremely diverged lineages from our own ancestors play in our story. That is, were there many modern human populations across Africa, interspersed with other human species? Or was there one modern human population that mixed with other species? We don’t really know the details of all of this.

8) I expressed skepticism of the idea of “behavioral modernity.” My reason for being skeptical is that the origin of modern humans is not as neat as we like to think, and the origin of “behavioral modernity” is also not as neat as we like to think. When the consensus was that humans emerged as a punctuated de novo event, ensouled by the Lord God on High 50,000 years ago (or, coming down from the skies as in Battlestar Galactica or in Larry Niven’s Ringworld), the idea of behavioral modernity kind of made sense. But it’s all more confused now (in any case, in Clive Finlayson’s The Humans Who Went Extinct he seems to be arguing that much of modern culture was invented by Gravettians, well after the Out of Africa event).

The consensus seems to be that rather than focusing on a set of human universals as behaviorally modern, we should look at the demographic patterns of the past to infer when our own lineage came into its distinctive being. Those of you who have read me for a while know this is already congenial to me, Luke Jostins’ plot of the encephalization of all hominin lineages over the past million years was suggestive to me long ago that our own lineage is not so special. Rather, something like us was probably inevitable so long as an asteroid didn’t wipe out large mammals once Homo erectus spread across the globe. Humanity is a destiny, not a lineage.

A reticulation + pulse expansion of modern human genetic variation

In response to a little bit of fatigue at the constant stream of ancient DNA, John Hawks digs the knife in a bit deeper. There’s more. Since Hawks is co-author with Lee Berger of Almost Human: The Astonishing Tale of Homo Naledi and the Discovery That Changed Our Human Story, I’m hoping for something related to naledi. But barring that, there is still lots to come down the pipeline (Alexander Kim has a Twitter account worth following to keep up on this line of research).

But I thought I’d enter something else into the record about what we’re starting to understand about the origins of modern human genetic variation.

  1. The “Out of Africa” movement is a real thing. That is, ~50,000 years ago a population seems to have diverged from a broader pan-African group, and replaced archaic hominins across Eurasia (with some admixture at low levels), and then pushed the boundaries of our genus to Oceania and the New World. This does not mean that there were not earlier “Out of Africas.” Just that the dominant signal of variation is due to the pulse that swept out ~50,000 years ago.
  2. Within Africa the story is different and more complicated. The expansion of anatomically modern humans out of Africa occurred relatively rapidly from a small founder population (within an order of magnitude of ~1,000 individuals seems correct). The archaeology and genetics are in pretty good alignment. But within Africa it looks like the lineages which led to modern humans are much deeper, and preserve structure that may be hundreds of thousands of years old. Just as we see admixture events giving rise to new lineages outside of Africa over the last 50,000 years, the same dynamic probably applied to within Africa far earlier.

One way to think about it is that the old “Africa Eve” model is very useful for the 10,000 year period around 50,000 years ago. And, it applies to non-Africans.

The story within Africa though may be more like the old multi-regionalist model, though with stronger biases of gene flow between populations, so that at one time one lineage may be preponderant. Over the last 10,000 years the expansion of certain populations within Africa though (in particular Bantus) has collapsed a lot of the deep structure, and now Africa resembles Eurasia much more (with some Eurasian back-migration too).

Note: I am talking about “modern human genetic variation” because I am starting to think talking about “modern humans” obscures far more than it illuminates.

The human phylogenetic graph gets curiouser and curiouser


While most of my readers were sleeping, Lee Berger in South Africa was giving a press conferences on new Homo naledi related results. Three papers are in elife. It’s open access, so read yourself. The major result is that the fossils have been dated to a 236,000 to 335,000 years ago.

If you aren’t a paleontologist, Homo naledi and Pleistocene hominin evolution in subequatorial Africa:

New discoveries and dating of fossil remains from the Rising Star cave system, Cradle of Humankind, South Africa, have strong implications for our understanding of Pleistocene human evolution in Africa. Direct dating of Homo naledi fossils from the Dinaledi Chamber (Berger et al., 2015) shows that they were deposited between about 236 ka and 335 ka (Dirks et al., 2017), placing H. naledi in the later Middle Pleistocene. Hawks and colleagues (Hawks et al., 2017) report the discovery of a second chamber within the Rising Star system (Dirks et al., 2015) that contains H. naledi remains. Previously, only large-brained modern humans or their close relatives had been demonstrated to exist at this late time in Africa, but the fossil evidence for any hominins in subequatorial Africa was very sparse. It is now evident that a diversity of hominin lineages existed in this region, with some divergent lineages contributing DNA to living humans and at least H. naledi representing a survivor from the earliest stages of diversification within Homo. The existence of a diverse array of hominins in subequatorial comports with our present knowledge of diversity across other savanna-adapted species, as well as with palaeoclimate and paleoenvironmental data. H. naledi casts the fossil and archaeological records into a new light, as we cannot exclude that this lineage was responsible for the production of Acheulean or Middle Stone Age tool industries.

In relation to the DNA part, we don’t have ancient genomes except for the Ethiopian Holocene one. They couldn’t get DNA out of naledi. But we do have inferences made from modern populations. Here is the most recent paper cited, Model-based analyses of whole-genome data reveal a complex evolutionary history involving archaic introgression in Central African Pygmies.

“Out of Africa” bottleneck is what really matters for mutations


At least in relation to mutational load, if you read a new preprint in biorxiv, The demographic history and mutational load of African hunter-gatherers and farmers:

The distribution of deleterious genetic variation across human populations is a key issue in evolutionary biology and medical genetics. However, the impact of different modes of subsistence on recent changes in population size, patterns of gene flow, and deleterious mutational load remains to be fully characterized. We addressed this question, by generating 300 high-coverage exome sequences from various populations of rainforest hunter-gatherers and neighboring farmers from the western and eastern parts of the central African equatorial rainforest. We show here, by model-based demographic inference, that the effective population size of African populations remained fairly constant until recent millennia, during which the populations of rainforest hunter-gatherers have experienced a ~75% collapse and those of farmers a mild expansion, accompanied by a marked increase in gene flow between them. Despite these contrasting demographic patterns, African populations display limited differences in the estimated distribution of fitness effects of new nonsynonymous mutations, consistent with purifying selection against deleterious alleles of similar efficiency in the different populations. This situation contrasts with that we detect in Europeans, which are subject to weaker purifying selection than African populations. Furthermore, the per-individual mutation load of rainforest hunter-gatherers was found to be similar to that of farmers, under both additive and recessive modes of inheritance. Together, our results indicate that differences in the subsistence patterns and demographic regimes of African populations have not resulted in large differences in mutational burden, and highlight the role of gene flow in reshaping the distribution of deleterious genetic variation across human populations.

There’s two major moving parts in this preprint. First, they using phylogenomic methods to explicitly model population history. Second, they integrated their demographic results in generation and interpreting the distribution of mutations within the exomes of these populations. That is, they combined phylogenomics to gain insight into population genomics, as the latter focuses more on the parameters which define variation with a population.

The data they worked with was from the exome. The regions of the genome which translate into genes. That’s ~30 million bases. They get really good precision due to high coverage, hitting site about 70 times. Their sample was about 300 Africans and 100 Europeans, and they got ~500,000 polymorphisms or variants for their trouble.

The populations were labeled by subsistence and provenance. The Europeans were Belgians. For the Africans they had two groups of hunter-gatherer Pymgies, and two groups of Bantu agriculturalists, sampled from western and eastern locations as you see on the map above.

The admixture plots, which separate out individuals into K numbers of populations break out in a way that makes sense. First, Europeans separate, and the eastern agriculturalist populations have a little bit of evidence of European-like ancestry. This is almost certainly Middle Eastern farmer, which has been found in many East African populations, and those populations which have mixed with them. Then the hunter-gathers separate from the agriculturalists. This is in line with expectation and earlier research; the hunter-gatherers of Africa seem very different from the agriculturalists, and are actually more closely related to each other than the agriculturalists in their neighboring regions.

The exception to this pattern is caused by recent gene flow, which is clearly evident above. Due to population size differences it looks like there is more agricultural ancestry in the Pygmies than vice versa. I wish that they had sampled Mbuti Pygmies. I’m told that this group has the least agricultural admixture.

But then they decided to get fancy and explicitly model demographic histories with fastsimcoal2. What does this do? From the website for the software:

While preserving all the simulation flexibility of simcoal2, fastsimcoal is now implemented under a faster continous-time sequential Markovian coalescent approximation, allowing it to efficiently generate genetic diversity for different types of markers along large genomic regions, for both present or ancient samples. It includes a parameter sampler allowing its integration into Bayesian or likelihood parameter estimation procedure.

fastsimcoal can handle very complex evolutionary scenarios including an arbitrary migration matrix between samples, historical events allowing for population resize, population fusion and fission, admixture events, changes in migration matrix, or changes in population growth rates. The time of sampling can be specified independently for each sample, allowing for serial sampling in the same or in different populations.

The models you see that were tested are pretty simple, and they all seem plausible I suppose. Their simulations suggested that the three above scenarios, with alternative branching patterns and various gene flows, were all of equal likelihood. That is, given the models and the data that they had (4-fold synonymous sites which are likely to be neutral) you can’t distinguish which is right.

In all the models hunter-gatherers diverged relatively recently and so did the agriculturalists. Europeans, who are stand-ins for all non-Africans in this scenario, diverged pretty early from the Africans. But how the Africans relate to each other and Europeans is not totally clear. Why? Because ancient population structure. It is becoming rather obvious now that ~100,000 years ago, and earlier, there were many different modern human lineages which had already diversified. The Khoisan seem to have diverged from other human lineages closer to 200,000 thousand than 100,000 years ago. What this means is that for most of the history of anatomically modern humans population structure  existed between distinct lineages. And some of that persists down to today within Africa.

I’ll bullet point some of their inferences from these models (verbatim quotes below):

  1. Our results suggest that the ancestors of the contemporary RHG, AGR and EUR populations diverged between 85 and 140 thousand years ago (kya), from an ancestral population that underwent demographic expansion between 173 and 191 kya
  2. After the initial population splits, the Ne of AGR and RHG (NaAGR and NaRHG) remained within a range extending from 0.55 to 2.2 times the ancestral African Ne (NHUM), whereas EUR (NaEUR) experienced a decrease in Ne by a factor of three to seven.
  3. The ancestors of the wRHG and eRHG populations diverged 18 to 20 kya (TRHG), and underwent a decreased in Ne by a factor of 3.8 to 5.7 for the wRHG (NwRHG) and 7.1 to 11 for the eRHG (NeRHG), regardless of the branching model considered.
  4. The ancestors of the AGR (NaAGR) split into western and eastern populations 6.7 to 11 kya (TAGR), and underwent a mild expansion, by a factor of 2.3 to 3.1 for the wAGR (NwAGR) and 1.2 to 2.2 for the eAGR (NeAGR).
  5. The EUR population experienced a 7.1- to 8.3-fold expansion (NEUR) 12 to 22 kya (TEUR).

No results are perfect. But some of these dates do not make sense. There’s a lot of circumstantial evidence that the ancestors of European populations began to expand over the last 10,000 years. The dates above suggest there was a Pleistocene expansion. Basically you can divide that value by half, and then you get a reasonable range.

Second, both the agriculturalists sampled here are Bantu speaking, and there’s a good amount of cultural and genetic data for recent shared ancestry of the Bantu over the last 3,000 years. I understand that admixture with a very diverged lineage (e.g., eastern Bantu agriculturalist samples mixing with Nilotic populations, which is how they got some non-African ancestry, as well as local Pygmy groups) can inflate these divergence dates. If that’s the case, they should note that in the text.

We don’t have much historical or archaeological clarity from what I know about divergences between Pygmy groups. This particular group has studied the topic and published on it before, so I’m inclined to trust them more than anyone else. But, the above dates for groups we do know make me a bit more skeptical of a simple divergence around the Last Glacial Maximum.

Then there are the earliest divergences. And 85 to 140,000 year interval is huge for when non-Africans split off from Africans. If closer to 140 than 85, then that means that non-African divergence from Africans preserves ancient African diversity. That is, non-Africans descend from an African group that no longer exists (or has not been sampled in this study at least!). I’ve poked around this question, and when you take into account recent gene flow, it is hard to find the specific African group that non-Africans descend from, though there is some consensus that they branched off from the non-Khoisan Africans later than from the Khoisan.

But there is also a lot of archaeological and some ancient genetic DNA now that indicates that the vast majority of non-African ancestry began to expand rapidly around 50-60,000 years ago. This is tens of thousands of years after the lowest value given above. Therefore, again we have to make recourse to a long period of separation before the expansion. This is not implausible on the face of it, but we could do something else: just assume there’s an artifact with their methods and the inferred date of divergence is too old. That would solve many of the issues.

I really don’t know if the above quibbles have any ramification for the site frequency spectrum of deleterious mutations. My own hunch is that no, it doesn’t impact the qualitative results at all.

Figure 3 clearly shows that Europeans are enriched for weak and moderately deleterious mutations (the last category produces weird results, and I wish they’d talked about this more, but they observe that strong deleterious mutations have issues getting detected). Ne is just the effective population size and s is the selection coefficient (bigger number, stronger selection).

Why are the middle two values enriched? Presumably it’s the non-African bottleneck. This is where another non-African population would have been a nice check to make sure that it was the “Out of Africa” bottleneck…but it’s probably asking a bit much to sequence more individuals to 70x coverage.

The lack of difference between the African populations is an indication that recent demography is not shaping the distribution much. Additionally, they note that gene flow between the African groups probably increased diversity in some ways, so that as long as a group is connected with other populations it will probably be rescued (note that none of these in their data were particular inbred as judging by runs of homozygosity).

Finally, they found that the number of homozygote mutations that were deleterious is higher in their model results for Europeans than the African groups. This is not surprising, and what one expects. But, they found that this is a function likely of continuous gene flow between the African groups. Without gene flow homozygosity would have been much higher. This gets back to the fact that gene flow is a powerful homogenizing tool, and the lack of gene flow has to be pretty extreme for divergence to occur.

Which brings us back to the “Out of Africa” event. The next ten years are going to see a lot of investigation of African phyologenomics and population genomics. Basically, the relationships, and selection pressures. It is totally implausible that Bantu groups in Kenya and Tanzania did not absorb local non-Nilotic populations. We’ll figure that out. Additionally, selection pressures are probably different between different groups. We’ll know more about that. But, ancient DNA will probably give us some understanding of why non-Africans went through such a massive demographic sieve. We know in broad sketches. But most people want to fill in the details.

Citation: The demographic history and mutational load of African hunter-gatherers and farmers, Marie Lopez, Athanasios Kousathanas, Helene Quach, Christine Harmant, Patrick Mouguiama-Daouda, Jean-Marie Hombert, Alain Froment, George H Perry, Luis B Barreiro, Paul Verdu, Etienne Patin, Lluis Quintana-Murci, doi: https://doi.org/10.1101/131219

The logic of human destiny was inevitable 1 million years ago

Robert Wright’s best book, Nonzero: The Logic of Human Destiny, was published nearly 20 years ago. At the time I was moderately skeptical of his thesis. It was too teleological for my tastes. And, it does pander to a bias in human psychology whereby we look to find meaning in the universe.

But this is 2017, and I have somewhat different views.

In the year 2000 I broadly accepted the thesis outlined a few years later in The Dawn of Human Culture. That our species, our humanity, evolved and emerged in rapid sequence, likely due to biological changes of a radical kind, ~50,000 years ago. This is the thesis of the “great leap forward” of behavioral modernity.

Today I have come closer to models proposed by Michael Tomasello in The Cultural Origins of Human Cognition and Terrence Deacon in The Symbolic Species: The Co-evolution of Language and the Brain. Rather than a punctuated event, an instance in geological time, humanity as we understand it was a gradual process, driven by general dynamics and evolutionary feedback loops.

The conceit at the heart of Robert J. Sawyer’s often overly preachy Neanderthal Parallax series, that if our own lineage went extinct but theirs did not they would have created a technological civilization, is I think in the main correct. It may not be entirely coincidental that the hyper-drive cultural flexibility of African modern humans evolved in African modern humans first. There may have been sufficient biological differences to enable this to be likely. But I believe that if African modern humans were removed from the picture Neanderthals would have “caught up” and been positioned to begin the trajectory we find ourselves in during the current Holocene inter-glacial.

Luke Jostins’ figure showing across board encephalization

The data indicate that all human lineages were subject to increased encephalization. That process trailed off ~200,000 years ago, but it illustrates the general evolutionary pressures, ratchets, or evolutionary “logic”, that applied to all of them. Overall there were some general trends in the hominin lineage that began to characterized us about a million years ago. We pushed into new territory. Our rate of cultural change seems to gradually increased across our whole range.

One of the major holy grails I see now and then in human evolutionary genetics is to find “the gene that made us human.” The scramble is definitely on now that more and more whole genome sequences from ancient hominins are coming online. But I don’t think there will be such gene ever found. There isn’t “a gene,” but a broad set of genes which were gradually selected upon in the process of making us human.

In the lingo, it wasn’t just a hard sweep from a de novo mutation. It was as much, or even more, soft sweeps from standing variation.

How Tibetans can function at high altitudes


About seven years ago I wrote two posts about how Tibetans manage to function at very high altitudes. And it’s not just physiological functioning, that is, fitness straightforwardly understood. High altitudes can cause a sharp reduction in reproductive fitness because women can not carry pregnancies to term. In other words, high altitude is a very strong selection pressure. You adapt, or you die off.

For me there have been two things of note since those original papers came out. First, one of those loci seem to have been introgressed from a Denisovan genetic background. I want to be careful here, because the initial admixture event may not have been into the Tibetans proper, but earlier hunter-gatherers who descend from Out of Africa groups, who were assimilated into the Tibetans as they expanded 5-10,000 years ago. Second, it turns out that dogs have been targeted for selection on EPAS1 as well (the “Denisovan” introgression) for altitude adaptation as well.

This shows that in mammals at least there’s a few genes which show up again and again. The fact that EPAS1 and EGLN1 were hits on relatively small sample sizes also reinforces their powerful effect. When the EPAS1 results initially came out they were highlighted as the strongest and fastest instance of natural selection in human evolutionary history. One can quibble about the details about whether this was literally true, but that it was a powerful selective event no one could deny.

A new paper in PNAS, Genetic signatures of high-altitude adaptation in Tibetans, revisits the earlier results with a much larger sample size (the research group is in China) comparing Han Chinese and Tibetans. They confirm the earlier results, but, they also find other loci which seem likely targets of selection in Tibetans. Below is the list:

SNP A1 A2 Frequency of A1 P value FST Nearest gene
Tibetan EAS (Han)
rs1801133 A G 0.238 0.333 6.30E-09 0.021 MTHFR
rs71673426 C T 0.102 0.013 1.50E-08 0.1 RAP1A
rs78720557 A T 0.498 0.201 4.70E-08 0.191 NEK7
rs78561501 A G 0.599 0.135 6.10E-15 0.414 EGLN1
rs116611511 G A 0.447 0.003 3.60E-19 0.57 EPAS1
rs2584462 G A 0.211 0.549 3.90E-09 0.203 ADH7
rs4498258 T A 0.586 0.287 1.70E-08 0.171 FGF10
rs9275281 G A 0.095 0.365 1.10E-10 0.162 HLA-DQB1
rs139129572 GA G 0.316 0.449 5.80E-09 0.036 HCAR2
P value indicates the P value from the MLMA-LOCO analysis. FST is the FST value between Tibetans and EASs. Nearest gene indicates the nearest annotated gene to the top differentiated SNP at each locus except EGLN1, which is known to be associated with high-altitude adaptation; rs139129572 is an insertion SNP with two alleles: GA and G. A1, allele 1; A2, allele 2.

Many of these genes are familiar. Observe the allele frequency differences between the Tibetans and other East Asians (mostly Han). The sample sizes are on the order of thousands, and the SNP-chip had nearly 300,000 markers. What they found was that the between population Fst of Han to Tibetan was ~0.01. So only 1% of the SNP variance in their data was partitioned between the two groups. These alleles are huge outliers.

The authors used some sophisticated statistical methods to correct for exigencies of population structure, drift, admixture, etc., to converge upon these hits, but even through inspection the deviation on these alleles is clear. And as they note in the paper it isn’t clear all of these genes are selected simply for hypoxia adaptation. MTFHR, which is quite often a signal of selection, may have something to due to folate production (higher altitudes have more UV). ADH7 is part of a set of genes which always seem to be under selection, and HLA is never a surprise.

Rather than get caught up in the details it is important to note here that expansion into novel habitats results in lots of changes in populations, so that two groups can diverge quite fast on functional characteristics.  The PCA makes it clear that Tibetans and Hans have very little West Eurasian admixture, and the Fst based analysis puts their divergence on the order of 5,000 years before the present. The authors admit honestly that this is probably a lower bound value, but I also think it is quite likely that Tibetans, and probably Han too, are compound populations, and a simple bifurcation model from a common ancestral population is probably shaving away too many realistic edges. In plainer language, there has been gene flow between Han and Tibetans probably <5,000 years ago, and Tibetans themselves probably assimilated more deeply diverged populations in the highlands as they expanded as agriculturalists. An estimate of a single divergence fits a complex history to too simple of a model quite often.

The take home: understanding population history is probably important to get a better sense of the dynamics of adaptation.

Citation: Jian Yang, Zi-Bing Jin, Jie Chen, Xiu-Feng Huang, Xiao-Man Li, Yuan-Bo Liang, Jian-Yang Mao, Xin Chen, Zhili Zheng, Andrew Bakshi, Dong-Dong Zheng, Mei-Qin Zheng, Naomi R. Wray, Peter M. Visscher, Fan Lu, and Jia Qu, Genetic signatures of high-altitude adaptation in Tibetans, PNAS 2017 ; published ahead of print April 3, 2017, doi:10.1073/pnas.1617042114