The non-European ancestry of Afrikaners

A few years ago I got some South African genotypes. Some of the individuals were clearly African. A few mapped perfectly upon Northern Europeans. But many of the samples consistently were European but shifted toward non-European populations.

Based on history of the assimilation of slaves into the European population of Cape Colony in the 18th century, my assumption is that these individuals are Afrikaners.

Recently I realized that Brenna Henn had released some more Khoisan samples, so I decided to look at this question of admixture again. The two Khoisan populations are the Nama and the Khomani. I removed those with lots of Bantu and European admixture and combined them together into one population.

Running unsupervised Admixture shows how distinct the South African whites are.

The average Utah white in this sample (this population is a mix of British, German, and Scandinavian in ancestry) is 99% European modal cluster, and 1% South Asian. The average for the white South Africans in this data set is 94% European modal cluster. The residual is 1% East Asian (Dai modal), 1% Khosian, 1% non-Khoisan African, and 2% South Asian.

I ran Treemix a bunch of times, and every single plot came out like this when I ran it for three migrations:


The gene flow from the Utah whites to the Gujuratis is simply an artifact of the fact that the Gujurati sample is mixed caste, and some of the Brahmin or Lohannas have more “Ancestral North Indian.” The gene flow from the Europeans to the Khoisan is probably real, or, might be due to pastoralist admixture via East Africans. The last migration arrow goes from the African populations to the South African whites, with a shift toward the Khoisan.

I also ran a three population test where A is the outgroup, and B and C are a clade. A significantly negative f3-statistic indicates admixture in population A. The negative values are listed below:

A B C f3 f3-error Z-score
Gujrati Dai UtahWhite -0.00121718 0.000140141 -8.68539
South_Africa EsanNigeria UtahWhite -0.00127718 0.000147982 -8.63059
South_Africa Khoisan_SA UtahWhite -0.0012928 0.000151416 -8.53802
Gujrati South_Africa Dai -0.000778791 0.000155656 -5.00329
South_Africa Dai UtahWhite -0.000541974 0.000133262 -4.06699
South_Africa UtahWhite Gujrati -0.000103581 8.46193e-05 -1.22408

This aligns well with the Admixture results. Afrikaners have both African ancestries, and, Asian ancestry.

In James Michener’s The Covenant one of the plot lines alludes to mixed ancestry in one of the Afrikaner families. The results above suggest that mixed ancestry is very common, and perhaps ubiquitous, in this population. True, there are some Afrikaners such as Hendrik Verwoerd who migrated to South Africa from the Netherlands in the past century or so, but these are uncommon to my knowledge.

After agriculture, before bronze


The above plot shows genetic distance/variation between highland and lowland populations in Papa New Guinea (PNG). It is from a paper in Science that I have been anticipating for a few months (I talked to the first author at SMBE), A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea.

What does “strong genetic structure” mean? Basically Fst is showing the proportion of genetic variation which is partitioned between groups. Intuitively it is easy to understand, in that if ~1% of the genetic variation is partitioned between groups in one case, and ~10% in another, then it is reasonable to suppose that the genetic distance between groups in the second case is larger than in the first case. On a continental scale Fst between populations is often on the order of ~0.10. That is the value for example when you pool the variation amongst Northern Europeans and Chinese, and assess how much of it can be apportioned in a manner which differentiates populations (so it’s about ~10% of the variation).

This is why ancient DNA results which reported that Mesolithic hunter-gatherers and Neolithic farmers in Central Europe who coexisted in rough proximity for thousands of years exhibited differences on the order of ~0.10 elicited surprise. These are values we are now expecting from continental-scale comparisons. Perhaps an appropriate analogy might be the coexistence of Pygmy groups and Bantu agriculturalists? Though there is some gene flow, the two populations exist in symbiosis and exhibit local ecological segregation.

In PNG continental scale Fst values are also seen among indigenous people. The differences between the peoples who live in the highlands and lowlands of PNG are equivalent to those between huge regions of Eurasia. This is not entirely surprising because there has been non-trivial gene flow into lowland populations from Austronesian groups, such as the Lapita culture. Many lowland groups even speak Austronesian languages today.

Using standard ADMIXTURE analysis the paper shows that many lowland groups have significant East Asian ancestry (red), while none of the highland groups do (some individuals with East Asian admixture seem to be due to very recent gene flow). But even within the highlands the genetic differences are striking. The  Fst values between Finns and Southern European groups such as Spaniards are very high in a European context (due to Finnish Siberian ancestry as well as drift through a bottleneck), but most comparisons within the highland groups in PNG still exceeds this.

The paper also argues that genetic differences between Papuans and the natives of Australia pre-date the rising sea levels at the beginning of the Holocene, when Sahul divided between its various constituents. This is not entirely surprising considering that the ecology of the highlands during the Pleistocene would have been considerably different from Australia to the south, resulting in sharp differences in the hunter-gatherer lifestyles. Additionally, there does not seem to have been a genetic cline. Papuans are symmetrically related to all Australian groups they had samples from.

Using coalescence-based genomic methods they inferred that separation between highlands and some lowland groups occurred ~10-20,000 years ago. That is, after the Last Glacial Maximum. For the highlands, the differences seem to date to within the last 10,000 years. The Holocene. Additionally, they see population increases in the highlands, correlating with the shift to agriculture (cultivation of taro).

None of the above is entirely surprising, though I would take the date inferences with a grain of salt. The key is to observe that large genetic differences, as well as cultural differences, accrued in the highlands of PNG during the Holocene. In the paper they have a social and cultural explanation for what’s going on:

  Fst values in PNG fall between those of hunter-gatherers and present-day populations of west Eurasia, suggesting that a transition to cultivation alone does not necessarily lead to genetic homogenization.

A key difference might be that PNG had no Bronze Age, which in west Eurasia was driven by an expansion of herders and led to massive population replacement, admixture, and cultural and linguistic change (7, 8), or Iron Age such as that linked to the expansion of Bantu-speaking
farmers in Africa (24). Such cultural events have resulted in rapid Y-chromosome lineage expansions due to increased male reproductive variance (25), but we consistently find no evidence for this in PNG (fig. S13). Thus, in PNG, wemay be seeing the genetic, linguistic, and cultural diversity that sedentary human societies can achieve in the absence of massive technology-driven expansions.

Peter Turchin in books like Ultrasociety has aruged that one of the theses in Steven Pinker’s The Better Angels of Our Nature is incorrect: that violence has not decreased monotonically, but peaked in less complex agricultural societies. PNG is clearly a case of this, as endemic warfare was a feature of highland societies when they encountered Europeans. Lawrence Keeley’s War Before Civilization: The Myth of the Peaceful Savage gives so much attention to highland PNG because it is a contemporary illustration of a Neolithic society which until recently had not developed state-level institutions.

What papers like these are showing is that cultural and anthropological dynamics strongly shape the nature of genetic variation among humans. Simple models which assume as a null hypothesis that gene flow occurs through diffusion processes across a landscape where only geographic obstacles are relevant simply do not capture enough of the dynamic. Human cultures strongly shape the nature of interactions, and therefore the genetic variation we see around us.

Quantitative genomics, adaptation, and cognitive phenotypes

The human brain utilizes about ~20% of the calories you take in per day. It’s a large and metabolically expensive organ. Because of this fact there are lots of evolutionary models which focus on the brain. In Catching Fire: How Cooking Made Us Human Richard Wrangham suggests that our need for calories to feed our brain is one reason we started to use fire to pre-digest our food. In The Mating Mind Geoffrey Miller seems to suggest that all the things our big complex brain does allows for a signaling of mutational load. And in Grooming, Gossip, and the Evolution of Language Robin Dunbar suggests that it’s social complexity which is driving our encephalization.

These are all theories. Interesting hypotheses and models. But how do we test them? A new preprint on bioRxiv is useful because it shows how cutting-edge methods from evolutionary genomics can be used to explore questions relating to cognitive neuroscience and pyschopathology, Polygenic selection underlies evolution of human brain structure and behavioral traits:

…Leveraging publicly available data of unprecedented sample size, we studied twenty-five traits (i.e., ten neuropsychiatric disorders, three personality traits, total intracranial volume, seven subcortical brain structure volume traits, and four complex traits without neuropsychiatric associations) for evidence of several different signatures of selection over a range of evolutionary time scales. Consistent with the largely polygenic architecture of neuropsychiatric traits, we found no enrichment of trait-associated single-nucleotide polymorphisms (SNPs) in regions of the genome that underwent classical selective sweeps (i.e., events which would have driven selected alleles to near fixation). However, we discovered that SNPs associated with some, but not all, behaviors and brain structure volumes are enriched in genomic regions under selection since divergence from Neanderthals ~600,000 years ago, and show further evidence for signatures of ancient and recent polygenic adaptation. Individual subcortical brain structure volumes demonstrate genome-wide evidence in support of a mosaic theory of brain evolution while total intracranial volume and height appear to share evolutionary constraints consistent with concerted evolution…our results suggest that alleles associated with neuropsychiatric, behavioral, and brain volume phenotypes have experienced both ancient and recent polygenic adaptation in human evolution, acting through neurodevelopmental and immune-mediated pathways.

The preprint takes a kitchen-sink approach, throwing a lot of methods of selection at the phenotype of interest. Also, there is always the issue of cryptic population structure generating false positive associations, but they try to address it in the preprint. I am somewhat confused by this passage though:

Paleobiological evidence indicates that the size of the human skull has expanded massively over the last 200,000 years, likely mirroring increases in brain size.

From what I know human cranial sizes leveled off in growth ~200,000 years ago, peaked ~30,000 years ago, and have declined ever since then. That being said, they find signatures of selection around genes associated with ‘intracranial volume.’

There are loads of results using different methods in the paper, but I was curious note that schizophrenia had hits for ancient and recent adaptation. A friend who is a psychologist pointed out to me that when you look within families “unaffected” siblings of schizophrenics often exhibit deviation from the norm in various ways too; so even if they are not impacted by the disease, they are somewhere along a spectrum of ‘wild type’ to schizophrenic. In any case in this paper they found recent selection for alleles ‘protective’ of schizophrenia.

There are lots of theories one could spin out of that singular result. But I’ll just leave you with the fact that when you have a quantitative trait with lots of heritable variation it seems unlikely it’s been subject to a long period of unidirecitional selection. Various forms of balancing selection seem to be at work here, and we’re only in the early stages of understanding what’s going on. Genuine comprehension will require:

– attention to population genetic theory
– large genomic data sets from a wide array of populations
– novel methods developed by population genomicists
– and funcitonal insights which neuroscientists can bring to the table

South Asian gene flow into Burmese and Malays?

I happen to have a data set merged from the 1000 Genomes and Estonian Biocentre which has Malays, Burmans, and other assorted Southeast Asians, East Asians, and South Asians. In light of recent posts I thought I would throw out something in relation to this data set (you can download the data here). Above you can see the populations in the data. You see Bangladeshis consistently are shifted toward Southeast Asians in comparison to other South Asians. But both Burmans and Malays exhibit some shift toward South Asians.

I ran ADMIXTURE at K = 4. Click the image for the larger file which shows the populations, but I will tell you what’s going on.

The yellow to green represent a north-south axis in East Asia. The Han sample is mostly yellow, but there is a green component in varying degrees. This almost certainly represents heterogeneity in the Han sample of north to south Chinese. The green component is nearly ~100% in some individuals from indigenous tribes in Borneo, and balanced with the yellow among peninsular Malays. It is more at a higher frequency in Cambodia than in Vietnam or Burma, indicating the older roots of Khmers and their relative insulation from later migrations of Sino-Tibetan and Tai peoples.

The red South Asian component is found in many Southeast Asians, but curious in the Burmans and Malays there is a lot of variation within the population. That indicates admixture over time that has not homogenized throughout the population.

I ran Treemix with 5 migration edges and French rooted (1000 SNP blocks out of 225,000 SNPs) and they all looked like this. Commentary I will leave to readers….

Genetics books for the masses!

Since I’ve become professionally immersed in genetics I haven’t read many books on the topics. I read papers. And I do genetics. But back in the day I did enjoy a good book. The standard recommendation would be to read Matt Ridley’s Genome. It’s a bit dated now (it was published around when the Human Genome Project being completed), but I’d still recommend it.

But when in the mid-2000s I dabbled a little bit in the world of worm (C. elegans) genetics I read Andrew Brown’s In the Beginning Was the Worm: Finding the Secrets of Life in a Tiny Hermaphrodite. It’s pretty far from my current concerns and fixations, with more of a focus on developmental processes, but it is pretty cool to read about the race to “map” every cell in C. elegans.

The second book I’d recommend readers of this blog is the late Will Provine’s The Origins of Theoretical Population Genetics. Modern population genomics is a massive edifice built atop the foundations of the early 20th century fusion of Mendelism and the biometrical heirs of Darwin. Provine outlines how primitive genetics eventually seeded the birth of the Neo-Darwinian Synthesis.

Why do percentage estimates of “ancestry” vary so much?

When looking at the results in Ancestry DNA, 23andMe, and Family Tree DNA my “East Asian” percentage is:

– 19%
– 13%
– 6%

What’s going on here? In science we often make a distinction between precision and accuracy. Precision is how much your results vary when you re-run an experiment or measurement. Basically, can you reproduce your result? Accuracy refers to how close your measurement is to the true value. A measurement can be quite precise, but consistently off. Similarly, a measurement may be imprecise, but it bounces around the true value…so it is reasonably accurate if you get enough measurements just cancel out the errors (which are random).

The values above are precise. That is, if you got re-tested on a different chip, the results aren’t going to be much different. The tests are using as input variation on 100,000 to 1 million markers, so a small proportion will give different calls than in the earlier test. But that’s not going to change the end result in most instances, even though these methods often have a stochastic element.

But what about accuracy? I am not sure that old chestnuts about accuracy apply in this case, because the percentages that these services provide are summaries and distillations of the underlying variation. The model of precision and accuracy that I learned would be more applicable to the DNA SNP array which returns calls on the variants; that is, how close are the calls of the variant to the true value (last I checked these are arrays are around 99.5% accurate in terms of matching the true state).

What you see when these services pop out a percentage for a given ancestry is the outcome of a series of conscious choices that designers of these tests made keeping in mind what they wanted to get out of these tests. At a high level here’s what’s going on:

  1. You have a model of human population history and dynamics with various parameters
  2. You have data that that varies that you put into that model
  3. You have results which come back with values which are the best fit of that data to the model you specificed

Basically you are asking the computational framework a question, and it is returning its best answer to the question posed. To ask whether the answer is accurate or not is almost not even wrong. The frameworks vary because they are constructed by humans with difference preferences and goals.

Almost, but not totally wrong. You can for example simulate populations whose histories you know, and then test the models on the data you generated. Since you already know the “truth” about the simulated data’s population structure and history, you can see how well your framework can infer what you already know from the patterns of variation in the generated data.

Going back to my results, why do my East Asian percentages vary so much? The short answer is that one of the major variables in the model alluded to above is the nature of the reference population set and the labels you give them.

Looking at Bengalis, the ethnic group I’m from, it is clear that in comparison to other South Asian populations they are East Asian shifted. That is, it seems clear I do have some East Asian ancestry. But how much?

The “simple” answer is to model my ancestry is a mix of two populations, an Indian one and an East Asian one, and then see what the values are for my ancestry across the two components. But here is where semantics becomes important: what is Indian and East Asian? Remember, these are just labels we give to groups of people who share genetic affinities. The labels aren’t “real”, the reality is in the raw read of the sequence. But humans are not capable of really getting anything from millions of raw SNPs assigned to individuals. We have to summarize and re-digest the data.

The simplest explanation for what’s going on here is that the different companies have different populations put into the boxes which are “Indian/South Asian” and “East Asian.” If you are using fundamentally different measuring sticks, then there are going to be problems with doing apples to apples comparisons.

My personal experience is that 23andMe tends to give very high percentages of South Asian ancestry for all South Asians. Because “South Asian” is a very diverse category when tests come back that someone is 95-99% South Asian…it’s not really telling you much. In contrast, some of the other services may be using a small subset of South Asians, who they define as “more typical”, and so giving lower percentages to people from Pakistan and Bengal, who have admixture from neighboring regions to the west and east respectively.*

Something similar can occur with East Asian ancestry. If the “donor” ancestral groups are South Asian and East Asian for me, then the proportions of each is going to vary by how close the donor groups selected by the company is to the true ancestral group. If, for example, Family Tree DNA chose a more Northeastern Asian population than Ancestry DNA, then my East Asian population would vary between the two services because I know my East Asian ancestry is more Southeast Asian.

The moral of the story is that the values you obtain are conditional on the choices you make, and those choices emerge from the process of reducing and distilling the raw genetic variation into a manner which is human interpretable. If the companies decided to use the same model, the would come out with the same results.

* I helped develop an earlier version of MyOrigins, and so can attest to this firsthand.

But evolution converges!

Stephen Jay Gould became famous in part for his book Wonderful Life: The Burgess Shale and the Nature of History. By examining the strange creatures in the Burgess Shale formation Gould makes the case that evolution is a highly contingent process, and that if you reran the experiment of life what we’d see might be very different from what we have now.

But the scientist whose study of the formation that inspired Gould’s interpretation, Simon Conway Morris, had very different views. Though it can sometimes be churlish, his rebuttal can be found in The Crucible of Creation: The Burgess Shale and the Rise of Animals. Simony Conway Morris does not believe that contingency is nearly as powerful a force as Gould would have you believe. And his viewpoints are influential. Richard Dawkins leaned on him to make the case for convergence in evolution in The Ancestor’s Tale.

This crossed my mind when reading Carl Zimmer’s new column, When Dinosaurs Ruled the Earth, Mammals Took to the Skies:

Today, placental mammals like flying squirrels and marsupials like sugar gliders travel through the air from tree to tree. But Volaticotherium belonged to a different lineage and independently evolved the ability to glide.

They were not the only mammals to do so, it turns out. Dr. Luo and his colleagues have now discovered at least two other species of gliding mammals from China, which they described in the journal Nature.

Dr. Meng said that the growing number of fossil gliders showed that many different kinds of mammals followed the same evolutionary path. “They did their own experiments,” he said.

This ultimately comes down to physics. There are only so many ways you can make an organize that flies or glides. Mammals come to the table with a general body plan, and that can be modified only so many different ways.

This is not a foolproof point of datum in favor of convergence as opposed to contingency. Frankly these are often vague verbal arguments which are hard to refute or confirm. And even molecular evolutionary analyses come to different conclusions. It may be that we are asking the wrong question. But, it does suggest that evolution may work in a much narrower range of parameters as time progresses because of the winnowing power of selection.

Jon Snow + Daenerys Targaryen far creepier genetically than you know

Screenshot 2016-06-14 22.09.51
Credit: poly-m (deviantART)

If you have been following Game of Thrones you have been noticing that there is a brewing romance between Jon Snow, King in the North, and Daenerys Targaryen, the aspiring claimant to her father’s Iron Throne.

Of course there is a twist to all of this: unbenknownst to either, Jon Snow’s biological father is Daenerys’ dead brother, Rhaegar. This means that Daenerys is Jon Snow’s aunt.

Long-time followers of the world of Game of Thrones are aware that incest between near relations is neither unknown nor shocking. But there is a non-trivial detail which it is important to note. Jon and Daenerys are far more closely related than typical aunts and nephews.

The reason is simple, Daenerys and her brother were the products of two generations of sibling incest. Incest results in inbreeding, and inbreeding as you know results in loss of genetic diversity. By Daenerys’ generation the coefficient of relationship between herself and her brothers was much higher than normal.

To be concrete, the coefficient of relationship of full-siblings is 0.50. That of half-siblings 0.25. Identical twins? Obviously 1.0. Another way to think about this is how much of the genome do any two pairs of individuals share in terms of long tracts of inheritance from recent ancestors. On the whole siblings share about half of their genomes in such a fashion. After two generations of inbreeding Daenerys and Rhaegar have a coefficient of relationship of 0.727 (using Wright’s method). They’re not identical twins, obviously, but their genetic relationship is far closer than full-siblings!

Don’t let the mother of dragons ride you Jon!

Dividing  this in half gives 0.36 as the coefficient of relationship between Jon and Daenerys, as opposed to 0.50 for full-siblings and 0.25 for a conventional aunt-nephew. Jon and Daenerys have almost the same genetic relationship as 3/4 siblings; two individuals who share a common parent, like half-siblings, but whose unshared parents are first order relatives (full-siblings or parent-child).

Not Jaime & Cersei creepy, but still creepy.

Addendum: Though Daenerys is quite inbred, Jon is not at all. One generation of outbreeding can eliminate all inbreeding.

When the ancestors were cyclops

The Greeks are important because Western civilization began with Greece. And therefore modern civilization. I don’t think the Greeks were “Western” truly; my own preference is to state that the West as we understand it is really just Latin Christendom, which emerged in the late first millennium A.D. in any coherent fashion. Yet without Classical Greece and its accomplishments the West wouldn’t make any sense.

But here I have to stipulate Classical, because Greeks existed before the Classical period. That is, a people who spoke a language that was recognizably Greek and worshipped gods recognizable to the Greeks of the Classical period. But these Greeks were not proto-Western in any way. These were the Mycenaeans, a Bronze Age civilization which flourished in the Aegean in the centuries before the cataclysms outlined in 1177 B.C.

The issue with the Mycenaean civilization is that its final expiration in the 11th century ushered in a centuries long Dark Age. During this period the population of Greece seems to have declined, and society reverted to a more simple structure. By the time the Greeks emerged from this Dark Age much had changed. For example, they no longer used Linear B writing. Presumably this technique was passed down along lineages of scribes, whose services were no longer needed, because the grand warlords of the Bronze Age were no longer there to patronize them and make use of their skills. In its stead the Greeks modified the alphabet of the Phoenicians.

To be succinct the Greeks had to learn civilization all over again. The barbarian interlude had broken continuous cultural memory between the Mycenaeans and the Greeks of the developing polises of the Classical period. The fortifications of the Mycenaeans were assumed by their Classical descendants to be the work of a lost race which had the aid of monstrous cyclops.

Of course not all memories were forgotten. Epic poems such as The Iliad retained the memory of the past through the centuries. The list of kings who sailed to Troy actually reflected the distribution of power in Bronze Age Greece, while boar’s tusk helmets mentioned by Homer were typical of the period. To be sure, much of the detail in Homer seems more reflective of a simpler society of petty warlords, so the nuggets of memory are encased in later lore accrued over the centuries.

When antiquarians and archaeologists began to take a look at the Bronze Age Aegean the assumption by many was that the Mycenaeans were not Greek, but extensions of the earlier Minoan civilization. The whole intellectual history here is outlined in Michael Wood’s 1980s documentary In Search of the Trojan War. But suffice it to say that many were shocked when Michael Ventris deciphered Linear B, and found that it was clearly Greek!

The surprise here was partly due to the fact that though Mycenaean cultural remains indicated a different civilization from that of the Minoans, its motifs are clearly inherited from the earlier group. Mycenaeans seemed in many ways to be Minoans in chariots. And the presumption has long been that the Minoans themselves were not an Indo-European population. In fact, the island of Crete had developed early on and become part of the orbit of civilized states from the northern Levant down to Egypt, including Cyprus. Therefore some scholars hypothesized an Egyptian connection.

In any case, the Mycenaeans were Greek. And Homer then most certainly must have transmitted traditions which went back to the Bronze Age.

At this point we can now speak to demographics with some data, as Nature has come out with a paper using ancient DNA from Mycenaeans, Minoans, as well as Bronze Age Anatolians, Genetic origins of the Minoans and Mycenaeans:

The origins of the Bronze Age Minoan and Mycenaean cultures have puzzled archaeologists for more than a century. We have assembled genome-wide data from 19 ancient individuals, including Minoans from Crete, Mycenaeans from mainland Greece, and their eastern neighbours from southwestern Anatolia. Here we show that Minoans and Mycenaeans were genetically similar, having at least three-quarters of their ancestry from the first Neolithic farmers of western Anatolia and the Aegean12, and most of the remainder from ancient populations related to those of the Caucasus3 and Iran45. However, the Mycenaeans differed from Minoans in deriving additional ancestry from an ultimate source related to the hunter–gatherers of eastern Europe and Siberia678, introduced via a proximal source related to the inhabitants of either the Eurasian steppe1,69 or Armenia49. Modern Greeks resemble the Mycenaeans, but with some additional dilution of the Early Neolithic ancestry. Our results support the idea of continuity but not isolation in the history of populations of the Aegean, before and after the time of its earliest civilizations.

About 85% of the ancestry of the Minoan samples could be modeled as being derived from Anatolian farmers, the ancestors of the “Early European Farmers” (EEF) that introduced agriculture to most of the continent, and whose heritage is most clear in modern populations among Sardinians. For the three Mycenaean samples the value is closer to 80% (though perhaps high 70s is more accurate).

Now the question though is what’s the balance? For the Minoans the residual is a component which seems to derive from “Eastern Farmer” populations. Additionally the authors note that the Y chromosomes in four out of five individuals in their Mycenaean-Minoan-Anatolians are haplogroup J associated with these eastern groups, rather than the ubiquitous G2 of the earlier farmer populations. The authors suggest that in the 4th millennium B.C. there was a demographic event where this ancestral component swept west, and served as the common Mycenaean-Minoan (and Anatolian) substrate.

But the Mycenaean samples (one of which was elite, two of which were not) also have a third component: affinities with steppe populations. One model which presents itself is that there was a pulse out of the Balkans, and this was part of the dynamic described in Massive migration from the steppe was a source for Indo-European languages in Europe. But another model, which they could not reject, is that the steppe affinity came from the east, perhaps from a proto-Armenian population. Additionally, they did not find much steppe ancestry in the Anatolian samples at all.

My own preference is for a migration through the Balkans. It seems relatively straightforward. As for why the Anatolian samples did not have the steppe ancestry, the authors provide the reasonable supposition that Indo-European in Anatolia branched off first, and the demographic signal was diluted over successor generations. Perhaps. But another aspect of Anatolia is that it seems the Hittites, the Nesa, where never a numerous population in comparison to the Hatti amongst whom they lived. Perhaps a good model for their rise and takeover may be that of the post-Roman West and the Franks in Gaul.

Then the question becomes how does a less numerous people impose their language on a more numerous one? This happens. See the Hungarians for an example. In fact the paper which covered the other end of the Mediterranean, The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods, suggests that language shift can occur in unpredictable ways. On the one hand Basques seem to have mostly Indo-European Y chromosomes, but their whole genome ancestry indicates less exogenous input than their neighbors. Speaking of which, we know by the Classical period large regions of western Spain were dominated by Celtic speaking peoples, but  the genetic imprint of the Indo-Europeans is still very modest in the Iberian peninsula.

I think what we’re seeing here is the difference between Indo-European agro-pastoralists arriving to a landscape of relatively simple societies with more primal institutions, and those who migrated into regions where local population densities are higher and social complexity is also greater. This higher social complexity means that external elites can takeover a system, as opposed to an almost animal competition for resources as seems to have occurred in Northern Europe.

Finally, at the end of the supplements there is an analysis of the physical features of the Minoans and Mycenaeans. There’s not much that’s surprising. The Minoans and Mycenaeans were a dark haired and dark eyed folk. Why should this surprise us at all? We actually have self representations of them! That’s what they look like. If anything they were darker than modern Greeks (small sample size means power to draw conclusions is not high). Why?

Two reasons that come to mind: natural selection, and the fact that modern Greeks seem to be shifted to continental Europeans to their north, likely due to migration. My number one contender here are the Scalveni Slavic tribes which pushed into much of Greece in the second half of the 6th century A.D. (though a minority of Greek samples I’ve seen don’t exhibit much skew toward Slavs at all).

In the future with more samples and more genomes we’ll know more. But I think this work emphasizes that when it comes to Europe most of the demographic patterns we see around us date to the Bronze Age or earlier.

The future will be genetically engineered

If the film Rise of the Planet of the Apes had come out a few years later I believe there would have been mention of CRISPR. Sometimes science leads to technology, and other times technology aids in science. On occasion the two are one in the same.

The plot I made above shows that in the first five years of the second decade of the 20th century CRISPR went from being an obscure aspect of bacterial genetics to ubiquitous. Friends who had been utilizing “advanced” genetic engineering methods such as TALENS and zinc fingers switched overnight to a CRISPR/Cas9 framework.

As I’ve said before the 2010s are the decade when “reading” the genome becomes normal. We really don’t know what the CRISPR/Cas9 technology is capable of. It’s early years yet. With that, First Human Embryos Edited in U.S.. Technically they’re single celled zygotes. The science itself is not astounding. Rather, it is that the human rubicon has been passed in the United States. As indicated in the article there has been some jealousy about what the Chinese have been able to do because of a different cultural and regulatory framework.

There are those calling for a moratorium on this work (on humans). I’m not in favor or opposed. Rather, my question is simple: if CRISPR/Cas9 makes genetic engineering cheap, easy, and effective, how exactly are we going to enforce a world-wide moratorium? A Butlerian Jihad?

Note: I know that people are freaking about humans + genetic engineering. But most geneticists I know are more excited about the prospects of non-human work, since human clinical trials are going to be way in the future. Over 20 years since Dolly it’s notable to me that no human has been cloned from adult somatic cells yet.