Beyond visualization of data in genetics

totalvarHopefully by now the image to the left is familiar to you. It’s from a paper in Human Genetics, Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study. The paper is interesting in and of itself, as it combines a wide set of populations and puts the focus on the extent of disjunction between self-identified ethnic identity, and the population clusters which fall out of patterns of genetic variation. In particular, the authors note that the “Native Hawaiian” identification in Hawaii is characterized by a great deal of admixture, and within their sample only ~50% of the ancestral contribution within this population was Polynesian (the balance split between European and Asian). The figure suggests that subjective self assessment of ancestral quanta is generally accurate, though there are a non-trivial number of outliers. Dienekes points out that the same dynamic holds (less dramatically) for Europeans and Japanese populations within their data set.

All well and good. And I like these sorts of charts because they’re pithy summations of a lot of relationships in a comprehensible geometrical fashion. But they’re not reality, they’re a stylized representation of a slice of reality, abstractions which distill the shape and processes of reality. More precisely the x-axis is an independent dimension of correlations of variation across genes which can account for ~7% of the total population variance. This is the dimension with the largest magnitude. The y-axis is the second largest dimension, accounting for ~4%. The magnitudes decline precipitously as you descend down the rank orders of the principle components. The 5th component accounts for ~0.2% of the variance.

The first two components in these sorts of studies usually conform to our intuitions, and add a degree of precision to various population scale relations. Consider this supplement chart from a 2008 paper (I’ve rotated and reedited for clarity):

The Chinese Muslims

The post is titled the Chinese Muslims, not the Muslims of China. One may make a semantic distinction here in that the latter connotes the residence of a Muslim community within Chinese society, while the former indicates members of Chinese society who happen to be Muslim. Such black and white dichotomies are naturally artificial, but to a large extent the Uyghurs of Xinjiang fall into the category of a group of Muslims (of Turkish language) who happen to fall within the boundaries of the modern Chinese state (thanks to that inheritance of the Chinese state of the full expanse of the Manchu Empire of the 18th century). On the other hand, the Hui people are arguably more a Chinese people who happen to be Muslim.

For more on the topic, please see my blog post at the Islam in China website. It was submitted a while back, but it only went up recently.

Answering Wallace's challenge: Relaxed Selection and Language Evolution

How does natural selection account for language? Darwin wrestled with it, Chomsky sidestepped it, and Pinker claimed to solve it. Discerning the evolution of language is therefore a much sought endeavour, with a vast number of explanations emerging that offer a plethora of choice, but little in the way of consensus. This is hardly new, and at times has seemed completely frivolous and trivial. So much so that in the 19th Century, the Royal Linguistic Society in London actually went as far as to ban any discussion and debate on the origins of language. Put simply: we don’t really know that much. Often quoted in these debates is Alfred Russell Wallace, who, in a letter to Darwin, argued that: “natural selection could only have endowed the savage with a brain a little superior to that of an ape whereas he possesses one very little inferior to that of an average member of our learned society”.

This is obviously relevant for those of us studying language evolution. If, as Wallace challenged, natural selection (and more broadly, evolution) is unable to account for our mental capacities and behavioural capabilities, then what is the source behind our propensity for language? Well, I think we’ve come far enough to rule out the spiritual explanations of Wallace (although it still persists on some corners of the web), and whilst I agree that biological natural selection alone is not sufficient to explain language, we can certainly place it in an evolutionary framework.

Such is the position of Prof Terrence Deacon, who, in his current paper for PNAS, eloquently argues for a role for relaxed selection in the evolution of the language capacity. He’s been making these noises for a while now, as I previously mentioned here, with him also recognising evolutionary-similar processes in development. However, with the publication of this paper I think it’s about time I disseminated his current ideas in more detail, which, in my humble opinion, offers a more nuanced position than the strict modular adaptationism previously championed by Pinker et al (I say previously, because Pinker also has a paper in this issue, and I’m going to read it before making any claims about his current position on the matter).

Doing evolution's sums

PLoS Biology has a review of Elements of Evolutionary Genetics up, Evolution Is a Quantitative Science:

But why has evolutionary genetics stood apart from biology’s resolutely qualitative, rather than quantitative, tradition? Most remarkably, while biomechanics employs the laws of physics, and biochemistry is founded on the quantitative science of chemistry, evolutionary genetics is based on axiomatic foundations that are entirely biological, and yet are capable of precise mathematical formulation. The rules of Mendelian genetics, encapsulated by unbiased inheritance and random mating in a diploid genetic system, predict Hardy-Weinberg frequencies, the binomial sampling of gametes in finite populations determines the properties of genetic drift, and, with a Poisson process of mutation, the complex theory of neutral genetic variation can be established on the basis of very simple assumptions.

A few books to get a historical perspective of the origins of modern evolutionary genetics (albeit with a pop gen focus), The Origins of Theoretical Population Genetics, Sewall Wright and Evolutionary Biology and R.A. Fisher: The Life of a Scientist.

The genes in Spain fall rather evenly

A new paper is out which drills down a bit on the genetic substructure in Spain. Genetic Structure of the Spanish Population:

Genetic admixture is a common caveat for genetic association analysis. Therefore, it is important to characterize the genetic structure of the population under study to control for this kind of potential bias.

In this study we have sampled over 800 unrelated individuals from the population of Spain, and have genotyped them with a genome-wide coverage. We have carried out linkage disequilibrium, haplotype, population structure and copy-number variation (CNV) analyses, and have compared these estimates of the Spanish population with existing data from similar efforts.

In general, the Spanish population is similar to the Western and Northern Europeans, but has a more diverse haplotypic structure. Moreover, the Spanish population is also largely homogeneous within itself, although patterns of micro-structure may be able to predict locations of origin from distant regions. Finally, we also present the first characterization of a CNV map of the Spanish population. These results and original data are made available to the scientific community.

What’s in a name? Genetic overlap between major psychiatric disorders

Mirrored from

The criteria used to assign patients to specific psychiatric disease categories are set out in the Diagnostic and Statistical Manual of Mental Disorders, published by the American Psychiatric Association. (There is also a World Health Organisation equivalent, the International Classification of Disease). Every so often, these criteria are revised to reflect new research and changing concepts of disease. The APA has just released a draft of preliminary revisions to the current diagnostic criteria (available at as part of the preparations for the fifth release (DSM-5), due out in 2013.

The specific diagnosis given to any patient who shows up with a spectrum of symptoms has major implications not only for their clinical treatment but also for insurance, education, employment and many other aspects of their lives. Given the authority and influence of this tome for clinical practice as well as research purposes, it is timely to consider how genetics is, or is not, informing the diagnostic criteria.
ResearchBlogCast #7

Here. The paper is Coordinated Punishment of Defectors Sustains Cooperation and Can Proliferate When Rare. The blog post highlighted is Punishing Cheaters Promotes the Evolution of Cooperation.

It is probably obvious that I’m not on the internet as much right now. But I’ve been thinking on the topic of this paper for a few days, and plan on putting together a post when I have something interesting to say, and nothing interesting to do off-net.

P.S. We decided to bring Kevin Zelnio back on.

How the Swedes became white

vikingsA few weeks ago I read Peter Heather’s Empires an Barbarians, but I had another book waiting in the wings which I had planned to tackle as a companion volume, Robert Ferguson’s The Vikings: A History. Heather covered the period of one thousand years between Arminius and the close of the Viking Age, but his real focus was on the three centuries between 300 and 600. It is telling that he spent more time on the rise and expansion of the domains of the Slavic speaking peoples than he did on the Viking assaults on Western civilization; an idiosyncratic take from the perspective of someone writing to an audience of English speakers. But within the larger narrative arc of Empires and Barbarians this was logical, the Slavs were far closer to the relevant action in terms of time and space than the Scandinavians who ravaged early medieval rather than post-Roman societies (where the latter bleeds into the former is up for debate). In Heather’s narrative the Viking invasions were a coda to the epoch of migration, the last efflorescence of the barbarian Europe beyond the gates of Rome before the emergence of a unified medieval Christian commonwealth. And these are the very reasons that Robert Ferguson’s narrative is a suitable complement to Peter Heather’s. Ferguson’s story begins after the central body of Heather’s, and most of its dramatic action is outside of the geographical purview of Empires and Barbarians. In The Vikings the post-Roman world has already congealed into the seeds of what we would term the Middle Ages, and it is this world which serves as the canvas upon which the Viking invasions are painted. Aside from what was Gaul the world of old Rome is on the peripheries of Ferguson’s narrative.

