Don't count old stock Anglo-America out

One of the things I really hate are unqualified linear projections. They’re so useless most of the time. A science fiction magazine will give you more insight about the future than the United Nations population projection for the year 2100. This is just as much of an issue when it comes to American Census demographic projections. As I’ve noted before population projections of the coming non-Hispanic white minority 2040 to 2050 are sensitive to the assumptions behind the basic parameters. The logic of the projection is crystal clear and airtight, but just because a certain set of assumptions holds today, does not mean that those assumptions will hold indefinitely (though the Census projections are much more plausible than the United Nations projections because two generations are so much more strongly impacted by by the inertia of current conditions that four generations). In the 18th and 19th century white Americans, and especially the Anglo-Saxon founding stock, were a highly fertile folk. They took over the American Southwest and the Northwest in large part due to their demographic assault. In New England the 30,000 of 1650 became the 700,000 in 1790 in large part due to fertility rates on the order of 7 per woman! Today no one would expect that Anglo-Saxon Americans would be so fertile, let alone the New Englanders who were prominent in the population control movements of the 20th century. In the 17th and 18th century the Jews of Eastern Europe were a highly prolific group, and the gentile majority in places like Poland viewed the waxing of the proportion of this minority with great suspicion. Today no one views the Ashkenazi Jews as demographic engines, though in places like Israel the fecund Haredi have now helped close the “birth gap” with the Arab population, as its fraction of the Jewish population keeps increasing. I can give you other “counter-intuitive” examples from the recent past, but a little history goes a long way in teaching suspicion (e.g., in the Balkans in the late 19th century rural Christian populations had much higher fertility than urban Muslim ones).

These sorts of reversals are not inexplicable. Fertility shifts occur, sometimes within a generation or two. This is why Thomas Malthus turned out to be wrong: he didn’t predict the demographic transition. But we shouldn’t be complacent and assume we’ve reached the “end of history” when it comes to fertility transitions. In the early 20th century there was great terror in the American elite due to the immigration of what would later be termed “ethnic whites,” in particular Jews and Southern Europeans. And yet the Jewish proportion of the American population peaked in the late 1940s at ~5%. What about the other groups? The General Social Survey has large sample sizes for some ethnic groups, so I decided to look there.

An attempt at "open science"

I was asked by the person who provided me the Tutsi genotype for detailed results. Of course I would do so! So I uploaded the raw csv files to Google Docs. The format and explanation isn’t totally clear, though if you follow my posts you’ll get it. This is for people who want more than pretty visualizations. But it did make me consider: I do many ADMIXTURE and EIGENSOFT runs, and you only see a small minority. This isn’t optimal for readers who want to dig deeper, but it also results in possible unconscious bias. So I’m going to try and do something different: I will post the raw results (at least in csv format) of all runs. But I obviously don’t want to cluster this weblog with updates, so you have to do one of two things to get notifications:

At some point I might just start throwing stuff into a public folder, but that’s often so user unfriendly that only those “in the know” can decrypt what is what. My aim here is to resolve some confusions by posting all the results that I get to see. A lot of the discussion on online forums about my ADMIXTURE related postings are easy to answer if the people who are confused saw the full range of my results.

Why Melanesians are blonde resolved?

Sort of and possible. I’ve been talking about this for years, and Greg Cochran points me to an abstract at the human genetics conference referenced earlier. Novel coding variation at TYRP1 explains a large proportion of variance in the hair colour of Solomon Islanders:

The Solomon archipelago comprises over 1,000 islands located east of Papua New Guinea and has a population noted for wide variation in hair pigmentation. 1200 samples were collected from 16 centres and hair colour measured in donors by spectrophotometer. Analysis of 589,241 single nucleotide polymorphisms across a subset of 42 dark haired and 43 blond haired individuals revealed a signal for pigmentation driven by 27 markers on 9p23 at the TYRP1 gene (rs13289810…). There were no systematic differences in ancestry between dark and blond haired participants indicating that this variation is unlikely to be due to recent introgression from other populations. Sequencing of TRYP1 showed complete conservation of this locus bar nucleotide 5,888(NG_011750), which was homozygous C in dark haired individuals and T in blonds. The resulting CGC->TGC missense mutation changes the 93 amino acid in exon 2 from an Arginine to a Cystine. Genotyping of TYRP1(93C/T) in all samples and analysis showed that in a recessive model including sex, age and local geography, there was a -1.67(-1.76, -1.50) standard deviation difference in hair colour by genotype groups (p=3.5e-106) equating to ~40% variance in this trait. Genotyping in the Human Gene Diversity Panel showed TYRP1(93C/T) to be essentially private to the Solomon Islanders…In humans, complete loss of function for Tyrp1 is known to cause rufous albinism. This is one of the only examples of a genomewide association study implicating causal variation directly, of a common local variant of functional effect being absent in other human populations and is one of the largest phenotypic effects attributable to a common polymorphism. Reasons for the maintenance of this variation are unclear, however this finding prompts the notion that we may find other large (disease causing) effect variants that are population specific and that our results are a call to arms to expand medical genomics to underrepresented populations.

Australian Aboriginals are not present in the HGDP panel, so there is no clarity on blondism in those populations, or amongst other indigenous groups in Southeast Asia and Oceania. If these are deep ancient variants then this may span all these populations. If not, then you see independent occurrences of a phenotype which is only present in Europeans and European-derived/admixed populations elsewhere. Why? One hypothesis I’ve thrown out is that it is possible that the expansive of agriculture populations erased a great deal of past human phenotypic diversity, due to the demographic growth of small initial founding groups ~5-10,000 years ago.

The question mark in the title by the way is that just because we characterize the genomic architecture of a trait, we don’t understand why it is distributed in the way it is. Perhaps small populations resulted in more genetic drift in Oceania than elsewhere? Or there is selection on the TYRP1 locus, and this trait is a side effect?

Survey on personal genomics

Just got this email, and I thought I would share with my readers:

I’m a biologist from Germany and together with 2 fellow biologists I’m currently working on a project that evaluates the sharing of raw data from DTC-genetic-testing companies like 23andme. I was genotyped myself and have already published the data set on GitHub and I there are other people who already did the same (i know the list of the SNPedia). But up to now these data sets are scattered all over the net and nearly none of them have attached phenotypic data.

What we are working on (and would like to see around) is a website that collects the genetic datasets as well as phenotypic data. This would make it much easier to find appropriate data and in the end – as long as there are enough users – it could become a resource for a kind of open source GWAS, similar to the idea behind the research 23andMe performs in it’s walled garden right now.

But publishing genetic and phenotypic data freely accessible on the net is still seldom seen and many people object the idea because of privacy concerns. We would like to know how many people in principle would like to participate in something like this and for what reasons they would like it (or not). So we created a small survey that asks those questions, which can be found at


// Bastian Greshake

I’ll be honest that I’m a lot more sanguine about release my genotype than entering my endophenotypes and what not in a public place. Genotypes give you probabilistic understanding, which you can gain in other ways. A lot of morphology is visible, and so there’s no privacy. But it’s a lot dicier when people want you to share how often you’ve taken anti-depressants. I think we’ll get to the stage where there will be less stigma and transparency will be the norm, but we’re not there yet….

(the survey took me less than 2 minutes, for what it’ worth)

Human genetics presentations of interest

Dienekes alerts me to the fact that the International Congress of Human Genetics abstracts are online. I spent an hour using only a few keywords, and came up with a lot.

1) If you have a presentation and think it might interest the readers here, feel free to drop a link in (I will look in the spam folder more today, though one link shouldn’t drive it crazy).

2) If you are a reader and found something interesting, do the same.

Below are some abstracts that caught my eye….

Tutsi genetics, ii

In my post below, Tutsi probably differ genetically from the Hutu, there were many comments. Some I did not post because they were rude, though they did ask valid questions. I will address those issues, but let me quote one comment:

That’s an interesting possibility, but this admixture run didn’t split the non-hunter-gatherer Africans that well. In one of your previous analyses on East Africa you managed to get a pretty accurate ‘Afro-Asiatic/Cushitic’ and ‘Nilotic’ cluster. Is it possible that you could run this Tutsi sample using the same admixture settings as in the ‘Flavors of Afro-Asiatic’ blog post to see if he carries a significant Nilotic component or is mainly Bantu & Cushitic derived?

So I replicated ADMIXTURE runs for many of the same populations as I did in my post, Flavors of Afro-Asiatic. I also pared down the population set and generated a PCA with EIGENSOFT. Before I get to those results, let me tackle the questions.

1) “Are the Luhya suitable proxies for the Hutus?”

Probably. The reason is that Bantu-speaking populations, from the Congo to South Africa, are surprisingly similar. Not only that, but these populations are very distinctive from groups which are close them geographically, but linguistically different (e.g., Khoe, Sandawe, Masai). The Luhya  are not exceptional. I’ve run the Henn et al. data sets enough to be convinced that they’re exactly as they should be. They are pretty much what you’d expect from Kenyan Bantu. A predominant element which ties them back to an East-Central African point of origin, with some admixture with other East African elements (similarly, South African Bantu exhibit Khoisan admixture). The Hutu may be peculiar, but we don’t know, and my null is that they’re mostly Bantu with some admixture, as is the case with most Bantu speaking populations (this one Tutsi seems to be an exception in that context, as they are presumably Bantu speaking). If you think that the the Luhya are not suitable, I invite you to download the HapMap Luhya, and merge them with some of the Henn et al. data sets (or HGDP or Behar data sets). I think that should convince you.

Where is the ArXiv for X?

Derek Lowe asks “Why Isn’t There an ArXiv For Chemistry?” Where indeed. A few years ago I went to a talk given by Michael Eisen and asked him about why the biological sciences didn’t have an ArXiv, and one of his explanations was that intellectual property was more of a concern in this area (e.g., pharmaceutical funded research). That sounds plausible enough to me. But the existence of ArXiv still should serve as a starting point for people outside of the physical and mathematical sciences in terms of the possibilities. Much of the discussion around Joe Pickrell’s post ‘Why publish science in peer-reviewed journals?’ seemed to operate in a world where ArXiv didn’t exist. And it’s not just ArXiv, SSRN makes it easy to get papers in social science. We have the technology, and we see the possibilities. There are obstacles, but let’s not pretend as if we don’t have a model for some success.

They're called "peer reviewers"

George Monbiot’s piece, Academic publishers make Murdoch look like a socialist, is making the rounds. This paragraph jumped out at me:

Murdoch pays his journalists and editors, and his companies generate much of the content they use. But the academic publishers get their articles, their peer reviewing (vetting by other researchers) and even much of their editing for free. The material they publish was commissioned and funded not by them but by us, through government research grants and academic stipends. But to see it, we must pay again, and through the nose

It reminded me of this scene from the South Park episode Crack Baby Athletic Association (click):

Tutsi probably differ genetically from the Hutu

Paul Kagame with Barack and Michelle Obama

I first heard about Rwanda in the 1980s in relation to Dian Fossey’s work with mountain gorillas. The details around this were tragic enough, but obviously what happened in 1994 washed away the events dramatized in Gorillas in the Mist in terms of their scale and magnitude. That period was a time when the idea of “ancient hatreds” leading to internecine conflict was in the air. It was highlighted by the series of wars in the former Yugoslavia, and the TutsiHutu civil wars in Rwanda, Burundi, and Congo. Of the latter the events in 1994 in Rwanda were only the most prominent and well known.

After having read Dancing in the Glory of Monsters: The Collapse of the Congo and the Great War of Africa I am relatively conscious of the broader canvas of what occurred in Central and East Africa in the 1990s. Not only was there a conflict between Tutsi and Hutu in Rwanda, but a similar dynamic also flared up in Burundi. The tensions are more complex in Congo and Uganda, in large part because there are many ethnic players, and the Hutu role as the antagonists with the Tutsi is divided among many different populations. In trying to distill the complex ethnography of this region in setting the structural parameters of the landscape into which the violence of the late 20th century emerged many pundits have pointed to the role of the Belgian colonial authorities in crystallizing, sharpening, and perhaps even originating the distinction between Tutsis and Hutus. This is not totally unreasonable if you don’t know much. A quick “look up” will confirm that there is no linguistic or religious distinction between the two groups; they share a common culture in many ways. Rather, the differences seem more of class and ecology. The Tutsi minority had a much stronger pastoral element to their economy. The Hutus were conventional farmers, clear legacies of the Bantu expansion which swept from West-Central Africa east and south, all the way to the Cape of Good Hope and the Indian Ocean. As is not uncommon in the history of humankind the pastoral Tutsi tended to dominate the Hutu peasant. This is where the class dimensions are clearest, as the modest Hutu were traditionally ruled by the wealthier Tutsi aristocracy.

