Humans, fur, and dark skin

Mali_Salif_Keita2_400A reader pointed me to a paper in The Proceedings of the Royal Society B, Was skin cancer a selective force for black pigmentation in early hominin evolution?. My initial reaction was to dismiss this argument, because cancer is obviously a late-in-life disease, and therefore it would not be a major selective hit. Rather, I found Nina Jablonski’s argument in , persuasive. Basically she suggests that chemical processes triggered by ultraviolet radiation result in the destruction of folate, and this in turn leads to elevated miscarriage rates due neural tube defects. Anything that impacts direct female reproductive output is a candidate from strong natural selection, so the thesis was highly persuasive.

But after reading the paper I do think the argument that carcinomas can reduce the fitness of humans with light skin in the tropics has merit. The author uses the case histories of albinos in Africa, who tend to develop serious health issues by their twenties. Obviously white skin is not albinism, but if the suboptimal function and lethality for albinos is correct, I can’t but help think that light-skinned hunter-gatherers on the Malthusian margin would have an even tougher go.

We can frame evolutionary question with some molecular genetic inferences. It seems that strong constraint (selection) impacted the region around MC1R 1-2 million years. The model is that ancient hominins shifted toward the savanna, lost their fur due to thermo-regulation needs, and then evolved dark skin to protect themselves against the radiation. What I think needs to be acknowledged is that it could have been that multiple forces resulted in the shift toward dark skin, which probably occurred concomitantly with the loss of fur.

Obesity drop in 2-5 year olds quite possibly a fluke

Yesterday I tweeted out Obesity Rate for Young Children Plummets 43% in a Decade. This is a big deal, and many people retweeted it. Here’s the summary in The New York Times:

But the figures on Tuesday showed a sharp fall in obesity rates among all 2- to 5-year-olds, offering the first clear evidence that America’s youngest children have turned a corner in the obesity epidemic. About 8 percent of 2- to 5-year-olds were obese in 2012, down from 14 percent in 2004.

They helpfully link to the paper in The Journal of the American Medical Association, Prevalence of Childhood and Adult Obesity in the United States, 2011-2012. And actually, if you read the paper the authors themselves seem very unsure about the robustness of this specific result. I quote from the paper:

…Tests for differences by age in children were evaluated with the following comparisons: aged 2 to 5 vs 6 to 11 years, 2 to 5 vs 12 to 19 years, and 6 to 11 vs 12 to 19 years. Similarly, in adults comparisons were made between aged 20 to 39 and 40 to 59 years, 20 to 39 and 60 years or older, and 40 to 59 and 60 years or older. P values for test results are shown in the text but not the tables. Adjustments were not made for multiple comparisons.

…Similarly, there was no significant change in obesity prevalence among adults between 2003-2004 and 2011-2012. In subgroup analyses, the prevalence of obesity among children aged 2 to 5 years decreased from 14% in 2003-2004 to just over 8% in 2011-2012, and the prevalence increased in women aged 60 years and older, from 31.5% to more than 38%. Because these age subgroup analyses and tests for significance did not adjust for multiple comparisons, these results should be interpreted with caution.

In the current analysis, trend tests were conducted on different age groups. When multiple statistical tests are undertaken, by chance some tests will be statistically significant (eg, 5% of the time using α of .05). In some cases, adjustments are made to account for these multiple comparisons, and a P value lower than .05 is used to determine statistical significance. In the current analysis, adjustments were not made for multiple comparisons, but the P value is presented.

The p-value here is 0.03 for the difference in question. That passes the conventional threshold of significance (0.05), but it is close enough to the border that I’m quite suspicious. Here is the full conclusion of the paper:

Overall, there have been no significant changes in obesity prevalence in youth or adults between 2003-2004 and 2011-2012. Obesity prevalence remains high and thus it is important to continue surveillance.

Granted, these may turn out to be real true results. And the age class that showed a decline in obesity is definitely one we should focus on. But public health is a serious matter, and therefore we shouldn’t get ahead of ourselves.

One hypothesis that presents itself in regards to this paper is that a reviewer asked explicitly about the multiple comparisons problem. The authors acknowledged the problem, without actually checking to see if the results hold after a correction, and then the editor let the paper through. Of course this is just a model. I haven’t tested it, so can’t even offer up a p-value, even if I was a frequentist.

Note: The raw data is here.

Why the jungli abides

514px-Major_crop_areas_IndiaThere seems to be a deep and ancient connection between the populations of Southeast and South Asia, most evident in the substrate of the Cambodians. In the author relays an early report about a farming community in northern Vietnam where morphological and ancient DNA evidence both pointed to a stabilized coexistence between a classically East Asian majority population and another which he terms “Austro-Melanesian.” This latter group has been predominantly absorbed today, but seems to persist in isolated tribes such as the Senoi. But these are most certainly residual elements, near extinction, and it seems the dominant genetic heritage of major ethnicities such as the Khmer derives from agriculturalists who left southern China over 4,000 years ago. Only in eastern Indonesia does the Melanesian component of ancestry in Southeast Asia begin to increase to a non-trivial component, and this area is truly as much or more part of Oceania than maritime Southeast Asia.

Freida Pinto
Freida Pinto
Nguyễn Linh Nga
Nguyễn Linh Nga

The Indian subcontinent has also characterized by a synthesis between outsiders, who likely brought farming technologies, and the native inhabitants. These ancient populations had very distant connections to the ancestors of the hunter-gatherers of the Andaman Islands, and no doubt with the peoples of pre-agricultural Southeast Asia, and further on toward Oceania. This is not to say that the zone between the South China Sea and Indus was homogeneous. Rather, like Northeast and Northwest Eurasia, it was likely a region where peoples diversified from an original Pleistocene element which arrived ~50,000 years ago, and retained broad affinities through gene flow and common ancestry. But whereas the farmers in Southeast Asia came from the north, those in India came from the west. Additionally, it seems clear that the fraction of ‘indigenous’ ancestry is far higher in South Asia, on the order of ~50% across the subcontinent. The equivalent figure for Austronesians, Daic, Burman, and Austro-Asiatic populations of Southeast Asia of Pleistocene hunter-gatherer is probably closer to ~10% (higher in the Austro-Asiatic, least among the Daic).

So I have decided to offer up a hypothesis: the agricultural toolkit which West Asian farmers brought to the northwest fringe of the Indian subcontinent was far more constrained in its ability to expand than the equivalent for the rice farmers from southern China. Though there is still debate, it seems that the dominant Indian cultivar of rice has an East Asian origin. Though wheat plays an important role in Pakistan and northwest India, rice is the staple crop for the preponderance of the South Asian population. Though I hold to the proposition that the Austro-Asiatic populations of South Asia are recently intrusive (i.e., they are not the primal inhabitants as some would argue), for geographic reasons, it seems that east to west migration across the difficult north-south mountains separating South and Southeast Asia served as a check on migration from farmers in that zone. Ultimately it was South Asian rice farmers, a hybrid population, that pushed south and east and absorbed the tribal hunter-gatherers who remained in their fastness (the current Indian tribes are not descendants of the original hunter-gatherers, but admixed populations at the margins of Sanskritic civilization; both genetics and their mode of production suggest this). The long pause in the northwest due to the limitations of their agricultural toolkit may explain the difference between South and Southeast Asia in the completeness of their demographic assimilation. Where the rice farmers from southern China swept across all of Southeast Asia rapidly in a singular sweep, the West Asian farmers were halted for many generations at the limits of their ecological range, absorbing genes from the hunter-gatherers on their frontiers. The analogy here would be the Xhosa, Bantus at the edge of their range of expansion which have absorbed a great deal of genetic material (~25% of their ancestry) from Khoisan populations. Once the proto-Indians of the northwest had accumulated enough cultural adaptations their distinctive West Asian genetic signal may already have been substantially diluted by gene flow from the hunter-gatherers to the south and east. The subsequent expansion into the forest zones was likely a demographic disaster for the old natives, but the newcomers themselves were already partly cousins.

Open Thread, 2/23/2014

I purchased Greg Clark’s most recent book, , and it was delivered to my Kindle on the money (I’d pre-ordered), but I don’t think I’ll be able to read it in the near future, as I’m quite busy. Luckily, Clark has written a nice precis in The New York Times. He notes the likely role of heritable variation in maintaining the status of particular lineages over hundreds of years. From an American perspective this is not a congenial outcome for any ideological camp. On the Right it discomfits those who hold that this is a meritocratic nation as typified by the heroes of Horatio Alger novels. But on the Left it should give pause to those who hold that increased redistributionist policies will quickly ameliorate heritable inequality.

The tale of a CRISPR clone

220px-David_von_MichelangeloIf you don’t know what CRISPR is, you should. Two words: genetic engineering. And then you have cloning. I was talking to a friend of mine about the possibility of combining these two technologies, CRISPR and cloning. The basic intention here would be to recreate yourself, but superior. Edit out de novo mutations, and genetic load inherited from ancestors more generally. Perhaps substitute well known large effect alleles which have salubrious consequences. This is not totally abstract, as I’ve talked to many people who are interested in the idea of cloning.

For example, the economics blogger and professor Bryan Caplan has confessed that he would like to see what raising a clone would be like. Or as he states, “I want to experience the sublime bond I’m sure we’d share.  I’m confident that he’d be delighted, too, because I would love to be raised by me. ” This may be correct. But now imagine that Caplan avails himself of the latest genetic engineering technology, in addition to cloning. Bryan Caplan version 2.0 is taller, better looking, smarter, more socially astute. In fact, from 2.0’s perspective the original Bryan Caplan may simply be an “alpha” version, before he was “perfected.” Perhaps 2.0 would love Caplan 1.0, but I suspect that this love would resemble Christianity’s love of its parent Judaism, which verges into patronizing condescension, as Christians believe their religion is a perfected completion of the Yahweh cult.

More farcically, consider how teenage rebellion would play out between a clone which is superior in every way to the parent. If a parent asks rhetorically “do you think you’re better than me?”, the clone would have to respond honestly, “Yes, and so do you.” The clone would be a better version of the parent, and likely this structural tension in the relationship would persist, as the original copies see themselves as they would wish to be, but never can be.

Addendum: should write a short story based on this idea!

Anti-vaccination sentiment & liberalism

Credit: Mother Jones

Over at Mother Jones Tasneem Raja and Chris Mooney have a rather alarming article up, How Many People Aren’t Vaccinating Their Kids in Your State? This is no joke. I’ve talked earlier about the fact that during my wife’s pregnancy we were confronted by rather strong anti-vaccination sentiments within the community. Because of our generally scientific bent it had no effect on us, but we saw how persuaded, or persuadable, many of our friends and acquaintances were. Without a scientific background people often rely on authorities, and those authorities can lead them astray.

One issue that has come up on occasion is the political orientation of the anti-vaccination movement. Many have assumed that it has a Left-liberal bias. I’m actually moderately skeptical of a strong political association (e.g., Michele Bachmann). But the map above suggested to me that we should test the proposition that there’s at least a state level correlation between exemptions and vote for Obama in 2012. The data was easy to get.

The raw Obama vote % and vaccination exemptions correlated at 0.08 (p-value 0.59). Pretty much nothing. But, I thought it might be more interesting to look at Obama vote for whites. Here the correlation was 0.25 (p-value 0.09). This is still a modest correlation, but it does suggest a political tinge. But rather than a standard Left-Right axis, I think we’re seeing a “crunchy counter-culture” sentiment. Here’s a scatterplot with state labels for what it’s worth….

Read More

Tracing historical genetic leapfrogging

There have been many popular press treatments of Hellenthal et al.’s A Genetic Atlas of Human Admixture History already. If you have not seen their interactive map, which imparts many of their results, I highly recommend it. To understand the scientific results it does help to read some of this group’s earlier papers, such as Inference of Population Structure using Dense Haplotype Data and Population Identification Using Genetic Data. As I suggested earlier the real paper is in the supplements, which has the virtue of being free, but generally the downside of not enforcing concision or accessibility. Obviously the general public is going to focus on the primary results; which populations mixed when. But perhaps more important is that the ingenuous methods described in the supplements illustrate the power of looking at linked variants across segments of the genome, rather than just the variants themselves.
guatemalanThese segments are haplotypes, sequences of variation across genetic regions which exhibit some association. This association can be used to illustrate relatedness across populations and individuals, because the greater the distance of generations (meiotic events) the more recombination events break apart the haplotypes. To make this clearer, I’ve included several chromosomes “painted” by 23andMe as a function of varied ancestral assignments for one individual. You notice that in this painting different colors keep alternating. That is because the individual in this case is a friend whose background is from the mestizo population of Central America. In other words, he has well over 10 generations of recombination events breaking apart associations of ancestry along his genome.

The paper above reports the end product of a similar process of analysis, but quite a bit more elaborated in the inferences being made. At this point I will elide the technical details, not because they are unimportant (I’m particularly fascinated by their decomposition of decay curves which hide multiple admixture events), but because they are difficult, and with one read-through of the supplements I don’t particularly grasp all the subtleties. What is relevant for the reader is that authors used haplotypic information by phasing their data, and so presumably can squeeze more juice out of it. This is illustrated by the comparison with the ROLLOFF application in ADMIXTOOLS, which uses just genotype data to make similar inferences. The future is probably toward phased analysis of haplotypes, because this sort of structuring of genomic data is more informationally rich. But it is computationally intensive to perform population based phasing, and the marker sets have to be dense enough that you can infer haplotypes. That will happen, but we’re not there yet with all data sets. This is a preview of the future, but we’re not in the future yet, that’s for sure.

Read More

Population genetics resources & books

Population genetics is a moderately technical field (at least at the shallower end of the pool, there are some subfields which veer into applied math), and I am finding it difficult to distill it all down in a very simple fashion to readers who are asking serious questions. To gain a full measure of many of the posts on this website it helps to understand the basics of population genetics. There’s really no short cut, just like you have to do some study if you want to talk about quantum mechanics in any serious fashion. If you are happy reading on a screen, then there are many free resources on the web. I would recommend Graham Coop’s population genetics notes, the classic ones hosted at UConn, and Joe Felsenstein’s Theoretical Evolutionary Genetics, for a start. But if you need a old fashioned book that you can hold in your hands, there are a finite number of choices. To me the closest to a “gold standard” is probably the “Hartl & Clark” text, . As is the norm among most technical texts it is expensive, though worth it. But notice when you a search that there used copies of the earlier editions which are affordable. The updates in the newest edition due to genomic technologies and such in the are not necessarily worth an extra $80 in my opinion if you just want basic population genetics. If you can understand , then you can understand population genetics to a level to master all of my posts. And, you know more about the field than the vast majority of professional biologists.

If you want more ecologically relevant illustrations of population genetic questions, you might enjoy Philip Hedrick’s . I recall there were some issues relating to spatial and temporal variation in structured populations which this book handled in great depth. But really there’s not that much difference in terms of substance besides that between this and Hartl & Clark from what I can recall. I generally find it a somewhat less elegant work stylistically. On the other hand if human examples are more to your taste, Alan Templeton has a textbook out, . Like the Hedrick text it doesn’t pack as dense a punch in my opinion as Hartl & Clark. Also, this is the first edition of the textbook, and I can imagine that will get better in future editions as Templeton gains a better sense of his audience.

The above are comprehensive surveys. Charlesworth & Charlesworth have written a text which is more like an encyclopedia, . This is not a compact work at all, and even I find it daunting. Sometimes it feels like this work is basically a “core dump,” but if you want to look up a specific issue in a textbook, then  will probably cover it. At the other end of the spectrum in terms of comprehensiveness is the classic Gillespie book, . This is more an undergraduate level work, and hammers home the most elementary of population genetic principles and fundamentals. It isn’t going to bring you up to speed on how genomics has transformed the whole field over the past 10 years, though if you are new to the discipline then that’s probably not the priority in any case.

In contrast, Rasmus Nielsen and Monty Slatkin’s new textbook, , is up-to-date on the latest genomics and computational methods. Because of the authors’ research focus the illustrations also are biased toward humans. This is definitely going to show you how “population genetics is done” in 2014. The focus on site frequency spectrum makes sense only in light of genomic data. A slim text, the main downside is that it’s a first edition, and seems to suffer from light editing. There are many typos and other such errors, which presumably will be cleaned up in future editions (or else there won’t be future editions!).

There are other books out there, such as Andrew Hamilton’s , which I can’t comment on because I don’t own them. Also, Falconer & McKay’s is a classic which is complementary to all the works above (it begins with population genetic fundamentals). In no way am I saying you have to buy all these books, or any of these books. The key is that you actually learn a little population genetics, and phylogenetics while you’re at it, if you want to comment intelligently on some of the technical nuances which come up on this blog.

In Praise of the Human Genetic Diversity Project

Credit: Luca Giarelli
Credit: Luca Giarelli, L. L. Cavalli-Sforza 2010

One of the things I (and probably almost anyone) do when reading a paper on population genetics which disaggregates the sample set into discrete elements is look at the number  of individuals within each group. In a genetic variation sense there need not be any deep technicalities about power analysis here (though those surely are there). If you have a sample size of ~30 Han Chinese I know enough about the variation present in Han Chinese to be less worried about this N than a sample size of ~30 Xhosa, or to make it even more explicit, ~30 Brazilians. Not only is sample size important, but so is provenance. Brazilians sampled from Rio Grande do Sul are going to be different from those sampled from Bahia. The same worry applies to Han Chinese (e.g., Guangdong vs. Hunan), but to a far lesser extent in terms of magnitude.

This came to mind when reading A Genetic Atlas of Human Admixture History, a paper by Hellenthal et al. which showcases the power of modern statistical genetic inference in outlining the dynamics of historical demography. It’s a masterful work, and I’ll try and grapple with the results in a later post, time permitting. But poring over the real paperthe supplements, I came upon this table:

Read More