Red hair and rotten teeth

Genetic Variations Associated With Red Hair Color and Fear of Dental Pain, Anxiety Regarding Dental Care and Avoidance of Dental Care:

Background. Red hair color is caused by variants of the melanocortin-1 receptor (MC1R) gene. People with naturally red hair are resistant to subcutaneous local anesthetics and, therefore, may experience increased anxiety regarding dental care. The authors tested the hypothesis that having natural red hair color, a MC1R gene variant or both could predict a patient’s experiencing dental care-related anxiety and dental care avoidance.

Methods. The authors enrolled 144 participants (67 natural red-haired and 77 dark-haired) aged 18 to 41 years in a cross-sectional observational study. Participants completed validated survey instruments designed to measure general and dental care-specific anxiety, fear of dental pain and previous dental care avoidance. The authors genotyped participants’ blood samples to detect variants associated with natural red hair color.

Results. Eighty-five participants had MC1R gene variants (65 of the 67 red-haired participants and 20 of the 77 dark-haired participants) (P < .001). Participants with MC1R gene variants reported significantly more dental care-related anxiety and fear of dental pain than did participants with no MC1R gene variants. They were more than twice as likely to avoid dental care as were the participants with no MC1R gene variants, even after the authors controlled for general trait anxiety and sex.

Heritability of height vs. weight

Megan McArdle has a post, Thining Thin, a follow up to America’s Moral Panic Over Obesity. She says:

1. Obesity is increasing in the population, so it can’t be genetic.
Well, average height is also increasing in the population. Does that mean that you could be as tall as me, if you weren’t too lazy to grow?
Twin studies and adoptive studies show that the overwhelming determinant of your weight is not your willpower; it’s your genes. The heritability of weight is between .75 and .85. The heritability of height is between .9 and .95. And the older you are, the more heritable weight is.

I think the analogy between obesity and height is weakened by likely differences in the effect on the variance of the traits due to environmental changes. First, remember that the term “genetic” is very broad, while the term “heritable” is very specific. Heritability is the proportion of trait variance within the population explainable by variance of genes. The said traits are usually thought of as quantitative traits, like height, weight or IQ, which exhibit a normal distribution. To say that a trait is .95 heritable does not mean that it is caused 95% by genes, that’s not even wrong. Rather, it is to say that 95% of the variance within the population can be accounted for by the variance of genes within the population. But heritable traits are also usually affected by environment; if you starve someone they will be short, but retain five fingers. The number of fingers you have on your hand is not heritable, because there’s no real variance within the population of the trait. It’s genetically specified, but not heritable.

Read More

93 ancestrally informative markers to categorize them all

An ancestry informative marker set for determining continental origin: validation
and extension using human genome diversity panels

In this study, genotypes from Human Genome Diversity Panel populations were used to further evaluate a 93 SNP AIM panel, a subset of the 128 AIMS set, for distinguishing continental origins. Using both model-based and relatively model-independent methods, we here confirm the ability of this AIM set to distinguish diverse population groups that were not previously evaluated. This study included multiple population groups from Oceana, South Asia, East Asia, Sub-Saharan Africa, North and South America, and Europe. In addition, the 93 AIM set provides population substructure information that can, for example, distinguish Arab and Ashkenazi from Northern European population groups and Pygmy from other Sub-Saharan African population groups.
These data provide additional support for using the 93 AIM set to efficiently identify continental subject groups for genetic studies, to identify study population outliers, and to control for admixture in association studies.

AIM = ancestrally informative markers. You are probably aware of the fact that most variance on any given gene is found within populations, and not between. Therefore, the chestnut of conventional wisdom that 85% of variance is within races, and 15% is between races. But not all genes are created equal. For example, on SLC24A5 almost all the variance among Europeans and Africans is between the races; if you know the state of SLC24A5, then you can establish with a high degree of certainty whether the person is African or European in origin if these are your only two options (Asians and Africans cluster on SLC245, though if you find the “European” variant you can be assured of an individual’s provenance, at least partially, from North Africa or Western Eurasia). The logic then is that a small number of highly population informative markers (i.e., those markers which are good at distinguishing between populations) can allow one to discern population stratification within medical studies. If, for example, you are looking for disease susceptibility alleles and different populations have different disease susceptibilities, then naturally those alleles which are correlated with particular populations will show up on an association (though the “causal” connection is population identity in terms of both disease and allele). This is why Ashkenazi Jewish genetics are of more than genealogical interest, if Jews have a unique suite of genetic diseases (this is true) then it might best to exclude them from studies using other Europeans. Sniffing out of this sort of “cryptic” structure isn’t that hard, in the early 2000s Neil Risch et al. pointed out that as few as 20 AIMs may be sufficient to distinguish continental populations.
This study uses 93 markers to distinguish HGDP groups, along with a few other supplemental populations which were not well represented in HGDP sample. For example, since the government of India was rather restrictive of genetic research when the HGDP population samples were being collected the “South Asians” are generally from Pakistan. A study which surveyed Indian Americans (that is, Americans whose family are of Indian origin) provided the data to “plug” that whole. Clusters were displayed through two primary methods, Structure and principal component analysis charts.

Read More

Rise of Google, 1999-2001

A quick follow up to the post below, I was curious as to the increased profile of Google in The New York Times (Google trends doesn’t seem to be available to the public before 2004) around the turn of the century. In particular, I curious as to Google’s prominence in the “Technology” section of the paper. So I looked it up. There were 78 mentions between July 1999 and December 2001. Mentions of Google increase at a rapid clip throughout this whole period. Below is a histogram of this period, illustrating the consistent rise in frequency of mention.

Read More

The next Google?

The collaboration between Yahoo! and Microsoft is spawning a lot of articles about the coming duopoly in search (since the Yahoo! Microsoft deal is for 10 years, we’re talking 10 year horizon times). But this got me to thinking: when did people realize Google was something big? I realized Google was something big (for me personally since I’m a data junkie) after being pointed to it from this article in Salon in December of 1998. I became a Google evangelist. Initially most people thought my enthusiasm was a bit strange, at that point there were a dozen search engines, and all of them were pretty much crap. Of course once anyone used Google they never touched anything else again (to be fair, it took a little while for Google’s indexing to overtake all other search engines, so there was some utility in checking the others for a bit).
But when did Google hit the media radar? It seems that The New York Times didn’t see fit to mention it until July of 1999, I Link, Therefore I Am: a Web Intellectual’s Diary:

Hang around long enough on and you’ll feel the power and depth of the Web in a way that is very different from, say, a day spent bidding on old comic books at Ebay or managing a stock portfolio at E*Trade. The links on go to very few commercial sites, with the exception of, which is the link for the site’s many book titles. To find just the right link, Ms. Halpert uses a search engine — usually Google, her favorite.

Read More

Bad reason vs. bad facts

One of the major issues when you discuss topics with people with whom you disagree is conflicts as to the acceptability of a particular chain of reason or line of analysis. There are usually implicit assumptions within any given analyses which need to be fleshed out, and to do so is usually time consuming. To give an example, I do not agree with the assertion that “IQ has nothing to do with intelligence.” This is a very common background assumption for many people, so many analyses simply make no sense when you do, or don’t, accept the viability of a concept like IQ. Talking about the issues at hand is a waste of time when there are such differences in the axioms and background structure of the models one holds, and I can understand why the temptation of extreme subjectivism emerges so often. Looking through the glass darkly can obscure the reality that beyond the glass there is a clear and distinct world.

That is why I think it is important to expose and avoid falsity of fact, however trivial. It is often much easier to agree on basic facts, especially quantitative ones. I do not say that it is alway easy, but it is certainly much easier. This is why weblogs such as The Audacious Epigone are so useful, their bread & butter is fact-checking. When blogs first began to make a splash in 2002 the whole idea of “fact checking your ass” was in vogue, but it doesn’t seem like it’s really worked out. What’s really happened is a proliferation of Google Pundits, who know the answers they want, and know how to get those answers out of the slush pile of answers via an appropriate query. Google Punditry is not exploratory data analysis, it’s fishing around for data to match your preconceptions.

Many GNXP readers may not agree with the conservative politics of The Inductivist or The Audacious Epigone, but their data-driven blog posts are often formatted such that you don’t even need to read the commentary after their tables. Eight months ago Kevin Drum of Mother Jones promised to do more digging through the GSS after I’d pointed him to the resources, but it doesn’t seem like it has happened. My GSS and WVS related posts at Secular Right often get picked up by mainstream pundits like Andrew Sullivan, but the utilization of the GSS or WVS interface hasn’t spread. Why? One friend suggested that perhaps people fear what they might find out.

I do agree that the GSS (or WVS) aren’t oracles which are infallible. There are obvious issues with representativeness in the WVS, and the small N’s for some categories in the GSS mean there’s a lot of noise. But with that caution aside, these objections are clear and distinct when one begins with these tools and data sets. In fact, with something like the GSS or WVS you can check your intuitions about representativeness by digging a little deeper.

Addendum: When I do GSS posts people often object in the form of “your data doesn’t prove that!” Interestingly, this objection comes up even when there’s a minimum of commentary. Of course the sort of surface scratches that I do don’t definitively disprove or prove much, at least in general. Rather, they should be starting off points for further digging.

Don't blame Canada

The paper Eight Americas: Investigating Mortality Disparities across Races, Counties, and Race-Counties in the United States, has this fascinating map (reformatted a bit):

As you can see there is a great deal of variance in white male life expectancy in the United Sates. Compare to this map:

“American” is probably just Scotch-Irish in this case. It is noticeable it seems on this map that the countries in central Texas where Anglo ancestry is dominated by those of German origin exhibit high life expectancy.

In any case, you can actually look at the county-by-county data set from the above paper in regards to life expectancies. The minimum male life expectancy in any county is 62, with the maximum being 80.30. The median is 73.60 and the mean 73.38 (these data are ~2000). There’s a “long tail” of sparsely populated counties with low male life expectancies as evidenced by the lower mean value than the median. The standard deviation across the counties is 2.35 years.

As can be seen on the first map there is a strong geographic component to the interregional differences. Below is a chart which reports the proportion of counties in the 50 states which have a life expectancy at, or above, the Canadian national value as of the year 2000 (again, both these values are for males).

Some states obviously have very few counties. But Kentucky has 120. None of them are at the Canadian level.