To mark the release of the 1000 Genomes papers, here’s are pedigree files with the 2,500 1000 Genomes samples. The 290,000 SNPs overlap with HGDP and other public SNP-chip data sets. The .fam has the population IDs. For what it’s work, I just used plink 2 to convert from VCF format.
Month: September 2015
The Jerry Coyne retirement
Jerry Coyne, an eminent and all around public intellectual, is retiring, and has posted a bittersweet and hopeful farewell letter to his conventional scientific career. For the general public Coyne is probably more famous as a New Atheist, though Coyne is actually a vocal atheist of long standing. His most recent book was on that topic, Faith Versus Fact: Why Science and Religion Are Incompatible. I’m an atheist, but on the balance I demur form many of his positions in regards to religion and science. More precisely, I am quite willing to defend atheism and dismiss religion, but on philosophical or meta-scientific grounds, not scientific grounds as such.
When it comes to science on the whole I tend to agree with Coyne more often than not. In particular, his attitude toward the dynamics driving evolutionary process. In regards to the science, this section jumped out at me:
What I’m proudest of, I suppose, is the book I wrote with my ex-student Allen Orr, , published in 2004. It took each of us six years to write, was widely acclaimed and, more important, was influential. I still see that book as my true legacy, for it not only summed up where the field had gone, but also highlighted its important but unsolved questions, serving as a guide for future research.
As readers know I read in 2005, and it has really influenced my perspective on the broader topic. It’s an ambitious book even if the focus is on the process of speciation, rather like in spirit, though far more economical in terms of prose and clearer in execution. I don’t know if is out of date or not, as I don’t study speciation, but I’d recommend it to anyone who wants to understand how an evolutionary geneticist might view the process and concept.
The 1000 Genomes Paper
The 1000 Genomes paper is out, A global reference for human genetic variation. It’s open access, read the whole thing. Here’s the abstract:
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
The PSMC above is interesting to me. It shows BEB, the Bengali population form Dhaka, starting from a small base and exploding in size. There are some issues relating to ascertainment that need to be admitted here though. The Indian Gujurati sample turns out to be about half Patel, and half other Gujuratis. In contrast, the Bengalis are relatively homogeneous in ancestry (sampled from Dhaka), and don’t seem to exhibit much population structure. What I’m saying is that when the authors talk about “Gujuratis” they are really talking about “sort of Patels”, while when they talk about Bengalis, they are talking about Bengalis as a whole. There’s an apples-to-oranges aspect to this. It also needs to be kept in mind when they note the alleles private to the Gujurati (GIH) sample; that’s almost certainly due to the large number of endogamous Patels in the original Houston data set who are going to share a lot more demographic history than you’d otherwise expect among Gujuratis.
Secondly, the bottleneck + genetic homogeneity in the admixture for Bengalis reinforces the model outlined in . Basically the population size change above highlights that eastern Bengalis descend from a small group of founders relatively recently in the past, despite their >100 million modern census size. Genetically this has resulted in the ancestral homogeneity you see in the plot above, but culturally it also allowed for the degrading of the social institutions of Indian society which allowed Hinduism to be robust to nearly one thousand years of Islamic hegemony across the subcontinent. Additionally, the lack of structure in ancestral components reflects relatively little endogamy (I have checked the runs of homozygosity in my parents’ genotypes, and they’re lower than those of my South Asian friends from particular caste/jati backgrounds).
The top 30 books over 5 years
As most of you who regularly read me know I’m not too interested in persuading people of things. Rather, I think that if the truth is what it is through a collaborative process of searching for it we’ll all eventually converge upon it, given enough time (which is a big condition!). Rather, the goal on this weblog is to create a set of like-minded readers and explorers. I know some of you think I’m smart and well read, but the point is that I don’t really care what you think of me. And similarly, I hope you don’t care what I think of you. The truth as we understand is its own reward. A sweetness of discovery and comprehension which most people don’t seek, nor desire. Rather, they’d prefer to run with their own horde of fellow-travelers.
With that out of the way, I was curious what books readers had purchased over the years. I’ve been an Amazon affiliate for over 15 years mostly because I do so much book-blogging. Amazon gives me records right now back to 2010. In that time over 5,000 books have been purchased through links on this website. So what are the top 30? (I picked that number because these are the number well above N = 10) It’s probably no surprise that tops the list. I’ve read this book three times cover to cover since 2006. It’s really shaped my perception of how we can understand history in a positive, rather than just interpretative, sense. Second, I’m rather proud that I’ve somehow been involved in ~20 purchases of . These were people who didn’t purchase it for a class, but because they were interested in the topic. Finally, I have no idea why so many people bought . I have never heard of this book before today. No surprise that no fiction is in the top 30.
|Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society|
“Genetic Map of Europe” Wins a MacArthur Grant!
Congratulations are in order obviously.
Update: For those who are not familiar with the paper, Genes mirror geography within Europe.
Did the internet matter for internationalism?
I’ve been on the internet for over 20 years. When I initially got on the net I remember interacting with people who lived in England, and it was so cool! At one point I recall getting into a talk session with someone who lived in Ecuador. If you lived through the era of Wired circa 1995 to 1999 you remember all the talk about how the internet was going to make location irrelevant, and we were going to congeal into a world cross-linked by cyber-connections. In the mid-2000s the Second Life boomlet brought back some of those feelings, but that faded.
Unlike many Americans I have a lot of family abroad. One of my Facebook friends is my cousin who happens to be a religious teacher and brought up in Tablighi Jamaat by an uncle who has long been a partisan of that movement. I know this cousin a bit (I met him when I visited Bangladesh in 1990 and 2004), and he’s a nice enough fellow. He even likes some of my personal events (e.g., the births of my children). We’ve had chat sessions here and there. Since my “religion” is put as “atheist” on my profile he also knows that about me (he double-checked with me when he became my Facebook friend).
I bring all this up because I hardly ever interact with the cousins who are on Facebook who live abroad. Rather, my Facebook feed is mostly devoted to those who I grew up with in the states, and in particular those who I work with, or went to school with recently. Basically what you’d expect. Facebook has over 1 billion users, but we’re all in our own cultural silos, chattering amongst ourselves. This isn’t totally surprising, and today it seems banal. Yes, there are millions of people from India on Facebook, but they’re not part of my social graph, and won’t be…unless they immigrate to the United States.
When the internet was young we didn’t anticipate many things about its later development. One was that rather than transforming our social networks, it would simply facilitate them. Yes, e-mail and Facebook have changed the way we interact and socialize. But they’ve probably just amplified and smoothed preexisting trends, rather than change the underlying dynamic.
Open Thread, 9/28/2015
I’ve been very busy of late, and had to travel this weekend. Explaining the relatively light blogging recently. Will probably change in the near future.
While on the airplane I decided to reread my Kindle version of John Gillespie’s . The subtitle is accurate, it’s short and quick. Longtime readers know if that you want to “understand” population genetics, you might should probably check out . But that’s not light reading, and, like many textbooks there isn’t a Kindle version (and in many cases the e-book version of a text is as expensive, and in some cases more expensive). But for the purposes of following along on some of the more abstruse posts in this space I have to say that Gillespie’s precis of pop gene probably does hit all the major notes. The main demerit is that since it’s short, and was written in the 1980s, it is not as genomic or coalescent heavy has a book written today with the same aims and constraints might be (I read the version that came out in 2004, but even in this edition there are some assertions about the limitations of what we know about genetic variation, especially human, which turn out to no longer hold in our time).
False controversy over ‘three-person’ babies
I haven’t paid much attention to the “three-person babies” controversy, because it seems like a manufactured one. After all, we’re balancing people who might develop a severe illness, against vague and inchoate concerns. Very few (though some) biologists that I know of express any concern about this issue. Mostly it seems to be the public, whose fears are stoked by ethicists and religious moralists.
Nature now has an article, The hidden risks for ‘three-person’ babies, which smokes out the major concern. Much of the piece focuses on what looks to be some sort of “hybrid break-down” due to conflicts between mitchondria and differing genetic backgrounds in animal models (flies and mice, for example). The worry is that mitocondria specified by their unique sequences exhibit functional differences which may interact in a deleterious fashion with different nuclear genomes. In more plain language, there may be a problem when you mix racial heritages. This is not a thesis that I usually see proposed outside of racialist circles, but it’s pretty obviously what the author is dancing around. Rather than focus on animal models, why not acknowledge that there are plenty of “natural experiments” which test this thesis. Here’s the rejoinder on Twitter:
And here’s the relevant section in the article (which is rather abbreviated after all the focus on Drosophila!):
They also pointed out that most of the evidence for risk stems from studies that used strains of flies and mice that had been highly inbred — a process that would increase the genetic differences between the strains and therefore produce a greater ‘mismatch’ when the mitochondria are swapped. They argued that such studies have little relevance for human populations that interbreed all the time. The “lack of any reliable evidence of mitochondrial–nuclear interaction as a cause of disease in human outbred populations”, they wrote, “provides the necessary reassurance to proceed”.
Aside from very rare instances such as in Helgadottir et al. there isn’t any evidence of hybrid breakdown across human populations. Greg Cochran and Henry Harpending looked in the fertility literature about ten years ago, and they found no evidence of depressed fitness. has looked for the sort of purifying selection you see in the Neanderthal-modern human admixture event (X chromosome and genic regions have less Neanderthal), and found none. The earliest branching human population are Khoisan, about ~200,000 years before the present, and there’s no good evidence that major incompabilities exist which are fundamentally racial except for the one I gave above (that is, differences are nearly fixed between populations, and crosses have reduced fitness due to lower fitness in heterozygote state). And of course there are the “natural experiments” of populations across the New World, and places like South Africa, where highly diverged mtDNA lineages are moving into different genetic backgrounds.
Presumably there have to be other issues that people worried about ‘three-person’ individuals are concerned with. What are they? I can accept the functional importance of variation in mitochondria, but I don’t see what this has to do with the three-person individual as opposed to people who are racially mixed. If the different genetic backgrounds are an issue, then it is more defensible to object to interracial relationships than three-person individuals, since in the latter case the only reason you’re introducing the novel mitochondria is to prevent illness.
CRISPR in the near term future?
Remember interactive television? In the mid-1990s Microsoft was betting the farm on this new technology. As it happens they had to make a course correction. The Mosaic browser was the first “killer app” of the internet (sorry e-mail and usenet), creating the world wide web as we know it. The the rest is history. The lesson is that sometimes no one sees a technology coming. And when it does come, it disrupts the whole landscape. It can both create and destroy. This was clear in my recent post for the Genetics Society of America. The discipline is over 100 years old, and yet over the past 30 years we’ve seen genomics go from being invented as a term (in 1986), to revolutionizing the field, finally to a great extent becoming coextensive with the field. Similarly, the internet existed for two decades before the world wide web came along, but rather soon our conception of “the internet” became synonymous with the web (and e-mail and newsgroups have become absorbed into the web architecture as well).
Similarly, genetic engineering has been around for decades. Direct manipulation of DNA sequences emerged as a technique in the 1970s, and the Asilomar Conference on Recombinant DNA agreed upon a set of guidelines in terms of how the method would be deployed. Despite what you might have gathered from movies such as Gattaca genetic engineering was both difficult and limited in its power to effect change. Of course, despite public concern GMO crops have been moving into circulation for years in the United States, while medical research would be hampered without the access to engineered mice. But very few people would assert that genetic engineering is ubiquitous today.
The CRISPR/Cas system has the potential to change this. It is easy, cheap, and fast. It can take genetic engineering from a vital niche, to a pervasive aspect of human culture. CRISPR first began to gain some attention in scientific circles in 2012. As I write now, in 2015, its presence in discussions relating to genetics can seem ubiquitous, even cloying. Seminars with the word “CRISPR” in the title suddenly become standing room. If the world wide web is an analogy to what is going on, then we are in 1993. The implication is that we haven’t seen our first Netscape of CRISPR, nor the emergence of a whole economy built around the technology. Right now it is a scientific superstar, but we’ll known that it’s made the “big time” when we see it mentioned ubiquitously on CNBC.
So what’s holding us back? There are two primary things I can think of. First, fear and uncertainty. The regulatory environment is essential for the success of any technology today (well, except Uber!), and the framework is currently ad hoc rather than formalized. The Chinese scientists who modified embryos were only newsworthy because of the bioethical and regulatory consequences of their actions, not the science. It is certainly more significant that a British group is now asking for permission to do experiments on the developmental genetics of embryos using CRISPR technology. The outrage over the modifications last spring had as much to do with breaking the tacit social norm within science where everyone wants to establish some sort of agreed upon framework for novel human research, rather than concern about the the scientific implications. If the British group receives approval, it will set a precedent which could open the door for other reputable researchers.
But what about in concrete terms in a near term horizon? The ability to “edit DNA” sounds incredible in the abstract, and is almost certainly civilization changing in the long term. But over the next few years it seems likely that CRISPR/Cas will result the reemergence of gene therapies as a means by which Mendelian diseases may be treated. Gene therapy as a field suffered a major blow in the late 1990s due to a series of fatalities, arguably tied to unethical practices by one researcher. But the idea of curing someone of a genetic illness by modifying the gene reBut isponsible for that illness is straightforward in its logic.
Many diseases, such as diabetes or schizophrenia, are complex in their origins. There is no specific gene responsible for the cause in the vast majority of instances. The road to genetically engineer a “fix” would be long and the outcome not assured in these cases. Any risks would have to be weighed strongly. In contrast Mendelian diseases are often due to a single locus, and the cause is due to that precise biological malfunction. And their outcomes are often easy to quantify. Cystic fibrosis takes decades off your life expectancy, and entails hundreds of thousands of extra costs over the lifetime. There is some debate as the frequency of Mendelian disease within the population, but something on the order of ~10% of the American population seems likely using a very liberal definition of disease. If only a a percent or two of these have illnesses which are of some severity, that may still justify intervention if it is feasible and safe.
The feasibility of gene editing to cure Mendelian disease is conditional partly on mode of delivery, which is not a genetic concern per se. That is, how do you modify a sufficient number of cell’s in a living human’s body to result in a change in function? A second concern are “off target” changes. You may be attempting to modify one thing, but modify another, in which case you’ve gone from the frying pan to the fire. Both of these though seem to be soluble problems over the time scale of a decade (CRISPR precision has gotten better even in the past few years). And for diseases such as sickle cell and cystic fibrosis, which entail shortened lives and constant monitoring and treatment, the perfect can’t be the enemy of the good. In the near future the ethical mandate will not be if, but why not.
When that happens you will see a shift in the medical system in the United States. Instead of attempting to tackle symptoms of Mendelian diseases, physicians will plausibly offer up the possibility of eliminating the root cause. This will make some companies very rich, as health care is a growing sector of our economy. Sequencing will be ubiquitous, obligatory, with exemptions necessary, not elective. In a classic sense it will be a “win-win,” as medical costs per individual will decrease, and their lifetime earning power will increase due to greater health.
The effective utilization of genetic engineering to make lives better for a minority of Americans will also change perceptions in the public as to the implications of genetic engineering. Instead of a dystopian future, people will begin to see their own present, and the fear will give away to acceptance. And it is in the time horizon beyond 2025 that I think we may need to start thinking about tackling germ-line modifications and more radical ‘experiments’ in biological engineering….
Open Thread, 9/20/2015
I know I’ve mentioned that stopped reading much about religion a few years back because I had hit diminishing marginal returns. But this Peter Turchin review of , made me reconsider. There’s no time or inclination in the near term for me to read this book, but it’s definitely in my mental stack now. I found the thesis plausible, and am familiar with the author’s published research, but remain mildly skeptical. Some of the experimental cognitive science I’ve seen in this literature is kind of “wow, that’s cool!”, but of late I’ve started to become more skeptical, as much of it turns out to not have generalizable relevance or is not robust (see ). But, it does seem that this research program is starting to go into a more multi-disciplinary direction, and that’s a good thing, as you have multiple domains of “cross-checking” to build your positive case.
On Twitter Steven Pinker points out that . Unlike a concept like implicit association there aren’t debates about its relevance to other characteristics (e.g., no, you do not necessarily behave in a more racist manner if you score as more racist on the IA tests) as well as the robustness of the result itself (e.g., the same people can get wildly different scores on re-tests which aren’t spaced that far apart). But that’s one reason I haven’t read much about IQ in years. The last book I read all the way through on IQ was probably James Flynn’s (though I am excited to read my friend Garrett Jones’ book , which is coming out in early November). Basically, the major findings of intelligence testing are pretty well set and good enough for someone for whom the topic isn’t a specialization. Similarly, from the lay perspective you don’t really need to keep up on the latest details in evolutionary science. The big sketch is probably already good enough for you. But, I did buy a copy of . I’ve sampled a fair amount of the book, though not read it front to back, and I think I can recommend it to those who want a primer.
Current Biology has a new paper, The Role of Recent Admixture in Forming the Contemporary West Eurasian Genomic Landscape. It uses the fineStructure framework, basically looking at haplotype sharing across groups. The time depth here of the inferences are relatively recent. There’s a lot in the paper, and I don’t know how to interpret all of it. But, it does reiterate that recent gene flow is a pervasive feature of the human landscape, and not just one of the modern era.
I will be in the DC-Baltimore area for ASHG in a few weeks. Excited about the poster buffet. Also going to eat spicy Chinese food. Any recommendations in Baltimore for Sichuan?