Castes are not just of mind

Before Nicholas Dirks was a controversial chancellor of UC Berkeley, he was a well regarded historian of South Asia. He wrote Castes of Mind: Colonialism and the Making of Modern India. I read it, along with other books on the topic in the middle 2000s.

Here is Amazon summary from Library Journal:

Is India’s caste system the remnant of ancient India’s social practices or the result of the historical relationship between India and British colonial rule? Dirks (history and anthropology, Columbia Univ.) elects to support the latter view. Adhering to the school of Orientalist thought promulgated by Edward Said and Bernard Cohn, Dirks argues that British colonial control of India for 200 years pivoted on its manipulation of the caste system. He hypothesizes that caste was used to organize India’s diverse social groups for the benefit of British control. His thesis embraces substantial and powerfully argued evidence. It suffers, however, from its restricted focus to mainly southern India and its near polemic and obsessive assertions. Authors with differing views on India’s ethnology suffer near-peremptory dismissal. Nevertheless, this groundbreaking work of interpretation demands a careful scholarly reading and response.

The condensation is too reductive. Dirks does not assert that caste structures (and jati) date to the British period, but the thrust of the book clearly leaves the impression that this particular identity’s formative shape on the modern landscape derives from the colonial experience. The British did not invent caste, but the modern relevance seems to date to the British period.

This is in keeping with a mode of thought flourishing today under the rubric of postcolonialism, with roots back to Edward Said’s Orientalism. As a scholar of literature Said’s historical analysis suffered from the lack of deep knowledge. A cursory reading of Orientalism picks up all sorts of errors of fact. But compared to his heirs Said was actually a paragon of analytical rigor. I say this after reading some contemporary postcolonial works, and going back and re-reading Orientalism.

To not put too fine a point on it postcolonialism is more about a rhetorical posture which aims to destroy what it perceives as Western hegemonic culture. In the process it transforms the modern West into the causal root of almost all social and cultural phenomenon, especially those that are not egalitarian. Anyone with a casual grasp of world history can see this, which basically means very few can, since so few actually care about details of fact.

Castes of Mind is an interesting book, and a denser piece of scholarship than Orientalism. Its perspective is clear, and though it is not without qualification, many people read it to mean that caste was socially constructed by the British.

This seems false. It has become quite evident that even the classical varna categories seem to correlate with genome-wide patterns of relatedness. And the Indian jatis have been endogamous for on the order of two thousand years. From The New York Times, In South Asian Social Castes, a Living Lab for Genetic Disease:

The Vysya may have other medical predispositions that have yet to be characterized — as may hundreds of other subpopulations across South Asia, according to a study published in Nature Genetics on Monday. The researchers suspect that many such medical conditions are related to how these groups have stayed genetically separate while living side by side for thousands of years.

This is not really a new finding. It was clear in 2009’s Reconstructing Indian Population History. It’s more clear now in The promise of disease gene discovery in South Asia.

Unfortunately though science is not well known in any depth among the general public. The ascendency of social constructionism is such that a garbled and debased view that “caste was invented by the British” will continue to be the “smart” and fashionable view among many intellectual elites.

Indian genetic history: before the storm

Over at Brown Pundits I’ve mentioned the continuing simmer of controversy over a recent piece, How genetics is settling the Aryan migration debate. This has prompted responses in the Indian media from a Hindu nationalist perspective. One of these notes that the author of the piece above cites me, and then goes on to observe I was fired from The New York Times a few years ago due to accusations of racism (also, there is the implication that I’m just a blogger and we should trust researchers with credibility like Gyaneshwer Chaubey; well, perhaps he should know that Gyaneshwer Chaubey considers me “unbiased” according to an email exchange which I had with him last week [we all have biases, so I think he’s wrong in a literal sense]).

I was a little surprised that a right-wing magazine would lend legitimacy to the slanders of social justice warriors, but this is the world we live in. Those who believe that everything written about me in the media, I invite you to submit your name and background to me. I have contacts in the media and can get things written if I so choose. Watch me write something which is mostly fact, but can easily be misinterpreted by those who Google you, and watch how much you value the objective “truth-telling” power of the press all of a sudden.

There’s a reason so many of us detest vast swaths of the media, though to be fair we the public give people who don’t make much money a great deal of power to engage in propaganda. Should we be surprised they sensationalize and misrepresent with no guilt or shame? I have seen most of those who snipe at me in the comments disappear once I tell them that I know what their real identity is. Most humans are cowards. I have put some evidence into the public record to suggest that I’m not.

Perhaps more strange for me is that the above piece was passed around favorably by Sanjeev Sanyal, who I was on friendly terms with (we had dinner & drinks in Brooklyn a few years back). I asked him about the slander in the piece and he unfollowed me on Twitter (a friend of Hindu nationalist bent asked Sanjeev on Facebook about the articles’ attack on me, but the comment was deleted). It shows how strongly people feel about these issues.

I’m in a weird position because I’m brown and have a deep interest in Indian history. But that interest in Indian history isn’t because I’m brown, I’m pretty interested in all the major zones of the Old World Oikoumene. Aside from some jocular R1a1a chauvinism I don’t have much investment personally (I just told said Hindu nationalist friend who turns out to be R2 to clean my latrine; joking of course, though I’m sure he resents that I’m descended on the direct paternal line from the All-Father & Lord of the Steppes and he is not!).

In the aughts I accepted the model outlined in 2006’s The Genetic Heritage of the Earliest Settlers Persists Both in Indian Tribal and Caste Populations. But to be frank it always struck me as a little confusing because the tentative autosomal data we had suggested that many South Asians were closer to West Eurasians than deep divergences dating to the Last Glacial Maximum would suggest. Since I’ve written something like 5 million words in 15 years, I actually can check if I’m remembering correctly. So here’s a post from 2008 where I express reservations of the idea of long term deep heritage of Indians separate from other West Eurasians. The reason I was so impressed by 2009’s Reconstructing Indian Population History is that it resolved the paradox of South Asian genetic relatedness.

To recap, Reich et al. proposed that modern Indians (South Asians) could be modeled as a two way mixture between two distinct populations with separate evolutionary genetic histories, Ancestral North Indians and Ancestral South Indians (ANI and ASI). How distinct? ANI were basically another West Eurasian population, while ASI was likely nested in the clade with Eastern Non-Africans. Additionally, there was a NW-to-SE and caste admixture cline. In other words, the higher you were on the caste ladder the more ANI you had, and the closer your ancestors were from the north and west, and more ANI you had. The difference between Y and mtDNA, male and female, could be explained by sex-biased migration.

But there were still aspects of the paper which I had reservations about. After all, it was a model.

  • Models are imperfect fits onto reality. The idea of mass migration seemed ridiculous to me at the time, because even by the time of the Classical Greeks it was noted that India was reputedly the most populous land in the world (to their knowledge). But ancient DNA has convinced me of the reality of mass migrations.
  • I wasn’t sure about the nature of the closest modern populations to the ANI. The researchers themselves (in particular, Nick Patterson) told me that the relatedness of ANI to Europeans was very close (on the order of intra-European differences). But modern Indians do not look to be descended from a population that is half Northern European physically. Again, ancient DNA has shown that there was lots of population turnover, and it turns out that Europeans and ANI were likely both compounds and mixed daughter populations of common ancestors (also, typical European physical appearance seems to have emerged in situ over the past 5,000 years).
  • The two way admixture modeled seemed too simple. I had run some data and it struck me that North Indian populations like Jats had something different than South Indian groups like Pulayars. In 2013 Priya Moorjani’s paper pretty much confirmed that it was more than a two way admixture along the ANI-ASI cline.

This March BMC Evolution Biology published Silva et al’s A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals. It has made a huge splash in India, arguably triggering the write up in The Hindu. But for me it was a bit ho-hum. If you read my 2008 post it is pretty clear that I suspected the most general of the findings in this paper at least 10 years back. It is nice to get confirmation of what you suspect, but I’m more interested to be surprised by something novel.

Nevertheless A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals has come in for lots of repeated attack in the right-wing Indian press. This is unfair, because it is a rather good paper. I suspect that it wasn’t published in a higher ranked journal because most scientists don’t consider the history of India to be that important, and they didn’t really apply new methods, as opposed to bringing a bunch of data and methods together (in contrast, the 2009 Reich et al. paper was one of the first publications which showed how to utilize “ghost populations” in explicit phylogenetic models with relevance to human demographic history).

As it happens I will be writing up my thoughts in detail in an article for a major Indian publication (similar circulation numbers as The Hindu). This has been in talks for over six months, but I’ve been busy. But a month or so ago I thought it was time that I put something into print for the Indian audience, because I felt there was some misrepresentation going on (i.e., the Aryan invasion theory has not been been refuted by genetics, but this is what many Indians assert).

For any years people have told me there are certain topics that shouldn’t be talked about. I have offended people greatly. There are many things people do not want to know. I have come to the conclusion this is not an entirely indefensible viewpoint (though if you accept this viewpoint, I think acceptance of authoritarianism is inevitable, so I hope people will toe the line when the new order arrives; knowing their personalities I think they will conform fine). But my nature is such that I continue to have nothing but contempt for the duplicitous and craven manner in which people go about these sorts of private conversations. I assume that as someone with the name “Razib Khan” I will be attacked vociferously by Hindu nationalists, who will no doubt make recourse to the Left-wing hit pieces against me to undermine my credibility. The fact that these groups are fellow travelers should tell us something, though I will leave that as an exercise for the reader.

I will write my piece that reflects the science as I believe it is, without much consideration of the attacks. That is rather easy for me to do in part because I live in the United States, where denigrating the deeply held views and self-esteem of Hindu nationalists is not sensitive or politically protected (unlike say, Muslims). And Hindu nationalists are less likely to kill me by orders of magnitude than Muslim radicals, and they have far less purchase in this nation then the latter (though you may be interested to know that very conservative Muslims follow me on Twitter; they’re actually more open-minded than many SJWs to be entirely honest).

Let me go over some general points that I see coming up over and over on the relationship between Indian (pre)history and genetics in the critiques .

One of the major critiques has to do with the nature of R1a-Z93 and its subclades. Basically this Y chromosomal haplogroup, the greatest that has ever been known, exhibits a strong signature of very rapid expansion over the past 4,000 years or so. It is divided from Z282. While Z93 is found in South Asia, Central Asia, and Siberia, Z282 is European, with its dominant subclade the one associated with Eastern Europeans. Both of these clades of R1a have gone through massive expansion. In the Altai region R1a is 40% of the heritage of peoples who are now predominantly East Eurasian today. But they are Z93. Additionally, ancient DNA from the Pontic Steppe dated ~4,000 years ago from Srubna remains is Z93, as are Scythian remains from the Iron Age.

Much of the argument comes down to dating, and citing papers that give deep coalescence numbers between difference branches of R1a1a. Hindu nationalists and their fellow travelers point to recent papers which give dates >10,000 years ago, and so place the origin of Z93 plausibly in the Pleistocene. The problem is that Y chromosomal coalescence dating is something of a mug’s game. Often they use microsatellite data whose mutational rates are highly uncertain. In contrast, using SNP data, which has a slower mutation rate but requires a lot more data, you get TRMCA (common ancestry) between Z93 and Z282 around ~5,800 years ago. But coalescence estimates often have wide confidence intervals of thousands of years. And even with these intervals, the assumptions you make (e.g., mutation rate) strongly influence your midpoint estimate.

The Y chromosomal data is powerful, but its interpretation is still buttressed upon other assumptions. The really big picture framework is the nature of ancient genome-wide variation across Eurasia. Lazaridis et al. 2016 condition us to a prior where much of Eurasia was subject to massive population-wide genetic changes since the Holocene. Therefore, I am much less surprised if there was massive genetic change in India relatively recently. The methods in Priya Moorjani’s paper and in other publications make it obvious that mixture was extensive in South Asia between very distinct groups until about ~2,000 years ago. In fact, Moorjani et al. using patterns of variation across the genome to come at a number of two to four thousand years ago as the period of massive admixture.

Though we don’t have relevant ancient DNA from India proper to answer any questions yet, we do have ancient DNA from across much of Europe, Central Asia, and the Near East. What they show is that Indian populations share ancestry from both Neolithic Iranians and peoples of the Pontic steppe, who flourished ~5 to ~10,000 years ago. To some extent the latter population is a daughter population of the former…which makes things complicated. Conversely, no West Eurasian population seems to harbor ancient signals of ASI ancestry.

One scientist who holds to the position that most South Asian ancestry dates to the Pleistocene argued to me that we don’t know if ancient Indian samples from the northwest won’t share even more ancestry than the Iranian Neolithic and Pontic steppe samples. In other words, ANI was part of some genetic continuum that extended to the west and north. This is possible, but I do not find it plausible.

The reasons are threefold. First, it doesn’t seem that continuous isolation-by-distance works across huge and rugged regions of Central Eurasia. Rather, there are demographic revolutions, and then relative stasis as the new social-cultural environment crystallizes. This inference I’m making from ancient DNA and extrapolating. This may be wrong, but I would bet I’m not off base here.

Second, it strikes me as implausible that there was literally apartheid between ASI and ANI populations for the whole Holocene right up until ~4,000 years before the present. That is, if Northwest India was involved in reciprocal gene flow with the rest of Eurasia over thousands of years I expect there should have been some distinctive South Asian ASI-like ancestry in the ancient DNA we have. We do not see it.

Third, one of the populations with strong affinities to some Indian populations are those of the Pontic steppe. But we know that this group itself is a compound of admixture that arose 5,000-6,000 years ago. Because of the complexity of the likely population model of ANI this is not definitive, but it seems strange to imagine that ANI could have predated one of the populations with which it was in genetic continuum as part of a quasi-panmictic deme.

Finally, many of the critiques involve evaluation of the scientific literature in this field. Unfortunately this is hard to do from the outside. Citing papers from the aughts, for example, is not wrong, but evolutionary human population genomics is such a fast moving field that even papers published a few years ago are often out of date.

Many are citing a 2012 paper by a respected group which argues for the dominant model of the aughts (marginal population movement into South Asia). One of their arguments, that Central Asian migrant should have East Asian ancestry, is a red herring since it is well known that this dates to the last ~2,000 years or so (we know more now with ancient DNA). But the second point that is more persuasive in the paper is that when they look at local ancestry of ANI vs. ASI in modern Indians, the ANI haplotypes are more diverse than West Eurasians, indicating that they are  not descendants but rather antecedents (usually the direction of ancestry is from more diverse to less due to subsampling).

There are two points that I have make here. First, local ancestry analysis is difficult, so I would not be surprised if they integrated ASI regions into ANI and so elevated the diversity in that way (though they think they’ve taken care of it in the paper). Second, if the ANI are a compound of several West Eurasian groups then we expect them to be more diverse than their parents. In other words, the paper is refuting a model which is almost certainly incorrect, but the alternative hypothesis is not necessarily the true hypothesis (which is a more complex demographic model than many were testing in 2012).

But there are many things we do not know still. Many free variables which we haven’t nailed down. Here are some major points:

  • Y chromosomal lineages have a correlation with ethno-linguistic groups, but the correlation is imperfect. R1b and R1a seems correlated with Indo-European groups, but both these are found in high proportions in groups which are putatively mostly “pre-Indo-European” in origin (e.g., Basques, Sardinians, and South Indian tribals and non-Brahmin Dravidian speaking groups). Also, haplogroups like I1 in Europe expand with Indo-Europeans locally, suggesting there was lots of heterogeneity in Indo-Europeans as they expanded. In other words, Indo-European expansion in relation to powerful paternal lineages did not always correlate with ethno-linguistic change.
  • There are probably at minimum two Holocene intrusions from the northwest into South Asia, but this is a floor. The models that are constructed always lack power to detect more complexity. E.g., it is not impossible that there were several migrations of Indo-Europeans into South Asia which we can not distinguish genetically over a period of a few thousand years.
  • If one looks over all of South Asia it may be that ASI ancestry in totality is >50% of the total genome ancestry. I don’t have a good guess of the numbers. If this is correct, perhaps most South Asian ancestors 10,000 years ago were living in South Asia (though the fertility rate are such in Pakistan that ANI ancestry is increasing right now in relative rates).
  • But, this presupposes that ASI were present in South Asia in totality 10,000 years ago, rather than being migrants themselves. If ancient DNA confirms that ANI were long present in Northwest India, I hold then it is entirely likely that ASI was intrusive to South Asia! The BMC Evolutionary Biology Paper does a lot of interpretation of deep structure in haplogroup M in South Asia. I’m moderately skeptical of this. Europe may not be a good model for South Asia, but there we see lots of Pleistocene turnover.

So where does this leave us? Ancient DNA will answer a lot of questions. Pretty much all scientists I’ve talked to agree on this. My predictions, some of which I’ve made before:

  1. The first period of admixture is old, and dates to the founding of Mehrgarh as an agricultural settlement. The dominant ANI component dates to this period and mixture event, all across South Asia. The presence in South India is due to expansion of these farming populations.
  2. A second admixture event occurred with the arrival of steppe people. Those who argue for the Aryan invasion model posit 1500 BCE as the date. But these people probably were expanding in some form before this date.
  3. We still don’t know who the antecedents for the Indo-Aryans were. Probably they were a compound of different steppe groups, and also other populations which were mixed in (by analogy, in Europe it is obvious now that there was some mixture with the local European farmers and hunter-gatherers as Europeans expanded their frontier westward; the same probably applies for Indo-Aryans are the BMAC).

Across the chasm of Incommensurability

The Washington Post has a piece typical of its genre, A Chinese student praised the ‘fresh air of free speech’ at a U.S. college. Then came the backlash. It’s the standard story; a student from China with somewhat heterodox thoughts and sympathies with some Western ideologies and mores expresses those views freely in the West, and social media backlash makes them walk it back. We all know that the walk back is insincere and coerced, but that’s the point: to maintain the norm of not criticizing the motherland abroad. The truth of the matter of how you really feel is secondary.

Tacit in these stories is that of course freedom of speech and democracy are good. And, there is a bit of confusion that even government manipulation aside, some of the backlash from mainland Chinese seems to be sincere. After all, how could “the people” not defend freedom of speech and democracy?

Reading this story now I remember what an academic and friend (well, ex-friend, we’re out of touch) explained years ago in relation to what you say and public speech: one can’t judge speech by what you intend and what you say in a descriptive sense, but you also have to consider how others take what you say and how it impacts them. In other words, intersubjectivity is paramount, and the object or phenomenon “out there” is often besides the point.

At the time I dismissed this viewpoint and moved on.

Though in general I do not talk to people from China about politics (let’s keep in real, it’s all about the food, and possible business opportunities), it was almost amusing to hear them offer their opinions about Tibet and democracy, because so often very educated and competent people would trot out obvious government talking points. In this domain there was little critical rationalism. One could have a legitimate debate about the value of economic liberalization vs. political liberalization. But it was ridiculous to engage with the thesis that China was always unitary between the Former Han and today. That is just a falsehood. Though the specific detail was often lacking in their arguments, it was clearly implied that they knew the final answer. I would laugh at this attitude, because I thought ultimately facts were the true weapon. The world as it is is where we start and where we end.

Or is it? From the article:

Another popular comment expressed disappointment in U.S. universities, suggesting without any apparent irony that Yang should not have been allowed to make the remarks.

“Are speeches made there not examined for evaluation of their potential impact before being given to the public?” the commentator wrote.

“Our motherland has done so much to make us stand up among Western countries, but what have you done? We have been working so hard to eliminate the stereotypes the West has put on us, but what are you doing? Don’t let me meet you in the United States; I am afraid I could not stop myself from going up and smacking you in the face.”

Others were critical not of Yang’s comments but of the venue in which she chose to make them.

“This kid is too naive. How can you forget the Chinese rule about how to talk once you get to the United States? Just lie or make empty talk instead of telling the truth. Only this will be beneficial for you in China. Now you cannot come back to China,” @Labixiaoxin said.

There is a lot of texture even within this passage. I do wonder if the writers and editors at The Washington Post knew the exegetical treasures they were offering up.

To me, there is irony in the irony. Among the vanguard of the intelligensia in these United States there is plenty of agreement with the thesis that some remarks should not be made, some remarks should not be thought. Especially in public. The issue is not on the principle, but specifically what remarks should not be made, and what remarks should not be public. That is, the important and substantive debates are not about a positive description of the world, but the values through which you view the world. The disagreements with the Chinese here are not about matters of fact, but matters of values. Facts are piddling things next to values.

So let’s take this at face value. Discussions about Tibetan autonomy and Chinese human rights violations cause emotional distress for many Chinese. I’ve seen this a little bit personally, when confronting Chinese graduate students with historical facts. It’s not that they were ignorant, but their views of history were massaged and framed in a particular manner, and it was shocking to be presented with alternative viewpoints when much of one’s national self-identity hinged on a particular narrative. Responses weren’t cogent and passionate, they were stuttering and reflexive.

Now imagine the psychic impact on hundreds of millions of educated Chinese. They’ve been sold a particular view of the world, and these students get exposed to new ideas and viewpoints and relay it back, and it causes emotional distress. Similarly, for hundreds of millions of Muslims expressing atheism is an ipso facto assault on their being, their self-identity. This is why I say that the existence of someone like me, an atheist from a Muslim background, is by definition an affront to many. My existence is blasphemy and hurtful.

And the Chinese view of themselves and their hurt at insults to their nationhood do not come purely from government fiction. There’s a factual reality that needs to be acknowledged. China was for thousands of years was one of the most significant political and cultural units in the world. But the period from 1850 to 1980 were dark decades. The long century of eclipse. China was humiliated, dismembered, and rendered prostrate before the world. It collapsed into factious civil war and warlordism. Tens of millions died in famines due to political instability.

In the late 1950s and early 1960s between 20 to 50 million citizens of the Peoples’ Republic of China starved due to Mao’s crazy ambitions. This is out of a population of ~650 million or so. Clearly many Chinese remember this period, and have relatives who survived through this period. A nation brought low, unable to feed its own children, is not an abstraction for the Chinese.

On many aspects of fact there are details where I shrug and laugh at the average citizen of China’s inability to look beyond the propaganda being fed to it. And I am not sure that the future of the Chinese state and society is particularly as rosy as we might hope for, as its labor force already hit a peak a few years ago. But the achievement of the Chinese state and society over the past generation in lifting hundreds of millions out of grinding poverty have been a wonder to behold. A human achievement greater than the construction of the Great Wall, not just a Chinese achievement.

But it is descriptively just a fact that nations which have been on the margins and find themselves at center stage want their “time in the sun.” The outcomes of these instances in history are often not ones which redound to the glory of our species, but it is likely that group self-glorification and hubris come out of a specific evolutionary context.

There are on the order of ~300 million citizens of the United States. There are 1.3 billion Chinese. If offense and hurt are the ultimate measures of the acceptance of speech than an objective rendering might suggest that we lose and they win. There are more of them to get hurt than us.

But perhaps the point is that there is no objectivity. There is no standard “out there.” Once the measuring stick of reality falls always, and all arguments are reduced to rhetoric, it is sophistry against sophistry. Power against power. Your teams and views are picked for you, or, through self-interest, or, your preferences derived from some aesthetic bias. Sometimes the team with the small numbers wins, though usually not.

Discourse is like a season of baseball. At the end there is a winner. But there is no final season. Just another round of argument.

Ten years ago I read Alister McGrath’s The Twilight of Atheism. I literally laughed at the time when I closed that book, because the numbers did not seem to support him in his grand confidence about atheism’s decline. And since the publication of that book the proportion of people in the United States who are irreligious has increased. Contrary to perceptions there has been no great swell of religion across the world.

But on a deep level McGrath was correct about something. Much of the book was aimed at the “New Atheism” specifically. A bold and offensive movement which prioritized the idea of facts first (in the ideal if not always the achievement), McGrath argued that this was a last gasp of an old modernist and realist view of the world, which would be swallowed by the post-modern age. He, a traditional Christian, had a response to the death of reason and empiricism uber alleles, his God of Abraham, God of Isaac, and the God of Jacob. Primordial identities of religion, race, and nationality would emerge from the chaos and dark as reason receded from the world.

With the rise of social constructionism McGrath saw that the New Atheists would lose the cultural commanding heights, their best and only weapons, the glittering steel of singular facts over social feelings. On the other hand, if facts derive from social cognition, than theistic views have much more purchase, because on the whole the numbers are with God, and not his detractors.

And going back to numbers. The Washington Post is owned by Jeff Bezos. And China is a massive economic shadow over us all. Anyone who works in the private sector dreams of business in China. Currently Amazon is nothing in China. What if the Chinese oligarchs made an offer Bezos couldn’t refuse? Do you think The Washington Post wouldn’t change its tune?

When objectivity and being right is no defense, then all that remains is self-interest. Ironically, cold hard realism may foster more universal empathy by allowing us to be grounded in something beyond our social unit. In the near future if the size of social units determines who is, and isn’t, right, than those who built a great bonfire on top of positivism’s death may die first at the hands of the hungry cannibal hordes. Many of us will shed no tears. We were not the ones in need of empathy, because we were among the broad bourgeois masses.

In the end the truth only wins out despite our human natures, not because of it.

Aryan marauders from the steppe came to India, yes they did!

Its seems every post on Indian genetics elicits dissents from loquacious commenters who are woolly on the details of the science, but convinced in their opinions (yes, they operate through uncertainty and obfuscation in their rhetoric, but you know where the axe is lodged). This post is an attempt to answer some questions so I don’t have to address this in the near future, as ancient DNA papers will finally start to come out soon, I hope (at least earlier than Winds of Winter).

In 2001’s The Eurasian Heartland: A continental perspective on Y-chromosome diversity Wells et al. wrote:

The current distribution of the M17 haplotype is likely to represent traces of an ancient population migration originating in southern Russia/Ukraine, where M17 is found at high frequency (>50%). It is possible that the domestication of the horse in this region around 3,000 B.C. may have driven the migration (27). The distribution and age of M17 in Europe (17) and Central/Southern Asia is consistent with the inferred movements of these people, who left a clear pattern of archaeological remains known as the Kurgan culture, and are thought to have spoken an early Indo-European language (27, 28, 29). The decrease in frequency eastward across Siberia to the Altai-Sayan mountains (represented by the Tuvinian population) and Mongolia, and southward into India, overlaps exactly with the inferred migrations of the Indo-Iranians during the period 3,000 to 1,000 B.C. (27). It is worth noting that the Indo-European-speaking Sourashtrans, a population from Tamil Nadu in southern India, have a much higher frequency of M17 than their Dravidian-speaking neighbors, the Yadhavas and Kallars (39% vs. 13% and 4%, respectively), adding to the evidence that M17 is a diagnostic Indo-Iranian marker. The exceptionally high frequencies of this marker in the Kyrgyz, Tajik/Khojant, and Ishkashim populations are likely to be due to drift, as these populations are less diverse, and are characterized by relatively small numbers of individuals living in isolated mountain valleys.

In a 2002 interview with the India site Rediff, the first author was more explicit:

Some people say Aryans are the original inhabitants of India. What is your view on this theory?

The Aryans came from outside India. We actually have genetic evidence for that. Very clear genetic evidence from a marker that arose on the southern steppes of Russia and the Ukraine around 5,000 to 10,000 years ago. And it subsequently spread to the east and south through Central Asia reaching India. It is on the higher frequency in the Indo-European speakers, the people who claim they are descendants of the Aryans, the Hindi speakers, the Bengalis, the other groups. Then it is at a lower frequency in the Dravidians. But there is clear evidence that there was a heavy migration from the steppes down towards India.

But some people claim that the Aryans were the original inhabitants of India. What do you have to say about this?

I don’t agree with them. The Aryans came later, after the Dravidians.

Over the past few years I’ve gotten to know the above first author Spencer Wells as a personal friend, and I think he would be OK with me relaying that to some extent he was under strong pressure to downplay these conclusions. Not only were, and are, these views not popular in India, but the idea of mass migration was in bad odor in much of the academy during this period. Additionally, there was later work which was less clear, and perhaps supported an Indian origin for R1a1a. Spencer himself told me that it was not impossible for R1a to have originated in India, but a branch eventually back-migrated to southern Asia.

But even researchers from the group at Stanford where he had done his postdoc did not support this model by the middle 2000s, Polarity and Temporality of High-Resolution Y-Chromosome Distributions in India Identify Both Indigenous and Exogenous Expansions and Reveal Minor Genetic Influence of Central Asian Pastoralists. In 2009 a paper out of an Indian group was even stronger in its conclusion for a South Asian origin of R1a1a, The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system.

By 2009 one might have admitted that perhaps Spencer was wrong. I was certainly open to that possibility. There was very persuasive evidence that the mtDNA lineages of South Asia had little to do with Europe or the Middle East.

Yet a closer look at the above papers reveals two major systematic problems.

First, ancient DNA has made it clear that there has been major population turnover during the Holocene, but this was not the null hypothesis in the 2000s. Looking at extant distributions of lineages can give one a distorted view of the past. Frankly, the 2009 Indian paper was egregious in this way because they included Turkic groups in their Central Asian data set. Even in 2009 there was a whole lot of evidence that Central Asian Turkic groups were likely very different from Indo-European Turanian populations which would have been the putative ancestors of Indo-Aryans. Honestly the authors either consciously loaded the die to reduce the evidence for gene flow from Central Asia, or they were ignorant (the nature of the samples is much clearer in the supplements than the  primary text for what it’s worth).

Second, Y chromosomal marker sets in the 2000s were constrained to fast mutating microsatellite regions or less than 100 variant SNPs on the Y. Because it is so repetitive the Y chromosome is hard to sequence, and it really took the technologies of the last ten years to get it done. Both the above papers estimate the coalescence of extant R1a1a lineages to be 10-15,000 years before the present. In particular, they suggest that European and South Asian lineages date back to this period, pushing back any possible connection between the groups, and making it possible that European R1a1a descended from a South Asian founder group which was expanding after the retreat of the ice sheets. The conclusions were not unreasonable based on the methods they had.  But now we have better methods.*

Whole genome sequencing of the Y, as well as ancient DNA, seems to falsify the above dates. Though microsatellites are good for very coarse grain phyolgenetic inferences, one has to be very careful about them when looking at more fine grain population relationships (they are still useful in forensics to cheaply differentiate between individuals, since they accumulate variation very quickly). They mutate fast, and their clock may be erratic.

Additionally, diversity estimates were based on a subset of SNP that were clearly not robust. R1a1a is not diverse anywhere, though basal lineages seem to be present in ancient DNA on the Pontic steppe in some cases.

To show how lacking in diversity R1a1a is, here are the results of a 2016 paper which performed whole genome sequencing on the Y. Instead of relying on the order of 10 to 100 SNPs, this paper discover over 65,000 Y variants worldwide. Notice how little difference there is between different South Asian groups below, indicative of a massive population expansion relatively recently in time which didn’t even have time to exhibit regional population variation. They note that “The most striking are expansions within R1a-Z93 [the South Asian clade], ~4.0–4.5 kya. This time predates by a few centuries the collapse of the Indus Valley Civilization, associated by some with the historical migration of Indo-European speakers from the western steppes into the Indian sub-continent.

Read More

Women hate going to India

For some reason women do not seem to migrate much into South Asia. In the late 2000s I, along with others, noticed a strange discrepancy in the Y and mtDNA lineages which trace one’s direct male and female lines: in South Asia the male lineages were likely to cluster with populations to the north an west, while the females lines did not. South Asia’s females lines in fact had a closer relationship to the mtDNA lineages of Southeast and East Asia, albeit distantly.

One solution which presented itself was to contend there was no paradox at all. That the Y chromosomal lineages found in South Asia were basal to those to the west and north. In particular, there were some papers suggesting that perhaps R1a1a originated in South Asia at the end of the last Pleistocene. Whole genome sequencing of Y chromosomes does not bear this out though. R1a1a went through rapid expansion recently, and ancient DNA has found it in Russia first. But in 2009 David Reich came out with Reconstructing Indian population history, which offered up somewhat of a possible solution.

What Reich and his coworkers found that South Asia seems to be characterized by the mixture of two very different types of populations. One set, ANI (Ancestral North Indian), are basically another western or northwestern Eurasian group. ASI (Ancestral South Indian), are indigenous, and exhibit distant affinities to the Andaman Islanders. The India-specific mtDNA then were from ASI, while the Y chromosomes with affinities to people to the north and west were from ANI. In other words, the ANI mixture into South Asia was probably through a mass migration of males.

But it’s not just Y and mtDNA in this case only. A minority of South Asians speak Austro-Asiatic languages. The most interesting of these populations are the Munda, who tend to occupy uplands in east-central India. Older books on India history often suggest that the Munda are the earliest aboriginals of the subcontinent, but that has to confront the fact that most Austro-Asiatic language are spoken in Southeast Asia. There was no true consensus where they were present first.

Genetics seems to have solved this question. The evidence is building up that Austro-Asiatic languages arrived with rice farmers from Southeast Asia. Though most of the ancestry of the Munda is of ANI-ASI mix, a small fraction is clearly East Asian. And interestingly, though they carry no East Asian mtDNA, they do carry East Asian Y. Again, gene flow mediated by males.

The same is true of India’s Bene Israel Jewish community.

A new preprint on biorxiv confirms that the Parsis are another instance of the same dynamic: The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection:

Zoroastrianism is one of the oldest extant religions in the world, originating in Persia (present-day Iran) during the second millennium BCE. Historical records indicate that migrants from Persia brought Zoroastrianism to India, but there is debate over the timing of these migrations. Here we present novel genome-wide autosomal, Y-chromosome and mitochondrial data from Iranian and Indian Zoroastrians and neighbouring modern-day Indian and Iranian populations to conduct the first genome-wide genetic analysis in these groups. Using powerful haplotype-based techniques, we show that Zoroastrians in Iran and India show increased genetic homogeneity relative to other sampled groups in their respective countries, consistent with their current practices of endogamy. Despite this, we show that Indian Zoroastrians (Parsis) intermixed with local groups sometime after their arrival in India, dating this mixture to 690-1390 CE and providing strong evidence that the migrating group was largely comprised of Zoroastrian males. By exploiting the rich information in DNA from ancient human remains, we also highlight admixture in the ancestors of Iranian Zoroastrians dated to 570 BCE-746 CE, older than admixture seen in any other sampled Iranian group, consistent with a long-standing isolation of Zoroastrians from outside groups. Finally, we report genomic regions showing signatures of positive selection in present-day Zoroastrians that might correlate to the prevalence of particular diseases amongst these communities.

The paper uses lots of fancy ChromoPainter methodologies which look at the distributions of haplotypes across populations. But some of the primary results are obvious using much simpler methods.

1) About 2/3 of the ancestry of Indian Parsis derives from an Iranian population
2) About 1/3 of the ancestry of Indian Parsis derives from an Indian popuation
3) Almost all the Y chromosomes of Indian Parsis can be accounted for by Iranian ancestry
4) Almost all the mtDNA haplogroups of Indian Parsis can be accounted for by Indian ancestry
5) Iranian Zoroastrians are mostly endogamous
6) Genetic isolation has resulted in drift and selection on Zoroastrians

The fact that the ancestry proportion is clearly more than 50% Iranian for Parsis indicates that there was more than one generation of males who migrated. They did not contribute mtDNA, but they did contribute genome-wide to Iranian ancestry. There are wide intervals on the dating of this admixture event, but they are consonant oral history that was later written down by the Parsis.

So there you have it. Another example of a population formed from admixture because women hate going to India.

Citation: The genetic legacy of Zoroastrianism in Iran and India: Insights into population structure, gene flow and selection.
Saioa Lopez, Mark G Thomas, Lucy van Dorp, Naser Ansari-Pour, Sarah Stewart, Abigail L Jones, Erik Jelinek, Lounes Chikhi, Tudor Parfitt, Neil Bradman, Michael E Weale, Garrett Hellenthal
bioRxiv 128272; doi:

How Indians are a lot like Latin Americans

Pretty much any person of Indian subcontinental origin in the United States of a certain who isn’t very dark skinned has probably had the experience of being spoken to in Spanish at some point. When I was younger growing up in Oregon I had the experience multiple times of Spanish speakers, probably Mexican, pleading with me to interpret for them because there was no one else who seemed likely. It isn’t a genius insight to conclude I was most likely South Asian…but it wasn’t out of the question I was Mexican. This applies even more to lighter skinned South Asians. In the Central Valley of California, where there are many Sikhs from Punjabi and Mexicans, this confusion occurred a lot for some Indian kids.

Of course biogeographically there isn’t that much connection between South Asia and the New World. But it isn’t crazy that Christopher Columbus labelled the peoples of the New World “Indian.” After all, they were a brown-skinned people whose features were not African, East Asian, or West Eurasian. And, it turns out genetically there is a coincidence that connects the New World and South Asia: the mixed peoples of Latin America with Amerindian and European ancestry recapitulate an admixture which resembles what occurred in South Asia thousands of years ago. It looks as if about half the ancestry of South Asians is West Eurasian and half something more like eastern Eurasians.

On principles component analysis that means that South Asian and Mexican and Peruvian samples often overlap. This is somewhat curious because the non-West Eurasian ancestors of South Asians and Amerindians diverged in ancestry on the order of 25 to 45 thousand years before the present. And the Iberian ancestry of the mixed people of the New World is almost as far from the character of South Asian West Eurasian ancestry as you can get (in the parlance of this blog, lots of EEF, less CHG, not too much ANE).

A new paper, A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals, highlights another similarity: massive bias in biogeographic ancestry by sex. More precisely, the rank order of West Eurasian ancestry in South Asia is skewed like so: Y chromosome > whole-genome > mtDNA (as is evident in the above figure).

I actually began writing about this in the late 2000s, when the fact that South Asian mtDNA was very different from West Eurasian mtDNA, and South Asian Y chromosome was mostly West Eurasian, was obvious. Then work using genome-wide data sets began to point to massive intra-Eurasian admixture between very diverged lineages. The paper is not revolutionary, but worth reading for its thoroughness and how it brings together all the lines of evidence.

Finally, no ancient DNA. That’s probably for the future, but I don’t expect any surprises.

Citation: A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals.