Patterns in international GRE scores

Why writing up my earlier post I stumbled onto to some interesting GRE data for applicants for various countries. I transcribed the results for all nations with sample sizes greater than 500. What you see above is a plot which shows mean quantitative and verbal scores on the GRE by nations.

The correlation in this set of countries between subtests of the GRE are as so:

Quant & verbal = 0.33

Verbal & writing = 0.84

Quant & writing = 0.21

Basically, the writing score and verbal score seem to reflect the lack of English fluency in many nations.

Many of these results are not too surprising if you’ve ever seen graduate school applications in the sciences (I have). Applicants from the United States tend to have lower quantitative and higher verbal scores. This is what you see here. It’s rather unfair since the test is administered in English, and that’s the native language of the United States. No surprise the United Kingdom and Canada score high on verbal reasoning. Ireland, Australia, and New Zealand didn’t have enough test takers to make the cut, but they all do as well as the United Kingdom. Singapore has an elite group which uses English as the medium of instruction in school.

I didn’t include standard deviation information, even though it’s in there. India has a pretty high standard deviation on quantitative reasoning, at 9.1. In contrast, China only has a standard deviation of 5.2 for quantitative reasoning. More than twice as many Indians as Chinese take the GRE.

Finally, I want to observe Saudi Arabia, as opposed to Iran. Both countries have about 5,000 people taking the GRE every year. About 2.5 times as many people live in Iran as opposed to Saudi Arabia. But the results for Saudi Arabia are dismal, while Iranian students perform rather well on the quantitative portion of the GRE.* This is not surprising to me, having seen applications from Saudi and Iranian students.

Saudi Arabia wants to move beyond being purely a resource-driven economy. These sorts of results show why many people are skeptical: in the generations since the oil-boom began the Saudi state has not cultivated and matured the human capital of its population. To get a better sense, here are the scores with N’s of MENA nations and a few others:

Country N Quantitative
Saudi 4462 141.6
Libya 113 146.2
Iraq 148 146.6
Oman 98 146.9
UAE 238 147.2
Qatar 85 147.3
Kuwait 386 147.8
Algeria 86 149.5
Yemen 68 149.9
Bahrain 55 150.9
Ethiopia 353 151.3
Jordan 472 152.1
Egypt 1044 153.2
Morocco 191 153.7
Tunisia 128 154.1
Georgia 71 154.2
Lebanon 691 154.7
Armenia 84 154.9
Azerbaijan 125 155.1
Eritrea 223 155.2
Israel 344 156.8
Iran 5319 157.3
Turkey 2370 158.9

 The “natural break” is between the Saudis and everyone else. In recent years Saudis indigenized their non-essential workforce. I’m broadly skeptical of the consequences of this.

The data for the plot at the top is below the fold.

Humans and the settlement of Asia

Current Anthropology has a bunch of articles related to the human settlement of Asia in its latest issue ahead of print. Aside from Martin Sikora’s most of them have a more traditional paleontological focus, so it’s pretty tough for me to understand them in context. But it’s all important to take in as we get a better and better understanding of the process.

All the articles are open access, so there’s no excuse not to read them!

GRE utility for graduate school and conditioning on the dependent variable

One of the things that seems to be popular in biological sciences right now is the push to get rid of the GRE as part of the criteria for entrance. Two of the major rationales are that it’s expensive, so discriminates against lower socioeconomic status candidates, and, that it makes it harder to recruit underrepresented minorities since on average they score lower on the GRE (many departments have either explicit or implicit GRE cut-offs).

I’m not going to litigate these issues. To be honest I believe it is a fait accompli that many departments will stop using the GRE. This will probably increase diversity in some ways. But I also suspect it will result in a greater bias toward more “polished” candidates since very high GRE scores sometimes indicate to admissions committees that applicants who are otherwise spotty or irregular may have promise.

But, I do want to enter into the record a major problem with the argument that GRE does not correlate with academic success at the graduate level (supported by research). Yes, part of the issue may simply be range restriction. But there is another issue which many biological scientists may not be familiar with.

First, right now this paper from early this year is getting a lot of attention, The Limitations of the GRE in Predicting Success in Biomedical Graduate School.

It was, of course, a political scientist who objected immediately:

This blog post is of interest for those curious, That one weird third variable problem nobody ever mentions: Conditioning on a collider. Basically, it is well known that at many universities graduate admittees exhibit a weak negative association between GRE scores and grade point averages. This was commented on as far back as the 1970s in ScienceGraduate Admission Variables and Future Success:

The standard variables considered in selecting students for graduate school do not correlate well with later measures of the success or attainments of the selected students (1, 2). The low correlations have led at least one investigator (3) to propose abandoning one of these standard variables, the Graduate Record Examination (GRE). The purpose of the present report is to demonstrate that variables that are the basis for admitting students to graduate school must have low correlations with future measures of the success of these students.

What’s going on?

As noted in the paper there are some universities which are first-choices for graduate school in a field to such an extent that they will admit candidates who have very high GPAs and very high GREs. In this case, neither of the criteria will predict success because there is very little variation to generate a correlation. But, at many universities, there is a negative correlation between admittee GRE score and undergraduate GPA. That is because very few applicants will be admitted with both low GRE and GPA scores, but some will be admitted with high GRE scores and low(er) GPAs and others with higher GPAs and low(er) GREs (usually there is still a GPA and GRE floor).

Consider the relation:

    \[ R^2 = \frac{r_1^2 + r_2^2 - 2r_1r_2r}{1 - r^2} \]

Where \R^2 is the proportion of the variance of the variable you want to predict, and r_1^2 and r_2^2 are the correlations between GRE and GPA and that the variable of interest, and r is the correlation between GRE and GPA.

Basically, when you have negative correlations you’re going to get into a situation where r_1^2 and r_2^2 are not going to be able to explain a lot of the variance in what you want to predict.

This may seem like a nerdy issue. And it is well known to social scientists. But since the people I see talking about the GRE are academics in the biological sciences I thought I would at least highlight this nerdy issue.

As I said above, I do think GRE is going to be dropped as a requirement at many universities for graduate programs. This is going to be a natural experiment, so we’ll be able to test many hypotheses. The paper above ends like so:

…Without a study in which a sample of the applicants-rather than of the selected students is evaluated, it is impossible to tell [the validity of the criteria -RK]. Yet such a study is completely infeasible. Even if rejected applicants are monitored throughout the rest of their working careers, it is impossible to evaluate how they would have done had they been admitted, because the rejection itself constitutes an important “treatment” difference between them and the selected students. The alternative is to admit a sample of the applicant population without using the standard admission variables to select them-preferably, to select at random.

Selection may not be random, but I believe we may be able to test some hypotheses in the next generation by testing a set of students later on after admittance on the GRE and see what the future correlation is.

The postdoc salary range with cost of living (situation probably worst than reported)

Nature has an article, Pay for US postdocs varies wildly by institution. True, but as Matt Hahn, professor of biology at Indian University in Bloomington (cost of living 93% of the USA average) observed there isn’t any correction for cost of living. The researcher who dug through the data actually posted it online, so I decided to correct that oversight.

I took the institutions with N > 20, and looked up the cost of living in Best Places. The plot above is messy, but you can see that lots of institutions are paying a standard median salary of around $47,500, no matter the cost of living.

The correlation between cost of living and postdoc salary is 0.39. The weighted correlation is 0.48. These are pretty modest. That means you can find a really good situation, or a really bad one (also, institution reputation matters, there are some gems which pay well and have great reputations from what I can tell!).

Also, I’m pretty sure that the situation is worse than the numbers above suggest. Looking at the list of universities it seems there’s a bias for institutions at high cost of living locations not to want to report their salary data I think. Aside from UCSB the whole UC system denied the attempt to get data, and I don’t see Stanford, Columbia, or Harvard on the list.

The full table is below the fold, but adjusted for cost of living UCSB postdocs get $20,866 per year. In contrast, Michigan State, University of Maryland, Baltimore, and Wayne State University postdocs make more than $60,000 per year when you adjust. Stanford isn’t on the list, but online it says Stanford postdocs make between the low $50,000 to low $60,000 range, which seems reasonable for life sciences, though definitely poverty wages where the university is located (though if you are in a lucrative field it can be more, and depending on your supervisor outside consulting is a possibility, though good luck living in Silicon Valley on a $100,000 yearly gross income if you have a family, as many postdocs do).

The Rising Waters of Human Tribal Nature

I’m excited to read Steven Pinker’s Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. I’ve read every one one of his books except for The Stuff of Thought, and The Blank Slate is one of my favorite books of all time. I still remember how much of a page-turner The Language Instinct was for me back in the late 1990s. But I’m most excited about Enlightenment Now because I’m looking for a little hope. At this point, I am very pessimistic as to the prospects for the Enlightenment project.

This is pretty obvious to anyone who reads me closely. I’ve been writing and discussing with people on the internet, and in private, for many years now, and have come to the conclusion most people are decent, but they’re also craven and intellectually unserious outside of their domain specificity when they are intellectual. Many of our institutions are quite corrupt, and those which are supposedly the torchbearers of the Enlightenment, such as science, are filled with people who are also blind to their own biases or dominated by those who will plainly lie to advance their professional prospects or retain esteem from colleagues.

That’s why I laughed out loud when I saw this tweet:

In psychology, much of the replication crisis was simply due to personal self-interest (more publications). But some of it was obviously political (see stereotype threat). Similarly, look at the fiasco in nutrition science. Some of it was personal, but there were also political demands from on high that there be something done. So “scholars” set some guidelines that people followed for decades, even if later they were shown to be totally ineffective. I’m not even going to get into the travesty that is modern biomedical science, with professional advancement and institutional interests combined in a deadly cocktail.

Also, I enjoy science popularizing (or did, I don’t read science books much anymore) as much as the next person, but isn’t it interesting how much of modern science confirms the mainstream elite cultural norms of ~2020? Curiously, if you read science popularizations in newspapers in 1920 they would also confirm the elite cultural norms of 1920…. But this time we’re right!

Other institutions aren’t doing better. The media is going through economic collapse, and journalists and their paymasters are reacting by pandering to their audiences. Instead of illuminating, they’re confirming. That’s what the audience wants, and I’m sure it’s more satisfying to journalists anyway. But can you blame them with the economics that are before us?

This is 2017, Nazi-pizza

Don’t get me started on Facebook or Twitter.

I was having a discussion with a reasonably prominent pundit (you would recognize the name) today who bemoaned the reality that so many journalists are now driven to sating tribal passions and generating clicks for their paymasters. He was trying to argue against my pessimism, suggesting that the fever was starting to break. We’ll see. I hope I’m wrong.

People have always been biased and subject to motivated reasoning. We’ve had our disputes whatever our ideology, whether it be conservative, moderate, or liberal. But the Enlightenment perspective of critical rationalism, which took philosophical realism seriously, meant that ultimately people who disagreed often assumed that fundamentally they were trying to converge on the same facts, the same reality. Reality existed, and you couldn’t just wish it away. Discussion might forward two individuals to a convergence!

We’re not there anymore. Whether it be Bush-era contempt for “Reality-Based Community”, or the rising crest of “Critical Theory”, the acid of subjectivism is eroding the vast edifice of aspirational realism which grew organically in the wake of the Enlightenment. This isn’t a Left vs. Right phenomenon, it’s a human dynamic, because for most of human history what is true has been determined by what the tribe dictates to be true, and what the tribe dictates to be true has often not been based on a critical evaluation of facts and theories. What the tribe dictates to be true is computationally less intensive than thinking things through yourself, and, it’s often right-enough.

The reality is that this cultural cognition and conformity has always held. It’s just that it seems that for a few centuries substantial latitude was given in public to a relative amount of heterodoxy from broad tribal visions. And it was always a work in progress. But there was a goal, and an ideal, even if we habitually failed. We failed in the direction of truth.

We live in a post-modern age now. Feelings are paramount, facts must bow before them. But the curious fact is that the post-modern age is just the pre-modern age. When I first read the Christian author Alister McGrath I literally scoffed at his contention that atheism would fail before the ascendancy of post-modernism. Ten years on I will admit that I now believe he was right and I was wrong. Though I don’t think the New Atheism failed miserably, I do think that the problems it is encountering from the cultural Left are due to its cold modernist baggage.

No truth, no liberalism. No liberalism, and democracy become the mob. The passions of the mob do eventually fail, and its wake a more oligarchic and hierarchical system will emerge. We may simply be seeing the end of the liberal individualist interregnum, as history reverts to its despotic collectivist norm.

Art, the applied sciences of engineering, and many human endeavors will continue to develop in the new order. Illiberal societies, all societies until recently, can be cultured and civilized. My own preference is for the dignity of the individual and legal egalitarianism of the liberal world in which I grew up (but in which I was not born), but humans have flourished and continue to flourish in illiberal environments.

One way to think about the past century or so is that more or less the waters of human nature receded, and a great undersea world was exposed. But now human nature is rising, and that world is submerging before our eyes. But islands of the old world we grew up in will persist. We need to find each other out and cherish the values of critical inquiry as we have for thousands of years. An archipelago of learning for learning’s sake can sill maintain itself in a world where our values no longer hold the leash. But like the mammals during the Mesozoic, we will have to go back into the night and the shadows. There will hopefully be oligarchic patrons who sympathize with us, and despots like Frederick the Great who give us some latitude to work. Our values will fade and diminish, but they will not disappear.* One day they may come to the fore again!

Finally, understanding that most people don’t need to be right or utter the truth, but simply need to win, has made me much more cheerful and less sour observing everyday stupidities. It is no great insight to observe that I’ve never been one who has had much esteem for the admiration of my peers. I like to do my own thing. But tribal acclamation must be the best of all things for most humans, and now I understand why they fight unfairly and stupidly with such ease and naturalness: their aim not to be right in the eyes of nature, but to rise in the esteem their fellow human. That is the summum bonum.

Note: I’ll be very happy to be proven wrong in 15 years. But as it is I think by then we’ll be dealing with the final breakdown of the institutions of the republic in the wake of a Left-wing attempt to forestall the economic immiseration of the middle-class that failed.

* The main reason I hated religion as a child is the mindless boredom of attendance at services. I quickly realized I didn’t believe any of that tripe and never had. But the liberty that I have to dissent from public values may not be a liberty we always have. Private dissent may come back and become the norm as it has been for much of human history.

$9.99 to get into the Helix exome ecosystem

Will try to keep self-interested product placement to a minimum normally, but I thought I’d pass on that Helix has a $100 off sale for the next 72 hours. That means that the company I work for has a Neanderthal app on sale for $9.99. The regular price is $29.99, and added $80.00 for exome+ sequencing if you aren’t in the Helix database (which most people are not).

The upshot here is that the $9.99 will get you an exome+ sequence, which at some point in early 2018 you can download for $600. But if you don’t want to download it it’s a great way to get into the ecosystem on-the-cheap.

I assume most of my readers know what the exome is, but it’s basically the portion of your genome which is directly translated into functional proteins. That’s about ~1% of the genome, or ~30,000,000 bases. This is a major expansion on the SNP-chip platforms which are DTC which are in the 500,000 to 1,000,000 SNP range.

Anyway, not sure this will be appealing to readers who need a full download of data. But if you are the type who is more interested in getting applications related to your genome, this is a pretty good deal at a sub-$10 price point.

Note: To my knowledge only ships to USA currently.

Our time in the sun

The New York Times has a story up, After the Dinosaurs’ Demise, Many Mammals Seized the Day. It’s a write-up of a new paper that is open access, Temporal niche expansion in mammals from a nocturnal ancestor after dinosaur extinction.

This research illustrates how computational power has changed evolutionary biology. There has long been an intuitive verbal model that mammals were ancestrally night-adapted creatures based on aspects of their biology, as well as the evolutionary reality that for most of the lineages’ existence they were overshadowed by dinosaurs (remember, more than half of our evolutionary history predates the Cenozoic).

But today we do more than posit models which match and predict the fossil (or genetic) data. Computationally intensive phylogenetic frameworks are tested using extant lineages to generate probabilities of given scenarios generating the data we see given particular models. Something like the Reversible-jump Markov chain Monte Carlo (which is used in this paper) could actually be done manually…if a phylogeneticist had thousands of slaves to do all the computations. Obviously, the emergence of powerful computers accessible to all really changed the game in terms of analytic power.

And yet I wonder about the sense of precision that people gain from these methods. Verbal models are necessarily vague. When you give a probability of a given hypothesis being 0.71, that gives understanding a solidity. But is it warranted? Though researchers understand all the individual moving parts of the phylogenetic framework, only a computer can really bring it all together.

It’s something to consider. This is to a great extent the future of evolutionary biology. Positing models, and put it into a calculating machine like Leibniz dreamed of.

Citation: Temporal Niche Expansion In Mammals From A Nocturnal Ancestor After Dinosaur Extinction
Roi Maor, Tamar Dayan, Henry Ferguson-Gow, Kate Jones

Addendum: This is stupid of me, but only after reading the above paper did I reflect that most amniotes are diurnal and that mammals are the exception. Think about it, birds. And reptiles are probably more sluggish at night.

The end of the Kingdom of Saudi Arabia

The most important thing happening in the world that is different this week from last week from what I can tell is that the the Kingdom of Saudi Arabia is going “full Ishmael” on us. By this I mean the reference in the Hebrew Bible to Abraham’s firstborn son, Ishmael, and the legendary ancestor of the Arabs: “And he will be a wild man; his hand will be against every man, and every man’s hand against him….”

What’s going on now? As you know there seems to be an internal purge going on, and a centralization of power around the Crown Prince. This, after the rollback of the power of the religious establishment.

Externally the quagmire in Yemen continues, and the Saudi state is now becoming more belligerent toward both Iran and Lebanon.

Most of you probably know the general issues about why the Saudi state is attempting to change and reform. Though petroleum will remain important for plastics and jet fuel, it is quite possible that the proportion used for gasoline will decline with the rise of electric cars. Additionally, there seem more supply-side possibilities with fracking technologies.

But perhaps the biggest factors are demographic. Over ten years ago Peter Turchin wrote a paper, Scientific Prediction in Historical Sociology: Ibn Khaldun meets Al Saud. It’s pretty useful in understanding what’s going on right now. The big issue which Turchin talks about more generally and is relevant to Saudi Arabia is elite overproduction. The Royal House is highly fecund. And all the scions demand unsustainable leisured lives….

China’s wealthiest come from only a few regions

In Kenneth Pomeranz’s The Great Divergence: China, Europe, and the Making of the Modern World Economy he argues that the difference in per capita economic wealth between Europe and China is a relatively recent phenomenon. One of the major arguments he makes is that one has to make an apples-to-apples comparison. Comparing Northwest Europe to China is not apples-to-apples, but comparing Northwest Europe to the lower Yangzi Delta region of Central China is apples-to-apples. Using this measure Europe and China are roughly comparable up until 1800.

At least that’s the argument. Others make the case for much deeper and older roots for the differences between Western Europe and the rest of the world, most articulately in Gregory Clark’s A Farewell to Alms.

I don’t have a dog in this fight and am not decided, though I follow the field somewhat closely. Rather, I’ve always been curious about differences between Chinese regions, and how they never undermine national unity. I recall reading years ago in The Age of Confucian Rule that imperial examinations to determine candidates for the bureaucracy had quotas on candidates from the southeastern province of Fujian. They were simply filling up too many slots, at the expense of northern Chinese candidates.

The tension between social and economic orientations of different regions of China cropped up periodically. Basically, the Overseas Chinese community is derived from southern regions such as Guangdong and Fujian, the central government over the centuries attempted to stamp out these regions’ propensity toward international commerce. A figure like Howqua is typical, though he certainly would not be met with approval by stern Neo-Confucians such as Zhu Xi (also a southern Chinese born and bred).

With all this in mind, I was curious about the origins of the 20 wealthiest Chinese as of 2017. Below you see the results:

Name Net worth (USD) Sources of wealth Province Certainty
Wang Wenyin 14 billion mining, copper products Anhui  
Liu Yongxing 6.6 billion agribusiness Fujian  
Ma Huateng 24.9 billion internet media Guangdong  
He Xiangjian 12.3 billion home appliances Guangdong  
Yang Huiyan 9 billion real estate Guangdong  
Yao Zhenhua 8.4 billion conglomerate Guangdong ?
Zhang Zhidong 8.4 billion internet media Guangdong ?
Hui Ka Yan (Xu Jiayin) 10.2 billion real estate Henan  
Lei Jun 6.8 billion smartphones Hubei  
Liu Qiangdong 7.7 billion e-commerce Jiangsu  
Zhang Shiping 6.7 billion aluminum products Shandong  
Wang Wei 15.9 billion package delivery Shanghai  
Robin Li 13.3 billion internet search Shanxi  
Wang Jianlin 31.3 billion real estate, Sichuan  
Xu Shihui 21.1 billion solar power equipment Sichuan  
Jack Ma 28.3 billion e-commerce Zhejiang  
William Ding 17.3 billion online games Zhejiang  
Zong Qinghou 7.2 billion beverages Zhejiang  
Li Shufu 21.1 billion automobiles Zhejiang  
Guo Guangchang 6.3 billion diversified Zhejiang

A few of the individuals I’m not totally sure about in terms of where they were born, but I think I guessed correctly. Comparing representation on the list to national population by province, and you get:

Province Pop % On list
Guangdong 8% 25%
Zheijiang 4% 25%
Sichuan 8% 10%
Fujian 3% 5%
Anhui 5% 5%
Henan 7% 5%
Hubei 4% 5%
Jiangsu 6% 5%
Shanghai 2% 5%
Shanxi 3% 5%
Shandong 7% 5%

Zheijang-Jiangsu-Shangai is the core economic region highlighted by Pomeranz. About 12% of China’s population resides in these jurisdictions, but 35%, 7 out of 20, of its 20 wealthiest individuals were born here. Guangdong, as ground zero of the new economic revolution has clearly benefited.