Patterns in international GRE scores

Why writing up my earlier post I stumbled onto to some interesting GRE data for applicants for various countries. I transcribed the results for all nations with sample sizes greater than 500. What you see above is a plot which shows mean quantitative and verbal scores on the GRE by nations.

The correlation in this set of countries between subtests of the GRE are as so:

Quant & verbal = 0.33

Verbal & writing = 0.84

Quant & writing = 0.21

Basically, the writing score and verbal score seem to reflect the lack of English fluency in many nations.

Many of these results are not too surprising if you’ve ever seen graduate school applications in the sciences (I have). Applicants from the United States tend to have lower quantitative and higher verbal scores. This is what you see here. It’s rather unfair since the test is administered in English, and that’s the native language of the United States. No surprise the United Kingdom and Canada score high on verbal reasoning. Ireland, Australia, and New Zealand didn’t have enough test takers to make the cut, but they all do as well as the United Kingdom. Singapore has an elite group which uses English as the medium of instruction in school.

I didn’t include standard deviation information, even though it’s in there. India has a pretty high standard deviation on quantitative reasoning, at 9.1. In contrast, China only has a standard deviation of 5.2 for quantitative reasoning. More than twice as many Indians as Chinese take the GRE.

Finally, I want to observe Saudi Arabia, as opposed to Iran. Both countries have about 5,000 people taking the GRE every year. About 2.5 times as many people live in Iran as opposed to Saudi Arabia. But the results for Saudi Arabia are dismal, while Iranian students perform rather well on the quantitative portion of the GRE.* This is not surprising to me, having seen applications from Saudi and Iranian students.

Saudi Arabia wants to move beyond being purely a resource-driven economy. These sorts of results show why many people are skeptical: in the generations since the oil-boom began the Saudi state has not cultivated and matured the human capital of its population. To get a better sense, here are the scores with N’s of MENA nations and a few others:

Country N Quantitative
Saudi 4462 141.6
Libya 113 146.2
Iraq 148 146.6
Oman 98 146.9
UAE 238 147.2
Qatar 85 147.3
Kuwait 386 147.8
Algeria 86 149.5
Yemen 68 149.9
Bahrain 55 150.9
Ethiopia 353 151.3
Jordan 472 152.1
Egypt 1044 153.2
Morocco 191 153.7
Tunisia 128 154.1
Georgia 71 154.2
Lebanon 691 154.7
Armenia 84 154.9
Azerbaijan 125 155.1
Eritrea 223 155.2
Israel 344 156.8
Iran 5319 157.3
Turkey 2370 158.9

 The “natural break” is between the Saudis and everyone else. In recent years Saudis indigenized their non-essential workforce. I’m broadly skeptical of the consequences of this.

The data for the plot at the top is below the fold.

Read More

GRE utility for graduate school and conditioning on the dependent variable

One of the things that seems to be popular in biological sciences right now is the push to get rid of the GRE as part of the criteria for entrance. Two of the major rationales are that it’s expensive, so discriminates against lower socioeconomic status candidates, and, that it makes it harder to recruit underrepresented minorities since on average they score lower on the GRE (many departments have either explicit or implicit GRE cut-offs).

I’m not going to litigate these issues. To be honest I believe it is a fait accompli that many departments will stop using the GRE. This will probably increase diversity in some ways. But I also suspect it will result in a greater bias toward more “polished” candidates since very high GRE scores sometimes indicate to admissions committees that applicants who are otherwise spotty or irregular may have promise.

But, I do want to enter into the record a major problem with the argument that GRE does not correlate with academic success at the graduate level (supported by research). Yes, part of the issue may simply be range restriction. But there is another issue which many biological scientists may not be familiar with.

First, right now this paper from early this year is getting a lot of attention, The Limitations of the GRE in Predicting Success in Biomedical Graduate School.

It was, of course, a political scientist who objected immediately:

This blog post is of interest for those curious, That one weird third variable problem nobody ever mentions: Conditioning on a collider. Basically, it is well known that at many universities graduate admittees exhibit a weak negative association between GRE scores and grade point averages. This was commented on as far back as the 1970s in ScienceGraduate Admission Variables and Future Success:

The standard variables considered in selecting students for graduate school do not correlate well with later measures of the success or attainments of the selected students (1, 2). The low correlations have led at least one investigator (3) to propose abandoning one of these standard variables, the Graduate Record Examination (GRE). The purpose of the present report is to demonstrate that variables that are the basis for admitting students to graduate school must have low correlations with future measures of the success of these students.

What’s going on?

As noted in the paper there are some universities which are first-choices for graduate school in a field to such an extent that they will admit candidates who have very high GPAs and very high GREs. In this case, neither of the criteria will predict success because there is very little variation to generate a correlation. But, at many universities, there is a negative correlation between admittee GRE score and undergraduate GPA. That is because very few applicants will be admitted with both low GRE and GPA scores, but some will be admitted with high GRE scores and low(er) GPAs and others with higher GPAs and low(er) GREs (usually there is still a GPA and GRE floor).

Consider the relation:

    \[ R^2 = \frac{r_1^2 + r_2^2 - 2r_1r_2r}{1 - r^2} \]

Where \R^2 is the proportion of the variance of the variable you want to predict, and r_1^2 and r_2^2 are the correlations between GRE and GPA and that the variable of interest, and r is the correlation between GRE and GPA.

Basically, when you have negative correlations you’re going to get into a situation where r_1^2 and r_2^2 are not going to be able to explain a lot of the variance in what you want to predict.

This may seem like a nerdy issue. And it is well known to social scientists. But since the people I see talking about the GRE are academics in the biological sciences I thought I would at least highlight this nerdy issue.

As I said above, I do think GRE is going to be dropped as a requirement at many universities for graduate programs. This is going to be a natural experiment, so we’ll be able to test many hypotheses. The paper above ends like so:

…Without a study in which a sample of the applicants-rather than of the selected students is evaluated, it is impossible to tell [the validity of the criteria -RK]. Yet such a study is completely infeasible. Even if rejected applicants are monitored throughout the rest of their working careers, it is impossible to evaluate how they would have done had they been admitted, because the rejection itself constitutes an important “treatment” difference between them and the selected students. The alternative is to admit a sample of the applicant population without using the standard admission variables to select them-preferably, to select at random.

Selection may not be random, but I believe we may be able to test some hypotheses in the next generation by testing a set of students later on after admittance on the GRE and see what the future correlation is.

The GRE is useful; range restriction is a thing

The above figure is from Beyond the Threshold Hypothesis: Even Among the Gifted and Top Math/Science Graduate Students, Cognitive Abilities, Vocational Interests, and Lifestyle Preferences Matter for Career Choice, Performance, and Persistence. It shows that even at very high levels of attainment on standardized tests there are differences in life outcome based on variation. The old joke is that results on intelligence tests don’t matter beyond a certain point…that point being whatever your own position is! But these results show that mathematics SAT outcomes at age 13 can still predict a lot of things across a wide range.

From personal experience people outside of psychology are pretty unaware of the power of cognitive aptitude testing. This includes many biologists. I was reminded of the above figure as I read portions of Richard Haier’s The Neuroscience of Intelligence. If you are a biologist curious about the topic, this is a highly recommended book.

The main reason I am posting this is because a friend in academia suggested it might be useful. There has recently been a backlash against the GRE exam, with support from the highest echelons of the science media. Additionally, many researchers in public forums are expressing objections to the GRE very vocally. Naturally this has resulted in counterarguments…but respondents have to be very careful how the couch their disagreement, because they fear being accused of being racist, sex, or classist. Such accusations might trigger social media mobs, which no one wants to be the target of (and if past experience is any guide, friends and colleagues will stand aside while the witch is virtually burned, hoping to avoid notice).

Because of the request above I finally decided to look at the two papers which are eliciting the current wave of GRE-skepticism, The Limitations of the GRE in Predicting Success in Biomedical Graduate School and Predictors of Student Productivity in Biomedical Graduate School Applications. To my eye they suffer from the same problem as all earlier criticisms: range restriction.

The issue is that if a university is using the GRE and other metrics well as filters for those admitted then there shouldn’t be that much variation to be left to be explained by those measures (the outcome being publications or some other important metric which actually leads to the production of science, as opposed to test scores and grades). The two papers above look at those admitted to biomedical programs at UNC and Vanderbilt, while another study looked at UCSF. These are all universities with standards high enough that there are either explicit or implicit cut-off scores so that many students are removed from the applicant pool immediately (the mean scores are well above the 50th percentile, you can see them in the paper yourself).

When I was in graduate school I was on a fellowship committee for several years, and I had access to GRE scores and grades. But I didn’t really pay much attention to them because there wasn’t that much range. And to be honest if the student was beyond their first year I didn’t look at all as time went on. In contrast, I did look really closely at the recommendations from their advisors. From talking to others on the committee this seemed typical. Once students were admitted they were judged based on how they were doing in graduate school. And how they were doing in graduate school had to do with research, not their graduate school GPA or what they scored on the GRE to get in.

As an empirical matter I do think that it is likely many universities will follow the University of Michigan in dropping the GRE as a requirement. There will be some resistance within academia, but there is a lot of reluctance to vocally defend the GRE in public, especially from younger faculty who fear the social and professional repercussions (every time a discussion pops up about the GRE I get a lot of Twitter DMs from people who believe in the utility of the GRE but don’t want to be seen defending it in public because they fear becoming the target of accusations of an -ism). My prediction is that after the GRE is gone people will simply rely on other proxies.

If the GRE is not required, but can be taken, then students who do well on the GRE will put that on their application. Sometimes strong students encounter tragedies in their undergraduate years which strongly impact their grade point averages, and very strong GREs can help show admissions committees that they can do the coursework despite their undergraduate record (I’m not positing a hypothetical, but recounting real individuals I’ve known of and seen). It seems cruel to deny these students the chance to submit their test scores. This means that those professors who believe the GRE is valid will show preference to students who take the test and have strong scores (and to be sure, many more care about the GRE when it means someone concretely joining their lab, as opposed to the abstraction of who gets admitted to the department).

More broadly, professors who are taking students will look more at proxies for GRE score, such as undergraduate institution, or the prestige of the people writing recommendation letters. That is, pedigree will matter a lot more. In some places, such as Britain, standardized testing emerged in part as a way to identify strong students from underprivileged backgrounds. These are not the type of students who would ever be able to present a prestigious letter of recommendation. This is a sort of student which still exists (often they are from non-academic backgrounds, being the first to graduate from college in their family; what they lack in polish they compensate for in aptitude, but that takes the right environment to express).

The recourse to other variables besides the GRE score will likely have mixed results at best. Consider the successful campaign to ban asking for job applicants’ criminal records. It turns out that just increased discrimination against all young black men, because employers could not longer differentiate. In general I think removing the GRE would probably hurt graduates of less prestigious state universities the most (and of course students from East Asia, who tend to have a comparative advantage on standardized tests). I’m pretty sure we’ll see, as the experiment will be run.

Addendum: There are professors at relatively prestigious research universities who had mediocre or sub-par GRE scores. We all know them. To some extent I think many of these individuals almost take pride in the fact that they accomplished so much in science despite negative feedback due to their unimpressive test scores. But remember that we’re talking about trends and averages, not deterministic predictions. Nothing in science is guaranteed, and even if you start at Harvard with undergraduate publications (not first author, but still) in Nature you may not make it that far (I’m thinking of a friend of mine, alas, who picked the wrong lab/project and couldn’t recover).

The coming reign of the Baby Boomer gerontocracy

From Dawn to Decadence: 1500 to the Present: 500 Years of Western Cultural Life is one of my favorite books. It’s one of those works whose breadth and depth is such that I would recommend it to anyone. Jacques Barzun began writing this work when he was 84, and it was published in his 93rd year. Born in 1907 Barzun saw the full efflorescence of 20th century Western culture across much of its span firsthand. When people say that when you age you gain wisdom, surely in the domain of scholarship Barzun’s production in the last few decades of his life would qualify.

But not everyone is Jacques Barzun. If you read Intelligence: All That Matters or peruse some of Eliott Tucker-Drob’s work you will know that cognitive function declines with age beyond your twenties. Different subcomponents may decline at different rates. And, they decline differently in different people (e.g., some people may develop dementia, so their faculties will decline far faster at an earlier age). But, by and large any gains in experience or wisdom are going to be balanced against declines in raw analytic ability, as well as the slow entropic loss of information.

This is not an inconsequential matter. Our governing class is quite old. The average age in Congress may be 55 to 60, but it is almost certainly true that more senior members with more power and authority are older. The president of the United States is 70 years old. If you look at the plots in these figures by 70 there has been a notable drop in intelligence by this age, though again, it may vary from person to person.

But most important in light of these figures is that the Supreme Court is a lifetime appointment, and many of its members are quite old, an anticipate serving until they are quite old if they are younger. In the mid-1970s justice William O. Douglas had a stroke and was basically not mentally competent to serve. Because of this fact, and Douglas’ reluctance to retire his fellow justices basically did not take his vote into account. Three of the justices today are over the age of 70, with Clarence Thomas nearing that age, and two are over the age of 80.

When it comes to Congress, or even the President, there seems to be some sort of institutional support as well as the larger collective vote in the case of Congress, which might buffer the cognitive impact of a gerontocracy. But aside from law clerks Supreme Court justices have to rely on their own individual mental capacities.

The Mormon Church has a gerontocracy among its we openleadership. Even my most devout friends in the church sometimes found it amusing how old their leadership was, and how quickly they died in succession due to the seniority principle. But The Supreme Court is not the leadership of a relatively small church. It impacts our whole nation. This sort of gerontocracy is no laughing matter.

Will we openly speak of the age issue? I doubt it. Today the Baby Boomers are between the ages of 53 an 71. They are coming into their own as a cohort into the highest reaches of the gerontocracy. If there is any generation with the grace and humility to step aside for the greater good, it will not be this generation.