GRE utility for graduate school and conditioning on the dependent variable

One of the things that seems to be popular in biological sciences right now is the push to get rid of the GRE as part of the criteria for entrance. Two of the major rationales are that it’s expensive, so discriminates against lower socioeconomic status candidates, and, that it makes it harder to recruit underrepresented minorities since on average they score lower on the GRE (many departments have either explicit or implicit GRE cut-offs).

I’m not going to litigate these issues. To be honest I believe it is a fait accompli that many departments will stop using the GRE. This will probably increase diversity in some ways. But I also suspect it will result in a greater bias toward more “polished” candidates since very high GRE scores sometimes indicate to admissions committees that applicants who are otherwise spotty or irregular may have promise.

But, I do want to enter into the record a major problem with the argument that GRE does not correlate with academic success at the graduate level (supported by research). Yes, part of the issue may simply be range restriction. But there is another issue which many biological scientists may not be familiar with.

First, right now this paper from early this year is getting a lot of attention, The Limitations of the GRE in Predicting Success in Biomedical Graduate School.

It was, of course, a political scientist who objected immediately:

This blog post is of interest for those curious, That one weird third variable problem nobody ever mentions: Conditioning on a collider. Basically, it is well known that at many universities graduate admittees exhibit a weak negative association between GRE scores and grade point averages. This was commented on as far back as the 1970s in ScienceGraduate Admission Variables and Future Success:

The standard variables considered in selecting students for graduate school do not correlate well with later measures of the success or attainments of the selected students (1, 2). The low correlations have led at least one investigator (3) to propose abandoning one of these standard variables, the Graduate Record Examination (GRE). The purpose of the present report is to demonstrate that variables that are the basis for admitting students to graduate school must have low correlations with future measures of the success of these students.

What’s going on?

As noted in the paper there are some universities which are first-choices for graduate school in a field to such an extent that they will admit candidates who have very high GPAs and very high GREs. In this case, neither of the criteria will predict success because there is very little variation to generate a correlation. But, at many universities, there is a negative correlation between admittee GRE score and undergraduate GPA. That is because very few applicants will be admitted with both low GRE and GPA scores, but some will be admitted with high GRE scores and low(er) GPAs and others with higher GPAs and low(er) GREs (usually there is still a GPA and GRE floor).

Consider the relation:

    \[ R^2 = \frac{r_1^2 + r_2^2 - 2r_1r_2r}{1 - r^2} \]

Where \R^2 is the proportion of the variance of the variable you want to predict, and r_1^2 and r_2^2 are the correlations between GRE and GPA and that the variable of interest, and r is the correlation between GRE and GPA.

Basically, when you have negative correlations you’re going to get into a situation where r_1^2 and r_2^2 are not going to be able to explain a lot of the variance in what you want to predict.

This may seem like a nerdy issue. And it is well known to social scientists. But since the people I see talking about the GRE are academics in the biological sciences I thought I would at least highlight this nerdy issue.

As I said above, I do think GRE is going to be dropped as a requirement at many universities for graduate programs. This is going to be a natural experiment, so we’ll be able to test many hypotheses. The paper above ends like so:

…Without a study in which a sample of the applicants-rather than of the selected students is evaluated, it is impossible to tell [the validity of the criteria -RK]. Yet such a study is completely infeasible. Even if rejected applicants are monitored throughout the rest of their working careers, it is impossible to evaluate how they would have done had they been admitted, because the rejection itself constitutes an important “treatment” difference between them and the selected students. The alternative is to admit a sample of the applicant population without using the standard admission variables to select them-preferably, to select at random.

Selection may not be random, but I believe we may be able to test some hypotheses in the next generation by testing a set of students later on after admittance on the GRE and see what the future correlation is.

4 thoughts on “GRE utility for graduate school and conditioning on the dependent variable

  1. A trivial point for the post as a whole, but I am obsessive about trying to get formulae (and explications of them) correct. Brackets indicate my additions.

    Where [R]-squared is the [explained] proportion of the variance of the variable that you want to predict, and r-sub-1-squared and r-sub-2-squared are the correlations between GRE and GPA and that the variable of interest, and r is the correlation between GRE and GPE.

    The correlations between GRE & GPA, and that variable of interest are r-sub-1 & r-sub-2, not their squared values.

  2. Here is the important consideration: In Mao’s Little Red Book, it is written: “Better Red Than Expert”. Now that the Red Tide is seizing control of the STEM parts of the Universities, expect irrelevant intellectual criteria to be replaced by intersectionality.

Leave a Reply

Your email address will not be published. Required fields are marked *