The Residency Selection Research Arms Race

Laws of unintended consequences combined with a crappy system, this can’t be the right way for people to spend their time collecting brownie points:

(chart via NRMP’s Charting Outcomes)

(I apparently shared a similar but different chart a couple of years ago as well.)

The prevailing belief is that in the era of pass/fail Step 1, students need to compete on research to stand out. I think that is probably not quite the reality, as we still have a measurable Step 2. I think we’re really seeing is not the need to stand out because of P/F Step 1 but rather the combination of relatively increased available time and greater uncertainty:

Time & Pressure:

With the pressure of Step 1 removed for strong students (who are in no danger of failing), pass/fail Step 1 has enabled many students to spend more time generally polishing their applications. This has been compounded by pass/fail curricula more broadly. Learning enough to pass simply doesn’t take the same amount of time as aiming for a perfect score.

Research is typically felt to be “more important” than other extracurriculars and it’s easy to quantify, but people are also certainly also checking boxes for volunteer opportunities and clubs. Everyone seems to have been 1 of 4 co-presidents of their local Magical Interest Group.

Schools went pass/fail for a variety of good reasons, but nature abhors a vacuum. It’s been filled with measurable trash.

Uncertainty:

We already had our longstanding competition due to the scarcity of “desirable” residency spots, but other unintended consequence of all these pass/fail components is that it delays knowing how competitive you really are for your desired field.

It used to be that you received a disappointing score on Step 1, and—before clerkships even started—you adjusted your dreams of dermatology.

Now that you can’t know if your Step 2 score will be competitive until you’ve already essentially entered application season. It makes intuitive sense to do everything else in your power to polish your potential turd if you want to maximize your chances for your desired specialty + location combination.

Step 2 is the new Step 1; it’s just harder to plan a career around.

So what?

Reasonableness at the n=1 level aside, I think this is a problem.

The research slop is largely meaningless. The work itself is mostly garbage, and people are wasting time, money, and resources filling the dregs of pay-to-publish journals. We’re also incentivizing volume over quality so that students are incentivized to pretend that random surveys and opinion pieces are research instead of spending real time doing real work that could have a meaningful impact on other people or actually develop valuable skills. Most work is read by no one except AI bots, and the last thing we need are the LLMs internalizing a bunch more fake research and observational BS.

Time is zero-sum. The question can therefore never just be: is there value? Despite the mockery and dismissal above, of course there is some value. The question has to be: is this the best use of limited resources to achieve the goals of graduating good doctors?

Building a true meritocracy with holistic application selection is an incredible challenge. Matching people to a limited prospective jobs based on both their desires and their aptitudes is truly hard, and the desperation to shine is just as reasonable here as high school students buffing their resumes for college admissions.

Easy Mitigation Steps

We can’t change the overall game, but we can adjust the rules to nudge the behaviors to our desired outcomes.

The NRMP needs to at least start reporting the median and not the mean. Even better, we should split application success into quartiles. Students currently see this data and are mislead, because long tail outliers are dragging the mean up. Many who are “below average” for their field aren’t actually below the median. Break these things down by quartile and maybe then we’ll see how “required” and impactful research really is in most fields.

We should probably also limit the ERAS length and separate posters/abstracts/presentations from publications. We need to limit the double-counting that distorts the averages and change the incentives to promote diving deep to do meaningful work.

The Great Filter

After posting that chart and out of curiosity, I did a 100% unscientific poll on Twitter with 71 responses:

 

So, I can’t pretend that the students are wrong to play the game. It just means even more that, as a field, we need to adjust our systems and incentives to drive our actual desired behavior and improve our actual observable outcomes.

Of course, how widespread this type of filtering really is and the actual impact for different specialities would make for a great research project.

Leave a Reply