Recalls and Exam Security

Out of all the reasons organizations like (but not limited to) the ABR have used as an excuse to shy away from remote content or historically relied on commercial testing centers, I strongly suspect exam security is the only one that actually matters.

While an individual not cheating on the exam is important for exam integrity, that type of exam security is a relatively straightforward n=1 problem. The real exam security that matters is the security of intellectual property. Nevermind that all these medical organizations use questions largely written and initially vetted by volunteers, losing hundreds of proprietary questions in one fell swoop to some industrious malcontent is the real fear.

The recent utter failure of the American Board of Surgery’s virtual testing also makes the point: once people have seen a significant fraction of the questions for a high stakes exam, it’s back to the drawing board. A doomed effort to administer an exam remotely sets an organization back by months, at the least.

That’s because recalls—people sharing memorized exam content—are a big deal. In fact, news about the universal use of recalled study questions for the radiology oral boards back in 2012 was the driving force behind the creation of the MCQ-only Core and Certifying exams, administered only on-site in bespoke testing centers created by the ABR itself in Chicago and Tuscon that sit gathering dust for most of the year.

While recalls are against organizational policy (and thus certainly something an individual should not do), the focus on recalls as a destabilizing force for the fairness of exams is…lame. Literally every important high-stakes exam including the SAT, MCAT, USMLE, and the various board exams have engendered a massive test prep industry around offering nearly the same thing: questions written–sometimes by the same people!–to exactly simulate those very same exams. Many, including most of the USMLE prep products, use software that even completely mimics the test software down to the pixel.

But I want to posit a fundamental misunderstanding:

A Really Meaningful Modern Test Shouldn’t Rely on Hiding its Content

Imagine a radiology exam had a question demonstrating a benign hepatic hemangioma on CT or MRI. Imagine a second-order question asking about management (nothing). If that information were put into a series of recalls, it would be meaningless. Every radiology resident should get the question correct because it is relevant to radiology practice and unavoidable during normal training. And if the exam, like the USMLE licensing exams or the various specialty board exams, purports to be a measure of minimal competency for safe independent practice, then all the questions should be relevant to daily practice.

If someone has mastered all of the relevant “recalls,” then they would presumably be ready to practice radiology. It’s okay if the fraction of answered questions is high if we expect the questions to reflect things we really want everyone to know. Mastering all of this exam content is exactly what we want trainees to do!

Recalls only matter in two situations:

  1. If exams need to include questions designed to differentiate high-level performance, separating the best from the average. In this setting, questions that should be really challenging become easy, throwing off the balance of exam difficulty. It’s precisely because the designers want to give you brainbusters that the element of surprise is key. But in a world where high-quality informational content is at your fingertips, this component of mastery is increasingly irrelevant. The things we really want people to know quickly, like how to manage a true emergency like a code or how to drop a line, are never satisfactorily going to be assessed with a multiple-choice question. That’s what a real-life training program is for. Performance on an MCQ test is typically only good at predicting future performance on other MCQ tests.
  2. Questions that really really test critical thinking. In this setting, the examinee shortcuts the thought process and arrives directly at the answer, defeating the purpose of the question. While the MCAT contains a fair number of reasoning questions along these lines, this is rarely the case in real life for medical licensing and board exams.

Every question bank for every major test is composed of glorified recalls. Pretending otherwise is silly. If all questions contained important material and the ability to answer them reflected meaningful knowledge and competence, then someone able to memorize the plethora of recalled questions would be exactly what you’d want: qualified.

1 Comments

Leave a Comment.