ABR totally botches 2017 Core Exam

This email belies how royally the ABR botched the 2017 Core Exam.

What the ABR should have done is what any accountable organization should do when they mess up.

Express regret and acknowledge responsibility
Be transparent and describe the mistake
Give an action plan and steps to correct the problem
Ask for forgiveness

Instead, examinees received the lip service version.

“Technical issue” is not a satisfactory explanation for the cause.

“Problems with the display of some questions” is not what happened.

“Those questions will NOT be counted toward your exam results” is a grossly incomplete solution.

So what did happen?

Well, the ABR still hasn’t offered a technical explanation. It would seem there was an issue with mammo module of the exam. If I had to guess, the larger image file sizes in this module probably exceeded a temporary throttling of the server they were hosted on and could not be transferred to all stations as the requests timed out.

But who knows? Apparently not the ABR.

The result of whatever happened is that some examinees in Chicago couldn’t start the exam. Some of them waited nervously in the holding room at the hotel room without explanation awaiting the shuttle. Others already at the center just had to sit at their desks wondering when they would be able to start. For two hours. Which of course turns the already long day into a hellishly long one with nerves racked, tummy grumbling, caffeine wearing off, etc.

Once the exam began, some test-takers had the mammo questions. Others did not. And some had them added to the end of the test mid-way through, suddenly increasing their day by another hour. In all cases, the ABR has suggested that “those questions” won’t adversely affect their scores. This presumably means that no one in Chicago will have mammo graded. But then why add it to some people’s tests and not others? Why make someone whose test-day is already two hours delayed stay another hour for questions that won’t count? How are they going to reconcile the fact that there are psychological and fatigue effects from this mistake that have nothing to do with the “display of some questions,” and that some of this could have simply been mitigated by upfront transparency?

In the grand scheme of things, given that nobody has ever conditioned the mammo section, I imagine the ABR feels confident saying that those questions not being graded will not have a meaningful impact on the grading of the examination itself. With around 103 total fails last year, one imagines only a fraction of those would even include mammo. Even the vast majority of people affected are probably nowhere near the failing mark, unfair psychological BS notwithstanding.

A follow-up email on June 14 (almost a week later) said this (emphasis mine):

The ABR sincerely regrets the problems with the administration of the Core Exam in Chicago on Thursday, June 8, 2017. We are taking this matter very seriously and are working hard to identify the sources of the problem and the impact on affected candidates.

We don’t yet have all the information needed to determine how many candidates have been affected and to what extent. Staff worked very hard over the weekend to ensure that the Core exams administered in Chicago and Tucson this week would go smoothly, and we have had no issues.

I want to emphasize that any candidate impacted by last Thursday’s difficulties with the breast imaging content will not have those items counted against their scores. We don’t expect anyone to have problems qualifying for MQSA.

How can you not know who was affected? The nature of this problem should have made it obvious who was affected during the examination itself. What they mean is that—despite getting into the business of test administration—the ABR never anticipated technical difficulties, had no meaningful system in place for troubleshooting or identifying issues, and had no contingency plans formed to deal with this eventuality.

Also missing: acknowledgment of any the issues outlined above outside of the “difficulties with the breast imaging content.”

And: you don’t “expect” problems with MQSA? The MQSA requirements only state that the radiologist be board-certified, not that the boards actually contain mammography. Of course this shouldn’t be a problem. But if you anticipate that there could be an issue, perhaps you should get some clarification before dropping a half-baked position-statement.1

Let’s go back to the underlying arguments for how we got here in the first place.

From the ABR FAQ:

Why do I have to go to Chicago or Tucson instead of a local testing center for diagnostic radiology exams?
With the transition to more image-rich exams with advanced item types, the ABR has built two exam centers in Chicago and Tucson to administer all diagnostic radiology exams. At this time, commercial test centers do not have the technology or means available to support these kinds of exams.

More detail from the 2014 Core Exam FAQ & misconceptions presentation:

Why can’t I just go to a PearsonVUE center to take this test?
• Modular content difficult for PV
• PV can’t handle case structure on their software
• PV monitors aren’t calibrated, can’t control lighting
• Aim: to have distributed exam. We are working on system to implement

So, now in 2017, we can firmly debunk these arguments

1. Modular Content

The content is not bizarrely or unique modular. First, this doesn’t really matter (even the very long Step exams are broken up into multiple modules). In years past, the modules for different sections were given in succession (breast, then cardiac, then GI) though lumped seamlessly into one large mega-module as you progress through the day. This year the modules were jumbled and topics jumped around. Thus there are just two days of relatively unmodular content.

2. PV can’t handle case structure on their software

This is only plausible if the ABR’s software is particularly poorly written. The USMLE also has multiple different case structure formats, including videos, images, and interactive fake physical exams, not to mention Step 3’s ludicrous choose-your-own-adventure CCS program. If we need to get rid of the two or three “drag the X” format questions per test in order to do a disseminated exam, I think we can all agree the collective radiology hivemind would acquiesce.

3. PV monitors aren’t calibrated, can’t control lighting

After this year’s difficulties, one can easily argue that there is no point having a “well-calibrated” monitor that can’t even show the carefully curated “Angoff-validated” questions in the first place. I’ll admit, the lighting is nicely dim. As a practical matter, few images are of sufficient quality for the lighting to be a plausible limiting factor. Most of the MR looks photocopied from books published in the 1980s. Residents take the ACR in-service exam in droves every year. The criticism there has always been the exam itself; not the testing software nor the ambiance of the venue.

4. Aim: to have distributed exam. We are working on system to implement

2018 sounds like a great year to start.

The costs of the ABR’s exam paradigm are absurd

There are almost 1200 graduating radiology residents every year (1149 took the core in 2016; 91% passed). Every class contributes $640 per person per year for a total of $3 million per graduating class over the course of a four-year residency ($4.6 million total when including the extra two years to take the Certifying Examination). That also means that the ABR rakes in around $750k per class per year and $3 million per year from residents alone. Not to mention the $340/year for every single radiologist in the MOC phase. Or the $3000+ to take subspecialty exams like neuro or VIR.

To reiterate: the class that just took this failed exam gave the ABR on the order of $3,000,000 to take this test. This figure doesn’t include the additional costs for the honor of traveling across the country to spend two days in a hotel to actually take the exam (at least another $500,000 per year).

If you can’t get photos and radio buttons working consistently on an operating budget of millions, then you’re doing it wrong.

Having a decent test is an important noninterpretive skill

When the ABR decided to start from scratch and write a new exclusively computer-based exam, they chose to become not just test-writers but test-administrators. No one forced the ABR to write a test that no high-volume testing center could implement. When you take over something this important, you have to do it right, and you should be completely accountable for your performance. Transparency should not be optionable. The way the Core and Certifying exams were created, graded, and handled is a poorly conceived and unnecessarily obfuscated embarrassment (e.g. why does the Certifying exam even exist?).

You don’t just say things like2NB: they didn’t actually say this.

we had a mysterious technical difficulty but also we totally fixed it we promise though actually we don’t know what happened or exactly to whom it happened but also don’t worry about those questions they won’t count for anyone because for real we don’t know who had them or didn’t have them or if they had them how pretty they looked so trust us also by the way your annual fee is due.

Since noninterpretive skills are an important part of the Core exam, let’s just say that a 6% failure rate for successful Core exam administrations is a far cry from Six Sigma.3

8 Comments

Tyler 07.11.17 Reply

Excellent summary. I agree with everything you wrote.

The other thing I would add is: How can the ABR say that these radiologists are qualified to read mammograms if they haven’t assessed their knowledge on mammograms? Are we just okay with a significant proportion of the radiologists (but we don’t know WHICH radiologists) graduating in 2018 not having had an empirical appraisal to certify them as competent in breast imaging?

Ben (Author) 07.12.17 Reply

The ironic thing based on the grading is that your point has essentially been true for every section of every Core exam (though this is certainly a more extreme and embarrassing example). The conditioning threshold for individual sections is so far below the overall exam passing score that not a single person has ever truly “failed” a section who hasn’t failed the whole exam.

So in practice, you can apparently do pretty terribly on individual sections and still pass the exam. Barely “passing” a couple small sections of miscellaneously-chosen multiple choice questions probably doesn’t correlate well with the day to day safe practice of radiology.

Pingback: The ABR Mammography Saga Continues | ben white

bob 08.23.17 Reply

mammo isn’t science anyways

Pingback: Mammogeddon: Yes, the conclusion | ben white

Pingback: The 2019 ABR Core Exam Results, the Board Prep Arms Race, and Where It All Went Wrong | ben white

Pingback: A Deep Dive into the Tax Returns of the American Board of Radiology | ben white

Pingback: It's Time to Disseminate the ABR Core Exam | Ben White