ABR totally botches 2017 Core Exam

This email belies how royally the ABR botched the 2017 Core Exam.

What the ABR should have done is what any accountable organization should do when they mess up.

  • Express regret and acknowledge responsibility
  • Be transparent and describe the mistake
  • Give an action plan and step to correct the problem
  • Ask for forgiveness

Instead, examinees received the lip service version.

“Technical issue” is not a satisfactory explanation for the cause.

“Problems with the display of some questions” is not what happened.

“Those questions will NOT be counted toward your exam results” is a grossly incomplete solution.

So what did happen?

Well, the ABR still hasn’t offered a technical explanation. It would seem there was an issue with mammo module of the exam. If I had to guess, the larger image file sizes in this module probably exceeded a temporary throttling of server they were hosted on and could not be transferred to all stations as the requests timed out.

But who knows? Apparently not the ABR.

The result of whatever happened is that some examinees in Chicago couldn’t start the exam. Some of them waited nervously in the holding room at the hotel room without explanation awaiting the shuttle. Others already at the center just had to sit at their desks wondering when they would be able to start. For two hours. Which of course turns the already long day into a hellishly long one with nerves racked, tummy grumbling, caffeine wearing off, etc.

Once the exam began, some test-takers had the mammo questions. Others did not. And some had them added to the end of the test mid-way through, suddenly increasing their day by another hour. In all cases, the ABR has suggested that “those questions” won’t adversely affect their scores. This presumably means that no one in Chicago will have mammo graded. But then why add it to some people’s tests and not others? Why make someone whose test-day is already two hours delayed stay another hour for questions that won’t count? How are they going to reconcile the fact that there are psychological and fatigue effects from this mistake that have nothing to do with the “display of some questions,” and that some of this could have simply been mitigated by upfront transparency?

In the grand scheme of things, given that nobody has ever conditioned the mammo section, I imagine the ABR feels confident saying that those questions not being graded will not have a meaningful impact on the grading of the examination itself. With around 103 total fails last year, one imagines only a fraction of those would even include mammo. Even the vast majority of people affected are probably nowhere near the failing mark, unfair psychological BS notwithstanding.

A follow-up email on June 14 (almost a week later) said this (emphasis mine):

The ABR sincerely regrets the problems with the administration of the Core Exam in Chicago on Thursday, June 8, 2017. We are taking this matter very seriously and are working hard to identify the sources of the problem and the impact on affected candidates.

We don’t yet have all the information needed to determine how many candidates have been affected and to what extent. Staff worked very hard over the weekend to ensure that the Core exams administered in Chicago and Tucson this week would go smoothly, and we have had no issues.

I want to emphasize that any candidate impacted by last Thursday’s difficulties with the breast imaging content will not have those items counted against their scores. We don’t expect anyone to have problems qualifying for MQSA.

How can you not know who was affected? The nature of this problem should have made it obvious who was affected during the examination itself. What they mean is that—despite getting into the business of test administration—the ABR never anticipated technical difficulties, has no meaningful system in place for troubleshooting or identifying issues, and had no contingency plans formed to deal with this eventuality.

Also missing: acknowledgment of any the issues outlined above outside of the “difficulties with the breast imaging content.”

And: you don’t “expect” problems with MQSA? The MQSA requirements only state that the radiologist be board-certified, not that the boards actually contain mammography. Of course this shouldn’t be a problem. But if you anticipate that there could be an issue, perhaps you should get some clarification before dropping a half-baked position-statement.1


Let’s go back to the underlying arguments for how we got here in the first place.

From the ABR FAQ:

Why do I have to go to Chicago or Tucson instead of a local testing center for diagnostic radiology exams?
With the transition to more image-rich exams with advanced item types, the ABR has built two exam centers in Chicago and Tucson to administer all diagnostic radiology exams. At this time, commercial test centers do not have the technology or means available to support these kinds of exams.

More detail from the 2014 Core Exam FAQ & misconceptions presentation:

Why can’t I just go to a PearsonVUE center to take this test?
• Modular content difficult for PV
• PV can’t handle case structure on their software
• PV monitors aren’t calibrated, can’t control lighting
• Aim: to have distributed exam. We are working on system to implement

So, now in 2017, we can firmly debunk these arguments

1. Modular Content

The content is not bizarrely or unique modular. First, this doesn’t really matter (even the very long Step exams are broken up into multiple modules). In years past, the modules for different sections were given in succession (breast, then cardiac, then GI) though lumped seamlessly into one large mega-module as you progress through the day. This year the modules were jumbled and topics jumped around. Thus there are just two days of relatively unmodular content.

2. PV can’t handle case structure on their software

This is only plausible if the ABR’s software is particularly poorly written. The USMLE also has multiple different case structure formats, including videos, images, and interactive fake physical exams, not to mention Step 3’s ludicrous choose-your-own-adventure CCS program. If we need to get rid of the two or three “drag the X” format questions per test in order to do a disseminated exam, I think we can all agree the collective radiology hivemind would acquiesce.

3. PV monitors aren’t calibrated, can’t control lighting

After this year’s difficulties, one can easily argue that there is no point having a “well-calibrated” monitor that can’t even show the carefully curated “Angoff-validated” questions in the first place. I’ll admit, the lighting is nicely dim. As a practical matter, few images are of sufficient quality for the lighting to be a plausible limiting factor. Most of the MR looks photocopied from books published in the 1980s. Residents take the ACR in-service exam in droves every year. The criticism there has always been the exam itself; not the testing software nor the ambiance of the venue.

4. Aim: to have distributed exam. We are working on system to implement

2018 sounds like a great year to start.


The costs of the ABR’s exam paradigm are absurd

There are almost 1200 graduating radiology residents every year (1149 took the core in 2016; 91% passed). Every class contributes $640 per person per year for a total of $3 million per graduating class over the course of a four-year residency ($4.6 million total when including the extra two years to take the Certifying Examination). That also means that the ABR rakes in around $750k per class per year and $3 million per year from residents alone. Not to mention the $340/year for every single radiologist in the MOC phase. Or the $3000+ to take subspecialty exams like neuro or VIR.

To reiterate: the class that just took this failed exam gave the ABR on the order of $3,000,000 to take this test. This figure doesn’t include the additional costs for the honor of traveling across the country to spend two days in a hotel to actually take the exam (at least another $500,000 per year).

If you can’t get photos and radio buttons working consistently on an operating budget of millions, then you’re doing it wrong.


Having a decent test is an important noninterpretive skill

When the ABR decided to start from scratch and write a new exclusively computer-based exam, they chose to become not just test-writers but test-administrators. No one forced the ABR to write a test that no high-volume testing center could implement. When you take over something this important, you have to do it right, and you should be completely accountable for your performance. Transparency should not be optionable. The way the Core and Certifying exams were created, graded, and handled is a poorly conceived and unnecessarily obfuscated embarrassment (e.g. why does the Certifying exam even exist?).

You don’t just say things like2

we had a mysterious technical difficulty but also we totally fixed it we promise though actually we don’t know what happened or exactly to whom it happened but also don’t worry about those questions they won’t count for anyone because for real we don’t know who had them or didn’t have them or if they had them how pretty they looked so trust us also by the way your annual fee is due.

Since noninterpretive skills are an important part of the Core exam, let’s just say that a 6% failure rate for successful Core exam administrations is a far cry from Six Sigma.3

My new book: Medical Student Loans

My second book, Medical Student Loans: A Comprehensive Guide, is now out. It’s a novella-length treatment of student loans specifically for physicians and written to cover the topic for all levels: premeds, medical students, residents, and attendings. It’s especially helpful for graduating MS4s and by its nature also covers important basic financial literacy in a hopefully non-threatening way.

In other words, I hope you like it.

Despite years of writing about student loans on this site, it was a ton of work to put this together and finally get it out to the world. To celebrate, I’ve made it completely free to download from Amazon until the end of Sunday, June 25.

MSL will also be part of the Kindle Unlimited program for the next three months. You can get a 30-day free trial if you need another way to read it for free.

Consider it your first few hours of CME.

Standard Ebooks

Standard Ebooks is an awesome long overdue idea:

Standard Ebooks is a volunteer driven, not-for-profit project that produces lovingly formatted, open source, and free public domain ebooks.

Ebook projects like Project Gutenberg transcribe ebooks and make them available for the widest number of reading devices. Standard Ebooks takes ebooks from sources like Project Gutenberg, formats and typesets them using a carefully designed and professional-grade style guide, lightly modernizes them, fully proofreads and corrects them, and then builds them to take advantage of state-of-the-art ereader and browser technology.

What a great project.

Explanations for the 2017-2018 Official Step 2 CK Practice Questions

The updated 2017-18 official “USMLE Step 2 CK Sample Test Questions” PDF, released in May and available here.

The PDF set is completely unchanged from last year. You can read the complete explanations for last year’s set here.


As for the updated multimedia questions found only in the online version:

Block 1

7. A – Classic Moro reflex, entirely expected and normal until it disappears around age 4 months. If you have never seen a newborn before, also note that the mom is concerned about delayed milestones at two weeks of age, which is a red flag for BS: babies aren’t even smiling socially yet by two weeks.

Block 2

3. D – Pill-rolling resting tremor of Parkinson’s disease secondary to loss of dopamine neurons in the substantia nigra.

18. A – I’m going to point out that a normal healthy kid with no cardiac history or symptoms and no family history of sudden cardiac death for a pre-sports physical is probably going to have a benign exam no matter what you think you hear. HOCM is what you want to exclude theoretically, but here we don’t have a real systolic murmur, just a little vibratory flow murmur at LLSB.

33. E – This one is a bit silly. The lung exam is normal outside of the super common basilar crackles. Everything except for PE you would expect to hear a more impressive auscultation abnormality. But for this question: B and C take longer than 3 days. D we would expect fever, productive cough etc. Bronchitis would be possible, but still more often to have at least productive cough if not fever. PE, on the other hand, classically has a nonproductive cough, hypoxemia, and tachycardia. All three are present. And then they mention her med: OCPs, which are an important predisposing factor for PE in young women for whom it is otherwise a rare entity. Young lady on OCPs is a classic set-up for an STD question (who needs condoms?) or a PE question, one of the two.

Block 3

12.1 D – Statistical significance (a low p-value) does not equal clinical significance. A favorite teaching point when it comes to interpreting literature.

12.2 C –A & D are conjectures: the kind of statements people drop inappropriately in the conclusion of a weak paper to make it sound important. E is an exclusion criterion. B is the opposite: including 0 is equivalent to something not being significantly different.

Navient is still lying to borrowers despite lawsuit

Unsurprisingly, Navient is still lying to borrowers despite the ongoing lawsuit (for misleading borrowers) from the Consumer Finance Protection Bureau.

I was talking to a fellow resident last week. She has almost a half million dollars in student loans from medical school and has been repaying in IBR. She recently got married, and her husband, also a resident, thankfully doesn’t have any student loans himself. Unfortunately, this also means that next time she recertifies her income, her payments are going to basically double thanks to the addition of his additional income. Once he becomes an attending, they would go up even further.

Given the possibility (and desirability) of public service loan forgiveness for her given her long training and very high loan burden, which would result in more money forgiven than she borrowed in the first place (due to negative amortization during residency), her goal should thus be to minimize payments in advance of this goal. I recommended she switched to Pay As You Earn (PAYE) to reduce her payments now and file her taxes separately from her husband next year (they weren’t married for tax purposes this year).

And here comes the lie: she called Navient for guidance, and the customer service representative told her she was ineligible for PAYE because she held a loan from 2009.

Anyone who knows anything about federal repayment programs or has the capacity to do a simple Google search would know that the limitations for PAYE eligibility have nothing to do with the year 2009. They are the following:

To qualify for the PAYE Plan you must also be a new borrower as of Oct. 1, 2007, and must have received a disbursement of a Direct Loan on or after Oct. 1, 2011. You’re a new borrower if you had no outstanding balance on a Direct Loan or FFEL Program loan when you received a Direct Loan or FFEL Program loan on or after Oct. 1, 2007.

So, nothing before 2007 and at least something after 2011. In other words, 2009 was a great vintage for PAYE. Full bodied and expensive with more than a hint of scut.

From the Consumerist:

The lawsuit alleges that for years Navient engaged in a series of illegal and deceptive practices, including providing borrowers with incorrect information, processing payments erroneously, and failing to address customers’ complaints.

Sounds familiar.

As always, it’s difficult to know if the servicer customer service reps are willfully ignorant or malicious, but they are routinely wrong (and seemingly proactively so).

My advice to anyone ever calling a servicer with a question is to already know the answer before you ask it. Find some official government verbiage to back up your interpretation. You should be looking for confirmation, not advice, and if the answer you receive isn’t the answer you’re expecting, find out exactly why. If the person you’re talking to doesn’t know, then get them to give you to somebody else.