Members of the radiology department at Johns Hopkins Medicine in Baltimore have tried two simple means of reducing recall rates in screening mammography and found both effective. What’s more, neither intervention hurt the team’s performance on cancer detection—and both are replicable by other breast-imaging operations.
Lisa Mullen, MD, and colleagues describe their success in a study published online July 29 in Academic Radiology.
The team first established baseline performance as evidenced over a three-year period. They focused on screening exams using full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT).
The first intervention was discussion-based and primarily aimed at raising awareness. For seven months, each of 10 participating breast imagers weekly reviewed their own recalled cases, including results from biopsy and clinical evaluation, then compared rads’ individual performance metrics to those of the group. The group talked over perceptions of recall and performance, including “the most frequent individual reasons for recall and individual fears that prompt recall,” the authors write.
The second intervention, also seven months long for the study, tested consensus double reading of all recalls: Two radiologists had to agree on whether recall was needed. If the second reader disagreed with the first that the finding warranted recall, a third reader was asked to provide the tie-breaking decision. The entire process, including release of the final radiology report, had to be completed within 24 hours of first interpretation so as not to inconvenience or concern the patient or referrer.
The team’s key findings:
- The baseline recall rate, cancer detection rate and positive predictive value 1 (PPV1, the percentage of positive screening examinations resulting in a tissue diagnosis of cancer) were 11.1 percent, 3.8/1,000 and 3.4 percent, respectively, for FFDM, and 7.6 percent, 4.8/1,000 and 6.0 percent, respectively, for DBT.
- Recall rates decreased significantly to 9.2 percent for FFDM and to 6.6 percent for DBT after the first intervention promoting awareness, as well as to 9.9 percent for FFDM after the second intervention implementing group consensus.
- PPV1 increased significantly to 5.7 percent for FFDM and to 9.0 percent for DBT after the second intervention.
The interventions did not significantly change cancer detection rates, the authors report.
Also, the time participating radiologists spent consulting for each recall averaged a tidy 2.3 minutes.
In their discussion, Mullen and colleagues highlight the more-substantial reduction in recalls achieved by the first intervention for both FFDM and DBT.
“This unexpected result suggests that the motivated radiologist could invest a small amount of time each week reviewing his or her own recalls and thereby improve his or her personal performance metrics,” they write. “This result may also suggest that there was marginal remaining opportunity after the awareness phase, therefore decreasing the additional opportunity available for improvement with consensus recall.”
Meanwhile, they note, the consensus recalls increased positive predictive values for both imaging modalities, “indicating that more patients were appropriately recalled with consensus review.”
The authors acknowledge several limitations in their study design, including small sample sizes, increasing DBT utilization over that of FFDM and, possibly affecting replicability, their academic setting with breast subspecialists.
Mullen et al. conclude that decreasing recall rates while maintaining critical quality metrics “enhances value and improves patient-centered care. Simultaneously, this approach saves healthcare expenses, time and resources while decreasing ‘false alarms’ and the resultant distress for women.”