Deep learning triages mammograms, reduces radiologists' workload nearly 20%

Deep learning can be used to triage cancer-free mammograms and improve the efficiency of radiologists, according to an August 6 feasibility study published in Radiology.

The simulated triage workflow let radiologists read scans above the cancer-free threshold, reducing their workload by nearly 20% while also improving their specificity, wrote Adam Yala, with the Massachusetts Institute of Technology in Cambridge, and colleagues.

Retrospectively tested in more than 7,000 women, the model showed potential for triaging mammograms of all varieties.

“Our model was discriminative across all age groups, races, and breast density categories, suggesting the model may be widely applicable to diverse patient populations,” Yala and colleagues wrote.

There have been many solutions tested to improve radiologists’ reading performance and efficiency, the group noted. Traditional computer-aided detection (CAD) approaches, which target improved sensitivity, fall short when put into clinical practice. Another approach, double reading, has improved reader performance, but, at the same time, hampers workflow efficiency.

“This work takes a substantial departure from prior work on CAD,” the group wrote. “Instead of annotating images to draw added attention to potentially malignant findings (to improve sensitivity), we propose to triage cancer-free mammograms from the workflow to improve both specificity and efficiency.”

Therefore, Yala and colleagues trained their model to predict cancer from full-resolution mammograms, simulating a scenario in which exams below a chosen high sensitivity threshold were deemed negative and scans above were sent to be read by breast imagers.

For their study, the team gathered 223,103 screening mammograms from 66,661 women performed from January 2009 to December 2016, along with cancer outcomes. Patients were split into a training group (56,831 women); validation group (7,021); and testing cohort (7,176). Mammograms were then separated into a training set of 212, 272 scans; validation set of 25,999; and test set of 26,540.

Results showed the deep learning model improved specificity from 93.5% without assistance to 94.3% when using the platform. Radiologists’ sensitivity was noninferior when using the method (90.6% to 90.1%). In total, the method reduced readers’ workload by 19.3%.

“This work is a first step to using deep learning to triage mammograms in routine clinical care,” the authors commented.

Not only that, but the deep learning method performed similarly for women from 40 to older than 70 years old. It also worked across multiple breast densities, registering area under the receiver operating characteristic curve (AUC) scores of 0.82, 0.81, 0.85, and 0.71 for women with fatty, scattered, heterogeneously, and extremely dense breasts, respectively.

In a related editorial, researchers described the results as “promising,” but acknowledged the study was a simulated scenario.

“It is therefore unclear how a radiologist would not only interpret but also pace the reading of the remaining cases in the presence of the proposed reduced caseload,” wrote Despina Kontos, PhD, and Emily F. Conant, MD, both with the University of Pennsylvania in Philadelphia.

More ethnically diverse studies are needed to prove the effectiveness of the triage platform, the authors argued, but Yala et al. have made their model publicly available to facilitate such research.

“Ultimately, this innovative application of artificial intelligence may prove more effective and reliable than conventional computer-aided detection in advancing a so-called lean approach to mammographic screening,” the pair concluded.