Deep learning may accurately identify false-positive mammograms and distinguish such from those identified as malignant or negative, according to new research published Oct. 11 in the journal Clinical Cancer Research.
Mammography can help identify breast cancer early, however it also suffers from a high false recall rate that can precipitate an increase in medical costs, clinical workload and psychological stress for the patient. But, researchers from the University of Pittsburgh believe that an AI approach based on deep learning convolutional neural networks (CNNs) can help.
“The assumption is that there may be some nuanced imaging features associated with some mammogram images that could lead to a false/unnecessary recall when the images are interpreted by human radiologists, and our goal is to utilize a CNN-based method to build a computer toolkit to identify those potential mammogram images,” study author Shandong Wu, PhD, director of the Intelligent Computing for Clinical Imaging Lab at the University of Pittsburgh, said in a prepared statement. “We showed that there are imaging features unique to recalled-benign images that deep learning can identify and potentially help radiologists in making better decisions on whether a patient should be recalled or is more likely a false recall.”
The researchers studied whether deep learning could distinguish between images from a large set of mammograms of women with a malignant diagnosis, women who were recalled and later had false recalls and women determined to be breast cancer free at the time of their mammogram.
A total of 14,860 images of 3,715 patients from two separate mammography datasets—Full-Field Digital Mammography Dataset ((FFDM) 1,303 patients) and Digital Dataset of Screening Mammography ((DDSM) 2,412 patients)—were used for the study.
Wu and colleagues then built CNN models to investigate six classification scenarios that would help distinguish whether images were benign, malignant or recalled benign mammograms.
The area under the curve (AUC) of the combined datasets (to distinguish whether images were benign, malignant or recalled benign mammograms) ranged from 0.76 to 0.91. Because AUC has a maximum value of one and summarized the comparison of true positives against false positive, the researchers found the AUC measurement to indicate how many images were correctly and falsely identified.
“Based on the consistent ability of our algorithm to discriminate all categories of mammography images, our findings indicate that there are indeed some distinguishing features/characteristics unique to images that are unnecessarily recalled,” Wu said. “Our AI models can augment radiologists in reading these images and ultimately benefit patients by helping reduce unnecessary recalls.”