Computer-based regression model aids in breast cancer diagnosis
A multidisciplinary group, including radiologists and industrial engineers, led by Elizabeth S. Burnside, MD, Oguzhan Alagoz, PhD, and Jagpreet Chhatwal, PhD, developed the breast cancer risk estimation model based on the descriptors of the National Mammography Database using logistic regression to detect breast cancer earlier.
The researchers created two logistic regression models based on the mammography features and demographic data for 62,219 consecutive mammography records from 48,744 studies in 18,270 patients reported using the Breast Imaging Reporting and Data System (BI-RADS) lexicon and the National Mammography Database format between April 5, 1999 and Feb. 9, 2004. State cancer registry outcomes matched with their data served as the reference standard.
The probability of cancer was the outcome in both models. Model 2 was built using all variables in Model 1 plus radiologists' BI-RADS assessment categories. The researchers used 10-fold cross-validation to train and test the model and to calculate the area under the receiver operating characteristic curves (Az) to measure the performance. Both models were compared with the radiologists' BI-RADS assessments.
According to the results, radiologists achieved an Az value of 0.939 ± 0.011. The Az was 0.927 ± 0.015 for Model 1 and 0.963 ± 0.009 for Model 2.
"At 90 percent specificity, the sensitivity of Model 2 (90 percent) was significantly better than that of radiologists (82 percent) and Model 1 (83 percent). At 85 percent sensitivity, the specificity of Model 2 (percent) was significantly better than that of radiologists (88 percent) and Model 1 (87 percent)," the authors wrote.
Based on the findings, the researchers concluded that the "logistic regression model can effectively discriminate between benign and malignant breast disease and can identify the most important features associated with breast cancer."
"The computer based model was designed to help the radiologist calculate breast cancer risk based on abnormality descriptors like mass shape; mass margins; mass density; mass size; calcification shape and distribution," said Burnside and Chhatwal. "When the radiologist combined his/her assessment with the computer model, the radiologist was able to detect 41 more cancers than when they didn't use the model. The model was created based upon findings of 48,744 mammograms in a breast imaging reporting database and found that the use of hormones and a family history of breast cancer did not contribute significant predictive ability in this context," they said.
"Our model has the potential to avoid delay in breast cancer diagnosis and reduce the number of unnecessary biopsies, which would benefit many patients. It may also encourage patients to get more actively involved in the decision-making process surrounding their breast health," they added.
Though work remains to be done to validate the system for clinical care, it represents a "promising direction that has the potential to substantially improve breast cancer diagnosis," according to Burnside and Chhatwal.