AI best used as second opinion to help radiologists classify ground glass opacities

Massachusetts General Hospital (MGH) researchers have created an artificial neural network (ANN) that can help radiologists classify pure ground glass opacities (GGOs), according to a new study published in Clinical Imaging.

In fact, the ACC outperformed two radiologists at identifying both malignant and benign lesions on 18FDG PET/CT images, according to J.A. Scott, MD, with the department of radiology at MGH in Boston, and colleagues. However, the authors warned radiologists shouldn’t rely “exclusively” on ANN predictions.

“The hypothesis of the present study is that ANNs can be used to synthesize…various observations concerning the GGO, together with limited basic clinical data, into a single numeric probability of malignancy based upon assessment of their image characteristics on PET/CT at a single point in time,” the researchers wrote.

The group created the ANN with 85 pure GGO training cases and 40 used for testing. They then compared the ANN’s ability to classify cases as malignant or benign on 18FDG PET/CT studies to that of two radiologists.

The ANN achieved an area under the ROC curve (AUC) of 0.981, “excellent predictive value in estimating the likelihood of malignancy,” the authors noted. The ANN identified 11/11 malignant lesions for a specificity of 100% and 27/29 benign lesions for a specificity of 93.1%. In comparison, the experts found 17/29 lesions to be benign and 23 indeterminate.

“The results of this investigation suggest that ANNs have the potential to assist the radiologist in estimating the likelihood of malignancy of GGOs on a single 18FDG PET/CT study, based upon their appearance and relevant clinical information,” the authors explained.

One problem, as is the case with many AI-based methods, is its lack of explainability. This can be “disturbing” to clinicians who seek a logical explanation for findings. Were predictions made on a clue radiologists might not see or were they based on an error due to limitations in training data? The authors asked.

“The results of this study support the prevailing opinion that exclusive reliance on the prediction of an ANN to interpret an imaging finding is not appropriate,” the group concluded. “This caveat is particularly important when the network is trained on a limited data set, as in the present study. Such networks are most judiciously employed as a ‘second opinion.’”