Experts urge radiology to be more cognizant of image resolution’s impact on AI

Deep learning in radiology is advancing rapidly, and a group of experts are urging the field to consider how image resolution can affect the clinical impact of these algorithms.

That was the message from two researchers with LSU Health Sciences Center in New Orleans, who examined how training AI algorithms on large, public datasets could impact their performance. They found that using certain pixel dimensions helped tailor algorithms to detect specific abnormalities.

“Our results revealed practical insights for improving the performance of radiology-based machine learning applications and demonstrated diagnosis-dependent performance differences that allow for potential inferences into relative difficulties of different radiology findings,” Carl F. Sabottke and Bradley M. Spieler, MD, with LSU, wrote Jan. 22 in Radiology: Artificial Intelligence.

For example, a majority of approaches performed best when trained on resolutions between 256 and 448 pixels per dimension. Detecting emphysema and pulmonary nodules, however, bucked this trend, the pair wrote.

In a number of situations, training a model on lower-quality images is “desirable,” and can mean fewer required optimized parameters which, in turn, lowers the risk of overfitting. This approach, however, can eliminate information that would be useful to classify certain findings on scans. Utilizing the highest quality images isn’t that easy though, they added. Computing power only goes so far, and memory limitations can prevent engineers from using larger image files.

To help address this “open” problem, the researchers selected eight of the 14 diagnoses in the National Institutes of Health ChestX-ray14 dataset and tested how two CNNs (ResNet34 and DenseNet121) performed when trained on various image resolutions.

The best area under the receiver-operating characteristic curve scores were produced at resolutions between 256 × 256 and 448 × 448 pixels for algorithms identifying emphysema, cardiomegaly, hernias, edema, effusions, atelectasis, masses and nodules. And when comparing performance between lower-resolution networks and higher inputs, emphysema, cardiomegaly, hernia and pulmonary nodule detection all showed improved AUC scores when higher quality images were used.

“Despite limitations, we have shown that, as would be intuitively expected, subtler findings benefit from CNN training at relatively higher image resolutions, specifically for the case of pulmonary nodule detection versus mass detection on chest radiographs,” the authors wrote.

In a related piece, Paras Lakhani, MD, with Thomas Jefferson University Hospital, wrote that clinicians need to be more “cognizant” of how image resolution can impact the performance of AI models. He added that groups, such as the NIH, who release large, public datasets should do so with image quality in mind.

“Groups who create public datasets to advance machine learning in medical imaging should consider releasing the images at full or near-full resolution,” Lakhani wrote. “This would allow researchers to further understand the impact of image resolution and could lead to more robust models that better translate into clinical practice.”