AI offers organ-level classification of free-text pathology reports

Machine learning algorithms can classify free-text pathology reports at the organ level and are easily interpreted by human readers, according to an Aug. 7 study published in Radiology: Artificial Intelligence.

Lead author Jackson M. Steinkamp and colleagues compared six AI methods and found the approaches were highly accurate requiring little preprocessing work. The team was also able to visualize the algorithm’s decision-making process—a common hurdle when using AI.

“Neural network–based approaches achieve high performance on organ-level pathology report classification, suggesting that it is feasible to use them within automated tracking systems,” the researchers wrote of their findings.

At the Hospital of the University of Pennsylvania, where Steinkamp works in the department of radiology, an automated radiology recommendation tracking engine is used to improve follow-up adherence after potential cancers are found in patient’ abdomen or pelvis. Pathology reports must also be reviewed, but are often free-text and relevant to multiple organs, making them difficult to structure, the authors noted.

Steinkamp and colleagues analyzed data from 2,013 pathology reports taken from patients who underwent abdominal imaging at their tertiary care center between 2012 and 2018. All reports were labeled by two annotators to indicate if findings were relevant to the liver, kidneys, pancreas and/or adrenal glands or none of the organs.

Six automated classification approaches were compared: simple string matching, random forests, extreme gradient boosting, support vector machines, convolutional neural networks (CNNs) and long short-term memory networks.

Results showed that the CNNs achieved the best F1 scores (95.3%), followed by long short-term memory networks (96.7%), extreme gradient boosting (93.9%), support vector machines (89.9%), random forests (82.8%) and simple string matching (75.2%). ‘

According to Steinkamp et al. the CNNs are also small enough to be trained and implemented “within hours on machines without GPUs,” which makes them applicable for use in a wide variety of clinical settings.

Their results also suggest the system can be used within a broader tracking engine, which the authors explained was the ultimate goal of the project, and made it known that such a platform would be meant to “augment, rather than to replace, human monitors” to improve the efficiency of radiologists.

“Toward this end, one might incorporate the overall interpretability algorithms into the overall system by auto-populating the most salient word spans from new reports into the user interface along with the prediction, allowing readers to quickly judge whether the prediction of the machine was correct or incorrect,” the researchers added. “This would significantly decrease time spent opening and reading full pathology reports (the current procedure at our institution).”