Novel language modeling approach correlates radiology, pathology reports

Correlating radiology and pathology reports is an ongoing challenge on the path toward achieving multidisciplinary patient care. One researcher has gotten a little closer to that goal with the help of deep learning techniques, sharing findings in the Journal of the American College of Radiology.

The method uses a language-modeling approaches trained on data from multiple hospitals and ambulatory sites and can serve as an important basis to continually improve specificity and user preferences in automatically correlating radiology and pathology reports, according to the study’s only author, Ross Filice, MD, with MedStar Georgetown University Hospital, Washington, D.C.

“This slightly modified language-modeling approach is very promising for automated radiology-pathology correlation,” Filice added. “Although the initial results may not have optimal performance characteristics in that specificity is prioritized at the expense of sensitivity, this work shows that a language model can rapidly adapt to the training data provided and perform very similarly.”

Recently, a new method—Universal Language Model Fine-Tuning for Text Classification (ULMFiT)—has been shown to provide advantages over previous deep learning techniques in correlating radiology and pathology reports, notably its use of transfer learning which has long been used for image-based problems.

Filice used the ULMFiT method, training the general model on 200,000 unique reports to fit the radiology-pathology space. Another 100,000 were used to train the final binary classification model.

The new model performed similarly to a previously designed anatomic concept-matching approach, notching a 100% specificity, 65.1% sensitivity and 73.7% accuracy.

“Because this language model can so rapidly adapt to existing training labels, these results should not be considered the final work but rather a foundation upon which iterative improvement can be performed,” Filice wrote.

The reported low sensitivity in the study was a definite limitation, Filice acknowledged. This was due to the impure radiology-pathology training labels used. Future research will refine the model based on user feedback which may boost sensitivity.

Another shortcoming is the system’s inability to explain why the model matched, or did not match, certain reports with one another.

“Because radiology-pathology matching can be subjective on the basis of the goals and preferences of the institution and even the user, we plan to use this model as a starting point and gather feedback from users through our existing dashboard to generate new training data to iteratively fine-tune our model to desired performance characteristics,” according to Filice.