Is a sequential or independent decision-support AI workflow more effective?

A team of East Coast researchers found the effectiveness of artificial intelligence (AI)-based decision support (DS) systems depends on how they are presented in a radiologist’s clinical workflow.

The study, published online Oct. 15 in the Journal of Digital Imaging, compared the clinical impact of an ultrasound AI-based DS system presented in two diagnostic workflows. The first, sequential, allowed the clinician to analyze the case unaided prior to receiving DS; the second, independent, presented the case and decision support simultaneously.

Despite a previous study showing that the DS platform improved both the sensitivity and specificity of lesion detection “DS’s practical efficacy and impact also need to be assessed when integrated into existing real-world clinical workflows,” wrote Lev Barinov, with Koios Medical Center in New York and colleagues.

A total of 500 ultrasound breast images were included in the study. Three radiologists reviewed the images using the two separate workflows. Accuracy differences were measured by area under the receiver operating curve (AUC) and inter-operator variability as measured by Kendall’s tau-b scale.

According to Barinov et al., a sequential study design suggests a “strong impact of confirmation bias,” they wrote. The deviation from the control is “significantly” smaller during the sequential reads compared to the independent method. Additionally, a supplemental, concurrent read decreased overall inter-operator variability, the group reported.

“Independent reads (concurrent reads) have shown dramatic shifts in reader performance and inter-operator variability as compared to either control reads or sequential reads,” the authors added.

Barinov and colleagues acknowledged several limitations to their study, including a limited number of images and readers which do not accurately represent the diversity of radiologists across a system. However, the researchers believe their study has a number of clinical applications.

“The evidence provided in this study can be used to impact both study design when demonstrating efficacy of new diagnostic decision support tools, as well as their implementation in practical environments,” Barinov et al. concluded.