‘Collective super intelligence’: Radiologists, AI join forces to improve chest x-ray interpretations

Experts have long talked about an ideal future in which radiologists work alongside AI. A new platform may have the answer, combining the intelligence of man and machine to better diagnose pneumonia.

The “human-in-the-loop” AI method leverages the advantages of automated detection while prompting human readers to intervene at certain checkpoints when algorithms may be unsure. When radiologists used the AI on a platform called Swarm—which allows multiple experts to work together in real-time—their diagnoses improved, beating those of algorithms and individual readers alone.

“Recent work has shown superior task performance of a combined human and AI augmented model compared to either human or machine alone,” Bhavik N. Patel, with Stanford University School of Medicine, and colleagues wrote. They went on to say that “this approach could harness the best of human intelligence and artificial intelligence to create a collective super intelligence.”

For their study, 13 expert radiologists from Stanford and Duke University were split into two groups and asked to estimate the probability of pneumonia on 50 chest x-rays. They did so alone and when using the AI swarm platform. Patel et al. also compared the results to those achieved by two high-performing deep learning models—CheXNet and CheXMax.

Overall, radiologists using the human-in-the-loop AI during swarm sessions yielded the highest diagnostic performance. The authors did note that individual approaches had specific strengths and weaknesses, but overall the combined approach worked best.

Additionally, performance using the swarm platform was better than crowd-based majority diagnoses, commonly used to establish ground truth labels for validation and test datasets.

“Results from our study show that swarm-based diagnoses outperform crowd-based diagnoses, and thus may represent a novel means for generating image labels that provide more accurate ground truth than conventional consensus labeling methods for training datasets for deep-learning algorithms,” the authors wrote.

“Moreover, some centers may not readily have access to experts, and labeling images through swarm sessions may allow such centers to achieve expert level labels,” they added.

The full study was published Nov. 18 in Nature’s npj Digital Medicine.