AI-optimized TI-RADS may reduce unnecessary thyroid biopsies

An AI-optimized American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) improved risk stratification of thyroid nodules and may be easier for readers to use, according to a new study published in Radiology.

The AI-based model also improved the specificity of eight nonexpert readers and reduced fine-needle aspiration recommendations for benign nodules—a “central” aim of the ACR TI-RADS, wrote lead author Benjamin Widman-Tobriner, Duke University Hospital’s Department of Radiology, Durham, North Carolina, and colleagues.

“Although most thyroid nodules are benign, many patients are subjected to a costly workup that may include one or more biopsies, follow-up imaging, and even diagnostic lobectomy,” the researchers added. “This contributes to the overdiagnosis of thyroid cancers that are not clinically significant.”

With this in mind, Widman-Tobriner et al. created an AI system to optimize the ACR TI-RADS. Expert readers assigned points based on five ACR TI-RADS categories (composition, echogenicity, shape, margin, echogenic foci) to 1,425 biopsy-proven thyroid nodules from more than 1,200 patients.

A genetic AI algorithm was applied to a training set of 1,325 nodules, while point and pathologic data was used to create the AI TI-RADS optimized scoring system. That model was compared to traditional ACR TI-RADS using a test set of 100 nodules complete with the interpretations of an expert reader, expert panel and eight nonexperts.

Results showed that AI TI-RADS assigned new point values for eight features; six were given zero points, simplifying categorization of nodules, the authors explained.

With expert reader data, the ACR TI-RADS and AI TI-RADS achieved similar area under the operating curve scores of 0.91 and 0.93, respectively. However, the specificity of the AI model (65%) was higher than the ACR TI-RADS (47%).

And when using the AI model, eight nonexperts achieved a higher mean specificity (55%) compared to the ACR-only model (48%).

“It has been reported that overdiagnosis accounts for up to 77% of cases of thyroid cancer and that more thyroid cancer diagnoses do not reduce mortality,” the researchers wrote. “Therefore, a small reduction in sensitivity seems acceptable in light of a larger gain in specificity.”

“As well, many nodules not biopsied would meet the criteria for follow-up, mitigating the likelihood of missing cancers while potentially reducing health care costs,” they added.

The study came with many limitations, including a training dataset from a single institution and feature assignments based on expert readers. Additionally, the researchers didn’t use a validation set for training the AI, but opted for cross-validation within the training cases.

Widman-Tobriner and colleagues believe their work can be further improved upon with more data and prospective long-term studies.

“Continued performance improvement is vital, and subsequent work could focus on further enhancements,” they wrote. “Future efforts could also include nodules with indeterminate pathologic results to broaden the mix of nodules included, which may enhance generalizability and performance