There was considerable variability in false-positive rates (FPRs) and nodule counts across radiologists in the National Lung Screening Trial (NLST), according to a study published online April 16 in Radiology.
The variability may encourage educational efforts to decrease interreader variation, wrote Paul F. Pinsky, PhD, of the National Cancer Institute, National Institutes of Health, in Bethesda, Md., and colleagues.
While the NLST famously demonstrated a 20 percent mortality benefit from using low-dose CT compared with chest radiography for lung cancer screening, there was a high rate of false-positive findings, explained the authors. For the first two rounds of screening in the trial, there was a positive screening exam rate of 27 percent, with more than 90 percent of these cases representing false-positives.
To better understand the factors leading to false-positive findings, Pinsky and colleagues analyzed overall NLST screening results, nodule-specific findings and recommendations for follow-up. Noncalcified nodules of 4 mm or larger constituted a positive screening result, and FPR was defined as the rate of positive exams without a cancer diagnosis within one year. A total of 112 radiologists at 32 screening centers each interpreted 100 or more NLST CT studies.
Results showed the mean FPR for radiologists was 28.7 percent, with a range of 3.8 to 69 percent, reported the authors. Descriptive analysis and mixed-effect models of variability showed an odds ratio of 2.49 across all pairs of radiologists and an odds ratio of 1.83 for pairs within the same screening center. “In other words, for a random pair of radiologists (generally at different centers), the odds of a false-positive study would be 2.5 times higher if one radiologist interpreted the examination versus the other,” wrote Pinsky and colleagues. There was also a standard deviation of 8.2 percent in FPR variability between screening centers as a whole.
In attempting to pin down the causes of variability, the authors noted similar FPRs for academic versus nonacademic centers, and for centers inside and outside the “histoplasmosis belt,” a southern U.S. region with high rates of the lung infection histoplasmosis. Being affiliated with a cancer center also did not account for center variability. Most studies were conducted with similar CT technical parameters. Section thickness, tube voltage, effective tube current and field of view had minimal effect on FPR.
In speculating about what did play a role in variability, the authors suggested differences in participant-level factors. While they controlled for age, sex, BMI and smoking history, other factors, such as medical and occupational history, could have contributed to variability. “Additionally, differences in the guidance given by the NLST lead radiologist and/or differences in institutional culture regarding making distinctions between true nodules and possible artifacts (e.g. small scars) could also have contributed to center differences.”
Also of note in the results was a high correlation between radiologists’ sensitivity rates and FPRs, which Pinsky and colleagues suggested was evidence the radiologists were operating along the same underlying receiver operating characteristic curve. “This suggests that readers’ predictive abilities may be similar but that they have different inclinations on how conservative to be in terms of ‘calling’ (as positive) a lesion,” they wrote.
For more on lung cancer screening, please read “Lung cancer screening model misses 41% fewer cancers than NLST model” and “Getting lung cancer screening right in the community setting."