White noise can increase speech recognition accuracy

Twitter icon
Facebook icon
LinkedIn icon
e-mail icon
Google icon

CHICAGO, Nov. 28—Reading diagnostic images in a brightly lighted room off a workstation monitor reflecting the glare from a high-intensity light source would reasonably be assumed to comprise the quality of the image interpretation.

Although radiology practices pay significant attention to the environment for diagnostic image interpretation, few give as much consideration to the acoustic workspace in which the physicians dictate their clinical report.

A recent study conducted at the University of Maryland Medical Center found that the introduction of white noise at certain levels as part of the acoustic background increased accuracy of speech recognition systems’ transcription capabilities.

According to Joseph Zwemmer, MD, who presented the results of the research at the 93rd annual meeting of the Radiological Society of North America (RSNA), speech recognition technology is now used by almost half of the academic and approximately 25 percent of private practices in diagnostic radiology.

“The utility of white noise and its effect, in particular on the accuracy of speech recognition, are not known in radiology reading room environments,” he said. “The purpose of our study was to evaluate the impact white noise has on speech recognition accuracy.”

Zwemmer noted that most radiologists compensate for unwanted background noise during their transcription sessions by talking louder to mask out the distractions. This “Lombard Effect,” as Zwemmer called it, does not generally result in increased transcription accuracy.

The researchers had ten radiologists digitally recorded 20 reports randomly selected from the facility’s RIS. The recorded reports were then transcribed using a commercial radiology speech recognition system at four different white noise levels.

Zwemmer reported that dictated reports were compared to the original report to determine the number of errors present. The researchers found that the mean baseline transcription error rate (TER) was 11.6 percent (range 6.5 percent - 26.1 percent). However, the TER at the four white noise levels was 10.3 percent, 12.3 percent, 13 percent and 13.5 percent, respectively.

“The TER at white noise level 1 was significantly lower than baseline with a p value of 0.006,” Zwemmer said. “The TER at higher noise levels were significantly higher then the baseline.”

On the basis of their research, the scientists believe that the presence of low-level white noise significantly improves speech recognition accuracy, while higher levels may increase transcription error rates.

“Implementation of white noise in reading room environments not only reduces acoustic distractions but may also improve speech recognition accuracy,” Zwemmer said.