RSNA: Harvesting returns on voice recognition

Clint vanSonnenberg | December 21, 2010 | Health Imaging | Practice Management

Voice recognition software demands training, consistency and likely some frustration, but it pays by way of decreased costs, shorter report turnaround time and greater productivity, according to a presentation given at the 96th annual scientific meeting of the Radiological Society of North America (RSNA) on Nov. 28 in Chicago.

While arguing that voice recognition (VR) software has come a long way, David L. Weiss, MD, radiologist at Geisinger Medical Center in Danville, Penn., commenced his presentation with a straw poll, asking audience members how many were satisfied with their VR software. Despite his promotion of the tool, Weiss was not surprised that only about 10 percent of attendees raised their hands. But his thesis remained firm—voice recognition software works, if radiologists know how to work it and are willing to put in sufficient upfront effort.

Dictation style is key. Weiss emphasized consistency, pronouncing words without variation and with a deep voice if possible. He also recommended using continuous phrases and correcting those whole phrases when entered incorrectly, rather than correcting individual words, as many presume to be the most logical strategy. The software will become more responsive and is likely to recognize the whole phrase once it is trained to interpret it.

Many radiologists are reluctant to add new vocabulary to their software, opting for manual text edits of language rather than repeating the word and adding it to the program's dictionary. But Weiss insists that adding one or two words per day in this matter yields valuable returns and time savings within little more than a month. This formula applies to radiology because of the specialized but often quite standardized language embedded in many radiology reports.

In a similar vein, an annoying initial shortcut to adopt the use of macros can save a tremendous amount of time in the long run. Similarly, the use of custom or pre-existing templates, a sundry of which are available for free download, can shorten dictation time and may contribute to more accurate and convenient dictations. Moreover, naming these macros, so that they can be identified and entered sequentially (for example, modality, body part, modifier and diagnosis) can enhance efficiency while elucidating dictation and the final product.

For the microphone as with many of these other features, ergonomics and efficacy coincide. Weiss claims that microphone placement and timing are critical. First, the microphone should be held close to the doctor, but not so close as to muffle and infuse discordance. For both optimal effectiveness and safer ergonomics, headsets usually prove best at leveling out that sweet spot. Weiss also advises pausing between the starting of the recording and actually speaking, as many microphones will cut off the first part of a sentence.

Finally, Weiss said, "talk to the gamers." Hard-core video gamers, as Weiss called them, have some of the highest-quality, most ergonomic and efficient shortcuts for avoiding the keyboard and accelerating PACS navigation. Weiss advised matching your VR vendor with your workflow, choosing VR software that not only recognizes speech with a high level of accuracy but also facilitates interoperability between voice commands, shortcuts, macros, RIS and PACS.

Clint vanSonnenberg

Related Content