Talking the Talk

Voice recognition software coupled with RIS and PACS cuts radiology report turnaround time by hours and even days and drives down transcription fees by hundreds of thousands of dollars - all well worth the training and adjustment they require.

The realities of modern radiology practice demand accurate, efficient and cost-effective workflow from image capture through report creation, verification and delivery.

"It's clear to me that PACS was meant to be married to voice activated transcription," says Elliot Sandberg, MD, associate professor of radiology and neurology at the University of Colorado and chief of imaging at the Veterans Administration Hospital of Denver. "With PACS allowing providers immediate access to images, any significant delay in the availability of reports reflects poorly on the credibility of the radiologist."

Radiology practices, whether on a large academic medical campus or in a small freestanding clinic, require rapid turnaround time for report generation. Voice recognition technology conquers that need. Radiologists using these systems describe a dramatic reduction in interruptions to their workflow caused by colleagues looking for a report. A corollary benefit end-users note is the reduction in cost afforded by decreased reliance on transcription services.


Sandberg is using the Agfa HealthCare Impax 4.5 PACS with TalkStation 3.0 software embedded with Dragon NaturallySpeaking (ScanSoft). Both PACS and voice recognition functions are driven by the Impax GUI (graphical user interface) to minimize redundant log-ons and order selection activities.

"From a quality management perspective, having [voice recognition] integrated has value with respect to decreased errors," says Sandberg. The more personnel involved in entering data, the greater the opportunities for mistakes in data entry to occur. Their department of 10 radiologists produces reports on 130,000 exams per year. "The number of visits from physicians wanting to view cases [in our department] has dropped by 75 percent since we started using voice transcription."

Since they are a tertiary care center, patients come from throughout the Rocky Mountain region. They schedule patients for clinic visits and imaging studies on the same day. Before implementing voice recognition, radiology report turnaround time clocked in at 24 hours or more. Now the radiologists are able to verify reports upon completion of their dictation.

Although not the driving force behind adoption of this technology, the department has realized cost-savings of $200,000 per year on transcription costs. Coupled with improved productivity for their radiologists, the cost-effectiveness of adopting this new workflow has shown demonstrable benefit.

Agfa offers two configurations for its voice recognition products; either a stand-alone PC model or an integrated option within their Impax PACS. Jennifer Caissie, the senior marketing manager at Agfa explains its approach offers three methods for accomplishing voice recognition.

With front-end speech recognition, the user dictates into a microphone, observes the words appearing on the monitor screen, corrects the report, instantly signs off and the report is sent to the RIS (radiology information system).

The digital dictation option provides the user with the same GUI and they would still dictate into a microphone, but now words appear on the screen, and the audio portion is sent to a transcriptionist. The report is returned to the clinician for further direction and sign-off.

The intermediary between the two options is called directionist workflow. The user employs front-end speech recognition, the words appear on the screen, and at that moment he or she can decide whether they wish to edit the report, or send it to the transcriptionist or correctionist to make the necessary changes and return the report for final action.


Mark S. Lerner, director of radiology services at Children's National Medical Center in Washington, D.C., is an advocate of voice recognition, having installed PowerScribe by Dictaphone in October 2000. The radiologists accomplish 65,000 exams per year, and prior to implementation of this system, they were sending all of their reports to an outside contracted service that had variable quality. The turnaround time could be very short, from a few minutes, to several days. And there were accuracy issues as well.

"When we 'went live' [with voice recognition], the change was dramatic. The turnaround time went to minutes," says Lerner. "When they signed off on a final report, the films were still in front of them, and they could feel fairly confident that what they said on the report was there."

In addition to accomplishing their primary mission of improving the workflow for their 11 faculty radiologists, four fellows and four resident physicians, they found the return on investment took about 18 months as they saved $150,000 per year on transcription costs. He says there is not one radiologist who would return to the former method of producing exam reports.

Richard Bagby, vice president of informatics and CIO of Pinnacle Health System in Harrisburg, Pa., has installed PowerScribe in their four hospital healthcare system with 63 radiologists who provide service to Pinnacle and two other systems.

"Within two weeks of implementing the system, we had 85 percent of our physicians doing self-editing," says Bagby. They adopted this approach because they had problems with turnaround time on their reports, often taking 24 hours for the report to be generated. With self-edit or transcriptionist-editor functionality, the turnaround time is now about 20 minutes. In addition, they've reduced the number of transcriptionists by four through attrition or re-assignment to other roles in the radiology department.

According to Don Fallati, the senior vice president of strategy and marketing at Dictaphone, many institutions decide to eliminate their transcription service up front simultaneous to deploying PowerScribe. He describes that the self-editing mode of using the product has proven to be the most efficient methodology for most of their 5,000 radiology customers. The product has been interfaced with systems from all leading PACS and RIS vendors.

PowerScribe permits a function of templating or setting standard normal text through voice commands. For example, the clinician dictates "standard normal chest" and an entire block of text appears. If they need to make some adjustment to that text, the interactive software makes it easy to do. They also offer a form fill-in function, where a standard form appears, and the user can tab his or her way through completing the form.

One caveat to smooth operation of these systems is that they require good dictation habits. Eating lunch while dictating does not work well. However, even users who are not native English speakers can build speech recognition functionality.

The PowerScribe product offers a "train-a-word" function. If users find specific terminology consistently produces an error, they can access a special pop-up window, type in the word, pronounce it two or three times, and the system will save that correction for future use.


Doyle Rabe, CEO of Austin Radiological Associates, is using ProVox Technologies Corp. products for their 57 radiologists in 14 outpatient imaging centers since the roll out of this capability in 2002, in cooperation with their Fujifilm Medical Systems Synapse PACS.

"One of the things we really liked was the percentage of [word] recognition, that is now in the 90 percent range," Rabe says.

He describes another feature where physicians can stylize their own macros to speed voice recognition dictation even more. The physician states "chest two" and that means a negative finding.

Miki Rice, who serves as the ProVox application manager at Austin Radiological, says that it takes her a few hours to train radiologists on how to create macros. They've found that overall macros reduce their turnaround time dramatically, and have been able to reduce their transcriptionist staff by four FTEs. They use a model where the clinician dictates and former transcriptionists edit the report.

ProVox uses the IBM ViaVoice speech engine that has been adapted to recognize medical terms. As new words are spoken, they are added to the built in vocabulary to further improve accuracy.


ScanSoft Inc. released Dragon NaturallySpeaking 7 speech recognition software in March 2003. It is designed as a Windows-based application that is capable of integrating with most popular electronic medical records (EMR), hospital information systems (HIS) and RIS. Because radiology is one of the most dictation-intensive of all subspecialties, ScanSoft personnel developed a specialized dictionary for use in radiology settings. The latest version will be available in the next several months.

LANtek Computer Service in Kutztown, Pa., is a computer network solution provider that helped integrate the Penn State Milton S. Hershey Medical Center's IDX ImageCast RIS/PACS with the ScanSoft NaturallySpeaking radiology package. With full deployment of this integrated system, every computer runs Dragon NaturallySpeaking and physicians dictate into a microphone and self-edit their reports.

Tunc Iyriboz, MD, medical director, RIS-PACS, at Penn State Milton S. Hershey Medical Center explains that once LANtek completed the integration, the department imposed a requirement that voice recognition would be used with no alternatives. As a result, they experienced a tremendous improvement in patient care quality since reports could be directed to referring physicians more efficiently. An 80 percent reduction in turnaround times for reports shortened their typical report generation period of 72 hours, down to 10 hours. Because they are an academic healthcare setting with resident physicians, the time can never be reduced completely because the workflow requires the resident to read the study, generate a report, which is then reviewed by an attending physician.

While their department is well on its way to becoming fully digital, the fact that some of their studies are produced on film impacts the turnaround time as well because radiologists must review prior film-based studies.

"Without voice recognition, I don't think our RIS-PACS would have necessarily improved our turnaround time," Iyriboz asserts. "We would have had some efficiency created by a RIS-PACS solution, but staying with a conventional dictation system or even a transcriptionist system, I don't think we could have accomplished the results we have achieved."

His colleague Timothy Yanchuck, team lead, RIS-PACS, concurs, "To bring in a modern RIS-PACS solution and not include voice recognition would have been foolish."


MedQuist has launched a new software product called SpeechQ for Radiology that is integrated with the Philips Medical Systems SpeechMagic engine, and offers two approaches to voice recognition. Front-end speech recognition allows the physician to dictate, edit and complete the report in real time. Back-end is server-based recognition where speech happens in the background. The physician dictates the way he or she always has, on one of multiple input devices, and transcription is maintained in the loop. The first installs of this new product were accomplished in June, and users are still in the process of initial use.

The Philips SpeechMagic engine provides Intelligent Speech Interpretation (ISI) with punctuation assistance that eliminates the need for radiologists to dictate commas and periods. If such punctuation is dictated, the system instinctively knows not to insert it twice.


Numerous radiologists who use voice recognition technology to generate imaging reports agree that they don't want to return to old workflow patterns involving conventional dictation/transcription services. The efficiency they gain offers measurable benefits to patient care and effective management of their time. Although effort is required for initial enrollment to use these systems, the ultimate pay-off far outweighs any challenges up-front, assuming the voice recognition system is accurate.