Speech Recognition Turns Up the Volume

Physicians across the enterprise are turning to speech recognition and reporting a number of benefits. Consider:

  • Sarasota Memorial Hospital Radiology Associates in Florida trimmed completion time for stat reports to less than 22 minutes after deploying Dictaphone Corp.'s PowerScribe speech recognition solution. Standard report turnaround time dropped from 24 to 36 hours to less than eight. What's more, the practice eliminated $300,000 in annual transcription costs.
  • The pathology department at Bronson Memorial Hospital in Kalamazoo, Mich., found that speech recognition filled the gap when two of its four transcriptionists retired. ScanSoft Inc.'s Dragon NaturallySpeaking also facilitates instant turnaround once dictations are complete.
  • Mayo Clinic Jacksonville, a large integrated group practice in Jacksonville, Fla., trimmed turnaround time to less than two hours with SpeechMagic from Philips Speech Recognition Systems. Turnaround time is important; however, the real gains are improved efficiency and enhanced quality and patient safety, says Reginald D. Smith, EdD, vice chair, department of applied informatics.
  • University Medical Center of Southern Nevada in Las Vegas deployed MedQuist's SpeechQ for Radiology as a key enabler of its paperless/filmless department. Voice recognition is necessary for maximum efficiency, says Carl A. Recine, MD, chief of radiology. The department bettered its turnaround time from 12 to 24 hours to one to two hours within weeks of implementing SpeechQ.

There are a number of points to consider when entering the world of speech recognition. Integration with existing information systems - particularly the RIS, but also PACS - is critical. How does the practice or department or enterprise plan to introduce speech? Will all users adopt the new technology in one fell swoop? Or is a gradual introduction more appropriate? And what process will the practice use?

Speech can be implemented in two flavors - front-end and back-end. Front-end speech recognition removes transcriptionists from the process, producing a more dramatic effect on costs and turnaround time, but it requires physicians to assume complete responsibility for dictations and self-corrections. Back-end speech recognition is closer to conventional dictation processes and relies on transcriptionist/editors as middlemen in the process.

It's all about (report) turnaround

Sarasota Memorial Hospital Radiology Associates, a 14-radiologist practice serving an 828-bed medical center, maintained an average turnaround time with an outside transcription service. Average, however, doesn't necessarily cut it for 21st century medicine. In addition, efforts were duplicated for ER reads. "ER physicians want immediate reports, so radiologists would dictate into the transcription service then write or call the wet read to the ER," recalls Kirk Conrad, MD, medical director of neurointerventional services. Labor duplication isn't the only drawback of this method; decoding hastily hand-written notes can be trying and represents a potential patient safety concern.

Early in 2004, the practice deployed Dictaphone's PowerScribe speech recognition system to create a more efficient and safer reporting process. Radiologists dictate directly into the system, which sends the report to the RIS. The system is configured to autofax results to the ER. "ER physicians are very happy with the system, especially with CTs of the chest, abdomen and pelvis and ultrasound [exams], [which can be lengthy with multiple findings]," claims Conrad.

Sarasota Memorial Hospital Radiology Associates opted for the 'cold turkey' approach to speech recognition - pulling the plug on transcription services when it deployed speech. "There was some rumbling, but most radiologists learn the system very quickly," says Conrad.

Speech recognition features can sell reluctant users. Because radiologists proofread their own reports, future review of transcribed reports is eliminated. And macros represent another hefty time-savings.

Vendors typically supply some macros such as "normal chest x-ray." The radiologist says the three-word phrase to produce a report. This takes less time than a 15-second dictation, and as results are always the same, there is no need to proofread. Users also can customize macros and create canned reports.  For example, a customized macro for a kidney ultrasound includes x, y and z as blank fields. The radiologist dictates patient-specific x, y and z measurements. Conrad relies on canned reports for complex studies like neck CTs. The canned dictation serves as a template or checklist that covers all anatomic spaces and landmarks. "The result is a very detailed, organized report. There is much greater satisfaction on the part of physicians who read the report," explains Conrad.

RIS and PACS integration are important points to consider when deploying speech, says Conrad. "The most important connection is the RIS. The speech recognition system must send information to the RIS," Conrad says.

Sarasota Memorial Hospital Radiology Associates is in the PACS replacement market and aims to integrate images from PACS, previous reports from RIS and speech. The plan is to create an unread films list. The radiologist could click on a name to access images. At the same time, PowerScribe opens with the appropriate accession number for a seamless viewing and reporting process. "It's definitely doable. We have to get the different vendors talking to each other, so someone can transfer information from the different systems," explains Conrad.

At this point, PowerScribe has not significantly impacted workflow. "The advantage is the elimination of transcription and immediate availability of reports," says Conrad. He foresees increased efficiency as the practice eliminates paper. "We're working to get rid of the file room and put everything on the computer to increase efficiency and improve turnaround time. This isn't a direct result of speech recognition, but speech enables paperless."

Currently, Conrad and another radiologist in the practice are engaged in a pilot with three programs that create a model integrated, paperless work process. "With this pilot, we can turn out reports almost real-time," Conrad shares.

Quality counts

Patients tend to move around within the large, integrated practice at Mayo Clinic Jacksonville, seeing a variety of clinicians for different tests. Transcription adds delay in turnaround time, which can affect the quality and safety of patient care, says Smith. More than 10 years ago, the practice realized speech recognition could yield improvements in the practice of medicine. At the same time, speech was seen as a way to trim the group's whopping $6 million in annual transcription costs.

Early speech recognition systems focused on proscribed vocabularies for specialties like radiology. "We are a general medicine practice and needed a solution that could serve all physicians," explains Smith.

In 2000, Mayo partnered with a small speech company, but found its product was not an intelligent system. That is, it could not learn new vocabulary or speech patterns from users. A second effort with another speech recognition vendor produced good results. Unfortunately, the system did not integrate with the existing HIS.

"We needed a solution that could auto-feed and interface into the electronic record with a great deal of control, so that dictations would not be tied to the wrong record," explains Smith. After de-installing the second system, the practice returned to square one and evaluated all of its options. This time, Mayo selected Philips SpeechMagic with Fusion Text from Dolbey and Digital Voice Inc. The clinic was confident the integration piece was taken care of because Dolbey had served as its transcription company, so it was already embedded in the HIS, says Smith.

The practice opted for a gradual roll out of back-end speech recognition for its 160 physicians. Smith explains, "This is the least disruptive model and requires the least education. Physicians dial into a digital storage facility. After recording the medical record number and type of note, the physician dictates the note. The computer completes a rough draft, which is routed to transcriptionist/ editors, who edit and release the note."

The practice plans to embark on a front-end speech recognition pilot this year. This model eliminates transcription completely - with physicians dictating into a workstation and editing on the fly. Smith does not foresee a front-end takeover at Mayo. "Clinical note generation is not a one-size-fits-all solution," claims Smith. Indeed, one of the practice's physicians types reports directly into templates on Cerner Corp.'s PowerNotes system. The report is populated using mouse clicks for common diagnoses and goes directly into the medication ordering and billing systems. On the downside, this approach is difficult for more complex conditions with co-morbidity, says Smith.

A practice transformed

"Speech recognition technology has transformed our work and practice," confirms Nigel Bramwell, MD, a pathologist in a six-physician practice at Bronson Memorial Hospital. The practice had tried an early speech recognition system, but found the system was not ready for prime time. When two of the four transcriptionists supporting the specialty group retired, the practice realized it was time to re-visit the speech market.

The same constraint that made it difficult to replace retiring transcriptionists - a very specialized vocabulary - also impacted the speech recognition implementation process. One of the practice's IT-savvy pathologists was released from most of his regular duties for several months to implement and fine-tune the practice's speech recognition solution - ScanSoft Dragon NaturallySpeaking Medical system - boosting and customizing the system's 300,000 word vocabulary.

The practice transitioned to speech recognition in 2003 and now creates medical reports in real-time by dictating notes directly into PCs. Pathologists use wireless headsets to enable mobile dictation in the cutting room. During the implementation phase, Jeff Pearson, MD, medical director, created a macro library of standard report templates. The practice now reaps ongoing dividends from the upfront time investment as many reports can be created with a few key words.

With Dragon NaturallySpeaking, reports are electronically distributed as soon as the pathologist dictates his findings. "Physicians have been very complimentary about our improved turnaround time, which consistently ranks high in clinical surveys," shares Bramwell. Turnaround time for most pathology reports is 24 hours, as fast as possible for studies that require overnight tissue processing. Gross diagnoses that do not require overnight processing and a subsequent microscopic exam can be dictated within minutes of receiving the specimen, says Bramwell. The rapid turnaround satisfies patients as well as physicians, says Bramwell. "Patients are anxious and want results fast," says Bramwell.

The pathology department reduced reliance on manual transcriptionists by 95 percent. Two of the four transcriptionists have been redeployed to other departments, and the remaining staff focuses on value-added tasks like customer service, billing and office management.

Flexible speech recognition?

Forty radiologists rotate through University Medical Center (UMC), and some pass through the center just once or twice monthly. Recine realized that it would be a challenge to train all 40 radiologists, especially infrequent users, to operate a speech recognition system. Thus, in addition to looking for a system that could completely integrate with its RIS and PACS, UMC required a flexible, user-friendly option.

SpeechQ fits the bill because it allows both front- and back-end speech recognition, says Recine. Some users employ front-end speech recognition, dictating, editing and signing reports in one session. "This approach maximizes the radiologists' and referring physicians' efficiency. It eliminates extraneous phone calls and faxes. Now the final, signed report is attached to the study," explains Recine. With front-end speech, radiologists can present referring physicians with a report within one to two minutes after signing it.

Another time-saver stems from the RIS/speech integration. The RIS, rather than the radiologist, populates demographic information; the radiologist simply states the accession number for the study. SpeechQ matches the accession number on PACS, linking the dictation with the appropriate images.

SpeechQ facilitates back-end transcription as well, so if a radiologist does not want to edit his reports, he can send them to transcription for editing.

Recine looks forward to added flexibility in the future, too. SpeechQ can be driven offsite - either by the UMC Philips PACS or the practice's eMed PACS. Either way, the configuration will enable radiologists to complete night calls from their home office rather than onsite at the hospital. Telephony access to the system is in the works, which means that residents without internet access will no longer need to track reports. Instead, they can listen to SpeechQ reports on the telephone.

Currently, the UMC radiology department relies on a paper trail safety net to track studies and reports among the hospital and outlying clinics; however, as all users complete training, the plan is to dictate straight from a worklist without paper.


Speech recognition brings a number of benefits. For starters, it decreases turnaround time, which can boost referring physician satisfaction and enhance patient care. Speech reduces duplicative workflow steps for stat reports as radiologists no longer need to call the ER with wet reads. It also reduces costs, allowing practices to redeploy transcriptionists. A final gain? With speech, departments can realize the promise of being paper-free and eliminate paper to further boost efficiency and trim costs.


The Speech Recognition Checklist

Speech promises some hefty gains - increased efficiency, reduced costs and shorter turnaround times. Achieving these ends requires due diligence and upfront legwork. Some questions to answer during the planning and evaluation stages include:

  • What are the site goals for speech?
  • How will speech be practiced? Weigh the pros and cons of front-end and back-end options. If a combination approach is desired, make sure the system can support it.
  • What systems must the speech solution integrate with? Integration is not necessarily a given and most often far from easy. Upfront due diligence can reduce future glitches.
  • How will the system be introduced? A big bang introduction that pulls the plug on transcription encourages faster learning and acceptance among physicians. But a staged process that starts small allows the practice to thoroughly test the system and work out any bugs.

"Complete site visits. Analyze how the system works at other institutions. Make sure what is promised is in practice somewhere," says Carl A. Recine, MD, chief of radiology at University Medical Center of Southern Nevada in Las Vegas. Ideally, the site visit conditions should reflect similar conditions and parameters at your facility. That is, if the site plans on a front-end model with RIS-PACS integration, look for a similar arrangement in site visits.

Once speech is implemented, it's important to have someone on hand who understands the system and can meld the professional needs and the technical abilities of the system, says Recine. This may be a vendor employee, IT staff member or department super-user depending on the size and needs of the facility.

Speech recognition can be deployed in a variety of configurations. Most work and yield results that are worth talking about - decreased turnaround time, enhanced patient care and reduced costs.