Frank Talk on Speech Recognition

Beth Walsh | October 01, 2006 | Enterprise Imaging

University of Southern California Hospital uses Dictaphone’s PowerScribe. Speech recognition is providing radiologists and other specialists with improved efficiency for much quicker reports turnaround times — as long as they’ve chosen the right system. Success also requires good training and the right attitude across the facility.

After a disastrous first outing with speech recognition that lasted just eight days, Janet Korgeski, system administrator, clinical technology, at Moses Taylor Hospital in Scranton, Pa., didn’t give up on the technology. She is in the process of implementing SpeechMagic from Philips Speech Recognition Systems. For the most part, the physicians are “willing to give it a whirl,” she says. “They’re giving me another chance, and I wouldn’t be trying it if I didn’t think the product was worth it.”

Using SpeechMagic required writing an interface for the product between RIS/PACS and transcription. While that was an extra step, Korgeski says it worked out to their advantage because “by writing our own interface, we could customize it to work the way we needed it to.”

Saving time, increasing revenue

All of the radiologists at Moses Taylor are using SpeechMagic on the back end. They dictate their reports into the computer. Text is created and sent to a transcriptionist. The transcriptionist makes corrections and sends the report back to the physician for signature. That process alone has saved the hospital 30 percent of transcription time. Some physicians do not send their reports to a transcriptionist for editing and instead review them and sign off on them. That saves 70 percent in transcription time. The report is sent directly to the units and referring physicians. Either way, speech “is still saving us time,” Korgeski says.

While many hospitals look to speech recognition technology to save on transcription costs, Moses Taylor is taking a different approach. “We’re not looking at cost savings but revenue enhancements,” she says. The goal is to increase business by having more physicians order their tests at the facility. The institution is feeling the competition from two other hospitals in the same market as well as standalone facilities popping up in the area. The biggest goal right now is providing same-day service, where, for example, a physician can order an MRI and have the procedure performed the same day. That is a service some of the smaller imaging centers offer, and Korgeski says Moses Taylor wants to compete with that service.

So far, the hospital has been able to increase its revenue by about $400,000 in the past six months just by bringing in more outpatients. Transcription FTEs have not decreased; rather, the department is now offering more services to physicians. For example, they have created the shell of a discharge summary, which means three-quarters of this thorn-in-the-side task is already done for physicians. “It is a real feather in our cap as far as a service we offer. It has attracted new physicians,” Korgeski says.

Meanwhile, Korgeski will move individual physicians to front-end use of speech over time. Not everyone is a good candidate, and scheduling issues will slow down some progress, but quite a few physicians who have experienced time savings have talked up the technology. “Word is getting out, which helps us,” she says.

Fighting preconceived notions

Some physicians feel that speech recognition is an attempt to turn them into transcriptionists. Korgeski fights this view by explaining to physicians that they can easily review and sign off on 20 reports several times a day rather than slog through more than 100 reports all at once at the end of the day. “I encourage them to go into their inbox throughout the day,” she says. “You can get rid of 20 reports relatively quickly, but 150 reports is a little overwhelming.” Aside from that convenience, the more quickly a radiologist signs off on a report, the more quickly it gets back to the clinician who ordered the study.

Another plus of SpeechMagic, according to Korgeski, is that when a new physician starts dictating, the product looks for word patterns and intonations. “Every time a report comes over and corrections are made, it learns. Some products don’t have that continual learning curve.” That helps Korgeski sell the technology to physicians. “Physicians aren’t real patient with this. They want it almost perfect, if not perfect. They don’t want to put a lot of time into it.”

Aside from a couple of holdouts, the physicians are anxious to move forward with speech recognition, Korgeski reports. “I’m taking baby steps. I’m not about to take on a bunch of people at one time. I stay with one physician until he or she is comfortable and then move on to the next one. That approach seems to be working.” But even with this gradual approach, the hospital is seeing benefits from back-end use. As she moves physicians to front-end use, “we’re only going to see more savings and quicker turnaround times.”

Second time is the charm

Rasu Shrestha, MD, radiologist and informatics director for the University of Southern California (USC) Hospital in Los Angeles, also had a bad experience with the first speech vendor the facility tried. The first system was not well integrated with RIS, and the vendor did not respond well to his queries, he says.

Since 2005, Shrestha has been using Dictaphone’s PowerScribe and hasn’t looked back. The product was installed on the back end, and “we were able to go live overnight,” he says. Implementation is unique in that users completely self-edit their reports; there are no transcriptionists onsite. Shrestha had a few complaints about that, but he had worked closely with the department chairman to implement the technology and plan for a drop-dead, go-live date. He was particularly interested in PowerScribe because it is web-based. “USC has hospitals spread across the campus. Anyone can sit down in any reading room and dictate a report.”

Shrestha started out with speech in an outpatient imaging center and used that as a model for the rest of the USC campus. “The plan was to implement and integrate with our PACS and RIS and 3D post-processing for one nice package. This facility is filmless and paperless, and the rest of the campus is moving ahead on that path,” he says.

Once the facility was up and running, Shrestha saw the productivity gains originally anticipated with the technology. “Other hospitals in the system have a turnaround time of seven days. We average four hours, but are sometimes done within minutes.”

Shrestha and the other PowerScribe users at USC are very happy with the product, he says. In fact, he is now part of the clinical advancement committee for Dictaphone.

Phenomenal productivity increase

Not all speech users spend a lot of time researching and planning for implementation of the technology. Southern Ohio Medical Center is in its third year of speech recognition use, says Howard Stewart, RIS/PACS administrator. The facility was looking for a new transcription vendor but decided to look at speech options. They chose Agfa Healthcare and initially set up as a digital dictation system. One of the radiologists wanted to see the speech technology in action, says Stewart, and “we literally turned from a digital dictation site to a speech recognition site,” he says. They had planned on going live with digital dictation and phasing in speech but “took off with it,” he says.

The increase in productivity has “been the most phenomenal thing,” Stewart says. The facility had worked hard to get report turnaround time down but could not do any better than 11 to 12 hours. “Within three weeks of implementing TalkTech, we were down to a two-hour turnaround time averaged over 24 hours.” Everything is dictated as it comes in. ER and outpatient cases are running less than 20 minutes from the time the exam is completed to the time the radiologist generates a signed report. The result so far is “a whole lot better than what we expected,” Stewart reports.

Southern Ohio has even been able to produce more radiology reports with fewer radiologists. Eight radiologists did 148,000 exams when speech was first implemented, and in 2005, seven radiologists produced 164,000 reports. Plus, they’ve been able to completely eliminate errors based on looking at the wrong patient’s images. They’ve also drastically reduced conflicts regarding “right” and “left,” says Stewart. That was an error that transcriptionists wouldn’t catch. “We’ve documented fewer significant clinical errors than we did before. The workflow makes it impossible for radiologists to do their job without reviewing reports.”

Stewart says he sold the radiologists on the technology by emphasizing the productivity improvement. The ER physicians and surgeons are happier, he reports. And the radiologists can now compare a number of imaging studies taken over several hours.

The facility also was able to go from being understaffed in transcription to adequately staffed. The radiologists review and edit all of their own reports, so the open transcriptionist positions simply weren’t refilled.

Slow and steady implementation

Inland Imaging, a group of imaging centers based in Spokane, Wash., implemented speech recognition earlier this year. The company had gone to full PACS, wanting to “get film out of the equation” before moving to speech technology, says CIO Jon Copeland.

Inland Imaging chose MedQuist as its speech vendor, opting to phase it in across the organization five to 10 radiologists at a time. They also have the option to go back to dictating if they want to, but Copeland says that option might be eliminated by the end of the year. It probably won’t be missed, however. Most of the radiologists were ready for implementation, Copeland reports. And they’ve seen the benefits. There used to be 1,200 to 1,500 reports waiting in the queue at the end of the day. Now there are zero. The average report turnaround time went from well over 24 hours to about 10 hours, he says.

Most users are self-editing and have been since day one, but the system allows them to send their dictation to transcription for almost immediate turnaround. Eliminating transcription is not a goal, says Copeland. “We expect to get more efficient and have fewer transcriptionists over time. Volume is increasing 10 to 20 percent per year as we grow, but we want to grow without having to hire more transcriptionists,” he says.

Another benefit of the new technology is the ability to establish a DICOM presentation state. “We can configure a set of images to look a certain way,” says Copeland. “At some point in time, we want all reports to look the same regardless of who generated it.” That standardization will lead to better quality, more structured reports. Users will know what to look for and where in each report. “SpeechQ lends itself to that very well,” Copeland says.

The move to speech has been well worth the effort, Copeland says, but he warns not to underestimate the training needed to gain fluency. The group is in the midst of a sweep of advanced training and had to assign more users to follow-up training than initially anticipated. However, he says the people from MedQuist have been extremely helpful throughout the process. “For little problems and enhancements, they are there and they are listening,” he says. “They fly in from Atlanta. It means a lot to know that your vendor is there and listening and cares that you are a success. This is and will be a success. There’s no doubt about it.”

Buffalo Medical Group in New York has been using Crescendo Systems’ Speech Processing powered by SpeechMagic since late 2004 — the organization’s first outing into speech. At this point, all of the reports generated through speech go through transcription for quality assurance and then are sent back to the radiologist for sign off, says Michelle Roesler, director of radiology.

Speech is integrated with PACS, which simplifies workflow and increases productivity. That enables radiologists to begin their dictation immediately without the need to enter details such as bar codes. Completed results can be seamlessly uploaded to the PACS through an HL7 connection. All information is exchanged electronically through a bidirectional interface. Once dictation is complete, the report is processed in the back-end by the speech recognition server and made available to a transcriptionist for correction. The integration between the two environments also eliminates the need for the user to log on twice. DigiDictate-IP is started and ended automatically by logging on and off PACS. When a radiologist pulls up images on PACS, the medical record number is blown in for them.

Users really appreciate the speech features when they have to read a mammogram, which is still analog. “They have to type in all the information and then start dictation. It’s an extra step they get irritated by,” says Roesler.

Another feature is voice streaming over IP. The Crescendo Wide Area Service (CWAS) is based on voice streaming technology and designed to automatically determine the fastest method for sending files to the central system. CWAS eliminates bandwidth-intensive methods for voice file transfer purposes such as FTP, e-mail, or file copying. Dictations, whether completed in the office or on the go, are immediately made available for transcription. With CWAS, all client/patient information is stored on the main server, as well as facility, department, document type, and template. This means that only the required information needs to be transferred to the transcriptionist. No file is ever stored on the local PC, preserving voice file integrity and ensuring an optimal level of security.

Conclusion

The bottom line on speech recognition today is that it works and works well. The goals and methods of implementation are important, however. Going into it as a cost-saving measure is the wrong idea, says Stewart. “Trying to sell speech to radiologists to save money on personnel probably isn’t going to work,” he says. “They aren’t going to care what the hospital spends to support them.” Look at the productivity standpoint, he says. “Our group of radiologists believe that the results are worth the effort.” They didn’t go into speech because they wanted to do less work. They wanted to improve their productivity, improve their turnaround time to referring physicians, and improve processes and service, he says. “The attitude was different than a hospital administrator trying to save money.”

And rather than turning the radiologists into transcriptionists, they are getting control of all of their work. “They like that,” says Stewart.

Speech recognition is almost inevitable, says Roesler. “With the explosion of CT and fusion images, radiologists are now going to be looking at thousands and thousands of data sets to do one dictation. We have to find some way to be more efficient, and voice is one way to do that.”

“Voice recognition has come of age,” says Shrestha. “Now is the time when workflow and radiology department efficiencies are the focus. Radiology is a business. It’s important to adopt new technology — it goes hand in hand with providing excellent patient care.”

Beth Walsh,

Editor

Editor Beth earned a bachelor’s degree in journalism and master’s in health communication. She has worked in hospital, academic and publishing settings over the past 20 years. Beth joined TriMed in 2005, as editor of CMIO and Clinical Innovation + Technology. When not covering all things related to health IT, she spends time with her husband and three children.

Related Content