The Stepping Stone to Structured Reporting: What Does Natural Language Processing Mean for Radiology

Twitter icon
Facebook icon
LinkedIn icon
e-mail icon
Google icon
 - women, horizon
Like much of healthcare, radiology is a state of flux. From reimbursement cuts to adopting EMRs, day-to-day operations are being transformed. Dictation software is evolving in the radiology field as natural language processing (NLP) is being developed to harness content from dictated, free text into a manageable report that can be used in radiology.

What’s in a name?

NLP is a broad term, according to Daniel L. Rubin, an associate professor in the department of radiology at Stanford University in Palo Alto, Calif., who conducts research in NLP. From a 10,000-foot-view, NLP is the “application of computer methods to understand and comprehend free text,” Rubin says, adding that the “holy grail is to have a computer understand the meaning of text equivalently to how a human reading that same text understands it.”
Because the major applications for NLP in radiology tasks are so specific, Rubin describes four computer processing tasks that fall under the broader concept of NLP that can assist radiology workflow:

  • Text Classification: A text report or sentences in the report are run through a classifier to label the reports, such as for automated ICD coding.
  • Name Entity Recognition: Recognizes findings, diseases, devices and diagnoses in reports.
  • Information Extraction: Pulls out from reports particular types of factual statements (such as the anatomic location of an imaging finding) or recommendations; these statements convey elemental facts for future tabulation.
  • Information Retrieval: Searches a large database of text reports for those that match certain query criteria.

Because radiology uses a large vocabulary of terms and since reporting styles tend to vary among individual readers, there are challenges to achieving fully structured reporting, says Keith Dreyer, DO, PhD, vice chairman of radiology and informatics at Massachusetts General Hospital (MGH) in Boston. “We feel that NLP is a powerful interim solution until we achieve full structured reporting. That said, I think we’ll utilize NLP for a very long time, particularly as we become more accountable for our outcomes.”

“All of our historical reports and medical literature are in free text and unstructured, so NLP methods will always be needed, at least to extract and help radiologists access, leverage and use these past reports,” Rubin says. “[This] is important not just for research or educational purposes, but also for clinical care, especially when a clinician is confronted with an unknown case.”

“You can’t improve anything you can’t measure,” says John Mattison, MD, assistant medical director, CMIO, at Kaiser Permanente Southern California. Mattison—who has been working on integrating NLP technologies (powered by Nuance) with evaluation and management (E/M) coding metrics for the Kaiser’s HealthConnect EHR—stresses the need to be able to disambiguate data and normalize the construction of free text data across a patient’s entire record.

According to Mattison, when that data are coded consistently into the EHR, clinicians such as radiologists could apply rule-based evidence to pick out previous information on patient encounters, a patient’s history, the course of events leading up to an event and the history of analysis. He says that while it has taken awhile to validate the NLP technologies and E/M coding metrics because they are so complicated, Kaiser Permanente is about a year away from integrating the two systems.

Time to shine, mine, slice, dice and parse language

“It’s no secret that radiologists who have been practicing for 10 to 20 years, for the most part, hate voice recognition because they feel it will slow them down and they don’t want to become editors,” says David J. Marichal, RT, CIO and COO at Radiology & Imaging Specialists in Lakeland, Fla. However, since adopting a new reporting tool, GE’s Centricity Precision Reporting (powered by M*Modal), in November 2008, Marichal says the group’s radiologists couldn’t be happier using NLP to quickly complete reports in the group’s multi-modality imaging centers.

“[The reporting system] simultaneously launches when viewing images and the RIS auto-populates report data so the radiologist doesn’t have to fish or hunt for information,” says Marichal. The reporting system is adaptive, so that it learns on the back-end the patterns of the clinicians using it. This means that the more radiologists use the system, the better it gets at recognizing often-used words and phrases.
“NLP can figure out from context where information should go and where it fits into the report,” says Marichal. “It can even correct syntax sometimes if a clinician slips up before the words go into the report and because that syntax is then part of the report. Because it uses clinical document architecture, you can data mine in the future.”

Radiology & Imaging Specialists uses a mixed environment where a couple of medical editors can clean up any voice recognition errors that might occur. Once a radiologist is finished with his or her interpretation, the text is available for review and, depending on the report, the radiologist may do a quick edit or send to the report a medical editor.

But because the clinical documentation architecture is not yet fully developed, Marichal says he is looking towards the future to be able to run analytics on the back end to see what percentage of radiologists document contact with the referring physician to close the loop between the two and show compliance based on diagnoses.

While the next generation trailblazers like Radiology & Imaging Specialists are starting to use NLP, today it is mainly a tool for research. However, the field is rapidly expanding and developing. In April 2009, biomedical informatics researchers at the Mayo Clinic in Rochester, Minn., formed a partnership with IBM to create the Open Health NLP Consortium, which established an open-space platform to engage NLP developmental efforts for researchers and developers.

Mayo introduced a text solution for clinical notes (cTAKES) focused on processing the patient-centric clinical notes, while IBM added medKAT (medical Knowledge Analy-sis Tool), an Unstructured Information Management Architecture-based (UIMA) system that uses NLP to extract structured information from unstructured data, such as pathology reports, clinical notes, discharge summaries and medical literature.
Since then, about five major National Institutes of Health (NIH) grants have been awarded to organizations such as the University of Pittsburgh and the Mayo Clinic who have contributed back into the consortium “sophisticated tools and updated annotators that have added some decision logic to sort text,” says Christopher G. Chute, MD, DrPH and senior consultant on the NLP project at Mayo.

Coreference resolution, the process of determining whether two words (like pronouns) refer to the same entity, and temporal resolution, or the sequence of events, are currently being developed in the consortium, according to Chute. Down the line, “[t]hese applications could be adopted, enhanced and specialized to practices such as radiology.”

So where will NLP help radiology reporting? For starters, in classifying tumors and automated disease status classification. “NLP demonstrated good accuracy for tumor status classification and may have novel application for automated disease status classification from electronic databases,” according to an article in the April issue of The Journal of Digital Imaging, by Bradley J. Erickson, MD, PhD, and colleagues. The team found that NLP achieved 80.6 percent sensitivity and 91.6 percent specificity for tumor status determination (mean positive predictive value, 82.4 percent; negative predictive value, 92 percent) whereby the authors concluded that most reports in their study contained sufficient information for tumor status determination, though variable features were used to describe status.

“In order to do research in brain tumors that might help diagnosis in the future, one needs to perform comprehensive and unbiased searches of image databases to find the cases,” Erickson notes in an interview on the study. “This study shows that NLP can help find candidate cases.”
Erickson adds that NLP can help to report cases in a clearer fashion. “If NLP was performed on the report after it was created but before it was signed off, and if the NLP produced a conclusion that was different than what the radiologist meant, it could help to highlight reports that need revision,” Erickson says.

Mining key data

Dreyer and colleagues at Mass General created an internal text classification software program in 2003 called LEXIMER (Lexicon Mediated Entropy Reduction), a data mining and analysis tool that structures reports through algorithms that extract and classify in real-time as well as batch processing. This means LEXIMER can be run automatically on millions of reports to mine previous examinations. “Instead of manually reviewing a million reports a year to look for terms such as ‘pulmonary embolism,’ we can do an automatic search in the report for the different ways one can describe that entity or even look for the findings that are suggestive of pulmonary embolism,” says Dreyer.

While electronically searching, LEXIMER simultaneously maps terms of interest in the report to terms found in RadLex, a source of unified radiology terms across procedures, pathology, anatomy and radiologic findings created by the Radiological Society of North America.
LEXIMER was mapped to RadLex to structure ontology such as synonyms, relationships of terms and terms themselves at MGH that caregivers could agree to and use, according to Dreyer. “Clinicians at MGH can create a report and map the terms used to RadLex to go back and mine those reports consistently and accurately,” says Dreyer. This is especially helpful for a radiologist who wants to find information on previous exams on a condition a patient has.
 
One future application of LEXIMER currently under development at MGH is the ability to inform physicians who are in the process of ordering imaging examinations about what potential findings the proposed examination might yield.  Thus, LEXIMER analyzes the current clinical indications, patient demographics and prior examination results and compares them with similar patients contained within its database.  The result alerts the physician to potential findings that similar patients under similar conditions have demonstrated.
Another tool in use at MGH is the QPID (Queriable Patient Inference Dossier), an internally built semantic search engine that can retrieve data based on clinical concepts using a set of NLP features to leverage the structure and etymology of medical language, according to Mike Zalis, director of QPID Informatics, department of radiology at MGH.

QPID aggregates EMR data across Partners Healthcare in Boston. It then prepares data for rapid search by applying complex indexing technologies and integrates a rich, but easily accessible suite of search tools into the system whose output is accessible via a web browser or machine-to-machine interface. QPID provides “an ability to reflect clinicians’ clinical insights into their search experience by employing, among other tools, search logic based on the proximity and association of complex saved word combinations,” Zalis says.

Since being developed in 2005, QPID’s clinical utility has increased thanks to   negation logic that excludes unwanted detections that would clutter search results.

So, how practical is it?

QPID recently helped MGH triple its screening rate for palliative care services, meeting important quality targets and pay-for-performance metrics, Zalis says. In addition, QPID is now being used to perform pre-procedural screening for conscious sedation procedures in gastroenterology, where after a pilot trial of 5,000 patients, the system was able to correctly identify the approximately 15 percent of scheduled outpatients with potential conscious sedation risk factors.
 
While dictation software may not be able to create fully structured reports yet, the toolbox for creating that structure for radiology reports is ever-expanding. Through advancements of negation logic, temporal resolution, coference resolution and more, radiologists can soon expect more options for their back-end architecture to mine previous imaging exams to gain a better understanding of the patient’s condition at hand.