Natural language processing (NLP) systems that can stratify semantic content can potentially provide transparent, real-time feedback (such as a missing content alert), clinical decision support, quality assurance and data mining, according to a study presented Nov. 30 at the 96th annual conference of the Radiological Society of North America (RSNA) in Chicago.
“Voice recognition creates opportunities for real-time structured reporting and live feedback, but structured reporting can distract radiologists and be cumbersome compared with conventional unconstrained dictation,” reported the study’s lead author Bao H. Do, MD, from Stanford Hospital & Clinics in Stanford, Calif. “Although very useful, NLP systems can be considered cumbersome to produce, especially with the more uncommon interfaces or anything that would require a radiologist to describe the visualization of images."
He and his colleagues at Stanford sought to develop and validate an NLP system to identify semantic content in knee MRI statements from unstructured text and use those data to automatically generate full, structured knee MRI reports. They designed an NLP system using the Apache/PHP/MySql platform.
The NLP processes whole knee MRI reports. Using a lexicon of signals or regular expressions that specify anatomy, findings or disease terms, the NLP assigns each sentence to one of eight categories of a standardized knee MRI template: joint/effusion/synovitis/loose bodies; menisci; cruciate ligaments; collateral ligaments; extensor mechanism; cartilage; bone and marrow; and miscellaneous (muscle, tendon, Baker's cyst, etc).
The researchers reviewed approximately 2,000 sentences from 125 knee MRI reports at their institution between 2005 and 2009, to generate 59 signals determined by two musculoskeletal subspecialists to be specific for the eight semantic categories. For validation, the researchers randomly selected 25 knee MRI reports between 2005 and 2009. Reports were pre-processed and converted to a single paragraph of sentences by removing all section headers. Sentences containing two semantic concepts were assigned to at least one of the two categories.
Do reported that the NLP classified 381 sentences to the eight categories. The researchers found that 10 sentences in nine reports were inaccurately categorized, but the NLP produced an overall accuracy of 97 percent and 64 percent accuracy per sentence and per report, respectively. The most common sources of classification error included absent lexicon and signal non-specificity.
“We have developed a simple rules-based NLP to extract semantic concepts from knee MRI statements to automatically create structured reports,” Do said.
Based on their findings, the researchers concluded that the extensible infrastructure has potential for integration with future RadLex-based best-practice templates.
Do acknowledged that knee MRI is ideal for this type of system with “its limited set of anatomic structures and its limited set of pathology,” and using the NLP for more complex image exams, like an abdomen CT, needs further testing.