Veterans Affairs Drives Data Mining

The numbers are staggering. As of early this year, the Department of Veterans Affairs’ Veterans Health Administration (VHA) had in its arsenal some 30 million veteran records, with accessible data including 3.2 billion clinical orders, 1.8 billion prescriptions and 2 billion clinical text notes (with a growth rate of 100,000 per day).

Mining this vast data treasure trove in meaningful ways is cutting-edge science. With its long history of data collection and analytics, VHA leads the charge in harnessing data to revolutionize care delivery. And, as hospitals and health systems across the country struggle to build knowledge from vast data stores, VHA has lessons to share—and more to come.

In the yet-to-be-scheduled 24-month pilot, VHA will deploy advanced algorithms to sift through its Veterans Health Information Systems and Technology Architecture (VistA) EHR system. The goal is to use clinical reasoning and prediction systems—including advanced natural language processing (NLP) techniques and machine learning—to aid diagnoses, identify negative drug interactions and evaluate treatment decisions.

Unlocking data potential

The pilot is part of the VHA’s “evolutionary” process to interrogate very large datasets to improve patient care, efficiency and outcomes, says Stephan D. Fihn, MD, MPH, director of the VA office of analytics and business intelligence.

Only recently has computing power and software been sophisticated enough to efficiently work with big datasets. “Ten years ago, if you had a dataset with 10,000 or 50,000 people in it, you’d run an analysis, but it may have run on your machine overnight or for a day or two. Now we can process millions of records in minutes,” Fihn says.

VistA is a source of data for many VHA databases, including the Corporate Data Warehouse (CDW), and is used on a daily basis to identify potential shortfalls or issues with productivity, and isolate patients at risk of certain medical events, says Fihn.

CDW, a national repository of structured data, includes literally a billion rows of data. Fihn’s office has developed and published a method to take CDW data and identify patients at the highest risk of hospital admittance in a selected timeframe. For example, it could help predict risks of heart disease, post-traumatic stress syndrome and falling, among other conditions. This enables providers to identify patients needing more intensive care coordination.
The method has since been published and “we run that now on every VA primary care patient every week,” Fihn says. About 1,000 users a month access these data and interest is growing.

Additionally, his office implemented a program in May directed to nurse care managers and patient-care aligned teams (ie., patient-centered medical homes) in VA provider networks to use data to proactively coordinate care for the most complicated and chronically ill patients.

“So far, the feedback we’ve gotten has been tremendously positive,” Fihn says of the project, noting that 400 people logged in for a live broadcast on the project in April.

Data-informed decisions at the point of care

Within medicine, many providers are working to take unstructured data—such as clinical notes, radiology reports, patient reports—and merge them with our more traditional structured data, he says. The pilot does just this, as it sets out to explore how automated clinical reasoning, NLP and predictive analytics can identify risks and improve patient care.

The pilot piggybacks on efforts that began about four years ago to test NLP using the research service VA Informatics and Computing Infrastructure (VINCI).
The pilot’s goal is using these types of information to answer questions and, ultimately, to enable decision-making, Fihn says. He compares it to IBM’s Watson technology that competed against people on the game show “Jeopardy.” “To be able to find an accurate answer is hugely complicated, it requires a full understanding of both the query and the data.”

Some types of questions the system would answer include: Who is likely to get sick? What can we do ahead of time? “That’s a very generic and important set of questions, but it’s critical to improving the quality and efficiency of our healthcare system,” Fihn says.

He also says clinicians could query the system when there are three different known therapies for a particular circumstance. “We could take a query of the last 1,000 similar patients, how we managed them and what the outcomes were [to determine the best course of treatment],” he says.

Queries also could be used to proactively identify patients at risk of hospital admission and provide them with home telehealth, home-based primary care and other care management services. By doing this “that patient avoids getting sick and going to the hospital.”

The model also could answer questions regarding drugs in common use that are no longer subject to clinical trials. He cites the case of Vioxx’s link to heart attacks as discovered observing veteran data. “Is drug A more dangerous or effective than drug B? We could look at that based on the data of hundreds of thousands of patients,” he says.

Physicians as information managers

With the vast amount of data at their disposal, physicians are increasingly becoming information managers.

Decisions on what therapies to pursue for which patients broaden the scope of the work. “One hundred years ago, we probably had four or five effective drugs, if that. Now we have hundreds. We need a system to figure out which ones work best on whom and how to use them. That’s the next phase as technology develops. It’s evolutionary,” Fihn says.

He cautions that efforts to evaluate models must be conducted slowly, especially in light of privacy and security concerns. “We can overwhelm ourselves with data; we can draw the wrong conclusions from data if we don’t do this correctly. That is part of the reason we’ve gone about this slowly, evaluating things as we go.”

MAVERIC Projects Puts Veteran DNA Data into Researchers’ Hands

Leonard D’Avolio, center director of the Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), is on a quest to move the dial on healthcare delivery through a learning healthcare system.

At the Medical Informatics World Conference in Boston in April, he discussed MAVERIC’s work to create a learning healthcare system within the Department of Veterans Affairs (VA) through application of research resources and methodologies to important clinical problems.

The goal: build vast medical databases by collecting data stored anonymously for research on diseases like diabetes and cancer, and military-related illnesses, such as post-traumatic stress disorder. MAVERIC’s goal is obtaining blood samples and health information from one million veteran volunteers. After collecting the blood, the DNA data are surveyed, linked to EMR data, and made available to researchers, D’Avolio explains.

MAVERIC’s work competencies include epidemiology, core laboratory, clinical trials and biomedical informatics. The Core Laboratory stores more than 500,000 samples for VA researchers. The Epidemiology Research and Information Center promotes VA-based population research, including genomic analyses, for VA providers’ use to improve quality of care.

 “The challenge is to facilitate personalized medicine. It does not happen without deep knowledge of biology and large sample sizes,” he says.

MAVERIC also hosts one of five  VA Cooperative Studies Program Coordinating Centers that helps manage clinical trial and epidemiological study data through cutting-edge electronic systems.

 “Pharma is not turning out enough drugs to stay sustainable,” he explains, as many drugs in expensive clinical trials fail.  Within an EHR, “you turn every situation where you don’t know what works best into a clinical trial.”

D’Avolio says veterans have been receptive to participating in the randomization. “We tell veterans that the machine will choose your care, and ask ‘are you cool with that?’ 90 percent say yes,” he says.

 

Trimed Popup
Trimed Popup