Artificial intelligence will play a pivotal role in the future of health care, medical experts say, but so far, the industry has been unable to fully leverage this tool. A Yale study has illuminated the limitations of these analytics when applied to traditional medical databases — suggesting that the key to unlocking their value may be in the way datasets are prepared.
Machine learning techniques are well-suited for processing complex, high-dimensional data or identifying nonlinear patterns, which provide researchers and clinicians with a framework to generate new insights. But the study suggests that achieving the potential of artificial intelligence will require improving the data quality of electronic health records (EHR).
“Our study found that advanced methods that have revolutionized predictions outside healthcare did not meaningfully improve prediction of mortality in a large national registry. These registries that rely on manually abstracted data within a restricted number of fields may, therefore, not be capturing many patient features that have implications for their outcomes,” said Rohan Khera, MD, MS, the first author of the new study published in JAMA Cardiology. “We believe that the next frontier for improving clinical prediction may be the application of these methods to the high-dimensional granular data collected in the EHR.”
The authors used the American College of Cardiology’s (ACC) Chest Pain-MI Registry from 2011 to 2016, which includes nearly 1 million patients hospitalized for an acute myocardial infarction (AMI) or heart attack across more than 1,000 U.S. hospitals. The researchers applied three different machine learning models to predict death after hospitalization and only observed marginal gains over models using the traditional logistic regression, applied to these nationwide data.
Clinical registries such as the Chest Pain-MI Registry have been the mainstay for assessing patient outcomes across many hospitals through standardized data collection. These registries can advance clinical understanding and knowledge but are less suited at complex data collection and abstraction. To infer additional insights will require rethinking how to aggregate novel digital data streams that are being generated at most U.S. hospitals, the researchers said.
The study also underscores that while some methods are more efficient or transparent, the clinical value of machine learning will be determined by data collection and processing.
“The clinical adoption of machine learning will depend on whether it delivers better information – and that may importantly depend on the data that are used,” said Harlan Krumholz, MD, SM, director of the Center for Outcomes Research and Evaluation (CORE) at Yale and senior author of the study.
The research team included clinicians and scientists from across several Yale departments, including Robert McNamara MD, MHS, Nihar Desai, MD, MPH, Chenxi Huang, PhD, and Bobak Jack Mortazavi, PhD.
The research was supported by the American College of Cardiology Foundation.