Many U.S. hospitals using predictive models are not evaluating their tools internally for accuracy, and fewer still are evaluating them for potential biases, according to a study published in the most recent edition of Health Affairs.
The “concerning” analysis noted that hospitals reporting local evaluation of their predictive models and AI were more often those that developed their tools in-house, rather than using an algorithm provided through their electronic health record vendor’s platform. Reports of local testing were also more frequent among higher-margin hospitals and those in health systems.
“We also found that critical access hospitals, other rural hospitals and organizations serving high–Social Deprivation Index areas were less likely to use models,” the researchers wrote in the journal. “This indicates that hospitals serving marginalized patient populations might not be able to access AI benefits at the same rate as hospitals serving more advantaged populations.”
Among a sample of over 2,400 hospitals surveyed in 2023, 65% reported using AI or predictive models. The tools were most often used to predict inpatients’ health trajectories (92%) or spot high-risk outpatients (79%), though scheduling (51%), treatment recommendations (44%), billing support (36%) and health monitoring (34%) were also reported by the hospitals.
Among the hospitals that reported using AI or a predictive model, 61% evaluated those models for accuracy using data from their own organization, and 44% reported doing so for bias.
Hospitals using the models to predict inpatient risk were more likely to report local evaluation as opposed to those using the technology for outpatient follow-up or billing automation, “which may reflect a misperception that administrative applications of AI are lower risk than clinical tools,” the researchers wrote.
Nearly four in five responding hospitals said they used models from their EHR developer, while 59% said they came from third parties, 54% said they were self-developed. Twenty-six percent said they used models from all three sources, and 24% said they exclusively used those from an EHR developer.
The spread is noteworthy as the researchers found local evaluation to be more common among the hospitals that developed their own predictive models, a trend “likely explained by the similarities in technical expertise required to develop models and locally evaluate them.”
Hospitals relying on EHR developer-made models and less often locally evaluating could be more susceptible of using inaccurate or biased models that could harm patients, the researchers wrote. Based on these and other local evaluation trends based on hospital type, policies to increase the information provided on models by their developers or more targeted interventions to bolster hospitals’ capacity for evaluation will be necessary to prevent a “rich-get-richer” effect, the researchers warned.
“By focusing on the differences in evaluation capacity among hospitals, this research highlights the risks of a growing digital divide between hospital types, which threatens equitable treatment and patient safety,” Paige Nong, assistant professor at the University of Minnesota’s School of Public Health and the study’s lead author, said in a release.
“Many better-funded hospitals and academic medical centers can design their own models tailored to their own patients, then conduct in-house evaluations of them. In contrast, critical-access hospitals, rural hospitals and other hospitals with fewer resources are buying these products ‘off the shelf,’ which can mean they’re designed for a patient population that may look very different from their actual patients and may not reflect the needs of local patient populations,” she said.