Machine learning has numerous potential applications for health insurers, including in predicting disease onset, future hospitalizations and medication adherence.
But when bias creeps into the development or use of machine learning models, technologies that intend to improve health outcomes can create barriers for certain patients.
In an analysis published in Health Affairs, researchers from Independence Blue Cross, the Massachusetts Institute of Technology and the University of California Berkeley identified several areas where bias can arise in machine learning tools used by health insurers and outlined recommendations for tackling those issues.
Predictive modeling is among the most well-known culprits of bias built into healthcare algorithms. Payers don’t rely solely on risk-based predictive modeling to identify potentially at-risk members, but typically use a combination of predictive modeling, risk scores from commercial vendors, and “if-then” business rules to allocate healthcare resources to those members, the authors wrote.
Disease onset predictive models are less likely to be developed for diseases that impact a smaller segment of the population, or for diseases that don’t have easily scalable interventions, which may leave marginalized communities unaccounted for.
Because these models rely on utilization data to generate predictions, patients who seek care less frequently will contribute less data to the models.
Any data used as the foundation of the model could also be skewed if the provider collecting those data exhibited explicit or implicit biases.
Implicit bias in patient-provider interactions has been heavily studied in recent years. Just last month, a study conducted by the University of Chicago found that Black patients were two-and-a-half times as likely as white patients to be given negative descriptors like “aggressive” or “non-compliant” in their electronic health records.
RELATED: Providers more likely to use negative descriptors in Black patients’ health records, study finds
Similar issues arise with machine learning models that predict the likelihood of avoidable hospitalizations, as barriers to access and use can influence the model’s target population and reinforce existing inequities.
The researchers suggest integrating data that account for social determinants of health, including socioeconomic status, education level, housing and access to transportation, food, and healthcare, into the disease onset and likelihood of hospitalization models to reduce reliance on utilization patterns.
Predictive models for medication adherence run into disparities in diagnosis—racial and ethnic minorities are less likely to be prescribed medications like antidepressants, anticoagulants, diabetes medication and opioids, even when the evidence would call for the prescription of those medications. Access to pharmacies and prescription drugs can also obscure the data.
Indirectly predicting medication adherence through contextual information could allow data scientists to find members that might benefit from lower-cost medication alternatives, the researchers suggest, and thus make them more likely to take those medications.
RELATED: Industry Voices—Building ethical algorithms to confront biases: Lessons from Aotearoa New Zealand
The analysis suggests health insurers audit their predictive models and business processes to identify potential sources of bias using a variety of strategies.
To ensure each racial and ethnic group is accurately represented in the data, rates of outreach and engagement in care management programs should reflect the proportions of each group in the population at large. Insurers should also examine their rates of false negatives and false positives for predictions and conduct analyses to determine whether those rates differ significantly on the basis of race, ethnicity or gender.
Counterfactual reasoning, the researchers posit, can also be used to consider if a person would’ve received a different prediction if they were a member of a different subpopulation despite their health profile.
Addressing biases in machine learning models can determine whether members receive high-quality care. The analysis notes the importance of payers collecting data on social determinants of health and using those data ethically, as well as working across the industry to reduce barriers to care to ensure these technologies aim to tackle health disparities rather than exacerbate them.