Kaiser Permanente develops machine learning tool to predict HIV risk

Researchers at Kaiser Permanente developed a machine learning algorithm officials say could ultimately help prevent HIV transmission.

The analytical tool more effectively identifies people at risk of contracting HIV compared to other HIV risk prediction tools to enable more at-risk patients to be referred for preventive medication, according to a study describing the prediction tool published July 5 in The Lancet HIV.

Investigators at Kaiser Permanente San Francisco, the Kaiser Permanente Division of Research, Beth Israel Deaconess Medical Center, and Harvard Medical School analyzed medical records of 3.7 million Kaiser Permanente patients and developed a machine learning algorithm to predict who would become infected with HIV during a three-year period.

The algorithm flagged 2.2% of all patients as high or very high risk, the study found.

This group of flagged patients included nearly half the men who later became infected, a significant improvement from other published HIV risk prediction tools, the study said.

RELATED: Google, Verily use machine learning to detect diabetic eye disease

“In preexposure prophylaxis, or PrEP, we have an incredibly powerful tool to stop HIV transmission,” senior author Jonathan Volk, M.D., an infectious disease physician, said in a statement.

PrEP is a daily antiretroviral pill that is more than 99% effective in preventing HIV.

“It is critical that we identify our patients at risk of HIV acquisition. We used our electronic medical record to develop a tool that could be implemented in a busy clinical practice to help providers identify patients who may benefit most from PrEP,” Volk said.

The Centers for Disease Control and Prevention estimates that just 7% of the people who could benefit from PrEP are taking it. Health care providers have difficulty identifying people at risk for HIV acquisition. Relying on the CDC’s indications for PrEP—sexual orientation and a history of sexually transmitted infections—underestimates risk for some populations, including African Americans, who have relatively high HIV incidence and low PrEP use, according to the researchers.

Finding a reliable and automated approach to predict which patients are at risk of HIV infection is a high priority for public health officials. The U.S. Preventive Services Task Force recently gave PrEP therapy a grade A rating—its highest rating—and urged researchers to develop tools to identify individuals at risk for HIV.

“Our predictive model directly addresses this gap and may be substantially more effective than current efforts to identify those who may be good PrEP candidates,” Volk said.

The prediction model does not replace the clinical judgment of medical providers but could save them time and address misconceptions about HIV risk, Volk said.

Kaiser Permanente Northern California has a comprehensive electronic health record that tracks many demographic and clinical data points for its members.

“Development of the tool required a setting like KP Northern California that had high-quality individual-level data on enough people to identify new HIV infections, which are rare events,” said Michael Silverberg, Ph.D., from the Kaiser Permanente Division of Research and co-author of the paper.

The investigators analyzed 81 variables in the EHR, finding 44 most relevant for predicting HIV risk. A tool that used these 44 variables identified 2.2% of the population as having a high or very high risk of HIV infection within three years. This high-risk group included 38.6% of all new HIV infections—46.4% or 32 of 69 men who were diagnosed with HIV during the study period but none of the 14 women, according to the study.

RELATED: IBM Watson Health touts recent studies showing AI improves how physicians treat cancer

The tool is limited, as are others, in identifying women at risk of contracting HIV. That’s because risk for females may be dependent on the risk factors of their partners, which are not captured by the variables included in the tool, the researchers said.

The tool also performs less well among patients for whom the EHR contains less data such as new enrollees or patients who access care less frequently.

Researchers compared the new tool with simpler models and found it identified more patients who acquired HIV. Importantly, simpler algorithms were less likely to identify African Americans who became infected, whereas the new tool performed well for both white and African American patients, the study said.

Other health care organizations could build similar algorithms using fewer EHR variables, Volk said. The study found simpler models that included only six variables still helped identify patients at risk for HIV.

The tool could be incorporated into EHR systems to alert primary care providers to speak with patients most likely to benefit from discussions about PrEP. Clinicians could use that opportunity to explain the availability of drug manufacturer and publicly funded programs that may cover all or part of the drug’s copay cost, the study said.

The risk thresholds established in the study flagged a small proportion but a large number of patients—13,463—over a three-year period as potential candidates for PrEP based on HIV risk.

"Embedding our algorithm in the electronic health record could support providers in discussing sexual health and HIV risk with their patients, ultimately increasing the uptake of PrEP and preventing new HIV infections,” said lead author Julia Marcus, Ph.D., formerly of the Kaiser Permanente Division of Research. Marcus is now at Harvard Medical School and Harvard Pilgrim Health Care Institute.

The project was supported by the Kaiser Permanente Northern California Community Benefit Research Program, the National Institute of Allergy and Infectious Diseases and the National Institute of Mental Health.