How Google, Mayo Clinic and Kaiser Permanente tackle AI bias and thorny data privacy problems

Provider and tech organizations alike often say that artificial intelligence, if fully realized, could herald a new paradigm for healthcare delivery.

Even today, teams across major names like the Mayo Clinic, Google and Kaiser Permanente are working on tools that can rapidly scan images to predict medical diagnoses, skim medical records to highlight optimal clinical decisions for practitioners, triage patients, streamline administrative tasks and nudge consumers toward healthier tasks or information.

But those same AI evangelists are also quick to acknowledge the numerous barriers and pitfalls where the technology has so far stumbled. Chief among these are issues of privacy, transparency and equity.

“Everyone should have the opportunity to achieve the full benefits of AI from their healthcare system,” Michael Howell, M.D., chief clinical officer and deputy chief health officer at Google, said Monday during an online discussion about healthcare AI. "We should work systematically to eliminate those barriers.”

Recent years have seen several reports (PDF) warning of algorithms intended to support care delivery inadvertently driving race and class disparities.

Howell pointed to the U.S. health system’s history of inequity as a key roadblock, saying that “the data that we collect has the disparities built into it even if the data collection process is fully equitable—which it usually isn’t.”

Fatima Paruk, M.D., chief health officer and senior vice president at Salesforce, said a key reason for those issues becomes clear when looking at the patient data other countries use to develop their own healthcare AI applications.

“When it comes to U.S. data, unfortunately, bias is just inherent in the system,” she said. “If you think historically about the data that has been available and just the historical lack of access to care, the U.S. data sometimes suffers from not being as inclusive as it could be, just from that [lack of] access.”

“Built-in” disparities among U.S. healthcare’s historical data have been “very hard” issues for algorithm developers to solve, Howell said. Teams are now recognizing the need to comb over the technical and clinical workflow decisions they’re making in development to avoid amplifying any existing disparities, he said.

Part of that process is monitoring the performance of an algorithm’s results as well as how its introduction is affecting care practice and outcomes, experts said.

Verily President of Health Platforms Vivian Lee, M.D., Ph.D., said that, across the board, her team now takes a magnifying glass to the populations being enrolled and later engaged by any AI tool to monitor any impacts related to social determinants, health equity and disparities.

Similarly, Edward Lee, M.D., Permanente Federation executive vice president and chief information officer, sponsors a group within the organization that ensures algorithms continue to receive quality assurance and monitoring as they see wider use and incorporate additional data.  

“We’re making sure that we’re providing the most equitable care possible and [making] sure the algorithms are still doing what they’re intended to do,” Lee said. “If you don’t look, you’ll never find bias. You need to make sure that that’s built into the process in which the algorithms are used, developed and continue to be maintained.”

Organizations can also look outward to diversify their data collection and validation processes.

“Creating a model isn’t sufficient—one needs to test the model, validate the model, use data even outside of the training set,” said John Halamka, M.D., president of Mayo Clinic Platform. “So, we’ve also established a variety of collaborations, nationally and internationally, to test the models to make sure they’re fair, unbiased and useful for purpose.”

Mayo Clinic’s recent AI efforts have centered around cleaning up and reorganizing episodic health record data to create longitudinal records yielding healthcare insights, Halamka explained. Part of that effort was de-identifying patients’ data—traditionally another sore spot for healthcare.

Simply scrubbing the names from a patient record isn’t enough to prevent that individual from being identified by other details such as their employment, age, job title or region, Halamka said while referencing a 1997 case in which data privacy researcher Latanya Sweeney was able to glean Massachusetts Governor William Weld’s medical information using de-identified public records.

“We had to go in through natural language processing, take all the proper nouns and do something we call ‘hiding in plain sight,’ which is instead of saying ‘this leader of all things healthcare for Kaiser Permanente,’ it might say ‘this medical leader at Yale,’” Halamka said. “You change things in such a way that they’re harder to re-identify.

“Though we bring in partners and joint ventures and collaborations into the de-identified data … we do not allow that kind of linkage and attempt to re-ID [Sweeney used],” he continued. “Everything is audited and everything ensures that what can leave are discoveries, but not the data itself. And in that respect, the patients and government, academia and industry stakeholders we spoke to say [we’ve] actually taken a very reasonable approach to doing good without doing harm.”

Ensuring healthcare AI does right by its patients will require providers to contribute their clinical expertise during development, validation and at the point of care, Halamka and Lee said.

The former noted that Mayo Clinic has built an “AI enablement and education” group tasked with educating the system’s full workforce on AI use and development.

To illustrate the benefit of such a program, he shared a warning story about how a “non-physician” was reviewing data models and suggested giving more inpatients high-dose morphine to reduce readmission rates.

“Now I think all of us would agree that while that may be true, that’s highly undesirable,” Halamka said. “Your clinical expertise is absolutely key to asking the right questions. And with a bit of enablement and education—potentially even working with the technologists at their elbow—they can create models that have clinical relevance, fairness and utility.”