Biases in EHR data can be reduced to improve research

By Marla Durben Hirsch Aug 27, 2013 11:46am

The data in electronic health records can be honed so that it can better be reused for research purposes, according to a new study published in the Journal of the American Medical Informatics Association.

EHRs can help identify phenotypes for research--such as specific traits or the presence of disease--but the way that they're recorded is often inaccurate, incomplete and complex, resulting in bias that can adversely affect the ability to use the data in research studies.

The researchers, from Columbia University, desired to better understand how the healthcare process impacts the recording of clinical information to improve, and possibly speed, the generation of phenotypes. They took 84 variables in an EHR, consisting of 24 lab tests and 60 clinical concepts, and applied them to five common process events: inpatient admission, discharge, outpatient visit, emergency department visit and ambulatory surgery. They then clustered them to see if there was any correlation.

The researchers found that the healthcare process did affect the variables, which appear to be sensitive to the manner in which data is collected. For example, lab values and note concepts tended to cluster together.

The researchers hope that such knowledge can be used to select variables from an EHR and correct for biases.

"[T]he approach may point the way in the longer run to a more automated and reliable phenotyping process," the researchers said.

The data in EHRs are increasingly being considered for secondary uses beyond individual patient record keeping, such as large scale research and predicting future healthcare needs.

To learn more:
- read the article