An algorithm to extract data from free text in electronic health records may be an invaluable aid in conducting research, according to a new study in BMC Medical Informatics & Decision Making.
The study's authors, from University College London, noted that most research involving EHRs uses coded data, since it's easier to use and readily available. However, the clinical and other information in free text also can be very helpful in conducting research. The researchers developed a free text matching algorithm computer program to extract dates, lab tests, diagnoses and other information from free text, and applied it to the EHRs of 3,310 patients who died in 2001 to identify their cause of death.
They found that the algorithm had over a 90 percent precision rate in a variety of types of free text, and "may facilitate research using databases of English electronic health records by reducing the need for time-consuming manual review of free text." Using the program also eliminated the need to de-identify the data.
The researchers noted that that much information is located only in free text and that it would be "impossible" to extract on a large scale without an automated tool. Eventually algorithms to extract this information could be used not only for research, but also in a clinical setting, such as coding assistance, they concluded.
Other studies have revealed that mining free text in EHRs can be a boon to research and that how a physician documents patient encounters into an EHR can affect the care patients receive, with the quality of care significantly better for patients whose physicians used free text or structured documentation such as templates, rather than dictation.