Toolkit designed to make biomedical data exploration easier

Researchers have developed an open-source platform for creating software applications that make complex data understandable and accessible to those without sophisticated informatics expertise.

Commercial analytics tools tend to require biomedical researchers to understand underlying data models before being able to effectively explore and use large data sets, according to an article at the Journal of the American Informatics Association.

Researchers at the Children's Hospital of Philadelphia and Perelman School of Medicine at the University of Pennsylvania have validated the platform, called Harvest, on two test cases: pediatric cardiology diagnostic and procedure data, and infectious disease data published by the OpenMRS open-source electronic health record (EHR) project.

This platform helps researchers perform queries on individual or multiple attributes in disparate data types--vital signs, blood cell counts, lengthy DNA sequences, bar graphs--and export raw data in an analysis-ready format, according to an announcement.

It involves a data abstraction layer, called Avocado, to generate and manage application metadata; a web application programming interface (Serrano), that supports several key features, including the ability for users to name and save queries; and the Cilantro web client that generates data visualizations such as histograms, bar charts, and pie charts for suitable database fields in real time, which allows users to see a summary profile of data even before constructing formal queries.

Harvest can support data discovery in a number of ways, the authors say, including  hypothesis generation and testing, clinical outcomes reporting and multisite access to shared research data.

It can easily be reused across a variety of biomedical domains, they say, and deployments so far have required little user training.

Translating clinical quality measures for queries has been difficult, though Massachusetts General Hospital computer scientists claim some success in translating between the Health Quality Measures Format (HQMF) and Informatics for Integrating Biology and the Bedside (i2b2). That work has been complex and time-consuming, though.

As part of the government's "open data" project, the U.S. Department of Health & Human Services is looking to developers for help in making its massive troves of data more accessible to the public.

To learn more:
- here's the announcement
- read the research
- find the tool and demo