How data from health records, social media streams improves flu predictions

By combining data from social media and other sources, researchers were able to more accurately predict incidence of flu, according to a study published last week in PLoS Computational Biology.

Computational epidemiologists at Boston's Children Hospital first compiled predictive models from five sources: near real-time records from medical practices managed by athenahealth; Google Trends, which notes search volumes for specific queries; flu-related Twitter posts; FluNearYou, a participatory flu-surveillance system; and Google Flu Trends.

With help from the Centers for Disease Control, Google is revamping its flu-prediction models after over-predicting the illness last year.

The researchers combined those sources into a single machine-learning prediction engine to provide what it calls "nowcast" or real-time estimates of flu activity, as well as predictions for up to four weeks in the future.

It compared results with historical CDC flu reports and with Google Flu Trends "nowcasts" from the 2013-14 and 2014-15 flu seasons.

The combined model outperformed estimates made on any source alone and lined up almost exactly with the CDC's estimates of real-time flu activity, according to the report. The system also generated better forecasts on both the timing and magnitude of flu activity than models relying just on historical data.

Other institutions are at work trying to lessen flu cases and offer more accurate predictions. Johns Hopkins University, for instance, landed a grant from the National Institutes of Health to create a new center geared toward innovating flu tracking. In addition, researchers at Los Alamos National Laboratory in New Mexico have reported being able to forecast trends in influenza and dengue fever by trying tracking Wikipedia page views on the illnesses.

To learn more:
- here's the research