We like to think that data is unfeeling--that it has no prejudice or emotion so it can't be swayed by impulse or bias. It's a strictly neutral party, therefore it can be trusted implicitly.
Unfortunately, that's not always true.
The inequality of data was on full display in "Machine Bias," an investigative report published ProPublica on Monday that looks at how the justice system uses computer algorithms to predict the risk of future crime. ProPublica's analysis found that black defendants were almost twice as likely to be wrongly labeled a higher risk compared to white defendants. On the other hand, a larger percentage of white defendants were categorized as low risk, but were more likely to commit other crimes.
Why does this seeming bias matter? For one, these risk assessments are being used to inform sentencing and bail decisions "at every stage of the criminal justice system," according to ProPublica. However, according to the publication's analysis, the risk assessment tool correctly predicts recidivism just 61 percent of the time.
It seems these inaccuracies were born not out of malice (the creators of the algorithm did not intentionally build in racial bias), but a lack of independent validation, and perhaps more importantly, an overreliance on the subsequent risk scores.
To be fair, the scores were an enticing tool. Faced with a growing prison population, the risk assessments presented what seemed to be a judicious and unbiased method of doling out punishment. But in some instances, the risk scores were probably given too much weight.
Fraud investigators are similarly smitten with the use of claims data and the emergence of predictive analytics, and rightly so: These tools have changed the game, gradually pushing the industry toward more preventive tactics, rather than the traditional "Whac-A-Mole" approach of rooting out fraud once it already happens.
Although health insurers are still cautiously embracing predictive analytics, there's evidence that more rigorous data analysis is effective when it comes to identifying large, sophisticated schemes. For example, federal officials have said analytics a played a key role in the government's historic fraud bust last year. In Florida, computer scientists are now just as valuable as traditional law enforcement in fraud fighting, and around the country, Medicare contractors are showing a clear willingness to embrace predictive models. One former personal injury attorney has even been dubbed a "bounty hunter" of Medicare fraud cases thanks to his reliance on claims data.
Incorporating data into fraud investigations has the potential shift the approach entirely, and that's good. But it's also easy to cross that line--to heap so much weight on data and predictive analytics that it brings the whole system crashing down.
ProPublica's adept analysis of criminal risk scores is an example of that. Instead of using data as one tool, the justice system gave it precedence.
That's an important cautionary tale to remember as data becomes a more integral part of fraud detection. Next week, FierceHealthPayer: Antifraud will publish an article about how insurers are battling prescription drug fraud. Analyzing claims data represents a major part of the investigation process. But, as you'll hear from experts, unusually high claims do not serve as a measure of absolute guilt, but merely a jumping off point for a more traditional "boots-on-the-ground" investigation.
The age of big data might be upon us, but humans--even with all of our preconceptions, subjectivity and fallibility--still possess the skills to account for intricacies an algorithm ignores.- Evan (@HealthPayer)