Automated macro helps to flag unauthorized health data exposure

As collaborative research grows, so does the potential for exposure of protected health information. Setting up a macro within SAS software can effectively automate the process of flagging such information, according to a paper published in BMC Medical Informatics and Decision Making.

The authors, from Mid-Atlantic Permanente Research Institute, HealthPartners Institute for Education and Research and other organizations, wanted the technology to not only identify the most common types of data likely to be exposed, but to operate quickly, yet leave the decision about what to do about the information up to humans, rather than mask it or make other changes.

Among the ways they noted that data can be traced back to the patient are inclusion of dates linked to rare characteristics, such as advanced age, and indications of small populations with rare disorders. The macro was designed to flag information such as a medical record number, Social Security number, date of birth and date of service.

In an evaluation of the macro's performance on 100 sample research data sets, the researchers reported a recall of 0.98 and precision of 0.81 when compared with human review.

The macro produces a PDF report that the site data reviewer then can use to ensure data is consistent with the data-sharing and institutional review board agreements, and does not contain unauthorized personal health information.

Code for the macro has been made available to the public.

A new disease registry, dubbed Reg4All, aims to boost patient participation in clinical trials by offering patients the final say over who sees their information and how it is used.

What's more, research at Clemson and Indiana universities found that patients want granular control over the information stored in electronic health records, though physicians are less likely to grant them full access to their records, according to an Accenture poll.

As new HIPAA regulations go into effect, privacy protections are growing ever more important--and harder to ensure. For instance, researchers have disclosed that patients can be identified in "anonymous" public genetics databases, according to a study published in the journal Science.

To learn more:
- find the research
- download the macro code