As healthcare organizations continue to collect more and more patient data across the spectrum of care, there is a rapidly growing need to access and use this treasure trove of data to solve some of healthcare’s most challenging problems. Big Data is the key to advancing research for new treatments and cures, population health surveillance, certification, improving organizational efficiency and data monetization.
The key to unlocking this tremendous potential is to ensure that patient privacy is protected and that healthcare data shared for secondary uses is no longer considered protected health information (PHI) under HIPAA because personally identifiable information has been removed or modified. Currently, there are two main barriers to achieving this: 1) a lack of clear industry-accepted de-identification standards, and 2) the lack of a sufficient number of trained and certified experts to ensure that responsible and appropriate de-identification methodology and standards are being used to protect patient privacy while still producing high quality, high value data sets.
The key to unlocking this tremendous potential is to ensure that patient privacy is protected and that healthcare data shared for secondary uses is no longer considered protected health information under HIPAA...
The HIPAA Privacy Rule established national standards and required safeguards to protect PHI and set conditions on how it is used. There are two approaches specified in the HIPAA Privacy Rule: Safe Harbor and Expert Determination. Safe Harbor requires 18 data types that must to be removed or modified. Sixteen of these required data fields are known as direct identifiers, such as names and Social Security numbers. The remaining two are known as quasi-identifiers, and they are dates and geographic locations.
While Safe Harbor can be seen as a useful “one-size fits all” strategy, it can limit the benefits of using health data for research and analysis if researchers are unable to capture important trends because the data field has been completely removed or masked. Expert Determination methodologies exist so that critical data can be used while still protecting patient privacy.
While there is not currently one standard method for de-identification, there are four major organizations who have adopted the Expert Determination standard, the Institute of Medicine (IOM), the Health Information Trust Alliance (HITRUST), the Pharmaceutical Users Software Exchange (PhUSE) and the Council of Canadian Academics. These standards help guide organizations through accessing, storing and exchanging personal information. These frameworks are a major step in clarifying current methodologies.
Unfortunately, there are not enough data de-identification experts who are trained to assess risk and apply the Expert Determination methodology.
Expert Determination requires an expert to assess the risk given the specific context for which the data will be used or released. Based on the level of risk, direct identifiers and quasi-identifiers can be removed and modified so that the data retains the greatest value and utility for research and analysis, while still protecting privacy. To responsibly use and maximize the value of sensitive data while protecting patient and consumer privacy and minimizing risk, organizations must de-identify personal information using a risk-based approach that goes beyond simple data masking techniques. Masking can leave a company exposed to financial, reputational and legal risks if there is a data breach. De-identification is more sophisticated due to the expert’s ability to consider the context of the data release and use.
Unfortunately, there are not enough data de-identification experts who are trained to assess risk and apply the Expert Determination methodology. This shortage could cause at least three possible negative impacts to research and health care advancement. First, a shortage of experts means Safe Harbor will be used more frequently and result in data that may not be appropriate for solving some of health care’s most challenging issues. Second, non-experts may improperly perform Expert Determination. This would result in the disclosure of data sets with unacceptably high re-identification risks. Third, it is possible that analyses on health data will simply not occur. This would result in many health organizations sitting on an untapped resource of data, leading to less progress in research for treatments and cures, and less innovation and improvement for the health care organizations.
Just as there is no one standard methodology for protecting privacy, there is no one way to become an expert. There is no one professional degree or certification program that someone can complete to become an “expert.” Right now, expertise is achieved by earning a professional degree or certificate, training by existing industry experts, learning on de-identification software or a combination of these options. Experts typically have a background in statistics, mathematics or other scientific domains. To become an expert, the HHS Office for Civil Rights reviews the relevant professional experience, education and experience of conducting de-identification methodologies of the potential expert.
HIPAA has defined the Expert Determination methodology as the following.
(i) Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and
(ii) Documents the methods and results of the analysis that justify such determination
What this means is that the expert needs to be able to do three main tasks:
- Define the “very small” risk of re-identification in a defensible way.
- Select the appropriate metrics and measure the re-identification risk.
- Transform the data using proven methodologies such as de-identification, in order to meet the pre-defined “very small” risk.
We won’t go into the formulas of calculating risk, but to determine that the process is defensible and that the risk is very small, the context of the data and data use must be considered. Meaning, the expert should look at:
- Whether the data will be released publicly or internally.
- How the data is stored, what is the risk of it being attacked and how might it be breached.
- What procedures and policies are in place within the organization so that employees are informed.
Once the trained expert has defined the metrics, evaluating the risk and actually transforming or modifying the data can be automated using software that has been designed to create de-identified data sets based on the Expert Determination methodology. While the human hand is able complete all of these processes, a manual approach is significantly more time consuming and increases the risk for human error. As a result, a manual approach is not a scalable solution as demand for large, complex de-identified data sets grows.
There is tremendous potential for healthcare organizations to leverage big data to drive further insights for treatments, cures and improved healthcare. In order to increase the pace and scale of big data use in healthcare, adoption of industry standards for de-identification and more trained de-identification experts are needed. Standards and expert training are essential to ensure that data is properly de-identified in order to protect patient privacy while still preserving the value and usefulness of the data for critical research and analysis. While progress is being made, federal guidance on de-identification methodology standards is needed along with a clear, industry-accepted path for professional training and certification.
photo credit: hGraph: looking at a group of patients via photopin (license)
If you want to comment on this post, you need to login.