Privacy experts can now rely on a new standard, the ISO/IEC 27559:2022 privacy-enhancing data deidentification framework, in an area that has been the subject of much discussion and development. The framework will play an important role in establishing best practices for the reuse and sharing of data about people. It was the result of a five-year effort, including a study period to review guidance the world over. Not to mention the effort put into the standard on which it was based, the ISO/IEC 20889:2018 Privacy enhancing data de-identification terminology and classification of techniques.
Is there a difference between deidentification and anonymization? Not in the workstream from which these standards were produced. For example, both could apply depending on the jurisdiction, since the standards are agnostic to legal interpretations. The term anonymization had been used in standards development to convey a range of different meanings for years. Privacy laws also vary in the use of these terms, and it was therefore deemed that the more neutral term of deidentification would allow for more jurisdictional variation.
Suddenly standards
An international standard provides the best practices for doing something, by drawing on the involvement of globally established experts. Those experts evaluate the landscape of interest, including guidance and current implementations, and develop standards to solve needs. A standard gives stakeholders confidence that an implemented process or product is safe, reliable and of good quality.
Sensitive data can be reused in many ways, like to improve services, identify new opportunities and insights that can shape an organization, and create data products to serve the needs of society. However, there are many dimensions to the safe and responsible reuse of data. This can also be thought of in terms of defense in-depth, i.e., protecting data from unauthorized access and misuse through layers of administrative and technical controls.
The purpose of the ISO/IEC 27559 framework is to identify various risks and mitigate them across the lifecycle of deidentified data. The process of removing the association between data and people through a range of different possible deidentification techniques is one aspect of the standard, as is governance of the process and resulting data to ensure risks are monitored and addressed as the need arises.
Picking fences
To capture the lifecycle of deidentified data, various scenarios are outlined based on how the organization responsible for personal data, the custodian, will make the deidentified data available to users, the data recipients.
- Use and reuse: The custodian deidentifies data and makes it available to internal users in an internal environment, i.e., under the custodian’s control.
- External sharing: The custodian deidentifies data and makes it available to external users in an internal environment, i.e., access from outside the organization to an environment under the custodian’s control.
- External release: The custodian deidentifies data and makes it available to external users in an external environment, i.e., access from outside the organization to an environment outside of the custodian’s control.
In each case, the custodian may use a third party to implement the deidentification process. However, each scenario introduces different risks that can be identified and mitigated through the use of the implementation framework.
The principles-based framework provides guardrails with a range of options for implementation, including the many deidentification techniques in the prescribed standard ISO/IEC 20889. The framework is divided into four main areas, capturing the environment and circumstances in which deidentified data are made available to users, and how the process and availability will be governed.
- Context assessment: Evaluating what external information may be available to an adversary, based on the environment and circumstances in which the deidentified data will be made available. Administrative and technical controls will mitigate potential context risks.
- Data assessment: Evaluating how the additional information available to an adversary could be used to reveal or uncover personal information. Limiting what data is made available will mitigate potential data risks.
- Identifiability assessment and mitigation: Determining a measure of identifiability, chances of an attack or chances the attack will be successful using previous assessments for context. Based on established benchmarks, the degree of disclosure is mitigated through a combination of contextual controls, data minimization and transformation.
- Deidentification governance: Following documented procedures and processes so the custodian is assured the above are done well, and there are controls and response mechanisms in place to manage risks before, during and after deidentified data is made available to users.
Standards: Above and beyond
Once national standards bodies adopt ISO/IEC 27559, auditors will test their compliance with the standard through assessment frameworks. This typically involves an evaluation of the controls required to manage risk, versus the controls that should be considered but are optional. Normally, a reason is needed to exclude optional controls, such as a risk assessment to determine reasonableness.
This standard provides a way forward for the safe and responsible use of sensitive data. It provides a global perspective on the definitions, interpretations and practices that can make a deidentification process successful. The detailed list of controls and considerations will facilitate the evaluation of risks in the lifecycle of deidentified data, and provide a means to properly implement deidentification techniques and ensure effective governance of deidentified data.
Industry, governments and regulators now have an agreed-upon framework to leverage for assessment and implementation purposes. In the meantime, privacy models will continue to be refined to produce safe, timely and useful data at scale, and will play an important role in the deidentification framework. And, in five years’ time, this standard will be reviewed and updated to ensure it remains current with best practices, making it a valuable resource now and into the future.