The idea that technology can actually help resolve international data privacy and security issues has been raised occasionally but has not been taken seriously. Tensions in global data security requirements together with the importance of big data analytics and permitted use requires revisiting the capabilities of de-identification as a concrete means of reconciling conflicting interests.
The EU-U.S. Privacy Bridges report produced as part of the 37th International Data Protection and Privacy Commissioners’ Conference this year, included, "Best Practices For De-Identification Of Personal Data." It stated:
[quote]De-identification of personal data is a critical tool for protecting personal information from abuse. This bridge calls on EU and U.S. regulators … to identify concrete, shared standards on de-identification practices. Common standards will improve privacy protections on both sides of the Atlantic while enhancing legal certainty for both EU and U.S. organizations that follow these recommendations.[/quote]
To fully benefit from de-identification, however, we must move beyond traditional static de-identification and embrace new dynamic approaches.
Last year, I wrote that “the masking of private information by using a single, unchanging identifier to hide connections between data and data subjects (also known as ‘static anonymity’)” and the “tiresome kabuki theater [of] the Transportation Security Administration” both “encourage complacency by fostering not only a false sense of security but a false sense of utility as well.” Plus, the trust economy “cannot be maintained using static anonymity. We must embrace new approaches like dynamic data obscurity to both maintain and earn trust and more effectively serve businesses, researchers, healthcare providers and anyone who relies on the integrity of data.”
Dynamic de-identification can support “proportional” use of data in a manner that is responsive to the variety and complexity of different, potential uses of data. Specifically, dynamic de-identification can reveal different levels and type of information to the same and/or different parties at different times, for different purposes, at different places–with respect to each, only as necessary for each proposed use of data. It is thus possible to support:
- Data privacy and proportionality as demanded by the European Court of Justice;
- Data value and utility as necessary for supporting vibrant commerce, research and innovation to fuel data-driven economies; and
- Data accountability as required for combating threats to national and global security.
Traditional approaches to data analysis generally involve the use of static identifiers that enable the ability to infer—or single out or link to—a data subject because static identifiers, when used across multiple data sets, enable the overlay of the data sets so data that is not identifiable by itself when combined with other overlapping data leads to re-identification of a data subject. Dynamic approaches to de-identification, on the other hand, leverage dynamically changing identifiers to probabilistically prevent the ability to infer identifying information pertaining to a data subject across multiple data sets or data combinations—all in a manner that is capable of mathematic analysis, audit and enforcement.
The final version of the General Data Protection Regulation (GDPR) will likely include a definition of “personal data” that includes “any information relating to an identified or identifiable natural person,” including both direct and indirect identifiers. It will also likely include a concept of “pseudonymous” data—i.e., personal data that has been subjected to technological measures so that it no longer directly or indirectly identifies an individual. Effective de-identification technologies can enable companies to show that they have used privacy-enhancing technologies in accordance with reasonable data minimization, proportionality and privacy-by-design principles necessary to benefit from relaxed data breach notification rules, less strict data subject access request requirements and greater flexibility to conduct data profiling, applicable to pseudonymous data under the final GDPR.
Similarly, if a company’s business purposes can be achieved by using data that has been “anonymized” as required under the final GDPR, then the company can use the data without requiring a separate legal basis under applicable data protection law since it does not reveal personal data. Support for this point of view can be found in the Anonymisation Code of Practice published by the UK Information Commissioner’s Office. Dynamic approaches to de-identification can provide the foundation for companies to comply with requirements for “anonymized” data under the final GDPR.
I was scheduled to speak on this very topic on Thursday, December 3, at the recently cancelled IAPP Europe Data Protection Congress 2015 in Brussels, in part, to present a high-level introduction to demonstrate that Anonos Data Privacy-as-a-Service is a concrete example of dynamic de-identification. (Full disclosure: I am the co-founder and CEO of Anonos.)
This technical solution seeks to balance the competing objectives of data privacy, proportionality, value, utility and accountability by means of a two-step process that limits the persistence, and time period of association, of direct and indirect identifiers; provides different authorized parties with different access and use privileges for gradations of identifying information, and supports requirements for multiple encryption key holder involvement to access and/or use disparate gradations of identifying versus obscured information.
Dynamic de-identification is not an abstract concept but rather a concrete principle.
It is high-time that technology companies, engineers and developers around the globe respond to the challenge laid down by U.S. FTC Commissioner Julie Brill in her 2013 speech on the role technologists can play in protecting privacy. "This is your 'call to arms,'–or perhaps, given who you are, your 'call to keyboard,'" she proclaimed, "to help create technological solutions to some of the most vexing privacy problems presented by big data."
Dynamic data de-identification is one way to take up her call.
photo credit: Source code security plugin via photopin (license)