For many, the General Data Protection Regulation ushered a new word into their vocabulary: pseudonymization.
What’s the difference between pseudonymous data and anonymous data, exactly?
That’s a bit of a rabbit hole, but the basic explanation is that “anonymous” means the data subject is long longer identifiable by any means, whereas “pseudonymous” means the use of an algorithm, or combination with another set of data, allows for reidentification.
Can data truly be anonymized, though? And when does data truly become pseudonymous, so as to allow for certain permitted uses under the GDPR? And what happens if the definitions of anonymous and pseudonymous are different between the EU and countries with adequacy decisions?
Such was the basic conversation that ran through the first-ever Brussels Privacy Symposium, run by the Brussels Privacy Hub and the Future of Privacy Forum at Vrije Universiteit Brussel the day prior to the IAPP’s Data Protection Congress. While academics clashed over reidentification possibilities and the future of technology, it was a fireside chat with the European Commission’s Paul Nemitz late in the day that brought out practical questions.
How would policymakers view these technologies? How might those answers affect trans-border data flow and interpretations of legitimate uses of data?
IAPP VP Omer Tene led the questioning, however, with something more big-picture: What did Nemitz make of anonymization in general? Is it truly possible, in his eyes, to keep data useful while removing identifiability?
Nemitz, who oversees fundamental rights, including data protection, for the Commission’s Directorate General for Justice, deflected the question a bit: “When I read the papers [prepared for the symposium], I find myself asking, ‘Why is this now a subject so much of interest?’ It seems the first misunderstanding is that it’s not possible to use data and draw learning from it for the public interest or profit if it’s personal data and not anonymized. That would be wrong. You can do a lot with personal data, provided your processing is legal, whether you have consent or another legal basis.”
"It seems the first misunderstanding is that it’s not possible to use data and draw learning from it for the public interest or profit if it’s personal data and not anonymized. That would be wrong. You can do a lot with personal data, provided your processing is legal, whether you have consent or another legal basis." -Paul Nemitz, European Commission
“If the whole purpose is only to get outside the legal constraints of the regulation, I’m not sure that’s very fruitful,” he continued, “and often it will be unnecessary.”
Ultimately, the question of what is anonymous, pseudonymous, and PII will be decided by the courts, he theorized, but that was the point of the regulation in the first place: “This is what people have asked us to do,” Nemitz said, “come up with a technology-neutral regulation that implies that the concept of personal and anonymized data will change over time in the real world, as technology evolves.
“What may be considered by a judge today to be anonymized might be considered personal tomorrow,” he said. “And that’s what everyone asked for.”
He pointed, by way of example, to the recently decided Breyer case, before the Court of Justice of the European Union, where judges decided that IP addresses are personal information, even if one needs additional data from a third party to identify a person based on that IP address. This surprised many observers, but “it’s totally okay that the definition of personal data moves in parallel with technology,” Nemitz said, adding: “On the high seas and in the courts, you are at the hands of God. And that’s good. It’s not technologists who decide these questions.”
However, Nemitz said, the vast majority of companies needn’t worry overly much about these questions. Only those “hungry for money,” looking to profit off personal data and walking right up to the line, will have to grapple with these questions.
Tene wondered if the discussion centers too much around the dichotomy of PII vs. anonymous, and whether we should instead see identifiability as a spectrum. That way, we don’t encourage people to walk right up to the edge.
Nemitz wasn’t buying that.
“Generally, courts are not keen to differentiate beyond what the law foresees,” he said. “It’s quite black and white. Either it identifies an individual, with reasonable means, or it doesn’t. … Our law is our law. And it will be applied as any other law, and I think that’s right. There’s nothing bad about it, where us Europeans have to go around with our heads down trying to figure out what kind of data it is.
“If there is a problem with that, let’s talk about it openly,” Nemitz said. “The debate comes from America; I don’t think we have that here so much.”
“The debate comes from America; I don’t think we have that here so much.” -Paul Nemitz, European Commission
But the GDPR specifically creates allowances for pseudonymous data, Tene countered. Doesn’t that mean we need to understand what is and what isn’t?
On the contrary, said Nemitz, “the regulation says pseudonymous data is personal data, and I don’t find the discount.”
Then what would be the point, Tene argued, in pseudonymizing the data? Data controllers might as well keep it in raw form, given that they don’t earn a discount if they apply the technology.
Nemitz did not accept that no incentive exists. The GDPR requires privacy by default and by design, and working to make data less personal is clearly part of that effort. Should a data protection regulator come calling, pseudonymization would clearly show a desire to protect personal data. Further, Nemitz said, “the regulation doesn’t ask for the impossible. It doesn’t depart from the default position that any data is personal unless you prove it’s not. But if [a DPA wants] to bring a case, you have to show positively that it’s personal data. … And what you have to show, then, is that it’s not possible to return to the individual, keeping in mind the reasonable means that’s likely to be used. That’s somewhere where you can have a discussion.”
And that’s the part that will change over time. As technology evolves, there will be better ways to anonymize data and better ways to reidentify data.
“Maybe that’s a disaster,” he allowed, “but one could also say, well, this is perfectly right. In a world where more and more data is collected, with challenges to individual freedom in terms of profiling and nudging, the fantasies are less than we actually can do. … We don’t live in a world where all technology must be used regardless of cost. Certainly not in Europe. We have technological regulation all over the place. Limits to this technology, this chemical, why should it be different in the world of data? In a democratic society is perfectly okay that limits are set, even if technology would allow more.”
“What is privacy by design today and what it will be tomorrow we don’t know yet,” Nemitz said, but these definitions will be vitally important in more than just avoiding regulator attention.
“This determination on personal data will be key to future adequacy decisions,” for example. “I don’t know if the FTC’s opinion is the same as ours.”
And it’s not just about the United States and Privacy Shield. “It’s about essential equivalence,” Nemitz said. “A key pillar of the law is how personal data is defined, and if it’s defined as a narrower group of data, that’s not equivalent protection.”
If you want to comment on this post, you need to login.