22 January 2025

ANALYSIS

'What's in a name?': EDPB publishes draft guidelines on pseudonymization

The guidelines set out the legal and technical requirements the EDPB considers necessary for pseudonymization to be effective. The guidelines also contain an Annex of 10 worked examples, three of which relate to use of medical data for purposes other than direct care, underlining the importance of the guidelines to that sector. A further example relates to use of customer data to identify correlations between items purchased without engaging in profiling, meaning the guidelines will also be of interest to those involved in marketing and advertising technology, among others.

A case is pending before the Court of Justice of the European Union on whether the disclosure of masked data to a processor, where only the controller can attribute data to identifiable individuals, should amount to pseudonymous or anonymous data. Spoiler alert: the guidelines jump the gun on this topic. However, it is difficult to interpret the guidelines without the case — or without the much-trailed but not yet available EDPB guidelines on anonymization.

What is effective pseudonymization?

Pseudonymization is defined in Article 4(5) of the EU General Data Protection Regulation as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person."

The guidelines break the definition down into three component parts: It must still be possible to attribute data to an identified or identifiable natural person; by additional information "whose use enables the attribution of pseudonymized data to identified or identifiable persons;" and that which is kept separately, subject to technical and organizational measures.

Paragraph 17 of the guidelines notes "to attribute data to an identifiable person means to link the data to other information with reference to which the natural person could be identified. Such a link could be established on the basis of one or several identifiers or identifying attributes."

The additional information could be data that is kept for later use to reverse pseudonymization and would typically consist of tables that match pseudonyms with the identifiers they replace or of cryptographic keys.

In paragraph 21, the guidelines note it is also important to consider if there is additional information — such as social media posts — that could allow for identification and that is beyond the control of the party responsible for or undertaking the pseudonymization. This risk must be considered and effectively mitigated for the deidentification technique to amount to effective pseudonymization.

The guidelines note, so long as data can be attributed to an identifiable natural person, then it will be personal data per Recital 26 of the GDPR. This is true even if the pseudonymized data and the additional data needed for attribution are held by different people. Here, as stated in paragraph 22 of the guidelines, one must have regard to the means reasonably likely to be used by the controller or by another person to combine the pseudonymized data and the additional information.

Hang on a minute, what about the SRB case?

This is the 26 April 2023 judgment of the General Court in Case T-557/20, which concerned a restructuring of Banco Popular under supervision of the Single Resolution Board. As part of this, the SRB appointed Deloitte to assist in a consultation process, which the SRB ran with affected creditors and shareholders.

The SRB allocated a unique code to each comment received and shared a small subset of the data from the consultation process with Deloitte, including the corresponding unique codes. Deloitte did not have access to the full database of responses to the consultation and only the SRB could link comments received to the registration data. In other words, the data seemed to be pseudonymous in the EDPB's terms.

Some creditors and shareholders complained to the EDPS that the SRB did not tell them their personal data could be shared with third parties, such as Deloitte. The European Data Protection Supervisor agreed. The SRB then appealed to the General Court.

The General Court noted, to determine if data had been effectively anonymized, one must look at the means of identification "reasonably likely to be used by the controller and any other person," as suggested in the new guidelines and in line with the CJEU's decision in Patrick Breyer v. Bundesrepublik Deutschland, Case C-582/14.

However, the court additionally noted, to determine if information disclosed to Deloitte was personal, one must put oneself in Deloitte's position — paragraph 97 of the case — and consider if Deloitte had the legal means available that would in practice enable it to access the additional information necessary to reidentify the authors of the comments — paragraph 105 of the case. The EDPS has since appealed this decision to the CJEU.

There is a significant difference between whether it is reasonably likely that anyone can identify the data subject and just considering this from the perspective of the party that has received the pseudonymized data. The new guidelines refer in multiple places to scenarios in which a controller discloses masked data to a third-party recipient as use cases of pseudonymization. The EDPB has — presumably deliberately — jumped the gun in issuing its guidelines now, before the CJEU's decision.

These guidelines also do not address whether it would be possible for the parties to put additional measures — such as security over source data and pseudonymization secrets, contractual restrictions, audit or disciplinary measures — in place to allow the receiving party to demonstrate it is not reasonably likely that it, or any other person, would be able to attribute the data it processes to an identifiable natural person and that the data is anonymous in the recipient's hands.

It's possible this will be addressed in the EDPB's guidelines on anonymization. However, this part of the guidelines, and all use cases which link to it, are difficult to interpret without those anonymization guidelines and the CJEU decision in the SRB case. This is regrettable, as the scenario here is commonplace.

Some final comments on the meaning of pseudonymization

In paragraphs five and 18, the guidelines note there must be a process of pseudonymization whereby a party modifies or transforms the data. In an adtech context, organizations sometimes assert the data they process would be classed as pseudonymous under the GDPR because it involves a unique ID allocated by the adtech company, rather than using another existing identifier. However, if this unique ID is used to identify the individual and there is no process of transformation or masking, then this would not be pseudonymous data.

Lastly, the EDPB notes the GDPR concept of pseudonymization is different to the common understanding of the term. In common parlance, a pseudonym is simply a replacement for an identifier that does not itself reveal the identity of the individual; for example, Mary Ann Evans published under the pseudonym George Eliot.

However, this plain-English understanding would not qualify as pseudonymous under the GDPR unless it is no longer possible to identify the author without additional information, which must also be held separately to prevent attribution. In addition, although pseudonymization usually involves the replacement of direct identifiers with pseudonyms, the EDPB notes in paragraph eight of the guidelines that it could instead be achieved in different ways involving removal of direct identifiers without a replacement pseudonym as long as there is retention of additional, secure information that allows attribution.

The pseudonymisation domain

The EDPB recognizes controllers often use pseudonymization to preclude attribution of information to data subjects by a specific group. The EDPB uses the term pseudonymisation domain to capture this concept in paragraphs 35 to 43. To take an example, in a clinical trial, pseudonymization is used to preclude attribution by the sponsor. The pseudonymization domain would capture the sponsor but not the investigator or the treating institution.

The EDPB notes different measures may need to be considered if a pseudonymization domain is limited to internal recipients, limited to a predefined set of external recipients — where more extensive measures and risk assessment may be necessary, in particular to demonstrate there is no further disclosure of the data, according to paragraph 51 — or is not limited at all. This last point would be relevant if pseudonymization is used as a security measure to reduce confidentiality risks to data subjects from any and all unauthorized third parties.

Here, in paragraph 42, the EDPB recommends the controller consider not just good faith actions that might reidentify but also those executed with criminal intent when assessing the measures necessary to protect the pseudonymization process.

How pseudonymization can mitigate legal risks

The guidelines also explain how pseudonymization can assist controllers in meeting several GDPR requirements, including:

Privacy by design and by default, as outlined in paragraph 53. Pseudonymization can be helpful in instances when this allows analysis to be carried out but the resulting learnings do not need to be related to the particular individuals concerned. Scientific research is referenced, in particular, here.
Lawfulness, fairness and accuracy, as outlined in paragraphs 54-58. Pseudonymization can assist controllers in meeting several GDPR requirements when it could satisfy a legitimate interests balancing test, when considering whether processing is compatible and when ensuring accuracy by replacing direct identifiers that are similar and prone to confusion — for example, individuals with similar sounding names and very different pseudonyms.
Security, as outlined in paragraphs 59-62.
Third-country transfer requirements, as outlined in paragraph 65. Privacy professionals will recall that if a transfer risk assessment leads to a conclusion that the laws and practices in the third country or international organization do not meet EU standards, then the EDPB's Recommendations 01/2020 suggest effective pseudonymization may be an appropriate safeguard, provided the importer has no access to the pseudonymization secrets. The EDPB reiterates its previous guidance that the exporter must consider what information the public authorities in that country possess or could obtain with reasonable effort but adds a new comment that the exporter should also even consider measures that may be illegal in the third country.

The examples in the annex further illustrate these points. The annex includes a table noting which examples are relevant to the various GDPR principles discussed.

Data subject rights

As pseudonymous data is personal data, it follows that data subject rights apply. GDPR Article 11 disapplies some of these rights when the controller is not able to effectively identify the data subject, and the EDPB notes this data is likely to be relevant here.

Per Article 11(2), in paragraph 79, the EDPB notes the controller must inform the data subject when this applies, which may include providing the identity and contact details of the source of pseudonymized data it holds to facilitate the data subject contacting that person to request access to the relevant pseudonyms.

It is not clear from the draft guidelines if the EDPB considers that this information can be given reactively or should be given proactively, for example in privacy notices. This would seem problematic, as it could undermine careful processes put in place to ensure confidentiality of data processing — again, for example, in the case of clinical trials.

Unauthorized reversal of pseudonymization

In paragraphs 80-82 the EDPB notes a breach of security leading to unauthorized reversal of pseudonymization is a personal data breach that, depending on its severity, may need to be reported to supervisory authorities or data subjects.

Technical measures and safeguards for pseudonymization

In contrast to the legal analysis section, the commentary on technical measures and safeguards is accessible, clear and relatively easy to read, including a one-page summary of requirements in paragraphs 130–135.

Privacy pros may find the following points useful:

In paragraph 83, the guidelines clearly state "pseudonymised data must not contain direct identifiers (e.g. national id numbers) whenever those direct identifiers could be used in the pseudonymisation domain to easily attribute the data to the data subjects." This may be particularly worth noting for those working in adtech, where it is common to assert that the data processed is pseudonymous. As the adtech company's unique ID is a direct identifier, used to attribute data to an identified natural person, it will not be considered an effective pseudonym.
Paragraphs 101 and 102 include a very accessible explanation of quasi-identifiers, which may be useful if you need to explain this to anyone.
In paragraphs 107-110, there is guidance on choosing effective cryptographic measures and on the technical and organizational measures needed to secure access to the systems performing the pseudonymizing transformation.
In paragraphs 111-114 there is guidance on securing the pseudonymization domain with a note that, when this is a defined set of recipients, contractual measures are a necessary but not solely sufficient part of this.
Finally, in paragraphs 115-129, there is guidance on factors to be considered when designing pseudonymization that allows linkage.

What about the UK?

Readers in the U.K. will recall the Information Commissioner's Office launched a consultation on its pseudonymization guidance in Februrary 2022. This has been on hold while changes to U.K. data protection law are being considered.

The ICO guidance is shorter and more accessible than the EDPB's. It also differs on the role of disclosures of pseudonymized data to third parties. On page 5, the ICO is clear that pseudonymous data, which is disclosed to a third party, may be anonymous in the hands of that third party.

This is in line with longstanding U.K. caselaw on anonymization, including in Common Services Agency v. Scottish Information Commissioner and subsequent cases. In that case, Lord Hope held "if it was impossible for the recipient of the [pseudonymised] data to identify those individuals, the information would not constitute 'personal data' in his hands."

Many might now be hoping that the CJEU follows a similar train of thought in the SRB case.

Ruth Boardman is a partner and the co-head of the International Data Protection Practice and Emma Drake, CIPP/E, is a partner at Bird & Bird.

This content is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Interested in writing for us? Visit our Contributor Guidelines Page