Editor's note: The IAPP is policy neutral. We publish contributed opinion and analysis pieces to enable our members to hear a broad spectrum of views in our domains.
The European Data Protection Board's long-awaited Opinion 28/2024 addressed whether, and under which circumstances, artificial intelligence models can be regarded anonymous and whether legitimate interests can be a valid legal basis during the development and deployment of AI models.
The EDPB advised data protection authorities to consider the anonymity status of AI models on a case-by-case basis, as it "considers that AI models trained with personal data cannot, in all cases, be considered anonymous," and confirmed legitimate interest can be used as a legal basis by AI developers for model training, as long as the three-step test is passed.
Much more than that, though, the 17 Dec. 2024 opinion also indirectly gives data protection and AI governance professionals insights on operationalizing its recommendations in practice.
Establish a procedure to assess anonymity of AI models, risk of personal data identification
The EDPB's opinion encourages an ad hoc assessment of anonymity and a "thorough evaluation of the risks of identification," providing a list of criteria to consider. This assessment shall document, among others, the likelihood of direct extraction of personal data used for training and the likelihood of obtaining personal data from queries.
A new procedure and a risk assessment template must, therefore, be established that will not only assess the risk identification from the side of the data controller but also the third parties that could potentially access or reuse the AI model. A reference to this new procedure must also be included in the internal privacy by design and/or privacy notice as well.
Whenever the assessment concludes an AI model is anonymous, organizations should have documentation available at any time that proves the existence of effective anonymization measures in three main areas: AI model design, AI model analysis, and AI model testing and resistance to attacks across the AI model life cycle.
Fine tune current LIA template
In case of legitimate interest assessments for AI models, the EDPB said it is up to controllers to identify "the appropriate legal basis for their processing activities," noting a three-step test should be conducted when assessing use of legitimate interest as a legal basis.
This includes identifying the legitimate interest, analyzing the necessity of the processing and conducting a balancing test assessing that the legitimate interest is not "overridden by the data subjects' interests or fundamental rights and freedoms of the data subjects."
Especially regarding the third step, the balancing test, different interests and data subjects' rights must be assessed during the AI model development and AI deployment, and a broader palette of fundamental rights enshrined under the EU Charter is expected to be assessed.
Amend data subjects' rights handling procedure
While rights enshrined under the EU General Data Protection Regulation continue to apply, according to the EDPB, the right to erasure shall be expanded and granted even to situations that do not meet Article 17(1) criteria.
A premature right to object before a data processing activity even takes place is also mentioned, as well as providing reasonable timeframes between the announcement of processing data in the context of developing AI models and the actual processing itself.
These two deviations should be added in the data subjects rights handling procedure, along with a new section on how to handle claims of personal data regurgitation or memorization.
Update privacy notices — and more
The EDPB urges transparency beyond the privacy notice requirements in Articles 13 and 14 of the GDPR, such as "providing additional details about the collection criteria and all datasets used, taking into account special protection for children and vulnerable persons."
"Some measures, in addition to compliance with the GDPR obligations, may help overcoming the information asymmetry and allow data subjects to get a better understanding of the processing involved in the development phase," the EDPB said.
In practice, this means organizations should start exploring more effective ways than privacy notices for providing necessary information, including transparency labels, model cards, media campaigns and use of graphics.
In case of new rights, relevant information should also be provided to data subjects. Organizations also need to decide whether they will make annual transparency reports publicly available to maximize transparency.
Establish and implement a web scraping policy
In the context of data scraping, a web scraping policy should be established to ensure mitigating measures — as listed in the opinion to include ensuring certain data categories are not collected and imposing relevant limits on collection — are in place and that companies follow a standardized approach when deciding on eligibility or exclusion of processing publicly available information.
"The use of web scraping in the development phase may lead — in the absence of sufficient safeguards — to significant impacts on individuals, due to the large volume of data collected, the large number of data subjects, and the indiscriminate collection of personal data," the EDPB said.
Organizations need to also decide whether an opt-out list will be created to allow data subjects to object to the collection of their data from websites, as a premature right to objection, and, if yes, to add a relevant section within the web scraping policy and ensure the opt-out list works in practice.
Amend data protection due diligence processes
The EDPB clearly states an EU declaration of conformity, which is mandatory for high-risk AI systems under the EU AI Act, does not release organizations from performing their due diligence obligations.
Therefore, prior to deploying an AI model developed by another company, an appropriate data protection assessment must be conducted as part of the organizations' due diligence process. While organizations need to monitor guidelines from national supervisory authorities on this topic, questions regarding the source of training data, and whether the AI model is a result of a GDPR infringement could already be added in the due diligence questionnaire.
Organizations could also consider a granular approach, with the creation of different assessments for different AI models depending on the level of risk raised during the AI model deployment to the data subjects.
Christina Varytimidou, AIGP, CIPP/E, CIPM, is a data protection manager at MyData-TRUST.