Artificial intelligence and machine learning are advancing at an unprecedented speed. This raises the question: How can AI/ML systems be used in a responsible and ethical way that deserves the trust of users and society?
Regulators, organizations, researchers and practitioners of various disciplines are all working toward answers. Privacy professionals, too, are increasingly getting involved in AI governance. They are challenged with the need to understand the complex interplay between privacy regulations and broader developments regarding the responsible use of AI.
With government authorities increasing their enforcement, rulemaking and legislating in this complex arena, it is critical that organizations understand the privacy requirements that currently apply to AI, those on the horizon and the resources available to build a compliant data protection program for AI and ML applications.
Broader AI governance developments
Over the last few years, numerous good governance guidelines on trustworthy AI were published. Most of these AI governance frameworks overlap in their definition of basic principles, which include privacy and data governance, accountability and auditability, robustness and security, transparency and explainability, fairness and non-discrimination, human oversight, and promotion of human values.
Some prominent examples of responsible AI frameworks by public organizations include the UNESCO’s Recommendation on the Ethics of AI, China’s ethical guidelines for the use of AI, the Council of Europe’s Report “Towards Regulation of AI Systems”, the OECD AI Principles, and the Ethics Guidelines for trustworthy AI by the High-Level Expert Group on AI set up by the European Commission.
Beyond that, one can find countless self-regulatory initiatives by companies. Additionally, industry has joined forces with academia and nonprofits to advance the responsible use of AI; for example, within the Partnership on AI or the Global Partnership for AI. Standardization bodies such as ISO/IEC, IEEE and NIST also offer guidance.
Present governance initiatives primarily take the form of declarations and are non-binding. At the same time, various existing privacy laws already regulate the responsible use of AI systems to a considerable extent.
The prominent role of privacy regulators for AI governance is also demonstrated by the release of the Model AI Governance Framework through Singapore’s Personal Data Protection Commission, the U.K. Information Commissioner’s Office extensive work on developing an AI Auditing Framework, and the release of a Guidance on the Ethical Development and Use of AI by the Office of The Privacy Commissioner for Personal Data of Hong Kong.
Privacy regulations and responsible AI
One of the principles of responsible AI regularly mentioned refers explicitly to “privacy.” This is reminiscent of the obligation to apply general privacy principles, which are the backbone of privacy and data protection globally, to AI/ML systems which process personal data. This includes ensuring collection limitation, data quality, purpose specification, use limitation, accountability and individual participation.
Principles of trustworthy AI like transparency and explainability, fairness and non-discrimination, human oversight, robustness and security of data processing can regularly be related to specific individual rights and provisions of corresponding privacy laws.
With regard to the EU GDPR, this is the case for the right to explanation (Articles 1(1), 12, 13, 14, 15 (1)(h), 22(3), Recital 71), the fairness principle (Article 5(1)(a), Recital 75), human oversight (Article 22), robustness (Article 5(1)d) and security of processing (Article 5(1)(f), 25, 32). Other privacy laws like China’s PIPL or the U.K. GDPR include similar provision which relate to these responsible AI principles.
In the U.S., the Federal Trade Commission holds AI developers and companies using algorithms accountable under Section 5 of the FTC Act, the US Fair Credit Reporting Act as well as the Equal Credit Opportunity Act. In its 2016 report and guidelines from 2020 and 2021, the FTC leaves no doubt the use of AI must be transparent, include explanations of algorithmic decision-making to consumers, and ensure that decisions are fair and empirically sound.
Not being aware of compliance requirements for AI systems that stem from privacy regulations poses risks not only for affected individuals. Companies can face hefty fines and even the forced deletion of data, models and algorithms.
Recent cases
At the end of last year, the Office of the Australian Information Commissioner found Clearview AI in violation of the Australian Privacy Act for the collection of images and biometric data without consent. Shortly after, and based on a joint investigation with Australia’s OAIC, the U.K. ICO announced its intent to impose a potential fine of over 17 million GBP for the same reason. Further, three Canadian privacy authorities as well as France's CNIL ordered Clearview AI to stop processing and delete the collected data.
European data protection authorities pursued several other cases of privacy violations by AI/ML systems in 2021.
In December 2021, the Dutch Data Protection Authority announced a fine of 2.75 million euros against the Dutch Tax and Customs Administration based on a GDPR-violation for processing the nationality of applicants by a ML algorithm in a discriminatory manner. The algorithm had identified double citizenship systematically as high-risk, leading to marking claims by those individuals more likely as fraudulent.
In another landmark case from August 2021, Italy’s DPA, the Garante, fined food delivery companies Foodinho and Deliveroo around $3 million each for infringement of the GDPR due to a lack of transparency, fairness and accurate information regarding the algorithms used to manage its riders. The regulator also found the companies’ data minimization, security, and privacy by design and default protections lacking and data protection impact assessment missing.
In similar cases in early 2021, Amsterdam’s District Court found ride-sharing companies Uber and Ola Cabs didn't meet the transparency requirements under the GDPR and violated the right to demand human intervention. The investigation by the Dutch DPA is ongoing.
In the U.S., recent FTC orders made clear the stakes are high for not upholding privacy requirements in the development of models or algorithms.
In the matter of Everalbum, the FTC not only focused on the obligation to disclose the collection of biometric information to the user and obtain consent, it also demanded that the illegally attained data, as well as models and algorithms that had been developed using it, be deleted or destroyed.
With this, the FTC followed its approach in its Cambridge Analytica order from 2019, where it had also required the deletion or destruction not only of the data in question but all work products, including any algorithms or equations that originated in whole or in part from the data.
Challenges in definition and practice
Despite being liable for not implementing responsible AI principles required by regulations, there are many open questions. While there is a lot of legal guidance around consent and appropriately informing users, the legal interpretation and practical implementation of requirements such as AI fairness and explainability is still in its infancy. Common ground is that there is no one-size-fits-all approach for assessing trustworthy AI principles in various use cases.
AI explainability or transparency aims at opening the so-called “blackbox” of ML models. A whole field of AI research around explainable AI has emerged. There are many answers to what it means to explain a ML model. To explain individual predictions to regulators or users, outcome-based post-hoc local models are common. Here, a surrogate model (or metamodel) can be trained on a dataset consisting of samples and outputs of the black box model to approximate its predictions. Any explanation should be adapted to the understanding of the receiver and include references to design choices of the system, as well as the rationale for deploying it.
AI fairness is another growing field covering a very complex issue. Bias, discrimination and fairness are highly context-specific. Numerous definitions of fairness exist and vary widely between and within the various disciplines of mathematics, computer science, law, philosophy and economics. Some privacy regulators have issued clear guidelines. According to the ICO, fairness means personal data needs to be handled in ways people would reasonably expect and not use it in ways that have unjustified adverse effects on them. Similarly, the FTC explains that under the FTC Act, a practice will be considered unfair if it causes more harm than good. On the other hand, definitions of the fairness principle in the context of the GDPR are still scarce. At the same time, many organizations are unsure how to avoid bias in practice. In general, bias can be addressed pre-processing (prior to training the algorithm), in-processing (during model training), and post-processing (bias correction in predictions).
AI explainability and fairness are only two of many rapidly evolving principles in the field of responsible AI. Other areas, such as securing AI/ML algorithms also require increasing awareness and safeguards, as the EU’s Agency for ENISA emphasized in a recent report. Another challenge is the tradeoff between different principles. Tensions may arise between some trustworthiness properties, like transparency and privacy, or privacy and fairness.
Practical assessment and documentation
Legal definitions are not the only component of responsible AI principles that require greater clarity. The term “Responsible AI Gap” was coined for the challenges companies are facing when trying to translate trustworthy AI principles into tangible actions.
For privacy professionals, it can be a good start to approach the topic from a data governance and risk management lens to ensure accountability. Co-designing an appropriate AI governance process with other teams such as computer engineering, data science, security, product development, user experience design, compliance, marketing and the emerging new role of AI ethicists are the baseline for ensuring that AI/ML systems which process personal data take privacy requirements into account throughout the entire ML pipeline.
Internal policies should ensure humans are in the loop to address data quality, data annotation, testing of the training data’s accuracy, (re)validation of algorithms, benchmarked evaluation and external auditing. Using external transparency standards, such as recently published by IEEE or the U.K., can also be considered.
Data protection impact assessments or privacy impact assessments could be augmented with additional questions relating to responsible AI. In this way, risks to rights and freedoms that using AI can pose to individuals can be identified and controlled. Here, any detriment to individuals that could follow from bias or inaccuracy in the algorithms and data sets should be assessed and the proportionality of the use of AI/ML algorithms documented. The PIA can describe trade-offs, for example between statistical accuracy and data minimization, and document the methodology and rationale for any decision made.
Additionally, organizations can consider privacy-preserving machine learning solutions or the use of synthetic data. While they do not replace policies for responsible AI and privacy, a thorough model risk management, and the use of methods and tools for model interpretability or bias detection, they strengthen a privacy-first approach when designing AI architectures.
The Norwegian DPA highlighted these approaches in a report dedicated to the use of personal data in ML algorithms: “Two new requirements that are especially relevant for organizations using AI, are the requirements privacy by design and DPIA.”
In this context, key questions for responsible AI principles can also be taken into account. Starting-points could be the list proposed by the EU’s AI-HLEG or the ones compiled by Partnership on AI. Interdisciplinary discussions and the deployment of toolkits for responsible AI, AI fairness, and AI explainability like LIME, SHAP or LORE can further contribute to mutual understanding and transparency toward users.
Further non-technical approaches can include the formation of an ethical AI committee, internal trainings, diversification of team composition, or an analysis of the data collection mechanism to avoid bias. Currently, the public sector is spearheading efforts to inventory all algorithms in use for transparency reasons. Other organizations have begun to release AI Explainability Statements. Regardless of approach, organizations must provide consumers necessary information in case of adverse actions resulting from AI/ML systems and the use and consequences of scoring.
More developments on the horizon
Principles for ensuring trustworthy AI and ML will be reflected in a large variety of laws in the upcoming years. On a global level, the OECD counted 700 AI policy initiatives in 60 countries.
With the new EU Artificial Intelligence Act, high-risk AI systems will be explicitly regulated. In the U.S., President Biden’s administration announced the development of a “AI bill of rights.” In addition to the additional $500 million in funding for the FTC on the horizon, the FTC has filed for rulemaking authority on privacy and artificial intelligence. Additionally, the new California Privacy Protection Agency will likely be charged with issuing regulations governing AI by 2023, which can be expected to have far-reaching impact.
With increasing enforcement and new regulations underway, ensuring privacy compliance of AI systems will become a minimum requirement for the responsible use of AI. With more to come our way, it is important that AI applications meet privacy requirements now. Alignment of efforts and a good understanding of the AI/ML ecosystem will help tremendously with preparing for those new developments.
Photo by Adi Goldstein on Unsplash