AI transparency without exposure: Legal horizons for homomorphic encryption, federated learning


Contributors:
Nicoletta Kolpakov
Director of law and policy
Cirrus Institute for AI and Data Governance
Editor's note: The IAPP is policy neutral. We publish contributed opinion and analysis pieces to enable our members to hear a broad spectrum of views in our domains.
Artificial intelligence depends on data — the more diverse the training data, the more powerful the resulting models. Yet access to such data is increasingly constrained. Privacy regulation, cross-border transfer restrictions and the reputational risks of sharing sensitive information make pooling large datasets difficult.
This is particularly acute in fields such as health care and financial and government services where regulators scrutinize how every byte of data is collected and used. The result is a paradox: Organizations know that robust AI systems require collaborative learning across institutions, but the very design of the black box of training and generative learning combined with laws designed to protect individuals prevent those datasets from being freely exchanged.
How federated learning protects data without centralization
Federated learning emerged as one way out of this paradox. Instead of centralizing raw data, multiple participants each train a local model on their own datasets and then contribute updates to a central aggregator. The aggregator merges those updates into a global model that benefits from all the participants' data without requiring them to directly share it.
Google's use of federated learning to improve predictive text on Android devices is a familiar example. Hospitals and banks have begun experimenting with the approach to capture insights from patient outcomes or transaction patterns without handing sensitive records to a central repository.
Contributors:
Nicoletta Kolpakov
Director of law and policy
Cirrus Institute for AI and Data Governance