Artificial intelligence is revolutionizing various sectors and is expected to continue in the foreseeable future. However, serious data privacy and security concerns challenge more widespread adoption.
Sometimes, AI is entirely at odds with privacy principles. However, looks are not always what they seem. Several privacy and AI regulations strongly encourage using privacy enhancing technology with AI including draft regulations, executive orders and frameworks.
Traditional AI models rely heavily on centralized data storage and processing. Mountains of user data are collected, aggregated in vast data centers and then used to train AI models.
While effective, this approach has privacy problems. The volume of personal data processed alone could be alarming.
Moreover, centrally storing data creates a single point of failure, making it a prime target for cyberattacks. Breaches can expose sensitive user information, leading to identity theft and other serious consequences.
Additionally, data poisoning attacks are another worrisome concern for centralized data because all data is contaminated. If only local data is compromised, then the attack is contained.
Lastly, when individuals surrender their data to centralized systems, they could lose control over its use. This lack of transparency and control can breed distrust and hinder AI adoption.
Two emerging technologies, federated learning and edge computing, can support privacy and security in AI use cases. By decentralizing data and processing power, these technologies offer a promising method to protect privacy while maintaining the benefits of AI innovation.
Privacy-preserving AI training
Federated learning, a subfield of machine learning, focuses on AI model training processes. It helps keep data private by distributing the training workload across devices (clients) while ensuring the data remains decentralized.
Imagine a group project where everyone works on their part at home (data stays on devices) and then shares only their key findings (model parameters) to create a final report (global model).
Federated learning comes with some legal benefits — data minimization, for example. It supports data minimization principles enshrined in privacy laws like the EU General Data Protection Regulation. By keeping user data on devices, it reduces the amount of personal information collected and stored centrally.
Additionally, federated learning can help with notice and consent requirements for data collection. Individuals can understand how their data is used locally on their devices and consent to sharing only the anonymized model parameters.
Security-wise, federated learning can reduce the attack surface. By distributing data across a network of devices, data compromises no longer expose the entire jackpot of your data collected.
Instead, an attacker can only access an ephemeral and small amount of data. Even if a single device is compromised, the attacker is limited to that device and cannot access the entire dataset. If raw data remains locally on the device, the risk of data breaches on centralized servers is virtually eliminated.
Living life on the edge
Edge computing is concerned with where data processing happens. It moves processing power and storage closer to where the data is generated — on devices or local servers — instead of centralized cloud servers.
Imagine doing homework assignments on your laptop instead of logging into the library's cloud every time. Processing happens closer to the data source.
This local processing is crucial for real-time or near-real-time applications like autonomous vehicles or smart devices. However, edge computing can cap computational power since the edge hardware's ability will always limit it.
Just like with federated learning, edge computing possesses some privacy benefits. Only the processed model parameters, not the raw data, are transferred to the central server, minimizing the data transmitted across networks. Local processing reduces the chance of data loss in transit.
Edge computing can empower users with greater control over their data. Individuals can choose which data is processed locally and have more transparency into how it is used.
Of course, developers ultimately establish the bounds of control in the programming and terms of use. Users should understand how AI makes decisions and who is responsible for potential harm.
Better together
Federated learning and edge computing can be a powerful combination. Edge computing reduces the need to constantly upload data to the cloud, which benefits federated learning by improving efficiency and reducing bandwidth, lag time and associated costs. Federated learning takes edge computing and expands it by keeping edge computing but allowing data to be aggregated and analyzed in larger sets.
The results of federated learning typically remove personal elements from data and then distribute relatively safe model results to the constituents. Both techniques play essential roles in the development of safe and efficient AI.
Federated learning, edge computing concerns in AI
Federated learning and edge computing offer a compelling vision for privacy-preserving AI. However, these decentralized approaches raise concerns that could hinder their ability to fully unlock AI's potential. There are four critical areas of concern.
Data quality and fragmentation. In traditional AI, vast amounts of centralized data fuel potent models. Federated learning distributes data across devices, potentially leading to fragmentation and inconsistencies. Data fragmentation can harm training data quality and, ultimately, the performance of the AI model.
Computational efficiency. Training complex AI models requires significant computational resources. Centralized cloud servers provide the horsepower needed for intensive calculations. While edge computing can handle more straightforward tasks, individual devices may need more processing power.
This potential limitation raises concerns about the scalability of federated learning, particularly for training sophisticated AI models that require substantial processing power. Edge computing and federated learning are not a one-size-fits-all solution but can be a big win for the problems that are well-suited to them.
Privacy risks at the edge. Distributing processing power to the edge introduces new security vulnerabilities. If the end user fails to adopt reasonable security practices, they risk their own data. Securing individual devices and ensuring the integrity of the training process across a decentralized network is critical.
Companies can mitigate these risks with technical and administrative controls like terms of use. System administrators must remain vigilant against cyberattacks that could compromise the entire federated learning system.
System heterogeneity. This is more of a secondary issue but worth mentioning. The devices participating in federated learning can be vastly different — smartphones, smartwatches and industrial equipment all have varying processing capabilities and software configurations.
This heterogeneity can create challenges in ensuring all devices contribute meaningfully to the training process. Companies utilizing federated learning and edge computing must ensure robust algorithms account for these discrepancies.
Turning black boxes into bright futures
Despite these concerns, federated learning and edge computing offer significant advantages over traditional computing and cloud approaches for privacy-preserving AI.
Current research continues to focus on the challenges. As these technologies mature, they have the potential to revolutionize AI development while safeguarding user privacy and data security.
As AI continues to promise exciting opportunities for progress, concerns about privacy and security threaten to hamstring its potential. Federated learning and edge computing, among other PETs, offer a compelling way forward.
While there are challenges, auspicious ongoing research and development show potential. Having innovation and privacy does not have to be a zero-sum game. By embracing PETs like federated learning and edge computing, we can begin to harness the full potential of AI while safeguarding individual privacy.
This is how we build a future where both technology and trust can flourish.
Michael Cole, AIGP, CIPP/C, CIPP/E, CIPP/US, CIPM, CIPT, FIP, PLS, is managing counsel for Mercedes-Benz R&D North America.