Privacy engineering for AI machine learning: Addressing algorithmic disgorgement risks during product development

AI checkpoints. These saved snapshots of a machine learning model's state during training act like a pause button, allowing developers to save the model's then-current progress. Because AI checkpoints enable selection of the best outcome model prior to a full-blown deployment, they are commonly used for inferences engineers or any machine learning AI used to either make predictions or generate outputs from new data.

AI checkpoints are typically saved at regular intervals during training or when certain performance milestones are achieved. Full AI checkpoints save the model's entire state at a point in time, including architecture and data. Partial AI checkpoints save only such things as a machine learning model's weights and parameters — meaning in AI and machine learning models, the numerical values that determine how strong connections are between nodes making up the neural network and the learnable elements of a model, respectively.

To leverage AI checkpoints for privacy purposes, whether a full or partial AI checkpoint is most appropriate based on the data ingestion strategy would need to be understood.

Model versioning. Like AI checkpoints, model versioning captures snapshots of an AI system, but it operates on a different level. While AI checkpoints capture specific snapshots of a model's training progress, facilitating resumption and fine-tuning, model versioning provides a high-level overview of different model states, enabling tracking of evolution and experimentation.

Model versioning essentially applies version control principles to machine learning models. It involves systematically tracking and managing changes made to the model's code, data, parameters, and even the model itself.

To be able to leverage model versioning, a model registry must be planned and maintained as an AI solution expands its data ingestion and integration. Additionally, model versioning is most suitable for AI models that expand either regionally or categorically but then need to be retracted.

Data sanitization layers. AI data sanitization involves deploying data sanitization techniques at the first two layers of the AI development life cycle that process data using AI algorithms to train AI models. These are the data link layer — when data enters an AI system, like web forms — and the computation layer — when data is about to be or is stored, staged and used to train AI models.

Which data sanitization techniques to deploy depends on the engineering limitations and the AI system. Common AI data sanitization techniques include: a rule-based filtering layer — deploying predefined rules and patterns in order to later identify, flag and remove information; named entity recognition — embedding natural language processing to enable automatically identifying and classifying information within the data later, such as names or locations; natural language processing based sanitization — implementing an algorithm in the model with advanced natural language process techniques to later identify and flag topics or contextual information within text data, which unlike named entity recognition, goes beyond simple pattern matching to understand the context of the data and identify potential privacy risks; and data masking and anonymization — embedding the capability to find original data and replace it with fictional or altered information later, like where the information is altered using such things as shuffling, substitution, encryption or pseudonymization before it re-enters the AI system.

To leverage data sanitization, AI governance teams will need to have a deep understanding of the data flow and architecture at the data link layer to not only identify the best technique, but also to identify when data sanitization would afford the most data protection.

Notably, the most privacy preserving method would be combining data sanitization with AI checkpoints or model versioning — rather than retroactively applying data sanitization alone — to ensure full retraining is not required as a result of the inadvertent use of unauthorized data.

Regularization methods

In general, regularization methods involve creating and imposing penalties on an AI model to retain specific data while encouraging it to "forget" other data. These techniques add penalties to the model's loss function, encouraging it to learn more generalizable patterns rather than memorizing the training data.

Four regularization techniques could be privacy enhancing and are commonly used in deep learning to prevent "overfitting," in which a model learns the training data too well, including its noise and outliers, leading to poor performance on unseen data.

L1 regularization, also known as least absolute shrinkage and selection operator, adds a penalty to certain information. Penalties refer to a mechanism, usually algorithmic based, that discourages making specific choices during text generation or other tasks and lowers the probability of certain information being chosen. While penalties are frequently used to encourage more diverse and novel outputs or to prevent specific undesirable outcomes, they can also be used to cancel out unauthorized data, rendering it meaningless and undermining any argument that the ultimate AI algorithm or model benefitted from the data.

L2 regularization, known as ridge regression, adds a penalty in a different mathematical way. It encourages an AI model to assign smaller values to all weights without necessarily shrinking them to zero. L2 regularization is effective in handling data multicollinearity — that is, correlating or significant to other data or an insight.

For privacy purposes, L1 regularization would be a more defensible technique than L2.

Dropout involves randomly deactivating a percentage of neurons in each layer of a neural network. This is commonly used to prevent neurons from becoming overly reliant on each other and improves the AI model's generalization capability. By randomly disabling neurons, dropout prevents the network from relying too heavily on specific connections or features and encourages it to learn more generalized and robust representations.

This method is a more proactive means to address data removal as a result of data being incorrectly collected and used. It could be deployed after data has been removed without sacrificing the entire algorithm, since implementing dropout ensures no specific data has more importance than any other, and that data removal will not necessarily negatively impact or change a deep learning system. Notably, this technique would not likely be viable for binary data.

Batch normalization is an algorithmic method used to help improve the speed and stability of training deep neural networks. The process begins by calculating the mean and variance of the activations — the output of a neuron — for each feature within a small subset of the training data used in each iteration of training called a "mini-batch." Each "activation" is then normalized with an algorithm to effectively standardize the activations and bring them to a zero mean and a unit variance.

This technique would likely need to be combined with methods that would diminish the value of any specific data, such as L1 regularization or dropout, so the normalization average remains intact.

AI differential unlearning

AI differential unlearning is already used for such things as fine-tuning for accuracy and ethical reasons like correcting biases, and is a suitable privacy consideration during the development of large language models.

In general, AI differential unlearning involves reversing changes in model weights. For AI differential unlearning, weights of specific data points during training are adjusted until the differential unlearning produces an approximate "forgetting."

AI differential unlearning is already recognized as an appropriate choice for deploying AI systems that utilize machine learning where privacy or data compliance are crucial — for example, LLMs or even image recognition models. Like other unlearning techniques, it should be a contingency implemented during AI development so it can be triggered as needed.

There are two parts to AI differential unlearning: the algorithm and the technique to introduce the algorithm. These methods do result in newly trained models, but adjustments are made to algorithms to remove or minimize personal data that needs to be removed.

AI differential unlearning algorithms retain attributes of the larger set of data that needs to remain — versus the data that needs to be minimized — so it has little to no impact on how the AI will work going forward. These algorithms can range from complete eradication of certain information to giving more or less importance to information the AI should consider in producing outputs.

Three algorithms are generally used: forgetting algorithms will effectively eradicate any knowledge of certain data points; self-attention and forgetting fusion knowledge tracking algorithms combine self-attention mechanisms so forgetting memory fitting can be used later rather than an entire retraining of an AI model; and recursive least squares with a forgetting factor algorithms give more "weight" to certain information to overcome other information.

Lisa Nee, CIPP/E, CIPP/US, CIPM, CIPT, FIP, is senior counsel, privacy at Lenovo U.S.

This article is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Interested in writing for us? Visit our Contributor Guidelines Page

Privacy engineering for AI machine learning: Addressing algorithmic disgorgement risks during product development

Related stories

Unlearning for machine learning

Regularization methods

AI differential unlearning