Guidelines on White-Box Development

Resource Center / Resource Articles / Guidelines on White-Box Development

Guidelines on White-Box Development

This resource outlines guidelines on white-box development to take into account when developing algorithms for automated decision making.

Published: June 2018

Contributors:

Lokke Moerel

Marijn Storm

Although it is far from set in stone what “white-box” development would require, there are some guidelines to take into account when developing algorithms for automated decision making. By documenting these steps and assessments, the controller will also comply with the requirement to perform a data protection impact assessment.

1. A clear, documented design for development at the outset (covering the elements below).

2. Verification from the outset that the dataset applied for the training of the algorithm is:

Representative (no missing information from particular populations and verification that there are no hidden unlawful biases that are having an unintended impact on certain populations).
Accurate and up to date (data collected in another context may be up-to-date but still lead to inaccurate outcomes). Note that using an existing, unmodified, dataset is likely to result in unlawful bias, simply because current situations are rarely unbiased, and this existing bias is rarely lawful. For example, using a dataset of all primary school teachers in the Netherlands will result in an unlawful bias because the algorithm will determine that women are better qualified for this job than men because women are overrepresented in the data set. Unlawful bias can be removed from a data set by, e.g.:
- Removing data elements that indicate group membership and near proxies thereof. These data elements include direct identifiers of group membership, such as gender, race, religion and sexual orientation. Proxy identifiers may, e.g., be neighbourhood (often proxy for race) or specific job titles (nurse and navy officer).
- Decide on the target variables before starting to select the training data. The controller needs to decide upfront which variables are thought relevant for the relevant selection. If, for example, for recruitment purposes personality traits are included in the selection, such traits must be important enough to job performance to justify their use. Even if automated feature selection methods are used, the final decision to use or not use the results, as well as the choice of feature selection method and any fine-tuning of its parameters, are choices made by humans. These variables need to be documented and must “pass the smell test,” i.e., they must be intuitively relevant and important enough to job performance to be used. For example, a correlation between job applicants using browsers that did not come with the computer (like Firefox) and better job performance and retention will likely not be acceptable.
- Adding or modifying elements that result in an unlawful bias. Instead of deleting group membership, group membership indicators can also be modified. For example, in the group of primary school teachers, the gender of a specific number of teachers can be reversed to remove bias. Alternatively, if a certain minority is underrepresented in the data set, this can be compensated by oversampling these underrepresented communities.
- Repairing of attributes. An example of an attribute that is often biased are SAT scores (research shows that SAT scores are often biased against women due to negative assumptions about the abilities of women, and the resulting stereotyping tends to have a real effect on the outcomes). This can be remedied by splitting the group of scores achieved by men and women and dividing each into quantiles (e.g., top 5%). Then a medium score can be calculated for each quantile and attributed to both women and men in such quantile.

3. Review the outcome of the algorithm (and correlations found) at set stages for unlawful bias and disparate impact and, where present, remove this:

4. Consider whether the algorithm can be used in ways that prevent unlawful discrimination. In the recruitment context, consider, for example, blind curation of CVs, e.g., eliminate names, gender, school names and geographical information from the CV before selection of a relevant candidate pool.

5. Ensuring auditability of the algorithm.

Tags:

Internships in a Box: How To Deliver Your Company’s First Privacy Internship Program

For those of you familiar with the IAPP, it’s a well-known fact that the organization has grown in leaps and bounds over the last 15 years. For those of you who are newer to the profession, it may interest you to know that the IAPP started in 2000, and, in just 15 years, has grown to a professional ...

The Changing Nature of Privacy Practice

Nothing has shown how fast privacy practice is evolving as much as Facebook’s recent controversy over its research experiment altering users' newsfeeds to see if different feeds made them more happy or sad. Crafting privacy policies has been an exercise in drafting to describe uses of information i...

FPF Whitepaper Intros Toolbox for Weighing Big Data Rewards, Risks

Over the past few years, organizations have developed various frameworks to measure privacy risks of new projects, products or services. Yet these frameworks, typically called privacy impact assessments or risk management tools, account for only one part of a cost-benefit analysis. In a new Future o...

Let's Not Place All Our Eggs in the Do Not Track Basket

If there’s one lesson I’ve learned in twenty-one years of covering information technology policy, it’s that there are no simple silver-bullet solutions to complex issues like online safety, hate speech, spam, cybersecurity, data breaches, or digital privacy. Problems such as these demand a layered, ...

Are Multiple Mobile Privacy Guidelines Helping or Hurting the Mobile Ecosystem?

Never has the mobile app ecosystem been as popular and dynamic as it is now. Smartphones and the use of mobile apps are practically ubiquitous and are giving the economy a needed boost. With that boost, though, come very unique privacy concerns and challenges. And privacy regulators have taken notic...

ResourceCenter

All the tools and information you need in one easy-to-find place

Guidelines on White-Box Development

Guidelines on White-Box Development

Guidelines on White-Box Development

Guidelines on White-Box Development

Related Stories