If self-learning algorithms discriminate, it is not because there is an error in the algorithm, but because the data used to train the algorithm are “biased.”
It is only when you know which data subjects belong to vulnerable groups that bias in the data can be made transparent and algorithms trained properly. The taboo against collecting such data should, therefore, be broken, as this is the only way to eliminate future discrimination.
We often see in the news that the deployment of machine learning algorithms leads to discriminatory outcomes. In the U.S., for example, “crime prediction tools” proved to discriminate against ethnic minorities. The police stopped and searched more ethnic minorities, and as a result of this group also showed more convictions. If you use this data to train an algorithm, the algorithm will allocate a higher risk score for this group. Discrimination by algorithms is, therefore, a reflection of discrimination already taking place “on the ground.”
As algorithms are always trained on historical data, it is virtually impossible to find a “clean” dataset on which an algorithm can be trained to be “bias-free.” To solve this, group indicators such as race, gender, and religion are often removed from the training data. The idea is that if the algorithm cannot “see” these elements, the outcome will not be discriminatory.
If we want to develop fair algorithms, we must get rid of the taboo of collecting ethnic data
The algorithm is thus “blinded,” just as résumés are sometimes blindly assessed by recruiters, or orchestra auditions are conducted behind a screen — which indeed typically results in the selection of more female musicians.
In practice, “blinding” does not work for algorithms. “Blind” training does not promote equality or fairness. For example, in the Netherlands, more than 75 percent of all primary school teachers are female. An algorithm trained to select the best candidates for this job would be fed with the résumés received in the past. Because primary schools employ so many more women than men, the algorithm will quickly develop a preference for female candidates, and making the résumés gender-neutral will not solve this. The algorithm will quickly detect other ways to explain why female résumés are selected more often, including by preferring certain female hobbies and allocating fewer points to résumés listing traditionally male pastimes.
The lesson is that removing group-indicators does not help if the underlying data is one-sided. The algorithm will soon find derived indicators — proxies — to explain this bias.
The only solution is to first make biases transparent in the training data. This requires that group-indicators be collected first in order to assess whether minority groups are treated unequally. Then the algorithm must be trained against selecting these factors, by means of “adversarial training.” That is the only way to prevent past bias from influencing future outcomes.
LinkedIn’s recruitment tool offers an example of how this process can be improved. Rather than removing gender from candidates’ résumés, LinkedIn specifically collects this data. Their premise is that men are not inherently better suited than women, or vice versa (recall the example of the primary school teacher). To prevent the tool from discriminating, candidates with the necessary qualifications are first divided by gender. LinkedIn then staggers each group into segments and combines the corresponding segments — by, for example, grouping the top five women and the top five men. This way, the results are corrected for diversity. LinkedIn is going to apply the same principle to ethnic background and will start asking candidates to provide this information.
Collecting this information is extremely sensitive, however. EU privacy laws have always provided for a special regime for data such as race, disability and religion. The processing of this data is only allowed for specific purposes, which do not include recruitment. The idea is that collecting and processing such data elements increases the risk of discrimination. We also see this in the U.S., where collection and use of such data in the employment context are strictly regulated, if allowed at all.
In earlier publications, I have argued that the specific regime for sensitive data is no longer meaningful. Increasingly, it is becoming more and more unclear whether data is sensitive. Rather, the focus should be on whether the use of such data is sensitive. Processing of race to prevent discrimination by algorithms seems to be an example of non-sensitive use, provided that strict technical and organizational measures are implemented to ensure that this data is not used for other purposes.
Increasingly, it is becoming more and more unclear whether data is sensitive. Rather, the focus should be on whether the use of such data is sensitive.
Ironically, certain of these group indicators — such as age, gender, and ethnic background — are visible to the recruiters, allowing them to discriminate against candidates from certain minority groups without recording any data. It is therefore only by recording the data that existing discrimination is revealed, and bias can be eliminated from the algorithm.
The magical thinking that dictates “not knowing” leads to more fairness persists in other areas.
For example, in the Netherlands, there is a taboo against “ethnic registration” in connection with crime, because it could lead to political abuse. This is a fallacy. Dutch scientists rightly advocated breaking this taboo: “You can only do something about inequality if you first map whether it takes place.” As an example, the scientists cited that young people of Moroccan origin rarely show up at a specific governmental agency tasked with agreeing to alternative punishments for crimes in order to prevent their establishing a criminal record. The condition is that these youngsters plead guilty and repent. Doing so is difficult, however, because of the shame culture in Moroccan society. How can we expect to improve this situation if we do not know that it is precisely these young people who stay away?
In this case, the potential risk of political abuse is outweighed by the many benefits of mapping these correlations. Again, it is not the algorithm that is wrong, it is humans who discriminate and the algorithm detects this bias. This offers opportunities to reduce inequality precisely through algorithms. To do this it is imperative that we know who belongs to certain minority groups.
The taboo against collecting these categories of data must be broken. But also, companies deploying AI should be aware that the fairness-principle under the GDPR cannot be achieved by unawareness.
photo credit: Mathematica via photopin (license)
This post is a longer version of an op-ed first published in the Dutch Financial Times, November 12, 2018.
If you want to comment on this post, you need to login.