From setting insurance premiums to deciding who gets a home loan, from predicting the risk of a person re-offending to more accurately diagnosing disease, algorithmic systems have the ability to reshape our lives. Algorithms are increasingly used to make predictions, recommendations or decisions vital to individuals and communities in areas such as finance, housing, social welfare, employment, education, and justice — with very real-world implications. As the use of algorithmic systems increases, so too does the need for appropriate auditing, assessment and review.
But where should a privacy professional start when assessing the privacy considerations raised by algorithmic systems?
As a privacy professional, you cannot apply the law or assess risk without understanding the technological components of algorithmic systems and how they interact with information privacy obligations. You don't need to be a technical expert, but you do need to know the right questions to ask of technical experts so you can assess the legal and reputational risk of what they propose.
AI or algorithm?
Algorithmic systems that raise privacy issues will include systems using artificial intelligence and human-written, rules-based code to make predictions, recommendations or decisions about humans.
Systems developed using AI may pose a particularly high privacy risk, but in our view, all types of algorithmic systems should be considered. The Australian Government's Online Compliance Intervention, more commonly referred to as "robodebt," which used an algorithmic system to automate a process to recover a debt, is a recent example. This system did not use AI, but its human impact — AU$1.5 billion in unlawful "debts" — was significant.
How to assess privacy risk in an AIA
Evaluating the privacy risk of algorithmic systems via an algorithmic impact assessment is not just a matter of testing for legal compliance. In order to understand, identify and mitigate against privacy harms, you need to think about concepts such as fairness, ethics, accountability and transparency, which are vital factors to consider when assessing algorithmic systems. However, I would also encourage privacy professionals to think deeper about how to design trustworthy systems by looking at both risks and solutions through the lens of what we call "The Four D's Framework."
The Four D's Framework
The Four D's Framework offers a foundation for assessing algorithmic systems to minimize privacy harms throughout the life cycle of an algorithmic system. The build of an algorithmic system comprises four stages:
- Design.
- Data.
- Development.
- Deployment.
Design
Responsible algorithmic systems feature privacy risk management long before any personal information is used in a live scenario. Ideally, consideration of privacy issues should start as early as the design stage.
One of the first things to consider for any algorithmic system is the design objectives. It should go without saying that building an algorithmic system that processes personal information "because we can" will rarely meet community expectations. Asking the question, "what is the problem we're trying to solve?" is a popular starting place to ensure a clear understanding of why an algorithmic system is being pursued at all. But many other questions are also important to ask at this stage, such as:
- Who will benefit from the development of this system?
- Who has this problem and how do we know it is actually a problem for them?
- What do they say they need?
- Who is most at risk or vulnerable?
- Will this system meet the need or solve the problem better than what is already in place?
- How will we know if the system is 'working?'
Data
Organizations need to take care when considering the types of data used to develop, train, test and refine their algorithmic systems. Data often only tells part of the story, and when processed without due consideration, it can lead to misleading or even harmful outcomes.
For example, historic bias might be found within a training dataset, or bias could be introduced as an AI system "learns" post-deployment. There have been numerous examples of this. The use of training data reflecting historical bias led to racial discrimination in facial recognition algorithms and gender bias in the Apple Credit Card.
Examples of machines' learning to be biased post-deployment include a recruitment algorithm learning Amazon preferred to hire men. Microsoft's Tay chatbot all too quickly learned from Twitter users to make offensive, sexist, racist and inflammatory comments.
Some questions to consider regarding data include:
- When, where, how and by whom was the data initially collected?
- Who is represented in the data? Who is not represented in the data?
- How will we organize the data?
- How will we test for bias in the data?
Development
The development stage of an algorithmic project includes building and testing the algorithmic system.
If you are procuring a product or service, you will need to play an active role in the building and testing stage or ensure that off-the-shelf products are rigorously examined and tested before deployment. Questions of accountability also need to be considered, as organizations cannot simply outsource their responsibility through procurement and hope to avoid liability for any harm caused.
Now is the time to put the design thinking and considerations around data into action for organizations developing their own algorithmic systems.
When testing algorithmic systems, organizations should first determine a baseline for each metric they deem to be the minimally acceptable result. For example, organizations should be aware of and mitigate the possibility of evaluation bias, which can occur when the test data do not appropriately represent the various parts of the system's intended population once deployed. Also, when developing an algorithmic system, organizations will want to test to see how well it is 'working' against metrics such as accuracy, recall and precision.
Deployment
Once the algorithmic system is deployed into the real world, organizations cannot simply wash their hands of the system and let it run wild. The system may need to be changed to deal with real-world inputs and requirements for interoperability and interpretability, and there may be real-time feedback that needs to be integrated back into the system.
Another key area for privacy professionals to consider at deployment is how individuals will be meaningfully informed of how the algorithmic system is being used in practice and how they may seek more information or exercise review rights. Decisions made or supplemented by algorithmic means should be explainable and auditable.
Conclusion
Privacy professionals should ensure they are involved in assessing privacy and related risks throughout the life cycle of an algorithmic system's development.
Photo by Owen Beard on Unsplash