The Federal Trade Commission (FTC) held a workshop on big data last week called “Big Data: A Tool for Inclusion or Exclusion?” And to answer the question, the consensus among the panelists was yes to both: Big data analytics can bring enormous benefits to society, but it can also wrongfully label individuals, stigmatize protected classes and perpetuate societal prejudices and discrimination.
Examples abounded particularly on the benefits side—think predicting flu outbreaks based on online searches or targeting advertisements only to those who are likely to want them and, yes, uncovering discrimination through big data analysis.
But the possible harmful uses of big data were front-and-center, too. After all, fighting such harms is what the FTC, as the nation’s consumer protection agency, does. These harms ranged from a plethora of speculative scenarios, such as inferring credit-worthiness from the fact that a consumer buys vegetable seeds, to making decisions based on “proxy” variables that are statistically associated with protected minorities, thereby inadvertently or surreptitiously discriminating against these minorities, to the well-known, real-life example of the pothole-finding mobile app whose results were skewed in favor of wealthier Boston neighborhoods with greater smartphone penetration until the data was complemented by gathering additional data through other means.
One speaker called the dazzling range of opportunity and detriment the “data paradox.” Data can be used to help or hurt. And equally dazzling was the panorama of opinions on how to deal with this data paradox. They ranged from taking a “wait-and-see” approach before rushing into new legislation to devising a new regulatory framework for big data. Some participants explained how existing credit, finance, employment and consumer protection laws already cover many big data risks. Other participants suggested that the sheer variety of big data applications precluded a “one-size-fits-all” approach or they pointed out that protecting from harm in this context is “situational” and that “framework models” for addressing such context-specific risks and harms should be devised.
Fortunately, as several speakers noted, one such model already exists. It’s the risk-based approach to privacy, also referred to as “benefit-risk” analysis. It’s already available and, in some cases, already part of the law. See the “legitimate interest” ground for data processing in the EU Data Protection Directive, mandatory Privacy Impact Assessments in an increasing number of jurisdictions and the FTC’s unfairness authority, for example. And it’s uniquely suitable for addressing the challenges of big data and the massive data explosion that’s still ahead of us with the Internet of Things (IoT).
So what exactly is this risk-based approach to privacy?
The approach posits that in an era where data practices are simply too complex for the average person to comprehend, much less actively manage, the burden for preventing harm and safeguarding privacy should fall on the businesses amassing and using the data. Thus, businesses should assess the potential harms of their proposed data uses to individuals and, maybe, society, devise the appropriate controls and then consider whether to proceed in light of any residual, unavoidable risk and the countervailing benefits of the proposed use. In other words, they should base their data-use decisions on contextual risk assessments. And since they would be following a structured and coherent risk-assessment methodology, businesses would be able to later justify their use decisions in the event of, say, an FTC investigation. Not only would this approach allocate the responsibility for privacy protections more appropriately, it would also improve the effectiveness of privacy protections, not least by replacing the common but dubious reliance on meaningless “consents” from individuals.
Thus, especially in the context of big data and the IoT where notice and consent are becoming increasingly impractical, impossible or illusory, a risk-based approach to privacy can deliver appropriate protections nonetheless.
Indeed, responsible companies include such assessments in their organizational compliance programs already. But more work to refine and socialize the privacy-risk methodology for the big data context needs to be done. Consensus still needs to grow around what kind of harms need to be considered and how to quantify and weigh them. Same goes for the benefits. But that work is underway in many corners, from industry groups, EU legislators, U.S. government agencies to privacy think tanks—including work by the Centre for Information Policy Leadership—on the risk-based approach and a privacy-risk framework.
I’m with the panelists who see the huge potential of the risk-based approach for both delivering effective privacy protections and enabling the benefits of big data. If deployed properly, this approach will allow big data to get bigger safely.
The same can’t be said of the blanket application of concepts such as data minimization, collection and purpose limitations and consent—though they should remain available as possible controls for specific contexts following a risk analysis and may also remain relevant in some non-big data contexts. And the risk-based approach can help calibrate their implementation where they are appropriate. But, as cornerstones of data protection, they are fundamentally at odds with the promise of big data.
As one panelist said, the biggest risk is not using data enough. He’s right. Size matters in big data analytics. More data will result in more knowledge and improve the quality and integrity of big data outcomes. Less data will result in the opposite, as illustrated by the Boston pothole-finding app. If there’s one thing that became clear at the FTC workshop, it’s that there is a lot at stake in choosing the right path forward. Luckily, we have some very good ideas of where to go.
If you want to comment on this post, you need to login.