As the volume of consumer data grows, an increasing number of decisions previously made by humans are now made by algorithms. The number of data sources have multiplied, and so too have the types of data and the number of entities keeping and crunching it.
The past two years have brought continuous policy discussion around the benefits and challenges that accompany this growing use of big data analytics. The White House and the Federal Trade Commission released reports on big data and data brokers in early 2014. Since then, policymakers and wonks of all stripes have weighed in on the subject, frequently highlighting one of the most contentious topics raised by these studies: how to ensure that the increase in automated decisionmaking does not result in unfair, unethical, or discriminatory effects for consumers.
Early in these conversations, a coalition led by the Leadership Conference for Civil Rights released a set of civil rights principles for the era of big data that established broad guidelines for how to avoid having a discriminatory impact with the use of big data. Washington white papers naturally followed; the Future of Privacy Forum partnered with the Anti-Defamation league to produce a report on using big data to fight discrimination and empower groups; Upturn wrote a report on the intersection of big data and civil rights; the President’s Council of Economic Advisors wrote about differential pricing; and the White House has promised a report on the implications for big data technologies for civil rights. Several groups convened on the topic, including an FTC workshop, which resulted in an eventual report around the use of big data for inclusion and exclusion.
Most of these conversations have included calls for transparency into the algorithms that are used to process and makes decisions around consumer data. Logically, disclosing more about the inputs and decision points in automated decisionmaking should enable consumers and policymakers to identify, react to, and correct detrimental results.
But algorithmic transparency is tricky.
Even if a company were to release a proprietary algorithm to the public, the task of understanding and reacting would be extremely difficult. Consumers and policymakers are unlikely to understand what an algorithm says or means, it would likely undergo continuous change over time or in reaction to new data inputs, and it would be difficult to decide how to measure unfairness — whether by looking at inputs, outputs, decision trees, or eventual effects. These challenges may leave even companies that care deeply about avoiding discrimination unsure as to what best practices really are.
Last December, FTC Commissioner Julie Brill acknowledged the challenge in creating public-facing algorithmic transparency, calling on companies to proactively look internally to identify unfair, unethical, or discriminatory effects of their data use. Brill called on companies using scoring models to “themselves do more to determine whether their own data analytics result in unfair, unethical, or discriminatory effects on consumers,” noting that this does not displace the need for transparency around data collection, and, in "addition to scrutinizing their own practices, companies can provide consumers with creative UIs to give consumers more meaningful, usable access to their data."
Brill’s speech sends a signal to companies navigating this landscape that even if standardized best practices don’t yet exist, these questions actively matter to regulators and consumers — and a good faith internal effort to prioritize them is in order. Companies operating in areas with existing legislation such as credit reporting, housing and employment may look to those laws for guidance, but Brill called for legislative solutions to address the broader challenge as well.
Companies would be well-advised to consult the FTC’s recent workshop report for a guide on existing laws that may apply to big data practices as they endeavor on this approach.
photo credit: Mathematica via photopin(license)