22 May 2024

Unpacking the shift toward substantive data minimization rules in proposed legislation

Data minimization is rooted in the earliest iterations of U.S. privacy law, appearing in the Fair Information Practice Principles and the Privacy Act of 1974. At a high-level, data minimization stands for the notion that a data controller should not collect more data than needed to accomplish a specific, identified and lawful purpose.

The EU General Data Protection Regulation, adopted in 2016 predating any of the comprehensive U.S. state privacy laws, includes data minimization as a principle under Article 5. Specifically, the GDPR provides that personal data be "adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed."

It is no surprise that as U.S. states began enacting comprehensive privacy legislation in the absence of federal action, these new laws tended to include data minimization requirements with language similar to the GDPR.

The majority of state comprehensive privacy laws — 13 of the 17 enacted today — require controllers to limit the collection and processing of personal data to what is "adequate, relevant, and reasonably necessary" to achieve the purposes that are disclosed to a data subject. Any unnecessary or incompatible secondary uses of personal data under these regimes require separate, affirmative consent.

This rule can be labeled as "procedural data minimization," because whether or not collection or processing can occur turns on whether the controller has taken the correct procedural step — adequately disclosing processing purposes — rather than the substance of the processing activity. A key additional protection in most comprehensive state privacy laws is that controllers must obtain affirmative opt-in consent to collect and process personal data that is deemed "sensitive."

The California Consumer Privacy Act goes a step further than other state privacy laws on data minimization by way of its implementing regulations. Section 7002 of California's privacy rules provide that collection and processing of personal information must be limited to what is reasonably necessary and proportionateto achieve either: a disclosed purpose that is consistent with a person's reasonable expectations, another disclosed purpose that is compatible with the context in which the personal information was collected, or another disclosed purpose for which the business obtained the individual's consent.

Under California's rules, disclosures made by the business are a relevant but not dispositive factor in the test for whether a processing purpose is consistent with an individual's reasonable expectations. This rule represents a shift toward "substantive data minimization" because it grounds the question of whether or not processing can occur in the nature of the processing activity itself. Because disclosure is still an important factor in that test, this rule is between procedural and substantive data minimization. For sensitive data, California allows individuals to opt-out of unnecessary processing, but only if the sensitive data is collected or processed for the purpose of inferring characteristics about the individual.

Except for California, procedural data minimization, requiring that businesses identify specific processing purposes and not use data further than necessary to accomplish those purposes, has become the prevailing standard in state comprehensive privacy laws.

Procedural data minimization is an improvement over a status quo in which companies could be incentivized to surreptitiously collect as much personal data as possible on the off chance it could become useful in the future. But there's also an argument that this kind of disclosure is already mandated by the U.S. Federal Trade Commission, as seen in long-standing enforcement actions.

Furthermore, scholars and advocates have long argued that procedural data minimization fails to account for structural power imbalances between individuals and companies. Under that view, rather than limiting vast and unnecessary collection, procedural data minimization merely requires that companies be transparent and adhere to their disclosures.

This argument has motivated attempts to legislate new, default standards for data minimization that are decoupled from both controller's disclosures and individual consent.

Emerging standards

One milestone in the ascendancy of data minimization as a priority issue in privacy debates was the introduction of the American Data Privacy and Protection Act of 2022, a federal bill that passed out of committee but ultimately failed to receive a full vote in the U.S. House.

The ADPPA included a two-part data minimization rule whereby covered entities would have been permitted to collect, process and transfer covered data only where doing so was "limited to what is reasonably necessary and proportionate" to "provide or maintain a specific product or service requested by" an individual or to effect an enumerated "permissible purpose," such as maintaining data security.

For sensitive data, the ADPPA would have limited collection and processing to what is "strictly necessary" to provide or maintain a product or service or to effect a more limited set of permitted purposes.

The ADPPA's data minimization rule, tied to functionality of a product or service, is a substantive one. It was lauded by many privacy advocates who, in its wake, have increased public calls for both federal and state lawmakers to enact strong data minimization rules.

The ADPPA was not enacted, but it may have influenced legislation elsewhere. Similar necessity requirements have appeared in sectoral laws. For example, Washington state's My Health My Data Act, enacted in 2023, prohibits collection of consumer health data except with an individual's consent for a specified purpose or to the extent necessary to provide a product or service requested by the individual. The scope of that necessity standard is important given the MHMDA's rigorous consent requirements.

This form of data minimization has worked its way into comprehensive privacy legislation as well. In 2023-2024, lawmakers in Maine, Maryland, Massachusetts and Vermont introduced comprehensive privacy bills that included data minimization language similar to that in the ADPPA. Although the Maine bill was narrowly rejected by the Maine Senate, the Maryland Online Data Privacy Act, a law that has substantive data minimization rules at its core, was enacted by Gov. Wes Moore, D-Md., 9 May.

Influenced by the ADPPA, Maryland's privacy law establishes a new data minimization framework that imposes default limitations on the collection and processing of personal data. Controllers must limit collection of personal data to what is "reasonably necessary and proportionate to provide or maintain a specific product or service requested by the consumer."

Unlike the ADPPA, Maryland's privacy law limits processing of personal data, i.e., use more broadly, to what is necessary to or compatible with the purposes disclosed to the individual, meaning any unnecessary or incompatible secondary uses of personal data also require separate, affirmative consent.

Maryland's privacy law includes heightened data minimization requirements for sensitive data. Controllers can only collect, process or share sensitive data when it is strictly necessary to provide or maintain a requested product or service, and Maryland's privacy law prohibits selling sensitive data entirely.

The passage of Maryland's privacy law represents another significant milestone in the rise of data minimization. This trend is also continuing at the federal level. The recently released discussion draft of the American Privacy Rights Act of 2024, the most significant federal proposal for comprehensive privacy legislation since the ADPPA, is largely based on the ADPPA and contains a similar substantive data minimization rule that would limit collection, processing, retention and transfer of covered data to what is reasonably necessary to provide or maintain a product or service or to effect one of several enumerated permitted purposes.

These permitted purposes include things like protecting data security, complying with legal obligations, conducting market research, and preventing and responding to fraud. The APRA would also create affirmative express consent requirements for transferring sensitive covered data to third parties and for collecting, processing, retaining or transferring biometric and genetic information — subject to some permitted purposes — which differs from the ADPPA's strictly necessary standard for processing sensitive data.

Policy tradeoffs and tensions

There are tensions and tradeoffs between these differing data minimization standards, and policymakers' choice of model will have broad consequences for individuals and for companies subject to these laws.

Default protections versus individual control. One of the arguments for substantive data minimization rules is that they will protect individuals and upend the traditional regulatory model of "privacy as individual control."

To many privacy advocates and scholars, procedural data minimization tied to disclosure does little more than enshrine a failing status-quo into law, enabling companies to do whatever they want with personal data, no matter how harmful to the individual or orthogonal to the commercial relationship, so long as the business discloses what it is doing in a dense, rarely-read privacy policy.

In contrast, these same advocates argue that rules tied to individual expectations — like that in California — or to offering a product or service — like those in Maryland's privacy law, the ADPPA or the APRA — could remedy the structural power imbalance between individuals making choices in the market and the companies offering products and services on take-it-or-leave-it privacy terms.

American privacy law tends to lionize individual control in the form of actionable rights, but scholars have long argued that a control-based model is overwhelming due to the excessive options presented to individuals. One potential way to address this is through the exercise of rights on a generalized basis, such as through technical measures like universal opt-out mechanisms.

But there is an understandable growing desire for default protections that take the onus off individuals entirely and, instead, limit how data can be collected and used. If that is the goal, then substantive data minimization — with a normative component limiting the purposes for which data can be processed — is an obvious vehicle for such protections.

There are two notable counterarguments to this. First, substantive data minimization rules like those in the APRA or Maryland's privacy law will remove choice from individuals and potentially deprive them of certain desired data uses and features.

Of course, advocates are likely to argue that individuals face little choice to begin with, beyond a binary decision of whether or not to use a particular product or service. And even under a data minimization rule like in Maryland's privacy law, individuals are still free to opt-in to using a particular feature, which then becomes part of the product or service being provided.

Another control-focused counterargument is that the vagueness of a substantive data minimization rule will empower companies, who may decide that certain processing activities, such as targeted advertising or selling data, are "necessary" to provide a product or service because it would not exist without that income stream. This is similar to arguments in the EU about which lawful bases can be relied upon for behavioral advertising on social media.

But under procedural data minimization, controllers arguably have an even lower bar to legitimizing these kinds of practices, because they merely have to disclose that they are occurring — although processing sensitive data still requires opt-in consent under procedural data minimization.

Furthermore, this challenge can be rectified by including specific prohibitions — like the ban on selling sensitive data Maryland's privacy law — opt-in or opt-out rights, or a clear statement in law that such activities are not reasonably necessary to provide or maintain a product or service.

Reasonable certainty and socially beneficial secondary uses. Another argument against substantive data minimization is that a procedural rule tied to disclosure provides businesses with reasonable certainty about permissible practices and enables businesses to engage in socially beneficial processing activities that may not be within an individual's reasonable expectations. These kinds of secondary uses include data sharing for research in the public interest, product development and launching of new features, and, critically, AI development.

There are several critical ambiguities that must be addressed in a substantive data minimization rule tied to providing or maintaining a product or service. What makes something 'reasonably' or 'strictly' necessary? What does it mean to provide or maintain a product or service? What does it mean for a product or service to be 'specifically requested' by an individual? Are things that we would consider legitimate interests under the GDPR, such as fraud prevention or IT security, implicitly allowed as reasonably necessary to provide or maintain a product or service? If collecting and processing sensitive data is limited to what is strictly necessary to provide a product or service, would it be possible for businesses to process biometric information to verify customers? If so, can it be mandatory for the product or must customers opt-in to that feature?

The approaches taken in the ADPPA and the APRA have their own unique ambiguities and trade-offs, as well. Creating a list of permitted purposes for which businesses can collect, process and transfer personal data should provide more flexibility and ease objections about foreclosing legitimate business practices.

But, as academic scholar Joe Jerome, CIPP/US, identified in his analysis of the APRA's data minimization rule, this approach can be "much less flexible for industry." For example, a list of enumerated permitted purposes risks becoming outdated and underinclusive as new business needs arise that are not covered by existing carve-outs. Jerome highlights the GDPR's more adaptable "legitimate interests" balancing test as an alternative.

The challenge of scoping data minimization applies to all substantive data minimization rules, whether that be Maryland's privacy law, the ADPPA, the APRA or something else. To be workable, the rule will need to somehow anticipate and carve-out legitimate, socially beneficial data uses and/or data uses that are required for businesses to function. Such processing purposes could be implicitly read as being necessary to provide or maintain a product, they could be enumerated in a list, or they could be covered by a legitimate interests-style balancing test.

Another argument against substantive data minimization rules like those in the ADPPA, Maryland's privacy law, and the APRA is that the "reasonably necessary" standard will allow enforcers to "second-guess" a business's operations by deeming collection or processing not reasonably necessary to provide or maintain a product or service.

Under the procedural data minimization standard, controllers maintain significant control to shape the bounds of legitimate processing via their disclosures, but enforcers still have some leeway to "second-guess" whether processing activities are beyond the bounds of what was disclosed, as the FTC has done for decades. A rule like that in Maryland's privacy law certainly increases the likelihood that an enforcer would find that processing is impermissible, but this is a difference of degree rather than kind.

Substantive data minimization's effects remain to be seen

Privacy advocates and scholars have long lamented the kind of notice-and-choice approach to privacy by which companies can collect, process and share personal data so long as they adequately disclose those activities.

Evidently, policymakers are listening. Maryland has broken new ground by enacting its law with new, stricter data minimization language.

But the proof of the pudding is in the eating. We will not know whether substantive data minimization truly offers something new until these requirements go into effect and are publicly enforced.

In the meantime, it remains to be seen whether other policymakers at the state, federal or administrative levels will follow Maryland's lead in crafting new "substantive" approaches to data minimization.

This article is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Interested in writing for us? Visit our Contributor Guidelines Page

Unpacking the shift toward substantive data minimization rules in proposed legislation

Related stories

The state landscape prior to 2024

Emerging standards

Policy tradeoffs and tensions

Substantive data minimization's effects remain to be seen