Recognize this? A spectrum of biometric identification

My dog recognizes me. From across a dog park full of people throwing balls and calling their pups, Spot never fails. She uses the poor eyesight, good hearing and excellent sense of smell she shares with all dogs. Playing in the park, she interacts with other dogs, communicating through smell and body language. But her loyalty always sees her home to me. Other dogs are, of course, just as dedicated to their owners.

While Spot can always identify me, I'm not worried about her infringing my privacy. She'll never disclose private information about me, my habits, whereabouts or friends. And while she can identify me in a group of 10 or 20 people, I wouldn't expect her to pick me out from a group of a million. This observation about Spot's limited identification capability offers a useful lens for understanding how facial recognition technology, when properly constrained, need not threaten privacy.

The legal landscape: 'Used to identify'

Over the past few years, several states have passed privacy laws protecting biometric information to the extent it is "used to identify" an individual. The Illinois Biometric Information Privacy Act defines biometric information as any information based on an individual's biometric identifier "used to identify" an individual. Notably, in July 2024, the U.S. Ninth Circuit Court of Appeals ruled that while the term biometric identifier in the BIPA is defined to include "a retina or iris scan, fingerprint, voiceprint, or scan of hand or face geometry," the very essence of these identifiers is to identify a person. In other words, a company that collects or creates a face geometry template would not come under the remit of BIPA unless it uses such a template to identify an individual.

PLI, Earn privacy CPE and CLE credits: Watch anytime online or on our mobile app, topics include AI, privacy, cybersecurity, and data law

While identification can mean different things to different people, experts in biometric privacy draw a critical distinction between one-to-one identification, often referred to as authentication, and one-to-many, referred to as recognition or identification. A one-to-one system answers the question, "is this person who she claims to be?" Think of the facial recognition lock on an iPhone or the global entry pass, enabling a machine at the border to compare a traveler's photo to the image in their passport. A one-to-many system answers the question, "Who is this?" This is the remote biometric identification system referenced in the EU Artificial Intelligence Act. Clearly, biometric identification carries more risk than biometric authentication.

The identification spectrum: A new framework

In this article, we argue that rather than being a dichotomy, identification is a spectrum. Between the risky one-to-many systems, which can single an individual out in a large crowd, and the lower risk one-to-one systems, lie "one-to-a-few" systems, which like Spot can identify an individual, but only among a limited, small group. We explore the nature of identification and what it means for privacy. And we propose a more nuanced approach to biometric identification when matching within small populations, including specific thresholds for regulatory safe harbor that can encourage innovation without amplifying privacy risks.

The limits of recognition

Spot doesn't know my Social Security number. Nor can she store my home address, phone number, bank account balance, medical history or other sensitive information. While she would pick me out in a lineup, she could not append to my identity any information of use to a third party. Indeed, I'm comfortable even if she sees my intimate moments. She never tells anyone about my stepping out of the shower undressed. Beyond identifying me, Spot cannot leak any personal information. She is, by her nature, a secure system.

There is also a pragmatic limit to her identification skills. Anyone wishing to use Spot to identify me would first need to narrow the field from 8 billion humans down to a much smaller population. Police perform a similar task when they present a lineup to an eyewitness. They do 99.9999999% of the identification — narrowing the suspects from 8 billion people to six or eight. While I may like to believe that even if every human was in the park, Spot would still find me; the reality is that her dealing with no more than a dozen people helps us get home on time.

Edge computing: Privacy by design

A new class of AI is emerging that employs privacy-by-design principles, enabling automated systems to provide convenient and practical applications without the broader privacy risks often associated with facial recognition. These constrained systems, which are less capable by design, operate in isolation and are effective at facial recognition against only a small pre-selected population of users. They can unlock the front door or set the climate control in your house without compromising your privacy.

These technologies are called edge services, an architecture that is the antithesis of "cloud." Cloud systems, like AWS, Microsoft Azure or GCP, centralize all computation in massive data centers. In cloud architectures, connected devices like smartphones or Internet of Things video doorbells transmit data to cloud servers and receive instructions from centralized processing centers. In contrast, edge architectures emphasize distributed computing, where data processing is completed at or near the source. An Edge video doorbell can, for example, autonomously decide to unlock a door, based on identifying a user's face, without reporting to or checking with a cloud server.

Several factors make this architecture viable. Following Moore's Law, silicon chips are now smaller and more capable than ever. Some are so small they can fit on a fingertip, a far cry from the energy- and heat-intensive systems that powered the first generation of AI. More than 50 companies are currently developing so-called "AI accelerator" chips. Simultaneously, researchers and scientists are reducing the computational requirements of algorithms to accomplish accurate results with fewer resources. Finally, manufacturers are combining these capabilities into "smart devices," simplifying development of automated systems by integrating all necessary components in one device. No cloud server connection is required; all computations remain on device.

Privacy protection through technical constraints

Restricting the biometric system output is an essential method for ensuring privacy. Edge devices present an opportunity to reshape norms around data collection. A system capable of operating autonomously does not need to share data with external parties. My microwave has a convenient popcorn button but does not share my snacking preferences with my insurance carrier. Like Spot, these devices are architecturally incapable of sharing user data. However, many modern devices connect to the internet for legitimate reasons. A video doorbell, for example, can notify a smartphone when there's a delivery. Therefore, we cannot trust these systems implicitly. We need explicit regulatory frameworks to ensure that a user's willingness to share limited data for specific purposes doesn't result in unauthorized data harvesting.

Smartphones exemplify devices that connect to the internet while remaining capable of edge computing. Apple has built its brand around privacy protection. The iPhone uses edge computing in its Face ID feature and courts have accepted Apple's argument that this approach complies with BIPA's stringent requirements. Moreover, consumers have embraced the feature as their primary means of unlocking their devices.

Edge devices have a second fundamental privacy-by-design feature. Just as their isolation limits their output, so too does it limit their inputs. A biometric recognition system has three major components. The first is a gallery of known individuals within which it searches. The second is its search or matching algorithm. The third is the biometric feature extractor, which generates the digital finger-, voice- or face-print. Recognition systems take an image as input, for example a photo of a porch; use the biometric feature extractor to generate a face-print; and then compare the face-print against the gallery. If there is a match, the system reports the corresponding person from the gallery. If there is no match, the recognition system reports "no match."

Just as restricting the output of use of a biometric system can mitigate privacy risk, so too does restricting input to an edge system. For example, a video doorbell "knows" the members of a family, because they were explicitly added to its gallery. However, it doesn't recognize neighbors — or anyone else among 330 million Americans. No matter how often a neighbor visits, when they ring the doorbell, they're reported as an "unknown visitor at the front door," unless explicitly added to the gallery. If police seized this doorbell and used it in a lineup, it would produce eight "unknown visitor" messages. It would not identify the neighbor, even if it had recorded them many times. Only if the actual residents were included in the lineup or the system erroneously identified someone with resemblance would it provide an identity.

The doorbell is fundamentally limited by its input. It knows only the explicitly enrolled household members. Everyone else is a stranger. These privacy protections are architectural. The system's limited gallery means it cannot be repurposed for broader surveillance.

A third privacy feature emerges from the miniaturization of these systems. A Social Security number uses nine digits to uniquely identify someone among 330 million Americans. But most households have fewer than 10 members. Correctly identifying an individual within a smaller population requires less information. Edge system engineers make a feature out of this limitation.

To miniaturize from large cloud servers to small edge devices, biometric recognition systems incorporate trade-offs that reduce their information load. Gallery, search and biometric print components are all compressed to a level suitable for finding one member in a group of 10 or 100, but not 1,000 or a million. For that, one would need the computational power of cloud infrastructure. Think of an edge biometric system as reporting just the last digit of a Social Security number, or a person's first initial. A first initial might work for separating groceries in a shared-house, but if that is the only "print" available, it would match 15 million Americans — hardly a privacy threat.

The specialized and disaggregated nature of edge systems also provides inherent privacy protection. Like Spot, the smart doorbell has no awareness of users' Social Security number or other identifying information. The smart doorbell carries little privacy risk because it can't identify someone unless one already knows who they are. While service providers who know a user's credit card number could theoretically cross-correlate multiple systems, with edge AI such linkage is not technically feasible. Data can be ring-fenced, containing privacy risks at the source.

Context-dependent recognition: Why scale matters

Facial recognition and other biometric identification systems can be technically limited in ways that minimize their privacy risk. Limiting input, restricting output, and constraining system size — each presents an additional layer of privacy protection. Can such design constraints be combined to create legally meaningful distinctions from a policy perspective? We believe so.

Recognizing an individual within a very small population is fundamentally different than recognizing an individual in a population of millions. While this exists on a spectrum, at the extremes, there are clear and critical differences with policy implications between singling a person out in a group of less than 100 people, for example, compared to a population of 1 million.

Policy frameworks already acknowledge the difference between biometric authentication and biometric recognition or identification. However, authentication and recognition are not a dichotomy. Recognition systems purport to answer, "Who is this?," but edge systems with appropriate design constraints can only answer, "Who is this among the people I already know?” Limiting a biometric system to known individuals has major policy implications.

Lack of clarity about privacy outcomes has led to regulation of all methods of recognition, and in some jurisdictions, even authentication. This broad-brush approach leaves open potential avenues for misuse, while simultaneously restricting access to beneficial technologies with minimal privacy risks. Ambiguity in legislation about what constitutes "identification" is a principal source of confusion for system designers.

Consider how biometric systems can accomplish one-to-many tasks through iterative one-to-one methods. Just as someone might try multiple email addresses to log into an account, current regulatory frameworks that ignore the spectrum result either in overly rigid rules where one in two is treated the same as one in 8 billion, or in loopholes that create regulatory gray zones.

Policy recommendations: A safe harbor proposal

Policy frameworks can foster innovation while protecting privacy by recognizing that the privacy risk of a small-scale system is fundamentally different than that of a large one. Many practical applications of face recognition sit between the "one-to-one" and "one-to-everyone" extremes. Consider that 98.8% of U.S. households have fewer than seven persons. To function well in that environment, a system needs to operate on no more than a one-in-10 scale. Similarly, 89% of U.S. businesses have fewer than 100 employees. A biometric system singling out an employee in a group of 100 is far removed from one capable of picking a person from a group of a million. Systems that truly operate on a one-to-330 million, or 8 billion, scale are unique — and raise specific challenges that a smart doorbell need not solve.

Nevertheless, an adversary could attempt to link the limited data stored on an edge system to data in other databases. In combination with other capabilities, a system that is incapable of identifying an individual on its own may still pose privacy risks. For example, someone might consent to use their phone's precise geolocation data to determine that they left home without locking the door, but such consent need not extend to combining their face print and location data to determine their identity. By restricting user consent to data collection for a specified purpose, we can prevent further misuse of data. Purpose specification should be an integral part of biometric regulation.

Systems that are capable of biometrically matching consenting individuals within populations of 100 or fewer, combined with on-device edge processing, present a privacy safe zone. Provided that such systems do not link data to other systems, a regulatory safe harbor would provide room for innovation that benefits consumers and businesses without compromising privacy. After several years of litigation around what it means for a biometric system to "identify an individual," we urge the regulatory and technology communities to differentiate between the extremes of "one-to-one" and one-to-everyone identification as an important framework for balancing innovation and privacy.

Omer Tene is a partner at Goodwin.

Lars Oleson is the CEO and co-founder of Xailient.

This article is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Interested in writing for us? Visit our Contributor Guidelines Page