From weirdly catchy fake pop singles to alarming deepfakes of global leaders, voice cloning technologies are beginning to make their mark on our daily lives.
Cloning is the process of using machine learning techniques to generate synthetic audio that convincingly mimics a real person's voice. It is a special type of speech synthesis, conditioned not only on linguistic features like words, but also on the identity of the speaker based on factors such as pitch, speech rate and accent. The feasibility of generating audio deepfakes has been drastically increased in recent years, with some systems reportedly requiring only a few seconds of real recording to generate convincing output.
The increasing availability of highly convincing identity-conditioned synthetic speech generation tools — and related voice conversion tools — has fueled concerns from regulators and policymakers around the world. While much of this concern is based on issues of misinformation and fraud, privacy issues lie at the heart of any unauthorized deepfake.
So why are voice clones a privacy issue?
- The ability to commandeer a person's voice is a direct threat to their individual autonomy. For most people, our voice is the most direct way we communicate our thoughts, feelings and preferences. In many situations, it is also how we authenticate ourselves, in both the technical and social sense. Losing power in such an intrinsic part of our self can lead to direct privacy harms.
- The loss of control of our own uniquely personal artifacts also erodes our human dignity. Our voice is not just a piece of data; it is a reflection of who we are and how we want to be perceived. Voices are also sources of personal and cultural value, conveying our heritage, identity and other markers of belonging. Manipulation of our voice could allow others to encroach in our personal lives.
- Questions of provenance and the accuracy of personal information have always been central to data protection law. Our voice is a type of personal information that can be used to infer sensitive information about us. If someone can create or alter our voice without our knowledge, they can potentially misrepresent us, expose us or exploit us.
As with any generative AI application, the triggering of privacy interests does not necessarily implicate privacy regulators as those best suited to tackling this issue. Nevertheless, data protection and consumer protection regulators alike have shown a willingness to use the authorities they have to tackle emerging threats from generative AI.
This week, four U.S. senators from the Senate Committee on Banking, Housing and Urban Affairs, led by Chair Sherrod Brown, D-Ohio, sent a letter urging Consumer Financial Protection Bureau Director Rohit Chopra to take action to stop voice clones in the consumer financial sector. The letter focuses on the "new, threatening dimension" that voice cloning adds to the wide variety of common financial scams. "Hearing trusted voices amplifies the risks of consumers falling victim to scams."
The senators also raised concerns that voice cloning enables security breaches by tricking voice authentication systems, which are frequently used in the financial sector. This builds on a set of letters that Sen. Brown sent to six large banks asking for information about how they are responding to the changing threat landscape of voice authentication.
If the CFPB acts on the congressional request, it will join the Federal Trade Commission's existing active engagement on deepfakes.
More than three years ago, the FTC convened a workshop on voice cloning technologies titled "You Don't Say." As it happens, Rohit Chopra, then an FTC commissioner, opened the workshop with remarks about the evolving privacy perils brought about by losing control of our biometrics:
"Privacy is clearly now a national security issue and a personal security issue. It's not just about surveillance of our movements and our social interactions. New technology can now allow us to clone what we thought was uniquely ours, our biometrics. From our fingerprints to our faces, losing control of our own biometrics poses another level of peril. When this happens, deep fakes, disinformation and distrust will accelerate. We'll need to determine how to control this technology and keep it out of the wrong hands."
Although no voice cloning enforcement actions have been released, the FTC has continued to engage on the issue. This March, the agency released a consumer alert warning specifically about the use of voice clones in "family emergency scams." And the FTC's new policy statement on biometrics was careful to include voice recordings within its definition of biometrics worthy of enhanced privacy protections, along with any data derived from such recordings "to the extent that it would be reasonably possible to identify the person from whose information the data had been derived."
Our expanding understanding of the potential for misuse of voice recordings comports with the expansion of regulators' definition of biometric data — and the growing set of privacy requirements that such data should be put to. New technical measures are emerging to detect synthetic speech, but ongoing iterations in generation and detection will no doubt keep us locked in an arms race for some time.
As with so much in privacy, the best defense is a good offense. Data minimization, access controls and the introduction of masking techniques to audio recordings could all play a part in reducing the risk of misuse. Until new laws are on the books, regulators will continue deploying the tools they have — including privacy rules — to combat deepfake harms.
Here's what else I'm thinking about:
- President Joe Biden nominated two FTC commissioners to fill vacant Republican seats. Biden announced the nomination of Andrew Ferguson and Melissa Holyoak. Each currently serves as a solicitor general in their home states of Virginia and Utah, respectively. FTC Chair Lina Khan released a statement welcoming her future colleagues: "The Commission operates best at full strength, and I will look forward to working with them to fulfill the important mandate Congress has given us."
- Two steps closer to adequacy for EU-U.S. data transfers. This week, U.S. Attorney General Merrick Garland designated the European Union, Iceland, Liechtenstein and Norway as "qualifying states" for purposes of the new redress mechanism under Executive Order 14086. This activates the redress mechanism process for data subjects in these European Economic Area countries. At the same time, the Office of the Director of National Intelligence published policies and procedures from each U.S. intelligence agency implementing the privacy and civil liberties safeguards in Executive Order 14086. The EU adequacy decision is expected as early as next week.
- Oregon and Delaware joined the state privacy party. Apparently not wanting to let Republicans states have all the fun, these two blue states closed out their legislative calendar by sending comprehensive consumer privacy bills to their governors' desks. The map of U.S. states with consumer privacy protections has gotten much more full this year. More analysis is needed to fully incorporate this new normal into baseline practices for organizations serving U.S. consumers.
- Guidance on the application of Washington's My Health My Data Act. With only a few months to go before the MHMDA's geofencing ban goes into effect, the Washington attorney general published a set of nonbinding FAQs to clarify a few questions about the application of the law. The guidance covers issues like the extraterritorial impact of the law, but leaves some questions unanswered.
Please send feedback, updates and voice memos to cobun@iapp.org.