ANALYSIS

New study maps the privacy gap in consumer AI — and proposes a fix

A new academic study offers a comprehensive attempt to map the gap between the confidentiality consumer chatbot users expect and the confidentiality they actually receive.

Published
Subscribe to IAPP Newsletters

Contributors:

Théodore Christakis

Chair, Legal & Regulatory Implications of AI, Multidisciplinary Institute in AI

University of Grenoble Alpes

Consumer chatbots have become the world's most trusted strangers. Every day, hundreds of millions of people confide health symptoms, legal strategies, financial anxieties, relationship crises and moments of acute emotional distress to systems that feel private but are not governed by anything resembling professional secrecy. 

The interface invites intimacy; the fine print reserves broad rights most users will never read.

A new academic study, "You Trust Your Chatbot With Everything. Should You? Part 1: How the Controller Uses Your Chat Data," offers the first comprehensive attempt to map the gap between the confidentiality users expect and the confidentiality they actually receive. Through a comparative policy-and-interface analysis of five major consumer chatbots — ChatGPT, Gemini, Claude, Grok and DeepSeek — the study examines the internal boundary: how providers may reuse conversations for training, review them through human annotators, monetize them through advertising and share them across operational and ecosystem channels.

The focus is deliberately on everyday consumer use, not enterprise or business offerings, which typically include stronger contractual and technical protections. The findings do not reveal a landscape of abuse, but they do reveal a landscape of structural opacity. And they point toward a concrete proposal that the privacy community should take seriously: Sealed Mode.

Five findings privacy professionals need to know

The study examines decision points that together define the privacy risk profile of everyday chatbot use. The combined picture produces five principal findings.

1. Every major provider now trains on consumer chat data by default. 

Since Anthropic reversed its prior usage policy in September 2025, the last holdout among major providers has fallen. A Stanford Human-Centered Artificial Intelligence study confirmed this across a broader six-provider sample. 

Opting out is possible, but the paths vary in visibility and the meaning of "opt out" is rarely absolute. Google effectively forces users into a trade-off: Opt out of broader reuse or keep the basic convenience of chat continuity. Turning off "Keep Activity" generally prevents chats from being saved, making it hard to resume ongoing threads — work drafts, complex questions or sensitive personal issues — and pushing users to keep activity on even if they would prefer not to contribute their chats to improvement uses. 

At least two other providers allow a single thumbs-up or thumbs-down click to override an account-level training opt-out for the entire associated conversation without warning the user. In one case, opting in to training extends backend data retention from 30 days to five years — although user-initiated deletion of conversations overrides the extended period — making the training choice difficult to disentangle from the retention choice. 

2. Human review is structural, not exceptional.

Every provider preserves the ability for humans to access consumer conversations, whether for safety, quality, abuse enforcement or support. Only one AI assistant, Gemini, places a prominent, candid warning in its consumer-facing interface: "Don't enter confidential information or any data you wouldn't want a reviewer to see." 

That warning is a transparency benchmark, but it also reveals the gap between the intimacy that chatbot interfaces invite and the confidentiality they actually provide. Reviewed chats can be retained for up to three years after the user deletes them. In one case, flagged conversation content may be retained for up to two years, while the associated safety-classifier scores — numerical outputs, not the conversations themselves — may persist for seven years.

3. Advertising has entered the chat.

In February 2026, OpenAI began testing ads in ChatGPT for logged-in adult users in the U.S. on the Free and Go subscription tiers. Ad personalization is enabled by default and, where memory is also active, draws on past chats and stored memories to select ads. 

Others position themselves as ad-free: Anthropic's USD8 million Super Bowl campaign dramatized the point while CEO and co-founder of Google DeepMind Demis Hassabis confirmed at Davos that Gemini has "no plans whatsoever" for ads. The open question is whether this posture will remain durable as business models evolve and monetization incentives grow.

4. 'No sale' does not resolve the full transparency question.

All five providers make some version of a "no sale" commitment, typically framed in U.S. state law terms. These commitments are genuine and important: They mean that user data is not transferred for third parties' independent commercial use. 

But a no-sale statement does not, by itself, describe the full scope of who may access conversation data within the provider's operational supply chain. A provider that maintains a strict no-sale posture will still share chat-related data with a range of processors: cloud infrastructure, analytics services, human-review vendors, safety tooling providers and customer-support systems. 

Much of this sharing serves privacy-protective purposes, such as safety auditing and abuse detection. The processors involved are typically bound by contractual and statutory constraints, including under the EU General Data Protection Regulation's Article 28 framework, that limit their use of the data to the controller's documented instructions. These are real safeguards. 

The concern identified in this study is not that operational sharing occurs, but that it remains under-disclosed: Users typically cannot assess who may access their conversations, under what constraints and for how long. For two providers, Gemini and Grok, the chatbot is not standalone but combined by design with broader service ecosystems, where cross-service data flows go beyond processor-based operational sharing and may involve separate or joint controllership. In these designs, data disclosed in a chat can travel to other services, often automatically, involving information that one provider's own Connected Apps documentation describes as potentially sensitive.

5. Memory is the emerging fifth boundary.

Several providers now offer persistent personalization features that build longitudinal user profiles from accumulated chat interactions. These cut across every dimension examined in the study: They create new training signals, new surfaces for human review, new inputs for advertising personalization and new data stores that can be compelled in discovery or exposed in a breach. 

As chatbots evolve from stateless tools into persistent assistants that remember a user's medical history, professional concerns and emotional patterns across months or years, the governance of memory will become a distinct and increasingly urgent privacy problem.

This is not the search engine debate all over again

A natural objection is that consumer chatbot privacy raises nothing fundamentally new. The study addresses this head-on and identifies at least five structural divergences that separate chatbots from search engines. 

The most important structural divergence is disclosure depth. A search query reveals a topic. A chatbot conversation reveals a life. A user who types "I've been having headaches and nausea for three weeks. I'm terrified it might be something serious. I haven't told my wife because she's already stressed about her mother's diagnosis." is not searching. They are confiding. The privacy framework must be proportionate to the disclosure.

Other divergences compound the problem: The interface is designed to feel like a conversation with a trusted interlocutor; retention covers rich, structured narratives rather than keyword strings; training creates a feedback loop with no search-engine equivalent — the model can memorize and, depending on the provider's training process and mitigations, potentially reproduce inputs; and human review exposes detailed, emotionally charged narratives rather than isolated queries. 

A recent Stanford/Yale study demonstrated that leading closed models can be strategically prompted to reproduce thousands of words from their training data, confirming that memorization is not a theoretical concern.

What do users actually expect?

The empirical base remains thin, but what evidence exists points consistently in one direction. A 2024 Consumer Reports survey of over 2,000 U.S. adults found that nearly half, 45%, believe chatbot companies should never store health-related information at all and only 5% accepted sale or sharing for purposes affecting the consumer, such as targeted advertising. 

A Deloitte survey of nearly 4,000 U.S. consumers found that 62% of generative AI users were willing to discuss personal medical topics with a chatbot, yet the same respondents identified data privacy and security as the primary condition for trusting the technology. A U.K. study found that 76% of large language model chatbot users lacked a basic understanding of the privacy risks involved in their interactions.

The pattern is clear: Users are confiding more in more sensitive domains while their understanding of what happens to those disclosures remains poor and their trust is declining. The appropriate response is not fatalism. It is better transparency, better defaults and better design.

What the chatbots themselves admit

To illustrate how these risks are communicated in practice, I posed the same question to all five chatbots: What is the risk that unpublished academic ideas discussed in a consumer chat session could leak to other users? The full responses are reproduced in an appendix to the study.

The results are more unsettling than the reassuring opening lines suggest. All five begin by characterizing the risk as "very low" or "extremely low." But none stops there. ChatGPT warns that sharing "a novel framing, taxonomy, catchy terminology" creates a "small chance a future answer to someone else echoes some of that framing." DeepSeek is the most candid: It advises users to "adopt the mindset that anything you type could potentially be read by others" and not to share an idea they "wouldn't be comfortable seeing on a public blog or in a competitor's grant proposal."

The most telling finding: every chatbot, having reassured the user, proceeded to recommend a battery of protective measures that collectively contradicted the reassurance. Call your novel concept "Concept X." Move sensitive work to local offline models. Treat the chatbot as a "public-access tool." When the product itself advises users to withhold their best ideas, the gap between expected and actual confidentiality is no longer a matter of inference. It is stated in the chatbot's own words.

From warnings to architecture: Introducing Sealed Mode

Across all four dimensions of the study, one finding recurs: The privacy protections available to consumer chatbot users are overwhelmingly promise-based. They depend on policy language, toggle settings and stated commitments that users must take on trust and that providers can revise, qualify or override through operational design. The gap between expectations and protections is starkest where the stakes are highest: health, mental well-being, legal consultations and crisis-adjacent disclosures.

Millions of people already use chatbots as functional substitutes for protected relationships. They do so not because the chatbot promises medical or legal confidentiality, but because the interface feels confidential. Warning labels cannot close this gap on their own. Telling users to "not share sensitive information" while the product design continues to invite disclosure through fluency, continuity and a personalized counsel-like tone is, as the European Data Protection Board's "Report of the work undertaken by the ChatGPT Taskforce" makes clear, an incomplete response that risks shifting compliance responsibility to users.

This is why the study proposes Sealed Mode as its centerpiece privacy-by-design recommendation. The idea: Stop treating all conversations as equivalent. For topics that predictably lead to highly sensitive disclosures, providers should offer at least one clearly labeled pathway where the default architecture materially constrains downstream reuse and insider access. A workable starting point is a dedicated lane for health, mental health and wellbeing — for example, "Health & Wellbeing — Sealed Mode" — combining six default protections:

1. No training or model-improvement use of sealed conversations, with narrow safety exceptions defined up front.

2. Siloed, purpose-bound personalization that never leaks into the general chatbot experience and is viewable and able to be deleted by the user at any time.

3. No advertising surfaces in the sealed lane, and no use of sealed-lane content as an advertising or ad-personalization signal.

4. Strict retention bounds with rapid deletion by default.

5. Minimized routine human access; only allow escalation by exception and ensure that all escalations are criteria-driven and auditable.

6. Stronger access governance — least-privilege controls, just-in-time access, immutable audit logs and periodic independent review.

This is not an incognito tab

Sealed Mode should not be conflated with existing "temporary" or "incognito" chat features. Temporary chats reduce user-facing persistence. 

Sealed Mode targets the internal boundary itself, constraining not just what the system remembers but what humans and operational systems can do with the conversation by default. And unlike temporary chats, Sealed Mode can support the continuity that health contexts genuinely require through siloed user-controlled profiles and without allowing sensitive information to leak into training, advertising or general memory. 

Temporary Chat answers the question "can I avoid saving this conversation?" Sealed Mode answers a different and deeper question: "Can I safely use this system for high-stakes disclosures without my conversation traveling through multiple systems and human hands by default?"

The technology already exists

The feasibility of this approach is no longer speculative. Apple's Private Cloud Compute, introduced in 2024, shows how consumer AI requests can be routed to hardened Apple silicon servers with technical enforcement. This enforcement ensures that data sent to Private Cloud Compute is not stored or made accessible to Apple, is only used to fulfil the user's request, and that the compute node is engineered to be incapable of retaining user data after its duty cycle. 

Meta's Private Processing for WhatsApp describes a server-based processing system built on trusted execution environments. Its goal is that sharing messages with private processing "does not make [them] available to Meta, WhatsApp, or anyone else," relying in its initial iteration on confidential computing hardware including AMD Secure Encrypted Virtualization-Secure Nested Paging — and NVIDIA confidential computing — together with attested, encrypted communications and a stateless design. 

The EDPB's April 2025 Support Pool of Experts report similarly identifies concrete technical mitigations for LLM privacy risks, including robust encryption and, in appropriate scenarios, differential privacy. Taken together, these developments show that "sealed" processing is increasingly an engineering and governance choice, not a technological moonshot.

How would Sealed Mode be overseen?

The options range from self-declaratory commitments — analogous to how companies currently represent that they offer end-to-end encryption — through voluntary technical standards or ISO-type certification, up to formal regulatory seals overseen by data protection authorities. Each carries trade-offs. 

The study does not prescribe a single governance model. The more important point is that some form of independent verifiability is essential if Sealed Mode is to be more than a marketing label. Existing mechanisms, including those contemplated under Articles 42 and 43 of the GDPR and the EU AI Act's conformity assessment procedures, could be adapted to serve this purpose.

10 recommendations, designed to be implemented

Sealed Mode is the centerpiece, but it sits within a broader set of 10 recommendations aimed at improving confidentiality and trust in consumer chat. 

1. Decouple history, retention and training. Allow full conversation history while keeping training disabled without requiring users to accept expanded retention.

2. Make opt-out semantics explicit and feedback-proof. Separate training, product analytics and safety uses in controls. Do not allow a feedback click to override an opt-out.

3. Adopt regurgitation-aware safeguards. Publish mitigation measures and residual risk. For high-stakes user groups, offer a labeled no-training mode with strict retention limits.

4. Standardize interface disclosure of human review. Adopt a clear notice in the interface stating that chats may be reviewed by humans. Under GDPR fairness, this is a transparency measure, not a transfer of responsibility.

5. Implement event-based transparency. Where feasible, notify users when a conversation is escalated for human review, or provide a user-visible audit log.

6. Provide a Sealed Mode for high-stakes topics. Offer at least one clearly labeled consumer-facing lane with strict defaults: no training, siloed personalization, no ads, strict retention, limited human access and cryptographic hardening.

7. Take a regulatory-first approach to conversational advertising. Map all applicable rules before deployment, especially in Europe. If a solution cannot safeguard both privacy and user trust, do not deploy ads at all.

8. Ensure recipient transparency. Describe which vendors and internal teams can access plaintext chat content and which only receive derived or redacted data. Where operational sharing serves a privacy-protective function, such as safety auditing, abuse detection or security monitoring, providers should explain this purpose in terms that allow users to understand that such processing is undertaken in their interest and is subject to contractual and statutory constraints on secondary use.

9. Provide ecosystem boundary notices. Where a chatbot shares data with another service, disclose that sharing at the moment it occurs and provide a control to disable it.

10. Separate "operations" from "improvement." Avoid collapsing broad operational uses into "improvement" language. Separate product analytics, safety operations and training in user controls.

What comes next: Part II

The forthcoming Part II of the study will address the external boundary: civil discovery and litigation holds, government-compelled access and cybersecurity breaches, and how the retention and access design choices documented in Part I directly amplify exposure.

This is where the public debate is heading. Industry leaders have already floated the idea of an "AI privilege" or privilege-like protections for certain sensitive interactions. But external protections are only coherent if the internal handling of those same conversations is disciplined first. You cannot credibly argue that third parties should be blocked from accessing sensitive chatbot interactions while the provider itself treats those interactions as broadly reusable, reviewable or monetizable by default.

That is precisely where Sealed Mode fits: It is a concrete way to "seal" a category of conversations at the point of collection, so that the case for heightened protection against discovery or breaches rests on verifiable technical and organizational constraints, not on user assumptions.

For privacy professionals, the practical question is no longer whether people will share sensitive information with consumer chatbots. They already do. The question is whether we keep responding with warnings and settings, or whether we build legible protections that match real-world use.

The author would like to thank Alston & Bird Senior Counsel Peter Swire, CIPP/US, and IAPP Research and Insights Director Joe Jones for their feedback and suggestions on earlier drafts of this paper. Any remaining errors are the author's own. 

CPE credit badge

This content is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Contributors:

Théodore Christakis

Chair, Legal & Regulatory Implications of AI, Multidisciplinary Institute in AI

University of Grenoble Alpes

Tags:

AI and machine learningCustomer trust and expectationsData securityLaw and regulation

Related Stories