LLMs with retrieval-augmented generation: Good or bad for privacy compliance?


Contributors:
Julia Kaufmann
AIGP, CIPP/E
Partner
Osborne Clarke
Florian Eisenmenger
CIPP/E, CIPM, FIP
Osborne Clarke GmbH & Co. KG
Editor's note: The IAPP is policy neutral. We publish contributed opinion and analysis pieces to enable our members to hear a broad spectrum of views in our domains.
Retrieval-augmented generation is a technique that enables AI systems with large language models to leverage additional information and documents to optimize output.
Unlike training and fine-tuning, these additional information and documents were not used to train or improve the LLM itself. Put simply, documents or information retrieved from an external source are fed into the AI system to enhance the accuracy and reliability of the output generated by the LLM. While the underlying LLM itself remains unchanged, it receives additional information and context to inform its output.
A typical example of a RAG-based AI system is a chatbot that can connect to external databases through application programming interfaces to incorporate a document search functionality or process a large number of uploaded documents to generate more tailored output.
The additional information and documents used by RAG can, of course, contain personal data. Privacy concerns arise because more personal data can have a negative impact on key privacy principles — such as data minimization, transparency and lawfulness — and RAG may result in dataflows to external parties.
The March 2025 paper "AI Privacy Risks & Mitigations — Large Language Models" published by the European Data Protection Board’s Support Pool of Experts discusses, among other things, privacy risks resulting from the use of RAG. Nevertheless, in its October 2025 guidelines on data protection law issues related to generative AI systems using RAG, the Datenschutzkonferenz — the Conference of the Independent Data Protection Authorities of Germany — concluded that RAG can also have a significant positive impact on privacy compliance.
Data privacy compliance risks
Contributors:
Julia Kaufmann
AIGP, CIPP/E
Partner
Osborne Clarke
Florian Eisenmenger
CIPP/E, CIPM, FIP
Osborne Clarke GmbH & Co. KG