TOTAL: {[ getCartTotalCost() | currencyFilter ]} Update cart for total shopping_basket Checkout

The Privacy Advisor | Publicly available data under the GDPR: Main considerations Related reading: Are there risks of using public clinical trial data under GDPR?


One the issues when applying the specific EU General Data Protection Regulation provisions, including the very principles relating to processing of personal data and data subject rights, is how to make these provisions work in practice when it comes to publicly available personal data.

This is important, as clearly the GDPR applies in full irrespective of if the data are or were publicly available or not.

There are various provisions of the GDPR that refer to such types of data, but as they cover only some issues, and in lack of official topic-specific guidelines, more in-depth analysis is needed.

What does the GDPR say about publicly available data?

First of all, recitals of the GDPR are clear that the principle of public access to official documents needs to be taken into account. Such access to official documents may be considered to be in the public interest, and personal data in documents held by a public authority or a public body should be able to be publicly disclosed by that authority or body if the disclosure is provided for by EU or member state law to which the public authority or public body is subject. Such laws, however, should reconcile public access to official documents and the reuse of public sector information with the right to the protection of personal data.

This refers only to official documents, which needs to be interpreted narrowly and should basically pertain to activities of people acting in an official capacity in the name of such authorities or as part of official bodies and institutions. Obviously, such documents and information should be disclosed and further processed in line with these purposes. 

Secondly, the GDPR, when referring to information to be provided where personal data have not been obtained from the data subject, which needs to include the source of the personal data, also says that it needs to be disclosed whether the data came from publicly accessible sources.

Consequently, there is no doubt, that when the personal data comes from publicly available sources, the data subjects must be notified in line with Article 14. 

There are also rules that apply to special categories of personal data and seem to limit the requirements when it comes to publicly available data. That is, in line with Article 9, if the processing relates to personal data that are manifestly made public by the data subject, no explicit consent or other legal basis as enlisted in the Article 9 (mainly specific laws and regulations or establishment, exercise or defense of legal claims) is required.

On the other hand, such data would have to be made public by the data subject, and more than that, manifestly made public, so as to indicate that they wish and expect such data to be further processed. No need to mention that all other provisions, including the principles and the Article 6, still apply, and also the personal data may be processed only if the purpose of the processing could not reasonably be fulfilled by other means.

Last but not least, the GDPR has special provisions relating to the right to be forgotten and publicly available data. There is no doubt that this right would be fully applicable, and in addition to that, the controller who has made the personal data public should take reasonable steps, including technical measures, to inform other controllers who are processing the personal data that the data subject has requested the erasure by such controllers of any links to or copy or replication of that personal data.

What would be the usual legal basis?

Considering the above, it is clear that both publicly available data might be processed and that the GDPR provisions apply to such data. This also means that a legal basis needs to be established from the outset, documented and included in relevant assessments (such as data protection impact assessments), as well as communicated to the data subjects.

There could be many different scenarios for using such data and, theoretically, different types of legal basis could be applicable. It can be imagined that the data subject provides consent to or enters into an agreement pursuant to which some companies (controllers) will collate and compile publicly available data on this person. This could be in the interest of the data subject when, for example, this person is widely known and would like to see relevant trends and assess the impact they have for a general audience. There could be also specific laws and regulations that require some data to be collected or analyzed or laws providing some exemptions, as for journalistic, academic, artistic or literary purposes.

More often than not, however, data controllers using publicly available information will rely on legitimate interests.

How is the context important, and what should be included in the balancing test?

As indicated by the Article 29 Data Protection Working Party, the notion of legitimate interest could include a broad range of interests, whether trivial or very compelling, straightforward or more controversial. It will then be in a second step, when it comes to balancing these interests against the interests and fundamental rights of the data subjects, that a more restricted approach and more substantive analysis should be taken. Even though this opinion has been issued under the European Data Protection Directive, these considerations would be still fully valid and up to the point. It also highlights that there is no blanket permission to reuse and further process publicly available personal data.

One of the specific examples in favor of making the data public would be in the case of publication of data for purposes of transparency and accountability, where public disclosure is done primarily not in the interest of the controller who publishes the data, but rather in the interest of other stakeholders, such as employees, journalists or the general public, to whom the data is disclosed. In general, however, the opinion points out that it is advisable that personal information should be disclosed to the public on the basis of a law allowing and — when appropriate — clearly specifying the data to be published, the purposes of the publication and any necessary safeguards.

In all cases, though, the nature of personal data, the way the information is being processed, the reasonable expectations of the data subjects, and the status of the controller and data subject will have to be considered.

In brief, the more sensitive the data or the more seemingly innocuous data is merged and linked together to create detailed profiles, and the bigger the audience to which the data will be subsequently disclosed (in extreme cases, the general public, such as when data are further disseminated in the internet), the less likely it is that the balancing test will be in favor of the controller.

It also needs to be taken into account whether the data has been made publicly available by the data subject or by third parties. Even in the first scenario, the data subject who made the data public could change their mind over time and their expectations might be that the data will not be processed anymore. 


In general, processing of publicly available data should be to the highest degree possible in line with the original purposes (e.g., when the data is part of official registers, such registers should be consulted on a need-to-know basis rather than copied in bulk just in case some data might be relevant). When considering the legitimate interests, a limited audience for the data that is subject to further processing and the limited amount of data is more likely to justify the processing than if large amounts of data are shared with many recipients. Some relevant exemptions could include journalistic purposes. Apart from that, data subject expectations, whether they made the data public themselves and other obvious risk factors, such as the sensitivity of data and vulnerability of data subjects, will be extremely important. This is yet another example that quality and source of data is more important than quantity.


Photo by Thomas Lefebvre on Unsplash

1 Comment

If you want to comment on this post, you need to login.

  • comment Marcel Lodewijk • Feb 14, 2022
    Hi Piotr,
    very intresting read, thanks for your insights.
    I was wondering what your thoughts would be on this topic with regards to proffesional sports and the broadcasting of matches.
    In your opinion, if you analyze a, for example, soccer match that is broadcasted via TV and/or internet, would the statistics you get about players (for example distance travelled, number of passes, time played, goals scored) qualify as personal data under the GDPR ? And if so, what are your thoughts on meeting the requirements set by GDPR such as the data subject rights and the legal base for processing ?