As most people will have realized by now, the General Data Protection Regulation takes a risk-based approach. Companies are expected to make an assessment of their processing operations, the types and volume of data they are processing and to decide what technical and operational measures might be required to mitigate possible risks to the rights and freedoms of data subjects. This is also part of the accountability requirements of any organization. Articles 5(2) and 24(1) of the GDPR stipulate all organizations need to be able to show what they are doing. U.K. Information Commissioner Elizabeth Denham in the past has described maintaining such records as an organization’s capacity to comply.

Documenting what you need to do and want to do makes sense, but doing so on the basis of a risk-based approach offers some challenges. What should be considered risky processing? The GDPR does not actually offer a lot of guidance on how to make such assessments and what would be right or wrong. What is clear from the text of the regulation, especially Articles 35 (DPIAs) and 37 (the appointment of a mandatory DPO), is that large-scale data processing is considered to be high risk.

Is guidance required?

The GDPR is principle-based regulation. Organizations are therefore free to deal with the requirements in any way they like, as long as they can explain how they have come to their decisions and make sure that these are fully documented. This is also true for defining what you consider to be large-scale processing.

We understand many organizations are looking for guidance on what they should be doing, but in reality, more guidance offered, especially in formal ways, by the European Commission, individual data protection authorities or the European Data Protection Board, would also limit the ways in which companies are able to provide their own interpretation to the norms enshrined in the GDPR. The freedom to work on the basis of your own interpretation of the law could also mean there is no need to adapt the business model of the organization. There is however one precondition: You need to be able to tell a convincing story on your interpretation of the law and have it documented. In case of questions, you need to show the interpretation of the norms was made in a deliberate manner.

 What guidance is available already?

Despite the fact that not all norms would need to be interpreted to work with the law, there is by now relevant guidance available when dealing with high risk and large-scale processing. The basic criteria for high-risk processing are included in the Opinion WP248 on Data Protection Impact Assessments from the Article 29 Working Party, which was recently endorsed by the European Data Protection Board.

On the definition of large-scale processing, the guidance is still scarce. The main guidance available is currently from individual data protection authorities. For example, the data protection commissioner from Estonia, Viljar Peep, has published a LinkedIn post explaining his office’s position on large-scale processing. In case of processing of special categories of personal data and/or data related to criminal convictions and offenses, the threshold lies at 5,000 persons. This threshold doubles to 10,000 persons in case data is processed related to financial and payment services; digital trust services like e-signatures; as well as communication data; real-time geolocation data; and data related to profiling with legal effect. These data are considered by the Estonian DPA to have an elevated risk. Finally, for all other data, the large scale threshold is set to 50,000 persons, meaning that any database covering over 50,000 people will for example trigger the DPIA requirement.

In the Netherlands, the Dutch DPA has released guidance on large-scale processing specifically in the healthcare sector. Data processing by hospitals, pharmacies, general practices centers and care groups, are always considered to be large scale. For smaller general practices or pharmacists working solo, as well as specialist medical care centers, data processing is large scale if more than 10,000 patients are registered with the practice or more than 10,000 patients are treated on a general basis and all patient files are maintained of a single filing system.

In the Czech Republic, the data protection authority has commented on large scale data processing in its guidance on DPIAs. As is the case in more countries, the Czech DPA has provided a threshold for the number of data subjects above which data processing is considered as large scale, in this case: 10,000 data subjects. However, also processing by more than 20 processing branches or by more than 20 employees is considered to be large scale. Finally, organizations will need to take into account if data processing is at regional level, or at (inter)national level, the latter being more likely to be large scale.

In Germany, the Federal Data Protection Commissioner has issued an overview of data processing operations that are subject to a DPIA. In this document, it defines large scale processing as data processing operations covering more than 5 million people, or those covering at least 40 percent of the relevant population. The latter threshold is thus dependent on the type of data that is being processed as well as of the data subjects involved in the processing operation.

It is interesting to note the difference in population thresholds between Estonia, the Czech Republic and Germany. The Estonian threshold of 50,000 people for general data processing amounts to just under 4 percent of the countries population, whereas in Germany the threshold lies at just over 6 percent of the population. In the Czech Republic however, the threshold is at 0.01 percent of the country’s population.

GIODO, the data protection authority from Poland, has also issued guidance on what it considers to be large scale, but has chosen not to put a number on large scale. Instead, examples are given of data processing operations that should be considered to be large scale, including the processing of medical records; employee documentation; systems in which a processor processes data from multiple data controllers; and databases collecting a wide range of data about web pages browsed, completed purchases and/or TV or radio programs watched/listened to.

Also the U.K. Information Commissioner’s Office has not put a number on large-scale processing. Instead, the ICO explains in its guidance that large scale includes the duration, or permanence, of the data processing activity, the number or proportion of data subjects involved, the volume of data and/or the range of different data items being processed as well as the geographical extent of the processing activity. It goes on to provide some examples, which include data processing by a hospital, tracking individuals using a city’s public transport system as well as the processing of customer data by banks, insurance companies and phone and internet service providers.

How to deal with the large-scale requirement?

The examples described here indicate that there is not yet a consistent approach between the data protection authorities on how to deal with large scale processing. It is therefore up to organizations themselves to make a decision how to deal with the requirement, while at the same time bearing in mind the guidance that is available.  

photo credit: wuestenigel via photopin