TOTAL: {[ getCartTotalCost() | currencyFilter ]} Update cart for total shopping_basket Checkout

The Privacy Advisor | On large-scale data processing and GDPR compliance Related reading: How to leverage GDPR compliance for increased consumer trust




As most people will have realized by now, the General Data Protection Regulation takes a risk-based approach. Companies are expected to make an assessment of their processing operations, the types and volume of data they are processing and to decide what technical and operational measures might be required to mitigate possible risks to the rights and freedoms of data subjects. This is also part of the accountability requirements of any organization. Articles 5(2) and 24(1) of the GDPR stipulate all organizations need to be able to show what they are doing. U.K. Information Commissioner Elizabeth Denham in the past has described maintaining such records as an organization’s capacity to comply.

Documenting what you need to do and want to do makes sense, but doing so on the basis of a risk-based approach offers some challenges. What should be considered risky processing? The GDPR does not actually offer a lot of guidance on how to make such assessments and what would be right or wrong. What is clear from the text of the regulation, especially Articles 35 (DPIAs) and 37 (the appointment of a mandatory DPO), is that large-scale data processing is considered to be high risk.

Is guidance required?

The GDPR is principle-based regulation. Organizations are therefore free to deal with the requirements in any way they like, as long as they can explain how they have come to their decisions and make sure that these are fully documented. This is also true for defining what you consider to be large-scale processing.

We understand many organizations are looking for guidance on what they should be doing, but in reality, more guidance offered, especially in formal ways, by the European Commission, individual data protection authorities or the European Data Protection Board, would also limit the ways in which companies are able to provide their own interpretation to the norms enshrined in the GDPR. The freedom to work on the basis of your own interpretation of the law could also mean there is no need to adapt the business model of the organization. There is however one precondition: You need to be able to tell a convincing story on your interpretation of the law and have it documented. In case of questions, you need to show the interpretation of the norms was made in a deliberate manner.

 What guidance is available already?

Despite the fact that not all norms would need to be interpreted to work with the law, there is by now relevant guidance available when dealing with high risk and large-scale processing. The basic criteria for high-risk processing are included in the Opinion WP248 on Data Protection Impact Assessments from the Article 29 Working Party, which was recently endorsed by the European Data Protection Board.

On the definition of large-scale processing, the guidance is still scarce. The main guidance available is currently from individual data protection authorities. For example, the data protection commissioner from Estonia, Viljar Peep, has published a LinkedIn post explaining his office’s position on large-scale processing. In case of processing of special categories of personal data and/or data related to criminal convictions and offenses, the threshold lies at 5,000 persons. This threshold doubles to 10,000 persons in case data is processed related to financial and payment services; digital trust services like e-signatures; as well as communication data; real-time geolocation data; and data related to profiling with legal effect. These data are considered by the Estonian DPA to have an elevated risk. Finally, for all other data, the large scale threshold is set to 50,000 persons, meaning that any database covering over 50,000 people will for example trigger the DPIA requirement.

In the Netherlands, the Dutch DPA has released guidance on large-scale processing specifically in the healthcare sector. Data processing by hospitals, pharmacies, general practices centers and care groups, are always considered to be large scale. For smaller general practices or pharmacists working solo, as well as specialist medical care centers, data processing is large scale if more than 10,000 patients are registered with the practice or more than 10,000 patients are treated on a general basis and all patient files are maintained of a single filing system.

In the Czech Republic, the data protection authority has commented on large scale data processing in its guidance on DPIAs. As is the case in more countries, the Czech DPA has provided a threshold for the number of data subjects above which data processing is considered as large scale, in this case: 10,000 data subjects. However, also processing by more than 20 processing branches or by more than 20 employees is considered to be large scale. Finally, organizations will need to take into account if data processing is at regional level, or at (inter)national level, the latter being more likely to be large scale.

In Germany, the Federal Data Protection Commissioner has issued an overview of data processing operations that are subject to a DPIA. In this document, it defines large scale processing as data processing operations covering more than 5 million people, or those covering at least 40 percent of the relevant population. The latter threshold is thus dependent on the type of data that is being processed as well as of the data subjects involved in the processing operation.

It is interesting to note the difference in population thresholds between Estonia, the Czech Republic and Germany. The Estonian threshold of 50,000 people for general data processing amounts to just under 4 percent of the countries population, whereas in Germany the threshold lies at just over 6 percent of the population. In the Czech Republic however, the threshold is at 0.01 percent of the country’s population.

GIODO, the data protection authority from Poland, has also issued guidance on what it considers to be large scale, but has chosen not to put a number on large scale. Instead, examples are given of data processing operations that should be considered to be large scale, including the processing of medical records; employee documentation; systems in which a processor processes data from multiple data controllers; and databases collecting a wide range of data about web pages browsed, completed purchases and/or TV or radio programs watched/listened to.

Also the U.K. Information Commissioner’s Office has not put a number on large-scale processing. Instead, the ICO explains in its guidance that large scale includes the duration, or permanence, of the data processing activity, the number or proportion of data subjects involved, the volume of data and/or the range of different data items being processed as well as the geographical extent of the processing activity. It goes on to provide some examples, which include data processing by a hospital, tracking individuals using a city’s public transport system as well as the processing of customer data by banks, insurance companies and phone and internet service providers.

How to deal with the large-scale requirement?

The examples described here indicate that there is not yet a consistent approach between the data protection authorities on how to deal with large scale processing. It is therefore up to organizations themselves to make a decision how to deal with the requirement, while at the same time bearing in mind the guidance that is available.  

photo credit: wuestenigel via photopin


If you want to comment on this post, you need to login.

  • comment Xavier Le Hericy • Sep 13, 2018
    While I appreciate that attention should be paid to instances of large scale data processing, and that DPIA should be performed, the following statement in the above article is misleading: "What is clear from the text of the regulation, especially Articles 35 (DPIAs) and 37 (the appointment of a mandatory DPO), is that large-scale data processing is considered to be high risk."
    What is clear from the text of Article 35 of the regulation is that a DPIA must be performed a) automated processing, b) large scale processing of special categories of data, and c) monitoring of public areas.
    Article 37 requires a DPO in the case of large scale processing, but does not qualify such processing as high risk. 
    So we agree that in cases of large scale processing, the data controller should perform a DPIA, which includes an assessment of risk, but large scale processing does not by itself equate high risk. There is enough fear mongering already around GDPR, let's not add to it, it only detracts from the substantial work already needed.
  • comment Olivier Proust • Sep 14, 2018
    The GDPR does not say that “large scale processing” as such is subject to the DPIA requirement. What article 35 GDPR says is that large scale processing of special categories of personal data (art. 9) or of data about criminal convictions and offences (art. 10) is subject to a DPIA. There’s a nuance that is important. On the other hand, processing that is “likely to result in a high risk” for individuals must systematically undergo a DPIA. This is a much stricter requirement for organisations because it will require them to assess the risk impact of their data processing activities every single time.
  • comment Gregory Dumont • Sep 14, 2018
    The link to "Opinion WP248 on Data Protection Impact Assessments from the Article 29 Working Party" is erroneous. Here is the correct link:
  • comment Emma Butler • Sep 19, 2018
    I'd be interested to know how companies operating across the EU are tackling this. If the criteria and thresholds are different in member states, are companies only using the guidance of the relevant local regulator? Or are they choosing the lowest threshold number of all the member states they operate in? Or taking another approach?
  • comment Paul Breitbarth • Sep 21, 2018
    While my conclusion that large-scale processing is high risk may be a little blunt (which might not be such a big surprise, given I'm Dutch), I do stand by it, also taking into account recitals 91 and the guidance of the Article 29 Working Party in WP248 rev 1. Large-scale processing is an important factor to take into account when assessing if a processing operation is likely to result in high risk and is thus subject to a DPIA. In my opinion, there will be very few situations where large-scale processing should not be regarded as high risk. You could regard it as "better safe than sorry", rather than scaremongering.
  • comment Paul Breitbarth • Oct 2, 2018
    Late September 2018, the European Data Protection Board adopted the national DPIA black and white lists that were submitted by the data protection authorities. According to Estonian Commissioner Viljar Peep, the Board declined to approve numeric guidance as to what should be considered large-scale processing. What this means in practice, also from an enforcement perspective, remains to be seen.