In November, the European Data Protection Board released two important documents with guidance on when personal data will be allowed to flow to the United States, India, China and many other non-EU countries. Expert European commentators have concluded the draft documents from the EDPB would apparently have the effect of hard data localization, limiting many routine data flows from the EU.

As academics at Georgia Tech, we have submitted comments to the EDPB that highlighted five areas of concern about the effects of hard data localization:

  1. Previous research shows numerous major data flows, beyond digital platforms, that would be affected by hard data localization.
  2. Previous research shows technical obstacles to providing online services in a regime of hard data localization.
  3. Strategies for localizing in the EU work less well when other jurisdictions also require data localization.
  4. Seemingly simple and lawful international transfers may include background processing that may not be consistent with hard data localization.
  5. Hard data localization may create cybersecurity, anti-fraud and related risks.

This article provides background about the EDPB documents and our comments and then examines each of the five themes in more detail.

Background

The EDPB documents are an important component of the ongoing legal and policy development in the wake of the "Schrems II" decision in July by the Court of Justice for the European Union. As privacy professionals know, that case struck down the EU-U.S. Privacy Shield and stated strict legal standards for when personal data can be transferred from the EU to “third countries” that have not received an EU determination they provide “adequate” protection of personal data.

The IAPP has provided a Caitlin Fennessey, CIPP, discussed the EDPB documents published in November. One of the authors (Georgia Tech Research Director Cross-Border Data Forum Peter Swire, CIPP/US) has published with Kenneth Propp a proposal that could meet the CJEU’s requirement that the U.S. provide individual redress and provided testimony to the U.S. Senate Committee on Commerce, Science, and Transportation on multiple issues related to the case.

On Nov. 10, 2020, the EDPB published draft “Recommendations 01/2020 on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data.” These draft recommendations discuss safeguards that can supplement the protections for personal data provided by the widely used standard contractual clauses.  As Théodore Christakis of the Université Grenobles-Alpes has Recommendations 02/2020 on the European Essential Guarantees for surveillance measures.” Christakis found these similarly strict: “Third countries might rarely if ever meet the EEG requirements.” Under the EEGs as drafted, except for the small number of countries with an “adequacy” determination, “few other countries might be considered as offering a protection 'essentially equivalent' to that offered by EU law.”

Although relatively few writers have explicitly called for hard data localization within the EU, the November approach of the EDPB would appear to lead to that result. We have previously studied cross-border data flows, including the effects of data localization. In conjunction with the Cross-Border Data Forum, we are currently engaged in a substantial research project on data localization and plan to provide further research results as they become available. 

Based on our research to date, we thus submitted comments to the EDPB for their consideration of the effects of their draft documents. We highlighted the five themes discussed here significant effects of hard data localization.

  1. Previous research shows numerous major data flows, beyond digital platforms, that would be affected by hard data localization

The most detailed examination of flows of personal data out of Europe, of which the authors are aware, remains the book that one of the authors (Swire) wrote with Robert Litan in 1998, called "None of Your Business: World Data Flows, Electronic Commerce, and the European Privacy Directive." The book catalogs roughly 40 categories and sub-categories of significant data flows from the EU with the focus especially on flows to the U.S. Many types of data flows are the same as in 1998, but there are important new categories of data flows, perhaps most notably for cloud computing, where the personal data of individuals is often stored, processed and/or accessed in a different country.

The Brookings Institution recently made the full text of the book available for free download. Chapters 5 to 7 indicate a list of sectors and types of data flows where hard data localization may have significant effects, often to limit transfers. To date, much of the focus in commentary within the EU has been on data flows concerning the largest digital platforms. The 1998 book shows numerous effects of hard data localization, beyond the largest digital platforms.

We highlight one area for particular attention. The possible disruption of data flows could include pharmaceuticals research, which would be especially important to consider during the COVID-19 pandemic, when sharing of personal data is so important concerning the safety and efficacy of vaccines and treatments, as well as other medical information.

  1. Previous research shows technical obstacles to providing online services in a regime of hard data localization

We highlight two previous studies that show technical obstacles to providing online services in a regime of hard data localization. In addition, we explain the need for further research on the extent to which traffic sent from one user to another, within the same country, actually routes through one or more different countries.

The first paper is by research engineer Dillon Reisman, who presented his results at a 2017 conference at Georgia Tech that was sponsored by, among others, the European Union’s Erasmus+ Program. The conference was part of the authors’ work leading the Georgia Tech Cross-Border Requests for Data Project, which has now been included in the ongoing work of the Where Is Your Data, Really?: The Technical Case Against Data Localisation.” Reisman identified the following technical obstacles to providing online services if hard data localization is in place:

  1. Your data might be stored in edge caches across borders.
  2. Your data might be replicated for load balancing.
  3. Your data might be “sharded” across multiple machines in multiple data centers.
  4. Your data might be backed up to multiple locations in case of failure.
  5. Your data might be made accessible to engineers in different countries for maintenance and de-bugging.
  6. Your data might be processed in batches at a central location to add features, such as search or artificial intelligence.
  7. Your data might be processed to generate “derived data.”

Reisman explained that “data can live ephemerally, in many copies and in many places. Some of our most important Internet applications, from search functions to communications, rely on those places being across a national border.” For tech companies, Reisman discussed the need for data to be accessible to employees — who may be located in different jurisdictions — to maintain these internet applications, ensure that processing occurs and provide support to users.

The second paper is by Jonathan Mayer of the Department of Computer Science at Princeton University. In his 2013 paper called “The Web is Flat,” Mayer examined what he called the “international referrer” issue. He wrote, “A person within the United States may be reading a webpage that looks, and is, as American as apple pie. But that webpage can pull in dozens of unexpected sources.”

He adds, “If just one of those third parties is international,” the user’s personal data will go outside of the user’s country. Mayer tested 2,500 popular websites and found “international referrers are pervasive.” For U.S. users, he concluded, “So much for a bright line dividing the domestic and international web.” Given the large number of web services that exist in the U.S. and other third countries, it would seem likely that individuals in the EU would encounter pervasive international referrers, as well — current websites “pervasively” refer browsing activity outside of the EU so hard data localization would appear to entail pervasive redesign and resourcing for websites accessed by EU users.

One topic for further research is the extent to which traffic sent from one user to another within the same country actually routes through one or more different countries. The simple idea is that internet communications do not travel in a straight line from two users, such as Alice to Bob. Instead, as explained in an introductory lesson by Khan Academy, “computers split messages into packets and those packets hop from router to router on their way to their destination.”

Currently, there is no infrastructure in place to ensure that packets sent from within the EU route only through the EU or that all packets sent from the EU are encrypted. As the Internet Society explains in its examination of data localization, “even if data is located in one country, the transmission path may cross national borders for resilience or performance reasons.”

One additional point about routing: A 2017 paper by computer scientist Peter Mell and others examines “information exposure,” defined as “the extent to which communications between pairs of countries are exposed to other countries.” The paper establishes that countries that are “well connected” — that have more connections to the global internet — face a greater likelihood of such information exposure. By contrast, a country such as China that limits exposure to the global internet has a lower likelihood that its communications will pass through other countries as a result of the Internet Protocol’s routing structure. The Mell paper raises the possibility that effective blocking of packets from going outside of the EU, at a technical level, may require regulatory limits on transfers more similar to the current Chinese approach.

  1. Strategies for localizing data in the EU work less well when other jurisdictions also require data localization             

For a business headquartered in the EU, one appealing strategy, in the face of hard data localization, may be to centralize data processing in the EU. Under this strategy, for instance, the company could keep human resources records within the EU to the maximum extent feasible. Processing then would remain under EU data protection rules while it stayed in the EU. Transfers to third countries could be relatively uncommon, perhaps enabling the company to rely on derogations or at least reducing the amount of possibly unlawful transfers. 

Indeed, under this approach, companies based either inside or outside of the EU could decide to shift their human resources and other records into the EU so that corporate decisions could be made based on the data that is allowed to come into the EU. Under this approach, centralizing data processing in the EU may, therefore, aid in compliance and reduce the risk of enforcement actions.

This EU-based architecture, however, works less well when other jurisdictions also require data localization.

Continuing with the human resources example, the other jurisdictions may limit transfers back to EU headquarters. For example, such as for e-commerce purchases from outside of the EU, the customer records may also need to be stored in the other jurisdiction. In each of these simple examples, it may not be possible to meet the data localization requirements of both the EU and the other jurisdictions.

Data localization outside of the EU is already significant and may grow substantially if the EU imposes hard data localization. First, nations, such as China, already have a hard data localization regime, and China is a leading trading partner for some member states. 

Second, other important trading partners, such as India, are seriously considering data localization regimes. 

Third, the EU General Data Protection Regulation and other aspects of EU data protection law have been widely copied by nations around the world. If the EU interprets its regime to require data localization, other countries could interpret their own “adequacy” and other provisions to require data localization, as well; moreover, the ability of the EU to achieve free trade goals generally may be reduced if the EU becomes a leading adopter of limits on cross-border economic activity.

To the extent that more jurisdictions follow an EU approach for data localization, then strategies for localizing personal data within the EU would work less well than may have been apparent to date.

  1. Seemingly simple and lawful international transfers may include background processing that may not be consistent with hard data localization

Another theme in our ongoing research is that apparently simple and lawful data flows may not be so simple in practice. Consider an individual in the EU booking a hotel room in the U.S., an example provided by the prominent privacy organization noyb. The contemplated data flow involves an informed choice by an EU person, such as Alice, to have her personal data go to the U.S. This data flow would, according to the example, rely on one or more of the Article 49 derogations.

Along with this direct booking request for a hotel room, there likely would be a number of other data flows, occurring in the background and often not visible to Alice. We provide several examples here. In doing so, we do not propose a legal conclusion about exactly which of these may be lawful under Article 49 derogations; instead, the point is to illustrate that there is often background processing to accompany a seemingly simple customer request:

  1. Existing customer records.
    1. Alice may have user preferences, such as for a double or king-sized bed.
    2. Alice may be part of a loyalty program, and she wishes to get “credit” for the nights she stays in the U.S. hotel. The hotel may wish to inform her that she can get a free night if she stays one extra night.
    3. Alice may have a preferred customer status, based on her purchases in the EU, and the U.S. hotel should give her a free breakfast or other benefits.
    4. Alice may wish to receive coupons or other offers from other companies, such as for car rentals or discounted admission to tourist attractions. Such offers may be based on personal data processed in the EU.
  2. Payment information.
    1. Travel agents or other companies in the EU may receive payment, in whole or in part, then communicate payment status to the U.S. hotel. Updates on booking status may flow back and forth if Alice changes her travel plans once in the U.S.
    2. Alice may have a credit card or other method of payment on file in the EU and wish to use it easily in the U.S.
    3. Alice may have a branded credit card, such as an airline miles card, so personal data about her trip goes to the airline, as well as the credit card company.
    4. There may be regulatory requirements that would trigger additional cross-border flows. As one example, if Alice preferred to pay in cash for an extended visit, that may trigger requirements under anti-money laundering laws, leading to follow-up investigation and access to personal data from the EU.
  3. Accounting and anti-fraud.
    1. The U.S. hotel and relevant EU companies all have accounting obligations so personal data may be exchanged EU/U.S. as part of routine accounting activities.
    2. Accounting also applies to each step of the payments system with EU actors sharing sufficient data with U.S. actors to ensure accurate entry and accounting for the payment transactions.
    3. When Alice arrives at the hotel, there may be authentication information received from the EU to verify her identity.
    4. Along with authentication information, an EU company may provide personal data in determining the maximum bill Alice can incur at the U.S. hotel.
    5. Alice may have a history with the hotel chain in the EU of canceling reservations at the last minute so the U.S. hotel may decide to make the reservation non-refundable.

This list is provided by way of example — a seemingly simple and one-time transaction (booking a hotel room) — may be accompanied by multiple, routine and ongoing transfers of personal data. Consistent with generally hard data localization, there may be ways to gain consent that meets EU requirements or otherwise structure the hotel booking to meet Article 49 derogations or have some other lawful basis. The point, however, is that one should consider background processing before reaching any conclusion about whether a seemingly simple transaction is lawful.

  1. Hard data localization may create cybersecurity, anti-fraud and related risks.

Hard data localization may create risks for cybersecurity, anti-fraud and related prudential activities. The basic idea is that information can be an important component of defending against and responding to cyberattacks. The respected Internet Society has stated, for instance, that “Cybersecurity may suffer as organizations are less able to store data outside borders with the aim of increasing reliability and mitigating a wide variety of risks including cyber-attacks and national disasters.”

Some reasons data localization may harm cybersecurity include the following:

  1. The general concern that a reduction in available information will increase the risks from cyberattacks.
  2. It may be more costly to implement and maintain state-of-the-art tools across different localization regions.
  3. The loss of redundant storage increases the risk of data loss or network outage in the case of a hardware malfunction or natural disaster.
  4. Options for distributed storage solutions, which often assist in deploying privacy, integrity and counter-intrusion protocols on networks, would be less available to data controllers.

We hope to provide greater detail about the effects on cybersecurity in future research. At this time, we simply point to the topic as one worth considering as part of the overall effects of hard data localization.

Photo by NASA on Unsplash