There’s something exciting about migrating data to a new environment. It’s an opportunity to start fresh, like the decluttering and reorganizing we do before we move to a new house. It’s also the time to really think about what personal data we are migrating and how we intend to allow for its processing.
Since companies are increasingly making the choice to have their data float above them, one of the most important privacy impact assessments (aka data protection impact assessments under the EU General Data Protection Regulation) is for the migration of personal information to the cloud environment.
Privacy professionals, it seems, will be completing many cloud migration PIAs in the coming years. Before your next cloud migration PIA, keep the following guidance in mind.
Don’t move data that can’t be recognized
The clear naming of data elements in database tables (or a central index) is a common challenge that is at the heart of effectively detecting and protecting personal information. Traditionally, headers were titled without the expectation that others in the company, besides the database administrators, would need to understand them. Today, privacy regulations require us to rethink this practice and establish new requirements to guide our management of data. The migration to the cloud is a great opportunity to start new with a clear mapping of the data elements in the new environment.
Require the reduction of risky data combinations
A common shortcoming in the protection of personal information is failing to recognize when data elements come together to represent a higher degree of sensitivity. Theses combinations can be influenced by regulations (e.g., state breach notification regulations in the U.S.) or unique to a company’s operations (e.g., a product or service that reveals sensitive attributes of a person). The migration process is a perfect opportunity to reassess for this risk, realizing the organization may have been adding data elements overtime to the migrated repository. Obfuscation techniques and access controls can be used to address this challenge.
Make sure tables connect through primary and foreign keys
Privacy regulations require us to have a complete view of a data subject for addressing rights, such as data subject access requests. Tables in a database use common references between them (i.e., keys) that allow privacy professionals, for example, to find information about a particular data subject across multiple tables. Yet, some databases are not maintained with such clear key references. Migrating tables that contain personal information about data subjects but cannot be linked to the rest of the database will only be a liability.
Map migrated data elements and databases to their respective applications
For privacy operations, one important difference between applications and databases is that an application is often directly associated with a specific processing purpose, while a database may feed data to many applications of different purposes. Migrating a database without clearly mapping its uses and users on the application level prevents the privacy professional from conducting effective audits, tracking meaningful changes in processes, and spotting opportunities for data minimization.
Inventory the migrated data subjects
The correct identification of data subjects across the enterprise is a foundational part of any privacy program. However, even if such an inventory is not in place, the migration process is a great opportunity to get some control over this part of the enterprise’s architecture by inventorying the identities it contains. Such an organized view of data subjects is not only the best way to address DSARs, but it is also the first step in addressing other data management tasks organizations are facing.
Do not allow the migration of orphaned identities
Orphaned identities are data subjects that cannot be associated with a known processing purpose. Every organization has orphaned identities in its systems; some have a lot. These identities may have found their way to your environment through employee personal use, past opportunities that did not materialize or plain old data. Migrating repositories with orphaned identities will be both a privacy nightmare and an economic mistake. By requiring clarity for the purposes of the migrated data (as noted in points 4 and 5 above), the elimination of the orphaned identities becomes a natural next step for the PIA.
Determine in advance the approach for ROE
The right of erasure is a challenge for many organizations due to the dependencies between repositories. Eliminating an identity in one repository could lead to significant consequences in another downstream system. Understanding what is migrated and how that data is intended to be used are key points for determining how to comply with erasure requirements. For example, format-preserving masking is a useful solution for the elimination of identifiers without impacting data use.
Challenge access by third parties at the data element level
The microcosm of the cloud environment is a great excuse to revisit access that is given to external users. Tools can be used to detect the likely use of shared credentials. Additional limitations can also be applied to sensitive combinations of data elements (see point 2 above). By having clarity about the association of data elements with applications, and to some degree, processing purpose, access to data can be minimized and risk can be reduced.
Ask for masking and encryption for data minimization
Repositories in cloud environments tend to be better equipped to align with data protection controls. While the legacy systems in your on-premises environment may not have been designed for use with obfuscating solutions, the new cloud environment can be. Data minimization can be best applied using obfuscation solutions, such as masking and encryption, as they offer an opportunity to share the same data at different degrees of exposure to different users.
Identify the controller and DPOs for the new environment
Be sure to include in your documentation how, if at all, the accountability chain will be maintained. A new environment may lead to changes on the data and system ownership levels and also on the privacy side of things. The controllers for the data may have to change and, for EU General Data Protection Regulation compliance, the assigned data protection officers may have to be identified without considering the geographic location of the data, as would be the case with the on-prem data.
Photo by Sam Schooler on Unsplash