Agile software development is a state-of-the-art methodology that speeds up development and focuses on adaptability. It is estimated that more than half of IT organizations use agile methodologies in their processes. However, for the purpose of complying with the EU General Data Protection Regulation, agile-led projects face challenges concerning privacy impact assessments and data protection impact assessments. We will expand on those challenges and analyze the use of a tagging method that relates the mandates of GDPR to elements of agile development.
The elements of agile development
In agile development, the three main conceptual elements of a project are Epics, User Stories and Tasks. An Epic is a group of User Stories with the same goal. For example, an Epic can be: “Development of a mobile notification system for a web-based healthcare application.” Epics are distinct sub-projects and can last for more than one sprint, which is a unit of planned work usually ranging from two to six weeks. User Stories describe software requirements in a few short sentences. They simply describe a feature required by someone who is usually a user, customer or administrator. For example, in the above Epic, a user story can be: “As an administrator, I want to verify and approve notifications on a webpage so that only proper notifications are passed to users.” User Stories often have the following structure: As a __, I want __ so that __. The development team determines what tasks to complete so that each User Story is satisfied. Figure 1 shows the breakdown of work into Epics, User Stories and Tasks.
Data protection impact assessments for agile projects
Figure 1 showcases how functionality emerges from the interworking of these elements. A privacy expert analyzes the system, performs a PIA or DPIA, as outlined in the GDPR, and makes sure the system is compliant with its requirements. This presents several challenges.
First, documentation is generally minimal in agile development, and development decisions are often made impromptu. This poses a problem for PIAs and DPIAs, as auditors or data privacy officers do not have visibility into the work’s development details. Guidelines on DPIAs for GDPR state that “[DPIAs] should be started as early as practical in the design of the processing operation.” Similarly, Roger Clarke, a computer science professor, explains that PIAs are anticipatory processes performed “in advance of or in parallel with the development of an initiative, rather than retrospectively.” This advance thinking and parallel work requires a methodology in agile development.
Additional activities are frequently needed to build privacy and security into the system to make the solution compliant, such as privacy-specific tasks and data protection. According to the Commission Nationale de l'Informatique et des Libertés, one of the pillars of PIAs is the management of privacy risks that determine “the appropriate technical and organizational controls to protect personal data.” If developers are unaware of those controls and cannot make a project plan that encompasses required activities, then how can they arrive at a list of tasks from GDPR mandates?
A common solution companies employ for building security and privacy is to adopt and follow a set of security and privacy controls in addition to project-specific tasks; for example, a control can be: “Use SSL/TLS for communication.” This approach falls under the category of application security requirements and threat management. Even if we have a set of security and privacy controls, such as the Application Security Verification Standard, there is no established way for an auditor or DPO to check if all the mandates of GDPR are satisfied through the completion of those tasks.
In short, what is lacking is a method that allows us to:
1. Audit and verify that performing Set T (tasks) satisfies Set G (GDPR mandates)
2. Use Set G to methodologically generate a set of tasks (Set T).
We will call these two goals verification and generation hereafter.
The roots of tagging and building a solution on tagging
The problems and goals outlined here resemble several problems of qualitative research that led to the invention of coding methods in grounded theory that include open coding, axial coding, and selective coding. That is, we need to build codes and tags for regulatory texts to methodologically and objectively carry out the verification and generation of tasks. This is a promising approach to reproducibility and objective verification. In their 1990 paper, Juliet Corbin and Anselm Strauss state that grounded theory aims to satisfy the requirements of “good science” that include generalizability, reproducibility, precision, and verification. We treat the text of a regulation as the source of a qualitative research project, as we would with the text of an interview. The goal is to code important ideas and concepts in the text and automate two steps further, much like the generation of hypothesis in grounded theory.
Consent: Consent Acquisition/Withdrawal | As a user, I want to be able to provide/withdraw consent to/from processing my PII so that I can have control over how my PII is processed. | Article 07 / Recital 32, Article 07 / Recital 33 |
Transparency: Whole Program | As a user, I want to be able to freely access information about processing activities that involve my PII so that I can exercise my right to view my processed PII. | Article 12 / Recital 58, Article 12 / Recital 59 |
Table 1: Samples tags with generated user stories and corresponding regulation mandates
Table 1 shows two sample tags and their corresponding regulations. Tags can act as an abstract layer between compliance regulation mandates and practical tasks or activities. Tag specifications are compliance-facing and are drawn on specifications rather than tasks; that is, they code an idea, or atomic concept, in the compliance regulation, such as “Processing Ground: Consent.”
In this way, tags can have attributes that point to various parts of compliance regulations and can account for nuances within the regulatory text’s concepts. The intention of coding and categorization is to use labels or tags for main ideas and hide or package their details in tag attributes. Assigning those codes and tasks allows privacy professionals to further analyze the implementation throughout various phases of the agile approach, including brainstorming, design, implementation, testing, and quality control.
Using tags, we can aim to write user stories that form the building blocks for the second goal of generation. Furthermore, developers can manage these tasks with agile methods using various application lifecycle management packages for verification. And finally, automated reports can be built on the correspondence between tags and mandates to simplify the PIA and DPIA process.
Next steps, challenges and research questions
We demonstrated how tags that act as an abstract layer between regulation mandates, combined with tasks, can contribute to PIA and DPIA procedures by codifying ideas, reducing subjectivity, and laying out a procedure for the generation and verification of tasks in agile development.
While this seems to be a promising approach for both verification and generation purposes, there are many challenges to address.
For example, the uniqueness of tags and the degree of subjectivity in assigning a tag, or class of tags, is debatable. While using this method, we observed that no two privacy professionals necessarily assigned the same tag and class of tags to a mandate from a GDPR article or recital. We found that further collaboration was needed to refine tags and arrive at a coherent set. Another important question is whether developers can choose from a list of tags on their own, or should privacy professionals such as DPOs have the responsibility of assigning those tags to tasks by participating in project planning meetings.
As the urgency to comply with GDPR increases, tags and tasks are a step in the right direction for simplifying the collaboration between developers and privacy professionals.
photo credit: akigabo Diversion? Decoration? Dehumanization? ... Done? via photopin (license)