By Martin Gomberg, CIPP/E
When the nonanonymized genomic data of an individual is processed for any purpose — including medical, law enforcement or retail consumer uses — the sensitive personal data of all related individuals, directly or indirectly identifiable, is also processed. This includes the personal data of those unaware of the processing, as well as those who won't or can't provide consent. Processing an individual's data without their knowledge, consent or an appropriate legal means is, by definition, surreptitious.
A DNA sample processed to identify an inheritable health risk for one brother who wants to know, could identify potential risks for another brother who adamantly does not want to know his risk. Even if the processing is artifactual and unintended, information has been processed by a company and some part of that data is related, directly or indirectly, to both brothers. Of course, the genomic profiles of two brothers are not exact. Just under half of their DNA is inherited from their father, with each brother sharing only some commonalities. The other half is inherited from the mother's familial line and, with genetic mutations each carry, this DNA relates the brothers and expresses their unique individuality.
Regarding individuals' privacy, and the emerging language in privacy laws, it is less about the specifics of the data processed and more about the challenges to processing sensitive, related and reasonably associated individuating data.
There is no more intimate exposure of our person than through our most generationally persistent and impactful data, our DNA. It can reveal our ancestral makeup, family relations, traits we pass forward, predisposition to disease and the specific diseases we carry, our physical characteristics, and our predilection to specific behaviors and behavioral abnormalities. It can inform on paternity, infidelity, criminality and other sociolegal questions. It can predict likely longevity.
Unlike other data typically anticipated by privacy laws, such as the personal data of individuals or shared households, DNA is extra-individual. The data is not only associated with us and our familial households, but also others to whom we are biologically related. Yet our innate curiosity and personal interest in knowing about ourselves, our relatives and our ancestry compels millions of individuals to happily contribute a swab or vial of saliva, granting consent to for-profit direct-to-consumer genomic processing companies to assess our genotypical makeup and relatedness.
But privacy laws do not handle the interrelatedness of individuals well. Nor do they consider how our actions, disclosures or processing we consent to impacts next or descendent generations. Clearly DNA data is personal and highly sensitive. But can laws treat the DNA data shared in a genetic cohort the same as an individual's personal data?
Regulatory challenges
Whether processing of one family member's DNA is at the same time processing of another's without their knowledge is the core of the question of whether the processing is, in fact, surreptitious. It is also the core of regulatory challenges direct-to-consumer companies may face.
There are nearly 200 privacy laws worldwide. In privacy there are principles, truths and structures common to existing and emerging laws everywhere across the globe. Irrespective of which law, primary among these are informed consent and the use of personal data in the legitimate interest of the individual, company, community, national or public good, in contracts, necessary processing and in compliance with law and authorities. None fit retail genomic data well.
There are 12 U.S. state privacy laws currently enacted and approximately 20 others pending. Irrespective of even the best efforts to manage security and privacy, aligning current policies and processing to existing and continually emerging domestic and potentially global regulations will be challenging. Each law imposes increasing demands on informed disclosures, processing, and protecting consumers and children.
Articles 4(13) and 9(4) of the EU General Data Protection Regulation disallow the processing of genetic (special category) health data, but because of its importance, allow member states further conditions, controlling or enabling medical, scientific, and clinical use and research. But retail direct-to-consumer processing is treated inconsistently and is covered under varied regulation across EU member states. Through medical device, patient's rights, bioethics, health or genetic regulations, it may be disallowed in part, completely, or require a local or medical facilitator. Some U.S. companies avoid these complexities by not participating in European markets.
The U.S. has been more favorable to direct-to-consumer genomic processing. Absent comprehensive privacy regulations, it has largely been treated as a retail consumer service, unrestricted in most states as to data collection, use and sharing. But with increasing scrutiny by the U.S. Federal Trade Commission, new state privacy, health and child protection laws, and as the U.S. adopts a more GDPR-like posture and language, there may be new challenges to direct-to-consumer uses of genomic data.
Like the GDPR, California's Consumer Privacy Act and other new laws define "personal information" broadly, to include any information reasonably linked or associated, directly or indirectly, to "an identified or identifiable individual." All known family members share an identified relationship through inheritable DNA. When processing a father's genomics, his inheritable DNA profile — positive, benign and deleterious — is processed and identified. It is the father's right to know and consent for himself. But absent a means of consent, his children, once they are adults, are passive recipients of the processing performed under the consent given by their father.
Blood relatives are all reasonably linked by DNA. They are identifiable by genetic genealogy. Cousins estranged, or who never met, share familial data. With the qualifiers reasonably linked or associated, personal data of one family member is related to others, identified or not.
Other laws add to these challenges. Connecticut, Colorado and Virginia require opt-in consent for processing of sensitive data. And, once given, a means to later revoke consent. The CCPA identifies genetic data as sensitive, and its inferential uses can be restricted by consumers. Utah requires notice and a right to opt out. Without notice of processing to impacted individuals, identified or identifiable, no means to consent is made available to them, and neither the restriction, revocation nor opting out of the processing is possible.
Colorado and California each disallow dark patterns, arguably a consequence of surreptitious DNA processing. All nonanonymized uses of DNA, regulated or not, share the issue of unintended processing of potentially exposable, usable or referential data about individuals. Also, the processing of individuating data, but without basis, informed consent or even subject knowledge.
Even with the best of efforts toward compliance, newly adopted EU-like language in U.S. law, a flood of emerging state regulation and legislation, what has been termed the "murky" nature of consent, and the nature of DNA, may prove to be difficult or insurmountable for some businesses.
Default consent and surreptitious processing
One of the more difficult challenges of genomic processing is in the use of consent. Consenting to the processing of our DNA is consenting to an exposure of the potential genomic makeup of close family, distant relatives and descendants. Surreptitious processing is any clandestine, covert or unauthorized processing. Artifactually processing the personal data of others under the valid consent of one party may be construed as surreptitious, as it is uninformed and without a choice or opportunity to opt in or out.
Dark patterns
Does surreptitious processing in this context of default consent — a preticked box that defaults to "Yes, I consent to the processing of my DNA" — meet the criteria of a dark pattern? Absent any question of consent or any informed choice by its inheritors, it would seem so. Typically, a dark pattern refers to a deceptive interface that compromises individual choice and intent. In this case it is less a matter of deceptive design and more about an imposed and obscured default consent, and an absence of the means, or opportunity, to exercise informed choice.
Does granting consent for DNA processing today deny next generations informed choice?
Every lifespan is granular, stacked generation on generation, great grandparent to great grandchild bracketed by all that came before and those ahead. The interests, choices and disclosures of one generation can trample and compromise those of the next. Great-grandparents, grandparents, parents, children, grandchildren and others share a lifespan together, each making choices, decisions and impactful disclosures about themselves, their family and their lives. The inheritance of a "default consent" — not given by our children or theirs directly, but by us as a parent, grandparent or others — to the disclosure of their genetic complement denies choice.
DNA data is unlike other personal data. It resists transparency, minimization, erasure and retention limits. Its individuation is not linear. A disclosure of self is also a disclosure of descendants and relatives, and it generationally persists. Parents, children, grandchildren and cousins each make each other relatable and findable.
In a paper titled "Murky Consent: An Approach to the Fictions of Consent in Privacy Law," Professor Daniel Solove states "in most circumstances, privacy consent is fictitious. Privacy law should take a new approach to consent that I call 'murky consent.' Traditionally, consent has been binary — an on/off switch — but murky consent exists in the shadowy middle ground between full consent and no consent. Murky consent embraces the fact that consent in privacy is largely a set of fictions and is at best highly dubious."
The recent 23andMe data breach that exposed data of individuals of Ashkenazi Jewish heritage is egregious, but the potential risk will only increase as we collect more data, increase the its density, keep it perpetually and use it in new ways. Neither technology nor privacy laws can fully address this. Privacy expectations cannot be met where personal data is inheritable, touches multiple people or crosses generational barriers. Consent is a poorly fit control and, as a minimal safety net, should expire, have limits, include designated proxies for postmortem decisions and require periodic revisitation for renewal.
And, it is only time before digital IDs and other individuating data emulate DNA to be persistent, associative, inheritable, relatable and immutable, requiring us to rethink privacy laws. Even now, there are several companies and technologies that promise the sequencing of an individual's entire genome to an immutable blockchain.
Interpolating from Maslow's Law of the Instrument, when all you have is a privacy law, all personal data looks like data relating to a natural (living) person. Except it doesn't and likely will increasingly less. We each suffer a cognitive bias toward the tools we hold most familiar. Privacy laws are a poor fit for genomic data. They are also inadequate for blockchain, transindividual, postmortem, massively dense and generationally persistent data — and genomic data is increasingly all of these.
There is a difference between synchronic consumer data, where applicability of the law is constrained to the lifespan of a living individual, and perpetual consumer data that survives generations passing. The same laws or language may not fit both. Consumer privacy laws are myopic in focusing only on protecting the rights of individuals living today. Even legal bases like consent on an individual's behalf, public interest or a company's legitimate interest fail where the needs, risks and context for upcoming generations is unknown. DNA spans generations and people, both those alive and yet to be born. Ethical responsibility will only increase with the sophistication and sensitivity of technologies as they increase our reach and expose more relationships to others.
DNA's 'unique' nature
Society demands a lot from its DNA data, continually increasing the density and persistence of data to support research and innovation. Europe's "1+ Million Genomes" initiative aligns 25 EU countries, Norway and the U.K. in establishing a genomic infrastructure for medical research and clinical trials. In the U.S., the National Institute of Health's National Human Genome Research Institute anticipates genomics research will potentially generate up to 40 exabytes of data in the next decade. Individual privacy benefits from minimization and limits to the data we hold and expose. Innovation in medicine, health and scientific research will only benefit from increased collection, dense aggregation, and the retention and persistence of more data over time.
Increasingly, synthetic DNA, or derived datasets, are being used to create pseudo datasets from real data for research purposes. These reduce the risk of exposing "live" individuating data. This is both an opportunity and risk. Larger, denser, richer and generationally persistent datasets are needed for research and innovation, and can be created using derivatives of the genetic profiles of millions. Persistence will allow its reuse in new ways for generations.
But with greater density and persistence, the risk of compromise and the attractiveness of the target also increase. With technology or statistical advancement, localization, reidentification and individuation remain risks for a dataset that, by design, violate retention limits, data minimization, the rights to know and access it, and other principles associated with privacy laws. DNA is a conundrum for privacy, and privacy laws do not align well with genomic data.
Genomic processing is critical for research and clinical trials but is challenged by the absence of an obvious legal basis that spans individuals, relations and generations, or the processing or disclosure of the sensitive personal information of nonconsenting or uninformed individuals and future generations. It seems unavoidable that the processing of the data of one person under consent is also a violation of the right to informed consent for another. It is the unintended consequence of current privacy laws, and the persistent, inheritable and unique nature of DNA.