With Our Privacy Terminology, Are We Even on the Same Page?

For nearly 30 years, I’ve done my best impression of Inigo whenever I hear someone using a term where I take exception with the implied definition. Unfortunately, today I’d need to be a better fencer than even the Dread Pirate Roberts if I want to take exception each time I hear a common privacy term used in a manner in which I sense there is no majority agreement on the word’s actual meaning.

I’ve previously written about what I call the “Problem At the Heart of the Privacy Profession”—chiefly that we don’t have anything close to agreement on the definition of the very thing, PII, which is at the foundation of all that we do. And given the fact that we’re on a shaky foundation, it should probably come as little surprise that it becomes even harder to pin down the beams on higher floors.

That said, it may well be past time to acknowledge the extent of our structural—err definitional—uncertainty.

I’m sure I’m just mentioning a minority, but indulge me as I call out just a few terms that seem to come up in nearly every conversation I have lately but where I don’t think there is anything close to an accepted definition or even a shared understanding. These terms include:

First/Third Party
Big Data
Data Broker
Data Minimization
Deterministic
Device ID
Fingerprint
PII
Precise Geo
Privacy by Design
Probabilistic
Sensitive
Track
Humflamate

The list goes on.

Privacy is certainly not the first field of human endeavor to have definitional troubles. I am often reminded of the U.S. Supreme Court challenge to define obscenity—leading to the now-famous line by Justice Potter Stewart, who said simply, “I know it when I see it.” But “obscene” works as a word in the common vernacular because, while there are always edge cases, most of us do know it when we see it. When I say, “photo of women’s ankles,” most of us agree non-obscene, while if I named a Larry Flynt magazine, most of us think obscene.

Alternatively, if I said, “Big Box Retailer CRM database” and non-data broker, I predict we’d see nothing like the general agreement we have on women’s ankles. If I said “Facebook Like icons,” I predict we’d see two intransigent camps who would call the icon either definitively first-party or definitively third-party, without acknowledging any potential for there to be a level of Victoria Secret-catalogue ambiguity.

I posit that there are two fundamental reasons why we see this disparity more in privacy than in other fields and why we face significant hurdles in developing consensus.

The first reason is that most people simply don’t have sufficient understanding of the technologies underpinning the majority of these terms to have a well-formed basis from which to converse. It’s a potentially harsh statement, but it is unfortunately truer than ever.

Take the cookie as the simplest of examples. Cookies have been around for more than 20 years. Cookies are actually easy to define, in the sense that 10 random HTTP experts would see a good definition and say, “yup that captures what an HTTP cookie is,” but equally, those same 10 experts probably cringe when they read the cookie description in most website privacy polices today.

So where does that leave us for communicating definitions of fingerprints? Deterministic IDs? Big data?

At some point, understanding things like big data as being distinct from alotta data requires at least a conceptual understanding of the implications of very advanced and evolving technologies. Please note, this is absolutely not a slight on privacy pros but rather a recognition that the field is expanding at an exponential pace in a manner that requires the same specialization we see in other mature disciplines like engineering.

The second reason is simpler. Good old-fashioned motivated reasoning.

I was reading an article the other day about the rise in accuracy of probabilistic IDs vs. deterministic IDs, with the underlying theme that probabilistic was now “as accurate” as deterministic. That said, much of what was cited in the article as a probabilistic ID was, in my mind, actually a deterministic ID. In fact, I was under the impression that there had actually been a consensus opinion that the types of IDs previously in one camp where now being used in the other camp.

What changed?

In a sanity check later in the week with a colleague, I was given a possible explanation that made a world of sense. The reason may be as simple as branding and a desire to either associate your solution with an in-vogue term or avoid an association with a maligned term. There probably is no bright line between probabilistic and deterministic IDs, and, absent that definitive test, the natural desire to associate your preferred solution with the term that is currently most favorable is natural and not necessarily disingenuous.

The same movement toward classifying in a favorable light is true of “party-ness.” Today there is a huge debate as to whether being a first or third party is an inherent property of who you are or is contextually a property of where you are. Is a party always a first party because consumers are familiar with the brand, even if the brand is appearing in a place where you might not expect it, or can you be a first party if your brand is unknown but where you work as an agent of the “obvious” first party?

Unlike defining obscenity, the problem with many of these privacy terms is that we don’t see a normal continuum with less disagreement at the ends but rather a polarized bimodal distribution.

At the risk of being glass half-empty, I don’t see a fix to this problem happening anytime soon.

I spent two-plus years of my life amongst W3C experts seeing as much agreement on the definition of the commonly used word “track” as I did around the made-up word “humflamate”—which I think we can all agree means “to humfle.”

I fear that, for the foreseeable future, we may be stuck either continuing to use terms with no consensus definition or creating definitions that hide both our lack of understanding and our bias through sufficiently fuzzy language or amorphous exceptions. So let me end by apologizing that I, in fact, don’t have a suggested fix other than to say that the first step in fixing any problem is acknowledging its existence, and it is time to acknowledge that, all too often, I don’t think it means what [we] think it means.

photo credit: DSC_0077 via photopin (license)

With Our Privacy Terminology, Are We Even on the Same Page?

Related stories