Consumer data is a big deal — it affects the consumers who share data, the companies that collect and process data, and the government that oversees the protection and usage of data.
However, there is still no universal opinion as to what fair use of that data is. In March alone, New York state passed a new regulation to enhance consumer data protection in financial services, while Congress conversely voted to eliminate new FCC protections around data usage consent. Nevertheless, the thread that connects these decisions, along with new international regulations like the China’s Cyber Security Law or the EU’s General Data Protection Regulation, is that personal data is increasingly becoming a battleground. Because of this, organizations will have to navigate competing and sometimes divergent consumer, business and international government priorities.
Accountability through accounting
Customers are the lifeblood of every organization and that relationship is increasingly built on data. Data is the currency that defines how the consumer interacts and transacts with an organization. Not surprisingly then, consumers see organizations as custodians of their data, responsible for that data’s safekeeping and fair use. This view of company as data custodian is only reinforced by the changing regulatory landscape which requires companies to be more accountable around data protection and privacy to their customers and the regulators that defend them.
However, data accountability is impossible without data accounting.
To know a customer in today’s business world means to know their data, and for most organizations, this remains a difficult task. Data is collected across many applications and processed in not always obvious ways. Traditional approaches to data discovery rely on imprecise questionnaires or dated scanners neither of which provide comprehensive data asset inventory or mapping. Knowing that you have a nine-digit number in a relational database is not the same as knowing an individual’s data. To know a customer or “data subject” means knowing all their data content and also the context of that data use — where is it resident, where is it flowing, who is accessing it, and what consent has been captured for it. Data accountability is impossible without data accounting, and data accounting requires the means to discover both an individual’s data and the usage context of that data.
Content and context
Next generation data discovery tools go beyond just uncovering Social Security numbers. Finding sensitive data still matters, but in today’s online business world, it’s important to accommodate a broader definition of what is sensitive content while also understanding the context of that data in order to meet security, privacy and data governance priorities.
Take the GDPR, for instance: The citizen data rights it enshrines requires organizations to know what data they collect on every individual; it requires them to know what consent they have around the data; where the data is resident; and how identifiable the data is and the data’s purpose of use. Doing this accurately and at scale requires new approaches not only to find but also inventory and map the data. Technology-driven tools can effectively help organizations create an atlas of their data so they can zoom in and zoom out of specific characteristics and relationships. This ensures both an ability to meet emerging regulations while enhancing customer data knowledge.
Data quality amidst data quantity
As with any map, there will always be macro details and micro details about customer data knowledge. Data stewardship requires a big picture view of how data comes into an organization, how it gets processed, and how it ultimately gets disposed. It will also benefit from a more detailed inventory that can be sliced and diced by data type, data subject, calling application, system, country, or even applicable regulation. But to truly make the data valuable it’s also important to understand the detailed inter-relationship between the various data attributes collected and processed. This requires a granular way to zoom in on how, for example, a cookie is connected to an SSN.
Having that granular view also helps answer common data quality issues in identity data. Are two data table entities different people or the same if they share some common attributes; can an IP address be mapped back to an individual; is a de-identified data set used for analytics re-identifiable?
Data quality depends on detailed data knowledge. There are tools that provide organizations a way to shine a light on customer information at a business level, asset inventory level and fine-grained data table/field level. Equally important it is designed from the ground up to do so at a global scale spanning petabytes of regionally distributed and governed data centers and stores while also accommodating an ever expanding definition of what is identity data. Moreover, data knowledge-driven protection and privacy tools give organizations a new way to analyze the data and also reuse it in a modern service-centric way across the organization.