We often hear the terms automation, machine learning, and artificial intelligence being tossed around our industry without a true understanding of if these terms actually have anything to do with privacy. To just about anyone you may ask, privacy is all about protecting sensitive and personally identifiable information, and this task should be priority number one for companies collecting and storing customer data. Yet, in the age of big data, organizations are just beginning to understand how much data they have at their fingertips — including financial, health, earnings, purchase, usage, preference, location data, and more — and are grappling with how to protect their vast stockpiles of information. Each of these represents data that would have serious consequences if accessed by a hacker or third party. Protecting an organization’s privacy begins with protecting the data, which given the daily headlines around new breaches, is no simple feat for many security and privacy teams.

How do organizations fail to protect their privacy? Simply put, they fail to protect their data. According to the Verizon 2017 Data Breach Investigations Report, nearly 80 percent of all data breaches were perpetrated by outsiders. This fact alone tells us there must be massive amounts of unsecure and/or poorly secured data. Unsecured data can fall into the wrong hands in a number of ways: Someone could lose a laptop full of data, someone on the inside of an organization could steal data, hackers on the outside could access and download data, or an individual could share restricted data outside of the organization. Although we will never be able to predict and fully prevent human error, we can take steps to get better at protecting data, and smart automation may be the key.

As cybercriminals become increasingly sophisticated and leverage new technologies like AI to enable their attacks, the best chance that companies have to defend against the onslaught is to use AI themselves. In this AI arms race, security and privacy teams looking to explore new technology to make their defenses smarter, faster and more reliable must consider the following approaches to protect their internet-facing data:

  • Eliminate malicious bots from accessing your websites and applications
  • Protect your data with “application-aware” web application firewalls (WAFs)
  • Defend your APIs from malicious calls and denial-of-service attacks
  • Guard against DDoS attacks designed to operate as breach smoke screens
  • Secure your sites from malware uploads designed to steal user data and spread infections

So where do automation, machine learning, and AI come into the data protection equation?  Although there is substantial fear and uncertainty surrounding these terms and what they mean to our industry, the value they offer is unparalleled. Ultimately, these technologies supplement human capabilities making organizations better prepared to protect data, which in turn means that privacy remains intact. 

Automation can be used to replicate many of the manual tasks that organizations (and people) tend to repeat, especially with regards to controlling access to data. Organizations can begin to develop methods of automating the “replication” activity for added security. For example, as new employees join the organization and other employees leave, automation can serve as a useful tool. Automation in IT almost always utilizes “scripts” that can be run over and over again to perform even the simplest of tasks. For example, new employees often need access to data to do their jobs, and granting that access can often be fully automated. In addition, as employees leave an organization, the process of removing or terminating their access to data can be automated as well. Unfortunately, many organizations fail to completely terminate a former employee’s access to data. This can create a major blind spot within the organization and increase the risk to data privacy. It’s important for organizations to remember that any process that needs to be replicated can be automated, and terminating an individual’s access to data is only a script away.  

This is only one example of automation. Others may include scheduled web application scanning —whereby the results are immediately interpreted and security controls are implemented without human intervention — or automatically detecting malicious actors and automatically implementing blocks across firewalls, IPS/IDS, sandboxes, WAFs, and other perimeter defenses. These are all doable by monitoring events, executing scripts, and testing the results. 

In addition to automation, another useful tool for privacy protection is machine learning. Today, machines are being developed that have the ability to learn, if directed to do so, and most often this process requires the human to teach the machine. This is called supervised machine learning.

Machine learning goes beyond simply replicating a task. Instead, it’s more focused on building models, running those models and interpreting the results — with input and human guidance. While the level of human guidance required is still significant, with time, machines will require less oversight. This technology holds clear promise for the future of cybersecurity technology, and there are already clear applications where machine learning can support privacy professionals.

Take web application security for example. One of the greatest challenges is securing applications appropriately without blocking “good” traffic. It’s quite the balancing act for those configuring and tuning web application firewalls (WAFs) that are required to help thwart data breaches. WAFs often take months to tune effectively, while at the same time DevOps groups are turning out application updates at intervals that are outpacing their SecOps counterparts. This is where machine learning comes in.

Supervised machine learning gives WAF operators the ability to work together with the “computer” in the WAF itself. With machine learning, operators can teach the WAF to get better at its job by reducing false positives and negatives in an extremely short period of time. The time to tune a machine-learning-enabled WAF is often measured in hours, not months.

Lastly, AI in cybersecurity is the term that causes the greatest concern among its practitioners. AI is the process by which machines begin to mimic their human operators with regards to learning and problem solving. Humans have the ability to learn on their own and are also great at problem solving in most cases. Similar to how humans learn and grow, AI driven algorithms also have the ability to learn and evolve over time.

Some potential applications for AI in data security and privacy may include AI-driven logging systems that detect, monitor, and track threat actors' “actions," and use automated kill chains to dynamically block their traffic and/or access, without human intervention. Others may include AI-driven systems used to find and detect advanced persistent threats; reducing the time from infection-to-detection-to-elimination to seconds instead of days, weeks, or even months.

In terms of protecting data privacy and the like, AI shows great promise. Automation, machine learning and AI are emerging technologies that are expected to soon be embraced by even their staunchest opponents. Many individuals and organizations don’t realize that automation and machine learning are already here, and full-fledged AI is not too far off the horizon. Although we are likely years away from AI-enabled cybersecurity technologies learning without human instruction, the probability of this advancement coming to fruition is high.  

photo credit: rbak923 Oh, I see, or Maybe Not! via photopin (license)