Carnegie Mellon University Professor and Privacy Engineering Program Co-Director Norman Sadeh, CIPT, has noticed certain organizational behavior may depend on whether a company falls under a privacy law that has opt-in requirements versus opt-out.
For those who must adhere to privacy rules with opt-in requirements, such as the EU General Data Protection Regulation, Sadeh said the onus is on website operators and service providers to obtain consent, meaning it's safe to assume opt-in links will be displayed prominently to entice users to agree with the companies' data practices.
On the other hand, other organizations may face opt-out requirements in the California Consumer Privacy Act, Controlling the Assault of Non-Solicited Pornography And Marketing Act of 2003, and Gramm-Leach-Bliley Act. Due to a different set of rules, Sadeh said those companies will exhibit slightly different behavior.
"When it comes to opt-out, these technology providers and website operators don’t have much pressure or incentive to make those choices visible to users," Sadeh said. "The result is that, while these choices may have to be there, they are really very hard for users to find."
Sadeh and his team of researchers at Carnegie Mellon’s CyLab Security and Privacy Institute wanted to help users find opt-out links that may be buried under the dense text of a privacy notice, and to do so, they created the Opt-Out Easy browser plug-in.
After it is downloaded, users can click the plug-in to reveal the opt-out links from the website's privacy notices. A box appears containing each of the opt-out links, which users can then click to be taken to them directly. The plug-in keeps tabs of each link a user visits by marking them blue once they are done. Opt-Out Easy is only available for Google Chrome; however, Sadeh believes it will be added to Firefox next month.
The plug-in identifies opt-out links by using artificial intelligence, machine learning and natural language processing to scan the site's privacy notice. For the past seven years, Sadeh and his team have been working to train AI to automatically read the text of privacy notices. He said the Opt-Out Easy plug-in is the latest byproduct of his team's work.
"We’ve trained machine learning technologies specifically to look for different types of opt-out choices. Now, when you browse the web, you as a user no longer have to go and read these privacy notices and look for these opt-out choices," Sadeh said. "You can install our opt-out extension, and as you browse the web, the extension will be looking for the privacy notices, and it’s going to expose these opt-out links to you so you can much more readily take advantage of these choices that are available."
Sadeh said the plug-in is 93% accurate in identifying and recalling opt-out links. While the accuracy rate is high, training AI algorithms to find those links has been anything but easy.
One reason stems from the inherent ambiguities of privacy notices. Sadeh said organizations' privacy notices may use intentionally vague terms to mask their data practices as much as they can, making it difficult to find common terms to train their models.
But the biggest hurdle to climb is at the beginning. Sadeh said to properly train those models to read privacy notices, humans must first manually annotate those documents.
"Just getting all these annotations from humans take a huge amount of effort. You need to make sure people are on the same page in terms of what they need to look for," Sadeh said. "You need to pay them to spend time looking at these policies very carefully. Nobody enjoys reading the text of privacy notices."
An added challenge with the human element ties back into the ambiguous nature of privacy notices. Sadeh said interpretations of privacy notices may differ between individuals, which is why it was important to have numerous people annotate documents to provide a consensus for most accurate results to help power the plug-in.
Opt-Out Easy can currently scan the notices of the top 7,000 most popular websites. Sadeh said should the plug-in start to gain traction, the budget for the project may be increased, and more sites could be added to the list. In the interim, users can request to have websites added through the plug-in. He estimates it takes roughly a week from the time the request is sent in to the time the plug-in recognizes the site.
Sadeh knows AI and machine learning will never produce 100% accuracy, but that does not mean there aren't avenues to pursue to get the number as close as possible. He added there are plans to lobby in public policy circles for a standard method to convey opt-out links to the user.
He believes a standardized application programming interface would both address some concerns with opt-outs and make the plug-in 100% accurate.
"We are not doing that right now, and this is the back and forth that takes place between industry, regulators and privacy advocates," Sadeh said. "That back and forth had led to different places in Europe and in the U.S., but it would simplify things if there was a requirement to just make choices available in a standardized manner and then, as I browse the web, I would be able to directly latch onto these links. I would not have to go look for them."
Photo by Steve Johnson on Unsplash