Popular lifestyle and utility apps have long raised privacy concerns about their collection and use of personally identifiable information, but the recent Strava "heatmap" issue has reminded us that privacy risks are not confined to PII.

Though arguably personal data in some jurisdictions, raw location data and metadata are often disregarded as low-risk anonymous data. However, as the Location Data Forum noted in its 2013 Location Data Privacy Guidelines, Assessments and Recommendations, “[t]he power, benefits, and risks associated with location data are in its capacity to infer more personally identifiable information than the face value of the original information.”

As location data becomes more ubiquitous and accessible by more parties (both governmental and private), we need to be concerned about the broad collection, use and potential use of location data and metadata.

As location data becomes more ubiquitous and accessible by more parties (both governmental and private), we need to be concerned about the broad collection, use and potential use of location data and metadata.

Fitness, traffic, directions and weather apps logically require access to location data, and users are likely to consent without much hesitation, but users are often unaware of what data the apps collect, how it is used, and with whom the data is shared. The pop-up consent that users click when accessing the app does not reveal that information, and, in many cases, the full privacy notice lacks the transparency a user would need to fully understand the use of their data.

The Strava heatmaps, though headline-making given the operations security implications, were a fairly predictable use of the app’s aggregated location data. Other applications have used data for purposes that were less foreseeable and often without clear disclosures to the user. See Uber and AccuWeather for competing what-not-to-do case studies in data-use transparency. AccuWeather, in particular, revealed the ease with which raw location data can be linked to an individual. 

AccuWeather collected and shared its users’ Wi-Fi router names and Mac addresses with a third-party data monetization company, which could then use such information to identify individuals.

Even after its practices were questioned, AccuWeather’s transparency issues persisted. In its published response, AccuWeather argued that it did not share GPS information, which was not part of the accusation, and that AccuWeather itself did not use the information, which also was not part of the accusation. Further, by its own admission, AccuWeather had embedded a third-party SDK into its app without knowing what the third-party SDK did. Not only is that an insufficient defense, but it’s also insufficient vendor management.

Risk mitigation and transparency in applications is achievable with proper due diligence, disclosures and data life cycle management, but location data is collected at less obvious points, as well, when the user may be unaware who collected their information or that it was collected at all.

Risk mitigation and transparency in applications is achievable with proper due diligence, disclosures and data life cycle management, but location data is collected at less obvious points, as well, when the user may be unaware who collected their information or that it was collected at all.

As “smart cities” emerge, robust privacy programs and tools must be deployed to ensure citizens understand what information about them is being collected, where and how it is used, and to ensure community involvement in decisions about methods and technologies utilized to collect such data. The Future of Privacy Forum’s “Shedding Light on Smart City Privacy” includes tools aimed at educating and empowering citizens, while acknowledging the benefits that can flow from smart cities. In turn, the FPF’s Model Open Data Benefit/Risk Analysis offers resources for smart city operators to make deliberate and conscientious decisions about how and when to publish open data sets.

But as the Electronic Frontier Foundation points out, privacy issues are not static. A solid privacy program today could leave vulnerabilities open tomorrow. “There is an inherent risk of mission creep from smart cities programs to surveillance. For example, cameras installed for the benevolent purpose of traffic management might later be used to track individuals as they attend a protest, visit a doctor, or go to church.”

That’s precisely the type of tracking that recently raised concerns about California’s automated license plate readers. In January, the California Senate rejected S.B. 712, which would have permitted license plate covers on lawfully parked vehicles, to prevent ALPRs from mining location data while the cars are parked. Privacy issues included bipartisan concerns about accountability, oversight and private commercial use of such readers.

Only a small leap is required to connect these issues to tricky Fourth Amendment privacy issues pending with the Supreme Court. Though it is true that license plates, for example, can be seen in public just as they could decades ago without any reasonable expectation of privacy, in the digital age that information can be stored, and used retroactively in ways real-time observation cannot. Though it is not a new issue in terms of public visibility, it is a new issue in terms of longevity, scale, and the potential for retroactive use of that data.

So how do we resolve the benefits that flow from location data, open data sets, and other metadata with the complex and evolving privacy concerns? 

Carpenter v. United States and the so-called “car cases” are in the hands of the Supreme Court, but there are things smart cities and private actors can do. 

Privacy by design is a good start, but developers need to more thoroughly think through the risks in every step of the data life cycle, and the analysis cannot be isolated to personal data or PII. De-identified, aggregated data is not to be disregarded as harmless. As the FPF has cautioned, “companies should thoroughly analyze the range of potential unintended consequences of an open data set, including risks to individual privacy (re-identification), but also including societal risks: quality, fairness, equity, and public trust.”

Privacy by default must be emphasized. As the FPF noted, Strava’s app actually had “more granular privacy controls than mostfitness apps — they allow users to selectively hide their activities from others, or to create 'Privacy Zones' around their home or office.” But many users do not adjust the default settings unless they have a known reason to be particularly concerned for their privacy, and users do not know that they have reason to be concerned if transparency is missing. 

Know your suppliers. When negotiating vendor contracts involving user data, app providers must take care to learn who will receive their users’ data, what they will do with it, and with whom the data will be shared. App providers cannot be transparent with their users if the provider does not do its own due diligence. If third-party SDK is involved, providers must find out what the SDK does before allowing it into an application. When assessing risk mitigation, be aware that de-identification may be insufficient to protect your users.

Transparency and data minimization. App providers and smart cities need to prioritize transparency and minimize data collection. App developers and providers must provide clear and concise disclosures, and app users need to actually read those policies and think through what information they are sharing.

Though we will have to accept some risks in exchange for convenience or efficiency, app developers, smart cities, and users and citizens alike need to better understand what those risks are, particularly the less obvious risks, and regularly adjust procedures and communications to better acknowledge and contain those risks.

photo credit: Richard Masoner / Cyclelicious cycling route - all Santa Cruz state beaches via photopin(license)