Editor's note: The IAPP is policy neutral. We publish contributed opinion and analysis pieces to enable our members to hear a broad spectrum of views in our domains.

Artificial intelligence is increasingly transforming business operations and has significant potential to benefit society, yet its advancement introduces serious risks, particularly for marginalized communities like Indigenous peoples, where repurposing data for AI applications raises complex ethical, legal and privacy issues — especially concerning data sovereignty.

Canada is the ancestral land of numerous Indigenous groups, each possessing their own cultural, historical and traditional values. Indigenous data refers to any information affecting Indigenous lives at collective and individual levels, including information related to health, finances, socioeconomic conditions, lands, resources, history and culture.

First Nations data sovereignty encompasses not only personal data, but also community and cultural information, including images of sacred objects, traditional stories, art, and, broadly, any data related to the nation, its people, lands, resources, programs and communities.

Canada has a treaty obligation to recognize First Nations data sovereignty under the United Nations Declaration on the Rights of Indigenous Peoples. Additionally, the Constitution Act affirms the existing Aboriginal and treaty rights of the Aboriginal peoples of Canada. However, the federal government has yet to explicitly and fully recognize First Nations data sovereignty in any legislation.

The growing demand for datasets to train AI systems often leads to the use of large volumes of personal information without appropriate safeguards, leaving Indigenous communities particularly vulnerable as existing privacy laws and regulatory frameworks struggle to keep pace with AI technology.

Despite its international obligations, Canada has not adequately recognized or protected Indigenous data sovereignty in the AI era. Addressing this serious gap requires amending the federal Personal Information Protection and Electronic Documents Act to address the unique data rights of Indigenous communities. Nation-to-nation consultation processes must respectfully include Indigenous voices in shaping AI and data governance frameworks. By integrating transparency, consent and accountability into AI governance, Canada can uphold its obligations and ensure an equitable digital future.

Framing the problem: Threats to privacy and data sovereignty

AI offers double-edged potential for Indigenous peoples: it can either drive progress or act as a tool of modern colonization. Although AI is expected to boost the global economy, its algorithmic biases pose significant threats to privacy and often discriminate against marginalized communities. 

As AI models and companies continue to collect increasing amounts of personal information across various settings and demographic groups, information from individuals using AI-driven services in remote Indigenous communities in Canada is routinely absorbed to train and refine these systems. Moreover, using mainstream AI, such as large language models, risks Indigenous data being absorbed and mixed into vast, undifferentiated databases. 

Growing concerns have emerged regarding the repurposing and secondary use of personal information. AI systems like Clearview AI collect personal information, either publicly available or through purchasing from third parties, to build databases or conduct analyses for purposes not originally disclosed to data subjects. This raises a critical question: What happens when Indigenous personal information collected for one purpose is reused for a secondary objective?

Furthermore, in practice, the Crown retains control over First Nations' data and unilaterally determines how data is accessed and used. In some cases, the Crown has even profited from the sale of such information without obtaining consent or engaging in proper consultation with the affected communities. 

AI can discriminate based on a variety of factors, including ethnicity.Historical biases used to train AI systems disadvantage specific communities and First Nations are not typically involved in the training processes of AI systems, contributing to potential discrimination against them. The use of First Nations' data to train machine learning systems without the permission, consent, or knowledge of the individuals violates their data sovereignty rights.

The principles of ownership, control, access and possession play an important role in First Nations data sovereignty. First Nations must own data about themselves and have the right to control its collection, use, disclosure and destruction. They must also maintain access to their information and determine who is permitted access to it. Possession is the mechanism through which ownership can be asserted and protected. 

Designing a response

One important tool in responding to this issue is the Organisation for Economic Co-operation and Development's Reference Checklist for Regulatory Decision-Making, which aligns with proper decision-making practices employed by OECD countries to enhance the impact of government rules. The checklist can play a crucial role in supporting sound policymaking. 

Based on this checklist, the benefits of regulation not only outweigh the costs but also make regulatory action essential. Most importantly, Indigenous groups must be able to understand legislation. Regulations and policy guidelines should be clear, consistent and as accessible as possible. Companies should also be required to draft terms of service and privacy notices in language that is easy to understand. Community organizations can play a vital role by translating guidelines into local languages. 

Finally, compliance should be addressed through clear enforcement mechanisms and accountability measures. Companies that work directly with data belonging to Indigenous groups must specifically address concerns and vulnerabilities in their privacy impact assessments. 

Experiences abroad

The protection of Indigenous rights has been recognized as an international human right in numerous human rights declarations and treaties. 

The United Nations Declaration on the Rights of Indigenous Peoples supports the rights of First Nations to participate in decision-making processes and to determine how information about them is collected and used. The International Covenant on Civil and Political Rights agrees. 

Countries have also recognized the data sovereignty and privacy rights of Indigenous peoples. Australia's Principles for Indigenous Data Sovereignty emphasize Indigenous control over the data ecosystem, the need for contextual and disaggregated data, relevance to self-determination, accountable data governance structures, and the protection of both individual and collective Indigenous interests. 

While Australia has not enacted a law or formal regulatory framework for Indigenous data governance, the government has developed a governance framework intended to function as soft law. It specifically emphasizes partnering with certain Indigenous peoples at all stages of the data cycle to ensure their priorities are reflected in data about their communities. The governance framework also outlines steps to build data-related capabilities and develops clear methods for communities to promote organizational and cultural change and understand what data is held about them, how it is used, and how it can be accessed.

Though this is a good example of involving Indigenous peoples in data governance, it is not sufficient. Indigenous information must receive enhanced protection, and Indigenous organizations must be entitled to govern Indigenous data.

While data sovereignty and the right to self-determination are globally recognized rights of Indigenous peoples, mere recognition on paper is not sufficient to ensure enforcement. Canada must proactively and meaningfully engage Indigenous peoples in all matters affecting their data and governance.   

Stakeholders' involvement

Involving disadvantaged groups in AI design and regulation helps eliminate bias during the development process. Limited consultation with a few Indigenous individuals or representatives does not fulfill the legal and moral obligation to conduct true Nation-to-Nation engagement. The lack of First Nations' involvement in data development and use has led many communities to hesitate or refuse to share their information.

When communities lack trust in the organizations or individuals collecting the data and do not see value in its intended purpose, securing their participation becomes extremely challenging. 

Conclusions and recommendations

Canada's PIPEDA should be amended to include clauses specifically aimed at protecting the privacy of First Nations. This should include an amended definition of personal information that reflects its relevance in Indigenous contexts. 

Since information about tribes is identifiable and holds cultural and political significance for Indigenous nations, a collective information regime, comparable to the concept of personal information, is also needed.

Indigenous data sovereignty must be defined as the right of Indigenous peoples to govern the collection, use and disclosure of their own data in a manner compatible with the definitions outlined in PIPEDA. 

The requirement for collective consent should be established alongside individual consent when data involves Indigenous communities. The consent mechanism should be adapted based on the sensitivity of the information, which must be determined in consultation with First Nations assemblies, societies, governments and communities. 

Indigenous information should refer to data about identifiable Indigenous individuals and include cultural heritage, genomic and health information, and land-related data connected to traditional knowledge. Ownership rights over this information should be granted to individual Indigenous Nations in recognition of their right to self-governance. 

Shaped by the uniqueness of their cultures and societies, First Nations hold distinct understandings of sensitive information. This requires a definition that is suitable to their specific context. Proposed amendments should include examples of secondary uses of Indigenous information, both personal and collective, and clearly outline safeguards to address associated concerns.

Data sovereignty must be explicitly defined within the legal text, not relegated to secondary sources. PIPEDA should reiterate the principle that Indigenous data should be governed by Indigenous communities, in accordance with the First Nations' principles of ownership, control, access and possession.

M. Milad Khani is a legal researcher at the University of Ottawa.