As copyright discrepancies become a dominant issue in artificial intelligence’s future, the question of who shoulders liability for legal infringements has become central to the legal debate.

AI developers claim they need massive amounts of data to improve the accuracy and quality of generative outputs. Countries are wrestling with whether copyright and fair use law allows for the wholesale collection of material without paying its creators.

While some developers have struck agreements with news outlets or websites to train on their content, technology companies have also argued their generative AI products create something entirely new out of training materials and therefore skirt fair use law.

The tension is likely to continue, according to panelists at the IAPP AI Global Governance North America 2025, because opposing sides are taking different stances on who is to blame if copyright infringement is found.

The AI training conundrum

Taylor Duma Partner Van Lindberg, who specializes in intellectual property and technology, said the landscape is complicated because of the myriad of ways AI interacts with information. He argued training on copyrighted works should have a different liability profile than retrieval augmented generation, where large language models go out and search for information beyond their original training data to provide more information.

But intention also matters, Lindberg argued, and AI developers should not be held accountable for if their users create something that infringes on a copyrighted work. He noted it often takes deliberate action, such as structuring a query to elicit a specific response — also known as adversarial prompting — to cause LLMs to bring up copyrighted material.

Lindberg compared AI developers to Sony, whose Betamax video recording device was at the center of a landmark 1984 copyright case. The U.S. Supreme Court determined making individual copies of TV shows to watch later was fair use and manufacturers of home recording devices were not liable for any secondhand copyright infringement, known as contributory infringement.

Rather, it is users who should be held accountable if they use AI models to infringe, Lindberg said.

"You can say whether they infringed or not, or whether that the creation of the machine was fair enough, but at a certain point you need to say that the person who is using the machine and eliciting a particular response bears responsibility for what they are doing," he said.

The New York Times Company Vice President and Assistant General Counsel Simone Procas argued users would not be able to access those materials if not for the AI’s training. The New York Times is in a long-running lawsuit with OpenAI for training on its news articles without permission but also struck a licensing agreement for content training with Amazon.

Procas said attempts to make users liable for infringement by altering terms of use may not land well with the public, saying it is easy for users to create works based on copyrighted material and may even do so unintentionally.

"I think that's going to be a tricky argument for them to make in the court of public opinion, since everybody is using these AI tools, and they're just not going to like hearing that OpenAI and Microsoft is planning to blame them and try to shift legal responsibility to the users," she said.

Without stronger legal protections, creators have few ways to protect themselves from unwanted copying, Procas added. She claimed technical measures like robot.txt are "blunt instruments" which can be ignored by crawler bots. Moreover, search engines that the NYT relies on to bring traffic to the site do not differentiate between bots surfacing websites and those grabbing information for training.

Looking to the record

The approach to tackling AI copyright questions varies by jurisdiction.

The U.S. has yet to make any federal policy decisions about the issue, although the U.S. Copyright Office is undergoing a multi-part study of AI and copyright intersections. 

Other jurisdictions have taken more action. South Korea's president promised to ease the country’s copyright rules to promote AI development while the U.K. government is working toward changing AI copyright law to clarify that companies are allowed to train their models without creatives’ permission. 

In the EU, the European AI Office laid out extensive rules for how to comply with the copyright law in its voluntary code of practice for general purpose AI.

The copyright battle in the U.S. has largely taken place in the courts, with numerous lawsuits filed against AI developers from newspapers, publishers and creatives arguing scraping their material from online or using images or audio to train generative products violates copyright law.

Two high-profile cases regarding Meta and Anthropic, respectively, have taken divergent paths. A federal judge tossed Meta's case out on technical grounds without ruling on whether the social media company’s use of books for training was lawful. Meanwhile, Anthropic is looking to settle a class-action case accusing it of violating copyright law by amassing a library of pirated works. 

The judge in the Anthropic case, who rejected an initial settlement proposal, said using copyrighted material for training might not violate the law but keeping a library might be.

With AI copyright issues bound to persist, Copyright Clearance Center Vice President and General Counsel Catherine Zaller Rowland said there must be a balance to work toward. Otherwise, creators will be disincentivized to create new content — leading to less material to improve AI down the road.

Finding harmony, Rowland said, can only benefit both parties in the end.

"The whole point of copyright is to promote progress and create new works. That's the point of copyright, to incentivize people to do that," she said. "Technology helps get that going farther."

Caitlin Andrews is a staff writer for the IAPP.