OPINION

A view from DC: Can AI governance catch up to innovation?

A new report from Stanford HAI highlights major technical progress and a 17% growth in AI governance roles in 2025, but new challenges are rapidly emerging alongside more complex AI pipelines.

Published
Subscribe to IAPP Newsletters

Contributors:

Cobun Zweifel-Keegan

CIPP/US, CIPM

Managing Director, D.C.

IAPP

Editor's note

The IAPP is policy neutral. We publish contributed opinion pieces to enable our members to hear a broad spectrum of views in our domains. 

As one stop in an ongoing roadshow, the Brookings Institution hosted representatives from the Stanford Institute for Human-Centered Artificial Intelligence this week. They were in Washington, D.C., to discuss the 2026 edition of the Stanford HAI AI Index Report. Since 2017, the report has served as an important collation of measurable standards in generative AI, cataloging everything from technical development and adoption to transparency and governance.

This year, as HAI's top-level summary explains, the index "reveals a widening gap between what AI can do and how prepared we are to manage it. While AI continues its rapid integration into the global economy … the frameworks needed to govern, evaluate, and understand this technology are falling behind. In a field where data transparency is declining, independent, rigorous measurement has never been more critical."

What will dive the next wave of AI competition?

The Brookings event kicked off with a presentation from the AI Index lead researcher, Sha Sajadieh, who ran through a rapid but insightful series of top-level takeaways from the report. The differences across recent years among some measures in the index are remarkable. 

For example, the technical gap between leading models is narrowing rapidly. One measure of this is performance ratings by human voting in the Arena Laboratory. In 2023, the top four models were separated by 97 Elo points when rated against one another. Today, that gap is fewer than 25 points, with Anthropic, xAI, Google and OpenAI all jockeying for position at the front of the pack with their leading large language models.

With such results driving competitive pressure, it is not surprising that almost every leading frontier AI model developer provides detailed transparency reports on capability benchmarks. But, in contrast, as the index report points out, "reporting on responsible AI benchmarks remains spotty."

This is not due to a lack of understanding of responsible AI governance practices. In chapter 3, for example, the index provides a helpful set of dimensions against which models can be tested, divided into three layers that are further mapped against leading governance frameworks. As a benchmarking exercise, and to best understand the state of play in governance, the layers and specific controls in HAI's responsible AI dimensions are worthy of a closer review:

  • Layer 1: core functions and behaviors or "what AI systems should achieve."
    • Validity and reliability.
    • Privacy.
    • Data stewardship.
    • Fairness and bias.
    • Transparency and auditability.
    • Explainability.
    • Autonomy and human agency.
    • Environmental sustainability.
    • Factuality and truthfulness.
  • Layer 2: system integrity and risk controls or "how risks are technically and operationally managed."
    • Security.
    • Safety.
    • Robustness.
  • Layer 3: governance, accountability and enforcement or "How responsibility, oversight, and redress are ensured."
    • Accountability and liability.
    • Human oversight and contestability.

The Index provides helpful definitions and examples of each of these dimensions. There is a wide range in the maturity of our understanding of how to measure and manage these dimensions, but the challenge faced by the researchers at HAI has more to do with a lack of data. With some exceptions, companies are not publishing benchmarks along most of these dimensions in the same way that they do for performance.

Instead, the index relies on incident reporting, analyzing the 362 reports from 2025, up from 233 in 2024, in two major incident reporting databases. It looks at some of the top reported harms, from "harmful speech" to deepfake impersonation to consumer fraud. 

As the technical frontier becomes indistinguishable among leading models, perhaps the strategic differentiator could shift from raw compute to responsibility.

Rapid professionalization

Even though transparency may be lagging across responsible AI dimensions, the index shows clear evidence that the AI governance profession is rapidly maturing. Investing in AI governance is increasingly becoming a pragmatic requirement for market entry.

As one measure of this trend, AI-specific governance roles expanded by 17% over the last year, according to a McKinsey survey detailed in the index. These roles are also becoming more senior and specialized, as companies "shifted AI governance ownership away from data and analytics functions toward dedicated AI governance roles."

Meanwhile, the share of businesses with no responsible AI policies in place fell sharply in 2025 from 24% to 11%. "And with the uptick in adoption, survey respondents perceived an overall positive impact from RAI policies. Compared to 2024, more organizations reported that RAI policies improved business outcomes (up 7 percentage points), business operations (up 4 percentage points), and customer trust (up 4 percentage points). Furthermore, more organizations reported a drop in the number of AI incidents (plus 8 pp)."

Major policy challenges remain

The final panel of the Brookings event featured a handful of experts reflecting on policy takeaways from the AI index, including Elham Tabassi, a prominent engineer and leader in AI governance who currently serves as the Director of the Artificial Intelligence and Emerging Technology Initiative and a Senior Fellow at the Brookings Institution. As it happens, she also serves as a member of HAI's steering committee for the AI index.

Tabassi identified some of her newest concerns as she watches the evolution of AI adoption. One is the shift from static LLMs to agentic, autonomous systems, which introduces structural volatility that our current oversight mechanisms are ill-equipped to handle. Distributed systems, where one agent's output is fed into another agent's input, suffer from compounding error rates. 

It is technically infeasible to extract insights about the interplay between agents in a multiagent pipeline. Tabassi worries that this leads to a world where the security and safety of the entire chain is dictated by its most fragile node. Without new innovations in measurability, transparency and governance these issues will remain.

As an expert in metrology, the science of measurement, Tabassi also highlighted her perception of the ongoing deficit in measurement paradigms for AI governance. The types of evaluation frameworks we generally deploy were first developed for the physical world, where variables are static, deterministic and traceable. 

AI systems challenge these assumptions. Some even exhibit situational awareness, behaving differently under the scrutiny of an evaluation than they do in a production environment. To bridge this gap, Tabassi reminds AI governance professionals that we will need to move beyond a focus on one-shot task evaluation and develop a methodology that accounts for environmental interactions and the interplay of diverse, often contradictory, trustworthiness characteristics.

Looping back to the governance trends in the index, Tabassi closed with her hopes for the future of AI governance, celebrating the fact that professionals in these roles increasingly have a "seat at the table" across the life cycle. But there is plenty of work ahead.

Please send feedback, updates and compounding errors to cobun@iapp.org

This article originally appeared in The Daily Dashboard and U.S. Privacy Digest, free weekly IAPP newsletters. Subscriptions to this and other IAPP newsletters can be found here.
CPE credit badge

This content is eligible for Continuing Professional Education credits. Please self-submit according to CPE policy guidelines.

Submit for CPEs

Contributors:

Cobun Zweifel-Keegan

CIPP/US, CIPM

Managing Director, D.C.

IAPP

Tags:

AI and machine learningBenchmarkingAI governance

Related Stories