What is the difference between AI governance and AI verification?

Written by h2o | May 8, 2026 7:15:05 AM

Broadly speaking, governance is about policies and oversight, while verification is about testing. Verification of AI agents involves finding out whether agents do what they're supposed to do. In practice, governance and verification overlap, and they sometimes get confused.

Governance

AI governance is the collection of rules, institutions, and processes that determine how AI systems should be built and deployed. At the company level, it might require an internal review board to sign off on new model releases, or a policy to ban training on certain kinds of data. At the national level, it can mean legislation which classifies AI systems by the level of risk they create, and imposes requirements on their developers accordingly. At the international level, it means efforts to coordinate policies and standards between governments. This is not happening much at the moment, apart from within well-established supra-national blocs like the EU.

Governance asks questions like: “Who is allowed to build these systems?” “What uses are prohibited?” “Who is liable when something goes wrong?” “What records must be kept?” These are questions about authority, responsibility, and permission. The answers are provided in the form of legal texts, corporate policies, and international agreements.

Governance documents rarely specify technical behaviour. A law might say "AI systems used in hiring must not discriminate on the basis of race," but it won't usually specify what statistical test should be applied, at what threshold, and using what data. This is where verification comes in.

Verification

AI verification is the process of checking whether an AI system behaves as intended and required. It can include testing a model's outputs against benchmarks, auditing its decisions for bias, running adversarial attacks to find failure modes, and with simple systems, formally proving their properties.

Verification can happen before deployment (pre-release testing, red-teaming), during deployment (monitoring, anomaly detection), or after something has gone wrong (incident investigation, forensic analysis). Post-deployment monitoring is arguably both harder and more important, because AI systems encounter unexpected situations in production and they can behave differently than anticipated.

Verification methods vary enormously depending on the system being tested, and on the level of risk it creates. Verifying that a self-driving car meets safety requirements involves formal methods, simulations, and physical testing over millions of miles. Verifying that a large language model won't help users to synthesise dangerous chemicals could involve red-teaming by domain experts, and the ongoing monitoring of real interactions. Verifying that a hiring algorithm treats all demographic groups fairly may rely on statistical audits of a sample of its decisions. These are different processes using different tools, but they share a common basic approach: checking a system’s actual behaviour against its intended behaviour.

Governance and verification depend on each other

Governance without verification is toothless. You can pass a law requiring that AI systems meet safety standards, but if nobody has the tools or access to check compliance, the law is ineffective. This is a real problem today, because many proposals for AI governance assume the existence of verification capabilities that don't yet exist at the required scale or reliability.

Verification without governance is directionless. You can test an AI system exhaustively, but testing requires criteria. What standard are you verifying against, and why? Who decides what counts as passing? If there's no governance framework specifying acceptable failure rates, fairness metrics, or safety thresholds, verification teams are left to invent their own, which leads to inconsistency and gaps.

The framers of the EU AI Act have recognised this, and tried to specify both governance and verification. The Act requires "conformity assessments" for high-risk AI systems, which is a governance mandate. The assessments themselves are verification processes, involving testing, documentation, and audit. The governance framework creates the legal obligation, verification provides the evidence that the obligation has been met.

Common mistakes in governance and verification

One common mistake is treating governance as sufficient on its own. People sometimes think that if they draft the rules well, the problem is solved. But rules that can't be checked can't be enforced. Some of today’s discussion about AI governance focuses too much on what the rules should say, and too little on the infrastructure needed to verify compliance with those rules. More attention should be paid to questions like “Who will do the auditing?” “What tools will they have?” “What information will they have access to?” and “How will all this be guaranteed when time is short and holding up deployment costs money?”

The reverse mistake is also sometimes made. Technical researchers sometimes treat verification as the whole problem, believing that if they build good enough evaluations and good enough monitoring systems, then the resulting processes will be safe. But verification tools produce information. Someone has to read and act on that information. It is governance structures that determine who acts, according to what rules, and with what authority.

Governance can sound like “just” paperwork and verification can sound like “just” engineering. In reality, both disciplines involve hard judgment calls, and they require good institutional design, and continuous discussion about what “good” looks like.

The relationship between governance and verification

There is a clear division of labour between governance and verification. Governance people write the rules, and verification people run the tests. But in practice, the two communities need to work together closely. Governance frameworks that are designed without input from verification experts tend to impose requirements that are vague, untestable, and poorly matched to the actual risks. Verification efforts that are designed in the absence of a coherent governance framework tend to focus on what is measurable rather than what matters.

The relationship between governance and verification resembles the relationship between law and forensic science in criminal justice. Lawyers define what counts as a crime and what evidence is admissible. Forensic scientists develop the methods for gathering and analysing that evidence. Neither works well without the other, and both evolve in response to what the other demands.

Governance and verification are separate fields, inhabited by different people, with separate career tracks and separate conferences. But they need to work closely together and understand each other, and ensure that neither community assumes the other has things covered when they haven’t.