Verifiers & Reward Models

Verifiers consume RunArtifacts — the structured record of everything the agent did — and produce a pass/fail result. There are two types:

Programmatic verifiers — Inspect the environment state directly. For example: “Did the agent send an email to the correct recipient?” is checked by querying the email tool server’s state, as well as reviewing the state diffs (before/after environment snapshots).
Rubric-based Reward Models — Reward models that evaluates the agent’s actions against a rubric. Useful for subjective criteria like “Did the agent communicate professionally?”

Both types receive the same RunArtifacts interface, so they work with any agent implementation.