Skip to main content
RunArtifacts is the structured record of an agent run. Your agent populates it during execution, and verifiers consume it to evaluate task completion.

Fields

FieldTypeRequiredDescription
versionstrYesSchema version (currently "0.1")
task_idstrYesTask identifier
taskstrYesTask instruction text
modelstrNoModel used (e.g., "gpt-4o")
providerstrNoModel provider (e.g., "openai")
messageslist[dict]NoFull message history (role + content)
tool_callslist[ToolCall]NoAll tool invocations made
tool_resultslist[ToolCallResult]NoResults from each tool call (parallel to tool_calls)
metadatadictNoArbitrary metadata
final_observationstrNoAgent’s final output/summary
errorstrNoError message if the run failed
steps_takenintYesNumber of tool calls executed
max_stepsintNoStep limit (if configured)
created_atstrYesISO 8601 timestamp

Helper Methods

MethodDescription
record_message(role, content)Append a message to the history
record_tool_call(call, result)Record a tool call + result, increments steps_taken
set_final_observation(text)Set the agent’s final output
set_error(text)Record an error
to_dict()Serialize to a JSON-compatible dict
dump(path)Write artifacts to a JSON file

ToolCall

@dataclass
class ToolCall:
    tool_server: str          # e.g. "email"
    tool_name: str            # e.g. "send_email"
    parameters: dict[str, Any]  # tool-specific parameters

ToolCallResult

@dataclass
class ToolCallResult:
    observation: Any          # response from the tool server
    is_error: bool = False    # True if the tool call failed