Skip to main content
Simulation Lab is agent-agnostic. The default agent uses a tool-calling loop with any LLM API (via LiteLLM), but you can bring your own agent implementation by implementing the BaseAgent contract.

The Agent Contract

A custom agent extends BaseAgent and implements four methods:
class MyAgent(BaseAgent):
    @staticmethod
    def name() -> str:
        return "my-agent"

    def version(self) -> str | None:
        return "1.0.0"

    def setup(self, environment: BaseEnvironment) -> None:
        """Called once before the agent starts.

        Use this to discover available tools, configure state,
        or perform any one-time initialization.
        """
        self.tools = environment.list_tools()

    def run(
        self,
        instruction: str,
        environment: BaseEnvironment,
        context: RunArtifacts,
    ) -> None:
        """Execute the agent's task.

        Populate the context (RunArtifacts) as execution progresses
        so that partial results are captured even on timeout or error.
        """
        context.record_message("user", instruction)

        # Your agent logic here:
        # 1. Read the instruction
        # 2. Decide which tools to call
        # 3. Call tools via environment.call_tool(server, name, params)
        # 4. Record each tool call and result in context
        # 5. Iterate until done

        call = ToolCall(tool_server="email-env", tool_name="send_email", parameters={...})
        result = environment.call_tool(call.tool_server, call.tool_name, call.parameters)
        context.record_tool_call(call, result)
Register your agent at runtime via the CLI:
simlab tasks run --env my-env --task my-task --agent-import-path path.to.agent:MyAgent

The Environment Interface

The environment object passed to your agent provides:
  • environment.list_tools(tool_server=None) — returns tool schemas (names, descriptions, input schemas) for one server or all servers in the workspace.
  • environment.call_tool(tool_server, tool_name, parameters) — executes a tool call on a specific server and returns a ToolCallResult.
  • environment.tool_servers — property returning a dict[str, str] mapping server names to base URLs.
Under the hood, these map to the tool server protocol: list_tools() calls GET /tools, and call_tool() calls POST /step.

Run Artifacts

As the agent executes, it populates a RunArtifacts object — the structured record of the run (conversation history, tool calls, results, errors). Helper methods (record_message, record_tool_call, set_error) allow incremental recording so that partial results are captured even on timeout or error. Verifiers consume RunArtifacts to determine whether the agent succeeded. This is the contract between agent execution and verification — your agent populates it, and verifiers read from it.