Skip to main content
Install the CLI and run your first evaluation against a simulated environment.

Prerequisites

  • Python 3.11+
  • Docker Desktop (or Docker Engine with Compose)
  • Collinear API key from platform.collinear.ai
  • An LLM provider API key (e.g. OPENAI_API_KEY)
  • Daytona API Key (optional if you want to deploy into a cloud sandbox)

Installation

pip install simulationlab

Authentication

All simlab commands require a Collinear API key. Get one from the platform dashboard (Developers → API Keys).
# Option A: Global config file (recommended)
mkdir -p ~/.config/simlab
cat >> ~/.config/simlab/config.toml <<'EOF'
collinear_api_key = "<your-collinear-api-key>"
EOF

# Option B: Environment variable
export SIMLAB_COLLINEAR_API_KEY="<your-collinear-api-key>"

# Option C: Per-command flag
simlab --collinear-api-key "<your-collinear-api-key>" <command>

Starting an environment

Initialize an environment from a template and start it:
simlab env init my-env --template hr_recruiting
simlab env up my-env --daytona
To pick tools interactively instead of using a template:
simlab env init my-env

Choosing a task

Tasks are organized by the scenario template associated with your environment.
# List tasks for your environment's template
simlab tasks list --env my-env

# View details for a specific task
simlab tasks info --env my-env --task ar-100-schedule-phone-screen
If you generated tasks locally (via tasks-gen), browse them directly:
simlab tasks list --tasks-dir ./generated-tasks

Running a rollout

The primary command is simlab tasks run. Built-in reference agent (default). Uses LiteLLM and supports any LLM provider.
simlab tasks run --env my-env \
  --task 100_weaver_schedule_phone_screen \
  --agent-model gpt-5.2 \
  --agent-api-key "$OPENAI_API_KEY"
Custom agent (recommended).
simlab tasks run --env my-env \
  --task ar-100-schedule-phone-screen \
  --agent-import-path path.to.agent:MyAgent
Your agent must implement the BaseAgent contract. See Bring Your Own Agent for the full interface.

Configuring verifiers

When a task includes LLM-as-judge verifiers, configure the judge credentials separately from the agent:
export SIMLAB_VERIFIER_MODEL="gpt-5.2"
export SIMLAB_VERIFIER_PROVIDER="openai"
export SIMLAB_VERIFIER_API_KEY="$OPENAI_API_KEY"
Or in config.toml:
[verifier]
model = "gpt-5.2"
provider = "openai"
api_key = "sk-..."

Tear down

simlab env down my-env