Prerequisites
- Python 3.13
- A Collinear API key from platform.collinear.ai (Developers → API Keys)
- An API key for any LiteLLM-supported model provider (OpenAI, Anthropic, Google, etc.)
- One of the following for running environments:
- A Daytona API key — for fast, ephemeral remote sandboxes (recommended)
- Docker Desktop (or Docker Engine with Compose) — for local execution
Installation
simulationlab. The installed CLI command is simlab.
Authentication
Log in with your Collinear API key:~/.config/simlab/config.toml.
Then export your model provider key:
Supported providers
SimLab supports any LiteLLM-compatible provider. Here are common examples:| Provider | Model format | SIMLAB_AGENT_API_KEY | Verifier provider value |
|---|---|---|---|
| OpenAI | gpt-4o | Your OpenAI API key | openai |
| Anthropic | anthropic/claude-sonnet-4-20250514 | Your Anthropic API key | anthropic |
gemini/gemini-2.5-pro | Your Google AI API key | gemini |
<provider>/<model_name>. OpenAI models don’t require the provider prefix since it’s the default.
Full example using Anthropic:
Starting an environment
Initialize an environment from a template and start it:
To see all available templates: simlab templates list
Choosing a task
Tasks are organized by the scenario template associated with your environment.tasks-gen), browse them directly:
Running a rollout
The primary command issimlab tasks run. It automatically starts the environment, seeds data, runs the agent, verifies the result, and tears down when done.
With Daytona (recommended — fast, ephemeral remote sandboxes):
--agent-model (e.g. gpt-4o, anthropic/claude-sonnet-4-20250514, gemini/gemini-2.5-pro).
You can also run tasks with your own agent implementation instead of the built-in one. See Bring Your Own Agent for the full interface and setup.
Viewing results
Results are saved tooutput/agent_run_<task_id>_<timestamp>/:
artifacts.json— full rollout trace (messages, tool calls, observations)verifier/reward.txt—1(pass) or0(fail)verifier/reward.json— e.g.{"reward": 1.0}
Configuring verifiers
Generated tasks use rubric-based verifiers that need a model to score results. Configure the verifier before running generated tasks:config.toml:
Built-in tasks use programmatic verifiers and don’t require this setup. This is only needed for tasks you generate via tasks-gen.

