- TraitBasis-generated personas – more accurate and interpretable user simulations.
- Domain-specific evaluation – tasks drawn from retail, airline, telecom, and telehealth settings.
✨ Features
- Persona Simulation with TraitBasis Generate diverse, coherent user personas with different traits.
-
Domain Coverage
TauTrait includes evaluation tasks in four industries:
- 🛒 Retail
- ✈️ Airline
- 📱 Telecom
- 🩺 Telehealth
🚀 Getting Started
Installation
Usage
results/ in the format agent_strategy-model-temperature_range_start-end_user-user_strategy_traits-<traits>_<timestamp>.json. The JSON captures the reward, transcript, and debug info for every task.
Some definitions of the settings are below.
TauTrait Config Settings
General
-
--num-trials(int, default: 1)
Number of independent trials to run. -
--seed(int, default: 10)
Random seed for reproducibility. -
--shuffle(int, default: 0)
Whether to shuffle task order (0 = no, 1 = yes). -
--log-dir(str, default:results)
Directory where logs and results are stored.
Environment & Tasks
-
--env(str, choices:retail,airline, default:retail)
Domain environment in which to run simulations. -
--task-split(str, choices:train,test,dev, default:test)
Dataset split of tasks to run (applies only to the retail domain currently). -
--start-index(int, default: 0)
Index of the first task to run. -
--end-index(int, default: -1)
Index of the last task to run. Use-1to run all remaining tasks. -
--task-ids(list of int, optional)
Explicit list of task IDs to run (overrides index ranges).
Agent Configuration
-
--model(str, required)
The model to use for the agent. -
--model-provider(str, choices fromprovider_list)
Provider for the agent’s model. -
--agent-strategy(str, choices:tool-calling,act,react,few-shot, default:tool-calling)
Strategy used by the agent to interact with the environment.tool-calling: Invoke external tools.act: Pure action selection.react: Reason + act alternation.few-shot: Use few-shot exemplars.
-
--temperature(float, default: 0.0)
Sampling temperature for the action model (higher = more randomness). -
--few-shot-displays-path(str, optional)
Path to a JSONL file containing few-shot demonstration examples.
User Simulator Configuration
-
--user-model(str, default:gpt-4o)
Model to use for the user simulator. -
--user-model-provider(str, optional)
Provider for the user simulator’s model. -
--user-strategy(str, choices fromUserStrategy, default:llm)
Strategy for the simulated user (e.g., LLM-based).
Execution Controls
--max-concurrency(int, default: 1)
Number of tasks to run in parallel.

