Agentic AI
Assess Agentic Workflows
Evaluate agent performance on custom metrics
Comprehensive evaluation of agent workflows across multiple dimensions.
Evaluation Metrics
Metric | Description | Scale |
---|---|---|
Goal Completion | Does the agent achieve its purpose? | 0-1 |
Step Efficiency | Optimal path to solution | 1-5 |
Context Retention | Maintains conversation memory | 1-5 |
Error Rate | Unsuccessful steps | % |
User Satisfaction | Predicted user experience | 1-5 |
Preset Metrics:
- Action Completion: Did the agent achieve all user goals?
- Action Advancement: Did it progress any goal?
- Tool Selection Quality: Did the agent select the correct tool and parameters?
- Tool Errors: Did the tool execution steps succeed?
- Instruction Adherence: Did the LLM follow the given instructions?
- Context Adherence: Is the response grounded in retrieved/expected context?
Assessment Process
- Select generated dataset
- Choose evaluation metrics
- Configure assessment parameters
- Run evaluation
- Review heatmap visualization
Interpreting Results
- <0.5 : Critical issue needing immediate attention
- 0.5-0.7 : Significant room for improvement
- 0.7-0.9 : Good but can be optimized
- >0.9 : Excellent performance