What is a Safety Assessment?

A Safety Assessment measures how well your model adheres to safety guidelines when generating responses. Collinear AI uses proprietary safety judges to assess risks like harmful, biased, or inappropriate content. This helps you:
  • Detect and categorize unsafe outputs
  • Benchmark model behavior against safety standards
  • Ensure alignment with responsible AI practices

Interactive Walkthrough

Want to see it in action? Follow this guided demo to create your safety run:

Judge Types

Choose the appropriate safety judge based on your assessment needs:

1. Collinear Guard

Collinear AI’s proprietary Likert-based model using a 1–5 rating scale.
  • This judge allows the user to customize the rating scale to match your evaluation criteria using a 1 (lowest) to 5 (highest) likert scale.
  • It assigns each conversation a primary risk category to accelerate vulnerability identification.

2. Collinear Guard Nano

Collinear AI’s proprietary binary classification model that evaluates specific safety dimensions.
  • This judge provides a pass or fail rating to each conversation within a dataset based on Collinear’s safety criteria.
  • Similar to Collinear Guard, it also assigns each conversation a primary risk category to accelerate vulnerability identification.

3. Llama Guard 3

Meta’s off-the-shelf safety evaluation model.
  • Plug-and-play judge with no customization needed.
  • Great for quick or comparative benchmarks.

4. LLM-as-a-Judge

Use any model with a custom prompt template.
  • Integrate your own model and customize the rating criteria or the entire prompt.

Next Steps

Once you’ve selected a judge, you’ll be guided to Run the evaluation and view results Need help picking the right judge? Reach out to support