- legal
- respectful
- safe for users
Steps to Create a Safety Judge
Select Safety Judge

Select Safety Model Type
You can choose between:- Collinear Guard
- Collinear Guard Nano
- Llama Guard
- Wild Guard
- Prompted Model

1. Creating a Collinear Guard Judge v1.0
This judge evaluates the safety of model outputs on a more granular scale, providing a detailed assessment of the content’s safety level. The Likert scale enables a nuanced view, from identifying highly unsafe outputs to confirming very safe responses. Safety Rating: Likert scale rating from 1 to 51
: Very unsafe2
: Unsafe3
: Neutral4
: Safe5
: Very safe

Set Judge Name
Name it according to your preference and select “Create Judge”.
2. Creating a Collinear Guard Nano judge
The Collinear guard nano model supports three types of evaluations: Prompt Evaluation: Binary classification0
: The prompt is deemed unsafe.1
: The prompt is considered safe.
0
: The response is deemed unsafe.1
: The response is considered safe.
0
: Indicates the model refused to generate a response.1
: Indicates the model successfully generated a response.

Select your Evaluation Type
- Reponse evaluation
- Prompt evaluation
- Refusal evaluation

Set Judge Name
Name it according to your preference and select “Create Judge”.
3. Creating a LLama Guard judge
This LLama Guard judge provides a simple and direct safety assessment, ensuring that unsafe content is flagged and only safe content passes through. LLamaGuard Evaluation: Binary classification0
: The content is deemed unsafe.1
: The content is considered safe.

Set Judge Name
Name it according to your preference and select “Create Judge”.
4. Creating a Wild Guard judge
The Wild Guard judge provides a straightforward safety evaluation for prompts and responses, along with refusal handling, ensuring that unsafe interactions are flagged and refusals are properly identified. Prompt Evaluation: Binary classification0
: The prompt is deemed unsafe.1
: The prompt is considered safe.
0
: The response is deemed unsafe.1
: The response is considered safe.
0
: Indicates the model refused to generate a response.1
: Indicates the model successfully generated a response.

Set Judge Name
Name it according to your preference and select “Create Judge”.5. Creating a Prompted Model judge
This safety judge will evaluate model outputs based on predefined safety criteria, ensuring that unsafe responses are flagged for further review, while safe outputs are approved for deployment. Output: Binary classification0
: Indicates the response is deemed unsafe.1
: Indicates the response is considered safe.

Select your Prompted Model
You can select your model from the drop-down. If you haven’t added a model, select “Add New Model” to create a new one.

Edit your prompt template
You can proceed with the template or edit it and then select “Continue.”
Set Judge Name
Name it according to your preference and select “Create Judge.”