Introduction
Reliability Judges help keep language models accurate by filtering out hallucinated content. They ensure that responses are:- accurate
- factually correct
- free from invented or fabricated information
Steps to Create a Reliability Judge
Select Reliability Judge
To create Reliability Judges, you need to select ‘Reliability’ as the task type. Click on theReliability
radio button.

Select Reliability Model
After setting the task type, the next step is to choose the specific reliability model. We offer three primary models:- Lynx: This is a open source model developed by Patronus AI, which uses sophisticated proprietary algorithms.
- Prompted Model: This is a closed source model, which uses sophisticated proprietary algorithms. An example of models that fall under this category includes the OpenAI and Claude models.
- Veritas: This is our state-of-the-art model specifically optimized for hallucination detection. It offers high accuracy and low latency.
Creating a Veritas judge
The Veritas judge is designed to evaluate the factual correctness of model outputs, ensuring that the responses are accurate and free from hallucinated content. The responses are binary1
: Factually correct0
: Hallucinated content
Configure RAG Details

- RAG Host: The host address of the RAG endpoint. This is where your requests will be sent.
- Context Engine API Key: Your unique API key for authenticating requests to the RAG endpoint.
- Index: The specific index within the RAG host that you wish to query.
- Namespace: The namespace associated with your data in the RAG endpoint.
- Top K: The number of top responses to retrieve and consider from the RAG endpoint. This helps determine how many relevant results to evaluate.
Set Judge Name
Name your judge according to your preference and click on “Create Judge”.
Creating a Lynx judge
The Lynx judge evaluates the factual correctness of model outputs, ensuring that the responses are accurate and free from hallucinated content. The responses are binaryPASS
: Factually correctFAIL
: Hallucinated content

reasoning
array which provides the reasoning behind the judgement.
Configure RAG Details

- RAG Host: The host address of the RAG endpoint. This is where your requests will be sent.
- Context Engine API Key: Your unique API key for authenticating requests to the RAG endpoint.
- Index: The specific index within the RAG host that you wish to query.
- Namespace: The namespace associated with your data in the RAG endpoint.
- Top K: The number of top responses to retrieve and consider from the RAG endpoint. This helps determine how many relevant results to evaluate.
Set Judge Name
Name your judge according to your preference and click on “Create Judge”.
Creating a Prompted Model judge
The Prompted Model Judge is designed to evaluate the factual correctness of model outputs, ensuring that responses are accurate and free from hallucinated content. The evaluation is based on the Likert scale, and the final output is determined according to the prompt template you provide. Once you select the Prompted Model judge, click on “Continue”.Configure Prompted Model Details

- Model: Select the model for the judge. You may choose an existing model or create a new one.
- Prompt Template: Define the template for the prompt used in evaluating model responses. You can customize this template according to your needs.
- Context Engine API Key: Enter your API key for authenticating interactions with the Context Engine.
- Index: Specify the index within the Context Engine you want to query.
- Namespace: Provide the namespace linked with your data in the Context Engine.
- Top K: Decide the number of top responses to retrieve from the Context Engine. This helps in evaluating the most pertinent results.
Set Judge Name
Name your judge according to your preference and proceed by clicking on Create Judge.