Run Judge on Dataset
Use the Collinear AI API to run judge on your dataset.
Step 1: Get dataset annotations from the API
Parameters
The given function takes the following parameters:
- space_id: The unique identifier for the space in which the dataset resides.
- dataset_name: The name of the dataset for which you want to retrieve annotations.
Process
- The function constructs the API request URL using the base API endpoint and appends the necessary path.
- It sends a POST request with a JSON payload containing the space_id and dataset_name.
- The request is authenticated with a Bearer token passed in the Authorization header.
- The response is parsed, and the dataset annotations are converted into a Pandas DataFrame for easier manipulation and analysis.
Return Value : The function returns a Pandas DataFrame containing the annotations for the specified dataset.
Python Code
Step 2: Create a Judge
After retrieving dataset annotations, the next step is to initialize the PromptedSafetyJudge which uses OpenAI’s GPT model to evaluate the dataset for safety or other criteria. Here’s how the provided Python code works:
Python Code
Step 3: Run Judge on Dataset
Finally, you can run the judge on the dataset annotations to evaluate the safety of the dataset. The following Python code demonstrates how to do this:
Step 4: Add Judgements run by the Judge to Collinear AI
After running the judge on the dataset, you can add the judgements to Collinear AI for further analysis and model improvement. Here’s how you can do this using the Collinear AI API:
You can download the full Jupyter Notebook from here