The Collinear AI Platform allows you to effortlessly evaluate AI models with flexibility and precision. This guide walks you through the steps to create a new assessment run using Collinear’s Flex Judge.
A Collinear Flex Assessment lets you create a customizable likert judge that mimics your own subject-matter experts and how they assess AI responses. It’s ideal for:
Explore the embedded demo below to see a Collinear Flex run in action. This interactive guide will walk you through setting up a run, choosing your data, and customizing your Judge.
How to Leverage a Collinear Flex Run with Annotations
Follow these steps to create a new run using Collinear Flex using annotated ground truth in an existing run:
Open an Existing Run
Start by opening any Assess run previously created.
Annotate the Data
Use the Revise Score feature to provide annotated ground truth on at least 5 rows. The more rows annotated, the better the Judge will align to your criteria.
Create a New Dataset for the New Judge
Export the rows with ground truth and combine with your data to be evaluated.
Start a New Assessment
Start a new performance assessment, upload the revised dataset with ground truth, and select Collinear Flex as the Judge.
Customize the Scoring Criteria
A new set of scoring criteria will be generated automatically. You can tweak or regenerate these as needed.
Finalize the Judge
Click Create Assessment again to finalize and create the run.