Create Collinear Flex Evaluation
The Collinear AI Platform allows you to effortlessly evaluate AI models with flexibility and precision. This guide walks you through the steps to create a new evaluation run using the Flex Evaluation feature.
๐ What is Flex Evaluation?
Flex Evaluation lets you run create a customizable likert judge and evaluate on your judge. Itโs ideal for:
- Testing model accuracy across custom metrics
- Generating insights quickly without heavy setup
๐ Interactive Walkthrough
Explore the embedded demo below to see Flex Evaluation in action. This interactive guide will walk you through setting up a run, choosing your data, and interpreting results.
๐งฐ Key Features
- No-code setup for quick dataset evaluation
- Rich visual analytics for in-depth insights
๐ ๏ธ How to Create a Flex Evaluation with Annotations
Follow these steps to create a new Flex Judge using annotations from an existing run:
-
Open an Existing Run Start by opening a run that was created using Collinear Flex.
-
Annotate the Data Use the Feedback feature to revise scores and provide annotations on specific rows.
-
Select Rows for the New Judge Choose the rows you want to include in your new evaluation.
-
Click โCreate a Judgeโ Once youโve selected the rows, click the Create a Judge button.
-
Customize the Scoring Criteria A new set of scoring criteria will be generated automatically. You can tweak or regenerate these as needed.
-
Finalize the Judge Click Create Judge again to finalize and save it. This new judge will be available for use in future runs.
๐ Additional Resources
Here is a sample dataset that you can use to test the Flex Evaluation feature: Sample Dataset.