The Collinear AI Platform allows you to effortlessly evaluate AI models with flexibility and precision. This guide walks you through the steps to create a new evaluation run using the Flex Evaluation feature.
Flex Evaluation lets you run create a customizable likert judge and evaluate on your judge. Itโs ideal for:
Explore the embedded demo below to see Flex Evaluation in action. This interactive guide will walk you through setting up a run, choosing your data, and interpreting results.
Follow these steps to create a new Flex Judge using annotations from an existing run:
Open an Existing Run Start by opening a run that was created using Collinear Flex.
Annotate the Data Use the Feedback feature to revise scores and provide annotations on specific rows.
Select Rows for the New Judge Choose the rows you want to include in your new evaluation.
Click โCreate a Judgeโ Once youโve selected the rows, click the Create a Judge button.
Customize the Scoring Criteria A new set of scoring criteria will be generated automatically. You can tweak or regenerate these as needed.
Finalize the Judge Click Create Judge again to finalize and save it. This new judge will be available for use in future runs.
Here is a sample dataset that you can use to test the Flex Evaluation feature: Sample Dataset.
The Collinear AI Platform allows you to effortlessly evaluate AI models with flexibility and precision. This guide walks you through the steps to create a new evaluation run using the Flex Evaluation feature.
Flex Evaluation lets you run create a customizable likert judge and evaluate on your judge. Itโs ideal for:
Explore the embedded demo below to see Flex Evaluation in action. This interactive guide will walk you through setting up a run, choosing your data, and interpreting results.
Follow these steps to create a new Flex Judge using annotations from an existing run:
Open an Existing Run Start by opening a run that was created using Collinear Flex.
Annotate the Data Use the Feedback feature to revise scores and provide annotations on specific rows.
Select Rows for the New Judge Choose the rows you want to include in your new evaluation.
Click โCreate a Judgeโ Once youโve selected the rows, click the Create a Judge button.
Customize the Scoring Criteria A new set of scoring criteria will be generated automatically. You can tweak or regenerate these as needed.
Finalize the Judge Click Create Judge again to finalize and save it. This new judge will be available for use in future runs.
Here is a sample dataset that you can use to test the Flex Evaluation feature: Sample Dataset.