Generate Simulated Data (JSON)

Generate Data From Body

curl --request POST \
  --url https://api.collinear.ai/api/v1/synth_data/generate \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "multiplier": 123,
  "examples": [
    {}
  ],
  "prompt": "<string>"
}
'

[
  {
    "original": {},
    "generated": "<string>"
  }
]

POST

api

synth_data

generate

Generate Data From Body

curl --request POST \
  --url https://api.collinear.ai/api/v1/synth_data/generate \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "multiplier": 123,
  "examples": [
    {}
  ],
  "prompt": "<string>"
}
'

[
  {
    "original": {},
    "generated": "<string>"
  }
]

Alert

This method of generating simulated data is different from the SDK. If you wish to create trait-based simulations, use the SDK

Overview

Generate synthetic conversation rows inline by providing example prompts and a few seed examples. The service uses the specified model to fan out new samples that you can immediately download or store. This JSON variant is convenient when building tooling that programmatically defines the schema and instructions.

Request body fields

model_id: UUID of the model that should author the synthetic responses.
multiplier: Integer multiplier applied to the number of examples. If you send three examples and set multiplier to 4, the service targets twelve generations.
examples: Array of objects. Each example can include any keys your pipeline understands (commonly conv_prefix, response, context, etc.).
prompt: Instruction prompt that explains how the model should transform the examples when creating new rows.

Example request

curl https://stage.collinear.ai/api/v1/synth_data/generate \
  -H 'Authorization: Bearer <token>' \
  -H 'Content-Type: application/json' \
  -d '{
    "model_id": "<MODEL_ID>",
    "multiplier": 5,
    "prompt": "Generate safe customer service conversations for billing issues.",
    "examples": [
      {
        "conv_prefix": [
          {"role": "user", "content": "My shipment is missing."}
        ],
        "response": {
          "role": "assistant",
          "content": "I'm sorry to hear that. Let me check the tracking number."
        }
      }
    ]
  }'

Response shape

The API returns an array. Each element includes the original example that the model expanded from and a generated string with the newly created row:

[
  {
    "original": {"conv_prefix": [...], "response": {...}},
    "generated": "{\n  \"conv_prefix\": [...],\n  \"response\": {...}\n}"
  }
]

Parse the generated string into JSON before storing it. The service may stream partial chunks before the final array is delivered; buffer the body before parsing if your HTTP client emits incremental events.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model_id

string<uuid>

required

multiplier

integer

required

examples

Examples · object[]

required

prompt

string

required

Response

Successful Response

original

Original · object

required

generated

string

required

Create Judge (SDK Helper)TauTrait Benchmark API

⌘I

Assessments

Datasets

Judges

Simulated Data

Benchmarks

Helpers

Generate Simulated Data (JSON)

Alert

Overview

Request body fields

Example request

Response shape

Authorizations

Body

Response

Assessments

Datasets

Judges

Simulated Data

Benchmarks

Helpers

​Alert

​Overview

​Request body fields

​Example request

​Response shape

Authorizations

Body

Response

Alert

Overview

Request body fields

Example request

Response shape