DataGen

Synthetic Data Generator

Generate complex synthetic data with AI-powered intelligence

Developer documentation

HTTP API

DataGen is built for agentic workflows first: coding assistants, autonomous test runners, and internal tools that only know what columns they need—not every Faker type string and constraint. The same HTTP API also supports teams that want fully explicit schemas for reproducible pipelines. Both paths share authentication, tiers, and limits.

Whether you call from an LLM tool, a script, or CI, you choose between AI mode (names only, we infer types and values) and schema mode (you declare each field’s type and options). The web app’s Schema Builder maps directly to schema mode payloads.

Two generation modes

One surface, two contracts. Use POST /api/v1/generate-ai when the caller should stay lightweight; use POST /api/v1/generate when you need deterministic, reviewable field definitions. Optional POST /api/v1/validate dry-runs schema mode before you burn quota.

AI mode /generate-ai

You send field_names (and optional domain_hint, locale, count). The service picks sensible types and fills rows—ideal when an agent discovers columns from a ticket, a spreadsheet header, or a user question.

Typical agents: “Generate 20 rows for these headers”, seed data for a spike, synthetic users for a demo API.

Schema mode /generate

You send schema_fields: each field has name, type (catalog key), and optional locale, constraints, blankPercentage. Output shape is predictable—ideal when legal, QA, or data engineering signs off on the exact layout.

Typical jobs: contract tests, migration dry-runs, ETL fixtures, replayable golden files.

Example: AI mode (contract-first)

POST /api/v1/infer-schema
Content-Type: application/json
X-API-Key: YOUR_KEY

{
  "field_names": ["customer_id", "email", "risk_tier", "last_login_at"],
  "locale": "en_AE",
  "domain_hint": "UAE retail loyalty program"
}

# Response includes:
# - proposed_schema
# - contract_hash
# - contract_id
POST /api/v1/generate-ai
Content-Type: application/json
X-API-Key: YOUR_KEY

{
  "field_names": ["customer_id", "email", "risk_tier", "last_login_at"],
  "count": 15,
  "locale": "en_AE",
  "domain_hint": "UAE retail loyalty program",
  "output_format": "json",
  "require_validate": true,
  "strict_contract": true,
  "expected_contract_hash": "PASTE_contract_hash_FROM_infer-schema"
}

Example: AI mode (minimal body)

POST /api/v1/generate-ai
Content-Type: application/json
X-API-Key: YOUR_KEY

{
  "field_names": ["customer_id", "email", "risk_tier", "last_login_at"],
  "count": 15,
  "locale": "en_AE",
  "domain_hint": "UAE retail loyalty program",
  "output_format": "json"
}

Example: Schema mode (explicit types)

POST /api/v1/generate
Content-Type: application/json
X-API-Key: YOUR_KEY

{
  "count": 50,
  "locale": "en_US",
  "output_format": "csv",
  "schema_fields": [
    { "name": "order_id", "type": "uuid" },
    { "name": "line_total", "type": "amount" },
    { "name": "customer_email", "type": "email" }
  ]
}

For machine-readable validation errors (message + correction hints), send Accept: application/json, application/vnd.agentic+json on generate, validate, and AI endpoints—useful for self-correcting agents. For contract lock in AI mode, use /api/v1/infer-schema then pass expected_contract_hash into /api/v1/generate-ai.

Downloadable sample payloads

What the API does for you

Technical paths: GET /api/v1/capabilities, POST /api/v1/validate, POST /api/v1/generate, POST /api/v1/generate-ai (and …/generate-ai/stream for SSE).

Interactive reference

Browse every endpoint, try requests from the browser, and copy working examples from the live API reference at /api/v1/docs (same site you are on now). If your tools need a machine-readable description, use /api/v1/openapi.json. If your stack uses MCP over HTTPS on this same deployment, the entry path is /mcp/sse with the same X-API-Key model—see the MCP overview.

Get started in three steps

1. Get a free API key

Open Get a free API key. You will receive a short JSON response with your key (it starts with DATAGEN-FREE-). Treat it like a password—anyone with the key can use your free-tier allowance. If you are on a shared office network, your team may share the same request limits; wait a bit and try again, or reuse a key you already saved.

2. Attach the key to each request

Add an HTTP header: X-API-Key: followed by your key. In the interactive reference, click Authorize, paste the key once, then use Try it out on any operation. Popular clients such as Postman or curl work the same way.

3. Call in a sensible order

Start with capabilities, validate your field list, then generate rows. The interactive reference lists response codes and example bodies for each endpoint.

For your IT or integration team

The interactive reference can be linked or embedded in your internal wiki the same way you would embed any secure page on this domain. Teams that maintain their own developer portal can import /api/v1/openapi.json into Postman, Scalar, Redoc, or similar products. If your organization uses AI assistants or IDEs with MCP, see the MCP integration overview (availability on request).

Higher limits and teams

Coming soon: plans with higher volume, faster throughput, and team-friendly billing for production-style workloads. Details will appear on this page when they are ready.