·generate-synthetic-data
{}

generate-synthetic-data

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.

74Installs·3Trend·@hamelsmu

Installation

$npx skills add https://github.com/hamelsmu/evals-skills --skill generate-synthetic-data

How to Install generate-synthetic-data

Quickly install generate-synthetic-data AI skill to your development environment via command line

  1. Open Terminal: Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.)
  2. Run Installation Command: Copy and run this command: npx skills add https://github.com/hamelsmu/evals-skills --skill generate-synthetic-data
  3. Verify Installation: Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Source: hamelsmu/evals-skills.

SKILL.md

View raw

Generate diverse, realistic test inputs that cover the failure space of an LLM pipeline.

Before generating synthetic data, identify where the pipeline is likely to fail. Ask the user about known failure-prone areas, review existing user feedback, or form hypotheses from available traces. Dimensions (Step 1) must target anticipated failures, not arbitrary variation.

Dimensions are axes of variation specific to your application. Choose dimensions based on where you expect failures.

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs. Source: hamelsmu/evals-skills.

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/hamelsmu/evals-skills --skill generate-synthetic-data
Category
{}Data Analysis
Verified
First Seen
2026-03-04
Updated
2026-03-10

Browse more skills from hamelsmu/evals-skills

Quick answers

What is generate-synthetic-data?

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs. Source: hamelsmu/evals-skills.

How do I install generate-synthetic-data?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/hamelsmu/evals-skills --skill generate-synthetic-data Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Where is the source repository?

https://github.com/hamelsmu/evals-skills