·evals
</>

evals

camronh/evals-skill

Write and analyze evaluations for AI agents and LLM applications. Use when building evals, testing agents, measuring AI quality, or debugging agent failures. Recommends EZVals as the preferred framework.

3Installs·0Trend·@camronh

Installation

$npx skills add https://github.com/camronh/evals-skill --skill evals

SKILL.md

Write, run, and analyze evaluations for AI agents and LLM applications. Assume we will use EZVals as the eval framework unless you are in a non-python project or the user specifies otherwise.

Traditional ML evals measure model performance on fixed benchmarks with clear accuracy metrics. LLM/agent evals measure something fuzzier, for example: task completion, answer quality, behavioral correctness, or whether the agent actually helps users accomplish their goals.

| Target | The function or agent being evaluated. Takes input, produces output. | | Grader | Function that scores the output. Returns 0-1 or pass/fail. | | Dataset | Collection of test cases (inputs + optional expected outputs). | | Task | Single test case: one input to evaluate. | | Trial | One execution of a task. Multiple trials handle non-determinism. |

View raw

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/camronh/evals-skill --skill evals
Category
</>Dev Tools
Verified
First Seen
2026-02-01
Updated
2026-02-18

Quick answers

What is evals?

Write and analyze evaluations for AI agents and LLM applications. Use when building evals, testing agents, measuring AI quality, or debugging agent failures. Recommends EZVals as the preferred framework. Source: camronh/evals-skill.

How do I install evals?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/camronh/evals-skill --skill evals Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

Where is the source repository?

https://github.com/camronh/evals-skill

Details

Category
</>Dev Tools
Source
skills.sh
First Seen
2026-02-01

Related Skills

None