ai-evaluation-evals

Name: ai-evaluation-evals
Author: oldwinter

✓

oldwinter/skillsGitHub: oldwinter/skills 来源: oldwinter/skills

使用基准、评分标准和错误分析工作流程创建 AI 评估计划。

oldwinter·ai·evaluation·evals

12安装·0热度·@oldwinter

安装

GitHub: oldwinter/skills

$npx skills add https://github.com/oldwinter/skills --skill ai-evaluation-evals

如何安装 ai-evaluation-evals

通过命令行快速安装 ai-evaluation-evals AI 技能到你的开发环境

打开终端: 打开你的终端或命令行工具（如 Terminal、iTerm、Windows Terminal 等）
运行安装命令: 复制并运行以下命令：npx skills add https://github.com/oldwinter/skills --skill ai-evaluation-evals
验证安装: 安装完成后，技能将自动配置到你的 AI 编程环境中，可以在 Claude Code、Cursor 或 OpenClaw 中使用

来源：oldwinter/skills。

SKILL.md

查看原文

Lenny Skills Database SKILLS PLAYBOOKS GUESTS ABOUT SKILLS PLAYBOOKS GUESTS ABOUT AI & Technology 2 guests | 2 insights

AI Evaluation (Evals) AI evaluation (evals) is the emerging skill of systematically testing and measuring AI model performance. As models become products, evals become the product requirements document. This involves error analysis, creating rubrics, building benchmarks, and developing systematic tests - a critical bottleneck for AI labs and a new core competency for product builders.

1 Treat evals as your product requirements In AI products, the eval suite defines what the product should do. If you can't measure it, you can't improve it. Before building features, define how you'll evaluate success. The eval is the spec - it tells the model (and your team) exactly what 'good' looks like.

可引用信息

为搜索与 AI 引用准备的稳定字段与命令。

安装命令: npx skills add https://github.com/oldwinter/skills --skill ai-evaluation-evals
来源: oldwinter/skills
分类: {}数据分析
认证: ✓
收录时间: 2026-02-28
更新时间: 2026-03-10
链接: https://www.learn-skills.dev/zh/skills/oldwinter/skills/ai-evaluation-evals

Browse more skills from oldwinter/skills

快速解答

什么是 ai-evaluation-evals？

使用基准、评分标准和错误分析工作流程创建 AI 评估计划。来源：oldwinter/skills。

如何安装 ai-evaluation-evals？

打开你的终端或命令行工具（如 Terminal、iTerm、Windows Terminal 等）复制并运行以下命令：npx skills add https://github.com/oldwinter/skills --skill ai-evaluation-evals 安装完成后，技能将自动配置到你的 AI 编程环境中，可以在 Claude Code、Cursor 或 OpenClaw 中使用

这个 Skill 的源码在哪？

https://github.com/oldwinter/skills

详情

分类: {}数据分析
来源: skills.sh
收录时间: 2026-02-28