·advanced-evaluation
</>

advanced-evaluation

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

3Installs·0Trend·@chakshugautam

Installation

$npx skills add https://github.com/chakshugautam/games --skill advanced-evaluation

How to Install advanced-evaluation

Quickly install advanced-evaluation AI skill to your development environment via command line

  1. Open Terminal: Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.)
  2. Run Installation Command: Copy and run this command: npx skills add https://github.com/chakshugautam/games --skill advanced-evaluation
  3. Verify Installation: Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Source: chakshugautam/games.

SKILL.md

View raw

This skill covers production-grade techniques for evaluating LLM outputs using LLMs as judges. It synthesizes research from academic papers, industry practices, and practical implementation experience into actionable patterns for building reliable evaluation systems.

Key insight: LLM-as-a-Judge is not a single technique but a family of approaches, each suited to different evaluation contexts. Choosing the right approach and mitigating known biases is the core competency this skill develops.

Evaluation approaches fall into two primary categories with distinct reliability profiles:

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment. Source: chakshugautam/games.

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/chakshugautam/games --skill advanced-evaluation
Category
</>Dev Tools
Verified
First Seen
2026-02-26
Updated
2026-03-11

Browse more skills from chakshugautam/games

Quick answers

What is advanced-evaluation?

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment. Source: chakshugautam/games.

How do I install advanced-evaluation?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/chakshugautam/games --skill advanced-evaluation Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Where is the source repository?

https://github.com/chakshugautam/games

Details

Category
</>Dev Tools
Source
skills.sh
First Seen
2026-02-26