·llm-evaluation

LLM output evaluation and quality assessment. Use when implementing LLM-as-judge patterns, quality gates for AI outputs, or automated evaluation pipelines.

4Installs·0Trend·@yonatangross

Installation

$npx skills add https://github.com/yonatangross/skillforge-claude-plugin --skill llm-evaluation

SKILL.md

Evaluate and validate LLM outputs for quality assurance using RAGAS and LLM-as-judge patterns.

| Faithfulness | RAG grounding | ≥ 0.8 | | Answer Relevancy | Q&A systems | ≥ 0.7 | | Context Precision | Retrieval quality | ≥ 0.7 | | Context Recall | Retrieval completeness | ≥ 0.7 |

| Judge model | GPT-4o-mini or Claude Haiku | | Threshold | 0.7 for production, 0.6 for drafts | | Dimensions | 3-5 most relevant to use case | | Sample size | 50+ for reliable metrics |

LLM output evaluation and quality assessment. Use when implementing LLM-as-judge patterns, quality gates for AI outputs, or automated evaluation pipelines. Source: yonatangross/skillforge-claude-plugin.

View raw

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/yonatangross/skillforge-claude-plugin --skill llm-evaluation
Category
</>Dev Tools
Verified
First Seen
2026-02-01
Updated
2026-02-18

Quick answers

What is llm-evaluation?

LLM output evaluation and quality assessment. Use when implementing LLM-as-judge patterns, quality gates for AI outputs, or automated evaluation pipelines. Source: yonatangross/skillforge-claude-plugin.

How do I install llm-evaluation?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/yonatangross/skillforge-claude-plugin --skill llm-evaluation Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

Where is the source repository?

https://github.com/yonatangross/skillforge-claude-plugin