Evaluate the output by type, score each criterion, make an accept/reject decision, and suggest concrete improvements. Goal: prevent weak output from reaching the next step.
| 8-10 | ACCEPT | Proceed | | 6-7 | CONDITIONAL | Apply minor fixes, then proceed | | 0-5 | REJECT | Apply improvements, re-evaluate |
| Correctness | 30% | Produces expected output? Handles edge cases? | | Readability | 20% | Meaningful names? Clean indentation? | | Security | 20% | SQL injection? Hardcoded secrets? Unsafe input? | | Performance | 15% | Unnecessary loops? N+1 queries? Memory leaks? | | Testability | 15% | Functions independently testable? |
Evaluate every produced output (code, report, plan, data, API response) against type-specific quality criteria, score 1-10, make accept/reject decisions, and provide actionable improvement suggestions. Triggers on "evaluate", "check", "review", "quality control", "is this good enough", "score it", or before passing output to the next step in an agentic workflow. Source: fatih-developer/fth-skills.