·aws-bedrock-evals
</>

aws-bedrock-evals

Build and run LLM-as-judge evaluation pipelines using Amazon Bedrock Evaluation Jobs with pre-computed inference datasets. Use when setting up automated model evaluation, designing test scenarios, collecting pre-computed responses, configuring custom metrics, creating AWS infrastructure, running evaluation jobs, parsing results, and iterating on findings.

7Installs·1Trend·@antstackio

Installation

$npx skills add https://github.com/antstackio/skills --skill aws-bedrock-evals

How to Install aws-bedrock-evals

Quickly install aws-bedrock-evals AI skill to your development environment via command line

  1. Open Terminal: Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.)
  2. Run Installation Command: Copy and run this command: npx skills add https://github.com/antstackio/skills --skill aws-bedrock-evals
  3. Verify Installation: Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Source: antstackio/skills.

SKILL.md

View raw

Amazon Bedrock Evaluation Jobs measure how well your Bedrock-powered application performs by using a separate evaluator model (the "judge") to score prompt-response pairs against a set of metrics. The judge reads each pair with metric-specific instructions and produces a numeric score plus written reasoning.

| Mode | How it works | Use when |

| Live Inference | Bedrock generates responses during the eval job | Simple prompt-in/text-out, no tool calling | | Pre-computed Inference | You pre-collect responses and supply them in a JSONL dataset | Tool calling, multi-turn conversations, custom orchestration, models outside Bedrock |

Build and run LLM-as-judge evaluation pipelines using Amazon Bedrock Evaluation Jobs with pre-computed inference datasets. Use when setting up automated model evaluation, designing test scenarios, collecting pre-computed responses, configuring custom metrics, creating AWS infrastructure, running evaluation jobs, parsing results, and iterating on findings. Source: antstackio/skills.

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/antstackio/skills --skill aws-bedrock-evals
Category
</>Dev Tools
Verified
First Seen
2026-02-22
Updated
2026-03-10

Browse more skills from antstackio/skills

Quick answers

What is aws-bedrock-evals?

Build and run LLM-as-judge evaluation pipelines using Amazon Bedrock Evaluation Jobs with pre-computed inference datasets. Use when setting up automated model evaluation, designing test scenarios, collecting pre-computed responses, configuring custom metrics, creating AWS infrastructure, running evaluation jobs, parsing results, and iterating on findings. Source: antstackio/skills.

How do I install aws-bedrock-evals?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/antstackio/skills --skill aws-bedrock-evals Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code, Cursor, or OpenClaw

Where is the source repository?

https://github.com/antstackio/skills

Details

Category
</>Dev Tools
Source
skills.sh
First Seen
2026-02-22

Related Skills

None