Claude Skills Hub
  • Home
  • Dashboard
  • Explore
  • Skills
  • Categories
  • Blog
Submit
#llm-as-judge
5 Skills
Home/Title/#llm-as-judge

Agent Evaluation Framework

Comprehensive Claude Code agent evaluation framework with multi-dimensional scoring, LLM-as-Judge mode, and research-backed performance variance analysis

Ai
Dev

Multi-Perspective Critique

Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building

Ai
Dev

Execute and Judge Loop

Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation

Ai
Dev

Spec-Driven Implement

Implementation system driven by task specs with LLM-as-Judge auto-verification, iterative repair, breakpoint resume, and human-in-the-loop checkpoints

Dev
Productivity

LLM-as-Judge Evaluator

Standalone LLM-as-Judge evaluation tool with context isolation, Chain-of-Thought scoring, multi-dimensional weighted rubric, and evidence-backed assessments

Ai
Dev

PopularSkills

  • Superpowers
  • McpBuilder
  • PdfProcessor
  • CanvasDesign
  • Playwright
  • ContentResearch
  • ContextEngineering

DiscoverTitle

  • CodeSkills
  • AgentSkills
  • AiSkills
  • DesignSkill
  • ProductivitySkills
  • SkillsExamples
  • AllCategories
  • AllTags

GuidesTitle

  • WhatAre
  • HowToUse
  • HowToInstall
  • HowToCreate
  • BestSkills
  • SkillsVsMcp
  • BestPractices
  • SkillsApi
  • AllGuides

Resources

  • Marketplace
  • Search
  • Library
  • SubmitSkill

Community

  • Github
  • Documentation

Copyright

Disclaimer