detectelectronpole/.claude/skills/eval/SKILL.md at 417f880a87d9dacf94ce7a8dd0cfed1fb7628e76

Files

minsung 417f880a87 Setup RailPose3D harness (Planner/Generator/Evaluator)

Name the project RailPose3D and stand up a multi-agent harness
following the Anthropic harness-design blog principles
(decomposition, separation of concerns, file-based handoff,
sprint contracts, context-reset over compaction).

- CLAUDE.md / PLAN.md / PROGRESS.md as the file-based handoff
  surface; every agent must read PLAN+PROGRESS before acting.
- 7 sub-agents under .claude/agents/: plan-architect (Planner),
  pole-detector-builder, rail-detector-builder, triangulation-
  builder, data-pipeline-builder (Generators), module-evaluator
  (Evaluator), dataset-explorer (read-only helper).
- 6 skills under .claude/skills/: /start /sprint /eval /progress
  /handoff /contract.
- SessionStart and Stop hooks to inject the PLAN/PROGRESS
  briefing and remind about PROGRESS.md updates.
- docs/plan.md captures the user-approved detailed plan;
  docs/research.md is the prior tech survey.
- .gitignore excludes data/, .usage/, model checkpoints, and
  local Claude overrides.

Tracking: closes #1

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-28 08:32:05 +09:00

1.1 KiB

Raw Blame History

name, description, argument-hint, allowed-tools

name	description	argument-hint	allowed-tools
eval	RailPose3D 모듈 평가. module-evaluator 에이전트에 위임해 sprint contract의 성공 조건을 정량 측정. argument로 sprint id 또는 module letter (A\|B\|C) 를 받는다.	<sprint-id\|module-letter>	Read, Glob, Agent

평가 대상: $ARGUMENTS

입력이 sprint id (S0~S8) 면 해당 sprint 의 contract 파일을 평가 대상으로 한다.
입력이 module letter (A|B|C) 면 PROGRESS.md 에서 해당 모듈의 가장 최근 in-progress 또는 막 완료된 sprint 의 contract 를 평가 대상으로 한다.

module-evaluator 서브에이전트 를 Agent 도구로 호출한다. 프롬프트:

대상 sprint: <확정한 sprint id>
contract: docs/contracts/<id>-contract.md

각 success criterion 을 측정하고 contract 파일과 PROGRESS.md 를 갱신하라.

evaluator 결과를 사용자에게 요약 (passed criteria 수 / 전체, fail 사유) 으로 보여준다.
fail 시 → 어떤 builder 를 재호출할지 권장. pass 시 → 다음 sprint 진입 권장.

1.1 KiB Raw Blame History

1.1 KiB

Raw Blame History