- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle - .claude/ agents, commands, skills, hooks per Claude Code conventions - Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter - SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree - PROGRESS.md / PLAN.md as shared agent handoff state - Solution scaffold targeting sut-prober PoC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
860 B
860 B
name, description, allowed-tools
| name | description | allowed-tools |
|---|---|---|
| evaluate | Grade a completed module against its Sprint Contract via the evaluator agent. Usage /evaluate <contract-slug> | Read, Glob, Grep, Bash, Agent |
Evaluate module: $ARGUMENTS
Delegate to the evaluator subagent. It must:
- Read
docs/contracts/$ARGUMENTS.md. Refuse if missing. - For each Definition-of-Done item, run the verification named in the contract's Evaluation plan.
- Collect evidence (command output, diffs, file paths).
- Write
docs/contracts/$ARGUMENTS.evaluation.mdwith the verdict table. - Return the verdict to the caller.
If verdict is fail, do NOT mark PROGRESS.md as done — report back so the generator can iterate. If verdict is pass, the caller (not the evaluator) may update PROGRESS.md.
Never let the generator and evaluator be the same agent in a single session.