- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle - .claude/ agents, commands, skills, hooks per Claude Code conventions - Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter - SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree - PROGRESS.md / PLAN.md as shared agent handoff state - Solution scaffold targeting sut-prober PoC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
889 B
889 B
name, description, allowed-tools
| name | description | allowed-tools |
|---|---|---|
| regress | Run the full recordingtest regression suite (or a filtered subset) and triage failures. | Bash, Read, Glob, Grep, Agent, Write |
Run the regression suite.
Steps:
- Verify the runner exists. If
src/Recordingtest.Runner/is not yet built, stop and tell the user the suite is not set up. - Execute the runner with optional filter
$ARGUMENTS(empty = all scenarios). - Collect results from the runner output folder.
- For each failed scenario, delegate to the
diff-triagersubagent with the baseline/received/artifact paths. - Summarize: passed / failed / triage buckets.
- If any failures are classified as "normalization gap", list suggested rules at the end.
- Append run summary to
PROGRESS.mdunder a "Recent regression runs" section.
Do NOT auto-approve or mutate baselines. Human confirmation is required via /approve.