- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle - .claude/ agents, commands, skills, hooks per Claude Code conventions - Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter - SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree - PROGRESS.md / PLAN.md as shared agent handoff state - Solution scaffold targeting sut-prober PoC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
20 lines
889 B
Markdown
20 lines
889 B
Markdown
---
|
|
name: regress
|
|
description: Run the full recordingtest regression suite (or a filtered subset) and triage failures.
|
|
allowed-tools: Bash, Read, Glob, Grep, Agent, Write
|
|
---
|
|
|
|
Run the regression suite.
|
|
|
|
Steps:
|
|
|
|
1. Verify the runner exists. If `src/Recordingtest.Runner/` is not yet built, stop and tell the user the suite is not set up.
|
|
2. Execute the runner with optional filter `$ARGUMENTS` (empty = all scenarios).
|
|
3. Collect results from the runner output folder.
|
|
4. For each failed scenario, delegate to the `diff-triager` subagent with the baseline/received/artifact paths.
|
|
5. Summarize: passed / failed / triage buckets.
|
|
6. If any failures are classified as "normalization gap", list suggested rules at the end.
|
|
7. Append run summary to `PROGRESS.md` under a "Recent regression runs" section.
|
|
|
|
Do NOT auto-approve or mutate baselines. Human confirmation is required via `/approve`.
|