Set up AI dev environment for recordingtest (#2)
- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle - .claude/ agents, commands, skills, hooks per Claude Code conventions - Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter - SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree - PROGRESS.md / PLAN.md as shared agent handoff state - Solution scaffold targeting sut-prober PoC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
19
.claude/commands/regress.md
Normal file
19
.claude/commands/regress.md
Normal file
@@ -0,0 +1,19 @@
|
||||
---
|
||||
name: regress
|
||||
description: Run the full recordingtest regression suite (or a filtered subset) and triage failures.
|
||||
allowed-tools: Bash, Read, Glob, Grep, Agent, Write
|
||||
---
|
||||
|
||||
Run the regression suite.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Verify the runner exists. If `src/Recordingtest.Runner/` is not yet built, stop and tell the user the suite is not set up.
|
||||
2. Execute the runner with optional filter `$ARGUMENTS` (empty = all scenarios).
|
||||
3. Collect results from the runner output folder.
|
||||
4. For each failed scenario, delegate to the `diff-triager` subagent with the baseline/received/artifact paths.
|
||||
5. Summarize: passed / failed / triage buckets.
|
||||
6. If any failures are classified as "normalization gap", list suggested rules at the end.
|
||||
7. Append run summary to `PROGRESS.md` under a "Recent regression runs" section.
|
||||
|
||||
Do NOT auto-approve or mutate baselines. Human confirmation is required via `/approve`.
|
||||
Reference in New Issue
Block a user