Set up AI dev environment for recordingtest (#2)

- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle - .claude/ agents, commands, skills, hooks per Claude Code conventions - Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter - SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree - PROGRESS.md / PLAN.md as shared agent handoff state - Solution scaffold targeting sut-prober PoC Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:57:20 +09:00
parent a48a8a2d1d
commit 7ffbb1f757
47 changed files with 1886 additions and 11 deletions
--- a/.claude/commands/regress.md
+++ b/.claude/commands/regress.md
@@ -0,0 +1,19 @@
+---
+name: regress
+description: Run the full recordingtest regression suite (or a filtered subset) and triage failures.
+allowed-tools: Bash, Read, Glob, Grep, Agent, Write
+---
+
+Run the regression suite.
+
+Steps:
+
+1. Verify the runner exists. If `src/Recordingtest.Runner/` is not yet built, stop and tell the user the suite is not set up.
+2. Execute the runner with optional filter `$ARGUMENTS` (empty = all scenarios).
+3. Collect results from the runner output folder.
+4. For each failed scenario, delegate to the `diff-triager` subagent with the baseline/received/artifact paths.
+5. Summarize: passed / failed / triage buckets.
+6. If any failures are classified as "normalization gap", list suggested rules at the end.
+7. Append run summary to `PROGRESS.md` under a "Recent regression runs" section.
+
+Do NOT auto-approve or mutate baselines. Human confirmation is required via `/approve`.