Set up AI dev environment for recordingtest (#2)

- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle
- .claude/ agents, commands, skills, hooks per Claude Code conventions
- Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter
- SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree
- PROGRESS.md / PLAN.md as shared agent handoff state
- Solution scaffold targeting sut-prober PoC

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
minsung
2026-04-07 13:57:20 +09:00
parent a48a8a2d1d
commit 7ffbb1f757
47 changed files with 1886 additions and 11 deletions

View File

@@ -0,0 +1,19 @@
---
name: regress
description: Run the full recordingtest regression suite (or a filtered subset) and triage failures.
allowed-tools: Bash, Read, Glob, Grep, Agent, Write
---
Run the regression suite.
Steps:
1. Verify the runner exists. If `src/Recordingtest.Runner/` is not yet built, stop and tell the user the suite is not set up.
2. Execute the runner with optional filter `$ARGUMENTS` (empty = all scenarios).
3. Collect results from the runner output folder.
4. For each failed scenario, delegate to the `diff-triager` subagent with the baseline/received/artifact paths.
5. Summarize: passed / failed / triage buckets.
6. If any failures are classified as "normalization gap", list suggested rules at the end.
7. Append run summary to `PROGRESS.md` under a "Recent regression runs" section.
Do NOT auto-approve or mutate baselines. Human confirmation is required via `/approve`.