Files
minsung 7ffbb1f757 Set up AI dev environment for recordingtest (#2)
- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle
- .claude/ agents, commands, skills, hooks per Claude Code conventions
- Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter
- SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree
- PROGRESS.md / PLAN.md as shared agent handoff state
- Solution scaffold targeting sut-prober PoC

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:57:20 +09:00

889 B

name, description, allowed-tools
name description allowed-tools
regress Run the full recordingtest regression suite (or a filtered subset) and triage failures. Bash, Read, Glob, Grep, Agent, Write

Run the regression suite.

Steps:

  1. Verify the runner exists. If src/Recordingtest.Runner/ is not yet built, stop and tell the user the suite is not set up.
  2. Execute the runner with optional filter $ARGUMENTS (empty = all scenarios).
  3. Collect results from the runner output folder.
  4. For each failed scenario, delegate to the diff-triager subagent with the baseline/received/artifact paths.
  5. Summarize: passed / failed / triage buckets.
  6. If any failures are classified as "normalization gap", list suggested rules at the end.
  7. Append run summary to PROGRESS.md under a "Recent regression runs" section.

Do NOT auto-approve or mutate baselines. Human confirmation is required via /approve.