recordingtest/.claude/commands/regress.md at b1c2383a546de89b3cdee564a949dbeb5959b8ac

Files

minsung 7ffbb1f757 Set up AI dev environment for recordingtest (#2 )

- CLAUDE.md with collaboration rules and Planner/Generator/Evaluator cycle
- .claude/ agents, commands, skills, hooks per Claude Code conventions
- Sprint Contracts for sut-prober, normalizer, recorder, player, diff-reporter
- SUT catalog (EG-BIM Modeler, 187 plugins) and .gitignore excluding SUT tree
- PROGRESS.md / PLAN.md as shared agent handoff state
- Solution scaffold targeting sut-prober PoC

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-07 13:57:20 +09:00

889 B

Raw Blame History

name, description, allowed-tools

name	description	allowed-tools
regress	Run the full recordingtest regression suite (or a filtered subset) and triage failures.	Bash, Read, Glob, Grep, Agent, Write

Run the regression suite.

Steps:

Verify the runner exists. If src/Recordingtest.Runner/ is not yet built, stop and tell the user the suite is not set up.
Execute the runner with optional filter $ARGUMENTS (empty = all scenarios).
Collect results from the runner output folder.
For each failed scenario, delegate to the diff-triager subagent with the baseline/received/artifact paths.
Summarize: passed / failed / triage buckets.
If any failures are classified as "normalization gap", list suggested rules at the end.
Append run summary to PROGRESS.md under a "Recent regression runs" section.

Do NOT auto-approve or mutate baselines. Human confirmation is required via /approve.

889 B Raw Blame History

889 B

Raw Blame History