Orchestrate P1 UI automation evaluations (#6, #7)

- recorder v1 (fail) → v2 (pass): drag state machine, focus events, ts/raw_coord
- player pass with caveats: reliability untestable in sandbox
- PROGRESS.md Done rows + follow-ups for live SUT smoke test
- PLAN.md P1 pivoted to test-runner + live smoke test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
minsung
2026-04-07 14:37:14 +09:00
parent 56b7233500
commit 836afea5ee
9 changed files with 323 additions and 5 deletions

View File

@@ -0,0 +1,47 @@
# 2026-04-07 — Issue #7 player evaluator
**Role:** Evaluator (independent)
**Target:** `player`
**Generator commit:** f17e764
**Verdict:** pass with caveats
## What I did
1. Built `recordingtest.sln` -> 0 warn / 0 err.
2. Ran `dotnet test tests/Recordingtest.Player.Tests` -> 6/6 pass.
3. Read all player sources: `PlayerEngine.cs`, `IPlayerHost.cs`, `Program.cs`, `ScenarioLoader.cs`, `Model/Scenario.cs`, `Model/Step.cs`, `UiaPlayerHost.cs` (skim).
4. Grep `Thread.Sleep(` and `Task.Delay(TimeSpan.FromSeconds` in `PlayerEngine.cs` -> 0 matches.
5. Verified `Player_NoFixedSleep` is real: uses `[CallerFilePath]` to locate `src/Recordingtest.Player/PlayerEngine.cs` and regex-asserts absence of fixed sleeps.
6. Mapped each DoD bullet to evidence and produced verdict table in `docs/contracts/player.evaluation.md`.
## DoD scores
| Item | Score |
|---|---|
| CLI args | pass |
| wait_for | partial (PoC passthrough) |
| resolve + offset | pass |
| failure artifacts | pass |
| checkpoint save | pass |
| exit codes | pass |
| 10/10 reliability | untestable (sandbox) |
| no fixed sleep | pass |
## Key findings
- `PlayerEngine.ComputeScreenPoint` formula matches expected `bounds.X + W*ox`, verified by test (125,210 from 100/200 + 50*0.5 / 40*0.25).
- `Program.cs` only supports `--no-launch` attach mode; launch path returns exit 5 with explicit message — generator was honest.
- `wait_for` hint is forwarded to `IPlayerHost.WaitFor` with timeout; engine throws on timeout. Real waiting strategy lives in `UiaPlayerHost` (PoC).
- `Model` classes mirror recorder yaml schema; `UnderscoredNamingConvention` handles `uia_path`, `after_step`, `save_as`, `startup_timeout_ms`.
- Reliability (10x replay) cannot be measured here — no real SUT GUI in sandbox. Deferred, not failed.
## Constraints respected
- Did NOT modify generator code.
- Did NOT update PROGRESS.md.
- Only wrote `docs/contracts/player.evaluation.md` and this history file.
## Artifacts
- `d:/MYCLAUDE_PROJECT/recordingtest/docs/contracts/player.evaluation.md`
- `d:/MYCLAUDE_PROJECT/recordingtest/docs/history/2026-04-07_이슈7-player-evaluator.md`