At the end of each scenario playback the runner now fetches a state
snapshot from the engine-bridge HTTP server and diffs it against an
approved engine-state baseline. This closes the engine-bridge v3 loop
and adds a semantic-state axis to the golden-file regression strategy —
diffs can now catch "camera moved" or "wrong selection after command"
even when the main saved-file diff still passes.
New:
src/Recordingtest.Runner/IEngineStateSnapshotClient.cs
IEngineStateSnapshotClient.TryCapture() -> string? (never throws)
HttpEngineStateSnapshotClient — GETs /scene /camera /selection off a
base URL (default http://localhost:38080), composes them into
{ "scene": {...}, "camera": {...}, "selection": {...} }
with stable ordering so the downstream differ stays friendly.
Runner is Generic tier, so this client carries zero HmEG knowledge;
it forwards raw JSON strings.
TestRunner.RunAll now takes an optional IEngineStateSnapshotClient.
After engine.Run() completes, CaptureAndDiffSidecar():
- null client -> SidecarStatus = "skipped"
- TryCapture null/throw -> "unavailable" (main result still evaluated)
- success -> writes engine-state.received.json
- baseline found -> normalize + diff + "pass"/"fail"
- baseline missing -> "missing_baseline" (first run convention)
A sidecar "fail" promotes the overall scenario Status to "fail" so
exit code reflects semantic divergence even when the save-file diff
agrees.
ScenarioResult: SidecarCaptured / SidecarHunks / SidecarStatus.
Markdown report grows Sidecar + Sidecar Hunks columns; JSON report
picks up the new fields automatically via camelCase serialization.
Program.cs: --sidecar-url <url> (default localhost:38080) and
--no-sidecar. Default behaviour is sidecar-on so that a loaded
bridge plugin is picked up automatically; when the plugin is not
deployed the client silently reports "unavailable" and CI still runs.
Baseline lookup (new):
<baselinesDir>/<scenario>.engine-state.approved.json
<baselinesDir>/<scenario>.engine-state.json
Tests (Recordingtest.Runner.Tests, +6):
- Sidecar_NullClient_SkippedStatus
- Sidecar_ClientReturnsNull_UnavailableStatus
- Sidecar_Throws_UnavailableStatus_MainStillPasses
- Sidecar_Captured_NoBaseline_MissingBaseline_And_WritesReceivedFile
- Sidecar_Captured_BaselineIdentical_PassPass
- Sidecar_Captured_BaselineDivergent_PromotesScenarioToFail
Full suite 126 -> 132, 0 failures.
Follow-ups (PLAN.md):
- Live loop: first run writes received, user approves, rerun passes.
- Normalizer profile for engine-state (float epsilon for camera
coords, selected_ids sort, document_path masking). Currently
runs through the default profile as identity, so false fails are
possible for sensitive camera moves until this lands.
Ref: #10 engine-bridge v3 final integration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First time box-v6.yaml (raw recorder output, 676 lines) replays end-to-end
and actually creates a Box in the SUT — no AI post-editing of target paths
or offsets required. This is the counterpart to #13's recorder-side fixes:
the player now absorbs the remaining record→replay gaps instead of demanding
a hand-cleaned scenario.
Changes (all in Recordingtest.Player):
- PlayerEngine: null-target fallbacks
- Type with null target → host.Type() against current focus
- Click with null target + raw_coord → click at screen-absolute raw_coord
- Other null targets still skipped
- PlayerEngine: strip leading alt+tab hotkey steps (recording-startup noise
that fights the player's own foreground switch)
- PlayerEngine: preserve recorded inter-step timing, clamped 150ms–3s,
routed through new IPlayerHost.Delay so the engine itself stays Sleep-free
(keeps the existing "no fixed sleep" DoD test passing)
- PlayerEngine: per-step console log for live debugging
- UiaPlayerHost: BringSutToForeground() — SetForeground + Focus + 600ms
settle, called from Program.cs before engine.Run
- Step model: add RawCoord (int[]) and Ts (long?) fields, auto-mapped from
YAML raw_coord / ts keys
Tests updated:
- PlayerEngine_NullTarget_SkipsWithoutCalling → _Fallback_Issue14
(verifies the new Click-with-raw_coord and Type behavior)
- FakePlayerHost (both player.tests and runner.tests) implement Delay
Live smoke: box-v6.yaml raw replay produced the expected Box geometry on
the 2nd attempt; 1st attempt dropped the initial "BOX" keystrokes, tracked
as a follow-up (foreground settle is still threshold-sensitive at 600ms).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>