runner: engine-bridge sidecar integration (#10)

At the end of each scenario playback the runner now fetches a state
snapshot from the engine-bridge HTTP server and diffs it against an
approved engine-state baseline. This closes the engine-bridge v3 loop
and adds a semantic-state axis to the golden-file regression strategy —
diffs can now catch "camera moved" or "wrong selection after command"
even when the main saved-file diff still passes.

New:
  src/Recordingtest.Runner/IEngineStateSnapshotClient.cs
    IEngineStateSnapshotClient.TryCapture() -> string? (never throws)
    HttpEngineStateSnapshotClient — GETs /scene /camera /selection off a
    base URL (default http://localhost:38080), composes them into
      { "scene": {...}, "camera": {...}, "selection": {...} }
    with stable ordering so the downstream differ stays friendly.
    Runner is Generic tier, so this client carries zero HmEG knowledge;
    it forwards raw JSON strings.

  TestRunner.RunAll now takes an optional IEngineStateSnapshotClient.
  After engine.Run() completes, CaptureAndDiffSidecar():
    - null client           -> SidecarStatus = "skipped"
    - TryCapture null/throw -> "unavailable" (main result still evaluated)
    - success               -> writes engine-state.received.json
    - baseline found        -> normalize + diff + "pass"/"fail"
    - baseline missing      -> "missing_baseline" (first run convention)
  A sidecar "fail" promotes the overall scenario Status to "fail" so
  exit code reflects semantic divergence even when the save-file diff
  agrees.

  ScenarioResult: SidecarCaptured / SidecarHunks / SidecarStatus.
  Markdown report grows Sidecar + Sidecar Hunks columns; JSON report
  picks up the new fields automatically via camelCase serialization.

  Program.cs: --sidecar-url <url> (default localhost:38080) and
  --no-sidecar. Default behaviour is sidecar-on so that a loaded
  bridge plugin is picked up automatically; when the plugin is not
  deployed the client silently reports "unavailable" and CI still runs.

Baseline lookup (new):
  <baselinesDir>/<scenario>.engine-state.approved.json
  <baselinesDir>/<scenario>.engine-state.json

Tests (Recordingtest.Runner.Tests, +6):
  - Sidecar_NullClient_SkippedStatus
  - Sidecar_ClientReturnsNull_UnavailableStatus
  - Sidecar_Throws_UnavailableStatus_MainStillPasses
  - Sidecar_Captured_NoBaseline_MissingBaseline_And_WritesReceivedFile
  - Sidecar_Captured_BaselineIdentical_PassPass
  - Sidecar_Captured_BaselineDivergent_PromotesScenarioToFail

Full suite 126 -> 132, 0 failures.

Follow-ups (PLAN.md):
  - Live loop: first run writes received, user approves, rerun passes.
  - Normalizer profile for engine-state (float epsilon for camera
    coords, selected_ids sort, document_path masking). Currently
    runs through the default profile as identity, so false fails are
    possible for sensitive camera moves until this lands.

Ref: #10 engine-bridge v3 final integration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
minsung
2026-04-09 11:56:15 +09:00
parent 062a285462
commit e28a029704
8 changed files with 510 additions and 8 deletions

View File

@@ -5,6 +5,8 @@ public static class Program
public static int Main(string[] args)
{
var options = new RunnerOptions();
string? sidecarUrl = null;
bool noSidecar = false;
for (int i = 0; i < args.Length; i++)
{
switch (args[i])
@@ -14,9 +16,13 @@ public static class Program
case "--out": options.OutDir = args[++i]; break;
case "--profile": options.Profile = args[++i]; break;
case "--no-launch": options.NoLaunch = true; break;
case "--sidecar-url": sidecarUrl = args[++i]; break;
case "--no-sidecar": noSidecar = true; break;
case "-h":
case "--help":
Console.WriteLine("Usage: Recordingtest.Runner --scenarios <dir> --baselines <dir> --out <dir> [--profile <name>] [--no-launch]");
Console.WriteLine("Usage: Recordingtest.Runner --scenarios <dir> --baselines <dir> --out <dir>");
Console.WriteLine(" [--profile <name>] [--no-launch]");
Console.WriteLine(" [--sidecar-url http://localhost:38080] [--no-sidecar]");
return 0;
}
}
@@ -29,12 +35,24 @@ public static class Program
return 2;
}
// engine-bridge v3 — default to the well-known localhost bridge port
// unless the caller explicitly opts out or overrides the URL.
IEngineStateSnapshotClient? sidecar = null;
if (!noSidecar)
{
sidecar = new HttpEngineStateSnapshotClient(
sidecarUrl ?? HttpEngineStateSnapshotClient.DefaultBaseUrl);
}
var runner = new TestRunner();
var report = runner.RunAll(
options,
new DefaultHostFactory(),
new DefaultNormalizer(),
new DefaultDiffer());
new DefaultDiffer(),
sidecarClient: sidecar);
(sidecar as IDisposable)?.Dispose();
Console.WriteLine($"Total: {report.Total}, Passed: {report.Passed}, Failed: {report.Failed}, Errored: {report.Errored}");
return TestRunner.ToExitCode(report);