recordingtest/docs/contracts/test-runner.evaluation.md

# test-runner Evaluation (Issue #8)

- Generator commit: `96df2ef`
- Evaluator: independent verification per contract `docs/contracts/test-runner.md`
- Build: `dotnet build recordingtest.sln` -> 0 warnings, 0 errors
- Tests: `dotnet test tests/Recordingtest.Runner.Tests` -> 6 passed / 0 failed / 0 skipped

## Verdict: PASS

## DoD verification

| # | DoD item | Result | Evidence |
|---|----------|--------|----------|
| 1 | Console exe with 5 flags `--scenarios/--baselines/--out/--profile/--no-launch` | pass | `src/Recordingtest.Runner/Program.cs` switch parses all 5; missing required -> exit 2 |
| 2 | Scan `*.yaml` and write to `<out>/<scenario>/` | pass | `TestRunner.cs` L27-36 enumerates `*.yaml`, creates per-scenario `artifactDir` |
| 3 | Order: player -> normalizer -> diff-reporter | pass | `TestRunner.cs` L50-52 (engine.Run), L103-104 (Normalize), L111 (Compare) |
| 4 | Profile default `default`, overridable | pass | `RunnerOptions.Profile = "default"`; passed through to normalizer; `--profile` writes it |
| 5 | `report.json` schema `{runAt,total,passed,failed,errored,scenarios:[{name,status,hunks,checkpointCount,artifactDir}]}` | pass | `RunReport.cs` matches; camelCase JSON; test 6 asserts every field |
| 6 | `report.md` human summary with table + failure section | pass | `WriteMarkdownReport` builds table + Failures section |
| 7 | Exit codes 0/1/2 | pass | `ToExitCode`: errored>0 -> 2, failed>0 -> 1, else 0; tests assert all three |
| 8 | `IPlayerHost` DI via `IRunnerHostFactory` | pass | `Interfaces.cs`; `RunAll` takes factory + INormalizer + IDiffer; tests inject fakes |
| 9 | xUnit tests >=5 covering 5 scenarios | pass | 6 tests, all required cases (identical, differs, throws, empty, profile spy, schema) |
| 10 | `dotnet build` green, `dotnet test` all pass | pass | 0/0 build, 6/6 tests |
| 11 | Fixed sleep 0 | pass | grep `Thread.Sleep(` and `Task.Delay(TimeSpan.FromSeconds` in `src/Recordingtest.Runner` -> 0 hits |

## Baseline normalization policy
Contract allows either pre-normalized or re-normalized baselines. `TestRunner.cs` L10-11 documents the choice: baselines are re-normalized with the same profile as received output (safe either way). Documented = pass.

## Test quality (not stubs)
- TwoScenarios_BothIdentical_ExitZero_AllPass: real scenario YAML, real PlayerEngine, real diff stub identical
- OneScenarioDiffers: asserts hunks==1 and status=="fail"
- PlayerThrows: uses click step + `throwOnClick` fake host -> errored>=1, exit 2
- EmptyScenariosDir: total==0, exit 0
- ProfileOverride: SpyNormalizer captures profiles list; asserts contains "strict", not "default"
- ReportJson schema: parses report.json and asserts every contract field; checks report.md exists

## Integration smoke
Trusted via unit tests + source review (Runner is fully DI-testable; tests drive `TestRunner.RunAll` directly with real `PlayerEngine` + scenario YAML).

## Artifacts
- Source: `src/Recordingtest.Runner/{Program.cs,TestRunner.cs,Interfaces.cs,RunnerOptions.cs,RunReport.cs,DefaultAdapters.cs}`
- Tests: `tests/Recordingtest.Runner.Tests/{TestRunnerTests.cs,Fakes.cs}`
- Contract: `docs/contracts/test-runner.md`