Bug discovered during #56 INTEGRATION-AUDIT-02 execution (2026-05-20):
- Both Claude and Codex put "Audit anchor: ..." as the FIRST line of every
Gitea comment per the #56 issue body instruction "cite anchor at start
of every stage".
- detect_agent (P0-1 strict, first-line only) then returns None for these
comments because the first line is "Audit anchor:..." not "[Codex #N]"
or "[Claude #N]".
- Result: orchestrator's "is_codex" check (line ~1288) flips false →
"Codex 응답 미감지 — continuing" → infinite Stage 4 loop. #56 reached
Round #14 (>300 comments, ~2 hours wasted token).
Fix path (NOT relaxing detect_agent — that would revive the original #45
pre-P0-1 bug where [Claude #N] citations inside Codex bodies caused
mis-detection):
1. AUDIT_ONLY_NOTE updated to enforce comment format:
- FIRST non-empty line MUST be `[Claude #N] <stage>` or `[Codex #N] <stage>`
- Audit anchor / banners / prefaces MUST appear line 2 or later
- Concrete CORRECT example included
- Explicit warning that violation breaks stage advance
2. is_codex None guard auto-supplements:
- When _audit_mode(title) AND detect_agent returns None, orchestrator
posts a Gitea supplement comment requesting the correct format
- Next round's Claude/Codex see the supplement and correct
- Breaks the infinite loop automatically (no manual ctrl-C needed)
3. Regression tests in TestDetectAgent (test_orchestrator_core.py):
- test_audit_anchor_preface_breaks_detection: confirms P0-1 strict
correctly returns None when anchor is first line
- test_audit_anchor_after_header_works: correct format passes
Total: 96/96 pytest pass (94 prior + 2 P5 regression).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P4 had two production issues blocking #50 integration audit deployment:
1. Stage 3 guard had no baseline awareness — flagged ALL forbidden-path
changes including pre-existing dirty WIP. Empirical: 328 such files
already in current working tree (tests/matching/ artifacts etc).
#50 would have hit reject loops immediately without Claude doing
anything wrong.
2. Stage 5 had no commit-scope guard — if Claude ran `git add -A` and
committed user's existing WIP, audit commit would be polluted with
unrelated production changes.
P4a additions:
- _audit_baseline_path / _ensure_audit_baseline / _load_audit_baseline:
snapshot working-tree dirty paths at run_issue entry for audit issues.
Resumed runs preserve existing baseline (no overwrite).
- _check_audit_only_violations(baseline=None): accept baseline set,
subtract from violations — only flags NEW forbidden changes introduced
after audit start.
- _check_audit_commit_scope: verify HEAD commit's file list matches
AUDIT_ALLOWED_COMMIT_GLOBS (INTEGRATION-AUDIT-*.md, BACKLOG.md).
- run_issue: save baseline on audit-mode entry only — no impact on
normal issues.
- Stage 5 (commit-push) YES gate: new guard rejects on out-of-scope
files with remediation prompt (git reset --soft + force-with-lease).
19 new tests:
- baseline subtraction (5): pre-existing removed, None=keep-all,
empty-set=catch-all, full-coverage filter, Windows path normalize.
- baseline persist (5): roundtrip, no-overwrite on resume, missing
fallback, corrupt JSON fallback, non-list fallback.
- commit scope detection (7): report-only allowed, backlog allowed,
src/ rejected, unrelated docs rejected, git error fail-open,
Windows backslash, empty commit pass.
- allowed globs sanity (2): every glob has audit marker, all under
docs/architecture/.
Total: 94/94 pytest pass (75 prior + 19 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>