P7 Patch B used `comments[-1]` at line 1868 but `comments` is defined
inside run_stage, not run_issue scope. The KEEP_OPEN guard runs after
run_stage returns, where `comments` is no longer in scope, causing
NameError crash after Stage 6 YES was already accepted and exit report
generated.
Fix: fetch comments fresh via get_comments(n) at the guard entry.
exit_path file check (fallback) still works as designed.
Refs: #84 (Stage 6 crash during normal close path)
- Block Stage 2 YES when IMPLEMENTATION_UNITS contains tests: [].
- Prevent fallback from accepting orchestrator supplement examples as valid plans.
- Honor KEEP_OPEN/DO NOT CLOSE final-close dispositions by skipping close PATCH.
- Add final-close casual self-contradiction guard for YES bodies (allows explicit
`disposition: KEEP_OPEN_*` to pass through to Patch B).
- Inject rejected approaches from failure reports into next-round context with
BANNED_APPROACHES block (tests: [] / DOM mount without jsdom / Home.tsx toast
removal / git add -A).
Refs: #83 (governance break — reopen pending user decision)
#84 (Stage 2 round 5 slip — replay required after this fix)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bug discovered during #24 IMP-24 K6 Stage 2 (2026-05-20):
- Codex r1, r2, r3 started with '=== IMPLEMENTATION_UNITS ===' on first line
(not '[Codex #N] ...'), so detect_agent (P0-1 strict, first-line only)
returned None.
- For non-audit issues, the P5 supplement guard was audit-only gated → silent
loop until Codex r4 happened to use correct format. 4 rounds wasted.
Verified that #21 Stage 4 had the same latent silent loop pattern
('## [Codex #1]' first line) — orchestrator looped through ~10 Claude rounds
before random recovery. P5b fix addresses this long-standing bug.
Patch (defensive parser-contract hardening; does not assume single root cause):
1. RULES global gets explicit "FIRST non-empty line MUST be [Claude #N] /
[Codex #N]" rule that OVERRIDES any stage-specific "body MUST contain"
constraint.
2. COMPACT_PLAN_RULE wording clarified: "body" begins AFTER the first-line
agent header. The 'body MUST contain ONLY' set no longer accidentally
permits '=== IMPLEMENTATION_UNITS ===' on line 1.
3. is_codex None supplement guard:
- audit-only gate REMOVED → fires for all issues (#24 latent loop fixed)
- Throttle: max 2 supplements per stage; on 3rd violation, orchestrator
hard-stops the issue with explicit "user action required" message
and exits run_stage cleanly
- Supplement message names both Claude AND Codex (Claude's first-line
violation also breaks downstream via Codex mimicry)
- Body-head 80 chars logged on detection failure (debugging aid)
4. Regression tests (+5 cases in test_orchestrator_core.py):
- TestDetectAgent: '=== IMPLEMENTATION_UNITS ===' first line → None
- TestDetectAgent: [Codex #N] first line + units after → 'codex' OK
- TestDetectAgent: '## ', '📌 **', '**' prefix all → None
- TestRulesAndCompactPlanFirstLineContract: RULES wording has FIRST/OVERRIDES
- TestRulesAndCompactPlanFirstLineContract: COMPACT_PLAN_RULE has carve-out
Cosmetic side effect (accepted): Claude's '📌 **[Claude #N] ...**' or
'## [Codex #N] ...' decoration prefixes will fail detect_agent. Agents
will drop decorations from line 1; line 2+ can still use them.
Out of scope (NOT included to keep regression risk low):
- detect_agent function logic UNCHANGED (P0-1 strict preserved)
- consensus parser UNCHANGED
- stage loop structure UNCHANGED
- git/Gitea retrieval logic UNCHANGED
- audit-only mode P4/P4a guards UNCHANGED
- pre-post comment validation (future axis, larger refactor)
Total: 131/131 pytest pass (126 prior + 5 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug discovered during #56 INTEGRATION-AUDIT-02 execution (2026-05-20):
- Both Claude and Codex put "Audit anchor: ..." as the FIRST line of every
Gitea comment per the #56 issue body instruction "cite anchor at start
of every stage".
- detect_agent (P0-1 strict, first-line only) then returns None for these
comments because the first line is "Audit anchor:..." not "[Codex #N]"
or "[Claude #N]".
- Result: orchestrator's "is_codex" check (line ~1288) flips false →
"Codex 응답 미감지 — continuing" → infinite Stage 4 loop. #56 reached
Round #14 (>300 comments, ~2 hours wasted token).
Fix path (NOT relaxing detect_agent — that would revive the original #45
pre-P0-1 bug where [Claude #N] citations inside Codex bodies caused
mis-detection):
1. AUDIT_ONLY_NOTE updated to enforce comment format:
- FIRST non-empty line MUST be `[Claude #N] <stage>` or `[Codex #N] <stage>`
- Audit anchor / banners / prefaces MUST appear line 2 or later
- Concrete CORRECT example included
- Explicit warning that violation breaks stage advance
2. is_codex None guard auto-supplements:
- When _audit_mode(title) AND detect_agent returns None, orchestrator
posts a Gitea supplement comment requesting the correct format
- Next round's Claude/Codex see the supplement and correct
- Breaks the infinite loop automatically (no manual ctrl-C needed)
3. Regression tests in TestDetectAgent (test_orchestrator_core.py):
- test_audit_anchor_preface_breaks_detection: confirms P0-1 strict
correctly returns None when anchor is first line
- test_audit_anchor_after_header_works: correct format passes
Total: 96/96 pytest pass (94 prior + 2 P5 regression).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P4 had two production issues blocking #50 integration audit deployment:
1. Stage 3 guard had no baseline awareness — flagged ALL forbidden-path
changes including pre-existing dirty WIP. Empirical: 328 such files
already in current working tree (tests/matching/ artifacts etc).
#50 would have hit reject loops immediately without Claude doing
anything wrong.
2. Stage 5 had no commit-scope guard — if Claude ran `git add -A` and
committed user's existing WIP, audit commit would be polluted with
unrelated production changes.
P4a additions:
- _audit_baseline_path / _ensure_audit_baseline / _load_audit_baseline:
snapshot working-tree dirty paths at run_issue entry for audit issues.
Resumed runs preserve existing baseline (no overwrite).
- _check_audit_only_violations(baseline=None): accept baseline set,
subtract from violations — only flags NEW forbidden changes introduced
after audit start.
- _check_audit_commit_scope: verify HEAD commit's file list matches
AUDIT_ALLOWED_COMMIT_GLOBS (INTEGRATION-AUDIT-*.md, BACKLOG.md).
- run_issue: save baseline on audit-mode entry only — no impact on
normal issues.
- Stage 5 (commit-push) YES gate: new guard rejects on out-of-scope
files with remediation prompt (git reset --soft + force-with-lease).
19 new tests:
- baseline subtraction (5): pre-existing removed, None=keep-all,
empty-set=catch-all, full-coverage filter, Windows path normalize.
- baseline persist (5): roundtrip, no-overwrite on resume, missing
fallback, corrupt JSON fallback, non-list fallback.
- commit scope detection (7): report-only allowed, backlog allowed,
src/ rejected, unrelated docs rejected, git error fail-open,
Windows backslash, empty commit pass.
- allowed globs sanity (2): every glob has audit marker, all under
docs/architecture/.
Total: 94/94 pytest pass (75 prior + 19 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>