fix(orchestrator): P5 audit-anchor-first-line regression guard
Bug discovered during #56 INTEGRATION-AUDIT-02 execution (2026-05-20): - Both Claude and Codex put "Audit anchor: ..." as the FIRST line of every Gitea comment per the #56 issue body instruction "cite anchor at start of every stage". - detect_agent (P0-1 strict, first-line only) then returns None for these comments because the first line is "Audit anchor:..." not "[Codex #N]" or "[Claude #N]". - Result: orchestrator's "is_codex" check (line ~1288) flips false → "Codex 응답 미감지 — continuing" → infinite Stage 4 loop. #56 reached Round #14 (>300 comments, ~2 hours wasted token). Fix path (NOT relaxing detect_agent — that would revive the original #45 pre-P0-1 bug where [Claude #N] citations inside Codex bodies caused mis-detection): 1. AUDIT_ONLY_NOTE updated to enforce comment format: - FIRST non-empty line MUST be `[Claude #N] <stage>` or `[Codex #N] <stage>` - Audit anchor / banners / prefaces MUST appear line 2 or later - Concrete CORRECT example included - Explicit warning that violation breaks stage advance 2. is_codex None guard auto-supplements: - When _audit_mode(title) AND detect_agent returns None, orchestrator posts a Gitea supplement comment requesting the correct format - Next round's Claude/Codex see the supplement and correct - Breaks the infinite loop automatically (no manual ctrl-C needed) 3. Regression tests in TestDetectAgent (test_orchestrator_core.py): - test_audit_anchor_preface_breaks_detection: confirms P0-1 strict correctly returns None when anchor is first line - test_audit_anchor_after_header_works: correct format passes Total: 96/96 pytest pass (94 prior + 2 P5 regression). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -97,6 +97,37 @@ Addressing [Codex #2] findings ...
|
||||
assert detect_agent("[Codex#1] hello") == "codex"
|
||||
assert detect_agent("[Claude#5] hi") == "claude"
|
||||
|
||||
def test_audit_anchor_preface_breaks_detection(self):
|
||||
"""P5 (2026-05-20) — regression: AUDIT-ONLY mode 의 'Audit anchor:' preface 가
|
||||
첫 줄에 박히면 detect_agent 는 None 반환 (P0-1 strict 의도된 동작).
|
||||
이게 #56 (INTEGRATION-AUDIT-02) 의 Stage 4 Round #14 infinite loop 의 직접 원인.
|
||||
해결책 = detect_agent 완화 X, AUDIT_ONLY_NOTE 가 agent header 를 first line 으로 강제."""
|
||||
body_anchor_first = (
|
||||
"Audit anchor: This audit verifies pipeline contracts...\n"
|
||||
"It does not implement runtime code.\n"
|
||||
"\n"
|
||||
"[Codex #14] Stage 4 (test-verify) Round #14 - INTEGRATION-AUDIT-02\n"
|
||||
"\n"
|
||||
"Verdict: PASS. Stage 3 satisfies all criteria.\n"
|
||||
"FINAL_CONSENSUS: YES\n"
|
||||
)
|
||||
assert detect_agent(body_anchor_first) is None, (
|
||||
"audit anchor preface as first line MUST cause detect_agent None "
|
||||
"(P0-1 strict). Fix path: comment format, not detect_agent."
|
||||
)
|
||||
|
||||
def test_audit_anchor_after_header_works(self):
|
||||
"""P5 (2026-05-20) — 올바른 format: agent header first line, anchor line 2+."""
|
||||
body_header_first = (
|
||||
"[Codex #14] Stage 4 (test-verify) Round #14 - INTEGRATION-AUDIT-02\n"
|
||||
"\n"
|
||||
"Audit anchor: This audit verifies pipeline contracts...\n"
|
||||
"\n"
|
||||
"Verdict: PASS.\n"
|
||||
"FINAL_CONSENSUS: YES\n"
|
||||
)
|
||||
assert detect_agent(body_header_first) == "codex"
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────────────
|
||||
# parse_consensus — YES/NO + rewind_target
|
||||
|
||||
Reference in New Issue
Block a user