From 8f6cffc2a7cbfbacb5f6bdeda5ac4d6eaccb22cb Mon Sep 17 00:00:00 2001 From: kyeongmin Date: Sat, 16 May 2026 02:28:46 +0900 Subject: [PATCH] =?UTF-8?q?fix(IMP-08):=20Stage=205=20R2=20=E2=80=94=20ali?= =?UTF-8?q?gner=20force-drill=20on=20sub-id=20override=20targets?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex #1 (Stage 5) reproduced a smoke regression on the actual checkout : when V4 carries the parent exact key (e.g., `04-2`) AND the drag/drop override targets a sub-id (`primary=04-2-sub-1`), the aligner kept the parent at parent granularity and emit `['04-1', '04-2']`, so the override flag failed with `unknown section_id(s) ['04-2-sub-1']`. Fix : `align_sections_to_v4_granularity` gains an optional `override_target_section_ids` keyword. From each canonical `${parent}-sub-N` target it derives the parent id and adds it to a `force_drill_parents` set. Sections in that set are drilled into sub-sections regardless of whether V4 carries the parent exact key. Top-level override targets (no derived parent) do not trigger force-drill, so backward-compat is preserved for parent-granularity overrides. The call site in `run_phase_z2_mvp1` collects sub-ids from `override_section_assignments` and forwards them to the aligner. Generalization (RULE 0) : - Trigger is the override schema (`X-sub-N`), not a specific MDX / section / frame id. Applies to all 32-frame MDX uniformly. - Decision is deterministic on the override target shape, independent of V4 yaml content. - Default (no override) path is unchanged byte-for-byte. Side fixes (forward-only RULE 1 cleanup, no history rewrite) : - `align_sections_to_v4_granularity` docstring rewritten in English (overwrites the Korean docstring committed in 5191aca). - Step 9 diagnostic comment quoted-string rewritten in English (overwrites `"V4 entry 없음"` committed in a422d72). Tests : 3 new cases in `test_phase_z2_subsection_schema.py` — `test_align_parent_v4_exact_keeps_section_when_no_override_targets_sub` (backward-compat axis), `test_align_force_drills_when_override_targets_sub_id_with_parent_in_v4` (blocker regression), `test_align_top_level_override_target_does_not_force_drill_other_sections` (force-drill scope guard). Pytest scope-qualified result : `test_phase_z2_subsection_schema.py` + `_section_assignment_override.py` + `_v4_fallback.py` = 40 / 40 PASS. Smoke (axis = sub-id override -> aligner -> assignment plan, both V4 yaml shapes) : - HEAD V4 yaml (`04-1`, `04-2.1`, `04-2.2` only) : `--override-section-assignment primary=04-2-sub-1` -> `aligned_section_ids=['04-1', '04-2-sub-1', '04-2-sub-2']`, `plan[0].assignment_source='cli_override'`, `plan[0].source_section_ids=['04-2-sub-1']`. - V4 yaml with `04-2` exact key (Codex's stress case) : identical aligned output and identical assignment plan. Downstream `composition_planner` abort (`phase_z_status_not_allowed:extract_matched_zone`) is IMP-05 territory, unchanged in both shapes. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/phase_z2_pipeline.py | 97 ++++++++++++++++++------ tests/test_phase_z2_subsection_schema.py | 61 ++++++++++++++- 2 files changed, 132 insertions(+), 26 deletions(-) diff --git a/src/phase_z2_pipeline.py b/src/phase_z2_pipeline.py index e882b04..4748756 100644 --- a/src/phase_z2_pipeline.py +++ b/src/phase_z2_pipeline.py @@ -41,6 +41,7 @@ from jinja2 import Environment, FileSystemLoader, select_autoescape from phase_z2_composition import ( LAYOUT_PRESETS, CompositionUnit, + derive_parent_id, plan_composition, select_display_strategy_candidates, select_layout_candidates, @@ -372,47 +373,80 @@ def load_v4_result() -> dict: return yaml.safe_load(V4_RESULT_PATH.read_text(encoding="utf-8")) -def align_sections_to_v4_granularity(sections: list[MdxSection], v4: dict) -> list[MdxSection]: - """V4 section granularity 에 맞춰 sections 조정. +def align_sections_to_v4_granularity( + sections: list[MdxSection], + v4: dict, + *, + override_target_section_ids: Optional[list[str]] = None, +) -> list[MdxSection]: + """Align MDX sections to canonical sub-section granularity. - IMP-08 B-3 : canonical sub-section id ``${section_id}-sub-${ordinal}`` - (예 : ``04-2-sub-1``) 를 emit 하고, legacy V4 키 (``04-2.1``) 는 - ``v4_alias_keys`` 로 보존하여 ``_resolve_v4_section_key`` 가 alias 경로로 - 매칭한다. canonical ordinal id 는 frontend drag/drop override 와 동일 - schema (`section_id-sub-N`). + Default behaviour (V4-driven granularity, backward compatible) : + - V4 has section_id exact key -> keep section unchanged (parent + granularity rendering, parent-level V4 evidence applies). + - V4 missing + H3 sub-sections -> drill into sub-sections, emit + canonical ids ``${section_id}-sub-${ordinal}`` with optional + decimal alias for legacy V4 keys (e.g. ``04-2.1``). + - V4 missing + no H3 -> pass through (downstream V4 lookup + will naturally abort with no_v4_section). - N-R5 alias guard : heading_number 가 decimal (``2.1``) 일 때만 alias - emit. integer-only (``1``) / non-numeric heading 은 alias 0 — sibling - parent V4 evidence 로 잘못 promote 되는 collision 방지 (RULE 0). + IMP-08 B-3 / Stage 5 R2 blocker-fix — ``override_target_section_ids`` + is the list of section ids that drag/drop override CLI flags target. + When any override target matches ``${section_id}-sub-N`` for a section + whose parent is otherwise V4-aligned, that section is force-drilled so + sub-section ids become addressable. This keeps the default rendering + path on V4 granularity while making drag/drop deterministic regardless + of whether V4 carries a parent exact key. - 각 section 에 대해 : - - V4 에 section.section_id 키 있음 → 그대로 유지 (## level 매칭) - - V4 에 키 없고 raw_content 에 ### sub-section 존재 → ### 로 drill - - V4 에 키 없고 ### 도 없음 → 원본 그대로 (V4 lookup 단계에서 자연스럽게 abort) + Each drilled sub-section carries : + - heading_number : decimal "2.1" / integer "1" / None (bare H3 title). + - v4_alias_keys : legacy V4 keys to try when the canonical ordinal + id misses. Populated only when ``heading_number`` matches the + decimal pattern ``\\d+\\.\\d+`` (N-R5 guard) — integer-only or + bare H3 produces no alias to avoid sibling-parent V4 collisions. - 설계 원칙 : - - parser (parse_mdx) = MDX 만 앎 (V4 무관) - - aligner (이 함수) = V4 키 기준 granularity 결정 - - runtime parser 가 matching artifact 의 granularity 를 *따라가는* 구조 + Design boundary : + - parser (``parse_mdx``) = MDX-only knowledge (V4-agnostic). + - aligner (this function) = canonical sub-id schema, MDX-driven on + force_drill, V4-driven otherwise. + - resolver (``_resolve_v4_section_key``) = exact > alias > None, + never auto-promotes to parent/sibling (axis 7 hybrid lock). """ v4_keys = set(v4.get("mdx_sections", {}).keys()) + + # Build the set of parent ids whose sub-ids are explicitly targeted by + # an override. These sections must be drilled even if V4 also carries + # the parent key exactly. Parents derived from canonical "X-sub-N" ids + # only — non-sub ids (top-level overrides) do not trigger drilling. + force_drill_parents: set[str] = set() + if override_target_section_ids: + for sid in override_target_section_ids: + parent = derive_parent_id(sid) + if parent and sid != parent: + force_drill_parents.add(parent) + aligned: list[MdxSection] = [] - # IMP-08 B-3 : capture optional heading-number prefix (decimal "2.1" or - # integer "1") + heading title. None group = bare "### Title". + # Capture optional heading-number prefix (decimal "2.1" or integer "1") + # plus the heading title. None group = bare "### Title". sub_pattern = re.compile( r"^###\s+(?:(\d+(?:\.\d+)?)\s+)?(.+?)$", re.MULTILINE ) decimal_re = re.compile(r"\d+\.\d+") for section in sections: - if section.section_id in v4_keys: + force_drill = section.section_id in force_drill_parents + if section.section_id in v4_keys and not force_drill: + # V4 carries this section exactly and no override targets a + # sub-id under it: keep parent granularity (backward compat). aligned.append(section) continue sub_matches = list(sub_pattern.finditer(section.raw_content)) if not sub_matches: - aligned.append(section) # drill 불가, V4 lookup 에서 abort 됨 + # No H3 sub-sections: cannot drill. Pass section through; + # downstream V4 lookup aborts with no_v4_section when needed. + aligned.append(section) continue mdx_id = section.section_id.split("-")[0] # e.g., "04" @@ -2076,8 +2110,21 @@ def run_phase_z2_mvp1( # 2. Load V4 v4 = load_v4_result() - # 3. Align sections to V4 granularity (### drill if needed) - sections = align_sections_to_v4_granularity(sections, v4) + # 3. Align sections to V4 granularity (### drill if needed). + # IMP-08 B-3 / Stage 5 R2 : forward override target ids so sub-id + # drag/drop targets force-drill their parent section even when V4 + # carries the parent exact key (deterministic drag/drop addressing). + _override_target_sids: list[str] = [] + if override_section_assignments: + for _sids in override_section_assignments.values(): + for _sid in _sids: + if isinstance(_sid, str) and _sid: + _override_target_sids.append(_sid) + sections = align_sections_to_v4_granularity( + sections, + v4, + override_target_section_ids=_override_target_sids or None, + ) print(f" aligned : sections={len(sections)} ({[s.section_id for s in sections]})") # ─── Step 5: V4 매칭 evidence (non-reject max-6 후보 list — 사용자 lock 2026-05-08) ─── @@ -2871,7 +2918,7 @@ def run_phase_z2_mvp1( # reporting only. Runtime selection goes through _resolve_v4_section_key # (4 sites). Direct dict lookup here is intentional — debug_zones carries # dict-shape entries without v4_alias_keys plumbing, and a miss here only - # yields a "V4 entry 없음" report line (runtime impact zero). + # yields a "V4 entry missing" report line (runtime impact zero). try: with open(V4_RESULT_PATH, encoding="utf-8") as _vf: _v4_full = yaml.safe_load(_vf) diff --git a/tests/test_phase_z2_subsection_schema.py b/tests/test_phase_z2_subsection_schema.py index 0955b68..adc6939 100644 --- a/tests/test_phase_z2_subsection_schema.py +++ b/tests/test_phase_z2_subsection_schema.py @@ -111,7 +111,9 @@ def test_mdx_section_default_construction_preserves_4_positional_callers(): def test_align_passthrough_when_v4_key_exact_match(): - # Section already aligned to V4 key — aligner keeps it untouched. + # Section already aligned to V4 key (no override target): aligner + # keeps it untouched. Parent-level V4 evidence flows via exact-match + # lookup. sections = [_section("04-1", 1, "1. Top", "body")] v4 = {"mdx_sections": {"04-1": {"judgments_full32": []}}} out = align_sections_to_v4_granularity(sections, v4) @@ -119,6 +121,63 @@ def test_align_passthrough_when_v4_key_exact_match(): assert out[0].section_id == "04-1" +def test_align_parent_v4_exact_keeps_section_when_no_override_targets_sub(): + # Backward-compat axis: when V4 carries the parent exact key and no + # drag/drop override targets a sub-id of this section, the aligner + # MUST keep the parent (preserves V4 evidence at parent granularity). + raw = "### 2.1 First\nbody1\n### 2.2 Second\nbody2\n" + sections = [_section("03-2", 2, "2. Parent", raw)] + v4 = {"mdx_sections": {"03-2": {"judgments_full32": []}}} + out = align_sections_to_v4_granularity(sections, v4) + assert [s.section_id for s in out] == ["03-2"] + + +def test_align_force_drills_when_override_targets_sub_id_with_parent_in_v4(): + # Stage 5 R2 blocker-fix regression: when V4 has the parent exact key + # AND an override targets a sub-id of that section, the aligner MUST + # drill regardless of V4 parent presence. This makes drag/drop + # addressing deterministic across all V4 yaml shapes. + raw = "### 2.1 First\nbody1\n### 2.2 Second\nbody2\n" + sections = [_section("04-2", 2, "2. Parent", raw)] + v4 = { + "mdx_sections": { + "04-2": {"judgments_full32": []}, # parent V4 entry present + "04-2.1": {"judgments_full32": []}, # plus decimal sub entries + "04-2.2": {"judgments_full32": []}, + } + } + out = align_sections_to_v4_granularity( + sections, v4, override_target_section_ids=["04-2-sub-1"] + ) + # Force-drill: parent id MUST be replaced by canonical sub-ids. + assert [s.section_id for s in out] == ["04-2-sub-1", "04-2-sub-2"] + # Decimal aliases preserved (N-R5: decimal heading_number). + assert out[0].v4_alias_keys == ["04-2.1"] + assert out[1].v4_alias_keys == ["04-2.2"] + + +def test_align_top_level_override_target_does_not_force_drill_other_sections(): + # Top-level override target ("primary=03-1") has no derive_parent_id, + # so it MUST NOT force-drill any section. Only "X-sub-N" targets + # trigger force-drill on parent X. + raw = "### 2.1 First\nbody1\n" + sections = [ + _section("03-1", 1, "1. Top", "body"), + _section("03-2", 2, "2. Parent", raw), + ] + v4 = { + "mdx_sections": { + "03-1": {"judgments_full32": []}, + "03-2": {"judgments_full32": []}, + } + } + out = align_sections_to_v4_granularity( + sections, v4, override_target_section_ids=["03-1"] + ) + # No sub-id target -> both sections kept at parent granularity. + assert [s.section_id for s in out] == ["03-1", "03-2"] + + def test_align_drill_emits_canonical_ordinal_id_with_decimal_alias(): # Decimal H3 headings -> canonical ordinal id + decimal alias (legacy V4 key). raw = "### 2.1 First\nbody1\n### 2.2 Second\nbody2\n"