Land the production + test surface for the Step 17 cascade POPUP terminal (DETERMINISTIC -> POPUP -> AI_REPAIR -> USER_OVERRIDE) per Stage 2 plan R2. u11 (baseline-red invariance gate) was already landed in7c93031ahead of this commit; this commit completes u1~u10 plus the Stage 3 R7 follow-up anchor re-pin for test_imp17_comment_anchor.py. Implementation units (Stage 2 R2 contract): u1 frame_reselect_insufficient failure_type + post-frame remeasure (q4) - src/phase_z2_failure_router.py, src/phase_z2_pipeline.py u2 NEXT_ACTION_BY_FAILURE row + impl_status flip - src/phase_z2_failure_router.py u3 Router details_popup_escalation MISSING->IMPLEMENTED + executor stub - src/phase_z2_router.py u4 step17.py AI split-decision contract (POPUP cascade_stage + route_for_label + skip_reason); API gated - src/phase_z2_ai_fallback/step17.py u5 Step 17 POPUP gate executor; popup_escalation_plan + has_popup marker - src/phase_z2_pipeline.py, src/phase_z2_ai_fallback/step17.py u6 Composition popup binding -- yaml strategy -> zone payload - src/phase_z2_composition.py u7 Pipeline composer -> render_slide wiring (popup_html / preview_text / has_popup) - src/phase_z2_pipeline.py u8 slide_base.html <details>/<summary> popup wrapper - templates/phase_z2/slide_base.html u9 display_strategies.yaml inline_preview + popup metadata - templates/phase_z2/regions/display_strategies.yaml u10 MDX preservation invariant: popup=full source / body=summary or subset (asserted by tests/phase_z2/test_popup_mdx_preservation.py) u11 (already in7c93031) -- baseline-red invariance gate Stage 3 R7 follow-up (anchor re-pin, test-only): - tests/orchestrator_unit/test_imp17_comment_anchor.py Pre-anchor additions in src/phase_z2_pipeline.py (u1 / u5 / u7) shifted the restructure/reject route-hint comments 578/579 -> 586/587. Re-pinned the two guard tests (and docstring re-pin lineage 564 -> 570 -> 578 -> 586). Production code untouched. Verification (Stage 4 R1): pytest -q tests/orchestrator_unit/test_imp17_comment_anchor.py -> 2 passed / 0.02s pytest -q <10 IMP-35 unit files in tests/phase_z2 + tests/phase_z2_ai_fallback> -> 136 passed / 15.94s Baseline-red invariance gate (tests/test_imp47b_step12_ai_wiring.py + tests/test_phase_z2_ai_fallback_config.py) -> 4 failed / 6 passed; FAILED set === IMP35_BASELINE_RED_NODE_IDS (frozen registry from7c93031). Contract holds. Codex Stage 4 R1 = YES (independent verify). Guardrails honored: - MDX content preservation: popup carries full source, body holds summary or subset only (CLAUDE.md 자세히보기 원칙; feedback_phase_z_spacing_direction -- capacity expanded, no margin shrink). - AI isolation contract: Step 17 POPUP gate is deterministic; AI hook surface is split-decision contract only, API call gated. - No hardcoding: escalation thresholds derived from existing overflow detector outputs; preview_chars deterministic from container px. - 1 commit = 1 decision unit: u1~u10 land together as the planned production surface; u11 was deliberately split into7c93031as Stage 3 R7 carve-out, and the R7 anchor re-pin rides with this commit because it is the direct shift consequence of the u1/u5/u7 pre-anchor additions. - Scope-locked: .claude/settings.json explicitly excluded (Stage 4 exit report contract). Out of scope (per Stage 1 + Stage 2): - AI_REPAIR API activation (post IMP-35 axis). - IMP-34 zone resize, IMP-36 responsive fit (chain partners, separate issues). - Print-time auto-expand JavaScript for <details>. - Popup escalation in stages other than Step 17. - Baseline-red body repair (4 frozen failures) -- separate follow-up issue; u11 only guards the count. - frame_reselect algorithm changes (entry point only). - templates/phase_z2/slide_base.html path rename. source_comment_ids: Stage 1: claude_stage1_problem_review_imp35, codex_stage1_verification_imp35_yes Stage 2: Claude #4 R2 plan, Codex #5 R2 YES Stage 3: Claude #86 (R7 anchor re-pin), Codex #87 YES Stage 4: Claude #88 R1, Codex #89 R1 YES Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1581 lines
70 KiB
Python
1581 lines
70 KiB
Python
"""Phase Z-2 Composition Planner v0.
|
||
|
||
Pipeline 의 빠진 layer = MDX 덩어리들을 *최종 zone unit* 으로 묶는 결정 layer.
|
||
|
||
위치 :
|
||
parse_mdx → align_sections_to_v4_granularity → [본 모듈] → render
|
||
|
||
원칙 (절대 룰) :
|
||
- 특정 MDX / frame / section 하드코딩 X (예: "04-2 면" / "F16 이면")
|
||
- 모든 결정 = catalog 메타 + V4 evidence parametric
|
||
- 같은 코드가 MDX 02/03/04/05/06... 모두 처리 — 결과는 케이스마다 다름
|
||
- drilling 결과 = 입력 (재료), composition planner 결과 = 출력 (zone units)
|
||
- slide-level layout = zone 까지만 나눔. zone 내부 분할은 frame partial 책임
|
||
|
||
8 layout preset vocabulary :
|
||
L1 single / L2 horizontal-2 / L3 vertical-2
|
||
L4 top-1-bottom-2 / L5 top-2-bottom-1
|
||
L6 left-1-right-2 / L7 left-2-right-1
|
||
L8 grid-2x2
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
import re
|
||
from dataclasses import dataclass, field
|
||
from pathlib import Path
|
||
from typing import Optional
|
||
|
||
import yaml
|
||
|
||
|
||
# ─── 8 Layout Preset Vocabulary — catalog-loaded (사용자 lock 2026-05-07) ───
|
||
#
|
||
# Source of truth = templates/phase_z2/layouts/layouts.yaml (사람이 보고 추가/수정 가능).
|
||
# 코드 hardcoded dict 폐기 (Step 7-A catalog 화). logic 변경 X — backward compat.
|
||
#
|
||
# catalog 의 추가 필드 (render_ready / default_selection / candidate_when) 는
|
||
# 기존 사용처에서 무시됨 — Step 7-B (multiple 후보) / Step 9 (layout × frame
|
||
# fit eval) 진입 시 입력.
|
||
|
||
_LAYOUTS_CATALOG_PATH = (
|
||
Path(__file__).resolve().parent.parent
|
||
/ "templates" / "phase_z2" / "layouts" / "layouts.yaml"
|
||
)
|
||
|
||
|
||
def load_layout_presets() -> dict[str, dict]:
|
||
"""Load 8 layout presets from catalog.
|
||
|
||
backward compat: returns same dict shape as old hardcoded LAYOUT_PRESETS —
|
||
keys = layout id (single / horizontal-2 / ...),
|
||
each value contains zones / topology / positions / css_areas / css_cols / css_rows.
|
||
Additional fields (render_ready / default_selection / candidate_when)
|
||
ignored by existing callers, consumed by Step 7-B / Step 9 (별 axis).
|
||
"""
|
||
with open(_LAYOUTS_CATALOG_PATH, encoding="utf-8") as f:
|
||
return yaml.safe_load(f) or {}
|
||
|
||
|
||
LAYOUT_PRESETS: dict[str, dict] = load_layout_presets()
|
||
|
||
|
||
def select_layout_candidates(unit_count: int) -> list[str]:
|
||
"""Return layout id candidates matching given unit_count.
|
||
|
||
Step 7-B (사용자 lock 2026-05-07) — multiple 후보 generation.
|
||
|
||
Args:
|
||
unit_count: Final layout placement unit count (Step 4 output).
|
||
= section_count + promoted lead_orphans 등.
|
||
NOT raw MDX section count — Step 2 raw section count 가 아님.
|
||
|
||
Returns:
|
||
List of layout ids matching candidate_when.unit_count.
|
||
Sort order:
|
||
1. default_selection: true 먼저 (catalog 정의 순서)
|
||
2. default_selection: false 그 다음 (catalog 정의 순서)
|
||
Layouts with render_ready: false 는 제외.
|
||
|
||
Raises:
|
||
ValueError: if unit_count < 1 or > 4 (current catalog scope).
|
||
|
||
Note:
|
||
호출처 박힘 (Step 7-conn 2026-05-08) — phase_z2_pipeline.py 의
|
||
step07 artifact 가 본 함수 결과 기록 (passive). 기존 select_layout_preset()
|
||
은 default 결정 그대로. 후보 평가 / auto decision 은 Step 9 v1 (별 axis).
|
||
"""
|
||
if unit_count < 1 or unit_count > 4:
|
||
raise ValueError(
|
||
f"unit_count {unit_count} out of catalog scope [1, 4]"
|
||
)
|
||
defaults: list[str] = []
|
||
alternatives: list[str] = []
|
||
for layout_id, spec in LAYOUT_PRESETS.items():
|
||
if not spec.get("render_ready", False):
|
||
continue
|
||
cw = spec.get("candidate_when") or {}
|
||
if cw.get("unit_count") != unit_count:
|
||
continue
|
||
if spec.get("default_selection", False):
|
||
defaults.append(layout_id)
|
||
else:
|
||
alternatives.append(layout_id)
|
||
return defaults + alternatives
|
||
|
||
|
||
# ─── Region Layout Catalog — Step 8-B-1 (사용자 lock 2026-05-07) ────────
|
||
#
|
||
# Source = templates/phase_z2/regions/region_layouts.yaml (SPEC §2.5).
|
||
# load 함수 + select_region_layout_candidates().
|
||
# 호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact 가
|
||
# 본 함수 결과 기록 (placeholder signals: region_count=1, Step 3/4 부재 종속).
|
||
|
||
_REGION_LAYOUTS_CATALOG_PATH = (
|
||
Path(__file__).resolve().parent.parent
|
||
/ "templates" / "phase_z2" / "regions" / "region_layouts.yaml"
|
||
)
|
||
|
||
|
||
def load_region_layouts() -> dict[str, dict]:
|
||
"""Load Internal Region layout catalog (SPEC §2.5, 6 entry).
|
||
|
||
Returns same dict shape as catalog yaml.
|
||
Step 7-A 와 같은 패턴 — source of truth = yaml, code 는 read 만.
|
||
"""
|
||
with open(_REGION_LAYOUTS_CATALOG_PATH, encoding="utf-8") as f:
|
||
return yaml.safe_load(f) or {}
|
||
|
||
|
||
REGION_LAYOUTS: dict[str, dict] = load_region_layouts()
|
||
|
||
|
||
def select_region_layout_candidates(
|
||
region_count: int,
|
||
content_type_mix: Optional[list[str]] = None,
|
||
details_presence: bool = False,
|
||
role_pattern: Optional[str] = None,
|
||
ratio_asymmetric: bool = False,
|
||
flow_type: Optional[str] = None,
|
||
has_visual_element: bool = False,
|
||
large_table: bool = False,
|
||
long_text: bool = False,
|
||
) -> list[str]:
|
||
"""Return Internal Region layout candidates per SPEC §2.5 decision tree.
|
||
|
||
Step 8-B-1 (사용자 lock 2026-05-07) — 후보 generation 함수.
|
||
Step 7-B 와 다른 점: SPEC §2.5 는 *순차 결정 트리* (첫 매칭 채택).
|
||
Step 7-B 는 단순 매칭 (unit_count 같은 모든 entry).
|
||
|
||
Decision rule (sequential, first match wins) — catalog 와 1:1 일치:
|
||
1. region_count == 1 -> region-single
|
||
2. details_presence / large_table / long_text -> region-preview-details
|
||
3. region_count == 4 AND flow_type == 'parallel_4' -> region-grid-2x2
|
||
4. region_count == 2 AND role_pattern ==
|
||
'primary_supporting' AND ratio_asymmetric -> region-main-support
|
||
5. region_count == 2 AND has_visual_element -> region-horizontal-split
|
||
6. fallback (위 미매칭) -> region-vertical-stack
|
||
|
||
Sort:
|
||
region_count == 1 -> [region-single] (fallback X)
|
||
region_count >= 2 -> [매칭, region-vertical-stack] 또는 [region-vertical-stack]
|
||
|
||
Raises:
|
||
ValueError: region_count < 1 or > 4 (SPEC §2.5 vocabulary scope).
|
||
|
||
Note:
|
||
호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact
|
||
가 본 함수 결과 기록. 현재 placeholder signals (region_count=1, content_type=
|
||
"text_block") 종속 — 실제 신호 활성화는 Step 3/4 별 axis.
|
||
Step 9 v0 (application_plan) 가 본 후보 list 를 application_candidates 로 해석.
|
||
"""
|
||
if region_count < 1 or region_count > 4:
|
||
raise ValueError(
|
||
f"region_count {region_count} out of catalog scope [1, 4]"
|
||
)
|
||
|
||
fallback = "region-vertical-stack"
|
||
|
||
# 1. region_count == 1
|
||
if region_count == 1:
|
||
return ["region-single"]
|
||
|
||
# 2. details_presence / large_table / long_text
|
||
if details_presence or large_table or long_text:
|
||
match = "region-preview-details"
|
||
# 3. region_count == 4 + parallel_4
|
||
elif region_count == 4 and flow_type == "parallel_4":
|
||
match = "region-grid-2x2"
|
||
# 4. region_count == 2 + role_pattern primary_supporting + ratio_asymmetric
|
||
elif (
|
||
region_count == 2
|
||
and role_pattern == "primary_supporting"
|
||
and ratio_asymmetric
|
||
):
|
||
match = "region-main-support"
|
||
# 5. region_count == 2 + visual element
|
||
elif region_count == 2 and has_visual_element:
|
||
match = "region-horizontal-split"
|
||
# 6. fallback
|
||
else:
|
||
return [fallback]
|
||
|
||
# 매칭됨 + fallback (단 매칭 == fallback 인 경우 1개만)
|
||
if match == fallback:
|
||
return [fallback]
|
||
return [match, fallback]
|
||
|
||
|
||
# ─── Display Strategy Catalog — Step 8-B-2 (사용자 lock 2026-05-07) ────
|
||
#
|
||
# Source = templates/phase_z2/regions/display_strategies.yaml (4 entry).
|
||
# load 함수 + select_display_strategy_candidates().
|
||
# 호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact 가
|
||
# 본 함수 결과 기록 (placeholder signals: content_type="text_block", Step 3/4 부재 종속).
|
||
|
||
_DISPLAY_STRATEGIES_CATALOG_PATH = (
|
||
Path(__file__).resolve().parent.parent
|
||
/ "templates" / "phase_z2" / "regions" / "display_strategies.yaml"
|
||
)
|
||
|
||
|
||
def load_display_strategies() -> dict[str, dict]:
|
||
"""Load display strategy catalog (4 entry).
|
||
|
||
Returns same dict shape as catalog yaml.
|
||
Step 7-A / 8-B-1 와 같은 패턴 — source of truth = yaml, code 는 read 만.
|
||
"""
|
||
with open(_DISPLAY_STRATEGIES_CATALOG_PATH, encoding="utf-8") as f:
|
||
return yaml.safe_load(f) or {}
|
||
|
||
|
||
DISPLAY_STRATEGIES: dict[str, dict] = load_display_strategies()
|
||
|
||
|
||
_KNOWN_CONTENT_TYPES = frozenset({
|
||
"text_block", "table", "image", "details", "decorative_element",
|
||
})
|
||
|
||
|
||
def select_display_strategy_candidates(
|
||
content_type: str,
|
||
long_text: bool = False,
|
||
large_table: bool = False,
|
||
fits_in_region: Optional[bool] = None,
|
||
) -> list[str]:
|
||
"""Return display strategy candidates per catalog (display_strategies.yaml).
|
||
|
||
Step 8-B-2 (사용자 lock 2026-05-07) — 후보 generation 함수.
|
||
display_strategies.yaml 만 본다 (region_layouts / frame 은 Step 9 axis).
|
||
|
||
Hard filter (catalog 박힌 절대 제약 — applies_to / forbidden_for):
|
||
- content_type 이 strategy.applies_to 에 있어야 후보
|
||
- content_type 이 strategy.forbidden_for 에 있으면 자동 제외
|
||
- 핵심 user lock: text_block / table / image / details 는 dropped 절대 X
|
||
(catalog forbidden_for 에 박혀 있음 — 원문 무손실 보존)
|
||
|
||
Ranking (content_type + fit signal):
|
||
decorative_element -> [inline_full, dropped]
|
||
image -> [inline_full]
|
||
text_block / table / details
|
||
long_text / large_table
|
||
/ fits_in_region == False -> [inline_preview_with_details,
|
||
details_only, inline_full]
|
||
그 외 -> [inline_full,
|
||
inline_preview_with_details,
|
||
details_only]
|
||
|
||
Note:
|
||
- fits_in_region 은 가벼운 hint 만. 실제 overflow 판단은 Step 9/14/17 axis.
|
||
- dropped 는 decorative_element 의 후순위 (공간 부족 신호 전엔 일단 보여주기).
|
||
|
||
Raises:
|
||
ValueError: content_type 이 catalog scope 밖
|
||
(text_block / table / image / details / decorative_element 외).
|
||
|
||
Note:
|
||
호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact
|
||
가 본 함수 결과 기록. 현재 placeholder signal (content_type="text_block")
|
||
종속 — 실제 신호 활성화는 Step 3/4 별 axis.
|
||
Step 9 v0 (application_plan) 가 본 후보 list 를 application_candidates 의
|
||
display_strategy axis 로 해석.
|
||
"""
|
||
if content_type not in _KNOWN_CONTENT_TYPES:
|
||
raise ValueError(
|
||
f"content_type {content_type!r} out of catalog scope "
|
||
f"(known: {sorted(_KNOWN_CONTENT_TYPES)})"
|
||
)
|
||
|
||
# Hard filter — applies_to / forbidden_for (catalog 직독)
|
||
eligible = set()
|
||
for name, meta in DISPLAY_STRATEGIES.items():
|
||
applies_to = meta.get("applies_to") or []
|
||
forbidden_for = meta.get("forbidden_for") or []
|
||
if content_type in applies_to and content_type not in forbidden_for:
|
||
eligible.add(name)
|
||
|
||
# Ranking — content_type + fit signal
|
||
if content_type == "decorative_element":
|
||
order = ["inline_full", "dropped"]
|
||
else:
|
||
escalate = long_text or large_table or fits_in_region is False
|
||
if escalate:
|
||
order = [
|
||
"inline_preview_with_details",
|
||
"details_only",
|
||
"inline_full",
|
||
]
|
||
else:
|
||
order = [
|
||
"inline_full",
|
||
"inline_preview_with_details",
|
||
"details_only",
|
||
]
|
||
|
||
return [s for s in order if s in eligible]
|
||
|
||
|
||
# ─── IMP-35 (#64) u6 — Composition popup binding (yaml strategy -> zone payload) ─
|
||
#
|
||
# Stage 2 binding contract (unit u6):
|
||
# Step 17 POPUP gate (u5 in src/phase_z2_ai_fallback/step17.py) stamps
|
||
# ``unit.has_popup=True`` AND ``unit.popup_escalation_plan=<plan>`` on
|
||
# composition units whose overflow category routes to
|
||
# ``details_popup_escalation``. u6 is the composition-side binding that
|
||
# translates the unit-side marker into a deterministic zone payload
|
||
# structure that u7 (pipeline composer -> render_slide wiring) reads to
|
||
# emit the ``<details>/<summary>`` markup u8 will add to slide_base.html.
|
||
#
|
||
# Inputs (unit-side, all duck-typed via getattr):
|
||
# has_popup — bool (False default; u5 sets True on
|
||
# feasible escalation only)
|
||
# popup_escalation_plan — dict | None (u3 router plan from
|
||
# plan_details_popup_escalation; carries
|
||
# feasible / category / rationale /
|
||
# needs_split_decision)
|
||
# raw_content — str (the source MDX content; popup body
|
||
# source per CLAUDE.md 자세히보기 원칙)
|
||
#
|
||
# Outputs (zone payload binding dict):
|
||
# display_strategy — catalog strategy id read from
|
||
# display_strategies.yaml (NOT hardcoded).
|
||
# ``inline_full`` when has_popup=False.
|
||
# ``inline_preview_with_details`` when
|
||
# has_popup=True (preview = excerpt from
|
||
# container px budget downstream; popup body
|
||
# preserves the FULL original).
|
||
# popup_body_source — str | None — the FULL raw_content. u7 passes
|
||
# this verbatim to the renderer; the popup
|
||
# body is the MDX 원문 (자세히보기 원칙),
|
||
# never summarized in the body branch.
|
||
# None when has_popup=False.
|
||
# detail_trigger — dict | None — placement + label read from
|
||
# the catalog strategy entry's
|
||
# ``detail_trigger``. None when has_popup=False.
|
||
# preserves_original — bool — echoed from the catalog entry.
|
||
# MUST be True for popup-binding strategies
|
||
# (absolute user lock — 오답노트 #5 /
|
||
# IMPROVEMENT-REDESIGN.md §3.6 line 110).
|
||
# has_popup — bool — echoed for downstream multiplex.
|
||
# popup_escalation_plan — dict | None — echoed verbatim (u5 plan).
|
||
# Provides traceability into the router
|
||
# category + rationale for downstream debug.
|
||
# strategy_meta — dict — full catalog entry (description /
|
||
# applies_to / forbidden_for / detail_trigger)
|
||
# so downstream traces can self-explain without
|
||
# re-reading the yaml.
|
||
#
|
||
# Guardrails honored:
|
||
# - feedback_ai_isolation_contract — NO AI call. Reads catalog + unit
|
||
# state only. The deterministic POPUP gate (u5) already established
|
||
# the marker; this function is pure composition-side binding.
|
||
# - feedback_no_hardcoding — strategy id is the ONLY name reference, and
|
||
# it is the catalog key (yaml is source of truth). detail_trigger
|
||
# placement / label come from the catalog entry, not literals.
|
||
# - MDX 원문 무손실 보존 — popup_body_source = full raw_content.
|
||
# u6 NEVER trims or summarizes; the body preview (excerpt from
|
||
# container px budget) is composed by u7 downstream.
|
||
# - Phase Z spacing 방향 — u6 binds a strategy that EXPANDS capacity
|
||
# (popup escalation) instead of shrinking common margins.
|
||
|
||
# Strategy id used when the unit carries no popup escalation marker.
|
||
# Catalog read — yaml is source of truth.
|
||
POPUP_BINDING_NO_POPUP_STRATEGY_ID = "inline_full"
|
||
|
||
# Strategy id used when the unit carries has_popup=True (deterministic
|
||
# choice — the preview body is a px-budget excerpt of the original, the
|
||
# popup body holds the FULL original per CLAUDE.md 자세히보기 원칙).
|
||
# u5 q3 — preview_chars deterministic from container px telemetry; that
|
||
# is an excerpt-from-original pattern, which matches
|
||
# ``inline_preview_with_details``. ``details_only`` (summary-only body)
|
||
# is the alternative future axis when an AI/summarizer is available.
|
||
POPUP_BINDING_ESCALATED_STRATEGY_ID = "inline_preview_with_details"
|
||
|
||
|
||
def bind_popup_display_strategy(unit) -> dict:
|
||
"""Bind catalog popup display strategy to a zone payload (IMP-35 u6).
|
||
|
||
Reads the unit-side ``has_popup`` + ``popup_escalation_plan`` markers
|
||
stamped by Step 17 POPUP gate (u5) and produces a zone payload dict
|
||
that u7 wires into the renderer. The catalog
|
||
(``display_strategies.yaml``) is the source of truth for both the
|
||
strategy id and the detail_trigger placement / label — no hardcoded
|
||
string literals.
|
||
|
||
Args:
|
||
unit: a CompositionUnit (or any duck-typed object exposing
|
||
``has_popup`` / ``popup_escalation_plan`` / ``raw_content``).
|
||
``has_popup`` defaults to False when the attribute is absent
|
||
(units that never went through the Step 17 POPUP gate).
|
||
|
||
Returns:
|
||
zone payload binding dict (see module-level u6 contract block
|
||
immediately above for the full schema).
|
||
|
||
Raises:
|
||
RuntimeError: if the chosen catalog strategy id is missing from
|
||
the loaded ``DISPLAY_STRATEGIES`` mapping. Defensive guard —
|
||
yaml drift would otherwise cause downstream KeyError on a
|
||
stale string literal. The constants
|
||
``POPUP_BINDING_NO_POPUP_STRATEGY_ID`` /
|
||
``POPUP_BINDING_ESCALATED_STRATEGY_ID`` must always resolve
|
||
against the catalog at import time.
|
||
"""
|
||
has_popup = bool(getattr(unit, "has_popup", False))
|
||
plan = getattr(unit, "popup_escalation_plan", None)
|
||
raw_content = getattr(unit, "raw_content", "") or ""
|
||
|
||
strategy_id = (
|
||
POPUP_BINDING_ESCALATED_STRATEGY_ID
|
||
if has_popup
|
||
else POPUP_BINDING_NO_POPUP_STRATEGY_ID
|
||
)
|
||
meta = DISPLAY_STRATEGIES.get(strategy_id)
|
||
if meta is None:
|
||
raise RuntimeError(
|
||
f"bind_popup_display_strategy: catalog drift — strategy id "
|
||
f"{strategy_id!r} is missing from display_strategies.yaml. "
|
||
f"Loaded keys: {sorted(DISPLAY_STRATEGIES)}."
|
||
)
|
||
|
||
if not has_popup:
|
||
return {
|
||
"display_strategy": strategy_id,
|
||
"popup_body_source": None,
|
||
"detail_trigger": None,
|
||
"preserves_original": bool(meta.get("preserves_original")),
|
||
"has_popup": False,
|
||
"popup_escalation_plan": None,
|
||
"strategy_meta": meta,
|
||
}
|
||
|
||
# has_popup=True path. preserves_original MUST be True per the catalog
|
||
# absolute user lock — defensive guard against yaml drift.
|
||
if not meta.get("preserves_original"):
|
||
raise RuntimeError(
|
||
f"bind_popup_display_strategy: catalog invariant violated — "
|
||
f"popup-binding strategy {strategy_id!r} has preserves_original="
|
||
f"{meta.get('preserves_original')!r}; MDX 원문 무손실 보존 "
|
||
f"requires preserves_original=True (오답노트 #5 / "
|
||
f"IMPROVEMENT-REDESIGN.md §3.6 line 110)."
|
||
)
|
||
trigger_meta = meta.get("detail_trigger") or {}
|
||
return {
|
||
"display_strategy": strategy_id,
|
||
# MDX 원문 무손실 보존 — popup body = full raw_content (verbatim).
|
||
"popup_body_source": raw_content,
|
||
"detail_trigger": {
|
||
"placement": trigger_meta.get("placement"),
|
||
"label": trigger_meta.get("label"),
|
||
},
|
||
"preserves_original": True,
|
||
"has_popup": True,
|
||
"popup_escalation_plan": plan,
|
||
"strategy_meta": meta,
|
||
}
|
||
|
||
|
||
# ─── IMP-35 (#64) u7 — Pipeline composer -> render_slide wiring ──
|
||
#
|
||
# Stage 2 wiring contract (unit u7):
|
||
# u6 (``bind_popup_display_strategy``) produced the deterministic zone
|
||
# binding from the unit-side marker stamped by Step 17 POPUP gate (u5).
|
||
# u7 wires that binding into the pipeline composer's zones_data so the
|
||
# render_slide call site (and downstream slide_base.html consumer u8)
|
||
# sees three uniform render-context field names per zone:
|
||
#
|
||
# has_popup : bool — escalation marker echo
|
||
# popup_html : str — popup body source (full ``raw_content`` per u6;
|
||
# u8 wraps it in ``<details>/<summary>``).
|
||
# ``None`` when has_popup=False.
|
||
# preview_text : str — px-budgeted excerpt of ``raw_content`` shown in
|
||
# the body / inline_preview slot. NEVER trims
|
||
# inside a line — line-boundary cut only — and
|
||
# the popup body retains the FULL original
|
||
# (MDX 원문 무손실 보존). ``None`` when
|
||
# has_popup=False.
|
||
#
|
||
# The full u6 binding is also echoed on the zone dict under
|
||
# ``popup_binding`` so downstream debug / catalog-aware consumers can
|
||
# self-explain without re-reading the yaml.
|
||
#
|
||
# Why the preview is a deterministic line-budget cut (u5 q3 resolution):
|
||
# The popup body holds the FULL original verbatim, so the preview loses
|
||
# no information — it just truncates at a deterministic boundary that
|
||
# fits the container height telemetry. Container telemetry source is the
|
||
# per-unit ``min_height_px`` (frame visual_hints), which is what the
|
||
# pipeline composer already knows at the zones_data append site.
|
||
#
|
||
# We never re-summarize, never AI-call, never reorder. Char-budget cut
|
||
# would risk splitting CJK words mid-character — line-boundary cut is
|
||
# the closest deterministic surface to ``raw_content`` semantics
|
||
# (MDX paragraph / bullet boundaries).
|
||
#
|
||
# Guardrails honored:
|
||
# - feedback_ai_isolation_contract — pure deterministic helper. No
|
||
# anthropic import, no AI fallback router path.
|
||
# - MDX 원문 무손실 보존 — preview is a CUT, never a rewrite; popup body
|
||
# stays equal to ``raw_content``.
|
||
# - feedback_no_hardcoding — line metric is parametric (line_height_px
|
||
# defaults to slide_base.html body line metric ~18 px = 11 px font *
|
||
# 1.6 line-height + ~0.4 px ascent guard). u9 will surface the literal
|
||
# value source.
|
||
|
||
# Line height in px used to convert a container-height budget into a
|
||
# line-count budget. Matches slide_base.html ``--font-body`` (11 px) at
|
||
# the ``.text-line`` line-height (1.6). Default — NOT a hardcoded magic
|
||
# constant: ``compute_popup_preview_text`` accepts an override so the
|
||
# downstream renderer (u8) or per-frame contracts can pass a tighter
|
||
# value if a frame uses a smaller body font.
|
||
POPUP_PREVIEW_DEFAULT_LINE_HEIGHT_PX = 18.0
|
||
|
||
|
||
def compute_popup_preview_text(
|
||
raw_content: str,
|
||
container_height_px: float,
|
||
*,
|
||
line_height_px: float = POPUP_PREVIEW_DEFAULT_LINE_HEIGHT_PX,
|
||
) -> str:
|
||
"""Px-budgeted preview excerpt of ``raw_content`` (IMP-35 u7).
|
||
|
||
Deterministic line-boundary cut — returns the leading lines of
|
||
``raw_content`` that fit within ``container_height_px`` at the slide
|
||
body line metric. Never trims inside a line (no mid-CJK-word cut);
|
||
the popup body (u6 ``popup_body_source``) retains the FULL original
|
||
verbatim so this excerpt loses no information.
|
||
|
||
Args:
|
||
raw_content: the unit's source MDX content; the popup body
|
||
source per CLAUDE.md 자세히보기 원칙.
|
||
container_height_px: container height telemetry. The pipeline
|
||
composer passes ``min_height_px`` (frame visual_hints) at
|
||
the zones_data append site. Non-positive values fall back
|
||
to returning the full content unchanged (popup gate would
|
||
not have fired without a real container budget anyway).
|
||
line_height_px: px per body line. Default matches slide_base.html
|
||
``.text-line`` (11 px font * 1.6 line-height + guard).
|
||
Overridable for tighter-font frames.
|
||
|
||
Returns:
|
||
The leading lines that fit the budget, joined verbatim. If the
|
||
content already fits, returns ``raw_content`` unchanged.
|
||
"""
|
||
if not raw_content:
|
||
return ""
|
||
if container_height_px <= 0 or line_height_px <= 0:
|
||
# No budget signal — return the full content unchanged. u5 POPUP
|
||
# gate would not have fired without a real container budget, so
|
||
# this branch is only reachable for non-popup units (where the
|
||
# preview is anyway unused — see compose_zone_popup_payload).
|
||
return raw_content
|
||
max_lines = int(container_height_px // line_height_px)
|
||
if max_lines < 1:
|
||
max_lines = 1
|
||
lines = raw_content.splitlines(keepends=False)
|
||
if len(lines) <= max_lines:
|
||
return raw_content
|
||
# Re-join with "\n" — splitlines drops the terminator so a verbatim
|
||
# round-trip of the leading lines is "\n".join(...). Preserves the
|
||
# exact head of raw_content up to the chosen line boundary.
|
||
return "\n".join(lines[:max_lines])
|
||
|
||
|
||
def compose_zone_popup_payload(unit, container_height_px: float) -> dict:
|
||
"""Compose the per-zone popup render-context payload (IMP-35 u7).
|
||
|
||
Reads u6 ``bind_popup_display_strategy(unit)`` and surfaces the three
|
||
uniform render-context field names the pipeline composer attaches to
|
||
each zone in ``zones_data``. The full u6 binding is also echoed
|
||
under ``popup_binding`` so downstream debug / u8 / u9 consumers can
|
||
self-explain without re-reading the yaml.
|
||
|
||
Args:
|
||
unit: a CompositionUnit (or any duck-typed object exposing
|
||
``has_popup`` / ``popup_escalation_plan`` / ``raw_content``).
|
||
container_height_px: container height telemetry. The pipeline
|
||
composer passes ``min_height_px`` at the zones_data append
|
||
site. The non-popup branch ignores the value (preview_text
|
||
is always None when has_popup=False).
|
||
|
||
Returns:
|
||
Dict with the four wiring keys (``has_popup``, ``popup_html``,
|
||
``preview_text``, ``popup_binding``). Spreadable into a zone
|
||
dict via ``zones_data.append({..., **payload})``.
|
||
"""
|
||
binding = bind_popup_display_strategy(unit)
|
||
has_popup = bool(binding.get("has_popup"))
|
||
if not has_popup:
|
||
return {
|
||
"has_popup": False,
|
||
"popup_html": None,
|
||
"preview_text": None,
|
||
"popup_binding": binding,
|
||
}
|
||
raw_content = getattr(unit, "raw_content", "") or ""
|
||
popup_html = binding.get("popup_body_source")
|
||
preview_text = compute_popup_preview_text(raw_content, container_height_px)
|
||
return {
|
||
"has_popup": True,
|
||
# popup body = FULL raw_content (u6 popup_body_source). u8 wraps
|
||
# this in <details>/<summary> markup on slide_base.html.
|
||
"popup_html": popup_html,
|
||
# body preview = px-budgeted line-boundary cut of raw_content.
|
||
# NEVER trims inside a line; popup body holds the FULL original
|
||
# so this excerpt loses no information.
|
||
"preview_text": preview_text,
|
||
# Full u6 binding echo — downstream debug surfaces (catalog
|
||
# detail_trigger placement, popup_escalation_plan category /
|
||
# rationale) without re-reading yaml.
|
||
"popup_binding": binding,
|
||
}
|
||
|
||
|
||
# ─── CompositionUnit ────────────────────────────────────────────
|
||
|
||
@dataclass
|
||
class CompositionUnit:
|
||
"""Slide 내 1 zone 후보 = MDX section(s) + 매칭된 frame.
|
||
|
||
source_section_ids : 1 개 = single, 2+ = merged
|
||
merge_type :
|
||
- "single" : 단일 section
|
||
- "parent_merged" : parent V4 entry 존재 (v0)
|
||
- "parent_merged_inferred" : parent V4 entry 없음, child evidence 로 추론 (v0.1)
|
||
frame_* : V4 evidence 그대로 (catalog 메타 X 하드코딩 X)
|
||
score : 종합 점수
|
||
rationale : score breakdown 추적
|
||
review_required : True 면 자동 선택 X — debug 에만 노출, 사용자/AI 검토 후
|
||
별도 path (light_edit / restructure / AI restructuring) 로 처리
|
||
review_reasons : 왜 review_required 가 True 인지 (자가검증용 — child label mix /
|
||
template_id 불일치 / cardinality 불호환 등)
|
||
"""
|
||
source_section_ids: list[str]
|
||
merge_type: str
|
||
frame_template_id: str
|
||
frame_id: str
|
||
frame_number: int
|
||
confidence: float
|
||
label: str # use_as_is / light_edit / restructure / reject
|
||
phase_z_status: str
|
||
raw_content: str
|
||
title: str
|
||
v4_rank: Optional[int] = None
|
||
selection_path: str = "rank_1"
|
||
fallback_reason: Optional[str] = None
|
||
score: float = 0.0
|
||
rationale: dict = field(default_factory=dict)
|
||
|
||
# 자동 파이프라인 단계 상태 (review/UI 개념 X — 현재는 자동 결정 + 명확한 실패 기록만)
|
||
# auto_selectable=False 면 자동 선택 단계에서 제외. filter_reasons 가 그 이유.
|
||
# 예: parent_merged_inferred 의 W1/W2/W3 (rep status / all reject / majority not-auto-renderable)
|
||
# 사용자/AI 검토는 별 layer (interactive editor) 에서 처리. 본 dataclass 는 자동 결정 완결.
|
||
auto_selectable: bool = True
|
||
filter_reasons: list[str] = field(default_factory=list)
|
||
# informational signals — auto_selectable 여부와 무관. future axis 가 점수화할 영역.
|
||
# 예: "children disagree on rank-1 template_id" / "minority of children non-auto-renderable"
|
||
notes: list[str] = field(default_factory=list)
|
||
|
||
# Step 6-A axis 추가 (사용자 lock 2026-05-08).
|
||
# V4 후보 list (V4Match-shape duck typed — composition module 은 V4Match dataclass 미import,
|
||
# circular dep 회피). 각 entry attrs : template_id / frame_id / frame_number / confidence / label.
|
||
# list 순서 = V4 rank (candidates[0] = rank-1 non-reject — 단일 frame_template_id /
|
||
# frame_id / label / confidence 와 일치, backward compat lock).
|
||
# 0 길이 = "no_non_reject_v4_candidate" 신호 (Step 9 application_plan input).
|
||
v4_candidates: list = field(default_factory=list)
|
||
|
||
# IMP-30 u2 — provisional first-render flag. True when the V4Match
|
||
# backing this unit was synthesized via lookup_v4_match_with_fallback
|
||
# (allow_provisional=True) after chain_exhausted, or when u3 inserts
|
||
# a last-resort provisional fill for an uncovered section. Carried as
|
||
# data (not re-derived from label/selection_path downstream) so the
|
||
# render path / status / zone template can surface "needs adaptation"
|
||
# uniformly. Default False keeps non-provisional units byte-identical.
|
||
provisional: bool = False
|
||
|
||
|
||
# ─── Heading Tree ──────────────────────────────────────────────
|
||
|
||
def derive_parent_id(section_id: str) -> Optional[str]:
|
||
"""Section id -> parent id derivation by V4 key convention.
|
||
|
||
IMP-08 B-3 : canonical ordinal `${parent}-sub-${n}` recognised first;
|
||
legacy decimal `04-2.1` kept as fallback alias path.
|
||
|
||
Examples (illustrative, not rules) :
|
||
- "03-1-sub-2" -> "03-1" (canonical ordinal, IMP-08)
|
||
- "04-2.1" -> "04-2" (decimal suffix, legacy V4 key style)
|
||
- "04-1" -> None (top-level, no parent)
|
||
- "04" -> None
|
||
"""
|
||
m = re.fullmatch(r"(.+?)-sub-(\d+)", section_id)
|
||
if m:
|
||
return m.group(1)
|
||
parts = section_id.split("-", 1)
|
||
if len(parts) != 2:
|
||
return None
|
||
mdx_id, suffix = parts
|
||
if "." in suffix:
|
||
parent_suffix = suffix.split(".")[0]
|
||
return f"{mdx_id}-{parent_suffix}"
|
||
return None
|
||
|
||
|
||
def build_heading_tree(sections) -> dict:
|
||
"""Section list → tree {section_id: {section, children}}."""
|
||
tree = {s.section_id: {"section": s, "children": []} for s in sections}
|
||
for s in sections:
|
||
parent = derive_parent_id(s.section_id)
|
||
if parent and parent in tree:
|
||
tree[parent]["children"].append(s.section_id)
|
||
return tree
|
||
|
||
|
||
# ─── Candidate Generation ──────────────────────────────────────
|
||
|
||
def _apply_capacity_fit(candidate: CompositionUnit, capacity_fit_fn) -> None:
|
||
"""capacity_fit_fn 결과를 candidate 의 rationale + auto_selectable + filter_reasons 에 반영.
|
||
|
||
fit_status 가 'ok' / 'no_contract' / 'unknown_source_shape' 이면 auto_selectable 영향 X
|
||
(no_contract 는 catalog-only mapper 가 별도로 ValueError 처리).
|
||
그 외 (strict_mismatch / exceeds_max / below_min / exceeds_truncate) 는 silent loss 또는
|
||
mapper FitError 가 발생할 후보 → auto_selectable=False + filter_reasons 'C1: ...'.
|
||
"""
|
||
if capacity_fit_fn is None:
|
||
return
|
||
fit = capacity_fit_fn(candidate.frame_template_id, candidate.raw_content)
|
||
candidate.rationale["capacity_fit"] = fit
|
||
if fit["fit_status"] in {"ok", "no_contract", "unknown_source_shape"}:
|
||
return
|
||
candidate.auto_selectable = False
|
||
candidate.filter_reasons.append(
|
||
f"C1: capacity mismatch ({fit['fit_status']}) — {fit['mismatch_reason']}"
|
||
)
|
||
|
||
|
||
def collect_candidates(sections, v4_lookup_fn, v4_label_to_status: dict,
|
||
auto_renderable_statuses: Optional[set[str]] = None,
|
||
capacity_fit_fn=None,
|
||
v4_candidates_lookup_fn=None):
|
||
"""Generate composition candidates.
|
||
|
||
v0.1 candidate types :
|
||
1. single : per leaf section (V4 entry 필수)
|
||
2. parent_merged : parent 자체에 V4 entry 존재 (parent 가 직접 매칭됨)
|
||
3. parent_merged_inferred : parent V4 없음. child evidence 로 representative
|
||
template_id 추론
|
||
|
||
원칙 :
|
||
- 특정 section_id / template_id / frame 하드코딩 X
|
||
- 모든 결정 = derive_parent_id() + V4 evidence + v4_label_to_status mapping + 주입된 fn (파라메트릭)
|
||
|
||
Args:
|
||
sections : align 결과
|
||
v4_lookup_fn : (section_id) → V4Match | None (rank-1 only, 기존 호환)
|
||
v4_label_to_status : V4 label → Phase Z status mapping
|
||
auto_renderable_statuses : 자동 렌더 허용 status set (W1/W3 판정 입력)
|
||
capacity_fit_fn : Optional (template_id, content) → fit dict.
|
||
제공되면 모든 candidate 에 적용 — capacity mismatch 시 auto_selectable=False
|
||
(silent truncate / mapper FitError 사전 차단).
|
||
v4_candidates_lookup_fn : Optional (section_id) → list[V4Match].
|
||
Step 6-A axis (사용자 lock 2026-05-08). non-reject max-N 후보 list.
|
||
제공되면 모든 candidate 에 v4_candidates 필드 채움.
|
||
None 이면 v4_candidates = [] (backward compat).
|
||
본 fn 이 V4 raw dict 구조를 흡수 — composition module 은 V4 yaml shape 모름.
|
||
|
||
Returns:
|
||
list[CompositionUnit]
|
||
"""
|
||
if auto_renderable_statuses is None:
|
||
auto_renderable_statuses = set()
|
||
|
||
def _v4_cands(section_id: str) -> list:
|
||
# v4_candidates_lookup_fn 미제공 시 빈 list (backward compat).
|
||
return v4_candidates_lookup_fn(section_id) if v4_candidates_lookup_fn else []
|
||
|
||
candidates = []
|
||
|
||
# 1. Separate
|
||
for s in sections:
|
||
match = v4_lookup_fn(s.section_id)
|
||
if match is None:
|
||
continue
|
||
c = CompositionUnit(
|
||
source_section_ids=[s.section_id],
|
||
merge_type="single",
|
||
frame_template_id=match.template_id,
|
||
frame_id=match.frame_id,
|
||
frame_number=match.frame_number,
|
||
confidence=match.confidence,
|
||
label=match.label,
|
||
phase_z_status=v4_label_to_status.get(match.label, "unknown"),
|
||
v4_rank=getattr(match, "v4_rank", None),
|
||
selection_path=getattr(match, "selection_path", "rank_1"),
|
||
fallback_reason=getattr(match, "fallback_reason", None),
|
||
raw_content=s.raw_content,
|
||
title=s.title,
|
||
v4_candidates=_v4_cands(s.section_id),
|
||
provisional=getattr(match, "provisional", False),
|
||
)
|
||
_apply_capacity_fit(c, capacity_fit_fn)
|
||
candidates.append(c)
|
||
|
||
# parent → children 그룹화
|
||
parent_to_children: dict[str, list] = {}
|
||
for s in sections:
|
||
pid = derive_parent_id(s.section_id)
|
||
if pid:
|
||
parent_to_children.setdefault(pid, []).append(s)
|
||
|
||
# 2. parent_merged (parent 자체가 V4 에 매칭된 경우)
|
||
for pid, children in parent_to_children.items():
|
||
parent_match = v4_lookup_fn(pid)
|
||
if parent_match is None:
|
||
continue # branch 3 가 처리
|
||
if len(children) < 2:
|
||
continue # merge 의미 없음
|
||
merged_raw = "\n\n".join(c.raw_content for c in children)
|
||
c_pm = CompositionUnit(
|
||
source_section_ids=[c.section_id for c in children],
|
||
merge_type="parent_merged",
|
||
frame_template_id=parent_match.template_id,
|
||
frame_id=parent_match.frame_id,
|
||
frame_number=parent_match.frame_number,
|
||
confidence=parent_match.confidence,
|
||
label=parent_match.label,
|
||
phase_z_status=v4_label_to_status.get(parent_match.label, "unknown"),
|
||
v4_rank=getattr(parent_match, "v4_rank", None),
|
||
selection_path=getattr(parent_match, "selection_path", "rank_1"),
|
||
fallback_reason=getattr(parent_match, "fallback_reason", None),
|
||
raw_content=merged_raw,
|
||
title=pid,
|
||
v4_candidates=_v4_cands(pid),
|
||
provisional=getattr(parent_match, "provisional", False),
|
||
)
|
||
_apply_capacity_fit(c_pm, capacity_fit_fn)
|
||
candidates.append(c_pm)
|
||
|
||
# 3. parent_merged_inferred (v0.1) — parent V4 없음, child evidence 기반
|
||
for pid, children in parent_to_children.items():
|
||
if v4_lookup_fn(pid) is not None:
|
||
continue # branch 2 가 이미 처리
|
||
if len(children) < 2:
|
||
continue
|
||
# children 중 V4 매칭 있는 것들만 evidence 로 사용
|
||
child_matches: list[tuple] = []
|
||
for c in children:
|
||
m = v4_lookup_fn(c.section_id)
|
||
if m is not None:
|
||
child_matches.append((c, m))
|
||
if len(child_matches) < 2:
|
||
continue # 최소 2 child evidence 필요
|
||
|
||
# representative = 가장 confidence 높은 child match (v0.1.1 단순 룰)
|
||
# 향후 axes : top-k convergence, template family agreement, cardinality_fit 등
|
||
rep_child, rep_match = max(child_matches, key=lambda cm: cm[1].confidence)
|
||
|
||
# 자동 선택 가능 여부 = auto_selectable. default True (strong inferred merge).
|
||
# 다음 weak 신호 중 하나라도 있으면 auto_selectable=False (filter_reasons 에 사유) :
|
||
# W1 : representative status 가 auto-renderable 아님 → 자동 렌더 자체가 막힘
|
||
# W2 : 모든 child 가 reject → merge 의미 자체가 없음
|
||
# W3 : auto-renderable 아닌 child label 이 majority (>50%)
|
||
# informational notes (auto_selectable 영향 X, future axis 점수화 영역) :
|
||
# N1 : children 의 rank-1 template_id 가 서로 다름 → top-k / family compat
|
||
# N2 : non-auto-renderable child label 이 일부 (소수) 존재
|
||
rep_status = v4_label_to_status.get(rep_match.label, "unknown")
|
||
child_labels = [m.label for _, m in child_matches]
|
||
child_template_ids_unique = sorted({m.template_id for _, m in child_matches})
|
||
n_children = len(child_matches)
|
||
n_not_auto = sum(
|
||
1 for l in child_labels
|
||
if v4_label_to_status.get(l) not in auto_renderable_statuses
|
||
)
|
||
|
||
filter_reasons: list[str] = []
|
||
notes: list[str] = []
|
||
|
||
if rep_status not in auto_renderable_statuses:
|
||
filter_reasons.append(
|
||
f"W1: representative status '{rep_status}' (label={rep_match.label}) "
|
||
f"not in auto_renderable_statuses={sorted(auto_renderable_statuses)}."
|
||
)
|
||
if all(l == "reject" for l in child_labels):
|
||
filter_reasons.append(
|
||
"W2: all children labeled 'reject' — merge has no fit basis."
|
||
)
|
||
if n_children > 0 and n_not_auto * 2 > n_children:
|
||
non_auto_labels = sorted({
|
||
l for l in child_labels
|
||
if v4_label_to_status.get(l) not in auto_renderable_statuses
|
||
})
|
||
filter_reasons.append(
|
||
f"W3: majority of children ({n_not_auto}/{n_children}) have "
|
||
f"non-auto-renderable labels {non_auto_labels}."
|
||
)
|
||
|
||
if len(child_template_ids_unique) > 1:
|
||
notes.append(
|
||
f"N1: children's rank-1 template_id differs ({child_template_ids_unique}). "
|
||
f"representative='{rep_match.template_id}' (highest child confidence). "
|
||
f"top-k / family compatibility 평가는 future axis."
|
||
)
|
||
if 0 < n_not_auto <= n_children // 2:
|
||
non_auto_labels_minority = sorted({
|
||
l for l in child_labels
|
||
if v4_label_to_status.get(l) not in auto_renderable_statuses
|
||
})
|
||
notes.append(
|
||
f"N2: minority ({n_not_auto}/{n_children}) of children non-auto-renderable "
|
||
f"({non_auto_labels_minority}). representative is auto-renderable, merge proceeds."
|
||
)
|
||
|
||
auto_selectable = len(filter_reasons) == 0
|
||
|
||
merged_raw = "\n\n".join(c.raw_content for c, _ in child_matches)
|
||
c_inf = CompositionUnit(
|
||
source_section_ids=[c.section_id for c, _ in child_matches],
|
||
merge_type="parent_merged_inferred",
|
||
frame_template_id=rep_match.template_id,
|
||
frame_id=rep_match.frame_id,
|
||
frame_number=rep_match.frame_number,
|
||
confidence=rep_match.confidence,
|
||
label=rep_match.label,
|
||
phase_z_status=rep_status,
|
||
v4_rank=getattr(rep_match, "v4_rank", None),
|
||
selection_path=getattr(rep_match, "selection_path", "rank_1"),
|
||
fallback_reason=getattr(rep_match, "fallback_reason", None),
|
||
raw_content=merged_raw,
|
||
title=pid,
|
||
auto_selectable=auto_selectable,
|
||
filter_reasons=filter_reasons,
|
||
notes=notes,
|
||
# rep_child 의 V4 후보 list (rep_match 와 같은 출처, frame_* 와 일관).
|
||
v4_candidates=_v4_cands(rep_child.section_id),
|
||
# IMP-30 u2 — rep_match drives frame selection so its provisional
|
||
# flag flows here. If a non-rep child match is provisional but the
|
||
# rep is not, this unit is not provisional (the rep frame is real).
|
||
provisional=getattr(rep_match, "provisional", False),
|
||
)
|
||
_apply_capacity_fit(c_inf, capacity_fit_fn)
|
||
candidates.append(c_inf)
|
||
|
||
return candidates
|
||
|
||
|
||
# ─── Scoring ───────────────────────────────────────────────────
|
||
|
||
# v0 label weights — V4 label → score multiplier.
|
||
# 향후 axes 추가 (cardinality_fit / hierarchy_coherence / density) 시 확장.
|
||
V0_LABEL_WEIGHT = {
|
||
"use_as_is": 1.0,
|
||
"light_edit": 0.7,
|
||
"restructure": 0.4,
|
||
"reject": 0.0,
|
||
}
|
||
|
||
|
||
def score_candidate(c: CompositionUnit) -> CompositionUnit:
|
||
"""v0 scoring : confidence × label_weight.
|
||
|
||
추후 추가될 axes (rationale 에 자리만 잡아둠) :
|
||
- cardinality_fit : item_count vs frame ideal/min/max
|
||
- hierarchy_coherence : merge_type 적합도
|
||
- density_score : content 밀도 vs zone 크기
|
||
"""
|
||
label_weight = V0_LABEL_WEIGHT.get(c.label, 0.0)
|
||
frame_compat = c.confidence * label_weight
|
||
c.score = frame_compat
|
||
# 기존 rationale 보존 (예: collect_candidates 가 넣은 capacity_fit)
|
||
c.rationale.update({
|
||
"frame_compat": round(frame_compat, 4),
|
||
"confidence": c.confidence,
|
||
"label": c.label,
|
||
"label_weight": label_weight,
|
||
"merge_type": c.merge_type,
|
||
# placeholders for future axes
|
||
"hierarchy_coherence": None,
|
||
"density_score": None,
|
||
})
|
||
return c
|
||
|
||
|
||
# ─── Selection ─────────────────────────────────────────────────
|
||
|
||
def select_composition_units(
|
||
candidates,
|
||
allowed_statuses: set[str],
|
||
*,
|
||
all_section_ids: Optional[list[str]] = None,
|
||
allow_provisional_fill: bool = False,
|
||
) -> list[CompositionUnit]:
|
||
"""Greedy non-overlapping selection by score, with coverage tiebreak.
|
||
|
||
1. 모든 candidate 점수 매김
|
||
2. filter :
|
||
- phase_z_status ∈ allowed_statuses
|
||
- auto_selectable=True (W1/W2/W3 신호 통과)
|
||
3. 정렬 키 = (score desc, source_section_ids 수 desc)
|
||
— 동점이면 더 많은 section 을 cover 하는 후보 우선.
|
||
parent_merged_inferred 가 같은 점수의 single 후보를 *coverage 우위* 로 이김.
|
||
4. greedy : 이미 covered 된 section 을 가진 후보는 skip
|
||
5. 최종 선택 = covered set 채워나감
|
||
|
||
auto_selectable=False candidate 는 자동 선택 X. debug 의 candidates_summary 에는 남음.
|
||
UI/editor layer 에서 사용자가 별도 처리 가능 (현 v0 범위 X).
|
||
|
||
IMP-30 u3 — last-resort provisional fill (opt-in via allow_provisional_fill):
|
||
After the normal greedy pass, sections in ``all_section_ids`` that are
|
||
still uncovered are filled with the highest-score *provisional*
|
||
candidate (``c.provisional == True``) that includes at least one
|
||
uncovered section and does not collide with already-covered ones. A
|
||
provisional candidate's backing V4Match was synthesized via
|
||
``lookup_v4_match_with_fallback(allow_provisional=True)`` (IMP-30 u1)
|
||
after chain_exhausted; its ``phase_z_status`` is therefore typically
|
||
*outside* ``allowed_statuses`` (extract_matched_zone / fallback_candidate),
|
||
which is why it gets filtered out of the normal greedy pass. The fill
|
||
preserves first-render invariant for sections whose rank-1~3 are all
|
||
restructure/reject. Default ``allow_provisional_fill=False`` keeps
|
||
pre-u3 behavior byte-identical (IMP-05 regression guard).
|
||
|
||
Args:
|
||
candidates: full candidate pool from collect_candidates().
|
||
allowed_statuses: phase_z_status set considered auto-renderable.
|
||
all_section_ids: ordered section id list (only consulted when
|
||
allow_provisional_fill=True; required for coverage check).
|
||
allow_provisional_fill: opt-in for last-resort provisional fill.
|
||
"""
|
||
scored = [score_candidate(c) for c in candidates]
|
||
viable = [
|
||
c for c in scored
|
||
if c.phase_z_status in allowed_statuses and c.auto_selectable
|
||
]
|
||
viable.sort(key=lambda c: (c.score, len(c.source_section_ids)), reverse=True)
|
||
|
||
selected = []
|
||
covered = set()
|
||
for c in viable:
|
||
if any(sid in covered for sid in c.source_section_ids):
|
||
continue
|
||
selected.append(c)
|
||
covered.update(c.source_section_ids)
|
||
|
||
# IMP-30 u3 — last-resort provisional fill (opt-in, default off).
|
||
# Honors first-render invariant by surfacing chain_exhausted sections as
|
||
# provisional zones instead of dropping them. Skip reasons on
|
||
# non-provisional filtered candidates are preserved (not mutated here).
|
||
if allow_provisional_fill and all_section_ids:
|
||
uncovered = {sid for sid in all_section_ids if sid not in covered}
|
||
if uncovered:
|
||
provisional_pool = [
|
||
c for c in scored
|
||
if c.provisional
|
||
and any(sid in uncovered for sid in c.source_section_ids)
|
||
]
|
||
provisional_pool.sort(
|
||
key=lambda c: (c.score, len(c.source_section_ids)),
|
||
reverse=True,
|
||
)
|
||
for c in provisional_pool:
|
||
if any(sid in covered for sid in c.source_section_ids):
|
||
continue
|
||
selected.append(c)
|
||
covered.update(c.source_section_ids)
|
||
|
||
return selected
|
||
|
||
|
||
# ─── Layout Preset Selection ───────────────────────────────────
|
||
|
||
def select_layout_preset(units: list[CompositionUnit]) -> Optional[str]:
|
||
"""v0 : count-based default selection.
|
||
|
||
1 unit → single
|
||
2 units → horizontal-2 (default. vertical-2 는 aspect signal 추가 시 분기)
|
||
3 units → top-1-bottom-2 (default. 다른 3-zone variant 는 content-weight signal 추가 시 분기)
|
||
4 units → grid-2x2
|
||
|
||
v0 한계 :
|
||
- aspect / content-weight 신호 미반영 → 2 units 는 항상 horizontal, 3 units 는 항상 top-1-bottom-2
|
||
- 향후 unit.raw_content 기반 weight 산정 시 정교화
|
||
"""
|
||
n = len(units)
|
||
if n == 0:
|
||
return None
|
||
if n == 1:
|
||
return "single"
|
||
if n == 2:
|
||
return "horizontal-2"
|
||
if n == 3:
|
||
return "top-1-bottom-2"
|
||
if n == 4:
|
||
return "grid-2x2"
|
||
raise ValueError(
|
||
f"Composition v0 : layout for {n} units not supported (max 4). "
|
||
"Larger counts require split-into-multiple-slides decision (future)."
|
||
)
|
||
|
||
|
||
# ─── Public entry — composition pipeline ───────────────────────
|
||
|
||
def plan_composition(sections, v4_lookup_fn, v4_label_to_status: dict,
|
||
allowed_statuses: set[str],
|
||
capacity_fit_fn=None,
|
||
v4_candidates_lookup_fn=None,
|
||
*,
|
||
allow_provisional_fill: bool = False) -> tuple[list[CompositionUnit], Optional[str], dict]:
|
||
"""Composition planner v0.2 entry.
|
||
|
||
v0.2 변경 :
|
||
- capacity_fit_fn 주입 시 모든 candidate 에 capacity 사전 검사
|
||
(silent truncate / mapper FitError 사전 차단). 불일치 시 auto_selectable=False
|
||
+ filter_reason 'C1: ...'.
|
||
|
||
Step 6-A axis (사용자 lock 2026-05-08) :
|
||
- v4_candidates_lookup_fn 주입 시 모든 CompositionUnit 에 v4_candidates 채움.
|
||
logic 변화 X — 단일 frame_template_id / frame_id / label / confidence 는 그대로.
|
||
runtime 결과 무변. Step 9 application_plan input 위한 schema 확장.
|
||
|
||
IMP-30 u3 — last-resort provisional fill (opt-in, default off):
|
||
``allow_provisional_fill`` is plumbed to select_composition_units().
|
||
When True, uncovered sections receive a provisional fill from candidates
|
||
whose backing V4Match was synthesized via ``allow_provisional=True``
|
||
(IMP-30 u1). ``_candidate_state`` returns ``selected_provisional`` for
|
||
those filled units so the debug summary distinguishes greedy selections
|
||
from provisional fills. Default False keeps IMP-05 behavior identical.
|
||
|
||
v0.1 / v0.1.1 동작 (유지) :
|
||
- parent_merged_inferred candidate 생성 (parent V4 없어도)
|
||
- review 개념 X. auto_selectable + filter_reasons 만으로 자동 결정
|
||
- selection : score desc + coverage 우세 tiebreak
|
||
|
||
Returns:
|
||
units : 자동 선택된 composition units
|
||
layout_preset : 8 vocabulary 중 하나 (또는 None)
|
||
debug : 후보 전체 + capacity_fit + filter_reasons + preset 결정 근거
|
||
"""
|
||
candidates = collect_candidates(
|
||
sections, v4_lookup_fn, v4_label_to_status,
|
||
auto_renderable_statuses=allowed_statuses,
|
||
capacity_fit_fn=capacity_fit_fn,
|
||
v4_candidates_lookup_fn=v4_candidates_lookup_fn,
|
||
)
|
||
scored_all = [score_candidate(c) for c in candidates]
|
||
|
||
units = select_composition_units(
|
||
candidates,
|
||
allowed_statuses,
|
||
all_section_ids=[s.section_id for s in sections] if allow_provisional_fill else None,
|
||
allow_provisional_fill=allow_provisional_fill,
|
||
)
|
||
preset = select_layout_preset(units)
|
||
|
||
def _candidate_state(c: CompositionUnit) -> str:
|
||
if c in units:
|
||
# IMP-30 u3 — provisional-fill units surface as a distinct state so
|
||
# downstream debug consumers can tell greedy selection apart from
|
||
# last-resort fill. unit.provisional flows from u1 (V4Match
|
||
# synthesis) → u2 (CompositionUnit propagation).
|
||
if c.provisional:
|
||
return "selected_provisional"
|
||
return "selected"
|
||
if c.phase_z_status not in allowed_statuses:
|
||
return "filtered_status" # V4 label → status not auto-renderable
|
||
if not c.auto_selectable:
|
||
# filter_reasons prefix 로 capacity 와 weak 구분
|
||
if any(r.startswith("C") for r in c.filter_reasons):
|
||
return "filtered_capacity" # C1 (capacity mismatch)
|
||
return "filtered_weak" # W1/W2/W3 (parent_merged_inferred only)
|
||
return "filtered_lost" # viable 였지만 coverage 충돌로 밀림
|
||
|
||
candidates_summary = [
|
||
{
|
||
"source_section_ids": c.source_section_ids,
|
||
"merge_type": c.merge_type,
|
||
"template_id": c.frame_template_id,
|
||
"label": c.label,
|
||
"phase_z_status": c.phase_z_status,
|
||
"v4_rank": c.v4_rank,
|
||
"selection_path": c.selection_path,
|
||
"fallback_reason": c.fallback_reason,
|
||
"score": c.score,
|
||
"selection_state": _candidate_state(c),
|
||
"auto_selectable": c.auto_selectable,
|
||
"filter_reasons": list(c.filter_reasons),
|
||
"notes": list(c.notes),
|
||
"capacity_fit": c.rationale.get("capacity_fit"),
|
||
}
|
||
for c in scored_all
|
||
]
|
||
|
||
merge_candidates = [
|
||
s for s in candidates_summary
|
||
if s["merge_type"] in {"parent_merged", "parent_merged_inferred"}
|
||
]
|
||
capacity_mismatches = [
|
||
s for s in candidates_summary
|
||
if s["selection_state"] == "filtered_capacity"
|
||
]
|
||
|
||
debug = {
|
||
"planner_version": "v0.2",
|
||
"selection_rule": (
|
||
"score desc, then source_section_ids count desc (coverage tiebreak). "
|
||
"filter = phase_z_status ∉ allowed_statuses OR auto_selectable=False. "
|
||
"auto_selectable=False 사유 : C1 (capacity mismatch — silent truncate / FitError 차단), "
|
||
"W1 (rep not auto-renderable), W2 (all children reject), W3 (majority children non-auto-renderable)."
|
||
),
|
||
"candidates_total": len(scored_all),
|
||
"candidates_viable_auto": len([
|
||
c for c in scored_all
|
||
if c.phase_z_status in allowed_statuses and c.auto_selectable
|
||
]),
|
||
"candidates_summary": candidates_summary,
|
||
"merge_candidates": merge_candidates,
|
||
"capacity_mismatches": capacity_mismatches,
|
||
"selected_units_count": len(units),
|
||
"layout_preset": preset,
|
||
"layout_preset_rationale": (
|
||
f"v0 count-based: {len(units)} units → {preset}"
|
||
if preset else "no viable units"
|
||
),
|
||
}
|
||
|
||
return units, preset, debug
|
||
|
||
|
||
# ─── IMP-48 — Re-split All-Reject Merges (#77, Stage 2 / u1~u3) ─────
|
||
|
||
def resplit_all_reject_merges(
|
||
units: list[CompositionUnit],
|
||
sections,
|
||
v4_lookup_fn,
|
||
v4_label_to_status: dict,
|
||
allowed_statuses: set[str],
|
||
*,
|
||
capacity_fit_fn=None,
|
||
v4_candidates_lookup_fn=None,
|
||
section_assignment_override: bool = False,
|
||
) -> tuple[list[CompositionUnit], dict]:
|
||
"""Re-split merged composition units whose rank-1 V4 label is ``reject``.
|
||
|
||
IMP-48 (#77) — Step 6 post-pass that decomposes a merged unit
|
||
(``parent_merged`` / ``parent_merged_inferred``) carrying ``label=reject``
|
||
into per-section singles, so child sections with non-reject rank-1 V4
|
||
evidence can flow through the normal use_as_is / light_edit / restructure
|
||
paths instead of being handed to IMP-47B (#76) as a single blob.
|
||
|
||
Stage 2 / u3 slice (current revision) :
|
||
u1 contract (detection scan + override skip + idempotent single-
|
||
exclusion) + u2 per-section Branch-1 rebuild (each rebuilt single
|
||
carries ``merge_type="single"`` + the section's OWN rank-1 V4
|
||
evidence via ``v4_lookup_fn`` + the section's original
|
||
``raw_content`` from ``sections``) are both preserved. u3 adds the
|
||
gating + swap path :
|
||
|
||
1. **Coverage equality** — every child section in
|
||
``source_section_ids`` MUST rebuild successfully. Any
|
||
``section_not_found`` / ``no_v4_match`` rebuild result short-
|
||
circuits that merged unit to ``reason="incomplete_rebuild"``.
|
||
2. **Beneficial split** — at least one rebuilt single MUST have
|
||
``label != "reject"`` (Stage 2 Q2 Codex YES — "≥1 section
|
||
gains non-reject frame"). Otherwise that merged unit short-
|
||
circuits to ``reason="no_beneficial_split"`` and IMP-47B (#76)
|
||
handles the merge directly.
|
||
3. **Layout cap (≤ 4 units)** — projected post-split unit count
|
||
(across ALL detected merges that would split) MUST be ≤ 4.
|
||
Otherwise EVERY would-be split is aborted with
|
||
``reason="layout_cap_exceeded"`` (Stage 2 Q2 default — keep
|
||
merged, no partial split; v0 ``select_layout_preset`` supports
|
||
1~4 units max).
|
||
4. **Telemetry** — every single produced by an APPLIED split has
|
||
``selection_path="resplit_from_merge"`` (Stage 1 Q3 YES,
|
||
additive field reuse — no schema add).
|
||
5. **Audit payload** — ``audit["applied"]`` reflects whether ANY
|
||
merge actually split. ``audit["split_units"]`` /
|
||
``audit["skipped_units"]`` capture per-merge decisions.
|
||
``audit["post_split_unit_count"]`` reflects the returned list
|
||
length. ``audit["post_split_layout_preset"]`` is filled via
|
||
``select_layout_preset(out_units)`` when ``applied=True``,
|
||
None otherwise (u5 also re-derives in pipeline scope).
|
||
|
||
``out_units`` is the post-resplit unit list (merged removed +
|
||
singles inserted, in original ordering). When no merge splits,
|
||
``out_units`` is byte-identical to input ``units`` and
|
||
``applied=False`` — the audit's ``skipped_reason`` becomes
|
||
``"no_split_applied"``.
|
||
|
||
Detection signal (★ no-hardcoding, AI=0) :
|
||
``merge_type ∈ {"parent_merged", "parent_merged_inferred"}``
|
||
AND ``label == "reject"``
|
||
AND ``len(source_section_ids) >= 2``
|
||
|
||
Signal uses only ``merge_type`` + ``label`` + section count — never
|
||
section_id, template_id, MDX filename, or sample identifier.
|
||
|
||
Override skip (Stage 2 Q1 — kwarg per Codex YES) :
|
||
``section_assignment_override=True`` makes the helper a no-op. User-
|
||
driven ``zoneSections`` (#6 IMP-06) is the ground truth and must not
|
||
be second-guessed by an automatic re-split.
|
||
|
||
Idempotency (max_retry=1, Stage 2 lock) :
|
||
u2's rebuilt units carry ``merge_type="single"``, which is excluded
|
||
from the detection filter by construction. A second pass through
|
||
this helper finds nothing — no inner loop, no recursion.
|
||
|
||
Frame-swap guardrail (★ feedback_ai_isolation_contract) :
|
||
u2 rebuilds each child section's single from its OWN rank-1 V4
|
||
evidence via ``v4_lookup_fn``. The merged unit's parent /
|
||
representative ``template_id`` is discarded along with the merge
|
||
itself — no swap of one section's frame onto another section.
|
||
|
||
Args:
|
||
units: composition units from ``plan_composition()``.
|
||
sections: original section list (forwarded to u2 for per-section
|
||
``raw_content`` lookup — merged units carry the joined string,
|
||
not the individual child source).
|
||
v4_lookup_fn: ``(section_id) -> V4Match | None`` (rank-1). Forwarded
|
||
to u2 — identical evidence source as ``plan_composition``.
|
||
v4_label_to_status: V4 label → Phase Z status mapping (forwarded).
|
||
allowed_statuses: auto-renderable status set (forwarded).
|
||
capacity_fit_fn: optional capacity fit injector (forwarded to u2).
|
||
v4_candidates_lookup_fn: optional Step 6-A candidates fn (forwarded).
|
||
section_assignment_override: True iff user supplied
|
||
``zoneSections`` / ``section_assignment_plan`` (IMP-06 chain).
|
||
|
||
Returns:
|
||
``(out_units, audit)`` :
|
||
``out_units`` = post-resplit units (u1: identical to input).
|
||
``audit`` = ``imp48_resplit`` payload following Stage 1 schema::
|
||
|
||
{
|
||
"applied": bool, # u1: always False
|
||
"split_units": [...], # u3 fills with per-section singles
|
||
"skipped_units": [...], # u3 fills with kept-merged + reason
|
||
"post_split_unit_count": int,
|
||
"post_split_layout_preset": Optional[str],
|
||
"skipped_reason": str, # u1: contract-stage reason
|
||
"detected_units": [...], # u1: u2's rebuild targets
|
||
}
|
||
"""
|
||
# ``allowed_statuses`` is forwarded for signature symmetry with
|
||
# ``plan_composition`` but unused inside the helper — Stage 2 / Codex YES
|
||
# fixed the beneficial-split threshold to ``single.label != "reject"``
|
||
# (Stage 1 contract "non-reject rank-1"). Future axes may widen the
|
||
# threshold using ``allowed_statuses``; until then the parameter is
|
||
# explicitly deleted to silence lint without losing the public contract.
|
||
del allowed_statuses
|
||
|
||
audit: dict = {
|
||
"applied": False,
|
||
"split_units": [],
|
||
"skipped_units": [],
|
||
"post_split_unit_count": len(units),
|
||
"post_split_layout_preset": None,
|
||
"detected_units": [],
|
||
"rebuild_attempts": [],
|
||
}
|
||
|
||
if section_assignment_override:
|
||
audit["skipped_reason"] = "section_assignment_override"
|
||
return units, audit
|
||
|
||
detected = [
|
||
u for u in units
|
||
if u.merge_type in {"parent_merged", "parent_merged_inferred"}
|
||
and u.label == "reject"
|
||
and len(u.source_section_ids) >= 2
|
||
]
|
||
audit["detected_units"] = [
|
||
{
|
||
"source_section_ids": list(u.source_section_ids),
|
||
"merge_type": u.merge_type,
|
||
"template_id": u.frame_template_id,
|
||
"label": u.label,
|
||
}
|
||
for u in detected
|
||
]
|
||
if not detected:
|
||
audit["skipped_reason"] = "no_detection"
|
||
return units, audit
|
||
|
||
# u2 — per-section Branch-1 rebuild for each detected merged-reject unit.
|
||
# Mirrors ``collect_candidates`` Branch 1 (single per section). Each rebuilt
|
||
# single carries the section's OWN rank-1 V4 evidence — the merged unit's
|
||
# parent/representative template_id is discarded along with the merge.
|
||
# ★ feedback_ai_isolation_contract : no frame swap (each section's own V4).
|
||
# ★ MDX_raw_content_invariant : raw_content taken from sections list.
|
||
# ★ idempotency : merge_type="single" excludes singles
|
||
# from re-detection on any later pass.
|
||
section_by_id = {s.section_id: s for s in sections}
|
||
|
||
def _v4_cands(section_id: str) -> list:
|
||
return v4_candidates_lookup_fn(section_id) if v4_candidates_lookup_fn else []
|
||
|
||
rebuild_attempts: list[dict] = []
|
||
for merged_unit in detected:
|
||
section_singles: list[dict] = []
|
||
for sid in merged_unit.source_section_ids:
|
||
section = section_by_id.get(sid)
|
||
if section is None:
|
||
section_singles.append({
|
||
"section_id": sid,
|
||
"build_result": "section_not_found",
|
||
"unit": None,
|
||
})
|
||
continue
|
||
match = v4_lookup_fn(sid)
|
||
if match is None:
|
||
section_singles.append({
|
||
"section_id": sid,
|
||
"build_result": "no_v4_match",
|
||
"unit": None,
|
||
})
|
||
continue
|
||
single = CompositionUnit(
|
||
source_section_ids=[sid],
|
||
merge_type="single",
|
||
frame_template_id=match.template_id,
|
||
frame_id=match.frame_id,
|
||
frame_number=match.frame_number,
|
||
confidence=match.confidence,
|
||
label=match.label,
|
||
phase_z_status=v4_label_to_status.get(match.label, "unknown"),
|
||
v4_rank=getattr(match, "v4_rank", None),
|
||
selection_path=getattr(match, "selection_path", "rank_1"),
|
||
fallback_reason=getattr(match, "fallback_reason", None),
|
||
raw_content=section.raw_content,
|
||
title=section.title,
|
||
v4_candidates=_v4_cands(sid),
|
||
provisional=getattr(match, "provisional", False),
|
||
)
|
||
_apply_capacity_fit(single, capacity_fit_fn)
|
||
score_candidate(single)
|
||
section_singles.append({
|
||
"section_id": sid,
|
||
"build_result": "ok",
|
||
"unit": single,
|
||
})
|
||
rebuild_attempts.append({
|
||
"merged_source_section_ids": list(merged_unit.source_section_ids),
|
||
"merged_merge_type": merged_unit.merge_type,
|
||
"merged_template_id": merged_unit.frame_template_id,
|
||
"section_singles": section_singles,
|
||
})
|
||
|
||
audit["rebuild_attempts"] = rebuild_attempts
|
||
|
||
# u3 — gating + swap path.
|
||
# Per-merge decision: split | skip(reason). Then a cumulative layout-cap
|
||
# check aborts ALL would-be splits if projected post-split count > 4
|
||
# (Stage 2 Q2 default — keep merged, no partial split; v0
|
||
# ``select_layout_preset`` supports 1~4 units max).
|
||
plans: list[dict] = []
|
||
for merged_unit, attempt in zip(detected, rebuild_attempts):
|
||
required_sids = set(merged_unit.source_section_ids)
|
||
built_sids = {
|
||
entry["section_id"]
|
||
for entry in attempt["section_singles"]
|
||
if entry["build_result"] == "ok"
|
||
}
|
||
if built_sids != required_sids:
|
||
# Some sections failed to rebuild — coverage equality violated.
|
||
# IMP-47B (#76) will handle the merged unit directly.
|
||
plans.append({
|
||
"merged": merged_unit,
|
||
"decision": "skip",
|
||
"reason": "incomplete_rebuild",
|
||
"missing": sorted(required_sids - built_sids),
|
||
})
|
||
continue
|
||
built_units = [
|
||
entry["unit"]
|
||
for entry in attempt["section_singles"]
|
||
if entry["build_result"] == "ok"
|
||
]
|
||
non_reject_count = sum(1 for u in built_units if u.label != "reject")
|
||
if non_reject_count == 0:
|
||
# No child section gains a non-reject frame — split is not
|
||
# beneficial. IMP-47B (#76) handles the merge directly.
|
||
plans.append({
|
||
"merged": merged_unit,
|
||
"decision": "skip",
|
||
"reason": "no_beneficial_split",
|
||
})
|
||
continue
|
||
plans.append({
|
||
"merged": merged_unit,
|
||
"decision": "split",
|
||
"singles": built_units,
|
||
"non_reject_count": non_reject_count,
|
||
})
|
||
|
||
# Cumulative layout-cap projection across all would-be splits.
|
||
projected_count = len(units)
|
||
for plan in plans:
|
||
if plan["decision"] == "split":
|
||
projected_count += len(plan["singles"]) - 1
|
||
if projected_count > 4:
|
||
for plan in plans:
|
||
if plan["decision"] == "split":
|
||
plan["decision"] = "skip"
|
||
plan["reason"] = "layout_cap_exceeded"
|
||
plan["projected_count"] = projected_count
|
||
|
||
# Build out_units by walking the input list once. Identity match by
|
||
# ``id(unit)`` keeps the swap deterministic and preserves order.
|
||
plan_by_unit_id = {id(plan["merged"]): plan for plan in plans}
|
||
out_units: list[CompositionUnit] = []
|
||
applied = False
|
||
for unit in units:
|
||
plan = plan_by_unit_id.get(id(unit))
|
||
if plan is None:
|
||
out_units.append(unit)
|
||
continue
|
||
if plan["decision"] == "split":
|
||
applied = True
|
||
for single in plan["singles"]:
|
||
# ★ Stage 1 Q3 YES — additive telemetry tag, no schema add.
|
||
# Overrides the v4 match's selection_path for split-produced
|
||
# singles only; non-resplit code paths are unaffected.
|
||
single.selection_path = "resplit_from_merge"
|
||
out_units.extend(plan["singles"])
|
||
audit["split_units"].append({
|
||
"merged_source_section_ids": list(plan["merged"].source_section_ids),
|
||
"merged_template_id": plan["merged"].frame_template_id,
|
||
"non_reject_count": plan["non_reject_count"],
|
||
"split_singles": [
|
||
{
|
||
"section_id": s.source_section_ids[0],
|
||
"template_id": s.frame_template_id,
|
||
"label": s.label,
|
||
"phase_z_status": s.phase_z_status,
|
||
}
|
||
for s in plan["singles"]
|
||
],
|
||
})
|
||
else: # skip
|
||
out_units.append(unit)
|
||
skip_entry: dict = {
|
||
"merged_source_section_ids": list(plan["merged"].source_section_ids),
|
||
"merged_template_id": plan["merged"].frame_template_id,
|
||
"reason": plan["reason"],
|
||
}
|
||
if plan["reason"] == "incomplete_rebuild":
|
||
skip_entry["missing_section_ids"] = list(plan["missing"])
|
||
if plan["reason"] == "layout_cap_exceeded":
|
||
skip_entry["projected_post_split_count"] = plan["projected_count"]
|
||
audit["skipped_units"].append(skip_entry)
|
||
|
||
audit["applied"] = applied
|
||
audit["post_split_unit_count"] = len(out_units)
|
||
if applied:
|
||
# ``select_layout_preset`` is deterministic on unit count (v0).
|
||
# u5 (pipeline) re-derives layout preset over the same out_units list;
|
||
# both values stay consistent by construction.
|
||
audit["post_split_layout_preset"] = select_layout_preset(out_units)
|
||
audit.pop("skipped_reason", None)
|
||
else:
|
||
audit["post_split_layout_preset"] = None
|
||
audit["skipped_reason"] = "no_split_applied"
|
||
|
||
return out_units, audit
|