Files
C.E.L_Slide_test2/src/phase_z2_composition.py
kyeongmin f3ef4d917c feat(#64): IMP-35 details_popup_escalation u1~u10 + Stage 3 R7 anchor re-pin
Land the production + test surface for the Step 17 cascade POPUP terminal
(DETERMINISTIC -> POPUP -> AI_REPAIR -> USER_OVERRIDE) per Stage 2 plan R2.
u11 (baseline-red invariance gate) was already landed in 7c93031 ahead of
this commit; this commit completes u1~u10 plus the Stage 3 R7 follow-up
anchor re-pin for test_imp17_comment_anchor.py.

Implementation units (Stage 2 R2 contract):
  u1  frame_reselect_insufficient failure_type + post-frame remeasure (q4)
        - src/phase_z2_failure_router.py, src/phase_z2_pipeline.py
  u2  NEXT_ACTION_BY_FAILURE row + impl_status flip
        - src/phase_z2_failure_router.py
  u3  Router details_popup_escalation MISSING->IMPLEMENTED + executor stub
        - src/phase_z2_router.py
  u4  step17.py AI split-decision contract (POPUP cascade_stage +
      route_for_label + skip_reason); API gated
        - src/phase_z2_ai_fallback/step17.py
  u5  Step 17 POPUP gate executor; popup_escalation_plan + has_popup marker
        - src/phase_z2_pipeline.py, src/phase_z2_ai_fallback/step17.py
  u6  Composition popup binding -- yaml strategy -> zone payload
        - src/phase_z2_composition.py
  u7  Pipeline composer -> render_slide wiring
      (popup_html / preview_text / has_popup)
        - src/phase_z2_pipeline.py
  u8  slide_base.html <details>/<summary> popup wrapper
        - templates/phase_z2/slide_base.html
  u9  display_strategies.yaml inline_preview + popup metadata
        - templates/phase_z2/regions/display_strategies.yaml
  u10 MDX preservation invariant: popup=full source / body=summary or subset
        (asserted by tests/phase_z2/test_popup_mdx_preservation.py)
  u11 (already in 7c93031) -- baseline-red invariance gate

Stage 3 R7 follow-up (anchor re-pin, test-only):
  - tests/orchestrator_unit/test_imp17_comment_anchor.py
    Pre-anchor additions in src/phase_z2_pipeline.py (u1 / u5 / u7) shifted
    the restructure/reject route-hint comments 578/579 -> 586/587. Re-pinned
    the two guard tests (and docstring re-pin lineage 564 -> 570 -> 578 ->
    586). Production code untouched.

Verification (Stage 4 R1):
  pytest -q tests/orchestrator_unit/test_imp17_comment_anchor.py
    -> 2 passed / 0.02s
  pytest -q <10 IMP-35 unit files in tests/phase_z2 + tests/phase_z2_ai_fallback>
    -> 136 passed / 15.94s
  Baseline-red invariance gate
    (tests/test_imp47b_step12_ai_wiring.py +
     tests/test_phase_z2_ai_fallback_config.py)
    -> 4 failed / 6 passed; FAILED set === IMP35_BASELINE_RED_NODE_IDS
    (frozen registry from 7c93031). Contract holds.
  Codex Stage 4 R1 = YES (independent verify).

Guardrails honored:
  - MDX content preservation: popup carries full source, body holds
    summary or subset only (CLAUDE.md 자세히보기 원칙;
    feedback_phase_z_spacing_direction -- capacity expanded, no margin shrink).
  - AI isolation contract: Step 17 POPUP gate is deterministic; AI hook
    surface is split-decision contract only, API call gated.
  - No hardcoding: escalation thresholds derived from existing overflow
    detector outputs; preview_chars deterministic from container px.
  - 1 commit = 1 decision unit: u1~u10 land together as the planned
    production surface; u11 was deliberately split into 7c93031 as Stage 3
    R7 carve-out, and the R7 anchor re-pin rides with this commit because
    it is the direct shift consequence of the u1/u5/u7 pre-anchor additions.
  - Scope-locked: .claude/settings.json explicitly excluded
    (Stage 4 exit report contract).

Out of scope (per Stage 1 + Stage 2):
  - AI_REPAIR API activation (post IMP-35 axis).
  - IMP-34 zone resize, IMP-36 responsive fit (chain partners,
    separate issues).
  - Print-time auto-expand JavaScript for <details>.
  - Popup escalation in stages other than Step 17.
  - Baseline-red body repair (4 frozen failures) -- separate follow-up
    issue; u11 only guards the count.
  - frame_reselect algorithm changes (entry point only).
  - templates/phase_z2/slide_base.html path rename.

source_comment_ids:
  Stage 1: claude_stage1_problem_review_imp35, codex_stage1_verification_imp35_yes
  Stage 2: Claude #4 R2 plan, Codex #5 R2 YES
  Stage 3: Claude #86 (R7 anchor re-pin), Codex #87 YES
  Stage 4: Claude #88 R1, Codex #89 R1 YES

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:36:57 +09:00

1581 lines
70 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""Phase Z-2 Composition Planner v0.
Pipeline 의 빠진 layer = MDX 덩어리들을 *최종 zone unit* 으로 묶는 결정 layer.
위치 :
parse_mdx → align_sections_to_v4_granularity → [본 모듈] → render
원칙 (절대 룰) :
- 특정 MDX / frame / section 하드코딩 X (예: "04-2 면" / "F16 이면")
- 모든 결정 = catalog 메타 + V4 evidence parametric
- 같은 코드가 MDX 02/03/04/05/06... 모두 처리 — 결과는 케이스마다 다름
- drilling 결과 = 입력 (재료), composition planner 결과 = 출력 (zone units)
- slide-level layout = zone 까지만 나눔. zone 내부 분할은 frame partial 책임
8 layout preset vocabulary :
L1 single / L2 horizontal-2 / L3 vertical-2
L4 top-1-bottom-2 / L5 top-2-bottom-1
L6 left-1-right-2 / L7 left-2-right-1
L8 grid-2x2
"""
from __future__ import annotations
import re
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
import yaml
# ─── 8 Layout Preset Vocabulary — catalog-loaded (사용자 lock 2026-05-07) ───
#
# Source of truth = templates/phase_z2/layouts/layouts.yaml (사람이 보고 추가/수정 가능).
# 코드 hardcoded dict 폐기 (Step 7-A catalog 화). logic 변경 X — backward compat.
#
# catalog 의 추가 필드 (render_ready / default_selection / candidate_when) 는
# 기존 사용처에서 무시됨 — Step 7-B (multiple 후보) / Step 9 (layout × frame
# fit eval) 진입 시 입력.
_LAYOUTS_CATALOG_PATH = (
Path(__file__).resolve().parent.parent
/ "templates" / "phase_z2" / "layouts" / "layouts.yaml"
)
def load_layout_presets() -> dict[str, dict]:
"""Load 8 layout presets from catalog.
backward compat: returns same dict shape as old hardcoded LAYOUT_PRESETS —
keys = layout id (single / horizontal-2 / ...),
each value contains zones / topology / positions / css_areas / css_cols / css_rows.
Additional fields (render_ready / default_selection / candidate_when)
ignored by existing callers, consumed by Step 7-B / Step 9 (별 axis).
"""
with open(_LAYOUTS_CATALOG_PATH, encoding="utf-8") as f:
return yaml.safe_load(f) or {}
LAYOUT_PRESETS: dict[str, dict] = load_layout_presets()
def select_layout_candidates(unit_count: int) -> list[str]:
"""Return layout id candidates matching given unit_count.
Step 7-B (사용자 lock 2026-05-07) — multiple 후보 generation.
Args:
unit_count: Final layout placement unit count (Step 4 output).
= section_count + promoted lead_orphans 등.
NOT raw MDX section count — Step 2 raw section count 가 아님.
Returns:
List of layout ids matching candidate_when.unit_count.
Sort order:
1. default_selection: true 먼저 (catalog 정의 순서)
2. default_selection: false 그 다음 (catalog 정의 순서)
Layouts with render_ready: false 는 제외.
Raises:
ValueError: if unit_count < 1 or > 4 (current catalog scope).
Note:
호출처 박힘 (Step 7-conn 2026-05-08) — phase_z2_pipeline.py 의
step07 artifact 가 본 함수 결과 기록 (passive). 기존 select_layout_preset()
은 default 결정 그대로. 후보 평가 / auto decision 은 Step 9 v1 (별 axis).
"""
if unit_count < 1 or unit_count > 4:
raise ValueError(
f"unit_count {unit_count} out of catalog scope [1, 4]"
)
defaults: list[str] = []
alternatives: list[str] = []
for layout_id, spec in LAYOUT_PRESETS.items():
if not spec.get("render_ready", False):
continue
cw = spec.get("candidate_when") or {}
if cw.get("unit_count") != unit_count:
continue
if spec.get("default_selection", False):
defaults.append(layout_id)
else:
alternatives.append(layout_id)
return defaults + alternatives
# ─── Region Layout Catalog — Step 8-B-1 (사용자 lock 2026-05-07) ────────
#
# Source = templates/phase_z2/regions/region_layouts.yaml (SPEC §2.5).
# load 함수 + select_region_layout_candidates().
# 호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact 가
# 본 함수 결과 기록 (placeholder signals: region_count=1, Step 3/4 부재 종속).
_REGION_LAYOUTS_CATALOG_PATH = (
Path(__file__).resolve().parent.parent
/ "templates" / "phase_z2" / "regions" / "region_layouts.yaml"
)
def load_region_layouts() -> dict[str, dict]:
"""Load Internal Region layout catalog (SPEC §2.5, 6 entry).
Returns same dict shape as catalog yaml.
Step 7-A 와 같은 패턴 — source of truth = yaml, code 는 read 만.
"""
with open(_REGION_LAYOUTS_CATALOG_PATH, encoding="utf-8") as f:
return yaml.safe_load(f) or {}
REGION_LAYOUTS: dict[str, dict] = load_region_layouts()
def select_region_layout_candidates(
region_count: int,
content_type_mix: Optional[list[str]] = None,
details_presence: bool = False,
role_pattern: Optional[str] = None,
ratio_asymmetric: bool = False,
flow_type: Optional[str] = None,
has_visual_element: bool = False,
large_table: bool = False,
long_text: bool = False,
) -> list[str]:
"""Return Internal Region layout candidates per SPEC §2.5 decision tree.
Step 8-B-1 (사용자 lock 2026-05-07) — 후보 generation 함수.
Step 7-B 와 다른 점: SPEC §2.5 는 *순차 결정 트리* (첫 매칭 채택).
Step 7-B 는 단순 매칭 (unit_count 같은 모든 entry).
Decision rule (sequential, first match wins) — catalog 와 1:1 일치:
1. region_count == 1 -> region-single
2. details_presence / large_table / long_text -> region-preview-details
3. region_count == 4 AND flow_type == 'parallel_4' -> region-grid-2x2
4. region_count == 2 AND role_pattern ==
'primary_supporting' AND ratio_asymmetric -> region-main-support
5. region_count == 2 AND has_visual_element -> region-horizontal-split
6. fallback (위 미매칭) -> region-vertical-stack
Sort:
region_count == 1 -> [region-single] (fallback X)
region_count >= 2 -> [매칭, region-vertical-stack] 또는 [region-vertical-stack]
Raises:
ValueError: region_count < 1 or > 4 (SPEC §2.5 vocabulary scope).
Note:
호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact
가 본 함수 결과 기록. 현재 placeholder signals (region_count=1, content_type=
"text_block") 종속 — 실제 신호 활성화는 Step 3/4 별 axis.
Step 9 v0 (application_plan) 가 본 후보 list 를 application_candidates 로 해석.
"""
if region_count < 1 or region_count > 4:
raise ValueError(
f"region_count {region_count} out of catalog scope [1, 4]"
)
fallback = "region-vertical-stack"
# 1. region_count == 1
if region_count == 1:
return ["region-single"]
# 2. details_presence / large_table / long_text
if details_presence or large_table or long_text:
match = "region-preview-details"
# 3. region_count == 4 + parallel_4
elif region_count == 4 and flow_type == "parallel_4":
match = "region-grid-2x2"
# 4. region_count == 2 + role_pattern primary_supporting + ratio_asymmetric
elif (
region_count == 2
and role_pattern == "primary_supporting"
and ratio_asymmetric
):
match = "region-main-support"
# 5. region_count == 2 + visual element
elif region_count == 2 and has_visual_element:
match = "region-horizontal-split"
# 6. fallback
else:
return [fallback]
# 매칭됨 + fallback (단 매칭 == fallback 인 경우 1개만)
if match == fallback:
return [fallback]
return [match, fallback]
# ─── Display Strategy Catalog — Step 8-B-2 (사용자 lock 2026-05-07) ────
#
# Source = templates/phase_z2/regions/display_strategies.yaml (4 entry).
# load 함수 + select_display_strategy_candidates().
# 호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact 가
# 본 함수 결과 기록 (placeholder signals: content_type="text_block", Step 3/4 부재 종속).
_DISPLAY_STRATEGIES_CATALOG_PATH = (
Path(__file__).resolve().parent.parent
/ "templates" / "phase_z2" / "regions" / "display_strategies.yaml"
)
def load_display_strategies() -> dict[str, dict]:
"""Load display strategy catalog (4 entry).
Returns same dict shape as catalog yaml.
Step 7-A / 8-B-1 와 같은 패턴 — source of truth = yaml, code 는 read 만.
"""
with open(_DISPLAY_STRATEGIES_CATALOG_PATH, encoding="utf-8") as f:
return yaml.safe_load(f) or {}
DISPLAY_STRATEGIES: dict[str, dict] = load_display_strategies()
_KNOWN_CONTENT_TYPES = frozenset({
"text_block", "table", "image", "details", "decorative_element",
})
def select_display_strategy_candidates(
content_type: str,
long_text: bool = False,
large_table: bool = False,
fits_in_region: Optional[bool] = None,
) -> list[str]:
"""Return display strategy candidates per catalog (display_strategies.yaml).
Step 8-B-2 (사용자 lock 2026-05-07) — 후보 generation 함수.
display_strategies.yaml 만 본다 (region_layouts / frame 은 Step 9 axis).
Hard filter (catalog 박힌 절대 제약 — applies_to / forbidden_for):
- content_type 이 strategy.applies_to 에 있어야 후보
- content_type 이 strategy.forbidden_for 에 있으면 자동 제외
- 핵심 user lock: text_block / table / image / details 는 dropped 절대 X
(catalog forbidden_for 에 박혀 있음 — 원문 무손실 보존)
Ranking (content_type + fit signal):
decorative_element -> [inline_full, dropped]
image -> [inline_full]
text_block / table / details
long_text / large_table
/ fits_in_region == False -> [inline_preview_with_details,
details_only, inline_full]
그 외 -> [inline_full,
inline_preview_with_details,
details_only]
Note:
- fits_in_region 은 가벼운 hint 만. 실제 overflow 판단은 Step 9/14/17 axis.
- dropped 는 decorative_element 의 후순위 (공간 부족 신호 전엔 일단 보여주기).
Raises:
ValueError: content_type 이 catalog scope 밖
(text_block / table / image / details / decorative_element 외).
Note:
호출처 박힘 (Step 8-conn 2026-05-08) — phase_z2_pipeline.py 의 step08 artifact
가 본 함수 결과 기록. 현재 placeholder signal (content_type="text_block")
종속 — 실제 신호 활성화는 Step 3/4 별 axis.
Step 9 v0 (application_plan) 가 본 후보 list 를 application_candidates 의
display_strategy axis 로 해석.
"""
if content_type not in _KNOWN_CONTENT_TYPES:
raise ValueError(
f"content_type {content_type!r} out of catalog scope "
f"(known: {sorted(_KNOWN_CONTENT_TYPES)})"
)
# Hard filter — applies_to / forbidden_for (catalog 직독)
eligible = set()
for name, meta in DISPLAY_STRATEGIES.items():
applies_to = meta.get("applies_to") or []
forbidden_for = meta.get("forbidden_for") or []
if content_type in applies_to and content_type not in forbidden_for:
eligible.add(name)
# Ranking — content_type + fit signal
if content_type == "decorative_element":
order = ["inline_full", "dropped"]
else:
escalate = long_text or large_table or fits_in_region is False
if escalate:
order = [
"inline_preview_with_details",
"details_only",
"inline_full",
]
else:
order = [
"inline_full",
"inline_preview_with_details",
"details_only",
]
return [s for s in order if s in eligible]
# ─── IMP-35 (#64) u6 — Composition popup binding (yaml strategy -> zone payload) ─
#
# Stage 2 binding contract (unit u6):
# Step 17 POPUP gate (u5 in src/phase_z2_ai_fallback/step17.py) stamps
# ``unit.has_popup=True`` AND ``unit.popup_escalation_plan=<plan>`` on
# composition units whose overflow category routes to
# ``details_popup_escalation``. u6 is the composition-side binding that
# translates the unit-side marker into a deterministic zone payload
# structure that u7 (pipeline composer -> render_slide wiring) reads to
# emit the ``<details>/<summary>`` markup u8 will add to slide_base.html.
#
# Inputs (unit-side, all duck-typed via getattr):
# has_popup — bool (False default; u5 sets True on
# feasible escalation only)
# popup_escalation_plan — dict | None (u3 router plan from
# plan_details_popup_escalation; carries
# feasible / category / rationale /
# needs_split_decision)
# raw_content — str (the source MDX content; popup body
# source per CLAUDE.md 자세히보기 원칙)
#
# Outputs (zone payload binding dict):
# display_strategy — catalog strategy id read from
# display_strategies.yaml (NOT hardcoded).
# ``inline_full`` when has_popup=False.
# ``inline_preview_with_details`` when
# has_popup=True (preview = excerpt from
# container px budget downstream; popup body
# preserves the FULL original).
# popup_body_source — str | None — the FULL raw_content. u7 passes
# this verbatim to the renderer; the popup
# body is the MDX 원문 (자세히보기 원칙),
# never summarized in the body branch.
# None when has_popup=False.
# detail_trigger — dict | None — placement + label read from
# the catalog strategy entry's
# ``detail_trigger``. None when has_popup=False.
# preserves_original — bool — echoed from the catalog entry.
# MUST be True for popup-binding strategies
# (absolute user lock — 오답노트 #5 /
# IMPROVEMENT-REDESIGN.md §3.6 line 110).
# has_popup — bool — echoed for downstream multiplex.
# popup_escalation_plan — dict | None — echoed verbatim (u5 plan).
# Provides traceability into the router
# category + rationale for downstream debug.
# strategy_meta — dict — full catalog entry (description /
# applies_to / forbidden_for / detail_trigger)
# so downstream traces can self-explain without
# re-reading the yaml.
#
# Guardrails honored:
# - feedback_ai_isolation_contract — NO AI call. Reads catalog + unit
# state only. The deterministic POPUP gate (u5) already established
# the marker; this function is pure composition-side binding.
# - feedback_no_hardcoding — strategy id is the ONLY name reference, and
# it is the catalog key (yaml is source of truth). detail_trigger
# placement / label come from the catalog entry, not literals.
# - MDX 원문 무손실 보존 — popup_body_source = full raw_content.
# u6 NEVER trims or summarizes; the body preview (excerpt from
# container px budget) is composed by u7 downstream.
# - Phase Z spacing 방향 — u6 binds a strategy that EXPANDS capacity
# (popup escalation) instead of shrinking common margins.
# Strategy id used when the unit carries no popup escalation marker.
# Catalog read — yaml is source of truth.
POPUP_BINDING_NO_POPUP_STRATEGY_ID = "inline_full"
# Strategy id used when the unit carries has_popup=True (deterministic
# choice — the preview body is a px-budget excerpt of the original, the
# popup body holds the FULL original per CLAUDE.md 자세히보기 원칙).
# u5 q3 — preview_chars deterministic from container px telemetry; that
# is an excerpt-from-original pattern, which matches
# ``inline_preview_with_details``. ``details_only`` (summary-only body)
# is the alternative future axis when an AI/summarizer is available.
POPUP_BINDING_ESCALATED_STRATEGY_ID = "inline_preview_with_details"
def bind_popup_display_strategy(unit) -> dict:
"""Bind catalog popup display strategy to a zone payload (IMP-35 u6).
Reads the unit-side ``has_popup`` + ``popup_escalation_plan`` markers
stamped by Step 17 POPUP gate (u5) and produces a zone payload dict
that u7 wires into the renderer. The catalog
(``display_strategies.yaml``) is the source of truth for both the
strategy id and the detail_trigger placement / label — no hardcoded
string literals.
Args:
unit: a CompositionUnit (or any duck-typed object exposing
``has_popup`` / ``popup_escalation_plan`` / ``raw_content``).
``has_popup`` defaults to False when the attribute is absent
(units that never went through the Step 17 POPUP gate).
Returns:
zone payload binding dict (see module-level u6 contract block
immediately above for the full schema).
Raises:
RuntimeError: if the chosen catalog strategy id is missing from
the loaded ``DISPLAY_STRATEGIES`` mapping. Defensive guard —
yaml drift would otherwise cause downstream KeyError on a
stale string literal. The constants
``POPUP_BINDING_NO_POPUP_STRATEGY_ID`` /
``POPUP_BINDING_ESCALATED_STRATEGY_ID`` must always resolve
against the catalog at import time.
"""
has_popup = bool(getattr(unit, "has_popup", False))
plan = getattr(unit, "popup_escalation_plan", None)
raw_content = getattr(unit, "raw_content", "") or ""
strategy_id = (
POPUP_BINDING_ESCALATED_STRATEGY_ID
if has_popup
else POPUP_BINDING_NO_POPUP_STRATEGY_ID
)
meta = DISPLAY_STRATEGIES.get(strategy_id)
if meta is None:
raise RuntimeError(
f"bind_popup_display_strategy: catalog drift — strategy id "
f"{strategy_id!r} is missing from display_strategies.yaml. "
f"Loaded keys: {sorted(DISPLAY_STRATEGIES)}."
)
if not has_popup:
return {
"display_strategy": strategy_id,
"popup_body_source": None,
"detail_trigger": None,
"preserves_original": bool(meta.get("preserves_original")),
"has_popup": False,
"popup_escalation_plan": None,
"strategy_meta": meta,
}
# has_popup=True path. preserves_original MUST be True per the catalog
# absolute user lock — defensive guard against yaml drift.
if not meta.get("preserves_original"):
raise RuntimeError(
f"bind_popup_display_strategy: catalog invariant violated — "
f"popup-binding strategy {strategy_id!r} has preserves_original="
f"{meta.get('preserves_original')!r}; MDX 원문 무손실 보존 "
f"requires preserves_original=True (오답노트 #5 / "
f"IMPROVEMENT-REDESIGN.md §3.6 line 110)."
)
trigger_meta = meta.get("detail_trigger") or {}
return {
"display_strategy": strategy_id,
# MDX 원문 무손실 보존 — popup body = full raw_content (verbatim).
"popup_body_source": raw_content,
"detail_trigger": {
"placement": trigger_meta.get("placement"),
"label": trigger_meta.get("label"),
},
"preserves_original": True,
"has_popup": True,
"popup_escalation_plan": plan,
"strategy_meta": meta,
}
# ─── IMP-35 (#64) u7 — Pipeline composer -> render_slide wiring ──
#
# Stage 2 wiring contract (unit u7):
# u6 (``bind_popup_display_strategy``) produced the deterministic zone
# binding from the unit-side marker stamped by Step 17 POPUP gate (u5).
# u7 wires that binding into the pipeline composer's zones_data so the
# render_slide call site (and downstream slide_base.html consumer u8)
# sees three uniform render-context field names per zone:
#
# has_popup : bool — escalation marker echo
# popup_html : str — popup body source (full ``raw_content`` per u6;
# u8 wraps it in ``<details>/<summary>``).
# ``None`` when has_popup=False.
# preview_text : str — px-budgeted excerpt of ``raw_content`` shown in
# the body / inline_preview slot. NEVER trims
# inside a line — line-boundary cut only — and
# the popup body retains the FULL original
# (MDX 원문 무손실 보존). ``None`` when
# has_popup=False.
#
# The full u6 binding is also echoed on the zone dict under
# ``popup_binding`` so downstream debug / catalog-aware consumers can
# self-explain without re-reading the yaml.
#
# Why the preview is a deterministic line-budget cut (u5 q3 resolution):
# The popup body holds the FULL original verbatim, so the preview loses
# no information — it just truncates at a deterministic boundary that
# fits the container height telemetry. Container telemetry source is the
# per-unit ``min_height_px`` (frame visual_hints), which is what the
# pipeline composer already knows at the zones_data append site.
#
# We never re-summarize, never AI-call, never reorder. Char-budget cut
# would risk splitting CJK words mid-character — line-boundary cut is
# the closest deterministic surface to ``raw_content`` semantics
# (MDX paragraph / bullet boundaries).
#
# Guardrails honored:
# - feedback_ai_isolation_contract — pure deterministic helper. No
# anthropic import, no AI fallback router path.
# - MDX 원문 무손실 보존 — preview is a CUT, never a rewrite; popup body
# stays equal to ``raw_content``.
# - feedback_no_hardcoding — line metric is parametric (line_height_px
# defaults to slide_base.html body line metric ~18 px = 11 px font *
# 1.6 line-height + ~0.4 px ascent guard). u9 will surface the literal
# value source.
# Line height in px used to convert a container-height budget into a
# line-count budget. Matches slide_base.html ``--font-body`` (11 px) at
# the ``.text-line`` line-height (1.6). Default — NOT a hardcoded magic
# constant: ``compute_popup_preview_text`` accepts an override so the
# downstream renderer (u8) or per-frame contracts can pass a tighter
# value if a frame uses a smaller body font.
POPUP_PREVIEW_DEFAULT_LINE_HEIGHT_PX = 18.0
def compute_popup_preview_text(
raw_content: str,
container_height_px: float,
*,
line_height_px: float = POPUP_PREVIEW_DEFAULT_LINE_HEIGHT_PX,
) -> str:
"""Px-budgeted preview excerpt of ``raw_content`` (IMP-35 u7).
Deterministic line-boundary cut — returns the leading lines of
``raw_content`` that fit within ``container_height_px`` at the slide
body line metric. Never trims inside a line (no mid-CJK-word cut);
the popup body (u6 ``popup_body_source``) retains the FULL original
verbatim so this excerpt loses no information.
Args:
raw_content: the unit's source MDX content; the popup body
source per CLAUDE.md 자세히보기 원칙.
container_height_px: container height telemetry. The pipeline
composer passes ``min_height_px`` (frame visual_hints) at
the zones_data append site. Non-positive values fall back
to returning the full content unchanged (popup gate would
not have fired without a real container budget anyway).
line_height_px: px per body line. Default matches slide_base.html
``.text-line`` (11 px font * 1.6 line-height + guard).
Overridable for tighter-font frames.
Returns:
The leading lines that fit the budget, joined verbatim. If the
content already fits, returns ``raw_content`` unchanged.
"""
if not raw_content:
return ""
if container_height_px <= 0 or line_height_px <= 0:
# No budget signal — return the full content unchanged. u5 POPUP
# gate would not have fired without a real container budget, so
# this branch is only reachable for non-popup units (where the
# preview is anyway unused — see compose_zone_popup_payload).
return raw_content
max_lines = int(container_height_px // line_height_px)
if max_lines < 1:
max_lines = 1
lines = raw_content.splitlines(keepends=False)
if len(lines) <= max_lines:
return raw_content
# Re-join with "\n" — splitlines drops the terminator so a verbatim
# round-trip of the leading lines is "\n".join(...). Preserves the
# exact head of raw_content up to the chosen line boundary.
return "\n".join(lines[:max_lines])
def compose_zone_popup_payload(unit, container_height_px: float) -> dict:
"""Compose the per-zone popup render-context payload (IMP-35 u7).
Reads u6 ``bind_popup_display_strategy(unit)`` and surfaces the three
uniform render-context field names the pipeline composer attaches to
each zone in ``zones_data``. The full u6 binding is also echoed
under ``popup_binding`` so downstream debug / u8 / u9 consumers can
self-explain without re-reading the yaml.
Args:
unit: a CompositionUnit (or any duck-typed object exposing
``has_popup`` / ``popup_escalation_plan`` / ``raw_content``).
container_height_px: container height telemetry. The pipeline
composer passes ``min_height_px`` at the zones_data append
site. The non-popup branch ignores the value (preview_text
is always None when has_popup=False).
Returns:
Dict with the four wiring keys (``has_popup``, ``popup_html``,
``preview_text``, ``popup_binding``). Spreadable into a zone
dict via ``zones_data.append({..., **payload})``.
"""
binding = bind_popup_display_strategy(unit)
has_popup = bool(binding.get("has_popup"))
if not has_popup:
return {
"has_popup": False,
"popup_html": None,
"preview_text": None,
"popup_binding": binding,
}
raw_content = getattr(unit, "raw_content", "") or ""
popup_html = binding.get("popup_body_source")
preview_text = compute_popup_preview_text(raw_content, container_height_px)
return {
"has_popup": True,
# popup body = FULL raw_content (u6 popup_body_source). u8 wraps
# this in <details>/<summary> markup on slide_base.html.
"popup_html": popup_html,
# body preview = px-budgeted line-boundary cut of raw_content.
# NEVER trims inside a line; popup body holds the FULL original
# so this excerpt loses no information.
"preview_text": preview_text,
# Full u6 binding echo — downstream debug surfaces (catalog
# detail_trigger placement, popup_escalation_plan category /
# rationale) without re-reading yaml.
"popup_binding": binding,
}
# ─── CompositionUnit ────────────────────────────────────────────
@dataclass
class CompositionUnit:
"""Slide 내 1 zone 후보 = MDX section(s) + 매칭된 frame.
source_section_ids : 1 개 = single, 2+ = merged
merge_type :
- "single" : 단일 section
- "parent_merged" : parent V4 entry 존재 (v0)
- "parent_merged_inferred" : parent V4 entry 없음, child evidence 로 추론 (v0.1)
frame_* : V4 evidence 그대로 (catalog 메타 X 하드코딩 X)
score : 종합 점수
rationale : score breakdown 추적
review_required : True 면 자동 선택 X — debug 에만 노출, 사용자/AI 검토 후
별도 path (light_edit / restructure / AI restructuring) 로 처리
review_reasons : 왜 review_required 가 True 인지 (자가검증용 — child label mix /
template_id 불일치 / cardinality 불호환 등)
"""
source_section_ids: list[str]
merge_type: str
frame_template_id: str
frame_id: str
frame_number: int
confidence: float
label: str # use_as_is / light_edit / restructure / reject
phase_z_status: str
raw_content: str
title: str
v4_rank: Optional[int] = None
selection_path: str = "rank_1"
fallback_reason: Optional[str] = None
score: float = 0.0
rationale: dict = field(default_factory=dict)
# 자동 파이프라인 단계 상태 (review/UI 개념 X — 현재는 자동 결정 + 명확한 실패 기록만)
# auto_selectable=False 면 자동 선택 단계에서 제외. filter_reasons 가 그 이유.
# 예: parent_merged_inferred 의 W1/W2/W3 (rep status / all reject / majority not-auto-renderable)
# 사용자/AI 검토는 별 layer (interactive editor) 에서 처리. 본 dataclass 는 자동 결정 완결.
auto_selectable: bool = True
filter_reasons: list[str] = field(default_factory=list)
# informational signals — auto_selectable 여부와 무관. future axis 가 점수화할 영역.
# 예: "children disagree on rank-1 template_id" / "minority of children non-auto-renderable"
notes: list[str] = field(default_factory=list)
# Step 6-A axis 추가 (사용자 lock 2026-05-08).
# V4 후보 list (V4Match-shape duck typed — composition module 은 V4Match dataclass 미import,
# circular dep 회피). 각 entry attrs : template_id / frame_id / frame_number / confidence / label.
# list 순서 = V4 rank (candidates[0] = rank-1 non-reject — 단일 frame_template_id /
# frame_id / label / confidence 와 일치, backward compat lock).
# 0 길이 = "no_non_reject_v4_candidate" 신호 (Step 9 application_plan input).
v4_candidates: list = field(default_factory=list)
# IMP-30 u2 — provisional first-render flag. True when the V4Match
# backing this unit was synthesized via lookup_v4_match_with_fallback
# (allow_provisional=True) after chain_exhausted, or when u3 inserts
# a last-resort provisional fill for an uncovered section. Carried as
# data (not re-derived from label/selection_path downstream) so the
# render path / status / zone template can surface "needs adaptation"
# uniformly. Default False keeps non-provisional units byte-identical.
provisional: bool = False
# ─── Heading Tree ──────────────────────────────────────────────
def derive_parent_id(section_id: str) -> Optional[str]:
"""Section id -> parent id derivation by V4 key convention.
IMP-08 B-3 : canonical ordinal `${parent}-sub-${n}` recognised first;
legacy decimal `04-2.1` kept as fallback alias path.
Examples (illustrative, not rules) :
- "03-1-sub-2" -> "03-1" (canonical ordinal, IMP-08)
- "04-2.1" -> "04-2" (decimal suffix, legacy V4 key style)
- "04-1" -> None (top-level, no parent)
- "04" -> None
"""
m = re.fullmatch(r"(.+?)-sub-(\d+)", section_id)
if m:
return m.group(1)
parts = section_id.split("-", 1)
if len(parts) != 2:
return None
mdx_id, suffix = parts
if "." in suffix:
parent_suffix = suffix.split(".")[0]
return f"{mdx_id}-{parent_suffix}"
return None
def build_heading_tree(sections) -> dict:
"""Section list → tree {section_id: {section, children}}."""
tree = {s.section_id: {"section": s, "children": []} for s in sections}
for s in sections:
parent = derive_parent_id(s.section_id)
if parent and parent in tree:
tree[parent]["children"].append(s.section_id)
return tree
# ─── Candidate Generation ──────────────────────────────────────
def _apply_capacity_fit(candidate: CompositionUnit, capacity_fit_fn) -> None:
"""capacity_fit_fn 결과를 candidate 의 rationale + auto_selectable + filter_reasons 에 반영.
fit_status 가 'ok' / 'no_contract' / 'unknown_source_shape' 이면 auto_selectable 영향 X
(no_contract 는 catalog-only mapper 가 별도로 ValueError 처리).
그 외 (strict_mismatch / exceeds_max / below_min / exceeds_truncate) 는 silent loss 또는
mapper FitError 가 발생할 후보 → auto_selectable=False + filter_reasons 'C1: ...'.
"""
if capacity_fit_fn is None:
return
fit = capacity_fit_fn(candidate.frame_template_id, candidate.raw_content)
candidate.rationale["capacity_fit"] = fit
if fit["fit_status"] in {"ok", "no_contract", "unknown_source_shape"}:
return
candidate.auto_selectable = False
candidate.filter_reasons.append(
f"C1: capacity mismatch ({fit['fit_status']}) — {fit['mismatch_reason']}"
)
def collect_candidates(sections, v4_lookup_fn, v4_label_to_status: dict,
auto_renderable_statuses: Optional[set[str]] = None,
capacity_fit_fn=None,
v4_candidates_lookup_fn=None):
"""Generate composition candidates.
v0.1 candidate types :
1. single : per leaf section (V4 entry 필수)
2. parent_merged : parent 자체에 V4 entry 존재 (parent 가 직접 매칭됨)
3. parent_merged_inferred : parent V4 없음. child evidence 로 representative
template_id 추론
원칙 :
- 특정 section_id / template_id / frame 하드코딩 X
- 모든 결정 = derive_parent_id() + V4 evidence + v4_label_to_status mapping + 주입된 fn (파라메트릭)
Args:
sections : align 결과
v4_lookup_fn : (section_id) → V4Match | None (rank-1 only, 기존 호환)
v4_label_to_status : V4 label → Phase Z status mapping
auto_renderable_statuses : 자동 렌더 허용 status set (W1/W3 판정 입력)
capacity_fit_fn : Optional (template_id, content) → fit dict.
제공되면 모든 candidate 에 적용 — capacity mismatch 시 auto_selectable=False
(silent truncate / mapper FitError 사전 차단).
v4_candidates_lookup_fn : Optional (section_id) → list[V4Match].
Step 6-A axis (사용자 lock 2026-05-08). non-reject max-N 후보 list.
제공되면 모든 candidate 에 v4_candidates 필드 채움.
None 이면 v4_candidates = [] (backward compat).
본 fn 이 V4 raw dict 구조를 흡수 — composition module 은 V4 yaml shape 모름.
Returns:
list[CompositionUnit]
"""
if auto_renderable_statuses is None:
auto_renderable_statuses = set()
def _v4_cands(section_id: str) -> list:
# v4_candidates_lookup_fn 미제공 시 빈 list (backward compat).
return v4_candidates_lookup_fn(section_id) if v4_candidates_lookup_fn else []
candidates = []
# 1. Separate
for s in sections:
match = v4_lookup_fn(s.section_id)
if match is None:
continue
c = CompositionUnit(
source_section_ids=[s.section_id],
merge_type="single",
frame_template_id=match.template_id,
frame_id=match.frame_id,
frame_number=match.frame_number,
confidence=match.confidence,
label=match.label,
phase_z_status=v4_label_to_status.get(match.label, "unknown"),
v4_rank=getattr(match, "v4_rank", None),
selection_path=getattr(match, "selection_path", "rank_1"),
fallback_reason=getattr(match, "fallback_reason", None),
raw_content=s.raw_content,
title=s.title,
v4_candidates=_v4_cands(s.section_id),
provisional=getattr(match, "provisional", False),
)
_apply_capacity_fit(c, capacity_fit_fn)
candidates.append(c)
# parent → children 그룹화
parent_to_children: dict[str, list] = {}
for s in sections:
pid = derive_parent_id(s.section_id)
if pid:
parent_to_children.setdefault(pid, []).append(s)
# 2. parent_merged (parent 자체가 V4 에 매칭된 경우)
for pid, children in parent_to_children.items():
parent_match = v4_lookup_fn(pid)
if parent_match is None:
continue # branch 3 가 처리
if len(children) < 2:
continue # merge 의미 없음
merged_raw = "\n\n".join(c.raw_content for c in children)
c_pm = CompositionUnit(
source_section_ids=[c.section_id for c in children],
merge_type="parent_merged",
frame_template_id=parent_match.template_id,
frame_id=parent_match.frame_id,
frame_number=parent_match.frame_number,
confidence=parent_match.confidence,
label=parent_match.label,
phase_z_status=v4_label_to_status.get(parent_match.label, "unknown"),
v4_rank=getattr(parent_match, "v4_rank", None),
selection_path=getattr(parent_match, "selection_path", "rank_1"),
fallback_reason=getattr(parent_match, "fallback_reason", None),
raw_content=merged_raw,
title=pid,
v4_candidates=_v4_cands(pid),
provisional=getattr(parent_match, "provisional", False),
)
_apply_capacity_fit(c_pm, capacity_fit_fn)
candidates.append(c_pm)
# 3. parent_merged_inferred (v0.1) — parent V4 없음, child evidence 기반
for pid, children in parent_to_children.items():
if v4_lookup_fn(pid) is not None:
continue # branch 2 가 이미 처리
if len(children) < 2:
continue
# children 중 V4 매칭 있는 것들만 evidence 로 사용
child_matches: list[tuple] = []
for c in children:
m = v4_lookup_fn(c.section_id)
if m is not None:
child_matches.append((c, m))
if len(child_matches) < 2:
continue # 최소 2 child evidence 필요
# representative = 가장 confidence 높은 child match (v0.1.1 단순 룰)
# 향후 axes : top-k convergence, template family agreement, cardinality_fit 등
rep_child, rep_match = max(child_matches, key=lambda cm: cm[1].confidence)
# 자동 선택 가능 여부 = auto_selectable. default True (strong inferred merge).
# 다음 weak 신호 중 하나라도 있으면 auto_selectable=False (filter_reasons 에 사유) :
# W1 : representative status 가 auto-renderable 아님 → 자동 렌더 자체가 막힘
# W2 : 모든 child 가 reject → merge 의미 자체가 없음
# W3 : auto-renderable 아닌 child label 이 majority (>50%)
# informational notes (auto_selectable 영향 X, future axis 점수화 영역) :
# N1 : children 의 rank-1 template_id 가 서로 다름 → top-k / family compat
# N2 : non-auto-renderable child label 이 일부 (소수) 존재
rep_status = v4_label_to_status.get(rep_match.label, "unknown")
child_labels = [m.label for _, m in child_matches]
child_template_ids_unique = sorted({m.template_id for _, m in child_matches})
n_children = len(child_matches)
n_not_auto = sum(
1 for l in child_labels
if v4_label_to_status.get(l) not in auto_renderable_statuses
)
filter_reasons: list[str] = []
notes: list[str] = []
if rep_status not in auto_renderable_statuses:
filter_reasons.append(
f"W1: representative status '{rep_status}' (label={rep_match.label}) "
f"not in auto_renderable_statuses={sorted(auto_renderable_statuses)}."
)
if all(l == "reject" for l in child_labels):
filter_reasons.append(
"W2: all children labeled 'reject' — merge has no fit basis."
)
if n_children > 0 and n_not_auto * 2 > n_children:
non_auto_labels = sorted({
l for l in child_labels
if v4_label_to_status.get(l) not in auto_renderable_statuses
})
filter_reasons.append(
f"W3: majority of children ({n_not_auto}/{n_children}) have "
f"non-auto-renderable labels {non_auto_labels}."
)
if len(child_template_ids_unique) > 1:
notes.append(
f"N1: children's rank-1 template_id differs ({child_template_ids_unique}). "
f"representative='{rep_match.template_id}' (highest child confidence). "
f"top-k / family compatibility 평가는 future axis."
)
if 0 < n_not_auto <= n_children // 2:
non_auto_labels_minority = sorted({
l for l in child_labels
if v4_label_to_status.get(l) not in auto_renderable_statuses
})
notes.append(
f"N2: minority ({n_not_auto}/{n_children}) of children non-auto-renderable "
f"({non_auto_labels_minority}). representative is auto-renderable, merge proceeds."
)
auto_selectable = len(filter_reasons) == 0
merged_raw = "\n\n".join(c.raw_content for c, _ in child_matches)
c_inf = CompositionUnit(
source_section_ids=[c.section_id for c, _ in child_matches],
merge_type="parent_merged_inferred",
frame_template_id=rep_match.template_id,
frame_id=rep_match.frame_id,
frame_number=rep_match.frame_number,
confidence=rep_match.confidence,
label=rep_match.label,
phase_z_status=rep_status,
v4_rank=getattr(rep_match, "v4_rank", None),
selection_path=getattr(rep_match, "selection_path", "rank_1"),
fallback_reason=getattr(rep_match, "fallback_reason", None),
raw_content=merged_raw,
title=pid,
auto_selectable=auto_selectable,
filter_reasons=filter_reasons,
notes=notes,
# rep_child 의 V4 후보 list (rep_match 와 같은 출처, frame_* 와 일관).
v4_candidates=_v4_cands(rep_child.section_id),
# IMP-30 u2 — rep_match drives frame selection so its provisional
# flag flows here. If a non-rep child match is provisional but the
# rep is not, this unit is not provisional (the rep frame is real).
provisional=getattr(rep_match, "provisional", False),
)
_apply_capacity_fit(c_inf, capacity_fit_fn)
candidates.append(c_inf)
return candidates
# ─── Scoring ───────────────────────────────────────────────────
# v0 label weights — V4 label → score multiplier.
# 향후 axes 추가 (cardinality_fit / hierarchy_coherence / density) 시 확장.
V0_LABEL_WEIGHT = {
"use_as_is": 1.0,
"light_edit": 0.7,
"restructure": 0.4,
"reject": 0.0,
}
def score_candidate(c: CompositionUnit) -> CompositionUnit:
"""v0 scoring : confidence × label_weight.
추후 추가될 axes (rationale 에 자리만 잡아둠) :
- cardinality_fit : item_count vs frame ideal/min/max
- hierarchy_coherence : merge_type 적합도
- density_score : content 밀도 vs zone 크기
"""
label_weight = V0_LABEL_WEIGHT.get(c.label, 0.0)
frame_compat = c.confidence * label_weight
c.score = frame_compat
# 기존 rationale 보존 (예: collect_candidates 가 넣은 capacity_fit)
c.rationale.update({
"frame_compat": round(frame_compat, 4),
"confidence": c.confidence,
"label": c.label,
"label_weight": label_weight,
"merge_type": c.merge_type,
# placeholders for future axes
"hierarchy_coherence": None,
"density_score": None,
})
return c
# ─── Selection ─────────────────────────────────────────────────
def select_composition_units(
candidates,
allowed_statuses: set[str],
*,
all_section_ids: Optional[list[str]] = None,
allow_provisional_fill: bool = False,
) -> list[CompositionUnit]:
"""Greedy non-overlapping selection by score, with coverage tiebreak.
1. 모든 candidate 점수 매김
2. filter :
- phase_z_status ∈ allowed_statuses
- auto_selectable=True (W1/W2/W3 신호 통과)
3. 정렬 키 = (score desc, source_section_ids 수 desc)
— 동점이면 더 많은 section 을 cover 하는 후보 우선.
parent_merged_inferred 가 같은 점수의 single 후보를 *coverage 우위* 로 이김.
4. greedy : 이미 covered 된 section 을 가진 후보는 skip
5. 최종 선택 = covered set 채워나감
auto_selectable=False candidate 는 자동 선택 X. debug 의 candidates_summary 에는 남음.
UI/editor layer 에서 사용자가 별도 처리 가능 (현 v0 범위 X).
IMP-30 u3 — last-resort provisional fill (opt-in via allow_provisional_fill):
After the normal greedy pass, sections in ``all_section_ids`` that are
still uncovered are filled with the highest-score *provisional*
candidate (``c.provisional == True``) that includes at least one
uncovered section and does not collide with already-covered ones. A
provisional candidate's backing V4Match was synthesized via
``lookup_v4_match_with_fallback(allow_provisional=True)`` (IMP-30 u1)
after chain_exhausted; its ``phase_z_status`` is therefore typically
*outside* ``allowed_statuses`` (extract_matched_zone / fallback_candidate),
which is why it gets filtered out of the normal greedy pass. The fill
preserves first-render invariant for sections whose rank-1~3 are all
restructure/reject. Default ``allow_provisional_fill=False`` keeps
pre-u3 behavior byte-identical (IMP-05 regression guard).
Args:
candidates: full candidate pool from collect_candidates().
allowed_statuses: phase_z_status set considered auto-renderable.
all_section_ids: ordered section id list (only consulted when
allow_provisional_fill=True; required for coverage check).
allow_provisional_fill: opt-in for last-resort provisional fill.
"""
scored = [score_candidate(c) for c in candidates]
viable = [
c for c in scored
if c.phase_z_status in allowed_statuses and c.auto_selectable
]
viable.sort(key=lambda c: (c.score, len(c.source_section_ids)), reverse=True)
selected = []
covered = set()
for c in viable:
if any(sid in covered for sid in c.source_section_ids):
continue
selected.append(c)
covered.update(c.source_section_ids)
# IMP-30 u3 — last-resort provisional fill (opt-in, default off).
# Honors first-render invariant by surfacing chain_exhausted sections as
# provisional zones instead of dropping them. Skip reasons on
# non-provisional filtered candidates are preserved (not mutated here).
if allow_provisional_fill and all_section_ids:
uncovered = {sid for sid in all_section_ids if sid not in covered}
if uncovered:
provisional_pool = [
c for c in scored
if c.provisional
and any(sid in uncovered for sid in c.source_section_ids)
]
provisional_pool.sort(
key=lambda c: (c.score, len(c.source_section_ids)),
reverse=True,
)
for c in provisional_pool:
if any(sid in covered for sid in c.source_section_ids):
continue
selected.append(c)
covered.update(c.source_section_ids)
return selected
# ─── Layout Preset Selection ───────────────────────────────────
def select_layout_preset(units: list[CompositionUnit]) -> Optional[str]:
"""v0 : count-based default selection.
1 unit → single
2 units → horizontal-2 (default. vertical-2 는 aspect signal 추가 시 분기)
3 units → top-1-bottom-2 (default. 다른 3-zone variant 는 content-weight signal 추가 시 분기)
4 units → grid-2x2
v0 한계 :
- aspect / content-weight 신호 미반영 → 2 units 는 항상 horizontal, 3 units 는 항상 top-1-bottom-2
- 향후 unit.raw_content 기반 weight 산정 시 정교화
"""
n = len(units)
if n == 0:
return None
if n == 1:
return "single"
if n == 2:
return "horizontal-2"
if n == 3:
return "top-1-bottom-2"
if n == 4:
return "grid-2x2"
raise ValueError(
f"Composition v0 : layout for {n} units not supported (max 4). "
"Larger counts require split-into-multiple-slides decision (future)."
)
# ─── Public entry — composition pipeline ───────────────────────
def plan_composition(sections, v4_lookup_fn, v4_label_to_status: dict,
allowed_statuses: set[str],
capacity_fit_fn=None,
v4_candidates_lookup_fn=None,
*,
allow_provisional_fill: bool = False) -> tuple[list[CompositionUnit], Optional[str], dict]:
"""Composition planner v0.2 entry.
v0.2 변경 :
- capacity_fit_fn 주입 시 모든 candidate 에 capacity 사전 검사
(silent truncate / mapper FitError 사전 차단). 불일치 시 auto_selectable=False
+ filter_reason 'C1: ...'.
Step 6-A axis (사용자 lock 2026-05-08) :
- v4_candidates_lookup_fn 주입 시 모든 CompositionUnit 에 v4_candidates 채움.
logic 변화 X — 단일 frame_template_id / frame_id / label / confidence 는 그대로.
runtime 결과 무변. Step 9 application_plan input 위한 schema 확장.
IMP-30 u3 — last-resort provisional fill (opt-in, default off):
``allow_provisional_fill`` is plumbed to select_composition_units().
When True, uncovered sections receive a provisional fill from candidates
whose backing V4Match was synthesized via ``allow_provisional=True``
(IMP-30 u1). ``_candidate_state`` returns ``selected_provisional`` for
those filled units so the debug summary distinguishes greedy selections
from provisional fills. Default False keeps IMP-05 behavior identical.
v0.1 / v0.1.1 동작 (유지) :
- parent_merged_inferred candidate 생성 (parent V4 없어도)
- review 개념 X. auto_selectable + filter_reasons 만으로 자동 결정
- selection : score desc + coverage 우세 tiebreak
Returns:
units : 자동 선택된 composition units
layout_preset : 8 vocabulary 중 하나 (또는 None)
debug : 후보 전체 + capacity_fit + filter_reasons + preset 결정 근거
"""
candidates = collect_candidates(
sections, v4_lookup_fn, v4_label_to_status,
auto_renderable_statuses=allowed_statuses,
capacity_fit_fn=capacity_fit_fn,
v4_candidates_lookup_fn=v4_candidates_lookup_fn,
)
scored_all = [score_candidate(c) for c in candidates]
units = select_composition_units(
candidates,
allowed_statuses,
all_section_ids=[s.section_id for s in sections] if allow_provisional_fill else None,
allow_provisional_fill=allow_provisional_fill,
)
preset = select_layout_preset(units)
def _candidate_state(c: CompositionUnit) -> str:
if c in units:
# IMP-30 u3 — provisional-fill units surface as a distinct state so
# downstream debug consumers can tell greedy selection apart from
# last-resort fill. unit.provisional flows from u1 (V4Match
# synthesis) → u2 (CompositionUnit propagation).
if c.provisional:
return "selected_provisional"
return "selected"
if c.phase_z_status not in allowed_statuses:
return "filtered_status" # V4 label → status not auto-renderable
if not c.auto_selectable:
# filter_reasons prefix 로 capacity 와 weak 구분
if any(r.startswith("C") for r in c.filter_reasons):
return "filtered_capacity" # C1 (capacity mismatch)
return "filtered_weak" # W1/W2/W3 (parent_merged_inferred only)
return "filtered_lost" # viable 였지만 coverage 충돌로 밀림
candidates_summary = [
{
"source_section_ids": c.source_section_ids,
"merge_type": c.merge_type,
"template_id": c.frame_template_id,
"label": c.label,
"phase_z_status": c.phase_z_status,
"v4_rank": c.v4_rank,
"selection_path": c.selection_path,
"fallback_reason": c.fallback_reason,
"score": c.score,
"selection_state": _candidate_state(c),
"auto_selectable": c.auto_selectable,
"filter_reasons": list(c.filter_reasons),
"notes": list(c.notes),
"capacity_fit": c.rationale.get("capacity_fit"),
}
for c in scored_all
]
merge_candidates = [
s for s in candidates_summary
if s["merge_type"] in {"parent_merged", "parent_merged_inferred"}
]
capacity_mismatches = [
s for s in candidates_summary
if s["selection_state"] == "filtered_capacity"
]
debug = {
"planner_version": "v0.2",
"selection_rule": (
"score desc, then source_section_ids count desc (coverage tiebreak). "
"filter = phase_z_status ∉ allowed_statuses OR auto_selectable=False. "
"auto_selectable=False 사유 : C1 (capacity mismatch — silent truncate / FitError 차단), "
"W1 (rep not auto-renderable), W2 (all children reject), W3 (majority children non-auto-renderable)."
),
"candidates_total": len(scored_all),
"candidates_viable_auto": len([
c for c in scored_all
if c.phase_z_status in allowed_statuses and c.auto_selectable
]),
"candidates_summary": candidates_summary,
"merge_candidates": merge_candidates,
"capacity_mismatches": capacity_mismatches,
"selected_units_count": len(units),
"layout_preset": preset,
"layout_preset_rationale": (
f"v0 count-based: {len(units)} units → {preset}"
if preset else "no viable units"
),
}
return units, preset, debug
# ─── IMP-48 — Re-split All-Reject Merges (#77, Stage 2 / u1~u3) ─────
def resplit_all_reject_merges(
units: list[CompositionUnit],
sections,
v4_lookup_fn,
v4_label_to_status: dict,
allowed_statuses: set[str],
*,
capacity_fit_fn=None,
v4_candidates_lookup_fn=None,
section_assignment_override: bool = False,
) -> tuple[list[CompositionUnit], dict]:
"""Re-split merged composition units whose rank-1 V4 label is ``reject``.
IMP-48 (#77) — Step 6 post-pass that decomposes a merged unit
(``parent_merged`` / ``parent_merged_inferred``) carrying ``label=reject``
into per-section singles, so child sections with non-reject rank-1 V4
evidence can flow through the normal use_as_is / light_edit / restructure
paths instead of being handed to IMP-47B (#76) as a single blob.
Stage 2 / u3 slice (current revision) :
u1 contract (detection scan + override skip + idempotent single-
exclusion) + u2 per-section Branch-1 rebuild (each rebuilt single
carries ``merge_type="single"`` + the section's OWN rank-1 V4
evidence via ``v4_lookup_fn`` + the section's original
``raw_content`` from ``sections``) are both preserved. u3 adds the
gating + swap path :
1. **Coverage equality** — every child section in
``source_section_ids`` MUST rebuild successfully. Any
``section_not_found`` / ``no_v4_match`` rebuild result short-
circuits that merged unit to ``reason="incomplete_rebuild"``.
2. **Beneficial split** — at least one rebuilt single MUST have
``label != "reject"`` (Stage 2 Q2 Codex YES — "≥1 section
gains non-reject frame"). Otherwise that merged unit short-
circuits to ``reason="no_beneficial_split"`` and IMP-47B (#76)
handles the merge directly.
3. **Layout cap (≤ 4 units)** — projected post-split unit count
(across ALL detected merges that would split) MUST be ≤ 4.
Otherwise EVERY would-be split is aborted with
``reason="layout_cap_exceeded"`` (Stage 2 Q2 default — keep
merged, no partial split; v0 ``select_layout_preset`` supports
1~4 units max).
4. **Telemetry** — every single produced by an APPLIED split has
``selection_path="resplit_from_merge"`` (Stage 1 Q3 YES,
additive field reuse — no schema add).
5. **Audit payload** — ``audit["applied"]`` reflects whether ANY
merge actually split. ``audit["split_units"]`` /
``audit["skipped_units"]`` capture per-merge decisions.
``audit["post_split_unit_count"]`` reflects the returned list
length. ``audit["post_split_layout_preset"]`` is filled via
``select_layout_preset(out_units)`` when ``applied=True``,
None otherwise (u5 also re-derives in pipeline scope).
``out_units`` is the post-resplit unit list (merged removed +
singles inserted, in original ordering). When no merge splits,
``out_units`` is byte-identical to input ``units`` and
``applied=False`` — the audit's ``skipped_reason`` becomes
``"no_split_applied"``.
Detection signal (★ no-hardcoding, AI=0) :
``merge_type ∈ {"parent_merged", "parent_merged_inferred"}``
AND ``label == "reject"``
AND ``len(source_section_ids) >= 2``
Signal uses only ``merge_type`` + ``label`` + section count — never
section_id, template_id, MDX filename, or sample identifier.
Override skip (Stage 2 Q1 — kwarg per Codex YES) :
``section_assignment_override=True`` makes the helper a no-op. User-
driven ``zoneSections`` (#6 IMP-06) is the ground truth and must not
be second-guessed by an automatic re-split.
Idempotency (max_retry=1, Stage 2 lock) :
u2's rebuilt units carry ``merge_type="single"``, which is excluded
from the detection filter by construction. A second pass through
this helper finds nothing — no inner loop, no recursion.
Frame-swap guardrail (★ feedback_ai_isolation_contract) :
u2 rebuilds each child section's single from its OWN rank-1 V4
evidence via ``v4_lookup_fn``. The merged unit's parent /
representative ``template_id`` is discarded along with the merge
itself — no swap of one section's frame onto another section.
Args:
units: composition units from ``plan_composition()``.
sections: original section list (forwarded to u2 for per-section
``raw_content`` lookup — merged units carry the joined string,
not the individual child source).
v4_lookup_fn: ``(section_id) -> V4Match | None`` (rank-1). Forwarded
to u2 — identical evidence source as ``plan_composition``.
v4_label_to_status: V4 label → Phase Z status mapping (forwarded).
allowed_statuses: auto-renderable status set (forwarded).
capacity_fit_fn: optional capacity fit injector (forwarded to u2).
v4_candidates_lookup_fn: optional Step 6-A candidates fn (forwarded).
section_assignment_override: True iff user supplied
``zoneSections`` / ``section_assignment_plan`` (IMP-06 chain).
Returns:
``(out_units, audit)`` :
``out_units`` = post-resplit units (u1: identical to input).
``audit`` = ``imp48_resplit`` payload following Stage 1 schema::
{
"applied": bool, # u1: always False
"split_units": [...], # u3 fills with per-section singles
"skipped_units": [...], # u3 fills with kept-merged + reason
"post_split_unit_count": int,
"post_split_layout_preset": Optional[str],
"skipped_reason": str, # u1: contract-stage reason
"detected_units": [...], # u1: u2's rebuild targets
}
"""
# ``allowed_statuses`` is forwarded for signature symmetry with
# ``plan_composition`` but unused inside the helper — Stage 2 / Codex YES
# fixed the beneficial-split threshold to ``single.label != "reject"``
# (Stage 1 contract "non-reject rank-1"). Future axes may widen the
# threshold using ``allowed_statuses``; until then the parameter is
# explicitly deleted to silence lint without losing the public contract.
del allowed_statuses
audit: dict = {
"applied": False,
"split_units": [],
"skipped_units": [],
"post_split_unit_count": len(units),
"post_split_layout_preset": None,
"detected_units": [],
"rebuild_attempts": [],
}
if section_assignment_override:
audit["skipped_reason"] = "section_assignment_override"
return units, audit
detected = [
u for u in units
if u.merge_type in {"parent_merged", "parent_merged_inferred"}
and u.label == "reject"
and len(u.source_section_ids) >= 2
]
audit["detected_units"] = [
{
"source_section_ids": list(u.source_section_ids),
"merge_type": u.merge_type,
"template_id": u.frame_template_id,
"label": u.label,
}
for u in detected
]
if not detected:
audit["skipped_reason"] = "no_detection"
return units, audit
# u2 — per-section Branch-1 rebuild for each detected merged-reject unit.
# Mirrors ``collect_candidates`` Branch 1 (single per section). Each rebuilt
# single carries the section's OWN rank-1 V4 evidence — the merged unit's
# parent/representative template_id is discarded along with the merge.
# ★ feedback_ai_isolation_contract : no frame swap (each section's own V4).
# ★ MDX_raw_content_invariant : raw_content taken from sections list.
# ★ idempotency : merge_type="single" excludes singles
# from re-detection on any later pass.
section_by_id = {s.section_id: s for s in sections}
def _v4_cands(section_id: str) -> list:
return v4_candidates_lookup_fn(section_id) if v4_candidates_lookup_fn else []
rebuild_attempts: list[dict] = []
for merged_unit in detected:
section_singles: list[dict] = []
for sid in merged_unit.source_section_ids:
section = section_by_id.get(sid)
if section is None:
section_singles.append({
"section_id": sid,
"build_result": "section_not_found",
"unit": None,
})
continue
match = v4_lookup_fn(sid)
if match is None:
section_singles.append({
"section_id": sid,
"build_result": "no_v4_match",
"unit": None,
})
continue
single = CompositionUnit(
source_section_ids=[sid],
merge_type="single",
frame_template_id=match.template_id,
frame_id=match.frame_id,
frame_number=match.frame_number,
confidence=match.confidence,
label=match.label,
phase_z_status=v4_label_to_status.get(match.label, "unknown"),
v4_rank=getattr(match, "v4_rank", None),
selection_path=getattr(match, "selection_path", "rank_1"),
fallback_reason=getattr(match, "fallback_reason", None),
raw_content=section.raw_content,
title=section.title,
v4_candidates=_v4_cands(sid),
provisional=getattr(match, "provisional", False),
)
_apply_capacity_fit(single, capacity_fit_fn)
score_candidate(single)
section_singles.append({
"section_id": sid,
"build_result": "ok",
"unit": single,
})
rebuild_attempts.append({
"merged_source_section_ids": list(merged_unit.source_section_ids),
"merged_merge_type": merged_unit.merge_type,
"merged_template_id": merged_unit.frame_template_id,
"section_singles": section_singles,
})
audit["rebuild_attempts"] = rebuild_attempts
# u3 — gating + swap path.
# Per-merge decision: split | skip(reason). Then a cumulative layout-cap
# check aborts ALL would-be splits if projected post-split count > 4
# (Stage 2 Q2 default — keep merged, no partial split; v0
# ``select_layout_preset`` supports 1~4 units max).
plans: list[dict] = []
for merged_unit, attempt in zip(detected, rebuild_attempts):
required_sids = set(merged_unit.source_section_ids)
built_sids = {
entry["section_id"]
for entry in attempt["section_singles"]
if entry["build_result"] == "ok"
}
if built_sids != required_sids:
# Some sections failed to rebuild — coverage equality violated.
# IMP-47B (#76) will handle the merged unit directly.
plans.append({
"merged": merged_unit,
"decision": "skip",
"reason": "incomplete_rebuild",
"missing": sorted(required_sids - built_sids),
})
continue
built_units = [
entry["unit"]
for entry in attempt["section_singles"]
if entry["build_result"] == "ok"
]
non_reject_count = sum(1 for u in built_units if u.label != "reject")
if non_reject_count == 0:
# No child section gains a non-reject frame — split is not
# beneficial. IMP-47B (#76) handles the merge directly.
plans.append({
"merged": merged_unit,
"decision": "skip",
"reason": "no_beneficial_split",
})
continue
plans.append({
"merged": merged_unit,
"decision": "split",
"singles": built_units,
"non_reject_count": non_reject_count,
})
# Cumulative layout-cap projection across all would-be splits.
projected_count = len(units)
for plan in plans:
if plan["decision"] == "split":
projected_count += len(plan["singles"]) - 1
if projected_count > 4:
for plan in plans:
if plan["decision"] == "split":
plan["decision"] = "skip"
plan["reason"] = "layout_cap_exceeded"
plan["projected_count"] = projected_count
# Build out_units by walking the input list once. Identity match by
# ``id(unit)`` keeps the swap deterministic and preserves order.
plan_by_unit_id = {id(plan["merged"]): plan for plan in plans}
out_units: list[CompositionUnit] = []
applied = False
for unit in units:
plan = plan_by_unit_id.get(id(unit))
if plan is None:
out_units.append(unit)
continue
if plan["decision"] == "split":
applied = True
for single in plan["singles"]:
# ★ Stage 1 Q3 YES — additive telemetry tag, no schema add.
# Overrides the v4 match's selection_path for split-produced
# singles only; non-resplit code paths are unaffected.
single.selection_path = "resplit_from_merge"
out_units.extend(plan["singles"])
audit["split_units"].append({
"merged_source_section_ids": list(plan["merged"].source_section_ids),
"merged_template_id": plan["merged"].frame_template_id,
"non_reject_count": plan["non_reject_count"],
"split_singles": [
{
"section_id": s.source_section_ids[0],
"template_id": s.frame_template_id,
"label": s.label,
"phase_z_status": s.phase_z_status,
}
for s in plan["singles"]
],
})
else: # skip
out_units.append(unit)
skip_entry: dict = {
"merged_source_section_ids": list(plan["merged"].source_section_ids),
"merged_template_id": plan["merged"].frame_template_id,
"reason": plan["reason"],
}
if plan["reason"] == "incomplete_rebuild":
skip_entry["missing_section_ids"] = list(plan["missing"])
if plan["reason"] == "layout_cap_exceeded":
skip_entry["projected_post_split_count"] = plan["projected_count"]
audit["skipped_units"].append(skip_entry)
audit["applied"] = applied
audit["post_split_unit_count"] = len(out_units)
if applied:
# ``select_layout_preset`` is deterministic on unit count (v0).
# u5 (pipeline) re-derives layout preset over the same out_units list;
# both values stay consistent by construction.
audit["post_split_layout_preset"] = select_layout_preset(out_units)
audit.pop("skipped_reason", None)
else:
audit["post_split_layout_preset"] = None
audit["skipped_reason"] = "no_split_applied"
return out_units, audit