feat(#76): IMP-47B reject-as-AI-adaptation activation (u1~u13 backend + tests)

- u1~u9: AI fallback infrastructure (router/prompts/schema/validator) + Step 12 hook
- u10: e2e reject chain (writes final.html with AI-repaired slot, full coverage)
- u11: frontend wiring deferred to follow-up commit (split from IMP-41 hunks)
- u12: coverage_invariant guard
- u13: cache save gate (visual_check PASS + user_approved/auto_cache) — Codex #22 verified

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-22 00:17:46 +09:00
parent f358604fb3
commit 1186ad8ae2
23 changed files with 3901 additions and 111 deletions

View File

@@ -26,6 +26,14 @@ class Settings(BaseSettings):
ai_fallback_budget_per_run: int = 10
ai_fallback_circuit_breaker_threshold: int = 5
# IMP-46 u5 — auto-cache flag. When True, `save_proposal` bypasses the
# `user_approved` gate only (`visual_check_passed` is never bypassed).
# Default OFF preserves the dual-gate contract; the CLI flag
# `--auto-cache` in `src/phase_z2_pipeline.py` mutates this setting at
# parse time. Downstream callers MUST source the flag from Settings,
# never inline literals.
ai_fallback_auto_cache: bool = False
model_config = {"env_file": ".env", "env_file_encoding": "utf-8"}

View File

@@ -1,48 +1,158 @@
"""IMP-33 u6AI fallback proposal cache (IMP-46 gate, no persistent storage).
"""IMP-46 u2 + u3 + u5Persistent JSON cache backend for AI fallback proposals.
This module defines the cache contract that IMP-33 callers use to remember
AI fallback proposals across runs. The persistent storage layer itself is
out-of-scope for IMP-33 and is owned by IMP-46 (frame transformation cache).
Replaces the IMP-33 u6 ``NotImplementedError`` stub with a content-addressed
store at ``data/frame_cache/{frame_id}/{signature_hash}.json``.
Behaviour locked by Stage 2 plan (u6):
Key format:
* ``read_proposal(key)`` always returns ``None`` until IMP-46 lands a
persistent backend. Callers MUST handle the cache-miss path.
* ``save_proposal(key, proposal, *, visual_check_passed, user_approved)``
enforces the IMP-46 gate before any storage write is attempted:
* ``read_proposal(key)`` / ``save_proposal(key, ...)`` accept a string ``key``
of the form ``"{frame_id}::{signature_hash}"``. The two components are
parsed inside this module so that upstream callers (router, step 12)
remain unaware of the on-disk layout.
* ``read_proposal`` on a malformed (legacy) key silently returns ``None``
— the IMP-33 u7 router currently passes a legacy ``cache_key`` string,
and u4 will switch to the structural form. Until then, all such reads
must miss safely (no exception, no false hit).
* ``save_proposal`` on a malformed key raises ``ValueError`` (loud, never
silent) — writes are gated and must use the structural form.
- ``visual_check_passed=False`` -> ``AiFallbackCacheGateError``
- ``user_approved=False`` -> ``AiFallbackCacheGateError``
Stored payload (one JSON file per (frame_id, signature_hash) pair):
Only when BOTH gates are True does control reach the storage layer,
which currently raises ``NotImplementedError`` (the IMP-46 marker).
{
"schema_version": 1,
"proposal": <AiFallbackProposal.model_dump(mode="json")>,
"slide_css": <str | null>,
"fingerprints": {"contract_sha": ..., "partial_sha": ..., "catalog_sha": ...}
}
Guardrails:
u3 invalidation contract (this module is a *comparator*, not a *computer*):
* No Anthropic import; cache is pure proposal bookkeeping.
* No MDX read/write; proposals are u2 ``AiFallbackProposal`` instances.
* No silent persistence: gate violations are loud, not skipped writes
(`feedback_artifact_status_naming`).
* ``save_proposal`` persists the ``fingerprints`` dict supplied by the
caller verbatim. Cache.py never computes any fingerprint — the three
declared shas (``contract_sha`` / ``partial_sha`` / ``catalog_sha``) are
computed by callers from the live contract YAML / partial templates /
catalog payloads and handed in. Keeping the computation out of cache.py
preserves AI isolation (no Phase Z runtime knowledge in the cache
module) and keeps the cache schema-agnostic — additional fingerprint
axes can be added without editing cache.py.
* ``read_proposal`` accepts an optional ``fingerprints`` kwarg. When
supplied, the stored ``fingerprints`` dict must equal the caller's dict
exactly (strict equality, NOT subset). Any mismatch — including a key
the caller demands but the stored entry lacks, OR a key the stored
entry has but the caller does not pass — returns ``None``. Default
``fingerprints=None`` performs no comparison (back-compat for legacy
callers that have not yet adopted fingerprint-aware lookup).
Guardrails (locked by Stage 2 plan):
* Both write gates preserved — ``visual_check_passed=False`` always
raises ``AiFallbackCacheGateError`` BEFORE any filesystem touch.
``user_approved=False`` also raises by default; the IMP-46 u5
``auto_cache=True`` override bypasses ONLY the ``user_approved`` gate
(``visual_check_passed`` is never bypassed). Gate violation never
silently no-ops.
* Missing or corrupt files cause ``read_proposal`` to return ``None`` —
the cache is a hint, never a hard dependency. Errors are not propagated
to callers because the AI fallback path can always recompute.
* ``mkdir(parents=True, exist_ok=True)`` is performed lazily on save.
* No Anthropic / MDX / Phase Z runtime imports (AI isolation contract).
* Cache root is held as a module-level :data:`CACHE_ROOT` so tests can
redirect writes via ``monkeypatch.setattr`` without subclassing.
u5 auto-cache contract (CLI ``--auto-cache`` + ``settings.ai_fallback_auto_cache``):
* ``save_proposal(..., auto_cache=True)`` only bypasses the
``user_approved`` gate; ``visual_check_passed`` remains mandatory.
* ``auto_cache`` is keyword-only and defaults to ``False`` — existing
callers (and the test suite) see the original dual-gate behaviour
unless they opt in explicitly.
* The truth table over ``(visual_check_passed, user_approved, auto_cache)``
has eight cells; exactly three succeed:
``(True, True, False)``, ``(True, True, True)``, and
``(True, False, True)``. Every other cell raises
``AiFallbackCacheGateError``.
"""
from __future__ import annotations
import json
import pathlib
from src.phase_z2_ai_fallback.schema import AiFallbackProposal
SCHEMA_VERSION = 1
KEY_DELIMITER = "::"
CACHE_ROOT: pathlib.Path = pathlib.Path("data/frame_cache")
class AiFallbackCacheGateError(RuntimeError):
"""Raised when ``save_proposal`` is called without both IMP-46 gates True."""
def read_proposal(key: str) -> AiFallbackProposal | None:
def _parse_key(key: str) -> tuple[str, str] | None:
"""Parse a ``frame_id::signature_hash`` key. Returns ``None`` if malformed."""
if KEY_DELIMITER not in key:
return None
frame_id, _, signature_hash = key.partition(KEY_DELIMITER)
if not frame_id or not signature_hash:
return None
if KEY_DELIMITER in signature_hash:
return None
return frame_id, signature_hash
def _cache_path(frame_id: str, signature_hash: str) -> pathlib.Path:
return CACHE_ROOT / frame_id / f"{signature_hash}.json"
def read_proposal(
key: str,
*,
fingerprints: dict | None = None,
) -> AiFallbackProposal | None:
"""Look up a previously cached proposal by ``key``.
IMP-33 ships without a persistent backend; this stub always returns
``None`` so callers exercise the cache-miss path. The persistent
backend will be wired by IMP-46.
Returns ``None`` for:
* empty / non-string key → ``ValueError`` (loud);
* non-dict ``fingerprints`` (when supplied) → ``TypeError`` (loud,
symmetric with :func:`save_proposal`);
* legacy key format (no ``::`` delimiter) → silent ``None`` (router
back-compat until u4 switches to the structural form);
* missing file under ``data/frame_cache/{frame_id}/{signature_hash}.json``;
* corrupt JSON / payload schema mismatch — read errors never propagate;
* ``fingerprints`` supplied AND stored ``fingerprints`` field is not a
dict OR does not equal the supplied dict (strict equality,
u3 invalidation).
"""
if not isinstance(key, str) or not key:
raise ValueError("cache key must be a non-empty string")
return None
if fingerprints is not None and not isinstance(fingerprints, dict):
raise TypeError("fingerprints must be a dict or None")
parsed = _parse_key(key)
if parsed is None:
return None
frame_id, signature_hash = parsed
path = _cache_path(frame_id, signature_hash)
if not path.is_file():
return None
try:
data = json.loads(path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return None
if not isinstance(data, dict):
return None
if fingerprints is not None:
stored = data.get("fingerprints")
if not isinstance(stored, dict) or stored != fingerprints:
return None
proposal_dict = data.get("proposal")
if not isinstance(proposal_dict, dict):
return None
try:
return AiFallbackProposal.model_validate(proposal_dict)
except Exception: # noqa: BLE001 — corrupt payload must miss, not raise
return None
def save_proposal(
@@ -51,13 +161,39 @@ def save_proposal(
*,
visual_check_passed: bool,
user_approved: bool,
) -> None:
"""Persist ``proposal`` under ``key`` once both IMP-46 gates are True.
slide_css: str | None = None,
fingerprints: dict | None = None,
auto_cache: bool = False,
) -> pathlib.Path:
"""Persist ``proposal`` under ``key`` once the IMP-46 gates clear.
Raises ``AiFallbackCacheGateError`` if either gate is False — the
proposal is NOT written. When both gates are True, storage raises
``NotImplementedError`` (the IMP-46 persistent backend has not landed
yet).
Gate contract (IMP-46 u5 truth table):
* ``visual_check_passed=False`` -> :class:`AiFallbackCacheGateError`
always (never bypassable; ``auto_cache`` cannot override).
* ``user_approved=False`` AND ``auto_cache=False`` ->
:class:`AiFallbackCacheGateError`.
* ``user_approved=False`` AND ``auto_cache=True`` -> bypass the
user-approval gate (IMP-46 u5 CLI / settings opt-in).
* Otherwise (``visual_check_passed=True`` AND either
``user_approved=True`` OR ``auto_cache=True``) -> persist payload.
Gate violations are raised BEFORE any filesystem touch — no parent
directory is created, no file is written. When the gates clear the
JSON payload (schema_version + proposal + slide_css + fingerprints)
is written to ``data/frame_cache/{frame_id}/{signature_hash}.json``
and the resolved :class:`pathlib.Path` is returned.
``slide_css`` may be ``None`` (no slide-level CSS captured) or a
string. ``fingerprints`` may be ``None`` (treated as empty dict) or a
dict mapping fingerprint name to SHA hex digest.
``auto_cache`` is keyword-only and defaults to ``False``. It is wired
from :data:`src.config.settings.ai_fallback_auto_cache`, which the
``--auto-cache`` CLI flag in ``src/phase_z2_pipeline.py`` toggles at
parse time. The cache module never reads the setting itself — the
caller passes the resolved boolean — so AI-isolation contracts
(no Phase Z runtime / no Anthropic import) remain intact.
"""
if not isinstance(key, str) or not key:
raise ValueError("cache key must be a non-empty string")
@@ -66,17 +202,42 @@ def save_proposal(
"proposal must be an AiFallbackProposal instance "
f"(got {type(proposal).__name__})"
)
if not isinstance(auto_cache, bool):
raise TypeError("auto_cache must be a bool")
if not visual_check_passed:
raise AiFallbackCacheGateError(
"IMP-46 gate: visual_check_passed=False; refusing to cache an "
"unverified proposal."
"unverified proposal. (auto_cache cannot bypass this gate.)"
)
if not user_approved:
if not user_approved and not auto_cache:
raise AiFallbackCacheGateError(
"IMP-46 gate: user_approved=False; refusing to cache without "
"explicit user approval."
"IMP-46 gate: user_approved=False and auto_cache=False; "
"refusing to cache without explicit user approval. Pass "
"auto_cache=True (or --auto-cache on the CLI) to bypass."
)
raise NotImplementedError(
"IMP-46 persistent cache storage is not implemented yet; "
"this is the IMP-33 u6 stub marker."
if slide_css is not None and not isinstance(slide_css, str):
raise TypeError("slide_css must be a string or None")
if fingerprints is None:
fingerprints = {}
elif not isinstance(fingerprints, dict):
raise TypeError("fingerprints must be a dict or None")
parsed = _parse_key(key)
if parsed is None:
raise ValueError(
"cache key must be in "
f"'frame_id{KEY_DELIMITER}signature_hash' format; got {key!r}"
)
frame_id, signature_hash = parsed
path = _cache_path(frame_id, signature_hash)
path.parent.mkdir(parents=True, exist_ok=True)
payload = {
"schema_version": SCHEMA_VERSION,
"proposal": proposal.model_dump(mode="json"),
"slide_css": slide_css,
"fingerprints": dict(fingerprints),
}
path.write_text(
json.dumps(payload, sort_keys=True, ensure_ascii=False, indent=2),
encoding="utf-8",
)
return path

View File

@@ -0,0 +1,91 @@
"""IMP-46 u1 — Frame transformation cache signature builder.
Deterministic SHA256 over the 8 declared structural axes:
frame_id, v4_label, cardinality, source_shape,
h3_count, char_count_bucket, layout_preset, zone_position
Guardrails:
* No sample/section identifiers in the signature surface (no-hardcoding lock).
* source_shape constrained to the bullet/paragraph/table/mixed enum.
* char_count_bucket is the *bucket label*; numeric counts must be projected
via :func:`bucket_char_count` before being fed to :func:`build_signature`.
* Schema version is embedded in the hashed payload so a future axis change
breaks the digest by design (cache invalidation on schema bump).
"""
from __future__ import annotations
import hashlib
import json
from enum import Enum
SCHEMA_VERSION = 1
class SourceShape(str, Enum):
BULLET = "bullet"
PARAGRAPH = "paragraph"
TABLE = "table"
MIXED = "mixed"
_CHAR_COUNT_BUCKETS: tuple[tuple[int, str], ...] = (
(50, "0-50"),
(150, "51-150"),
(400, "151-400"),
(1000, "401-1000"),
)
_CHAR_COUNT_BUCKET_OVERFLOW = "1001+"
CHAR_COUNT_BUCKET_LABELS: tuple[str, ...] = tuple(
label for _, label in _CHAR_COUNT_BUCKETS
) + (_CHAR_COUNT_BUCKET_OVERFLOW,)
def bucket_char_count(char_count: int) -> str:
"""Project a non-negative character count to its fixed bucket label."""
if isinstance(char_count, bool) or not isinstance(char_count, int):
raise TypeError("char_count must be a non-negative int")
if char_count < 0:
raise ValueError("char_count must be non-negative")
for upper, label in _CHAR_COUNT_BUCKETS:
if char_count <= upper:
return label
return _CHAR_COUNT_BUCKET_OVERFLOW
def build_signature(
*,
frame_id: str,
v4_label: str,
cardinality: int | None,
source_shape: SourceShape | str,
h3_count: int,
char_count_bucket: str,
layout_preset: str,
zone_position: str,
) -> str:
"""Return a deterministic SHA256 hex digest over the 8 declared axes."""
if isinstance(source_shape, SourceShape):
source_shape_value = source_shape.value
elif isinstance(source_shape, str):
source_shape_value = SourceShape(source_shape).value
else:
raise TypeError("source_shape must be SourceShape or str")
if char_count_bucket not in CHAR_COUNT_BUCKET_LABELS:
raise ValueError(
f"char_count_bucket={char_count_bucket!r} is not a known bucket "
f"label (expected one of {CHAR_COUNT_BUCKET_LABELS})"
)
payload = {
"schema_version": SCHEMA_VERSION,
"frame_id": frame_id,
"v4_label": v4_label,
"cardinality": cardinality,
"source_shape": source_shape_value,
"h3_count": h3_count,
"char_count_bucket": char_count_bucket,
"layout_preset": layout_preset,
"zone_position": zone_position,
}
encoded = json.dumps(payload, sort_keys=True, ensure_ascii=False).encode("utf-8")
return hashlib.sha256(encoded).hexdigest()

View File

@@ -1,32 +1,72 @@
"""IMP-33 u8 — Step 12 AI repair wiring (IMP-30 provisional units only).
"""IMP-33 u8 + IMP-46 u4 — Step 12 AI repair wiring with structural cache key.
Phase Z Step 12 = slot_payload (the runtime "light_edit / restructure" surface
where AI-assisted frame-aware adaptation is allowed per IMP-17 carve-out).
This module is the only call site that pipes Phase Z composition units into
``src.phase_z2_ai_fallback.router.route_ai_fallback``. Two structural gates
preserve the AI isolation contract:
``src.phase_z2_ai_fallback.router.route_ai_fallback``. One structural gate
preserves the AI isolation contract:
* IMP-30 provisional gate — units with ``provisional=False`` are skipped
before any route classification. AI repair is reserved for first-render
invariant survivors (no rank-1 V4 evidence, recovered as provisional).
* Reject gate — units whose V4 label maps to ``design_reference_only``
(``reject``) are skipped with ``skip_reason="design_reference_only_no_ai"``.
Reject path is design reference only — never an AI call.
Per IMP-47B u1+u2, the ``reject`` V4 label routes to
``ai_adaptation_required`` (no longer ``design_reference_only``) and is
admitted to the AI repair path; the legacy "reject gate" short-circuit is
removed. Any unit whose ``route_hint`` is not ``ai_adaptation_required``
still falls through to the catch-all ``route_not_ai_adaptation:<hint>``
skip — that single gate continues to enforce the AI=0 normal path.
Combined with the u7 router's flag-off + route-gate short-circuits, the
default Phase Z run path performs zero AI calls (PZ-1). Save to cache is
NOT performed here — that is the caller's responsibility AFTER
``visual_check_passed=True`` AND ``user_approved=True`` (u6 IMP-46 gate).
IMP-46 u4 — structural cache key + fingerprints
------------------------------------------------
The legacy ``cache_key`` was ``"{template_id}::{sorted(source_section_ids)}"``
which leaked sample / section identity into the cache surface
(no-hardcoding lock violation: structurally identical content with
different MDX section ids would miss). u4 replaces it with
``"{frame_id}::{signature_hash}"`` where ``signature_hash`` is the
deterministic SHA256 over the 8 declared structural axes (see
``src.phase_z2_ai_fallback.signature``). Per-unit signature inputs are
read from unit attributes:
* ``cardinality`` (int | None) — also forwarded to ``v4_result``
* ``layout_preset`` (str)
* ``zone_position`` (str)
* ``source_shape`` (str) — bullet / paragraph / table / mixed
* ``h3_count`` (int)
* ``char_count`` (int) — bucketed via ``bucket_char_count``
In parallel the three invalidation fingerprints
(``contract_sha`` / ``partial_sha`` / ``catalog_sha``) are computed and
attached to the record. The cache.py module remains a *comparator* — all
fingerprint *computation* happens here (or via injected loaders) so the
cache schema-agnostic contract is preserved. The router's existing
``read_proposal(cache_key)`` continues to perform exact-match lookup only
(fuzzy is deferred per Stage 2 plan); read-side fingerprint validation
through the router is a follow-up axis.
"""
from __future__ import annotations
import hashlib
import json
from typing import Any, Callable, Iterable
from src.phase_z2_ai_fallback.router import route_ai_fallback
from src.phase_z2_ai_fallback.signature import bucket_char_count, build_signature
_AI_ADAPTATION_ROUTE = "ai_adaptation_required"
_DESIGN_REFERENCE_ROUTE = "design_reference_only"
def _sha256_of(payload: Any) -> str:
"""Deterministic SHA256 hex digest over a JSON-serialisable payload."""
encoded = json.dumps(payload, sort_keys=True, ensure_ascii=False).encode("utf-8")
return hashlib.sha256(encoded).hexdigest()
def gather_step12_ai_repair_proposals(
@@ -38,6 +78,7 @@ def gather_step12_ai_repair_proposals(
figma_partial_loader: Callable[[str], dict] | None = None,
internal_region_lookup: Callable[[Any], dict] | None = None,
mdx_text_loader: Callable[[Any], str] | None = None,
catalog_sha_loader: Callable[[], str] | None = None,
) -> list[dict]:
"""Return one record per unit describing the Step 12 AI repair decision.
@@ -55,8 +96,16 @@ def gather_step12_ai_repair_proposals(
"skip_reason": str | None,
"proposal": dict | None,
"error": str | None,
"cache_key": str | None, # IMP-46 u4
"fingerprints": dict | None, # IMP-46 u4
}
``cache_key`` and ``fingerprints`` are populated only when the unit
reaches the AI-eligible code path (provisional + ai_adaptation route).
Skipped units retain ``None`` for both — the structural axes
(layout_preset / zone_position / source_shape / h3_count / char_count)
are not guaranteed to be set for non-AI paths.
``ai_called`` is True only when ``route_ai_fallback`` was invoked AND
returned a proposal OR raised. Flag-off / route-mismatch returns
``None`` from the router and is surfaced as ``ai_called=False`` with
@@ -64,6 +113,9 @@ def gather_step12_ai_repair_proposals(
"router decided not to run" from "router ran and returned a proposal".
"""
records: list[dict] = []
catalog_sha = (
catalog_sha_loader() if catalog_sha_loader is not None else ""
)
for index, unit in enumerate(units):
label = getattr(unit, "label", None)
route_hint = route_for_label(label)
@@ -78,15 +130,13 @@ def gather_step12_ai_repair_proposals(
"skip_reason": None,
"proposal": None,
"error": None,
"cache_key": None,
"fingerprints": None,
}
if not record["provisional"]:
record["skip_reason"] = "not_provisional"
records.append(record)
continue
if route_hint == _DESIGN_REFERENCE_ROUTE:
record["skip_reason"] = "design_reference_only_no_ai"
records.append(record)
continue
if route_hint != _AI_ADAPTATION_ROUTE:
record["skip_reason"] = f"route_not_ai_adaptation:{route_hint}"
records.append(record)
@@ -106,15 +156,40 @@ def gather_step12_ai_repair_proposals(
if mdx_text_loader is not None
else (getattr(unit, "raw_content", "") or "")
)
cache_key = "::".join(
[template_id, ",".join(sorted(record["source_section_ids"]))]
frame_id_value = getattr(unit, "frame_id", "") or ""
cardinality = getattr(unit, "cardinality", None)
layout_preset = getattr(unit, "layout_preset", "") or ""
zone_position = getattr(unit, "zone_position", "") or ""
source_shape = getattr(unit, "source_shape", "paragraph") or "paragraph"
h3_count = int(getattr(unit, "h3_count", 0) or 0)
char_count = int(getattr(unit, "char_count", 0) or 0)
char_count_bucket = bucket_char_count(char_count)
signature_hash = build_signature(
frame_id=frame_id_value,
v4_label=label or "",
cardinality=cardinality,
source_shape=source_shape,
h3_count=h3_count,
char_count_bucket=char_count_bucket,
layout_preset=layout_preset,
zone_position=zone_position,
)
cache_key = f"{frame_id_value}::{signature_hash}"
fingerprints = {
"contract_sha": _sha256_of(frame_contract),
"partial_sha": _sha256_of(figma_partial_json),
"catalog_sha": catalog_sha,
}
record["cache_key"] = cache_key
record["fingerprints"] = fingerprints
v4_result = {
"route": route_hint,
"label": label,
"frame_id": getattr(unit, "frame_id", None),
"rank": getattr(unit, "v4_rank", None),
"cardinality": None,
"cardinality": cardinality,
}
try:
proposal = route_ai_fallback(

View File

@@ -78,6 +78,12 @@ from phase_z2_failure_router import (
from phase_z2_content_extractor import extract_content_objects, extract_rich_content_objects
from phase_z2_placement_planner import plan_placement
# IMP-47B u4 — Step 12 AI repair wiring. gather() short-circuits at the
# router when settings.ai_fallback_enabled is False (default), so import
# at module load is safe for the AI=0 normal path (PZ-1). Activation gate
# stays in src/config.py + src/phase_z2_ai_fallback/router.py.
from src.phase_z2_ai_fallback.step12 import gather_step12_ai_repair_proposals
# ─── Constants ──────────────────────────────────────────────────
@@ -569,12 +575,15 @@ def lookup_v4_match(
# use_as_is → Phase Z direct render
# light_edit → deterministic minor adjustment
# restructure → AI-assisted frame-aware adaptation (deferred to IMP-17 — carve-out, AI fallback only, normal path 밖)
# reject → design reference only (deferred to IMP-29 frontend override)
# reject → AI re-construction over the rank-1 reject frame (IMP-47B u1, 2026-05-21);
# policy correction supersedes the legacy "design reference only" disposition.
# Frame visual / contract stays untouched; AI only re-maps MDX content into
# declared slots. Activation still gated by ai_fallback_enabled (default OFF).
_IMP05_ROUTE_HINTS: dict[str, str] = {
"use_as_is": "direct_render",
"light_edit": "deterministic_minor_adjustment",
"restructure": "ai_adaptation_required",
"reject": "design_reference_only",
"reject": "ai_adaptation_required",
}
@@ -585,6 +594,249 @@ def _imp05_route_hint(label: Optional[str]) -> Optional[str]:
return _IMP05_ROUTE_HINTS.get(label)
def _load_frame_partial_html(template_id: str) -> str:
"""IMP-47B u4 — Read templates/phase_z2/families/{template_id}.html.
Missing partial (e.g., ``__empty__`` shell from IMP-30) returns an
empty string so gather_step12_ai_repair_proposals can still build a
record with skip_reason without raising on file IO.
"""
partial_path = TEMPLATE_DIR / "families" / f"{template_id}.html"
if not partial_path.is_file():
return ""
return partial_path.read_text(encoding="utf-8")
def _run_step12_ai_repair(units) -> list[dict]:
"""IMP-47B u4 — Wire gather_step12_ai_repair_proposals into Step 12.
Routes provisional units whose IMP-05 hint maps to
``ai_adaptation_required`` (``restructure`` + ``reject`` per u1)
through ``src.phase_z2_ai_fallback.router``. Normal-path units
(``use_as_is`` / ``light_edit`` / non-provisional) record a
skip_reason without invoking the router; flag-off runs short-circuit
at the router (``settings.ai_fallback_enabled=False`` default).
Returns the per-unit record list — u5 consumes records for
PARTIAL_OVERRIDES apply and u6 writes the audit artifact.
"""
return gather_step12_ai_repair_proposals(
units,
route_for_label=_imp05_route_hint,
get_contract_fn=get_contract,
frame_visual_loader=_load_frame_partial_html,
)
_REJECT_SUPPORTED_PROPOSAL_KINDS: frozenset[str] = frozenset({"partial_overrides"})
def _apply_ai_repair_proposals_to_zones(
ai_repair_records: list[dict],
unit_positions: list[str],
zones_data: list[dict],
) -> None:
"""IMP-47B u5 — Apply PARTIAL_OVERRIDES into zones_data.slot_payload.
Mutates each record's ``apply_status`` in place and merges
``proposal.payload.slots`` into the matching zone. Out-of-scope
kinds (``builder_options_patch``, ``slot_mapping_proposal``)
loud-fail with ``unsupported_kind_for_reject_route:<kind>`` — zones
untouched (human_review surfacing → u8). IMP-33 u5 validator
guarantees declared-slot completeness, so ``dict.update`` is the
structural merge (``feedback_ai_isolation_contract``).
"""
zone_by_position = {z["position"]: z for z in zones_data}
for record in ai_repair_records:
proposal = record.get("proposal")
if proposal is None:
record["apply_status"] = "no_proposal"
continue
kind = proposal.get("proposal_kind")
if kind not in _REJECT_SUPPORTED_PROPOSAL_KINDS:
record["apply_status"] = f"unsupported_kind_for_reject_route:{kind}"
print(
f" [ai-repair-apply] unit {record['unit_index']} "
f"proposal_kind='{kind}' out-of-scope for reject route — "
"skipping apply; human_review required.",
file=sys.stderr,
)
continue
unit_index = record["unit_index"]
position = (
unit_positions[unit_index]
if 0 <= unit_index < len(unit_positions) else None
)
zone = zone_by_position.get(position) if position is not None else None
if zone is None:
record["apply_status"] = "no_zone_match"
continue
slots = (proposal.get("payload") or {}).get("slots") or {}
zone["slot_payload"].update(slots)
record["apply_status"] = "applied:partial_overrides"
def _check_post_ai_coverage_invariant(
units,
ai_repair_records: list[dict],
) -> dict:
"""IMP-47B u7 — Verify AI repair preserved every source_section_id.
Compares the union of unit-level ``source_section_ids`` (pre-AI) to
the union present on ``ai_repair_records`` post-apply. Per the AI
isolation contract + dropped 절대 룰
(``feedback_ai_isolation_contract``), AI repair never removes a
unit's section coverage. Any divergence indicates a regression that
u8 surfaces through ``slide_status.ai_repair_status``. The check is
structural (set membership); the per-record ``source_section_ids``
list is a copy populated by ``gather_step12_ai_repair_proposals``
(``step12.py:124``) so apply mutations cannot silently drop it.
"""
pre_ai_ids: set[str] = set()
for unit in units:
pre_ai_ids.update(getattr(unit, "source_section_ids", []) or [])
post_ai_ids: set[str] = set()
for record in ai_repair_records:
post_ai_ids.update(record.get("source_section_ids") or [])
dropped = sorted(pre_ai_ids - post_ai_ids)
return {
"pre_ai_section_ids": sorted(pre_ai_ids),
"post_ai_section_ids": sorted(post_ai_ids),
"dropped_section_ids": dropped,
"status": "ok" if not dropped else "violated",
}
def _persist_ai_repair_proposals_to_cache(
ai_repair_records: list[dict],
*,
visual_check_passed: bool,
user_approved: bool,
auto_cache: bool,
) -> None:
"""IMP-47B u13 — Persist applied AI repair proposals through IMP-46 gates.
Mutates each record in place with a ``cache_save_status`` axis.
Only records whose ``apply_status`` starts with ``"applied:"`` and
that still carry the original ``cache_key`` + ``fingerprints`` + a
serialized ``proposal`` dict are eligible — everything else marked
``not_applied``. Eligible records go through
``cache.save_proposal`` with the IMP-46 dual-gate truth table; the
helper catches :class:`AiFallbackCacheGateError` so a gate block is
surfaced (``gate_blocked:<reason>``) without raising into the
pipeline runtime (the cache is a hint, never a hard dependency —
cache.py contract). ``visual_check_passed`` is never bypassable;
``auto_cache=True`` bypasses ONLY the ``user_approved`` gate per
IMP-46 u5. Pure save layer: no AI call, no MDX touch.
"""
from src.phase_z2_ai_fallback.cache import (
AiFallbackCacheGateError,
save_proposal,
)
from src.phase_z2_ai_fallback.schema import AiFallbackProposal
for record in ai_repair_records:
apply_status = record.get("apply_status") or ""
proposal_dict = record.get("proposal")
cache_key = record.get("cache_key")
fingerprints = record.get("fingerprints")
if (
not apply_status.startswith("applied:")
or not isinstance(proposal_dict, dict)
or not cache_key
or not isinstance(fingerprints, dict)
):
record["cache_save_status"] = "not_applied"
continue
try:
proposal_obj = AiFallbackProposal.model_validate(proposal_dict)
except Exception as exc: # noqa: BLE001 — invalid payload → skip, never raise
record["cache_save_status"] = f"invalid_proposal:{type(exc).__name__}"
continue
try:
save_proposal(
cache_key,
proposal_obj,
visual_check_passed=visual_check_passed,
user_approved=user_approved,
auto_cache=auto_cache,
fingerprints=fingerprints,
)
except AiFallbackCacheGateError as gate_exc:
record["cache_save_status"] = f"gate_blocked:{gate_exc}"
continue
record["cache_save_status"] = "saved"
def _summarize_ai_repair_status(
ai_repair_records: list[dict],
coverage_invariant: dict,
) -> dict:
"""IMP-47B u8 — Classify Step 12 AI repair outcomes for slide_status surfacing.
Reads u4 gather ``error`` + u5 ``apply_status`` + u7 coverage_invariant
to derive a single ``ai_repair_status`` axis attached to
``slide_status``. Failure-axis priority (highest → lowest):
``error`` > ``coverage_violated`` > ``unsupported_kind`` > ``applied`` > ``ok``.
``human_review_required`` flips True on the three failure axes so the
frontend (u11) can surface a notification per the IMP-47B policy
("AI 호출 실패 / proposal validation 실패 / coverage 미달 → frontend notification").
Pure: no IO, no AI call.
"""
counts = {
"total": len(ai_repair_records),
"applied": 0,
"no_proposal": 0,
"no_zone_match": 0,
"unsupported_kind": 0,
"error": 0,
}
unsupported_records: list[dict] = []
error_records: list[dict] = []
for record in ai_repair_records:
if record.get("error"):
counts["error"] += 1
error_records.append({
"unit_index": record.get("unit_index"),
"source_section_ids": list(record.get("source_section_ids") or []),
"error": record.get("error"),
})
continue
apply_status = record.get("apply_status") or ""
if apply_status.startswith("applied:"):
counts["applied"] += 1
elif apply_status.startswith("unsupported_kind_for_reject_route:"):
counts["unsupported_kind"] += 1
unsupported_records.append({
"unit_index": record.get("unit_index"),
"source_section_ids": list(record.get("source_section_ids") or []),
"apply_status": apply_status,
})
elif apply_status == "no_zone_match":
counts["no_zone_match"] += 1
else:
counts["no_proposal"] += 1
coverage_status = (coverage_invariant or {}).get("status", "ok")
dropped = list((coverage_invariant or {}).get("dropped_section_ids") or [])
if counts["error"]:
status = "error"
elif coverage_status != "ok":
status = "coverage_violated"
elif counts["unsupported_kind"]:
status = "unsupported_kind"
elif counts["applied"]:
status = "applied"
else:
status = "ok"
return {
"status": status,
"counts": counts,
"unsupported_kind_records": unsupported_records,
"error_records": error_records,
"coverage_status": coverage_status,
"dropped_section_ids": dropped,
"human_review_required": status in {"error", "coverage_violated", "unsupported_kind"},
}
def lookup_v4_match_with_fallback(
v4: dict,
section_id: str,
@@ -878,6 +1130,54 @@ def lookup_v4_candidates(
return candidates
def _apply_frame_override_to_unit(unit, new_tid: str, v4: dict) -> str:
"""IMP-47B u3 — apply a frame override to *unit* in place.
Returns a meta_source string for the override book-keeping. Three
probe layers, in order:
1. ``unit.v4_candidates`` (non-reject, max_n bounded). Copies
frame_id / frame_number / confidence / label from the matching
candidate so Step 9 metadata stays consistent. Returns
``"v4_candidates"``.
2. Full 32 V4 judgments (reject inclusive). When the override
target matches a reject judgment for the unit's primary section,
the unit is promoted to ``provisional=True`` with ``label="reject"``
so Step 12 (IMP-47B u4) admits the AI repair path. Returns
``"v4_reject_judgment_provisional"``.
3. Raw fall-through. Updates only ``frame_template_id``; returns
``"raw_template_id_only"``.
Frame visual / contract stay untouched per the AI isolation contract
(frame auto-swap forbidden — AI re-places content into the existing
frame only). The caller validates catalog contract presence before
invoking this helper.
"""
for cand in (unit.v4_candidates or []):
if getattr(cand, "template_id", None) == new_tid:
unit.frame_template_id = cand.template_id
unit.frame_id = cand.frame_id
unit.frame_number = cand.frame_number
unit.confidence = cand.confidence
unit.label = cand.label
return "v4_candidates"
primary_sid = (
unit.source_section_ids[0] if unit.source_section_ids else None
)
if primary_sid:
for j in lookup_v4_all_judgments(v4, primary_sid):
if j.template_id == new_tid and j.label == "reject":
unit.frame_template_id = j.template_id
unit.frame_id = j.frame_id
unit.frame_number = j.frame_number
unit.confidence = j.confidence
unit.label = "reject"
unit.provisional = True
return "v4_reject_judgment_provisional"
unit.frame_template_id = new_tid
return "raw_template_id_only"
# ─── Content weight + zone layout 계산 ─────────────────────────
# layout preset 선택은 phase_z2_composition.select_layout_preset (composition v0) 가 담당.
# 본 모듈의 select_layout_preset 은 이전 단순 count-based 구현이었고 dead code 로 제거 (2026-04-29).
@@ -3336,6 +3636,57 @@ def run_phase_z2_mvp1(
),
}
# IMP-47B u12 — mixed direct+reject first-render admission.
# When initial plan_composition produces a viable layout but at least one
# section remains uncovered (typically chain_exhausted / reject), re-run
# with allow_provisional in the lookup + allow_provisional_fill=True so
# reject sections gain a provisional rank-1 V4Match and a last-resort
# provisional candidate fill. This admits the mixed direct+reject case
# to the AI repair path (IMP-47B u4/u5) on first render. Skipped under
# --override-section-assignments to preserve the operator's plan and
# mirror the IMP-30 u4 retry's section_assignment_plan gate. All-direct
# slides have no uncovered sections so this is a no-op. The all-reject
# case is still handled by the IMP-30 u4 retry block below (initial
# plan_composition returns units=[]).
if units and layout_preset is not None and not override_section_assignments:
_u12_covered_ids: set[str] = set()
for _u in units:
_u12_covered_ids.update(_u.source_section_ids)
_u12_uncovered_ids = [
s.section_id for s in sections if s.section_id not in _u12_covered_ids
]
if _u12_uncovered_ids:
def _lookup_fn_mixed_admission(sid: str) -> Optional[V4Match]:
match, trace = lookup_v4_match_with_fallback(
v4,
sid,
raw_content=section_content_by_id.get(sid),
alias_keys=section_alias_by_id.get(sid),
allow_provisional=True,
)
v4_fallback_traces[sid] = trace
return match
units_mixed, layout_preset_mixed, _comp_debug_mixed = plan_composition(
sections,
_lookup_fn_mixed_admission,
V4_LABEL_TO_PHASE_Z_STATUS,
MVP1_ALLOWED_STATUSES,
capacity_fit_fn=compute_capacity_fit,
v4_candidates_lookup_fn=candidates_lookup_fn,
allow_provisional_fill=True,
)
if units_mixed and layout_preset_mixed is not None:
units = units_mixed
layout_preset = layout_preset_mixed
comp_debug["v4_fallback_selections"] = list(v4_fallback_traces.values())
comp_debug["imp47b_u12_mixed_admission"] = {
"applied": True,
"uncovered_before": _u12_uncovered_ids,
"result_unit_count": len(units_mixed),
"result_layout_preset": layout_preset_mixed,
}
# ── Step 7-A axis : layout override ──
# 사용자가 LayoutPanel 에서 다른 preset 을 선택했을 때 자동 결정값을 강제 변경.
# 길이 mismatch (positions count vs unit count) 는 zone loop 의 fallback (zone_{i})
@@ -3684,7 +4035,10 @@ def run_phase_z2_mvp1(
# {unit_id: template_id} 형식. unit_id 매칭 시 unit.frame_template_id 강제 변경.
# v4_candidates 안에서 같은 template_id 를 가진 entry 를 찾으면 frame_id /
# frame_number / confidence / label 까지 그 entry 에서 가져와 갱신 — 그래야 step09
# artifact 의 메타가 일관됨.
# artifact 의 메타가 일관됨. IMP-47B u3 (2026-05-21) : v4_candidates miss 시
# 전 32 judgments 까지 probe — reject 라벨 frame 을 사용자가 선택한 경우
# unit 을 provisional=True 로 승격해 Step 12 AI 재구성 게이트를 통과시킴
# (frame 유지, 자동 frame swap 금지 — [[feedback_ai_isolation_contract]]).
# frame contract 가 catalog 에 등록 안 된 template_id 면 skip + warning —
# crash 방지 (V4 score 는 매겨지지만 catalog partial 은 없는 후보 존재).
frame_overrides_applied: list[dict] = []
@@ -3713,21 +4067,7 @@ def run_phase_z2_mvp1(
file=sys.stderr,
)
continue
match = None
for cand in (unit.v4_candidates or []):
if getattr(cand, "template_id", None) == new_tid:
match = cand
break
if match is not None:
unit.frame_template_id = match.template_id
unit.frame_id = match.frame_id
unit.frame_number = match.frame_number
unit.confidence = match.confidence
unit.label = match.label
meta_source = "v4_candidates"
else:
unit.frame_template_id = new_tid
meta_source = "raw_template_id_only"
meta_source = _apply_frame_override_to_unit(unit, new_tid, v4)
frame_overrides_applied.append({
"unit_id": unit_id,
"from": old_tid,
@@ -4329,6 +4669,58 @@ def run_phase_z2_mvp1(
note="B4 PlacementPlan slot_assignments — render path 미연결. 실제 render slot 매핑은 mapper.py 의 builder.",
)
# ─── Step 12 IMP-47B u4 — AI repair proposal gather ───
# Wire gather_step12_ai_repair_proposals so reject / restructure
# provisional units reach the AI fallback router. Normal-path units
# (use_as_is / light_edit / non-provisional) skip via the catch-all
# route gate; flag-off runs short-circuit at the router. Stored locally
# for u5 (PARTIAL_OVERRIDES apply) + u6 (step12_ai_repair.json audit).
ai_repair_records = _run_step12_ai_repair(units)
# ─── Step 12 IMP-47B u5 — Apply PARTIAL_OVERRIDES proposals ───
# Mirror the per-unit position derivation from the render loop above
# (L3789-3796); apply merges slots into zone slot_payload, loud-fails
# unsupported kinds via apply_status marker.
unit_positions: list[str] = []
for _i, _unit in enumerate(units):
_pos = positions[_i] if _i < len(positions) else f"zone_{_i}"
_plan_record = render_record_by_unit_id.get(id(_unit))
if _plan_record is not None and _plan_record.get("position"):
_pos = _plan_record["position"]
unit_positions.append(_pos)
_apply_ai_repair_proposals_to_zones(ai_repair_records, unit_positions, zones_data)
# ─── Step 12 IMP-47B u7 — Post-AI source_section_ids coverage invariant ───
# Structural defense: AI repair must not silently drop a unit's
# source_section_ids. dropped 절대 룰 — text_block / table / image /
# details deletion forbidden. Result feeds u6 audit (below) and
# u8 slide_status.ai_repair_status surfacing.
ai_repair_coverage_invariant = _check_post_ai_coverage_invariant(
units, ai_repair_records,
)
# ─── Step 12 IMP-47B u6 — AI repair audit artifact ───
# Persist per-unit gather/apply outcomes (route_hint, skip_reason,
# apply_status, ai_called, proposal kind, cache_key, fingerprints)
# so reviewers can audit which units reached the AI fallback router
# and what happened. Flag-off default → every record has
# ai_called=False + apply_status='no_proposal'; flag-on +
# provisional reject/restructure → router_short_circuit (cache miss
# without client) or applied:partial_overrides (cache hit / live AI).
# u7 coverage_invariant rides alongside per_unit for reviewers.
_write_step_artifact(
run_dir, 12, "ai_repair",
data={
"per_unit": ai_repair_records,
"coverage_invariant": ai_repair_coverage_invariant,
},
step_status="done",
pipeline_path_connected=True,
inputs=["step10_frame_contract.json", "step02_normalized.json"],
outputs=["step12_ai_repair.json"],
note="IMP-47B u6 — Step 12 AI repair gather + apply records per unit (route, skip_reason, apply_status, proposal). u7 coverage_invariant = pre/post AI source_section_ids set comparison.",
)
# ─── Step 12: Slot Payload (actual values, mapper.py 결과) ───
_write_step_artifact(
run_dir, 12, "slot_payload",
@@ -4943,6 +5335,24 @@ def run_phase_z2_mvp1(
),
)
# ─── IMP-47B u13: Persist validated AI repair proposals to cache ───
# Saves each applied PARTIAL_OVERRIDES proposal AFTER Step 14 visual
# check + per IMP-46 dual-gate. ``visual_check_passed`` reads the
# Selenium overflow result; ``auto_cache`` sourced from Settings
# (CLI --auto-cache wires settings.ai_fallback_auto_cache at parse
# time, src/phase_z2_pipeline.py:5631-5633). ``user_approved`` stays
# False — the pipeline has no UX approval gate; the auto_cache
# opt-in is the documented bypass per IMP-46 u5. Gate violations
# surface as ``cache_save_status='gate_blocked:<reason>'`` on the
# record (cache is a hint, never a hard dependency).
from src.config import settings as _ai_cache_settings
_persist_ai_repair_proposals_to_cache(
ai_repair_records,
visual_check_passed=bool(overflow.get("passed")),
user_approved=False,
auto_cache=bool(_ai_cache_settings.ai_fallback_auto_cache),
)
# 10. fit_classifier v0 (A1) — Selenium 결과 → spec §3 category 분류 layer.
# *분류만*. action / router / rerender X. behavior 변경 0.
fit_classification = classify_visual_runtime_check(overflow, debug_zones)
@@ -5126,6 +5536,16 @@ def run_phase_z2_mvp1(
debug_zones=debug_zones,
)
# IMP-47B u8 — Surface Step 12 AI repair outcomes through slide_status.
# Composes u4 gather errors + u5 apply_status + u7 coverage_invariant
# into a single ``ai_repair_status`` axis the frontend (u11) reads to
# render human_review notifications. Auto pipeline first
# ([[feedback_auto_pipeline_first]]) — no review_queue insertion;
# explicit status enum + human_review_required flag.
slide_status["ai_repair_status"] = _summarize_ai_repair_status(
ai_repair_records, ai_repair_coverage_invariant,
)
# ─── Step 20: Slide Status ───
_write_step_artifact(
run_dir, 20, "slide_status",
@@ -5147,6 +5567,11 @@ def run_phase_z2_mvp1(
_aligned = slide_status.get("aligned_section_ids") or []
_covered = slide_status.get("covered_section_ids") or []
_filtered = slide_status.get("filtered_section_ids") or []
_ai_repair = slide_status.get("ai_repair_status") or {}
_ai_repair_label = (
f'{_ai_repair.get("status", "?")} '
f'(human_review_required={_ai_repair.get("human_review_required", False)})'
)
_write_step_html(
run_dir, 20, "final_status",
title="Final Slide Status",
@@ -5161,6 +5586,7 @@ def run_phase_z2_mvp1(
f'<tr><th>filtered_section_ids</th><td>{_filtered}</td></tr>'
f'<tr><th>adapter_needed_count</th><td>{slide_status.get("adapter_needed_count", 0)}</td></tr>'
f'<tr><th>content_truncated_count</th><td>{slide_status.get("content_truncated_count", 0)}</td></tr>'
f'<tr><th>ai_repair_status</th><td>{_ai_repair_label}</td></tr>'
f'</table>'
f'<h2>Visual Fail Reasons</h2>{_vfs_html}'
f'<h2>Note</h2><p>{slide_status.get("note", "")}</p>'
@@ -5331,8 +5757,29 @@ if __name__ == "__main__":
"--override-section-assignment bottom=03-2,03-3"
),
)
# IMP-46 u5 — auto-cache opt-in. When set, ``cache.save_proposal``
# bypasses the ``user_approved`` gate only (``visual_check_passed``
# is never bypassable). Source of truth is
# ``settings.ai_fallback_auto_cache`` (src/config.py); this flag
# mutates the setting in-process so downstream callers read the
# same value through Settings rather than parsing args themselves.
parser.add_argument(
"--auto-cache",
dest="auto_cache",
action="store_true",
default=False,
help=(
"Allow cache.save_proposal to bypass the user_approved gate "
"(visual_check_passed remains mandatory). Sets "
"settings.ai_fallback_auto_cache=True for this run."
),
)
args = parser.parse_args()
if args.auto_cache:
from src.config import settings as _settings
_settings.ai_fallback_auto_cache = True
overrides_frames: dict[str, str] = {}
for ov in args.override_frames:
if "=" not in ov: