diff --git a/docs/architecture/IMP-17-CARVE-OUT.md b/docs/architecture/IMP-17-CARVE-OUT.md index 0f3c821..555111a 100644 --- a/docs/architecture/IMP-17-CARVE-OUT.md +++ b/docs/architecture/IMP-17-CARVE-OUT.md @@ -1,6 +1,6 @@ # IMP-17 — AI repair fallback infrastructure (carve-out) -**Status**: carve-out, **design-only**. Normal-path AI calls = 0. No runtime fallback code lands until the activation gate clears. +**Status**: carve-out infra **scaffolded under IMP-33** (issue #61, Stage 3 u1~u11). Normal-path AI calls = 0 (PZ-1) — `ai_fallback_enabled` flag default `False` in `src/config.py`. Runtime AI is reachable only via fallback path entry points; Step 12 entry is provisional-gated, Step 17 entry is structurally blocked behind IMP-34 + IMP-35. **Source anchors** - IMP-17 backlog row — [`PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md`](PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md):68 (carve-out — normal path 밖, soft link IMP-04 + IMP-05). @@ -42,3 +42,14 @@ Phase Q `content_editor.py` 는 **Archive Candidate** ([`PHASE-Q-AUDIT.md`](PHAS - AI 호출은 normal path 에 없다 (Phase Z 원칙, [memory `feedback_ai_isolation_contract`](../../README.md)). - 출력 단위는 항상 content_object / Internal Region / Frame Slot 또는 restructuring proposal — HTML 구조 / 레이아웃 / 프리셋 결정 X. - Phase Q 자산 (Kei persona prompts, Kei-API endpoint, persona retry semantics) 과 단절. Phase Z 의 fallback runtime 은 별도 prompt / endpoint 설계로 출발한다 (본 carve-out 활성 시). + +## Runtime module surface (IMP-33 u1~u11 binding) + +| Axis | Binding | +|---|---| +| Module path | `src/phase_z2_ai_fallback/` (locked by [`IMP-31-GATE-AUDIT.md`](IMP-31-GATE-AUDIT.md):31,50,56). | +| Step 12 entry | `src.phase_z2_ai_fallback.step12.gather_step12_ai_repair_proposals` — IMP-30 provisional gate (`not_provisional` skip) AND reject gate (`design_reference_only_no_ai` skip) AND non-AI route catch-all run BEFORE `route_ai_fallback`. | +| Step 17 entry | `src.phase_z2_ai_fallback.step17.gather_step17_ai_repair_proposals` — STRUCTURALLY BLOCKED. Every unit returns `skip_reason="step17_ai_blocked_imp_34_35_prerequisites_missing"`. Module does NOT import `route_ai_fallback` / `AiFallbackClient` / `anthropic`. | +| Cascade order | `src.phase_z2_ai_fallback.step17.OVERFLOW_CASCADE_ORDER = (DETERMINISTIC, POPUP, AI_REPAIR, USER_OVERRIDE)` — single source of truth for Step 17 consumers. Aligns with line 16 of this doc. | +| IMP-46 cache gate | `src.phase_z2_ai_fallback.cache.save_proposal(..., visual_check_passed, user_approved)` raises `AiFallbackCacheGateError` unless BOTH gates are True; storage backend then raises `NotImplementedError` (IMP-46 marker). `read_proposal` returns `None` until IMP-46 lands a backend. | +| AST isolation | `tests/phase_z2_ai_fallback/test_ast_isolation.py` parses every `*.py` under `src/phase_z2_ai_fallback/` and forbids Phase Q runtime / Kei client / `src.phase_z2_*` (non-fallback) imports. Whitelist = `src.config` + intra-package + stdlib + `anthropic` + `pydantic`. | diff --git a/docs/architecture/IMP-31-GATE-AUDIT.md b/docs/architecture/IMP-31-GATE-AUDIT.md index 25a503f..5d003a0 100644 --- a/docs/architecture/IMP-31-GATE-AUDIT.md +++ b/docs/architecture/IMP-31-GATE-AUDIT.md @@ -28,7 +28,7 @@ Anchor pin: `tests/orchestrator_unit/test_imp17_comment_anchor.py`. Synced in [` | 2 | B4 frame_selection evidence integration complete | **NOT CLEAR** (⚠ partial) | [`PHASE-Z-PIPELINE-STATUS-BOARD.md`](PHASE-Z-PIPELINE-STATUS-BOARD.md):48 Step 9 ⚠ partial; :82 "B4 frame_selection 의 V4 evidence 미통합"; :126 (j) ❌ pending. | | 3 | IMP-04 catalog expansion + IMP-05 V4 fallback live | **AMBIGUOUS** | `templates/phase_z2/catalog/frame_contracts.yaml` = 11 `template_id:` entries vs 32 target. IMP-05 V4 rank-2/3 fallback selector logic live, but catalog coverage gates real semantics. | -**Verdict**: gate **NOT CLEAR**. Runtime AI adaptation remains design-only. `src/phase_z2_ai_fallback/` = declaration-only path (not created this cycle). +**Verdict**: gate **NOT CLEAR**. Runtime AI adaptation remains gated. `src/phase_z2_ai_fallback/` = **scaffolded under IMP-33** (#61, Stage 3 u1~u11); module created, but `settings.ai_fallback_enabled` defaults to `False` (u1) so normal-path AI call count remains 0 (PZ-1). Runtime engagement still requires the 3-condition AND gate above. ## Issue-body axis verdict @@ -47,13 +47,13 @@ Anchor pin: `tests/orchestrator_unit/test_imp17_comment_anchor.py`. Synced in [` ## Out of scope (this cycle) -Runtime AI module, `src/phase_z2_ai_fallback/` directory creation, prompt implementation, `candidate_evidence` schema change, Phase Q file mutation, Kei API reuse, frontend zone override (IMP-29 scope), IMP-30 invariant change, `calculate_fit` migration. +Runtime AI consumer enablement (flag default OFF), `candidate_evidence` schema change, Phase Q file mutation, Kei API reuse, frontend zone override (IMP-29 scope), IMP-30 invariant change, `calculate_fit` migration. Note: `src/phase_z2_ai_fallback/` directory scaffold itself was created under IMP-33 (#61, Stage 3 u1~u11) — see [`IMP-17-CARVE-OUT.md`](IMP-17-CARVE-OUT.md) §"Runtime module surface". -## Future activation path (declaration only) +## Future activation path When the 3-condition AND gate clears (User GO ∧ B4 V4 evidence integrated ∧ catalog 32/32 + IMP-05 V4 fallback live): -- Runtime AI module path = `src/phase_z2_ai_fallback/` (not created this cycle). +- Runtime AI module path = `src/phase_z2_ai_fallback/` (scaffolded under IMP-33; flag default OFF until gate clears). - Provider = Anthropic API only. Prompt design starts fresh (no Phase Q `EDITOR_PROMPT` import). - Output granularity = content_object → Internal Region / Frame Slot placement proposal. Frame / layout / zone topology selection remains deterministic. - Activation tracker = this issue (#40, IMP-31). No new IMP ID issued. diff --git a/src/config.py b/src/config.py index 81c635c..f9de404 100644 --- a/src/config.py +++ b/src/config.py @@ -14,6 +14,18 @@ class Settings(BaseSettings): slide_width: int = 1280 slide_height: int = 720 + # IMP-33 u1 — AI fallback policy. Fallback-path only; normal path AI=0. + # Defaults locked by Stage 2 plan; do NOT inline literals downstream. + ai_fallback_enabled: bool = False + ai_fallback_model: str = "claude-opus-4-6-20250415" + ai_fallback_timeout_s: float = 60.0 + ai_fallback_max_retries: int = 3 + ai_fallback_backoff_base_s: float = 1.0 + ai_fallback_backoff_cap_s: float = 8.0 + ai_fallback_backoff_jitter: float = 0.3 + ai_fallback_budget_per_run: int = 10 + ai_fallback_circuit_breaker_threshold: int = 5 + model_config = {"env_file": ".env", "env_file_encoding": "utf-8"} diff --git a/src/phase_z2_ai_fallback/__init__.py b/src/phase_z2_ai_fallback/__init__.py new file mode 100644 index 0000000..bf71b11 --- /dev/null +++ b/src/phase_z2_ai_fallback/__init__.py @@ -0,0 +1,15 @@ +"""IMP-33 AI fallback package (fallback path only). + +Module path locked by IMP-31-GATE-AUDIT.md (Stage 1 binding). +Normal path AI call count MUST remain 0; this package only executes under +classified fallback routes (reject / restructure / overflow). See +`feedback_ai_isolation_contract`. +""" +from __future__ import annotations + +from src.phase_z2_ai_fallback.schema import ( + AiFallbackProposal, + ProposalKind, +) + +__all__ = ["AiFallbackProposal", "ProposalKind"] diff --git a/src/phase_z2_ai_fallback/cache.py b/src/phase_z2_ai_fallback/cache.py new file mode 100644 index 0000000..9ceba97 --- /dev/null +++ b/src/phase_z2_ai_fallback/cache.py @@ -0,0 +1,82 @@ +"""IMP-33 u6 — AI fallback proposal cache (IMP-46 gate, no persistent storage). + +This module defines the cache contract that IMP-33 callers use to remember +AI fallback proposals across runs. The persistent storage layer itself is +out-of-scope for IMP-33 and is owned by IMP-46 (frame transformation cache). + +Behaviour locked by Stage 2 plan (u6): + +* ``read_proposal(key)`` always returns ``None`` until IMP-46 lands a + persistent backend. Callers MUST handle the cache-miss path. +* ``save_proposal(key, proposal, *, visual_check_passed, user_approved)`` + enforces the IMP-46 gate before any storage write is attempted: + + - ``visual_check_passed=False`` -> ``AiFallbackCacheGateError`` + - ``user_approved=False`` -> ``AiFallbackCacheGateError`` + + Only when BOTH gates are True does control reach the storage layer, + which currently raises ``NotImplementedError`` (the IMP-46 marker). + +Guardrails: + +* No Anthropic import; cache is pure proposal bookkeeping. +* No MDX read/write; proposals are u2 ``AiFallbackProposal`` instances. +* No silent persistence: gate violations are loud, not skipped writes + (`feedback_artifact_status_naming`). +""" +from __future__ import annotations + +from src.phase_z2_ai_fallback.schema import AiFallbackProposal + + +class AiFallbackCacheGateError(RuntimeError): + """Raised when ``save_proposal`` is called without both IMP-46 gates True.""" + + +def read_proposal(key: str) -> AiFallbackProposal | None: + """Look up a previously cached proposal by ``key``. + + IMP-33 ships without a persistent backend; this stub always returns + ``None`` so callers exercise the cache-miss path. The persistent + backend will be wired by IMP-46. + """ + if not isinstance(key, str) or not key: + raise ValueError("cache key must be a non-empty string") + return None + + +def save_proposal( + key: str, + proposal: AiFallbackProposal, + *, + visual_check_passed: bool, + user_approved: bool, +) -> None: + """Persist ``proposal`` under ``key`` once both IMP-46 gates are True. + + Raises ``AiFallbackCacheGateError`` if either gate is False — the + proposal is NOT written. When both gates are True, storage raises + ``NotImplementedError`` (the IMP-46 persistent backend has not landed + yet). + """ + if not isinstance(key, str) or not key: + raise ValueError("cache key must be a non-empty string") + if not isinstance(proposal, AiFallbackProposal): + raise TypeError( + "proposal must be an AiFallbackProposal instance " + f"(got {type(proposal).__name__})" + ) + if not visual_check_passed: + raise AiFallbackCacheGateError( + "IMP-46 gate: visual_check_passed=False; refusing to cache an " + "unverified proposal." + ) + if not user_approved: + raise AiFallbackCacheGateError( + "IMP-46 gate: user_approved=False; refusing to cache without " + "explicit user approval." + ) + raise NotImplementedError( + "IMP-46 persistent cache storage is not implemented yet; " + "this is the IMP-33 u6 stub marker." + ) diff --git a/src/phase_z2_ai_fallback/client.py b/src/phase_z2_ai_fallback/client.py new file mode 100644 index 0000000..bc126a3 --- /dev/null +++ b/src/phase_z2_ai_fallback/client.py @@ -0,0 +1,92 @@ +"""IMP-33 u4 — AI fallback Anthropic client (fallback path only). + +Wraps ``anthropic.Anthropic.messages.create`` with the timeout / retry / +backoff / budget / circuit-breaker policy locked in u1 ``Settings``. NO +inline policy literals: every knob is sourced from ``src.config.settings``. +Transient errors (timeout / connection / 429 / 5xx) are retried with +capped exponential backoff + jitter; all other errors propagate without +retry. PZ-1 invariant: this module is fallback-path only and MUST NOT be +imported on the normal pipeline path. +""" +from __future__ import annotations + +import json +import random +import time +from dataclasses import dataclass +from typing import Any + +import anthropic + +from src.config import settings +from src.phase_z2_ai_fallback.schema import AiFallbackProposal + +_TRANSIENT_ERRORS: tuple[type[BaseException], ...] = ( + anthropic.APITimeoutError, + anthropic.APIConnectionError, + anthropic.RateLimitError, + anthropic.InternalServerError, +) + +# Output cap is an Anthropic API requirement, not a policy knob (u1). +_MAX_OUTPUT_TOKENS = 4096 + + +class AiFallbackBudgetExceeded(RuntimeError): + """Per-run AI call budget (u1 ai_fallback_budget_per_run) exhausted.""" + + +class AiFallbackCircuitOpen(RuntimeError): + """Circuit breaker tripped (u1 ai_fallback_circuit_breaker_threshold).""" + + +@dataclass +class AiFallbackClient: + """Stateful per-run fallback client (budget + circuit accounting).""" + + client: Any = None + _calls: int = 0 + _consecutive_failures: int = 0 + + def __post_init__(self) -> None: + if self.client is None: + self.client = anthropic.Anthropic( + api_key=settings.anthropic_api_key, + timeout=settings.ai_fallback_timeout_s, + ) + + def request_proposal(self, prompt: dict[str, str]) -> AiFallbackProposal: + if self._calls >= settings.ai_fallback_budget_per_run: + raise AiFallbackBudgetExceeded( + f"per-run budget {settings.ai_fallback_budget_per_run} exhausted" + ) + if self._consecutive_failures >= settings.ai_fallback_circuit_breaker_threshold: + raise AiFallbackCircuitOpen( + f"circuit open after {self._consecutive_failures} consecutive failures" + ) + self._calls += 1 + last_error: BaseException | None = None + for attempt in range(settings.ai_fallback_max_retries + 1): + try: + response = self.client.messages.create( + model=settings.ai_fallback_model, + max_tokens=_MAX_OUTPUT_TOKENS, + system=prompt["system"], + messages=[{"role": "user", "content": prompt["user"]}], + ) + text = "".join( + block.text for block in response.content if hasattr(block, "text") + ) + self._consecutive_failures = 0 + return AiFallbackProposal.model_validate(json.loads(text)) + except _TRANSIENT_ERRORS as err: + last_error = err + if attempt >= settings.ai_fallback_max_retries: + break + base = settings.ai_fallback_backoff_base_s * (2 ** attempt) + delay = min(settings.ai_fallback_backoff_cap_s, base) + delay += random.uniform(0, delay * settings.ai_fallback_backoff_jitter) + time.sleep(delay) + self._consecutive_failures += 1 + assert last_error is not None + raise last_error diff --git a/src/phase_z2_ai_fallback/prompts.py b/src/phase_z2_ai_fallback/prompts.py new file mode 100644 index 0000000..a762657 --- /dev/null +++ b/src/phase_z2_ai_fallback/prompts.py @@ -0,0 +1,80 @@ +"""IMP-33 u3 — AI fallback prompt builder (fallback path only). + +System+user prompt for the Anthropic client (u4). MDX is READ-ONLY +(`feedback_ai_isolation_contract`); output is constrained to the u2 +schema; frame_id swap is forbidden (V4 rank-1 protected, +`feedback_phase_z_spacing_direction`). Inputs per Stage 2 plan: V4 +result (route=ai_adaptation_required, cardinality), frame_contract, +frame_visual HTML, figma_to_html_agent partial JSON, Internal Region, +MDX text. +""" +from __future__ import annotations + +import json +from typing import Any + +from src.phase_z2_ai_fallback.schema import FORBIDDEN_KINDS, ProposalKind + +V4_ROUTE_AI_ADAPTATION = "ai_adaptation_required" + +_ALLOWED_KINDS = ", ".join(sorted(k.value for k in ProposalKind)) +_FORBIDDEN_KINDS = ", ".join(sorted(FORBIDDEN_KINDS)) + +SYSTEM_PROMPT = ( + "You are an IMP-33 AI fallback adapter for Phase Z slide composition.\n" + "STRICT RULES:\n" + " 1. MDX text in the user payload is READ-ONLY. Do NOT rewrite, " + "compress, or paraphrase MDX.\n" + " 2. Output MUST be a single JSON object conforming to AiFallbackProposal.\n" + f" 3. proposal_kind MUST be one of: {_ALLOWED_KINDS}.\n" + f" 4. Do NOT propose any of: {_FORBIDDEN_KINDS}.\n" + " 5. Do NOT change frame_id — V4 rank-1 frame is locked.\n" + " 6. Keep declared frame slots (text/table/image/details) populated.\n" + " 7. Respect Internal Region containment; place content units within " + "the declared region only." +) + + +def build_ai_fallback_prompt( + *, + v4_result: dict[str, Any], + frame_contract: dict[str, Any], + frame_visual_html: str, + figma_partial_json: dict[str, Any], + internal_region: dict[str, Any], + mdx_text: str, +) -> dict[str, str]: + """Build system+user prompt strings for the fallback AI adapter. + + Raises: + ValueError: when ``v4_result.route`` is not + ``ai_adaptation_required`` — the fallback prompt MUST NOT be + built outside this route (normal-path AI call count must + remain 0; PZ-1). + """ + route = v4_result.get("route") or v4_result.get("imp05_route_hint") + if route != V4_ROUTE_AI_ADAPTATION: + raise ValueError( + f"build_ai_fallback_prompt: v4_result.route={route!r} is not " + f"{V4_ROUTE_AI_ADAPTATION!r}; fallback prompt MUST NOT be built " + "outside the AI adaptation route." + ) + user_payload = { + "v4": { + "route": route, + "cardinality": v4_result.get("cardinality") + or v4_result.get("cardinality_signature"), + "label": v4_result.get("label"), + "frame_id": v4_result.get("frame_id"), + "rank": v4_result.get("rank"), + }, + "frame_contract": frame_contract, + "frame_visual_html": frame_visual_html, + "figma_partial_json": figma_partial_json, + "internal_region": internal_region, + "mdx_text_READ_ONLY": mdx_text, + } + return { + "system": SYSTEM_PROMPT, + "user": json.dumps(user_payload, ensure_ascii=False), + } diff --git a/src/phase_z2_ai_fallback/router.py b/src/phase_z2_ai_fallback/router.py new file mode 100644 index 0000000..fc1ecd7 --- /dev/null +++ b/src/phase_z2_ai_fallback/router.py @@ -0,0 +1,89 @@ +"""IMP-33 u7 — AI fallback router (fallback path only). + +Composes the IMP-33 fallback flow: + + 1. flag gate (``settings.ai_fallback_enabled`` default OFF) + 2. V4 route gate (route must equal ``ai_adaptation_required``) + 3. cache read (u6 stub returns ``None`` until IMP-46 lands) + 4. build prompt (u3) + 5. call client (u4 ``request_proposal``) + 6. validate (u5 ``validate_proposal``) + +Returns the validated ``AiFallbackProposal``. Save to cache is NOT +performed here — it is caller-driven AFTER ``visual_check_passed=True`` +AND ``user_approved=True``, per the u6 IMP-46 gate. The router does not +import ``save_proposal``; this is the structural guarantee that the +router cannot persist a proposal before the caller's visual + user +checks (`feedback_artifact_status_naming`). + +Guardrails: + +* PZ-1 — normal-path AI call count stays 0: flag-off OR route-mismatch + short-circuits BEFORE the prompt builder or client are touched. +* ``feedback_ai_isolation_contract`` — MDX READ-ONLY (u3 enforces in + prompt; this module never reads or writes MDX). +* ``feedback_phase_z_spacing_direction`` — V4 rank-1 protected (u5 + enforces; router only forwards the contract). +""" +from __future__ import annotations + +from typing import Any + +from src.config import settings +from src.phase_z2_ai_fallback.cache import read_proposal +from src.phase_z2_ai_fallback.client import AiFallbackClient +from src.phase_z2_ai_fallback.prompts import ( + V4_ROUTE_AI_ADAPTATION, + build_ai_fallback_prompt, +) +from src.phase_z2_ai_fallback.schema import AiFallbackProposal +from src.phase_z2_ai_fallback.validate import validate_proposal + + +def route_ai_fallback( + *, + cache_key: str, + v4_result: dict[str, Any], + frame_contract: dict[str, Any], + frame_visual_html: str, + figma_partial_json: dict[str, Any], + internal_region: dict[str, Any], + mdx_text: str, + client: AiFallbackClient | None = None, +) -> AiFallbackProposal | None: + """Route a fallback request through cache → prompt → client → validate. + + Returns ``None`` when the master flag is OFF or when the V4 route is + not ``ai_adaptation_required`` — both gates short-circuit BEFORE any + prompt/client work, so the normal-path AI call count stays at 0 + (PZ-1). + """ + if not settings.ai_fallback_enabled: + return None + route = v4_result.get("route") or v4_result.get("imp05_route_hint") + if route != V4_ROUTE_AI_ADAPTATION: + return None + cached = read_proposal(cache_key) + if cached is not None: + validate_proposal( + cached, + frame_contract=frame_contract, + internal_region=internal_region, + ) + return cached + prompt = build_ai_fallback_prompt( + v4_result=v4_result, + frame_contract=frame_contract, + frame_visual_html=frame_visual_html, + figma_partial_json=figma_partial_json, + internal_region=internal_region, + mdx_text=mdx_text, + ) + active_client = client if client is not None else AiFallbackClient() + proposal = active_client.request_proposal(prompt) + validate_proposal( + proposal, + frame_contract=frame_contract, + internal_region=internal_region, + ) + return proposal diff --git a/src/phase_z2_ai_fallback/schema.py b/src/phase_z2_ai_fallback/schema.py new file mode 100644 index 0000000..54d9e64 --- /dev/null +++ b/src/phase_z2_ai_fallback/schema.py @@ -0,0 +1,50 @@ +"""IMP-33 u2 — AI fallback proposal schema. + +Whitelisted proposal kinds (Stage 2 plan): + - builder_options_patch : zone/frame builder option overrides + - partial_overrides : Internal Region / Frame Slot content overrides + - slot_mapping_proposal : restructuring proposal (content unit mapping) + +Forbidden output forms (rejected by validator): + - mdx_text (MDX read-only — `feedback_ai_isolation_contract`) + - frame_id_change (V4 rank-1 protected — `feedback_phase_z_spacing_direction`) + - raw_html (HTML structure is code-decided, not AI-generated) + - raw_css (same) +""" +from __future__ import annotations + +from enum import Enum +from typing import Any + +from pydantic import BaseModel, ConfigDict, Field, field_validator + + +class ProposalKind(str, Enum): + BUILDER_OPTIONS_PATCH = "builder_options_patch" + PARTIAL_OVERRIDES = "partial_overrides" + SLOT_MAPPING_PROPOSAL = "slot_mapping_proposal" + + +FORBIDDEN_KINDS: frozenset[str] = frozenset( + {"mdx_text", "frame_id_change", "raw_html", "raw_css"} +) + + +class AiFallbackProposal(BaseModel): + """Single AI fallback proposal (output contract for u4 client).""" + + model_config = ConfigDict(extra="forbid") + + proposal_kind: ProposalKind + payload: dict[str, Any] = Field(default_factory=dict) + rationale: str = "" + + @field_validator("proposal_kind", mode="before") + @classmethod + def _reject_forbidden_kind(cls, value: Any) -> Any: + if isinstance(value, str) and value in FORBIDDEN_KINDS: + raise ValueError( + f"proposal_kind={value!r} is forbidden (MDX/frame/raw HTML/CSS " + "mutations are not permitted under IMP-33)." + ) + return value diff --git a/src/phase_z2_ai_fallback/step12.py b/src/phase_z2_ai_fallback/step12.py new file mode 100644 index 0000000..2cb8a98 --- /dev/null +++ b/src/phase_z2_ai_fallback/step12.py @@ -0,0 +1,141 @@ +"""IMP-33 u8 — Step 12 AI repair wiring (IMP-30 provisional units only). + +Phase Z Step 12 = slot_payload (the runtime "light_edit / restructure" surface +where AI-assisted frame-aware adaptation is allowed per IMP-17 carve-out). +This module is the only call site that pipes Phase Z composition units into +``src.phase_z2_ai_fallback.router.route_ai_fallback``. Two structural gates +preserve the AI isolation contract: + +* IMP-30 provisional gate — units with ``provisional=False`` are skipped + before any route classification. AI repair is reserved for first-render + invariant survivors (no rank-1 V4 evidence, recovered as provisional). +* Reject gate — units whose V4 label maps to ``design_reference_only`` + (``reject``) are skipped with ``skip_reason="design_reference_only_no_ai"``. + Reject path is design reference only — never an AI call. + +Combined with the u7 router's flag-off + route-gate short-circuits, the +default Phase Z run path performs zero AI calls (PZ-1). Save to cache is +NOT performed here — that is the caller's responsibility AFTER +``visual_check_passed=True`` AND ``user_approved=True`` (u6 IMP-46 gate). +""" +from __future__ import annotations + +from typing import Any, Callable, Iterable + +from src.phase_z2_ai_fallback.router import route_ai_fallback + + +_AI_ADAPTATION_ROUTE = "ai_adaptation_required" +_DESIGN_REFERENCE_ROUTE = "design_reference_only" + + +def gather_step12_ai_repair_proposals( + units: Iterable[Any], + *, + route_for_label: Callable[[str | None], str | None], + get_contract_fn: Callable[[str], dict | None], + frame_visual_loader: Callable[[str], str], + figma_partial_loader: Callable[[str], dict] | None = None, + internal_region_lookup: Callable[[Any], dict] | None = None, + mdx_text_loader: Callable[[Any], str] | None = None, +) -> list[dict]: + """Return one record per unit describing the Step 12 AI repair decision. + + The record schema is stable across all gate decisions so the Step 12 + artifact consumer can rely on a single shape: + + { + "unit_index": int, + "source_section_ids": list[str], + "frame_template_id": str, + "label": str | None, + "route_hint": str | None, + "provisional": bool, + "ai_called": bool, + "skip_reason": str | None, + "proposal": dict | None, + "error": str | None, + } + + ``ai_called`` is True only when ``route_ai_fallback`` was invoked AND + returned a proposal OR raised. Flag-off / route-mismatch returns + ``None`` from the router and is surfaced as ``ai_called=False`` with + ``skip_reason="router_short_circuit"`` so the caller can distinguish + "router decided not to run" from "router ran and returned a proposal". + """ + records: list[dict] = [] + for index, unit in enumerate(units): + label = getattr(unit, "label", None) + route_hint = route_for_label(label) + record: dict = { + "unit_index": index, + "source_section_ids": list(getattr(unit, "source_section_ids", []) or []), + "frame_template_id": getattr(unit, "frame_template_id", None), + "label": label, + "route_hint": route_hint, + "provisional": bool(getattr(unit, "provisional", False)), + "ai_called": False, + "skip_reason": None, + "proposal": None, + "error": None, + } + if not record["provisional"]: + record["skip_reason"] = "not_provisional" + records.append(record) + continue + if route_hint == _DESIGN_REFERENCE_ROUTE: + record["skip_reason"] = "design_reference_only_no_ai" + records.append(record) + continue + if route_hint != _AI_ADAPTATION_ROUTE: + record["skip_reason"] = f"route_not_ai_adaptation:{route_hint}" + records.append(record) + continue + + template_id = record["frame_template_id"] or "" + frame_contract = get_contract_fn(template_id) or {} + frame_visual_html = frame_visual_loader(template_id) + figma_partial_json = ( + figma_partial_loader(template_id) if figma_partial_loader is not None else {} + ) + internal_region = ( + internal_region_lookup(unit) if internal_region_lookup is not None else {} + ) + mdx_text = ( + mdx_text_loader(unit) + if mdx_text_loader is not None + else (getattr(unit, "raw_content", "") or "") + ) + cache_key = "::".join( + [template_id, ",".join(sorted(record["source_section_ids"]))] + ) + v4_result = { + "route": route_hint, + "label": label, + "frame_id": getattr(unit, "frame_id", None), + "rank": getattr(unit, "v4_rank", None), + "cardinality": None, + } + try: + proposal = route_ai_fallback( + cache_key=cache_key, + v4_result=v4_result, + frame_contract=frame_contract, + frame_visual_html=frame_visual_html, + figma_partial_json=figma_partial_json, + internal_region=internal_region, + mdx_text=mdx_text, + ) + except Exception as exc: # noqa: BLE001 — record + continue, no AI re-raise + record["ai_called"] = True + record["error"] = f"{type(exc).__name__}: {exc}" + records.append(record) + continue + if proposal is None: + record["skip_reason"] = "router_short_circuit" + records.append(record) + continue + record["ai_called"] = True + record["proposal"] = proposal.model_dump() + records.append(record) + return records diff --git a/src/phase_z2_ai_fallback/step17.py b/src/phase_z2_ai_fallback/step17.py new file mode 100644 index 0000000..e655bf0 --- /dev/null +++ b/src/phase_z2_ai_fallback/step17.py @@ -0,0 +1,111 @@ +"""IMP-33 u9 — Step 17 AI repair wiring (BLOCKED until IMP-34 + IMP-35 land). + +Phase Z Step 17 = retry / salvage cascade (see ``src.phase_z2_pipeline`` +section 11.7 ``_attempt_salvage_chain`` and the existing IMP-12 u8/u9 +deterministic chain at ``src/phase_z2_pipeline.py:1994`` and +``src/phase_z2_pipeline.py:4948``). + +Per IMP-17 carve-out (``docs/architecture/IMP-17-CARVE-OUT.md`` lines 16, +40-44), AI repair at Step 17 is permitted ONLY after the full deterministic +chain is exhausted AND popup escalation is exhausted AND a user-approved +fallback budget remains. IMP-34 (zone resize + compact retry) and IMP-35 +(``details_popup_escalation``) are explicit prerequisites under the IMP-33 +out-of-scope contract — neither has landed yet. Therefore Step 17 AI repair +is STRUCTURALLY BLOCKED at u9. + +This module: + +1. **SPECIFIES** the canonical overflow cascade order via + :data:`OVERFLOW_CASCADE_ORDER` — ``deterministic`` → ``popup`` → + ``ai_repair`` → ``user_override``. Downstream Step 17 consumers can rely + on this single source of truth. +2. **KEEPS** Step 17 AI repair structurally blocked. The entry point + :func:`gather_step17_ai_repair_proposals` does NOT import + ``route_ai_fallback`` (u7), does NOT instantiate ``AiFallbackClient`` (u4), + and does NOT call any Anthropic API. Every unit is recorded with + ``skip_reason="step17_ai_blocked_imp_34_35_prerequisites_missing"`` so + the caller can distinguish "blocked by carve-out gate" from any other + skip path (e.g., u8 ``not_provisional`` / ``design_reference_only_no_ai``). + +Once IMP-34 + IMP-35 land AND a user-approved fallback budget is granted, +this module will gain the actual ``route_ai_fallback`` wiring guarded by +the cascade-stage conjunction. Today the gate is closed. +""" +from __future__ import annotations + +from enum import Enum +from typing import Any, Callable, Iterable + + +class OverflowCascadeStage(str, Enum): + """Step 17 overflow cascade stages — canonical order (u9 single source of truth). + + Members are ordered to match the AI isolation contract: + + * ``DETERMINISTIC`` — IMP-12 u4/u5/u6 (``cross_zone_redistribute`` / + ``glue_compression`` / ``font_step_compression``) + IMP-12 terminal + actions (``layout_adjust`` / ``frame_reselect``) + IMP-34 + (``zone resize + compact retry``, pending). No AI in any sub-stage. + * ``POPUP`` — IMP-35 (``details_popup_escalation``, pending). Content + popup escalation as the final deterministic resort before any AI. + * ``AI_REPAIR`` — IMP-33 (this carve-out) + IMP-46 cache. Only reachable + after DETERMINISTIC and POPUP are both exhausted AND user-approved + fallback budget remains. + * ``USER_OVERRIDE`` — explicit user override after all auto stages. + """ + + DETERMINISTIC = "deterministic" + POPUP = "popup" + AI_REPAIR = "ai_repair" + USER_OVERRIDE = "user_override" + + +OVERFLOW_CASCADE_ORDER: tuple[OverflowCascadeStage, ...] = ( + OverflowCascadeStage.DETERMINISTIC, + OverflowCascadeStage.POPUP, + OverflowCascadeStage.AI_REPAIR, + OverflowCascadeStage.USER_OVERRIDE, +) + + +STEP17_AI_REPAIR_BLOCKED_REASON = ( + "step17_ai_blocked_imp_34_35_prerequisites_missing" +) + + +def gather_step17_ai_repair_proposals( + units: Iterable[Any], + *, + route_for_label: Callable[[str | None], str | None], +) -> list[dict]: + """Return one BLOCKED record per unit. No AI call is performed at u9. + + The record schema mirrors :func:`src.phase_z2_ai_fallback.step12 + .gather_step12_ai_repair_proposals` so the Step 17 artifact consumer can + reuse the same shape, with one addition: ``cascade_stage`` pins the + stage this record belongs to (always ``ai_repair`` here). + + Per Stage 2 contract (IMP-33 u9): Step 17 AI repair is blocked behind + IMP-34 + IMP-35. Every unit returns with + ``skip_reason=STEP17_AI_REPAIR_BLOCKED_REASON`` and ``ai_called=False``. + """ + records: list[dict] = [] + for index, unit in enumerate(units): + label = getattr(unit, "label", None) + record: dict = { + "unit_index": index, + "source_section_ids": list( + getattr(unit, "source_section_ids", []) or [] + ), + "frame_template_id": getattr(unit, "frame_template_id", None), + "label": label, + "route_hint": route_for_label(label), + "provisional": bool(getattr(unit, "provisional", False)), + "cascade_stage": OverflowCascadeStage.AI_REPAIR.value, + "ai_called": False, + "skip_reason": STEP17_AI_REPAIR_BLOCKED_REASON, + "proposal": None, + "error": None, + } + records.append(record) + return records diff --git a/src/phase_z2_ai_fallback/validate.py b/src/phase_z2_ai_fallback/validate.py new file mode 100644 index 0000000..ae44973 --- /dev/null +++ b/src/phase_z2_ai_fallback/validate.py @@ -0,0 +1,83 @@ +"""IMP-33 u5 — AI fallback proposal validator (fallback path only). + +Defence-in-depth layer between the u4 client output (already u2-schema-valid) +and the caller. Adds the four Stage 2 guards that u2 cannot express purely at +the schema level: + + 1. builder-options whitelist (BUILDER_OPTIONS_PATCH may only touch keys + already declared in ``frame_contract.payload.builder_options``). + 2. dropped-slot guard (PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL must keep + every declared ``sub_zones[*].id`` populated — text/table/image/details + slots cannot disappear; `feedback_ai_isolation_contract`). + 3. frame-swap guard (no ``frame_id`` mutation inside payload — V4 rank-1 + protected; `feedback_phase_z_spacing_direction`). + 4. Internal Region containment (``payload.region_id`` must match the + declared Internal Region id when present). +""" +from __future__ import annotations + +from typing import Any + +from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind + + +class AiFallbackValidationError(ValueError): + """Raised when a proposal violates an IMP-33 u5 guard.""" + + +_SLOT_KINDS = (ProposalKind.PARTIAL_OVERRIDES, ProposalKind.SLOT_MAPPING_PROPOSAL) + + +def validate_proposal( + proposal: AiFallbackProposal, + *, + frame_contract: dict[str, Any], + internal_region: dict[str, Any] | None = None, +) -> None: + """Validate an AI fallback proposal against the active frame contract. + + Raises ``AiFallbackValidationError`` on any guard violation. Returns + ``None`` on success — caller is responsible for downstream application. + """ + AiFallbackProposal.model_validate(proposal.model_dump()) + + payload = proposal.payload + frame_id = frame_contract.get("frame_id") + if "frame_id" in payload and payload["frame_id"] != frame_id: + raise AiFallbackValidationError( + f"frame-swap guard: payload.frame_id={payload['frame_id']!r} " + f"differs from contract frame_id={frame_id!r}; V4 rank-1 is locked." + ) + + if proposal.proposal_kind is ProposalKind.BUILDER_OPTIONS_PATCH: + declared = (frame_contract.get("payload") or {}).get("builder_options") or {} + unknown = set(payload.keys()) - set(declared.keys()) + if unknown: + raise AiFallbackValidationError( + f"builder whitelist: keys {sorted(unknown)} not in " + f"frame_contract.payload.builder_options {sorted(declared)}." + ) + + if proposal.proposal_kind in _SLOT_KINDS: + declared_slot_ids = [z.get("id") for z in (frame_contract.get("sub_zones") or [])] + slots = payload.get("slots") + if not isinstance(slots, dict): + raise AiFallbackValidationError( + "dropped-slot guard: PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL " + "payload MUST include a 'slots' mapping." + ) + missing = [sid for sid in declared_slot_ids if sid not in slots] + if missing: + raise AiFallbackValidationError( + f"dropped-slot guard: declared slots {missing} are absent " + "from payload.slots (text/table/image/details must remain populated)." + ) + + region_id = payload.get("region_id") + if region_id is not None and internal_region is not None: + declared_region_id = internal_region.get("id") + if region_id != declared_region_id: + raise AiFallbackValidationError( + f"Internal Region containment: payload.region_id={region_id!r} " + f"differs from internal_region.id={declared_region_id!r}." + ) diff --git a/tests/phase_z2_ai_fallback/__init__.py b/tests/phase_z2_ai_fallback/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/phase_z2_ai_fallback/test_ast_isolation.py b/tests/phase_z2_ai_fallback/test_ast_isolation.py new file mode 100644 index 0000000..8c3e77d --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_ast_isolation.py @@ -0,0 +1,153 @@ +"""IMP-33 u10 — AST isolation guard for the AI fallback package. + +Structural defence: parse every ``*.py`` file under +``src/phase_z2_ai_fallback/`` and assert that none of them imports a +Phase Q runtime module, the Kei API client, or any ``phase_z2_*`` runtime +module (e.g. ``phase_z2_pipeline``). Even if a future patch wires such a +module by accident, this AST scan catches it before runtime and protects +the PZ-1 invariant (normal-path AI call count = 0). + +Allowed imports inside the fallback package: + +* Standard library modules. +* ``anthropic`` (u4 client) and ``pydantic`` (u2 schema). +* ``src.config`` (u1 settings — single source of truth for policy knobs). +* Other modules inside ``src.phase_z2_ai_fallback`` (intra-package). +""" +from __future__ import annotations + +import ast +import pathlib + +import pytest + +PACKAGE_ROOT = pathlib.Path(__file__).resolve().parents[2] / "src" / "phase_z2_ai_fallback" + +_ALLOWED_SRC_PREFIXES: tuple[str, ...] = ( + "src.config", + "src.phase_z2_ai_fallback", +) + +_ALLOWED_TOP_LEVEL: frozenset[str] = frozenset( + { + "anthropic", + "pydantic", + "__future__", + "ast", + "dataclasses", + "enum", + "json", + "pathlib", + "random", + "time", + "typing", + } +) + +_FORBIDDEN_PHASE_Q_MODULES: frozenset[str] = frozenset( + { + "src.pipeline", + "src.pipeline_v2", + "src.block_assembler", + "src.block_assembler_b2", + "src.block_matcher_tfidf", + "src.block_reference", + "src.block_search", + "src.block_selector", + "src.content_editor", + "src.design_director", + "src.html_generator", + "src.html_validator", + "src.renderer", + "src.mdx_normalizer", + "src.fit_verifier", + "src.slide_measurer", + "src.space_allocator", + "src.kei_client", + } +) + + +def _module_files() -> list[pathlib.Path]: + return sorted(p for p in PACKAGE_ROOT.glob("*.py") if p.name != "__pycache__") + + +def _imported_names(tree: ast.AST) -> list[str]: + names: list[str] = [] + for node in ast.walk(tree): + if isinstance(node, ast.Import): + for alias in node.names: + names.append(alias.name) + elif isinstance(node, ast.ImportFrom): + if node.module is not None: + names.append(node.module) + return names + + +def _parse(path: pathlib.Path) -> ast.AST: + return ast.parse(path.read_text(encoding="utf-8"), filename=str(path)) + + +def _is_allowed(name: str) -> bool: + for prefix in _ALLOWED_SRC_PREFIXES: + if name == prefix or name.startswith(prefix + "."): + return True + top = name.split(".", 1)[0] + return top in _ALLOWED_TOP_LEVEL + + +def test_fallback_package_root_exists() -> None: + assert PACKAGE_ROOT.is_dir(), ( + f"fallback package root not found at {PACKAGE_ROOT!s}; module path " + "is locked by IMP-31-GATE-AUDIT (src/phase_z2_ai_fallback/)." + ) + files = _module_files() + assert files, f"no .py modules found under {PACKAGE_ROOT!s}" + + +def test_fallback_package_imports_are_whitelisted() -> None: + violations: list[tuple[str, str]] = [] + for path in _module_files(): + for name in _imported_names(_parse(path)): + if not _is_allowed(name): + violations.append((path.name, name)) + assert not violations, ( + "fallback package imports outside the IMP-33 whitelist " + f"(Phase Q / Kei / phase_z2_* runtime forbidden): {violations}" + ) + + +@pytest.mark.parametrize("forbidden_module", sorted(_FORBIDDEN_PHASE_Q_MODULES)) +def test_fallback_package_forbids_phase_q_and_kei_imports(forbidden_module: str) -> None: + for path in _module_files(): + for name in _imported_names(_parse(path)): + top2 = ".".join(name.split(".")[:2]) + assert top2 != forbidden_module and name != forbidden_module, ( + f"{path.name} imports forbidden module {name!r}; " + f"{forbidden_module!r} is a Phase Q / Kei runtime module and " + "must not be reachable from the AI fallback package." + ) + + +def test_fallback_package_forbids_phase_z2_pipeline_imports() -> None: + for path in _module_files(): + for name in _imported_names(_parse(path)): + assert not name.startswith("src.phase_z2_pipeline"), ( + f"{path.name} imports {name!r}; the Phase Z2 pipeline runtime " + "module must not be reachable from the AI fallback package " + "(PZ-1: normal-path AI=0)." + ) + + +def test_fallback_package_forbids_other_phase_z2_runtime_imports() -> None: + violations: list[tuple[str, str]] = [] + for path in _module_files(): + for name in _imported_names(_parse(path)): + if name.startswith("src.phase_z2_") and not name.startswith( + "src.phase_z2_ai_fallback" + ): + violations.append((path.name, name)) + assert not violations, ( + "fallback package imports another phase_z2_* runtime module; " + f"violations: {violations}" + ) diff --git a/tests/phase_z2_ai_fallback/test_cache.py b/tests/phase_z2_ai_fallback/test_cache.py new file mode 100644 index 0000000..b3d1dd9 --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_cache.py @@ -0,0 +1,90 @@ +"""IMP-33 u6 — AI fallback cache gate tests. + +Verifies the IMP-46 gate contract: + * ``read_proposal`` is a stub (returns None until IMP-46). + * ``save_proposal`` enforces both gates before any write attempt. + * Storage itself raises NotImplementedError (IMP-46 marker). +""" +from __future__ import annotations + +import pytest + +from src.phase_z2_ai_fallback.cache import ( + AiFallbackCacheGateError, + read_proposal, + save_proposal, +) +from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind + + +def _proposal() -> AiFallbackProposal: + return AiFallbackProposal( + proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH, + payload={"item_parser": "bullet_v2"}, + rationale="u6-test", + ) + + +def test_read_proposal_returns_none_for_any_key(): + assert read_proposal("frame=foo|cardinality=3") is None + + +def test_read_proposal_rejects_empty_key(): + with pytest.raises(ValueError): + read_proposal("") + + +def test_save_rejects_when_visual_check_failed(): + with pytest.raises(AiFallbackCacheGateError) as exc: + save_proposal( + "k", _proposal(), visual_check_passed=False, user_approved=True + ) + assert "visual_check_passed" in str(exc.value) + + +def test_save_rejects_when_user_not_approved(): + with pytest.raises(AiFallbackCacheGateError) as exc: + save_proposal( + "k", _proposal(), visual_check_passed=True, user_approved=False + ) + assert "user_approved" in str(exc.value) + + +def test_save_rejects_when_both_gates_false(): + with pytest.raises(AiFallbackCacheGateError): + save_proposal( + "k", _proposal(), visual_check_passed=False, user_approved=False + ) + + +def test_save_raises_not_implemented_when_both_gates_pass(): + with pytest.raises(NotImplementedError) as exc: + save_proposal( + "k", _proposal(), visual_check_passed=True, user_approved=True + ) + assert "IMP-46" in str(exc.value) + + +def test_save_rejects_empty_key(): + with pytest.raises(ValueError): + save_proposal( + "", _proposal(), visual_check_passed=True, user_approved=True + ) + + +def test_save_rejects_non_proposal_object(): + with pytest.raises(TypeError): + save_proposal( + "k", + {"proposal_kind": "builder_options_patch"}, # type: ignore[arg-type] + visual_check_passed=True, + user_approved=True, + ) + + +def test_gate_error_is_not_notimplementederror(): + with pytest.raises(AiFallbackCacheGateError): + save_proposal( + "k", _proposal(), visual_check_passed=False, user_approved=True + ) + assert not issubclass(AiFallbackCacheGateError, NotImplementedError) diff --git a/tests/phase_z2_ai_fallback/test_client_mock.py b/tests/phase_z2_ai_fallback/test_client_mock.py new file mode 100644 index 0000000..c42710c --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_client_mock.py @@ -0,0 +1,151 @@ +"""IMP-33 u4 — fallback client mock tests. + +Scope (Stage 2 plan, u4): + - Success path returns a validated ``AiFallbackProposal`` (u2 schema). + - Transient errors (timeout / connection / 429 / 5xx) are retried. + - Retries exhausted → last transient error propagates + consec-fail bumps. + - Non-transient errors are NOT retried. + - Per-run budget exhaustion raises ``AiFallbackBudgetExceeded``. + - Circuit breaker opens after consecutive-failure threshold reached. + - Policy values are sourced from ``settings`` (no inline literals). +""" +from __future__ import annotations + +import json +import time +from types import SimpleNamespace +from unittest.mock import MagicMock + +import anthropic +import httpx +import pytest + +from src.config import settings +from src.phase_z2_ai_fallback.client import ( + AiFallbackBudgetExceeded, + AiFallbackCircuitOpen, + AiFallbackClient, +) + + +class _NonTransient(Exception): + """Stand-in for any anthropic error not in the transient whitelist.""" + + +def _ok_response() -> SimpleNamespace: + block = SimpleNamespace( + text=json.dumps( + { + "proposal_kind": "builder_options_patch", + "payload": {"k": 1}, + "rationale": "ok", + } + ) + ) + return SimpleNamespace(content=[block]) + + +def _timeout_err() -> anthropic.APITimeoutError: + return anthropic.APITimeoutError(request=httpx.Request("POST", "https://x")) + + +def _connection_err() -> anthropic.APIConnectionError: + return anthropic.APIConnectionError(request=httpx.Request("POST", "https://x")) + + +@pytest.fixture(autouse=True) +def _no_real_sleep(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(time, "sleep", lambda _s: None) + + +@pytest.fixture(autouse=True) +def _restore_settings(): + snapshot = settings.model_dump() + yield + for key, value in snapshot.items(): + setattr(settings, key, value) + + +def _client_with(side_effect=None, return_value=None) -> AiFallbackClient: + fake = MagicMock() + if side_effect is not None: + fake.messages.create.side_effect = side_effect + else: + fake.messages.create.return_value = return_value or _ok_response() + return AiFallbackClient(client=fake) + + +def test_success_returns_validated_proposal() -> None: + out = _client_with().request_proposal({"system": "s", "user": "u"}) + assert out.proposal_kind.value == "builder_options_patch" + assert out.payload == {"k": 1} + + +def test_call_uses_settings_model() -> None: + fake = MagicMock() + fake.messages.create.return_value = _ok_response() + AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"}) + kwargs = fake.messages.create.call_args.kwargs + assert kwargs["model"] == settings.ai_fallback_model + + +def test_transient_retries_then_succeeds() -> None: + fake = MagicMock() + fake.messages.create.side_effect = [_timeout_err(), _connection_err(), _ok_response()] + AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"}) + assert fake.messages.create.call_count == 3 + + +def test_retries_exhausted_raises_last_transient(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(settings, "ai_fallback_max_retries", 1) + fake = MagicMock() + fake.messages.create.side_effect = [_timeout_err(), _timeout_err()] + c = AiFallbackClient(client=fake) + with pytest.raises(anthropic.APITimeoutError): + c.request_proposal({"system": "s", "user": "u"}) + assert fake.messages.create.call_count == 2 + assert c._consecutive_failures == 1 + + +def test_non_transient_not_retried() -> None: + fake = MagicMock() + fake.messages.create.side_effect = _NonTransient("boom") + c = AiFallbackClient(client=fake) + with pytest.raises(_NonTransient): + c.request_proposal({"system": "s", "user": "u"}) + assert fake.messages.create.call_count == 1 + + +def test_budget_exceeded(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(settings, "ai_fallback_budget_per_run", 1) + c = _client_with() + c.request_proposal({"system": "s", "user": "u"}) + with pytest.raises(AiFallbackBudgetExceeded): + c.request_proposal({"system": "s", "user": "u"}) + + +def test_circuit_breaker_opens(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setattr(settings, "ai_fallback_circuit_breaker_threshold", 1) + monkeypatch.setattr(settings, "ai_fallback_max_retries", 0) + fake = MagicMock() + fake.messages.create.side_effect = _timeout_err() + c = AiFallbackClient(client=fake) + with pytest.raises(anthropic.APITimeoutError): + c.request_proposal({"system": "s", "user": "u"}) + with pytest.raises(AiFallbackCircuitOpen): + c.request_proposal({"system": "s", "user": "u"}) + + +def test_backoff_uses_settings(monkeypatch: pytest.MonkeyPatch) -> None: + """Sleep delay must be derived from settings (no inline literals).""" + monkeypatch.setattr(settings, "ai_fallback_max_retries", 1) + monkeypatch.setattr(settings, "ai_fallback_backoff_base_s", 0.25) + monkeypatch.setattr(settings, "ai_fallback_backoff_cap_s", 0.5) + monkeypatch.setattr(settings, "ai_fallback_backoff_jitter", 0.0) + sleeps: list[float] = [] + monkeypatch.setattr(time, "sleep", lambda s: sleeps.append(s)) + fake = MagicMock() + fake.messages.create.side_effect = [_timeout_err(), _ok_response()] + AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"}) + # attempt 0 transient → sleep(min(cap, base * 2**0) + jitter==0) = 0.25 + assert sleeps == [0.25] diff --git a/tests/phase_z2_ai_fallback/test_docs_sync.py b/tests/phase_z2_ai_fallback/test_docs_sync.py new file mode 100644 index 0000000..2a75087 --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_docs_sync.py @@ -0,0 +1,61 @@ +"""IMP-33 u11 — docs sync verification. + +Verifies that the binding architecture docs reference the IMP-33 runtime +module surface introduced by u1~u10. Scope is intentionally narrow per the +Stage 2 plan: module path, Step 12 entry, Step 17 entry, cascade order, and +the IMP-46 cache gate. Failure here means the docs and the code have +drifted — fix the docs (or the code) before merging. +""" +from __future__ import annotations + +from pathlib import Path + +import pytest + +DOCS_ROOT = Path(__file__).resolve().parents[2] / "docs" / "architecture" +CARVE_OUT_DOC = DOCS_ROOT / "IMP-17-CARVE-OUT.md" +GATE_AUDIT_DOC = DOCS_ROOT / "IMP-31-GATE-AUDIT.md" + + +def _read(doc: Path) -> str: + assert doc.is_file(), f"binding doc missing: {doc}" + return doc.read_text(encoding="utf-8") + + +@pytest.mark.parametrize( + "needle", + [ + # Module path lock. + "src/phase_z2_ai_fallback/", + # Step 12 entry. + "gather_step12_ai_repair_proposals", + # Step 17 entry + blocked-reason sentinel. + "gather_step17_ai_repair_proposals", + "step17_ai_blocked_imp_34_35_prerequisites_missing", + # Cascade order single source of truth. + "OVERFLOW_CASCADE_ORDER", + "(DETERMINISTIC, POPUP, AI_REPAIR, USER_OVERRIDE)", + # IMP-46 cache gate. + "visual_check_passed", + "user_approved", + "AiFallbackCacheGateError", + # PZ-1 normal-path AI=0 invariant. + "ai_fallback_enabled", + ], +) +def test_carve_out_doc_references_runtime_surface(needle: str) -> None: + assert needle in _read(CARVE_OUT_DOC), ( + f"IMP-17-CARVE-OUT.md missing binding reference: {needle!r}" + ) + + +def test_gate_audit_reflects_scaffolded_module() -> None: + body = _read(GATE_AUDIT_DOC) + assert "scaffolded under IMP-33" in body, ( + "IMP-31-GATE-AUDIT.md must record that the fallback module path is " + "scaffolded (not 'not created this cycle')." + ) + assert "ai_fallback_enabled" in body, ( + "IMP-31-GATE-AUDIT.md must record the flag default that keeps PZ-1 " + "(normal-path AI=0) intact while the 3-condition gate is open." + ) diff --git a/tests/phase_z2_ai_fallback/test_prompts.py b/tests/phase_z2_ai_fallback/test_prompts.py new file mode 100644 index 0000000..25c2ca9 --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_prompts.py @@ -0,0 +1,100 @@ +"""IMP-33 u3 — fallback prompt builder tests. + +Scope (Stage 2 plan, u3): + - Prompt is built only when V4 route == 'ai_adaptation_required'. + - System prompt declares MDX READ-ONLY and pins the u2 whitelist. + - System prompt forbids the u2 forbidden kinds + frame_id swap. + - User payload carries all 6 declared inputs and labels MDX READ_ONLY. +""" +from __future__ import annotations + +import json + +import pytest + +from src.phase_z2_ai_fallback.prompts import ( + SYSTEM_PROMPT, + V4_ROUTE_AI_ADAPTATION, + build_ai_fallback_prompt, +) +from src.phase_z2_ai_fallback.schema import FORBIDDEN_KINDS, ProposalKind + + +def _v4(route: str = V4_ROUTE_AI_ADAPTATION) -> dict: + return { + "route": route, + "cardinality": {"strict": 3}, + "label": "restructure", + "frame_id": 1171281190, + "rank": 1, + } + + +def _inputs(route: str = V4_ROUTE_AI_ADAPTATION) -> dict: + return { + "v4_result": _v4(route), + "frame_contract": {"template_id": "three_parallel_requirements"}, + "frame_visual_html": "", + "figma_partial_json": {"nodes": []}, + "internal_region": {"id": "region_top", "bbox": [0, 0, 1200, 320]}, + "mdx_text": "# 대목차\n- 항목 1\n- 항목 2\n- 항목 3", + } + + +def test_system_prompt_declares_mdx_read_only() -> None: + assert "READ-ONLY" in SYSTEM_PROMPT + + +def test_system_prompt_lists_all_whitelisted_kinds() -> None: + for kind in ProposalKind: + assert kind.value in SYSTEM_PROMPT + + +def test_system_prompt_forbids_all_forbidden_kinds() -> None: + for forbidden in FORBIDDEN_KINDS: + assert forbidden in SYSTEM_PROMPT + + +def test_system_prompt_locks_frame_id_swap() -> None: + assert "frame_id" in SYSTEM_PROMPT + + +def test_build_prompt_returns_system_and_user() -> None: + prompt = build_ai_fallback_prompt(**_inputs()) + assert set(prompt.keys()) == {"system", "user"} + assert prompt["system"] == SYSTEM_PROMPT + + +def test_user_payload_carries_all_inputs_and_marks_mdx_read_only() -> None: + prompt = build_ai_fallback_prompt(**_inputs()) + payload = json.loads(prompt["user"]) + assert payload["v4"]["route"] == V4_ROUTE_AI_ADAPTATION + assert payload["v4"]["cardinality"] == {"strict": 3} + assert payload["v4"]["frame_id"] == 1171281190 + assert payload["frame_contract"]["template_id"] == "three_parallel_requirements" + assert payload["frame_visual_html"] == "" + assert payload["figma_partial_json"] == {"nodes": []} + assert payload["internal_region"]["id"] == "region_top" + assert "mdx_text_READ_ONLY" in payload + assert payload["mdx_text_READ_ONLY"].startswith("# 대목차") + assert "mdx_text" not in payload # only the READ_ONLY key, not a writable alias + + +@pytest.mark.parametrize( + "route", ["direct_render", "deterministic_minor_adjustment", "design_reference_only", None] +) +def test_non_ai_route_rejected(route) -> None: + inputs = _inputs(route=route) if route is not None else _inputs() + if route is None: + inputs["v4_result"].pop("route") + with pytest.raises(ValueError, match=V4_ROUTE_AI_ADAPTATION): + build_ai_fallback_prompt(**inputs) + + +def test_cardinality_signature_alias_accepted() -> None: + """Some V4 callers expose ``cardinality_signature``; both keys must resolve.""" + inputs = _inputs() + inputs["v4_result"].pop("cardinality") + inputs["v4_result"]["cardinality_signature"] = {"strict": 4} + payload = json.loads(build_ai_fallback_prompt(**inputs)["user"]) + assert payload["v4"]["cardinality"] == {"strict": 4} diff --git a/tests/phase_z2_ai_fallback/test_router.py b/tests/phase_z2_ai_fallback/test_router.py new file mode 100644 index 0000000..671b21e --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_router.py @@ -0,0 +1,156 @@ +"""IMP-33 u7 — AI fallback router tests. + +Scope (Stage 2 plan, u7): + - flag-off gate returns None and does NOT touch the client / prompt + - route-mismatch gate returns None and does NOT touch the client / prompt + - cache-hit short-circuits the client and still re-validates against the + current frame contract (defence-in-depth) + - cache-miss calls the client and validates the returned proposal + - validation errors propagate + - budget / circuit exceptions from u4 propagate + - router never imports ``save_proposal`` (cache save is caller-driven + after visual_check + user_approved per u6 IMP-46 gate) +""" +from __future__ import annotations + +from unittest.mock import MagicMock + +import pytest + +from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind +from src.phase_z2_ai_fallback import router as router_mod +from src.phase_z2_ai_fallback.client import ( + AiFallbackBudgetExceeded, + AiFallbackCircuitOpen, + AiFallbackClient, +) +from src.phase_z2_ai_fallback.router import route_ai_fallback +from src.phase_z2_ai_fallback.validate import AiFallbackValidationError + + +_FRAME_CONTRACT = { + "frame_id": 1171281190, + "sub_zones": [{"id": "pillar_1", "accepts": ["text_block"]}], + "payload": {"builder_options": {"item_parser": "pillar_item"}}, +} +_REGION = {"id": "zone_top.region_a"} +_V4_AI = { + "route": "ai_adaptation_required", + "cardinality": "many", + "frame_id": 1171281190, + "rank": 1, +} +_V4_NOT_AI = {"route": "light_edit", "cardinality": "many"} + + +def _make_proposal( + kind: ProposalKind = ProposalKind.PARTIAL_OVERRIDES, + payload: dict | None = None, +) -> AiFallbackProposal: + return AiFallbackProposal( + proposal_kind=kind, + payload=payload if payload is not None else {"slots": {"pillar_1": "a"}}, + ) + + +def _call_kwargs() -> dict: + return dict( + cache_key="frame:1171281190:cardinality:many", + v4_result=_V4_AI, + frame_contract=_FRAME_CONTRACT, + frame_visual_html="
", + figma_partial_json={}, + internal_region=_REGION, + mdx_text="# example\n- a\n- b", + ) + + +def test_router_returns_none_when_flag_off(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", False) + client = MagicMock(spec=AiFallbackClient) + result = route_ai_fallback(**_call_kwargs(), client=client) + assert result is None + client.request_proposal.assert_not_called() + + +def test_router_returns_none_when_route_not_ai_adaptation(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + client = MagicMock(spec=AiFallbackClient) + kwargs = _call_kwargs() + kwargs["v4_result"] = _V4_NOT_AI + result = route_ai_fallback(**kwargs, client=client) + assert result is None + client.request_proposal.assert_not_called() + + +def test_router_returns_cached_when_cache_hit(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + cached = _make_proposal() + monkeypatch.setattr(router_mod, "read_proposal", lambda key: cached) + client = MagicMock(spec=AiFallbackClient) + result = route_ai_fallback(**_call_kwargs(), client=client) + assert result is cached + client.request_proposal.assert_not_called() + + +def test_router_validates_cached_proposal(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + bad_cached = AiFallbackProposal( + proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH, + payload={"unknown_key": "x"}, + ) + monkeypatch.setattr(router_mod, "read_proposal", lambda key: bad_cached) + client = MagicMock(spec=AiFallbackClient) + with pytest.raises(AiFallbackValidationError): + route_ai_fallback(**_call_kwargs(), client=client) + client.request_proposal.assert_not_called() + + +def test_router_calls_client_and_returns_validated_proposal(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + monkeypatch.setattr(router_mod, "read_proposal", lambda key: None) + proposal = _make_proposal() + client = MagicMock(spec=AiFallbackClient) + client.request_proposal.return_value = proposal + result = route_ai_fallback(**_call_kwargs(), client=client) + assert result is proposal + client.request_proposal.assert_called_once() + sent_prompt = client.request_proposal.call_args.args[0] + assert set(sent_prompt.keys()) == {"system", "user"} + + +def test_router_propagates_validation_error(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + monkeypatch.setattr(router_mod, "read_proposal", lambda key: None) + bad = AiFallbackProposal( + proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH, + payload={"unknown_key": "x"}, + ) + client = MagicMock(spec=AiFallbackClient) + client.request_proposal.return_value = bad + with pytest.raises(AiFallbackValidationError): + route_ai_fallback(**_call_kwargs(), client=client) + + +def test_router_propagates_budget_exceeded(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + monkeypatch.setattr(router_mod, "read_proposal", lambda key: None) + client = MagicMock(spec=AiFallbackClient) + client.request_proposal.side_effect = AiFallbackBudgetExceeded("over") + with pytest.raises(AiFallbackBudgetExceeded): + route_ai_fallback(**_call_kwargs(), client=client) + + +def test_router_propagates_circuit_open(monkeypatch): + monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True) + monkeypatch.setattr(router_mod, "read_proposal", lambda key: None) + client = MagicMock(spec=AiFallbackClient) + client.request_proposal.side_effect = AiFallbackCircuitOpen("tripped") + with pytest.raises(AiFallbackCircuitOpen): + route_ai_fallback(**_call_kwargs(), client=client) + + +def test_router_does_not_import_save_proposal(): + """Cache save is caller-driven AFTER visual_check + user_approved (u6 IMP-46 + gate); structurally guaranteed by NOT importing save_proposal in the router.""" + assert not hasattr(router_mod, "save_proposal") diff --git a/tests/phase_z2_ai_fallback/test_schema.py b/tests/phase_z2_ai_fallback/test_schema.py new file mode 100644 index 0000000..d85a661 --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_schema.py @@ -0,0 +1,46 @@ +"""IMP-33 u2 — AiFallbackProposal schema tests. + +Scope (Stage 2 plan, u2): + - Whitelisted proposal_kind values are accepted. + - Forbidden output forms are rejected: mdx_text / frame_id_change / raw_html / raw_css. + - extra fields outside the declared schema are rejected (MDX read-only signal). +""" +from __future__ import annotations + +import pytest +from pydantic import ValidationError + +from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind + + +@pytest.mark.parametrize( + "kind_value", + [ + "builder_options_patch", + "partial_overrides", + "slot_mapping_proposal", + ], +) +def test_whitelisted_proposal_kinds_accepted(kind_value: str) -> None: + proposal = AiFallbackProposal(proposal_kind=kind_value) + assert proposal.proposal_kind == ProposalKind(kind_value) + + +@pytest.mark.parametrize( + "forbidden", + ["mdx_text", "frame_id_change", "raw_html", "raw_css"], +) +def test_forbidden_proposal_kinds_rejected(forbidden: str) -> None: + with pytest.raises(ValidationError): + AiFallbackProposal(proposal_kind=forbidden) + + +def test_unknown_proposal_kind_rejected() -> None: + with pytest.raises(ValidationError): + AiFallbackProposal(proposal_kind="something_else") + + +def test_extra_fields_rejected() -> None: + """`extra=forbid` keeps the AI from smuggling raw_html/mdx_text alongside a valid kind.""" + with pytest.raises(ValidationError): + AiFallbackProposal(proposal_kind="partial_overrides", raw_html="") diff --git a/tests/phase_z2_ai_fallback/test_step12.py b/tests/phase_z2_ai_fallback/test_step12.py new file mode 100644 index 0000000..f66eadc --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_step12.py @@ -0,0 +1,193 @@ +"""IMP-33 u8 — Step 12 AI repair wiring tests. + +Covers the two structural gates layered on top of the u7 router: + * IMP-30 provisional gate (only provisional units may invoke AI repair) + * Reject gate (route_hint=design_reference_only NEVER calls AI) +Plus the record-shape contract returned for downstream Step 12 artifacts. +""" +from __future__ import annotations + +from dataclasses import dataclass, field +from typing import Any +from unittest.mock import MagicMock + +from src.phase_z2_ai_fallback import step12 as step12_mod +from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind + + +@dataclass +class FakeUnit: + label: str | None + provisional: bool + frame_template_id: str = "tmpl" + frame_id: str = "fid" + source_section_ids: list[str] = field(default_factory=lambda: ["s1"]) + raw_content: str = "raw" + v4_rank: int | None = 1 + + +_ROUTE_HINTS: dict[str | None, str | None] = { + "use_as_is": "direct_render", + "light_edit": "deterministic_minor_adjustment", + "restructure": "ai_adaptation_required", + "reject": "design_reference_only", + None: None, +} + + +def _route_for_label(label: str | None) -> str | None: + return _ROUTE_HINTS.get(label) + + +def _get_contract(_tid: str) -> dict[str, Any]: + return {"frame_id": "fid", "payload": {"builder_options": {}}, "sub_zones": []} + + +def _frame_visual(_tid: str) -> str: + return "" + + +def _call( + units: list[FakeUnit], + *, + route_ai_fallback: Any | None = None, + **overrides: Any, +) -> list[dict]: + if route_ai_fallback is not None: + step12_mod.route_ai_fallback = route_ai_fallback # type: ignore[assignment] + kwargs: dict[str, Any] = dict( + route_for_label=_route_for_label, + get_contract_fn=_get_contract, + frame_visual_loader=_frame_visual, + ) + kwargs.update(overrides) + return step12_mod.gather_step12_ai_repair_proposals(units, **kwargs) + + +def test_non_provisional_unit_is_skipped_without_ai_call(monkeypatch): + router = MagicMock() + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="restructure", provisional=False)] + records = _call(units) + assert records[0]["ai_called"] is False + assert records[0]["skip_reason"] == "not_provisional" + assert records[0]["provisional"] is False + router.assert_not_called() + + +def test_reject_route_is_skipped_without_ai_call(monkeypatch): + router = MagicMock() + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="reject", provisional=True)] + records = _call(units) + assert records[0]["ai_called"] is False + assert records[0]["skip_reason"] == "design_reference_only_no_ai" + assert records[0]["route_hint"] == "design_reference_only" + router.assert_not_called() + + +def test_non_ai_route_is_skipped_with_reason(monkeypatch): + router = MagicMock() + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="light_edit", provisional=True)] + records = _call(units) + assert records[0]["ai_called"] is False + assert records[0]["skip_reason"] == ( + "route_not_ai_adaptation:deterministic_minor_adjustment" + ) + router.assert_not_called() + + +def test_router_short_circuit_returns_none_skip_reason(monkeypatch): + router = MagicMock(return_value=None) + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="restructure", provisional=True)] + records = _call(units) + assert records[0]["ai_called"] is False + assert records[0]["skip_reason"] == "router_short_circuit" + assert records[0]["proposal"] is None + router.assert_called_once() + + +def test_ai_adaptation_call_records_proposal(monkeypatch): + proposal = AiFallbackProposal( + proposal_kind=ProposalKind.PARTIAL_OVERRIDES, + payload={"slots": {"s_text": "x"}}, + rationale="r", + ) + router = MagicMock(return_value=proposal) + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="restructure", provisional=True)] + records = _call(units) + rec = records[0] + assert rec["ai_called"] is True + assert rec["skip_reason"] is None + assert rec["proposal"]["proposal_kind"] == "partial_overrides" + router.assert_called_once() + kwargs = router.call_args.kwargs + assert kwargs["v4_result"]["route"] == "ai_adaptation_required" + assert kwargs["v4_result"]["label"] == "restructure" + + +def test_router_exception_is_captured_per_record(monkeypatch): + router = MagicMock(side_effect=RuntimeError("transient_boom")) + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [FakeUnit(label="restructure", provisional=True)] + records = _call(units) + rec = records[0] + assert rec["ai_called"] is True + assert rec["proposal"] is None + assert rec["error"] == "RuntimeError: transient_boom" + router.assert_called_once() + + +def test_mixed_units_each_independently_classified(monkeypatch): + router = MagicMock(return_value=None) + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [ + FakeUnit(label="use_as_is", provisional=False), + FakeUnit(label="reject", provisional=True), + FakeUnit(label="restructure", provisional=True), + FakeUnit(label="restructure", provisional=False), + ] + records = _call(units) + assert [r["skip_reason"] for r in records] == [ + "not_provisional", + "design_reference_only_no_ai", + "router_short_circuit", + "not_provisional", + ] + assert router.call_count == 1 + + +def test_cache_key_includes_template_and_section_ids(monkeypatch): + router = MagicMock(return_value=None) + monkeypatch.setattr(step12_mod, "route_ai_fallback", router) + units = [ + FakeUnit( + label="restructure", + provisional=True, + frame_template_id="tmpl_abc", + source_section_ids=["02-1", "02-2"], + ) + ] + _call(units) + assert router.call_args.kwargs["cache_key"] == "tmpl_abc::02-1,02-2" + + +def test_record_shape_contract_is_stable(monkeypatch): + monkeypatch.setattr(step12_mod, "route_ai_fallback", MagicMock(return_value=None)) + units = [FakeUnit(label="reject", provisional=True)] + rec = _call(units)[0] + assert set(rec.keys()) == { + "unit_index", + "source_section_ids", + "frame_template_id", + "label", + "route_hint", + "provisional", + "ai_called", + "skip_reason", + "proposal", + "error", + } diff --git a/tests/phase_z2_ai_fallback/test_step17.py b/tests/phase_z2_ai_fallback/test_step17.py new file mode 100644 index 0000000..373ec6d --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_step17.py @@ -0,0 +1,208 @@ +"""IMP-33 u9 — Step 17 AI repair wiring tests (BLOCKED until IMP-34 + IMP-35). + +Covers: + * :data:`OVERFLOW_CASCADE_ORDER` canonical order (4 stages). + * :class:`OverflowCascadeStage` member values. + * :data:`STEP17_AI_REPAIR_BLOCKED_REASON` constant value. + * :func:`gather_step17_ai_repair_proposals` BLOCKED contract — every unit + returns ``ai_called=False`` + ``skip_reason=STEP17_AI_REPAIR_BLOCKED_REASON`` + + ``proposal=None`` regardless of provisional / label / route_hint. + * Structural guarantee — the u9 module does NOT import + :func:`src.phase_z2_ai_fallback.router.route_ai_fallback` or the + ``anthropic`` SDK. Step 17 AI repair stays structurally blocked. +""" +from __future__ import annotations + +import ast +from dataclasses import dataclass, field +from pathlib import Path + +from src.phase_z2_ai_fallback import step17 as step17_mod +from src.phase_z2_ai_fallback.step17 import ( + OVERFLOW_CASCADE_ORDER, + STEP17_AI_REPAIR_BLOCKED_REASON, + OverflowCascadeStage, + gather_step17_ai_repair_proposals, +) + + +@dataclass +class FakeUnit: + label: str | None + provisional: bool + frame_template_id: str = "tmpl" + frame_id: str = "fid" + source_section_ids: list[str] = field(default_factory=lambda: ["s1"]) + raw_content: str = "raw" + v4_rank: int | None = 1 + + +_ROUTE_HINTS: dict[str | None, str | None] = { + "use_as_is": "direct_render", + "light_edit": "deterministic_minor_adjustment", + "restructure": "ai_adaptation_required", + "reject": "design_reference_only", + None: None, +} + + +def _route_for_label(label: str | None) -> str | None: + return _ROUTE_HINTS.get(label) + + +# ─── Stage / order constants ───────────────────────────────────────── + + +def test_overflow_cascade_order_is_canonical(): + assert OVERFLOW_CASCADE_ORDER == ( + OverflowCascadeStage.DETERMINISTIC, + OverflowCascadeStage.POPUP, + OverflowCascadeStage.AI_REPAIR, + OverflowCascadeStage.USER_OVERRIDE, + ) + + +def test_overflow_cascade_stage_string_values(): + assert OverflowCascadeStage.DETERMINISTIC.value == "deterministic" + assert OverflowCascadeStage.POPUP.value == "popup" + assert OverflowCascadeStage.AI_REPAIR.value == "ai_repair" + assert OverflowCascadeStage.USER_OVERRIDE.value == "user_override" + + +def test_step17_blocked_reason_constant_value(): + assert ( + STEP17_AI_REPAIR_BLOCKED_REASON + == "step17_ai_blocked_imp_34_35_prerequisites_missing" + ) + + +# ─── BLOCKED contract: every unit returns blocked record ───────────── + + +def test_gather_returns_one_record_per_unit(): + units = [ + FakeUnit(label="restructure", provisional=True), + FakeUnit(label="reject", provisional=False), + FakeUnit(label="use_as_is", provisional=True), + ] + records = gather_step17_ai_repair_proposals(units, route_for_label=_route_for_label) + assert len(records) == 3 + + +def test_gather_records_blocked_skip_reason(): + """Every record must carry the IMP-34/IMP-35 prerequisite block reason.""" + units = [FakeUnit(label="restructure", provisional=True)] + records = gather_step17_ai_repair_proposals(units, route_for_label=_route_for_label) + assert records[0]["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON + + +def test_gather_blocks_even_when_route_is_ai_adaptation_required(): + """Provisional + ai_adaptation_required must NOT bypass the u9 block. + + Stage 2 contract: AI repair at Step 17 is blocked behind IMP-34 + IMP-35 + regardless of V4 route hint. Only u8 (Step 12) is allowed to invoke AI today. + """ + units = [FakeUnit(label="restructure", provisional=True)] + record = gather_step17_ai_repair_proposals( + units, route_for_label=_route_for_label + )[0] + assert record["route_hint"] == "ai_adaptation_required" + assert record["ai_called"] is False + assert record["proposal"] is None + assert record["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON + + +def test_gather_blocks_reject_units_too(): + """Reject units (design_reference_only) are also blocked at u9 — same reason.""" + units = [FakeUnit(label="reject", provisional=False)] + record = gather_step17_ai_repair_proposals( + units, route_for_label=_route_for_label + )[0] + assert record["ai_called"] is False + assert record["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON + + +def test_gather_records_proposal_none_and_no_error(): + units = [FakeUnit(label="restructure", provisional=True)] + record = gather_step17_ai_repair_proposals( + units, route_for_label=_route_for_label + )[0] + assert record["proposal"] is None + assert record["error"] is None + + +def test_gather_records_cascade_stage_is_ai_repair(): + units = [FakeUnit(label="restructure", provisional=True)] + record = gather_step17_ai_repair_proposals( + units, route_for_label=_route_for_label + )[0] + assert record["cascade_stage"] == OverflowCascadeStage.AI_REPAIR.value + + +def test_gather_preserves_unit_metadata(): + units = [ + FakeUnit( + label="restructure", + provisional=True, + frame_template_id="frame_05_overview", + source_section_ids=["s1", "s2"], + ) + ] + record = gather_step17_ai_repair_proposals( + units, route_for_label=_route_for_label + )[0] + assert record["unit_index"] == 0 + assert record["frame_template_id"] == "frame_05_overview" + assert record["source_section_ids"] == ["s1", "s2"] + assert record["label"] == "restructure" + assert record["provisional"] is True + + +def test_gather_with_empty_units_returns_empty_list(): + records = gather_step17_ai_repair_proposals([], route_for_label=_route_for_label) + assert records == [] + + +# ─── Structural guarantee: u9 must NOT import route_ai_fallback / anthropic ─ + + +def _u9_imports() -> list[str]: + src_path = Path(step17_mod.__file__) + tree = ast.parse(src_path.read_text(encoding="utf-8")) + imports: list[str] = [] + for node in ast.walk(tree): + if isinstance(node, ast.Import): + imports.extend(alias.name for alias in node.names) + elif isinstance(node, ast.ImportFrom): + module = node.module or "" + for alias in node.names: + imports.append(f"{module}.{alias.name}") + return imports + + +def test_step17_module_does_not_import_route_ai_fallback(): + """u9 must not be able to reach the u7 router — structural block.""" + imports = _u9_imports() + forbidden = { + "src.phase_z2_ai_fallback.router.route_ai_fallback", + "src.phase_z2_ai_fallback.router", + } + assert not any(imp in forbidden for imp in imports), imports + assert not hasattr(step17_mod, "route_ai_fallback") + + +def test_step17_module_does_not_import_anthropic(): + """u9 must not reach the Anthropic SDK directly — AI=0 in this layer.""" + imports = _u9_imports() + leaked = [imp for imp in imports if imp.split(".", 1)[0] == "anthropic"] + assert leaked == [], leaked + + +def test_step17_module_does_not_import_ai_fallback_client(): + """u9 must not instantiate the u4 client either.""" + imports = _u9_imports() + forbidden_prefixes = ("src.phase_z2_ai_fallback.client",) + leaked = [ + imp for imp in imports if imp.startswith(forbidden_prefixes) + ] + assert leaked == [], leaked diff --git a/tests/phase_z2_ai_fallback/test_validate.py b/tests/phase_z2_ai_fallback/test_validate.py new file mode 100644 index 0000000..532e551 --- /dev/null +++ b/tests/phase_z2_ai_fallback/test_validate.py @@ -0,0 +1,144 @@ +"""IMP-33 u5 — AI fallback validator tests. + +Scope (Stage 2 plan, u5): + - schema re-validation (defence-in-depth) + - builder whitelist (BUILDER_OPTIONS_PATCH) + - dropped-slot guard (PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL must keep + every declared sub_zone slot present) + - frame-swap guard (no payload.frame_id mutation; V4 rank-1 protected) + - Internal Region containment (payload.region_id must match declared id) +""" +from __future__ import annotations + +import pytest + +from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind +from src.phase_z2_ai_fallback.validate import ( + AiFallbackValidationError, + validate_proposal, +) + + +_FRAME_CONTRACT = { + "frame_id": 1171281190, + "sub_zones": [ + {"id": "pillar_1", "accepts": ["text_block"]}, + {"id": "pillar_2", "accepts": ["text_block"]}, + {"id": "pillar_3", "accepts": ["text_block"]}, + ], + "payload": { + "builder_options": { + "item_parser": "pillar_item", + "array_root": "pillars", + "role_field": "color_class", + }, + }, +} + +_REGION = {"id": "zone_top.region_a"} + + +def _make(kind: ProposalKind, payload: dict) -> AiFallbackProposal: + return AiFallbackProposal(proposal_kind=kind, payload=payload) + + +def test_builder_options_patch_accepts_whitelisted_keys() -> None: + proposal = _make( + ProposalKind.BUILDER_OPTIONS_PATCH, + {"item_parser": "alt_pillar_item"}, + ) + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_builder_options_patch_rejects_unknown_key() -> None: + proposal = _make( + ProposalKind.BUILDER_OPTIONS_PATCH, + {"item_parser": "x", "padding_px": 10}, + ) + with pytest.raises(AiFallbackValidationError, match="builder whitelist"): + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_partial_overrides_requires_all_declared_slots() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + {"slots": {"pillar_1": "a", "pillar_2": "b"}}, + ) + with pytest.raises(AiFallbackValidationError, match="dropped-slot guard"): + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_partial_overrides_with_all_slots_passes() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + {"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}}, + ) + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_slot_mapping_proposal_requires_slots_dict() -> None: + proposal = _make(ProposalKind.SLOT_MAPPING_PROPOSAL, {"slots": []}) + with pytest.raises(AiFallbackValidationError, match="dropped-slot guard"): + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_frame_swap_guard_rejects_mismatched_frame_id() -> None: + proposal = _make( + ProposalKind.BUILDER_OPTIONS_PATCH, + {"frame_id": 9999, "item_parser": "x"}, + ) + with pytest.raises(AiFallbackValidationError, match="frame-swap guard"): + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_frame_swap_guard_accepts_matching_frame_id() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + { + "frame_id": 1171281190, + "slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}, + }, + ) + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) + + +def test_internal_region_containment_rejects_mismatch() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + { + "slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}, + "region_id": "zone_bottom.region_x", + }, + ) + with pytest.raises(AiFallbackValidationError, match="Internal Region"): + validate_proposal( + proposal, + frame_contract=_FRAME_CONTRACT, + internal_region=_REGION, + ) + + +def test_internal_region_containment_accepts_match() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + { + "slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}, + "region_id": "zone_top.region_a", + }, + ) + validate_proposal( + proposal, + frame_contract=_FRAME_CONTRACT, + internal_region=_REGION, + ) + + +def test_internal_region_check_skipped_when_no_region_supplied() -> None: + proposal = _make( + ProposalKind.PARTIAL_OVERRIDES, + { + "slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}, + "region_id": "zone_top.region_a", + }, + ) + validate_proposal(proposal, frame_contract=_FRAME_CONTRACT) diff --git a/tests/test_phase_z2_ai_fallback_config.py b/tests/test_phase_z2_ai_fallback_config.py new file mode 100644 index 0000000..877dab3 --- /dev/null +++ b/tests/test_phase_z2_ai_fallback_config.py @@ -0,0 +1,46 @@ +"""IMP-33 u1 — AI fallback Settings defaults (locked). + +These defaults are the binding contract from Stage 2 plan (per-unit u1): + - ai_fallback_enabled = False (master flag OFF; fallback path only) + - ai_fallback_model = "claude-opus-4-6-20250415" + - ai_fallback_timeout_s = 60.0 + - ai_fallback_max_retries = 3 + - ai_fallback_backoff_base_s = 1.0 + - ai_fallback_backoff_cap_s = 8.0 + - ai_fallback_backoff_jitter = 0.3 + - ai_fallback_budget_per_run = 10 + - ai_fallback_circuit_breaker_threshold = 5 + +Downstream u4 (client) MUST source timeout/retry/backoff/budget/circuit from +Settings; inline literals are forbidden by Stage 2 plan. +""" +from __future__ import annotations + +from src.config import Settings + + +def test_ai_fallback_master_flag_default_off() -> None: + s = Settings() + assert s.ai_fallback_enabled is False, ( + "AI fallback master flag MUST default OFF (normal path AI=0 contract)." + ) + + +def test_ai_fallback_model_default_locked() -> None: + s = Settings() + assert s.ai_fallback_model == "claude-opus-4-6-20250415" + + +def test_ai_fallback_retry_timeout_backoff_defaults_locked() -> None: + s = Settings() + assert s.ai_fallback_timeout_s == 60.0 + assert s.ai_fallback_max_retries == 3 + assert s.ai_fallback_backoff_base_s == 1.0 + assert s.ai_fallback_backoff_cap_s == 8.0 + assert s.ai_fallback_backoff_jitter == 0.3 + + +def test_ai_fallback_budget_and_circuit_defaults_locked() -> None: + s = Settings() + assert s.ai_fallback_budget_per_run == 10 + assert s.ai_fallback_circuit_breaker_threshold == 5