feat(#61): IMP-33 AI fallback scaffolding (u1~u11, flag default OFF)
Frame-aware AI fallback module scaffolded under src/phase_z2_ai_fallback/ with master flag ai_fallback_enabled=False; normal-path AI call count remains 0. AI output constrained to builder_options_patch / partial_overrides / slot_mapping_proposal; MDX / frame_id / raw HTML / raw CSS mutations rejected at schema layer. IMP-46 cache gate (cache.py) raises AiFallbackCacheGateError unless visual_check_passed AND user_approved. Step 12 wires AI repair after IMP-30 provisional payload only; Step 17 stays blocked behind IMP-34 / IMP-35 prerequisites. AST isolation guard forbids fallback package from importing Phase Q / Kei / pipeline runtime symbols. Docs IMP-17 / IMP-31 bound to runtime module surface via 11-row structural test pin (test_docs_sync.py) so drift fails CI. Tests: 116 fallback / 161 phase_z2 regression / 526 scoped full sweep all passing. Existing pre-IMP-33 fixture issue in scripts/test_phase_t_* remains untouched (out of scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# IMP-17 — AI repair fallback infrastructure (carve-out)
|
||||
|
||||
**Status**: carve-out, **design-only**. Normal-path AI calls = 0. No runtime fallback code lands until the activation gate clears.
|
||||
**Status**: carve-out infra **scaffolded under IMP-33** (issue #61, Stage 3 u1~u11). Normal-path AI calls = 0 (PZ-1) — `ai_fallback_enabled` flag default `False` in `src/config.py`. Runtime AI is reachable only via fallback path entry points; Step 12 entry is provisional-gated, Step 17 entry is structurally blocked behind IMP-34 + IMP-35.
|
||||
|
||||
**Source anchors**
|
||||
- IMP-17 backlog row — [`PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md`](PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md):68 (carve-out — normal path 밖, soft link IMP-04 + IMP-05).
|
||||
@@ -42,3 +42,14 @@ Phase Q `content_editor.py` 는 **Archive Candidate** ([`PHASE-Q-AUDIT.md`](PHAS
|
||||
- AI 호출은 normal path 에 없다 (Phase Z 원칙, [memory `feedback_ai_isolation_contract`](../../README.md)).
|
||||
- 출력 단위는 항상 content_object / Internal Region / Frame Slot 또는 restructuring proposal — HTML 구조 / 레이아웃 / 프리셋 결정 X.
|
||||
- Phase Q 자산 (Kei persona prompts, Kei-API endpoint, persona retry semantics) 과 단절. Phase Z 의 fallback runtime 은 별도 prompt / endpoint 설계로 출발한다 (본 carve-out 활성 시).
|
||||
|
||||
## Runtime module surface (IMP-33 u1~u11 binding)
|
||||
|
||||
| Axis | Binding |
|
||||
|---|---|
|
||||
| Module path | `src/phase_z2_ai_fallback/` (locked by [`IMP-31-GATE-AUDIT.md`](IMP-31-GATE-AUDIT.md):31,50,56). |
|
||||
| Step 12 entry | `src.phase_z2_ai_fallback.step12.gather_step12_ai_repair_proposals` — IMP-30 provisional gate (`not_provisional` skip) AND reject gate (`design_reference_only_no_ai` skip) AND non-AI route catch-all run BEFORE `route_ai_fallback`. |
|
||||
| Step 17 entry | `src.phase_z2_ai_fallback.step17.gather_step17_ai_repair_proposals` — STRUCTURALLY BLOCKED. Every unit returns `skip_reason="step17_ai_blocked_imp_34_35_prerequisites_missing"`. Module does NOT import `route_ai_fallback` / `AiFallbackClient` / `anthropic`. |
|
||||
| Cascade order | `src.phase_z2_ai_fallback.step17.OVERFLOW_CASCADE_ORDER = (DETERMINISTIC, POPUP, AI_REPAIR, USER_OVERRIDE)` — single source of truth for Step 17 consumers. Aligns with line 16 of this doc. |
|
||||
| IMP-46 cache gate | `src.phase_z2_ai_fallback.cache.save_proposal(..., visual_check_passed, user_approved)` raises `AiFallbackCacheGateError` unless BOTH gates are True; storage backend then raises `NotImplementedError` (IMP-46 marker). `read_proposal` returns `None` until IMP-46 lands a backend. |
|
||||
| AST isolation | `tests/phase_z2_ai_fallback/test_ast_isolation.py` parses every `*.py` under `src/phase_z2_ai_fallback/` and forbids Phase Q runtime / Kei client / `src.phase_z2_*` (non-fallback) imports. Whitelist = `src.config` + intra-package + stdlib + `anthropic` + `pydantic`. |
|
||||
|
||||
@@ -28,7 +28,7 @@ Anchor pin: `tests/orchestrator_unit/test_imp17_comment_anchor.py`. Synced in [`
|
||||
| 2 | B4 frame_selection evidence integration complete | **NOT CLEAR** (⚠ partial) | [`PHASE-Z-PIPELINE-STATUS-BOARD.md`](PHASE-Z-PIPELINE-STATUS-BOARD.md):48 Step 9 ⚠ partial; :82 "B4 frame_selection 의 V4 evidence 미통합"; :126 (j) ❌ pending. |
|
||||
| 3 | IMP-04 catalog expansion + IMP-05 V4 fallback live | **AMBIGUOUS** | `templates/phase_z2/catalog/frame_contracts.yaml` = 11 `template_id:` entries vs 32 target. IMP-05 V4 rank-2/3 fallback selector logic live, but catalog coverage gates real semantics. |
|
||||
|
||||
**Verdict**: gate **NOT CLEAR**. Runtime AI adaptation remains design-only. `src/phase_z2_ai_fallback/` = declaration-only path (not created this cycle).
|
||||
**Verdict**: gate **NOT CLEAR**. Runtime AI adaptation remains gated. `src/phase_z2_ai_fallback/` = **scaffolded under IMP-33** (#61, Stage 3 u1~u11); module created, but `settings.ai_fallback_enabled` defaults to `False` (u1) so normal-path AI call count remains 0 (PZ-1). Runtime engagement still requires the 3-condition AND gate above.
|
||||
|
||||
## Issue-body axis verdict
|
||||
|
||||
@@ -47,13 +47,13 @@ Anchor pin: `tests/orchestrator_unit/test_imp17_comment_anchor.py`. Synced in [`
|
||||
|
||||
## Out of scope (this cycle)
|
||||
|
||||
Runtime AI module, `src/phase_z2_ai_fallback/` directory creation, prompt implementation, `candidate_evidence` schema change, Phase Q file mutation, Kei API reuse, frontend zone override (IMP-29 scope), IMP-30 invariant change, `calculate_fit` migration.
|
||||
Runtime AI consumer enablement (flag default OFF), `candidate_evidence` schema change, Phase Q file mutation, Kei API reuse, frontend zone override (IMP-29 scope), IMP-30 invariant change, `calculate_fit` migration. Note: `src/phase_z2_ai_fallback/` directory scaffold itself was created under IMP-33 (#61, Stage 3 u1~u11) — see [`IMP-17-CARVE-OUT.md`](IMP-17-CARVE-OUT.md) §"Runtime module surface".
|
||||
|
||||
## Future activation path (declaration only)
|
||||
## Future activation path
|
||||
|
||||
When the 3-condition AND gate clears (User GO ∧ B4 V4 evidence integrated ∧ catalog 32/32 + IMP-05 V4 fallback live):
|
||||
|
||||
- Runtime AI module path = `src/phase_z2_ai_fallback/` (not created this cycle).
|
||||
- Runtime AI module path = `src/phase_z2_ai_fallback/` (scaffolded under IMP-33; flag default OFF until gate clears).
|
||||
- Provider = Anthropic API only. Prompt design starts fresh (no Phase Q `EDITOR_PROMPT` import).
|
||||
- Output granularity = content_object → Internal Region / Frame Slot placement proposal. Frame / layout / zone topology selection remains deterministic.
|
||||
- Activation tracker = this issue (#40, IMP-31). No new IMP ID issued.
|
||||
|
||||
@@ -14,6 +14,18 @@ class Settings(BaseSettings):
|
||||
slide_width: int = 1280
|
||||
slide_height: int = 720
|
||||
|
||||
# IMP-33 u1 — AI fallback policy. Fallback-path only; normal path AI=0.
|
||||
# Defaults locked by Stage 2 plan; do NOT inline literals downstream.
|
||||
ai_fallback_enabled: bool = False
|
||||
ai_fallback_model: str = "claude-opus-4-6-20250415"
|
||||
ai_fallback_timeout_s: float = 60.0
|
||||
ai_fallback_max_retries: int = 3
|
||||
ai_fallback_backoff_base_s: float = 1.0
|
||||
ai_fallback_backoff_cap_s: float = 8.0
|
||||
ai_fallback_backoff_jitter: float = 0.3
|
||||
ai_fallback_budget_per_run: int = 10
|
||||
ai_fallback_circuit_breaker_threshold: int = 5
|
||||
|
||||
model_config = {"env_file": ".env", "env_file_encoding": "utf-8"}
|
||||
|
||||
|
||||
|
||||
15
src/phase_z2_ai_fallback/__init__.py
Normal file
15
src/phase_z2_ai_fallback/__init__.py
Normal file
@@ -0,0 +1,15 @@
|
||||
"""IMP-33 AI fallback package (fallback path only).
|
||||
|
||||
Module path locked by IMP-31-GATE-AUDIT.md (Stage 1 binding).
|
||||
Normal path AI call count MUST remain 0; this package only executes under
|
||||
classified fallback routes (reject / restructure / overflow). See
|
||||
`feedback_ai_isolation_contract`.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from src.phase_z2_ai_fallback.schema import (
|
||||
AiFallbackProposal,
|
||||
ProposalKind,
|
||||
)
|
||||
|
||||
__all__ = ["AiFallbackProposal", "ProposalKind"]
|
||||
82
src/phase_z2_ai_fallback/cache.py
Normal file
82
src/phase_z2_ai_fallback/cache.py
Normal file
@@ -0,0 +1,82 @@
|
||||
"""IMP-33 u6 — AI fallback proposal cache (IMP-46 gate, no persistent storage).
|
||||
|
||||
This module defines the cache contract that IMP-33 callers use to remember
|
||||
AI fallback proposals across runs. The persistent storage layer itself is
|
||||
out-of-scope for IMP-33 and is owned by IMP-46 (frame transformation cache).
|
||||
|
||||
Behaviour locked by Stage 2 plan (u6):
|
||||
|
||||
* ``read_proposal(key)`` always returns ``None`` until IMP-46 lands a
|
||||
persistent backend. Callers MUST handle the cache-miss path.
|
||||
* ``save_proposal(key, proposal, *, visual_check_passed, user_approved)``
|
||||
enforces the IMP-46 gate before any storage write is attempted:
|
||||
|
||||
- ``visual_check_passed=False`` -> ``AiFallbackCacheGateError``
|
||||
- ``user_approved=False`` -> ``AiFallbackCacheGateError``
|
||||
|
||||
Only when BOTH gates are True does control reach the storage layer,
|
||||
which currently raises ``NotImplementedError`` (the IMP-46 marker).
|
||||
|
||||
Guardrails:
|
||||
|
||||
* No Anthropic import; cache is pure proposal bookkeeping.
|
||||
* No MDX read/write; proposals are u2 ``AiFallbackProposal`` instances.
|
||||
* No silent persistence: gate violations are loud, not skipped writes
|
||||
(`feedback_artifact_status_naming`).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal
|
||||
|
||||
|
||||
class AiFallbackCacheGateError(RuntimeError):
|
||||
"""Raised when ``save_proposal`` is called without both IMP-46 gates True."""
|
||||
|
||||
|
||||
def read_proposal(key: str) -> AiFallbackProposal | None:
|
||||
"""Look up a previously cached proposal by ``key``.
|
||||
|
||||
IMP-33 ships without a persistent backend; this stub always returns
|
||||
``None`` so callers exercise the cache-miss path. The persistent
|
||||
backend will be wired by IMP-46.
|
||||
"""
|
||||
if not isinstance(key, str) or not key:
|
||||
raise ValueError("cache key must be a non-empty string")
|
||||
return None
|
||||
|
||||
|
||||
def save_proposal(
|
||||
key: str,
|
||||
proposal: AiFallbackProposal,
|
||||
*,
|
||||
visual_check_passed: bool,
|
||||
user_approved: bool,
|
||||
) -> None:
|
||||
"""Persist ``proposal`` under ``key`` once both IMP-46 gates are True.
|
||||
|
||||
Raises ``AiFallbackCacheGateError`` if either gate is False — the
|
||||
proposal is NOT written. When both gates are True, storage raises
|
||||
``NotImplementedError`` (the IMP-46 persistent backend has not landed
|
||||
yet).
|
||||
"""
|
||||
if not isinstance(key, str) or not key:
|
||||
raise ValueError("cache key must be a non-empty string")
|
||||
if not isinstance(proposal, AiFallbackProposal):
|
||||
raise TypeError(
|
||||
"proposal must be an AiFallbackProposal instance "
|
||||
f"(got {type(proposal).__name__})"
|
||||
)
|
||||
if not visual_check_passed:
|
||||
raise AiFallbackCacheGateError(
|
||||
"IMP-46 gate: visual_check_passed=False; refusing to cache an "
|
||||
"unverified proposal."
|
||||
)
|
||||
if not user_approved:
|
||||
raise AiFallbackCacheGateError(
|
||||
"IMP-46 gate: user_approved=False; refusing to cache without "
|
||||
"explicit user approval."
|
||||
)
|
||||
raise NotImplementedError(
|
||||
"IMP-46 persistent cache storage is not implemented yet; "
|
||||
"this is the IMP-33 u6 stub marker."
|
||||
)
|
||||
92
src/phase_z2_ai_fallback/client.py
Normal file
92
src/phase_z2_ai_fallback/client.py
Normal file
@@ -0,0 +1,92 @@
|
||||
"""IMP-33 u4 — AI fallback Anthropic client (fallback path only).
|
||||
|
||||
Wraps ``anthropic.Anthropic.messages.create`` with the timeout / retry /
|
||||
backoff / budget / circuit-breaker policy locked in u1 ``Settings``. NO
|
||||
inline policy literals: every knob is sourced from ``src.config.settings``.
|
||||
Transient errors (timeout / connection / 429 / 5xx) are retried with
|
||||
capped exponential backoff + jitter; all other errors propagate without
|
||||
retry. PZ-1 invariant: this module is fallback-path only and MUST NOT be
|
||||
imported on the normal pipeline path.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import random
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from typing import Any
|
||||
|
||||
import anthropic
|
||||
|
||||
from src.config import settings
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal
|
||||
|
||||
_TRANSIENT_ERRORS: tuple[type[BaseException], ...] = (
|
||||
anthropic.APITimeoutError,
|
||||
anthropic.APIConnectionError,
|
||||
anthropic.RateLimitError,
|
||||
anthropic.InternalServerError,
|
||||
)
|
||||
|
||||
# Output cap is an Anthropic API requirement, not a policy knob (u1).
|
||||
_MAX_OUTPUT_TOKENS = 4096
|
||||
|
||||
|
||||
class AiFallbackBudgetExceeded(RuntimeError):
|
||||
"""Per-run AI call budget (u1 ai_fallback_budget_per_run) exhausted."""
|
||||
|
||||
|
||||
class AiFallbackCircuitOpen(RuntimeError):
|
||||
"""Circuit breaker tripped (u1 ai_fallback_circuit_breaker_threshold)."""
|
||||
|
||||
|
||||
@dataclass
|
||||
class AiFallbackClient:
|
||||
"""Stateful per-run fallback client (budget + circuit accounting)."""
|
||||
|
||||
client: Any = None
|
||||
_calls: int = 0
|
||||
_consecutive_failures: int = 0
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
if self.client is None:
|
||||
self.client = anthropic.Anthropic(
|
||||
api_key=settings.anthropic_api_key,
|
||||
timeout=settings.ai_fallback_timeout_s,
|
||||
)
|
||||
|
||||
def request_proposal(self, prompt: dict[str, str]) -> AiFallbackProposal:
|
||||
if self._calls >= settings.ai_fallback_budget_per_run:
|
||||
raise AiFallbackBudgetExceeded(
|
||||
f"per-run budget {settings.ai_fallback_budget_per_run} exhausted"
|
||||
)
|
||||
if self._consecutive_failures >= settings.ai_fallback_circuit_breaker_threshold:
|
||||
raise AiFallbackCircuitOpen(
|
||||
f"circuit open after {self._consecutive_failures} consecutive failures"
|
||||
)
|
||||
self._calls += 1
|
||||
last_error: BaseException | None = None
|
||||
for attempt in range(settings.ai_fallback_max_retries + 1):
|
||||
try:
|
||||
response = self.client.messages.create(
|
||||
model=settings.ai_fallback_model,
|
||||
max_tokens=_MAX_OUTPUT_TOKENS,
|
||||
system=prompt["system"],
|
||||
messages=[{"role": "user", "content": prompt["user"]}],
|
||||
)
|
||||
text = "".join(
|
||||
block.text for block in response.content if hasattr(block, "text")
|
||||
)
|
||||
self._consecutive_failures = 0
|
||||
return AiFallbackProposal.model_validate(json.loads(text))
|
||||
except _TRANSIENT_ERRORS as err:
|
||||
last_error = err
|
||||
if attempt >= settings.ai_fallback_max_retries:
|
||||
break
|
||||
base = settings.ai_fallback_backoff_base_s * (2 ** attempt)
|
||||
delay = min(settings.ai_fallback_backoff_cap_s, base)
|
||||
delay += random.uniform(0, delay * settings.ai_fallback_backoff_jitter)
|
||||
time.sleep(delay)
|
||||
self._consecutive_failures += 1
|
||||
assert last_error is not None
|
||||
raise last_error
|
||||
80
src/phase_z2_ai_fallback/prompts.py
Normal file
80
src/phase_z2_ai_fallback/prompts.py
Normal file
@@ -0,0 +1,80 @@
|
||||
"""IMP-33 u3 — AI fallback prompt builder (fallback path only).
|
||||
|
||||
System+user prompt for the Anthropic client (u4). MDX is READ-ONLY
|
||||
(`feedback_ai_isolation_contract`); output is constrained to the u2
|
||||
schema; frame_id swap is forbidden (V4 rank-1 protected,
|
||||
`feedback_phase_z_spacing_direction`). Inputs per Stage 2 plan: V4
|
||||
result (route=ai_adaptation_required, cardinality), frame_contract,
|
||||
frame_visual HTML, figma_to_html_agent partial JSON, Internal Region,
|
||||
MDX text.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
from src.phase_z2_ai_fallback.schema import FORBIDDEN_KINDS, ProposalKind
|
||||
|
||||
V4_ROUTE_AI_ADAPTATION = "ai_adaptation_required"
|
||||
|
||||
_ALLOWED_KINDS = ", ".join(sorted(k.value for k in ProposalKind))
|
||||
_FORBIDDEN_KINDS = ", ".join(sorted(FORBIDDEN_KINDS))
|
||||
|
||||
SYSTEM_PROMPT = (
|
||||
"You are an IMP-33 AI fallback adapter for Phase Z slide composition.\n"
|
||||
"STRICT RULES:\n"
|
||||
" 1. MDX text in the user payload is READ-ONLY. Do NOT rewrite, "
|
||||
"compress, or paraphrase MDX.\n"
|
||||
" 2. Output MUST be a single JSON object conforming to AiFallbackProposal.\n"
|
||||
f" 3. proposal_kind MUST be one of: {_ALLOWED_KINDS}.\n"
|
||||
f" 4. Do NOT propose any of: {_FORBIDDEN_KINDS}.\n"
|
||||
" 5. Do NOT change frame_id — V4 rank-1 frame is locked.\n"
|
||||
" 6. Keep declared frame slots (text/table/image/details) populated.\n"
|
||||
" 7. Respect Internal Region containment; place content units within "
|
||||
"the declared region only."
|
||||
)
|
||||
|
||||
|
||||
def build_ai_fallback_prompt(
|
||||
*,
|
||||
v4_result: dict[str, Any],
|
||||
frame_contract: dict[str, Any],
|
||||
frame_visual_html: str,
|
||||
figma_partial_json: dict[str, Any],
|
||||
internal_region: dict[str, Any],
|
||||
mdx_text: str,
|
||||
) -> dict[str, str]:
|
||||
"""Build system+user prompt strings for the fallback AI adapter.
|
||||
|
||||
Raises:
|
||||
ValueError: when ``v4_result.route`` is not
|
||||
``ai_adaptation_required`` — the fallback prompt MUST NOT be
|
||||
built outside this route (normal-path AI call count must
|
||||
remain 0; PZ-1).
|
||||
"""
|
||||
route = v4_result.get("route") or v4_result.get("imp05_route_hint")
|
||||
if route != V4_ROUTE_AI_ADAPTATION:
|
||||
raise ValueError(
|
||||
f"build_ai_fallback_prompt: v4_result.route={route!r} is not "
|
||||
f"{V4_ROUTE_AI_ADAPTATION!r}; fallback prompt MUST NOT be built "
|
||||
"outside the AI adaptation route."
|
||||
)
|
||||
user_payload = {
|
||||
"v4": {
|
||||
"route": route,
|
||||
"cardinality": v4_result.get("cardinality")
|
||||
or v4_result.get("cardinality_signature"),
|
||||
"label": v4_result.get("label"),
|
||||
"frame_id": v4_result.get("frame_id"),
|
||||
"rank": v4_result.get("rank"),
|
||||
},
|
||||
"frame_contract": frame_contract,
|
||||
"frame_visual_html": frame_visual_html,
|
||||
"figma_partial_json": figma_partial_json,
|
||||
"internal_region": internal_region,
|
||||
"mdx_text_READ_ONLY": mdx_text,
|
||||
}
|
||||
return {
|
||||
"system": SYSTEM_PROMPT,
|
||||
"user": json.dumps(user_payload, ensure_ascii=False),
|
||||
}
|
||||
89
src/phase_z2_ai_fallback/router.py
Normal file
89
src/phase_z2_ai_fallback/router.py
Normal file
@@ -0,0 +1,89 @@
|
||||
"""IMP-33 u7 — AI fallback router (fallback path only).
|
||||
|
||||
Composes the IMP-33 fallback flow:
|
||||
|
||||
1. flag gate (``settings.ai_fallback_enabled`` default OFF)
|
||||
2. V4 route gate (route must equal ``ai_adaptation_required``)
|
||||
3. cache read (u6 stub returns ``None`` until IMP-46 lands)
|
||||
4. build prompt (u3)
|
||||
5. call client (u4 ``request_proposal``)
|
||||
6. validate (u5 ``validate_proposal``)
|
||||
|
||||
Returns the validated ``AiFallbackProposal``. Save to cache is NOT
|
||||
performed here — it is caller-driven AFTER ``visual_check_passed=True``
|
||||
AND ``user_approved=True``, per the u6 IMP-46 gate. The router does not
|
||||
import ``save_proposal``; this is the structural guarantee that the
|
||||
router cannot persist a proposal before the caller's visual + user
|
||||
checks (`feedback_artifact_status_naming`).
|
||||
|
||||
Guardrails:
|
||||
|
||||
* PZ-1 — normal-path AI call count stays 0: flag-off OR route-mismatch
|
||||
short-circuits BEFORE the prompt builder or client are touched.
|
||||
* ``feedback_ai_isolation_contract`` — MDX READ-ONLY (u3 enforces in
|
||||
prompt; this module never reads or writes MDX).
|
||||
* ``feedback_phase_z_spacing_direction`` — V4 rank-1 protected (u5
|
||||
enforces; router only forwards the contract).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
from src.config import settings
|
||||
from src.phase_z2_ai_fallback.cache import read_proposal
|
||||
from src.phase_z2_ai_fallback.client import AiFallbackClient
|
||||
from src.phase_z2_ai_fallback.prompts import (
|
||||
V4_ROUTE_AI_ADAPTATION,
|
||||
build_ai_fallback_prompt,
|
||||
)
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal
|
||||
from src.phase_z2_ai_fallback.validate import validate_proposal
|
||||
|
||||
|
||||
def route_ai_fallback(
|
||||
*,
|
||||
cache_key: str,
|
||||
v4_result: dict[str, Any],
|
||||
frame_contract: dict[str, Any],
|
||||
frame_visual_html: str,
|
||||
figma_partial_json: dict[str, Any],
|
||||
internal_region: dict[str, Any],
|
||||
mdx_text: str,
|
||||
client: AiFallbackClient | None = None,
|
||||
) -> AiFallbackProposal | None:
|
||||
"""Route a fallback request through cache → prompt → client → validate.
|
||||
|
||||
Returns ``None`` when the master flag is OFF or when the V4 route is
|
||||
not ``ai_adaptation_required`` — both gates short-circuit BEFORE any
|
||||
prompt/client work, so the normal-path AI call count stays at 0
|
||||
(PZ-1).
|
||||
"""
|
||||
if not settings.ai_fallback_enabled:
|
||||
return None
|
||||
route = v4_result.get("route") or v4_result.get("imp05_route_hint")
|
||||
if route != V4_ROUTE_AI_ADAPTATION:
|
||||
return None
|
||||
cached = read_proposal(cache_key)
|
||||
if cached is not None:
|
||||
validate_proposal(
|
||||
cached,
|
||||
frame_contract=frame_contract,
|
||||
internal_region=internal_region,
|
||||
)
|
||||
return cached
|
||||
prompt = build_ai_fallback_prompt(
|
||||
v4_result=v4_result,
|
||||
frame_contract=frame_contract,
|
||||
frame_visual_html=frame_visual_html,
|
||||
figma_partial_json=figma_partial_json,
|
||||
internal_region=internal_region,
|
||||
mdx_text=mdx_text,
|
||||
)
|
||||
active_client = client if client is not None else AiFallbackClient()
|
||||
proposal = active_client.request_proposal(prompt)
|
||||
validate_proposal(
|
||||
proposal,
|
||||
frame_contract=frame_contract,
|
||||
internal_region=internal_region,
|
||||
)
|
||||
return proposal
|
||||
50
src/phase_z2_ai_fallback/schema.py
Normal file
50
src/phase_z2_ai_fallback/schema.py
Normal file
@@ -0,0 +1,50 @@
|
||||
"""IMP-33 u2 — AI fallback proposal schema.
|
||||
|
||||
Whitelisted proposal kinds (Stage 2 plan):
|
||||
- builder_options_patch : zone/frame builder option overrides
|
||||
- partial_overrides : Internal Region / Frame Slot content overrides
|
||||
- slot_mapping_proposal : restructuring proposal (content unit mapping)
|
||||
|
||||
Forbidden output forms (rejected by validator):
|
||||
- mdx_text (MDX read-only — `feedback_ai_isolation_contract`)
|
||||
- frame_id_change (V4 rank-1 protected — `feedback_phase_z_spacing_direction`)
|
||||
- raw_html (HTML structure is code-decided, not AI-generated)
|
||||
- raw_css (same)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from enum import Enum
|
||||
from typing import Any
|
||||
|
||||
from pydantic import BaseModel, ConfigDict, Field, field_validator
|
||||
|
||||
|
||||
class ProposalKind(str, Enum):
|
||||
BUILDER_OPTIONS_PATCH = "builder_options_patch"
|
||||
PARTIAL_OVERRIDES = "partial_overrides"
|
||||
SLOT_MAPPING_PROPOSAL = "slot_mapping_proposal"
|
||||
|
||||
|
||||
FORBIDDEN_KINDS: frozenset[str] = frozenset(
|
||||
{"mdx_text", "frame_id_change", "raw_html", "raw_css"}
|
||||
)
|
||||
|
||||
|
||||
class AiFallbackProposal(BaseModel):
|
||||
"""Single AI fallback proposal (output contract for u4 client)."""
|
||||
|
||||
model_config = ConfigDict(extra="forbid")
|
||||
|
||||
proposal_kind: ProposalKind
|
||||
payload: dict[str, Any] = Field(default_factory=dict)
|
||||
rationale: str = ""
|
||||
|
||||
@field_validator("proposal_kind", mode="before")
|
||||
@classmethod
|
||||
def _reject_forbidden_kind(cls, value: Any) -> Any:
|
||||
if isinstance(value, str) and value in FORBIDDEN_KINDS:
|
||||
raise ValueError(
|
||||
f"proposal_kind={value!r} is forbidden (MDX/frame/raw HTML/CSS "
|
||||
"mutations are not permitted under IMP-33)."
|
||||
)
|
||||
return value
|
||||
141
src/phase_z2_ai_fallback/step12.py
Normal file
141
src/phase_z2_ai_fallback/step12.py
Normal file
@@ -0,0 +1,141 @@
|
||||
"""IMP-33 u8 — Step 12 AI repair wiring (IMP-30 provisional units only).
|
||||
|
||||
Phase Z Step 12 = slot_payload (the runtime "light_edit / restructure" surface
|
||||
where AI-assisted frame-aware adaptation is allowed per IMP-17 carve-out).
|
||||
This module is the only call site that pipes Phase Z composition units into
|
||||
``src.phase_z2_ai_fallback.router.route_ai_fallback``. Two structural gates
|
||||
preserve the AI isolation contract:
|
||||
|
||||
* IMP-30 provisional gate — units with ``provisional=False`` are skipped
|
||||
before any route classification. AI repair is reserved for first-render
|
||||
invariant survivors (no rank-1 V4 evidence, recovered as provisional).
|
||||
* Reject gate — units whose V4 label maps to ``design_reference_only``
|
||||
(``reject``) are skipped with ``skip_reason="design_reference_only_no_ai"``.
|
||||
Reject path is design reference only — never an AI call.
|
||||
|
||||
Combined with the u7 router's flag-off + route-gate short-circuits, the
|
||||
default Phase Z run path performs zero AI calls (PZ-1). Save to cache is
|
||||
NOT performed here — that is the caller's responsibility AFTER
|
||||
``visual_check_passed=True`` AND ``user_approved=True`` (u6 IMP-46 gate).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Callable, Iterable
|
||||
|
||||
from src.phase_z2_ai_fallback.router import route_ai_fallback
|
||||
|
||||
|
||||
_AI_ADAPTATION_ROUTE = "ai_adaptation_required"
|
||||
_DESIGN_REFERENCE_ROUTE = "design_reference_only"
|
||||
|
||||
|
||||
def gather_step12_ai_repair_proposals(
|
||||
units: Iterable[Any],
|
||||
*,
|
||||
route_for_label: Callable[[str | None], str | None],
|
||||
get_contract_fn: Callable[[str], dict | None],
|
||||
frame_visual_loader: Callable[[str], str],
|
||||
figma_partial_loader: Callable[[str], dict] | None = None,
|
||||
internal_region_lookup: Callable[[Any], dict] | None = None,
|
||||
mdx_text_loader: Callable[[Any], str] | None = None,
|
||||
) -> list[dict]:
|
||||
"""Return one record per unit describing the Step 12 AI repair decision.
|
||||
|
||||
The record schema is stable across all gate decisions so the Step 12
|
||||
artifact consumer can rely on a single shape:
|
||||
|
||||
{
|
||||
"unit_index": int,
|
||||
"source_section_ids": list[str],
|
||||
"frame_template_id": str,
|
||||
"label": str | None,
|
||||
"route_hint": str | None,
|
||||
"provisional": bool,
|
||||
"ai_called": bool,
|
||||
"skip_reason": str | None,
|
||||
"proposal": dict | None,
|
||||
"error": str | None,
|
||||
}
|
||||
|
||||
``ai_called`` is True only when ``route_ai_fallback`` was invoked AND
|
||||
returned a proposal OR raised. Flag-off / route-mismatch returns
|
||||
``None`` from the router and is surfaced as ``ai_called=False`` with
|
||||
``skip_reason="router_short_circuit"`` so the caller can distinguish
|
||||
"router decided not to run" from "router ran and returned a proposal".
|
||||
"""
|
||||
records: list[dict] = []
|
||||
for index, unit in enumerate(units):
|
||||
label = getattr(unit, "label", None)
|
||||
route_hint = route_for_label(label)
|
||||
record: dict = {
|
||||
"unit_index": index,
|
||||
"source_section_ids": list(getattr(unit, "source_section_ids", []) or []),
|
||||
"frame_template_id": getattr(unit, "frame_template_id", None),
|
||||
"label": label,
|
||||
"route_hint": route_hint,
|
||||
"provisional": bool(getattr(unit, "provisional", False)),
|
||||
"ai_called": False,
|
||||
"skip_reason": None,
|
||||
"proposal": None,
|
||||
"error": None,
|
||||
}
|
||||
if not record["provisional"]:
|
||||
record["skip_reason"] = "not_provisional"
|
||||
records.append(record)
|
||||
continue
|
||||
if route_hint == _DESIGN_REFERENCE_ROUTE:
|
||||
record["skip_reason"] = "design_reference_only_no_ai"
|
||||
records.append(record)
|
||||
continue
|
||||
if route_hint != _AI_ADAPTATION_ROUTE:
|
||||
record["skip_reason"] = f"route_not_ai_adaptation:{route_hint}"
|
||||
records.append(record)
|
||||
continue
|
||||
|
||||
template_id = record["frame_template_id"] or ""
|
||||
frame_contract = get_contract_fn(template_id) or {}
|
||||
frame_visual_html = frame_visual_loader(template_id)
|
||||
figma_partial_json = (
|
||||
figma_partial_loader(template_id) if figma_partial_loader is not None else {}
|
||||
)
|
||||
internal_region = (
|
||||
internal_region_lookup(unit) if internal_region_lookup is not None else {}
|
||||
)
|
||||
mdx_text = (
|
||||
mdx_text_loader(unit)
|
||||
if mdx_text_loader is not None
|
||||
else (getattr(unit, "raw_content", "") or "")
|
||||
)
|
||||
cache_key = "::".join(
|
||||
[template_id, ",".join(sorted(record["source_section_ids"]))]
|
||||
)
|
||||
v4_result = {
|
||||
"route": route_hint,
|
||||
"label": label,
|
||||
"frame_id": getattr(unit, "frame_id", None),
|
||||
"rank": getattr(unit, "v4_rank", None),
|
||||
"cardinality": None,
|
||||
}
|
||||
try:
|
||||
proposal = route_ai_fallback(
|
||||
cache_key=cache_key,
|
||||
v4_result=v4_result,
|
||||
frame_contract=frame_contract,
|
||||
frame_visual_html=frame_visual_html,
|
||||
figma_partial_json=figma_partial_json,
|
||||
internal_region=internal_region,
|
||||
mdx_text=mdx_text,
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001 — record + continue, no AI re-raise
|
||||
record["ai_called"] = True
|
||||
record["error"] = f"{type(exc).__name__}: {exc}"
|
||||
records.append(record)
|
||||
continue
|
||||
if proposal is None:
|
||||
record["skip_reason"] = "router_short_circuit"
|
||||
records.append(record)
|
||||
continue
|
||||
record["ai_called"] = True
|
||||
record["proposal"] = proposal.model_dump()
|
||||
records.append(record)
|
||||
return records
|
||||
111
src/phase_z2_ai_fallback/step17.py
Normal file
111
src/phase_z2_ai_fallback/step17.py
Normal file
@@ -0,0 +1,111 @@
|
||||
"""IMP-33 u9 — Step 17 AI repair wiring (BLOCKED until IMP-34 + IMP-35 land).
|
||||
|
||||
Phase Z Step 17 = retry / salvage cascade (see ``src.phase_z2_pipeline``
|
||||
section 11.7 ``_attempt_salvage_chain`` and the existing IMP-12 u8/u9
|
||||
deterministic chain at ``src/phase_z2_pipeline.py:1994`` and
|
||||
``src/phase_z2_pipeline.py:4948``).
|
||||
|
||||
Per IMP-17 carve-out (``docs/architecture/IMP-17-CARVE-OUT.md`` lines 16,
|
||||
40-44), AI repair at Step 17 is permitted ONLY after the full deterministic
|
||||
chain is exhausted AND popup escalation is exhausted AND a user-approved
|
||||
fallback budget remains. IMP-34 (zone resize + compact retry) and IMP-35
|
||||
(``details_popup_escalation``) are explicit prerequisites under the IMP-33
|
||||
out-of-scope contract — neither has landed yet. Therefore Step 17 AI repair
|
||||
is STRUCTURALLY BLOCKED at u9.
|
||||
|
||||
This module:
|
||||
|
||||
1. **SPECIFIES** the canonical overflow cascade order via
|
||||
:data:`OVERFLOW_CASCADE_ORDER` — ``deterministic`` → ``popup`` →
|
||||
``ai_repair`` → ``user_override``. Downstream Step 17 consumers can rely
|
||||
on this single source of truth.
|
||||
2. **KEEPS** Step 17 AI repair structurally blocked. The entry point
|
||||
:func:`gather_step17_ai_repair_proposals` does NOT import
|
||||
``route_ai_fallback`` (u7), does NOT instantiate ``AiFallbackClient`` (u4),
|
||||
and does NOT call any Anthropic API. Every unit is recorded with
|
||||
``skip_reason="step17_ai_blocked_imp_34_35_prerequisites_missing"`` so
|
||||
the caller can distinguish "blocked by carve-out gate" from any other
|
||||
skip path (e.g., u8 ``not_provisional`` / ``design_reference_only_no_ai``).
|
||||
|
||||
Once IMP-34 + IMP-35 land AND a user-approved fallback budget is granted,
|
||||
this module will gain the actual ``route_ai_fallback`` wiring guarded by
|
||||
the cascade-stage conjunction. Today the gate is closed.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from enum import Enum
|
||||
from typing import Any, Callable, Iterable
|
||||
|
||||
|
||||
class OverflowCascadeStage(str, Enum):
|
||||
"""Step 17 overflow cascade stages — canonical order (u9 single source of truth).
|
||||
|
||||
Members are ordered to match the AI isolation contract:
|
||||
|
||||
* ``DETERMINISTIC`` — IMP-12 u4/u5/u6 (``cross_zone_redistribute`` /
|
||||
``glue_compression`` / ``font_step_compression``) + IMP-12 terminal
|
||||
actions (``layout_adjust`` / ``frame_reselect``) + IMP-34
|
||||
(``zone resize + compact retry``, pending). No AI in any sub-stage.
|
||||
* ``POPUP`` — IMP-35 (``details_popup_escalation``, pending). Content
|
||||
popup escalation as the final deterministic resort before any AI.
|
||||
* ``AI_REPAIR`` — IMP-33 (this carve-out) + IMP-46 cache. Only reachable
|
||||
after DETERMINISTIC and POPUP are both exhausted AND user-approved
|
||||
fallback budget remains.
|
||||
* ``USER_OVERRIDE`` — explicit user override after all auto stages.
|
||||
"""
|
||||
|
||||
DETERMINISTIC = "deterministic"
|
||||
POPUP = "popup"
|
||||
AI_REPAIR = "ai_repair"
|
||||
USER_OVERRIDE = "user_override"
|
||||
|
||||
|
||||
OVERFLOW_CASCADE_ORDER: tuple[OverflowCascadeStage, ...] = (
|
||||
OverflowCascadeStage.DETERMINISTIC,
|
||||
OverflowCascadeStage.POPUP,
|
||||
OverflowCascadeStage.AI_REPAIR,
|
||||
OverflowCascadeStage.USER_OVERRIDE,
|
||||
)
|
||||
|
||||
|
||||
STEP17_AI_REPAIR_BLOCKED_REASON = (
|
||||
"step17_ai_blocked_imp_34_35_prerequisites_missing"
|
||||
)
|
||||
|
||||
|
||||
def gather_step17_ai_repair_proposals(
|
||||
units: Iterable[Any],
|
||||
*,
|
||||
route_for_label: Callable[[str | None], str | None],
|
||||
) -> list[dict]:
|
||||
"""Return one BLOCKED record per unit. No AI call is performed at u9.
|
||||
|
||||
The record schema mirrors :func:`src.phase_z2_ai_fallback.step12
|
||||
.gather_step12_ai_repair_proposals` so the Step 17 artifact consumer can
|
||||
reuse the same shape, with one addition: ``cascade_stage`` pins the
|
||||
stage this record belongs to (always ``ai_repair`` here).
|
||||
|
||||
Per Stage 2 contract (IMP-33 u9): Step 17 AI repair is blocked behind
|
||||
IMP-34 + IMP-35. Every unit returns with
|
||||
``skip_reason=STEP17_AI_REPAIR_BLOCKED_REASON`` and ``ai_called=False``.
|
||||
"""
|
||||
records: list[dict] = []
|
||||
for index, unit in enumerate(units):
|
||||
label = getattr(unit, "label", None)
|
||||
record: dict = {
|
||||
"unit_index": index,
|
||||
"source_section_ids": list(
|
||||
getattr(unit, "source_section_ids", []) or []
|
||||
),
|
||||
"frame_template_id": getattr(unit, "frame_template_id", None),
|
||||
"label": label,
|
||||
"route_hint": route_for_label(label),
|
||||
"provisional": bool(getattr(unit, "provisional", False)),
|
||||
"cascade_stage": OverflowCascadeStage.AI_REPAIR.value,
|
||||
"ai_called": False,
|
||||
"skip_reason": STEP17_AI_REPAIR_BLOCKED_REASON,
|
||||
"proposal": None,
|
||||
"error": None,
|
||||
}
|
||||
records.append(record)
|
||||
return records
|
||||
83
src/phase_z2_ai_fallback/validate.py
Normal file
83
src/phase_z2_ai_fallback/validate.py
Normal file
@@ -0,0 +1,83 @@
|
||||
"""IMP-33 u5 — AI fallback proposal validator (fallback path only).
|
||||
|
||||
Defence-in-depth layer between the u4 client output (already u2-schema-valid)
|
||||
and the caller. Adds the four Stage 2 guards that u2 cannot express purely at
|
||||
the schema level:
|
||||
|
||||
1. builder-options whitelist (BUILDER_OPTIONS_PATCH may only touch keys
|
||||
already declared in ``frame_contract.payload.builder_options``).
|
||||
2. dropped-slot guard (PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL must keep
|
||||
every declared ``sub_zones[*].id`` populated — text/table/image/details
|
||||
slots cannot disappear; `feedback_ai_isolation_contract`).
|
||||
3. frame-swap guard (no ``frame_id`` mutation inside payload — V4 rank-1
|
||||
protected; `feedback_phase_z_spacing_direction`).
|
||||
4. Internal Region containment (``payload.region_id`` must match the
|
||||
declared Internal Region id when present).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind
|
||||
|
||||
|
||||
class AiFallbackValidationError(ValueError):
|
||||
"""Raised when a proposal violates an IMP-33 u5 guard."""
|
||||
|
||||
|
||||
_SLOT_KINDS = (ProposalKind.PARTIAL_OVERRIDES, ProposalKind.SLOT_MAPPING_PROPOSAL)
|
||||
|
||||
|
||||
def validate_proposal(
|
||||
proposal: AiFallbackProposal,
|
||||
*,
|
||||
frame_contract: dict[str, Any],
|
||||
internal_region: dict[str, Any] | None = None,
|
||||
) -> None:
|
||||
"""Validate an AI fallback proposal against the active frame contract.
|
||||
|
||||
Raises ``AiFallbackValidationError`` on any guard violation. Returns
|
||||
``None`` on success — caller is responsible for downstream application.
|
||||
"""
|
||||
AiFallbackProposal.model_validate(proposal.model_dump())
|
||||
|
||||
payload = proposal.payload
|
||||
frame_id = frame_contract.get("frame_id")
|
||||
if "frame_id" in payload and payload["frame_id"] != frame_id:
|
||||
raise AiFallbackValidationError(
|
||||
f"frame-swap guard: payload.frame_id={payload['frame_id']!r} "
|
||||
f"differs from contract frame_id={frame_id!r}; V4 rank-1 is locked."
|
||||
)
|
||||
|
||||
if proposal.proposal_kind is ProposalKind.BUILDER_OPTIONS_PATCH:
|
||||
declared = (frame_contract.get("payload") or {}).get("builder_options") or {}
|
||||
unknown = set(payload.keys()) - set(declared.keys())
|
||||
if unknown:
|
||||
raise AiFallbackValidationError(
|
||||
f"builder whitelist: keys {sorted(unknown)} not in "
|
||||
f"frame_contract.payload.builder_options {sorted(declared)}."
|
||||
)
|
||||
|
||||
if proposal.proposal_kind in _SLOT_KINDS:
|
||||
declared_slot_ids = [z.get("id") for z in (frame_contract.get("sub_zones") or [])]
|
||||
slots = payload.get("slots")
|
||||
if not isinstance(slots, dict):
|
||||
raise AiFallbackValidationError(
|
||||
"dropped-slot guard: PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL "
|
||||
"payload MUST include a 'slots' mapping."
|
||||
)
|
||||
missing = [sid for sid in declared_slot_ids if sid not in slots]
|
||||
if missing:
|
||||
raise AiFallbackValidationError(
|
||||
f"dropped-slot guard: declared slots {missing} are absent "
|
||||
"from payload.slots (text/table/image/details must remain populated)."
|
||||
)
|
||||
|
||||
region_id = payload.get("region_id")
|
||||
if region_id is not None and internal_region is not None:
|
||||
declared_region_id = internal_region.get("id")
|
||||
if region_id != declared_region_id:
|
||||
raise AiFallbackValidationError(
|
||||
f"Internal Region containment: payload.region_id={region_id!r} "
|
||||
f"differs from internal_region.id={declared_region_id!r}."
|
||||
)
|
||||
0
tests/phase_z2_ai_fallback/__init__.py
Normal file
0
tests/phase_z2_ai_fallback/__init__.py
Normal file
153
tests/phase_z2_ai_fallback/test_ast_isolation.py
Normal file
153
tests/phase_z2_ai_fallback/test_ast_isolation.py
Normal file
@@ -0,0 +1,153 @@
|
||||
"""IMP-33 u10 — AST isolation guard for the AI fallback package.
|
||||
|
||||
Structural defence: parse every ``*.py`` file under
|
||||
``src/phase_z2_ai_fallback/`` and assert that none of them imports a
|
||||
Phase Q runtime module, the Kei API client, or any ``phase_z2_*`` runtime
|
||||
module (e.g. ``phase_z2_pipeline``). Even if a future patch wires such a
|
||||
module by accident, this AST scan catches it before runtime and protects
|
||||
the PZ-1 invariant (normal-path AI call count = 0).
|
||||
|
||||
Allowed imports inside the fallback package:
|
||||
|
||||
* Standard library modules.
|
||||
* ``anthropic`` (u4 client) and ``pydantic`` (u2 schema).
|
||||
* ``src.config`` (u1 settings — single source of truth for policy knobs).
|
||||
* Other modules inside ``src.phase_z2_ai_fallback`` (intra-package).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
import pathlib
|
||||
|
||||
import pytest
|
||||
|
||||
PACKAGE_ROOT = pathlib.Path(__file__).resolve().parents[2] / "src" / "phase_z2_ai_fallback"
|
||||
|
||||
_ALLOWED_SRC_PREFIXES: tuple[str, ...] = (
|
||||
"src.config",
|
||||
"src.phase_z2_ai_fallback",
|
||||
)
|
||||
|
||||
_ALLOWED_TOP_LEVEL: frozenset[str] = frozenset(
|
||||
{
|
||||
"anthropic",
|
||||
"pydantic",
|
||||
"__future__",
|
||||
"ast",
|
||||
"dataclasses",
|
||||
"enum",
|
||||
"json",
|
||||
"pathlib",
|
||||
"random",
|
||||
"time",
|
||||
"typing",
|
||||
}
|
||||
)
|
||||
|
||||
_FORBIDDEN_PHASE_Q_MODULES: frozenset[str] = frozenset(
|
||||
{
|
||||
"src.pipeline",
|
||||
"src.pipeline_v2",
|
||||
"src.block_assembler",
|
||||
"src.block_assembler_b2",
|
||||
"src.block_matcher_tfidf",
|
||||
"src.block_reference",
|
||||
"src.block_search",
|
||||
"src.block_selector",
|
||||
"src.content_editor",
|
||||
"src.design_director",
|
||||
"src.html_generator",
|
||||
"src.html_validator",
|
||||
"src.renderer",
|
||||
"src.mdx_normalizer",
|
||||
"src.fit_verifier",
|
||||
"src.slide_measurer",
|
||||
"src.space_allocator",
|
||||
"src.kei_client",
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
def _module_files() -> list[pathlib.Path]:
|
||||
return sorted(p for p in PACKAGE_ROOT.glob("*.py") if p.name != "__pycache__")
|
||||
|
||||
|
||||
def _imported_names(tree: ast.AST) -> list[str]:
|
||||
names: list[str] = []
|
||||
for node in ast.walk(tree):
|
||||
if isinstance(node, ast.Import):
|
||||
for alias in node.names:
|
||||
names.append(alias.name)
|
||||
elif isinstance(node, ast.ImportFrom):
|
||||
if node.module is not None:
|
||||
names.append(node.module)
|
||||
return names
|
||||
|
||||
|
||||
def _parse(path: pathlib.Path) -> ast.AST:
|
||||
return ast.parse(path.read_text(encoding="utf-8"), filename=str(path))
|
||||
|
||||
|
||||
def _is_allowed(name: str) -> bool:
|
||||
for prefix in _ALLOWED_SRC_PREFIXES:
|
||||
if name == prefix or name.startswith(prefix + "."):
|
||||
return True
|
||||
top = name.split(".", 1)[0]
|
||||
return top in _ALLOWED_TOP_LEVEL
|
||||
|
||||
|
||||
def test_fallback_package_root_exists() -> None:
|
||||
assert PACKAGE_ROOT.is_dir(), (
|
||||
f"fallback package root not found at {PACKAGE_ROOT!s}; module path "
|
||||
"is locked by IMP-31-GATE-AUDIT (src/phase_z2_ai_fallback/)."
|
||||
)
|
||||
files = _module_files()
|
||||
assert files, f"no .py modules found under {PACKAGE_ROOT!s}"
|
||||
|
||||
|
||||
def test_fallback_package_imports_are_whitelisted() -> None:
|
||||
violations: list[tuple[str, str]] = []
|
||||
for path in _module_files():
|
||||
for name in _imported_names(_parse(path)):
|
||||
if not _is_allowed(name):
|
||||
violations.append((path.name, name))
|
||||
assert not violations, (
|
||||
"fallback package imports outside the IMP-33 whitelist "
|
||||
f"(Phase Q / Kei / phase_z2_* runtime forbidden): {violations}"
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize("forbidden_module", sorted(_FORBIDDEN_PHASE_Q_MODULES))
|
||||
def test_fallback_package_forbids_phase_q_and_kei_imports(forbidden_module: str) -> None:
|
||||
for path in _module_files():
|
||||
for name in _imported_names(_parse(path)):
|
||||
top2 = ".".join(name.split(".")[:2])
|
||||
assert top2 != forbidden_module and name != forbidden_module, (
|
||||
f"{path.name} imports forbidden module {name!r}; "
|
||||
f"{forbidden_module!r} is a Phase Q / Kei runtime module and "
|
||||
"must not be reachable from the AI fallback package."
|
||||
)
|
||||
|
||||
|
||||
def test_fallback_package_forbids_phase_z2_pipeline_imports() -> None:
|
||||
for path in _module_files():
|
||||
for name in _imported_names(_parse(path)):
|
||||
assert not name.startswith("src.phase_z2_pipeline"), (
|
||||
f"{path.name} imports {name!r}; the Phase Z2 pipeline runtime "
|
||||
"module must not be reachable from the AI fallback package "
|
||||
"(PZ-1: normal-path AI=0)."
|
||||
)
|
||||
|
||||
|
||||
def test_fallback_package_forbids_other_phase_z2_runtime_imports() -> None:
|
||||
violations: list[tuple[str, str]] = []
|
||||
for path in _module_files():
|
||||
for name in _imported_names(_parse(path)):
|
||||
if name.startswith("src.phase_z2_") and not name.startswith(
|
||||
"src.phase_z2_ai_fallback"
|
||||
):
|
||||
violations.append((path.name, name))
|
||||
assert not violations, (
|
||||
"fallback package imports another phase_z2_* runtime module; "
|
||||
f"violations: {violations}"
|
||||
)
|
||||
90
tests/phase_z2_ai_fallback/test_cache.py
Normal file
90
tests/phase_z2_ai_fallback/test_cache.py
Normal file
@@ -0,0 +1,90 @@
|
||||
"""IMP-33 u6 — AI fallback cache gate tests.
|
||||
|
||||
Verifies the IMP-46 gate contract:
|
||||
* ``read_proposal`` is a stub (returns None until IMP-46).
|
||||
* ``save_proposal`` enforces both gates before any write attempt.
|
||||
* Storage itself raises NotImplementedError (IMP-46 marker).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from src.phase_z2_ai_fallback.cache import (
|
||||
AiFallbackCacheGateError,
|
||||
read_proposal,
|
||||
save_proposal,
|
||||
)
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind
|
||||
|
||||
|
||||
def _proposal() -> AiFallbackProposal:
|
||||
return AiFallbackProposal(
|
||||
proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
payload={"item_parser": "bullet_v2"},
|
||||
rationale="u6-test",
|
||||
)
|
||||
|
||||
|
||||
def test_read_proposal_returns_none_for_any_key():
|
||||
assert read_proposal("frame=foo|cardinality=3") is None
|
||||
|
||||
|
||||
def test_read_proposal_rejects_empty_key():
|
||||
with pytest.raises(ValueError):
|
||||
read_proposal("")
|
||||
|
||||
|
||||
def test_save_rejects_when_visual_check_failed():
|
||||
with pytest.raises(AiFallbackCacheGateError) as exc:
|
||||
save_proposal(
|
||||
"k", _proposal(), visual_check_passed=False, user_approved=True
|
||||
)
|
||||
assert "visual_check_passed" in str(exc.value)
|
||||
|
||||
|
||||
def test_save_rejects_when_user_not_approved():
|
||||
with pytest.raises(AiFallbackCacheGateError) as exc:
|
||||
save_proposal(
|
||||
"k", _proposal(), visual_check_passed=True, user_approved=False
|
||||
)
|
||||
assert "user_approved" in str(exc.value)
|
||||
|
||||
|
||||
def test_save_rejects_when_both_gates_false():
|
||||
with pytest.raises(AiFallbackCacheGateError):
|
||||
save_proposal(
|
||||
"k", _proposal(), visual_check_passed=False, user_approved=False
|
||||
)
|
||||
|
||||
|
||||
def test_save_raises_not_implemented_when_both_gates_pass():
|
||||
with pytest.raises(NotImplementedError) as exc:
|
||||
save_proposal(
|
||||
"k", _proposal(), visual_check_passed=True, user_approved=True
|
||||
)
|
||||
assert "IMP-46" in str(exc.value)
|
||||
|
||||
|
||||
def test_save_rejects_empty_key():
|
||||
with pytest.raises(ValueError):
|
||||
save_proposal(
|
||||
"", _proposal(), visual_check_passed=True, user_approved=True
|
||||
)
|
||||
|
||||
|
||||
def test_save_rejects_non_proposal_object():
|
||||
with pytest.raises(TypeError):
|
||||
save_proposal(
|
||||
"k",
|
||||
{"proposal_kind": "builder_options_patch"}, # type: ignore[arg-type]
|
||||
visual_check_passed=True,
|
||||
user_approved=True,
|
||||
)
|
||||
|
||||
|
||||
def test_gate_error_is_not_notimplementederror():
|
||||
with pytest.raises(AiFallbackCacheGateError):
|
||||
save_proposal(
|
||||
"k", _proposal(), visual_check_passed=False, user_approved=True
|
||||
)
|
||||
assert not issubclass(AiFallbackCacheGateError, NotImplementedError)
|
||||
151
tests/phase_z2_ai_fallback/test_client_mock.py
Normal file
151
tests/phase_z2_ai_fallback/test_client_mock.py
Normal file
@@ -0,0 +1,151 @@
|
||||
"""IMP-33 u4 — fallback client mock tests.
|
||||
|
||||
Scope (Stage 2 plan, u4):
|
||||
- Success path returns a validated ``AiFallbackProposal`` (u2 schema).
|
||||
- Transient errors (timeout / connection / 429 / 5xx) are retried.
|
||||
- Retries exhausted → last transient error propagates + consec-fail bumps.
|
||||
- Non-transient errors are NOT retried.
|
||||
- Per-run budget exhaustion raises ``AiFallbackBudgetExceeded``.
|
||||
- Circuit breaker opens after consecutive-failure threshold reached.
|
||||
- Policy values are sourced from ``settings`` (no inline literals).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import time
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import anthropic
|
||||
import httpx
|
||||
import pytest
|
||||
|
||||
from src.config import settings
|
||||
from src.phase_z2_ai_fallback.client import (
|
||||
AiFallbackBudgetExceeded,
|
||||
AiFallbackCircuitOpen,
|
||||
AiFallbackClient,
|
||||
)
|
||||
|
||||
|
||||
class _NonTransient(Exception):
|
||||
"""Stand-in for any anthropic error not in the transient whitelist."""
|
||||
|
||||
|
||||
def _ok_response() -> SimpleNamespace:
|
||||
block = SimpleNamespace(
|
||||
text=json.dumps(
|
||||
{
|
||||
"proposal_kind": "builder_options_patch",
|
||||
"payload": {"k": 1},
|
||||
"rationale": "ok",
|
||||
}
|
||||
)
|
||||
)
|
||||
return SimpleNamespace(content=[block])
|
||||
|
||||
|
||||
def _timeout_err() -> anthropic.APITimeoutError:
|
||||
return anthropic.APITimeoutError(request=httpx.Request("POST", "https://x"))
|
||||
|
||||
|
||||
def _connection_err() -> anthropic.APIConnectionError:
|
||||
return anthropic.APIConnectionError(request=httpx.Request("POST", "https://x"))
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _no_real_sleep(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
monkeypatch.setattr(time, "sleep", lambda _s: None)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _restore_settings():
|
||||
snapshot = settings.model_dump()
|
||||
yield
|
||||
for key, value in snapshot.items():
|
||||
setattr(settings, key, value)
|
||||
|
||||
|
||||
def _client_with(side_effect=None, return_value=None) -> AiFallbackClient:
|
||||
fake = MagicMock()
|
||||
if side_effect is not None:
|
||||
fake.messages.create.side_effect = side_effect
|
||||
else:
|
||||
fake.messages.create.return_value = return_value or _ok_response()
|
||||
return AiFallbackClient(client=fake)
|
||||
|
||||
|
||||
def test_success_returns_validated_proposal() -> None:
|
||||
out = _client_with().request_proposal({"system": "s", "user": "u"})
|
||||
assert out.proposal_kind.value == "builder_options_patch"
|
||||
assert out.payload == {"k": 1}
|
||||
|
||||
|
||||
def test_call_uses_settings_model() -> None:
|
||||
fake = MagicMock()
|
||||
fake.messages.create.return_value = _ok_response()
|
||||
AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"})
|
||||
kwargs = fake.messages.create.call_args.kwargs
|
||||
assert kwargs["model"] == settings.ai_fallback_model
|
||||
|
||||
|
||||
def test_transient_retries_then_succeeds() -> None:
|
||||
fake = MagicMock()
|
||||
fake.messages.create.side_effect = [_timeout_err(), _connection_err(), _ok_response()]
|
||||
AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"})
|
||||
assert fake.messages.create.call_count == 3
|
||||
|
||||
|
||||
def test_retries_exhausted_raises_last_transient(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
monkeypatch.setattr(settings, "ai_fallback_max_retries", 1)
|
||||
fake = MagicMock()
|
||||
fake.messages.create.side_effect = [_timeout_err(), _timeout_err()]
|
||||
c = AiFallbackClient(client=fake)
|
||||
with pytest.raises(anthropic.APITimeoutError):
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
assert fake.messages.create.call_count == 2
|
||||
assert c._consecutive_failures == 1
|
||||
|
||||
|
||||
def test_non_transient_not_retried() -> None:
|
||||
fake = MagicMock()
|
||||
fake.messages.create.side_effect = _NonTransient("boom")
|
||||
c = AiFallbackClient(client=fake)
|
||||
with pytest.raises(_NonTransient):
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
assert fake.messages.create.call_count == 1
|
||||
|
||||
|
||||
def test_budget_exceeded(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
monkeypatch.setattr(settings, "ai_fallback_budget_per_run", 1)
|
||||
c = _client_with()
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
with pytest.raises(AiFallbackBudgetExceeded):
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
|
||||
|
||||
def test_circuit_breaker_opens(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
monkeypatch.setattr(settings, "ai_fallback_circuit_breaker_threshold", 1)
|
||||
monkeypatch.setattr(settings, "ai_fallback_max_retries", 0)
|
||||
fake = MagicMock()
|
||||
fake.messages.create.side_effect = _timeout_err()
|
||||
c = AiFallbackClient(client=fake)
|
||||
with pytest.raises(anthropic.APITimeoutError):
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
with pytest.raises(AiFallbackCircuitOpen):
|
||||
c.request_proposal({"system": "s", "user": "u"})
|
||||
|
||||
|
||||
def test_backoff_uses_settings(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""Sleep delay must be derived from settings (no inline literals)."""
|
||||
monkeypatch.setattr(settings, "ai_fallback_max_retries", 1)
|
||||
monkeypatch.setattr(settings, "ai_fallback_backoff_base_s", 0.25)
|
||||
monkeypatch.setattr(settings, "ai_fallback_backoff_cap_s", 0.5)
|
||||
monkeypatch.setattr(settings, "ai_fallback_backoff_jitter", 0.0)
|
||||
sleeps: list[float] = []
|
||||
monkeypatch.setattr(time, "sleep", lambda s: sleeps.append(s))
|
||||
fake = MagicMock()
|
||||
fake.messages.create.side_effect = [_timeout_err(), _ok_response()]
|
||||
AiFallbackClient(client=fake).request_proposal({"system": "s", "user": "u"})
|
||||
# attempt 0 transient → sleep(min(cap, base * 2**0) + jitter==0) = 0.25
|
||||
assert sleeps == [0.25]
|
||||
61
tests/phase_z2_ai_fallback/test_docs_sync.py
Normal file
61
tests/phase_z2_ai_fallback/test_docs_sync.py
Normal file
@@ -0,0 +1,61 @@
|
||||
"""IMP-33 u11 — docs sync verification.
|
||||
|
||||
Verifies that the binding architecture docs reference the IMP-33 runtime
|
||||
module surface introduced by u1~u10. Scope is intentionally narrow per the
|
||||
Stage 2 plan: module path, Step 12 entry, Step 17 entry, cascade order, and
|
||||
the IMP-46 cache gate. Failure here means the docs and the code have
|
||||
drifted — fix the docs (or the code) before merging.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
DOCS_ROOT = Path(__file__).resolve().parents[2] / "docs" / "architecture"
|
||||
CARVE_OUT_DOC = DOCS_ROOT / "IMP-17-CARVE-OUT.md"
|
||||
GATE_AUDIT_DOC = DOCS_ROOT / "IMP-31-GATE-AUDIT.md"
|
||||
|
||||
|
||||
def _read(doc: Path) -> str:
|
||||
assert doc.is_file(), f"binding doc missing: {doc}"
|
||||
return doc.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"needle",
|
||||
[
|
||||
# Module path lock.
|
||||
"src/phase_z2_ai_fallback/",
|
||||
# Step 12 entry.
|
||||
"gather_step12_ai_repair_proposals",
|
||||
# Step 17 entry + blocked-reason sentinel.
|
||||
"gather_step17_ai_repair_proposals",
|
||||
"step17_ai_blocked_imp_34_35_prerequisites_missing",
|
||||
# Cascade order single source of truth.
|
||||
"OVERFLOW_CASCADE_ORDER",
|
||||
"(DETERMINISTIC, POPUP, AI_REPAIR, USER_OVERRIDE)",
|
||||
# IMP-46 cache gate.
|
||||
"visual_check_passed",
|
||||
"user_approved",
|
||||
"AiFallbackCacheGateError",
|
||||
# PZ-1 normal-path AI=0 invariant.
|
||||
"ai_fallback_enabled",
|
||||
],
|
||||
)
|
||||
def test_carve_out_doc_references_runtime_surface(needle: str) -> None:
|
||||
assert needle in _read(CARVE_OUT_DOC), (
|
||||
f"IMP-17-CARVE-OUT.md missing binding reference: {needle!r}"
|
||||
)
|
||||
|
||||
|
||||
def test_gate_audit_reflects_scaffolded_module() -> None:
|
||||
body = _read(GATE_AUDIT_DOC)
|
||||
assert "scaffolded under IMP-33" in body, (
|
||||
"IMP-31-GATE-AUDIT.md must record that the fallback module path is "
|
||||
"scaffolded (not 'not created this cycle')."
|
||||
)
|
||||
assert "ai_fallback_enabled" in body, (
|
||||
"IMP-31-GATE-AUDIT.md must record the flag default that keeps PZ-1 "
|
||||
"(normal-path AI=0) intact while the 3-condition gate is open."
|
||||
)
|
||||
100
tests/phase_z2_ai_fallback/test_prompts.py
Normal file
100
tests/phase_z2_ai_fallback/test_prompts.py
Normal file
@@ -0,0 +1,100 @@
|
||||
"""IMP-33 u3 — fallback prompt builder tests.
|
||||
|
||||
Scope (Stage 2 plan, u3):
|
||||
- Prompt is built only when V4 route == 'ai_adaptation_required'.
|
||||
- System prompt declares MDX READ-ONLY and pins the u2 whitelist.
|
||||
- System prompt forbids the u2 forbidden kinds + frame_id swap.
|
||||
- User payload carries all 6 declared inputs and labels MDX READ_ONLY.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
|
||||
from src.phase_z2_ai_fallback.prompts import (
|
||||
SYSTEM_PROMPT,
|
||||
V4_ROUTE_AI_ADAPTATION,
|
||||
build_ai_fallback_prompt,
|
||||
)
|
||||
from src.phase_z2_ai_fallback.schema import FORBIDDEN_KINDS, ProposalKind
|
||||
|
||||
|
||||
def _v4(route: str = V4_ROUTE_AI_ADAPTATION) -> dict:
|
||||
return {
|
||||
"route": route,
|
||||
"cardinality": {"strict": 3},
|
||||
"label": "restructure",
|
||||
"frame_id": 1171281190,
|
||||
"rank": 1,
|
||||
}
|
||||
|
||||
|
||||
def _inputs(route: str = V4_ROUTE_AI_ADAPTATION) -> dict:
|
||||
return {
|
||||
"v4_result": _v4(route),
|
||||
"frame_contract": {"template_id": "three_parallel_requirements"},
|
||||
"frame_visual_html": "<section class='f13b'/>",
|
||||
"figma_partial_json": {"nodes": []},
|
||||
"internal_region": {"id": "region_top", "bbox": [0, 0, 1200, 320]},
|
||||
"mdx_text": "# 대목차\n- 항목 1\n- 항목 2\n- 항목 3",
|
||||
}
|
||||
|
||||
|
||||
def test_system_prompt_declares_mdx_read_only() -> None:
|
||||
assert "READ-ONLY" in SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_system_prompt_lists_all_whitelisted_kinds() -> None:
|
||||
for kind in ProposalKind:
|
||||
assert kind.value in SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_system_prompt_forbids_all_forbidden_kinds() -> None:
|
||||
for forbidden in FORBIDDEN_KINDS:
|
||||
assert forbidden in SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_system_prompt_locks_frame_id_swap() -> None:
|
||||
assert "frame_id" in SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_build_prompt_returns_system_and_user() -> None:
|
||||
prompt = build_ai_fallback_prompt(**_inputs())
|
||||
assert set(prompt.keys()) == {"system", "user"}
|
||||
assert prompt["system"] == SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_user_payload_carries_all_inputs_and_marks_mdx_read_only() -> None:
|
||||
prompt = build_ai_fallback_prompt(**_inputs())
|
||||
payload = json.loads(prompt["user"])
|
||||
assert payload["v4"]["route"] == V4_ROUTE_AI_ADAPTATION
|
||||
assert payload["v4"]["cardinality"] == {"strict": 3}
|
||||
assert payload["v4"]["frame_id"] == 1171281190
|
||||
assert payload["frame_contract"]["template_id"] == "three_parallel_requirements"
|
||||
assert payload["frame_visual_html"] == "<section class='f13b'/>"
|
||||
assert payload["figma_partial_json"] == {"nodes": []}
|
||||
assert payload["internal_region"]["id"] == "region_top"
|
||||
assert "mdx_text_READ_ONLY" in payload
|
||||
assert payload["mdx_text_READ_ONLY"].startswith("# 대목차")
|
||||
assert "mdx_text" not in payload # only the READ_ONLY key, not a writable alias
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"route", ["direct_render", "deterministic_minor_adjustment", "design_reference_only", None]
|
||||
)
|
||||
def test_non_ai_route_rejected(route) -> None:
|
||||
inputs = _inputs(route=route) if route is not None else _inputs()
|
||||
if route is None:
|
||||
inputs["v4_result"].pop("route")
|
||||
with pytest.raises(ValueError, match=V4_ROUTE_AI_ADAPTATION):
|
||||
build_ai_fallback_prompt(**inputs)
|
||||
|
||||
|
||||
def test_cardinality_signature_alias_accepted() -> None:
|
||||
"""Some V4 callers expose ``cardinality_signature``; both keys must resolve."""
|
||||
inputs = _inputs()
|
||||
inputs["v4_result"].pop("cardinality")
|
||||
inputs["v4_result"]["cardinality_signature"] = {"strict": 4}
|
||||
payload = json.loads(build_ai_fallback_prompt(**inputs)["user"])
|
||||
assert payload["v4"]["cardinality"] == {"strict": 4}
|
||||
156
tests/phase_z2_ai_fallback/test_router.py
Normal file
156
tests/phase_z2_ai_fallback/test_router.py
Normal file
@@ -0,0 +1,156 @@
|
||||
"""IMP-33 u7 — AI fallback router tests.
|
||||
|
||||
Scope (Stage 2 plan, u7):
|
||||
- flag-off gate returns None and does NOT touch the client / prompt
|
||||
- route-mismatch gate returns None and does NOT touch the client / prompt
|
||||
- cache-hit short-circuits the client and still re-validates against the
|
||||
current frame contract (defence-in-depth)
|
||||
- cache-miss calls the client and validates the returned proposal
|
||||
- validation errors propagate
|
||||
- budget / circuit exceptions from u4 propagate
|
||||
- router never imports ``save_proposal`` (cache save is caller-driven
|
||||
after visual_check + user_approved per u6 IMP-46 gate)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind
|
||||
from src.phase_z2_ai_fallback import router as router_mod
|
||||
from src.phase_z2_ai_fallback.client import (
|
||||
AiFallbackBudgetExceeded,
|
||||
AiFallbackCircuitOpen,
|
||||
AiFallbackClient,
|
||||
)
|
||||
from src.phase_z2_ai_fallback.router import route_ai_fallback
|
||||
from src.phase_z2_ai_fallback.validate import AiFallbackValidationError
|
||||
|
||||
|
||||
_FRAME_CONTRACT = {
|
||||
"frame_id": 1171281190,
|
||||
"sub_zones": [{"id": "pillar_1", "accepts": ["text_block"]}],
|
||||
"payload": {"builder_options": {"item_parser": "pillar_item"}},
|
||||
}
|
||||
_REGION = {"id": "zone_top.region_a"}
|
||||
_V4_AI = {
|
||||
"route": "ai_adaptation_required",
|
||||
"cardinality": "many",
|
||||
"frame_id": 1171281190,
|
||||
"rank": 1,
|
||||
}
|
||||
_V4_NOT_AI = {"route": "light_edit", "cardinality": "many"}
|
||||
|
||||
|
||||
def _make_proposal(
|
||||
kind: ProposalKind = ProposalKind.PARTIAL_OVERRIDES,
|
||||
payload: dict | None = None,
|
||||
) -> AiFallbackProposal:
|
||||
return AiFallbackProposal(
|
||||
proposal_kind=kind,
|
||||
payload=payload if payload is not None else {"slots": {"pillar_1": "a"}},
|
||||
)
|
||||
|
||||
|
||||
def _call_kwargs() -> dict:
|
||||
return dict(
|
||||
cache_key="frame:1171281190:cardinality:many",
|
||||
v4_result=_V4_AI,
|
||||
frame_contract=_FRAME_CONTRACT,
|
||||
frame_visual_html="<div></div>",
|
||||
figma_partial_json={},
|
||||
internal_region=_REGION,
|
||||
mdx_text="# example\n- a\n- b",
|
||||
)
|
||||
|
||||
|
||||
def test_router_returns_none_when_flag_off(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", False)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
result = route_ai_fallback(**_call_kwargs(), client=client)
|
||||
assert result is None
|
||||
client.request_proposal.assert_not_called()
|
||||
|
||||
|
||||
def test_router_returns_none_when_route_not_ai_adaptation(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
kwargs = _call_kwargs()
|
||||
kwargs["v4_result"] = _V4_NOT_AI
|
||||
result = route_ai_fallback(**kwargs, client=client)
|
||||
assert result is None
|
||||
client.request_proposal.assert_not_called()
|
||||
|
||||
|
||||
def test_router_returns_cached_when_cache_hit(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
cached = _make_proposal()
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: cached)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
result = route_ai_fallback(**_call_kwargs(), client=client)
|
||||
assert result is cached
|
||||
client.request_proposal.assert_not_called()
|
||||
|
||||
|
||||
def test_router_validates_cached_proposal(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
bad_cached = AiFallbackProposal(
|
||||
proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
payload={"unknown_key": "x"},
|
||||
)
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: bad_cached)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
with pytest.raises(AiFallbackValidationError):
|
||||
route_ai_fallback(**_call_kwargs(), client=client)
|
||||
client.request_proposal.assert_not_called()
|
||||
|
||||
|
||||
def test_router_calls_client_and_returns_validated_proposal(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: None)
|
||||
proposal = _make_proposal()
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
client.request_proposal.return_value = proposal
|
||||
result = route_ai_fallback(**_call_kwargs(), client=client)
|
||||
assert result is proposal
|
||||
client.request_proposal.assert_called_once()
|
||||
sent_prompt = client.request_proposal.call_args.args[0]
|
||||
assert set(sent_prompt.keys()) == {"system", "user"}
|
||||
|
||||
|
||||
def test_router_propagates_validation_error(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: None)
|
||||
bad = AiFallbackProposal(
|
||||
proposal_kind=ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
payload={"unknown_key": "x"},
|
||||
)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
client.request_proposal.return_value = bad
|
||||
with pytest.raises(AiFallbackValidationError):
|
||||
route_ai_fallback(**_call_kwargs(), client=client)
|
||||
|
||||
|
||||
def test_router_propagates_budget_exceeded(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: None)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
client.request_proposal.side_effect = AiFallbackBudgetExceeded("over")
|
||||
with pytest.raises(AiFallbackBudgetExceeded):
|
||||
route_ai_fallback(**_call_kwargs(), client=client)
|
||||
|
||||
|
||||
def test_router_propagates_circuit_open(monkeypatch):
|
||||
monkeypatch.setattr(router_mod.settings, "ai_fallback_enabled", True)
|
||||
monkeypatch.setattr(router_mod, "read_proposal", lambda key: None)
|
||||
client = MagicMock(spec=AiFallbackClient)
|
||||
client.request_proposal.side_effect = AiFallbackCircuitOpen("tripped")
|
||||
with pytest.raises(AiFallbackCircuitOpen):
|
||||
route_ai_fallback(**_call_kwargs(), client=client)
|
||||
|
||||
|
||||
def test_router_does_not_import_save_proposal():
|
||||
"""Cache save is caller-driven AFTER visual_check + user_approved (u6 IMP-46
|
||||
gate); structurally guaranteed by NOT importing save_proposal in the router."""
|
||||
assert not hasattr(router_mod, "save_proposal")
|
||||
46
tests/phase_z2_ai_fallback/test_schema.py
Normal file
46
tests/phase_z2_ai_fallback/test_schema.py
Normal file
@@ -0,0 +1,46 @@
|
||||
"""IMP-33 u2 — AiFallbackProposal schema tests.
|
||||
|
||||
Scope (Stage 2 plan, u2):
|
||||
- Whitelisted proposal_kind values are accepted.
|
||||
- Forbidden output forms are rejected: mdx_text / frame_id_change / raw_html / raw_css.
|
||||
- extra fields outside the declared schema are rejected (MDX read-only signal).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
from pydantic import ValidationError
|
||||
|
||||
from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"kind_value",
|
||||
[
|
||||
"builder_options_patch",
|
||||
"partial_overrides",
|
||||
"slot_mapping_proposal",
|
||||
],
|
||||
)
|
||||
def test_whitelisted_proposal_kinds_accepted(kind_value: str) -> None:
|
||||
proposal = AiFallbackProposal(proposal_kind=kind_value)
|
||||
assert proposal.proposal_kind == ProposalKind(kind_value)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"forbidden",
|
||||
["mdx_text", "frame_id_change", "raw_html", "raw_css"],
|
||||
)
|
||||
def test_forbidden_proposal_kinds_rejected(forbidden: str) -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
AiFallbackProposal(proposal_kind=forbidden)
|
||||
|
||||
|
||||
def test_unknown_proposal_kind_rejected() -> None:
|
||||
with pytest.raises(ValidationError):
|
||||
AiFallbackProposal(proposal_kind="something_else")
|
||||
|
||||
|
||||
def test_extra_fields_rejected() -> None:
|
||||
"""`extra=forbid` keeps the AI from smuggling raw_html/mdx_text alongside a valid kind."""
|
||||
with pytest.raises(ValidationError):
|
||||
AiFallbackProposal(proposal_kind="partial_overrides", raw_html="<div/>")
|
||||
193
tests/phase_z2_ai_fallback/test_step12.py
Normal file
193
tests/phase_z2_ai_fallback/test_step12.py
Normal file
@@ -0,0 +1,193 @@
|
||||
"""IMP-33 u8 — Step 12 AI repair wiring tests.
|
||||
|
||||
Covers the two structural gates layered on top of the u7 router:
|
||||
* IMP-30 provisional gate (only provisional units may invoke AI repair)
|
||||
* Reject gate (route_hint=design_reference_only NEVER calls AI)
|
||||
Plus the record-shape contract returned for downstream Step 12 artifacts.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
from src.phase_z2_ai_fallback import step12 as step12_mod
|
||||
from src.phase_z2_ai_fallback.schema import AiFallbackProposal, ProposalKind
|
||||
|
||||
|
||||
@dataclass
|
||||
class FakeUnit:
|
||||
label: str | None
|
||||
provisional: bool
|
||||
frame_template_id: str = "tmpl"
|
||||
frame_id: str = "fid"
|
||||
source_section_ids: list[str] = field(default_factory=lambda: ["s1"])
|
||||
raw_content: str = "raw"
|
||||
v4_rank: int | None = 1
|
||||
|
||||
|
||||
_ROUTE_HINTS: dict[str | None, str | None] = {
|
||||
"use_as_is": "direct_render",
|
||||
"light_edit": "deterministic_minor_adjustment",
|
||||
"restructure": "ai_adaptation_required",
|
||||
"reject": "design_reference_only",
|
||||
None: None,
|
||||
}
|
||||
|
||||
|
||||
def _route_for_label(label: str | None) -> str | None:
|
||||
return _ROUTE_HINTS.get(label)
|
||||
|
||||
|
||||
def _get_contract(_tid: str) -> dict[str, Any]:
|
||||
return {"frame_id": "fid", "payload": {"builder_options": {}}, "sub_zones": []}
|
||||
|
||||
|
||||
def _frame_visual(_tid: str) -> str:
|
||||
return "<html></html>"
|
||||
|
||||
|
||||
def _call(
|
||||
units: list[FakeUnit],
|
||||
*,
|
||||
route_ai_fallback: Any | None = None,
|
||||
**overrides: Any,
|
||||
) -> list[dict]:
|
||||
if route_ai_fallback is not None:
|
||||
step12_mod.route_ai_fallback = route_ai_fallback # type: ignore[assignment]
|
||||
kwargs: dict[str, Any] = dict(
|
||||
route_for_label=_route_for_label,
|
||||
get_contract_fn=_get_contract,
|
||||
frame_visual_loader=_frame_visual,
|
||||
)
|
||||
kwargs.update(overrides)
|
||||
return step12_mod.gather_step12_ai_repair_proposals(units, **kwargs)
|
||||
|
||||
|
||||
def test_non_provisional_unit_is_skipped_without_ai_call(monkeypatch):
|
||||
router = MagicMock()
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="restructure", provisional=False)]
|
||||
records = _call(units)
|
||||
assert records[0]["ai_called"] is False
|
||||
assert records[0]["skip_reason"] == "not_provisional"
|
||||
assert records[0]["provisional"] is False
|
||||
router.assert_not_called()
|
||||
|
||||
|
||||
def test_reject_route_is_skipped_without_ai_call(monkeypatch):
|
||||
router = MagicMock()
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="reject", provisional=True)]
|
||||
records = _call(units)
|
||||
assert records[0]["ai_called"] is False
|
||||
assert records[0]["skip_reason"] == "design_reference_only_no_ai"
|
||||
assert records[0]["route_hint"] == "design_reference_only"
|
||||
router.assert_not_called()
|
||||
|
||||
|
||||
def test_non_ai_route_is_skipped_with_reason(monkeypatch):
|
||||
router = MagicMock()
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="light_edit", provisional=True)]
|
||||
records = _call(units)
|
||||
assert records[0]["ai_called"] is False
|
||||
assert records[0]["skip_reason"] == (
|
||||
"route_not_ai_adaptation:deterministic_minor_adjustment"
|
||||
)
|
||||
router.assert_not_called()
|
||||
|
||||
|
||||
def test_router_short_circuit_returns_none_skip_reason(monkeypatch):
|
||||
router = MagicMock(return_value=None)
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
records = _call(units)
|
||||
assert records[0]["ai_called"] is False
|
||||
assert records[0]["skip_reason"] == "router_short_circuit"
|
||||
assert records[0]["proposal"] is None
|
||||
router.assert_called_once()
|
||||
|
||||
|
||||
def test_ai_adaptation_call_records_proposal(monkeypatch):
|
||||
proposal = AiFallbackProposal(
|
||||
proposal_kind=ProposalKind.PARTIAL_OVERRIDES,
|
||||
payload={"slots": {"s_text": "x"}},
|
||||
rationale="r",
|
||||
)
|
||||
router = MagicMock(return_value=proposal)
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
records = _call(units)
|
||||
rec = records[0]
|
||||
assert rec["ai_called"] is True
|
||||
assert rec["skip_reason"] is None
|
||||
assert rec["proposal"]["proposal_kind"] == "partial_overrides"
|
||||
router.assert_called_once()
|
||||
kwargs = router.call_args.kwargs
|
||||
assert kwargs["v4_result"]["route"] == "ai_adaptation_required"
|
||||
assert kwargs["v4_result"]["label"] == "restructure"
|
||||
|
||||
|
||||
def test_router_exception_is_captured_per_record(monkeypatch):
|
||||
router = MagicMock(side_effect=RuntimeError("transient_boom"))
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
records = _call(units)
|
||||
rec = records[0]
|
||||
assert rec["ai_called"] is True
|
||||
assert rec["proposal"] is None
|
||||
assert rec["error"] == "RuntimeError: transient_boom"
|
||||
router.assert_called_once()
|
||||
|
||||
|
||||
def test_mixed_units_each_independently_classified(monkeypatch):
|
||||
router = MagicMock(return_value=None)
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [
|
||||
FakeUnit(label="use_as_is", provisional=False),
|
||||
FakeUnit(label="reject", provisional=True),
|
||||
FakeUnit(label="restructure", provisional=True),
|
||||
FakeUnit(label="restructure", provisional=False),
|
||||
]
|
||||
records = _call(units)
|
||||
assert [r["skip_reason"] for r in records] == [
|
||||
"not_provisional",
|
||||
"design_reference_only_no_ai",
|
||||
"router_short_circuit",
|
||||
"not_provisional",
|
||||
]
|
||||
assert router.call_count == 1
|
||||
|
||||
|
||||
def test_cache_key_includes_template_and_section_ids(monkeypatch):
|
||||
router = MagicMock(return_value=None)
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", router)
|
||||
units = [
|
||||
FakeUnit(
|
||||
label="restructure",
|
||||
provisional=True,
|
||||
frame_template_id="tmpl_abc",
|
||||
source_section_ids=["02-1", "02-2"],
|
||||
)
|
||||
]
|
||||
_call(units)
|
||||
assert router.call_args.kwargs["cache_key"] == "tmpl_abc::02-1,02-2"
|
||||
|
||||
|
||||
def test_record_shape_contract_is_stable(monkeypatch):
|
||||
monkeypatch.setattr(step12_mod, "route_ai_fallback", MagicMock(return_value=None))
|
||||
units = [FakeUnit(label="reject", provisional=True)]
|
||||
rec = _call(units)[0]
|
||||
assert set(rec.keys()) == {
|
||||
"unit_index",
|
||||
"source_section_ids",
|
||||
"frame_template_id",
|
||||
"label",
|
||||
"route_hint",
|
||||
"provisional",
|
||||
"ai_called",
|
||||
"skip_reason",
|
||||
"proposal",
|
||||
"error",
|
||||
}
|
||||
208
tests/phase_z2_ai_fallback/test_step17.py
Normal file
208
tests/phase_z2_ai_fallback/test_step17.py
Normal file
@@ -0,0 +1,208 @@
|
||||
"""IMP-33 u9 — Step 17 AI repair wiring tests (BLOCKED until IMP-34 + IMP-35).
|
||||
|
||||
Covers:
|
||||
* :data:`OVERFLOW_CASCADE_ORDER` canonical order (4 stages).
|
||||
* :class:`OverflowCascadeStage` member values.
|
||||
* :data:`STEP17_AI_REPAIR_BLOCKED_REASON` constant value.
|
||||
* :func:`gather_step17_ai_repair_proposals` BLOCKED contract — every unit
|
||||
returns ``ai_called=False`` + ``skip_reason=STEP17_AI_REPAIR_BLOCKED_REASON``
|
||||
+ ``proposal=None`` regardless of provisional / label / route_hint.
|
||||
* Structural guarantee — the u9 module does NOT import
|
||||
:func:`src.phase_z2_ai_fallback.router.route_ai_fallback` or the
|
||||
``anthropic`` SDK. Step 17 AI repair stays structurally blocked.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
from src.phase_z2_ai_fallback import step17 as step17_mod
|
||||
from src.phase_z2_ai_fallback.step17 import (
|
||||
OVERFLOW_CASCADE_ORDER,
|
||||
STEP17_AI_REPAIR_BLOCKED_REASON,
|
||||
OverflowCascadeStage,
|
||||
gather_step17_ai_repair_proposals,
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class FakeUnit:
|
||||
label: str | None
|
||||
provisional: bool
|
||||
frame_template_id: str = "tmpl"
|
||||
frame_id: str = "fid"
|
||||
source_section_ids: list[str] = field(default_factory=lambda: ["s1"])
|
||||
raw_content: str = "raw"
|
||||
v4_rank: int | None = 1
|
||||
|
||||
|
||||
_ROUTE_HINTS: dict[str | None, str | None] = {
|
||||
"use_as_is": "direct_render",
|
||||
"light_edit": "deterministic_minor_adjustment",
|
||||
"restructure": "ai_adaptation_required",
|
||||
"reject": "design_reference_only",
|
||||
None: None,
|
||||
}
|
||||
|
||||
|
||||
def _route_for_label(label: str | None) -> str | None:
|
||||
return _ROUTE_HINTS.get(label)
|
||||
|
||||
|
||||
# ─── Stage / order constants ─────────────────────────────────────────
|
||||
|
||||
|
||||
def test_overflow_cascade_order_is_canonical():
|
||||
assert OVERFLOW_CASCADE_ORDER == (
|
||||
OverflowCascadeStage.DETERMINISTIC,
|
||||
OverflowCascadeStage.POPUP,
|
||||
OverflowCascadeStage.AI_REPAIR,
|
||||
OverflowCascadeStage.USER_OVERRIDE,
|
||||
)
|
||||
|
||||
|
||||
def test_overflow_cascade_stage_string_values():
|
||||
assert OverflowCascadeStage.DETERMINISTIC.value == "deterministic"
|
||||
assert OverflowCascadeStage.POPUP.value == "popup"
|
||||
assert OverflowCascadeStage.AI_REPAIR.value == "ai_repair"
|
||||
assert OverflowCascadeStage.USER_OVERRIDE.value == "user_override"
|
||||
|
||||
|
||||
def test_step17_blocked_reason_constant_value():
|
||||
assert (
|
||||
STEP17_AI_REPAIR_BLOCKED_REASON
|
||||
== "step17_ai_blocked_imp_34_35_prerequisites_missing"
|
||||
)
|
||||
|
||||
|
||||
# ─── BLOCKED contract: every unit returns blocked record ─────────────
|
||||
|
||||
|
||||
def test_gather_returns_one_record_per_unit():
|
||||
units = [
|
||||
FakeUnit(label="restructure", provisional=True),
|
||||
FakeUnit(label="reject", provisional=False),
|
||||
FakeUnit(label="use_as_is", provisional=True),
|
||||
]
|
||||
records = gather_step17_ai_repair_proposals(units, route_for_label=_route_for_label)
|
||||
assert len(records) == 3
|
||||
|
||||
|
||||
def test_gather_records_blocked_skip_reason():
|
||||
"""Every record must carry the IMP-34/IMP-35 prerequisite block reason."""
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
records = gather_step17_ai_repair_proposals(units, route_for_label=_route_for_label)
|
||||
assert records[0]["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON
|
||||
|
||||
|
||||
def test_gather_blocks_even_when_route_is_ai_adaptation_required():
|
||||
"""Provisional + ai_adaptation_required must NOT bypass the u9 block.
|
||||
|
||||
Stage 2 contract: AI repair at Step 17 is blocked behind IMP-34 + IMP-35
|
||||
regardless of V4 route hint. Only u8 (Step 12) is allowed to invoke AI today.
|
||||
"""
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
record = gather_step17_ai_repair_proposals(
|
||||
units, route_for_label=_route_for_label
|
||||
)[0]
|
||||
assert record["route_hint"] == "ai_adaptation_required"
|
||||
assert record["ai_called"] is False
|
||||
assert record["proposal"] is None
|
||||
assert record["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON
|
||||
|
||||
|
||||
def test_gather_blocks_reject_units_too():
|
||||
"""Reject units (design_reference_only) are also blocked at u9 — same reason."""
|
||||
units = [FakeUnit(label="reject", provisional=False)]
|
||||
record = gather_step17_ai_repair_proposals(
|
||||
units, route_for_label=_route_for_label
|
||||
)[0]
|
||||
assert record["ai_called"] is False
|
||||
assert record["skip_reason"] == STEP17_AI_REPAIR_BLOCKED_REASON
|
||||
|
||||
|
||||
def test_gather_records_proposal_none_and_no_error():
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
record = gather_step17_ai_repair_proposals(
|
||||
units, route_for_label=_route_for_label
|
||||
)[0]
|
||||
assert record["proposal"] is None
|
||||
assert record["error"] is None
|
||||
|
||||
|
||||
def test_gather_records_cascade_stage_is_ai_repair():
|
||||
units = [FakeUnit(label="restructure", provisional=True)]
|
||||
record = gather_step17_ai_repair_proposals(
|
||||
units, route_for_label=_route_for_label
|
||||
)[0]
|
||||
assert record["cascade_stage"] == OverflowCascadeStage.AI_REPAIR.value
|
||||
|
||||
|
||||
def test_gather_preserves_unit_metadata():
|
||||
units = [
|
||||
FakeUnit(
|
||||
label="restructure",
|
||||
provisional=True,
|
||||
frame_template_id="frame_05_overview",
|
||||
source_section_ids=["s1", "s2"],
|
||||
)
|
||||
]
|
||||
record = gather_step17_ai_repair_proposals(
|
||||
units, route_for_label=_route_for_label
|
||||
)[0]
|
||||
assert record["unit_index"] == 0
|
||||
assert record["frame_template_id"] == "frame_05_overview"
|
||||
assert record["source_section_ids"] == ["s1", "s2"]
|
||||
assert record["label"] == "restructure"
|
||||
assert record["provisional"] is True
|
||||
|
||||
|
||||
def test_gather_with_empty_units_returns_empty_list():
|
||||
records = gather_step17_ai_repair_proposals([], route_for_label=_route_for_label)
|
||||
assert records == []
|
||||
|
||||
|
||||
# ─── Structural guarantee: u9 must NOT import route_ai_fallback / anthropic ─
|
||||
|
||||
|
||||
def _u9_imports() -> list[str]:
|
||||
src_path = Path(step17_mod.__file__)
|
||||
tree = ast.parse(src_path.read_text(encoding="utf-8"))
|
||||
imports: list[str] = []
|
||||
for node in ast.walk(tree):
|
||||
if isinstance(node, ast.Import):
|
||||
imports.extend(alias.name for alias in node.names)
|
||||
elif isinstance(node, ast.ImportFrom):
|
||||
module = node.module or ""
|
||||
for alias in node.names:
|
||||
imports.append(f"{module}.{alias.name}")
|
||||
return imports
|
||||
|
||||
|
||||
def test_step17_module_does_not_import_route_ai_fallback():
|
||||
"""u9 must not be able to reach the u7 router — structural block."""
|
||||
imports = _u9_imports()
|
||||
forbidden = {
|
||||
"src.phase_z2_ai_fallback.router.route_ai_fallback",
|
||||
"src.phase_z2_ai_fallback.router",
|
||||
}
|
||||
assert not any(imp in forbidden for imp in imports), imports
|
||||
assert not hasattr(step17_mod, "route_ai_fallback")
|
||||
|
||||
|
||||
def test_step17_module_does_not_import_anthropic():
|
||||
"""u9 must not reach the Anthropic SDK directly — AI=0 in this layer."""
|
||||
imports = _u9_imports()
|
||||
leaked = [imp for imp in imports if imp.split(".", 1)[0] == "anthropic"]
|
||||
assert leaked == [], leaked
|
||||
|
||||
|
||||
def test_step17_module_does_not_import_ai_fallback_client():
|
||||
"""u9 must not instantiate the u4 client either."""
|
||||
imports = _u9_imports()
|
||||
forbidden_prefixes = ("src.phase_z2_ai_fallback.client",)
|
||||
leaked = [
|
||||
imp for imp in imports if imp.startswith(forbidden_prefixes)
|
||||
]
|
||||
assert leaked == [], leaked
|
||||
144
tests/phase_z2_ai_fallback/test_validate.py
Normal file
144
tests/phase_z2_ai_fallback/test_validate.py
Normal file
@@ -0,0 +1,144 @@
|
||||
"""IMP-33 u5 — AI fallback validator tests.
|
||||
|
||||
Scope (Stage 2 plan, u5):
|
||||
- schema re-validation (defence-in-depth)
|
||||
- builder whitelist (BUILDER_OPTIONS_PATCH)
|
||||
- dropped-slot guard (PARTIAL_OVERRIDES / SLOT_MAPPING_PROPOSAL must keep
|
||||
every declared sub_zone slot present)
|
||||
- frame-swap guard (no payload.frame_id mutation; V4 rank-1 protected)
|
||||
- Internal Region containment (payload.region_id must match declared id)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from src.phase_z2_ai_fallback import AiFallbackProposal, ProposalKind
|
||||
from src.phase_z2_ai_fallback.validate import (
|
||||
AiFallbackValidationError,
|
||||
validate_proposal,
|
||||
)
|
||||
|
||||
|
||||
_FRAME_CONTRACT = {
|
||||
"frame_id": 1171281190,
|
||||
"sub_zones": [
|
||||
{"id": "pillar_1", "accepts": ["text_block"]},
|
||||
{"id": "pillar_2", "accepts": ["text_block"]},
|
||||
{"id": "pillar_3", "accepts": ["text_block"]},
|
||||
],
|
||||
"payload": {
|
||||
"builder_options": {
|
||||
"item_parser": "pillar_item",
|
||||
"array_root": "pillars",
|
||||
"role_field": "color_class",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
_REGION = {"id": "zone_top.region_a"}
|
||||
|
||||
|
||||
def _make(kind: ProposalKind, payload: dict) -> AiFallbackProposal:
|
||||
return AiFallbackProposal(proposal_kind=kind, payload=payload)
|
||||
|
||||
|
||||
def test_builder_options_patch_accepts_whitelisted_keys() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
{"item_parser": "alt_pillar_item"},
|
||||
)
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_builder_options_patch_rejects_unknown_key() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
{"item_parser": "x", "padding_px": 10},
|
||||
)
|
||||
with pytest.raises(AiFallbackValidationError, match="builder whitelist"):
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_partial_overrides_requires_all_declared_slots() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{"slots": {"pillar_1": "a", "pillar_2": "b"}},
|
||||
)
|
||||
with pytest.raises(AiFallbackValidationError, match="dropped-slot guard"):
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_partial_overrides_with_all_slots_passes() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"}},
|
||||
)
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_slot_mapping_proposal_requires_slots_dict() -> None:
|
||||
proposal = _make(ProposalKind.SLOT_MAPPING_PROPOSAL, {"slots": []})
|
||||
with pytest.raises(AiFallbackValidationError, match="dropped-slot guard"):
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_frame_swap_guard_rejects_mismatched_frame_id() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.BUILDER_OPTIONS_PATCH,
|
||||
{"frame_id": 9999, "item_parser": "x"},
|
||||
)
|
||||
with pytest.raises(AiFallbackValidationError, match="frame-swap guard"):
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_frame_swap_guard_accepts_matching_frame_id() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{
|
||||
"frame_id": 1171281190,
|
||||
"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"},
|
||||
},
|
||||
)
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
|
||||
|
||||
def test_internal_region_containment_rejects_mismatch() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{
|
||||
"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"},
|
||||
"region_id": "zone_bottom.region_x",
|
||||
},
|
||||
)
|
||||
with pytest.raises(AiFallbackValidationError, match="Internal Region"):
|
||||
validate_proposal(
|
||||
proposal,
|
||||
frame_contract=_FRAME_CONTRACT,
|
||||
internal_region=_REGION,
|
||||
)
|
||||
|
||||
|
||||
def test_internal_region_containment_accepts_match() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{
|
||||
"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"},
|
||||
"region_id": "zone_top.region_a",
|
||||
},
|
||||
)
|
||||
validate_proposal(
|
||||
proposal,
|
||||
frame_contract=_FRAME_CONTRACT,
|
||||
internal_region=_REGION,
|
||||
)
|
||||
|
||||
|
||||
def test_internal_region_check_skipped_when_no_region_supplied() -> None:
|
||||
proposal = _make(
|
||||
ProposalKind.PARTIAL_OVERRIDES,
|
||||
{
|
||||
"slots": {"pillar_1": "a", "pillar_2": "b", "pillar_3": "c"},
|
||||
"region_id": "zone_top.region_a",
|
||||
},
|
||||
)
|
||||
validate_proposal(proposal, frame_contract=_FRAME_CONTRACT)
|
||||
46
tests/test_phase_z2_ai_fallback_config.py
Normal file
46
tests/test_phase_z2_ai_fallback_config.py
Normal file
@@ -0,0 +1,46 @@
|
||||
"""IMP-33 u1 — AI fallback Settings defaults (locked).
|
||||
|
||||
These defaults are the binding contract from Stage 2 plan (per-unit u1):
|
||||
- ai_fallback_enabled = False (master flag OFF; fallback path only)
|
||||
- ai_fallback_model = "claude-opus-4-6-20250415"
|
||||
- ai_fallback_timeout_s = 60.0
|
||||
- ai_fallback_max_retries = 3
|
||||
- ai_fallback_backoff_base_s = 1.0
|
||||
- ai_fallback_backoff_cap_s = 8.0
|
||||
- ai_fallback_backoff_jitter = 0.3
|
||||
- ai_fallback_budget_per_run = 10
|
||||
- ai_fallback_circuit_breaker_threshold = 5
|
||||
|
||||
Downstream u4 (client) MUST source timeout/retry/backoff/budget/circuit from
|
||||
Settings; inline literals are forbidden by Stage 2 plan.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from src.config import Settings
|
||||
|
||||
|
||||
def test_ai_fallback_master_flag_default_off() -> None:
|
||||
s = Settings()
|
||||
assert s.ai_fallback_enabled is False, (
|
||||
"AI fallback master flag MUST default OFF (normal path AI=0 contract)."
|
||||
)
|
||||
|
||||
|
||||
def test_ai_fallback_model_default_locked() -> None:
|
||||
s = Settings()
|
||||
assert s.ai_fallback_model == "claude-opus-4-6-20250415"
|
||||
|
||||
|
||||
def test_ai_fallback_retry_timeout_backoff_defaults_locked() -> None:
|
||||
s = Settings()
|
||||
assert s.ai_fallback_timeout_s == 60.0
|
||||
assert s.ai_fallback_max_retries == 3
|
||||
assert s.ai_fallback_backoff_base_s == 1.0
|
||||
assert s.ai_fallback_backoff_cap_s == 8.0
|
||||
assert s.ai_fallback_backoff_jitter == 0.3
|
||||
|
||||
|
||||
def test_ai_fallback_budget_and_circuit_defaults_locked() -> None:
|
||||
s = Settings()
|
||||
assert s.ai_fallback_budget_per_run == 10
|
||||
assert s.ai_fallback_circuit_breaker_threshold == 5
|
||||
Reference in New Issue
Block a user