Some checks failed
Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 21s
u1 (src/region_marker_stamper.py): deterministic root-div stamper injecting data-region-id + data-content-unit-id onto each family-partial root div anchored by data-template-id. Idempotent (re-stamp = no-op), AI=0, additive only, empty/None markers no-op, F9/F29 frame-slot axis preserved.
u2 (src/phase_z2_pipeline.py render_slide chain): _stamp_region_markers chained after IMP-56 u9 _stamp_zone_html. Marker source = zone.get("placement_markers") or [] — Codex #16 P4b crash risk closed via the or-[] call-site fallback.
u3 (_derive_placement_markers helper): projects PlacementPlan.slot_assignments[] → list[dict] carrying region_id + content_unit_id + frame_slot_id (frame_slot_id reserved for #96 89-d). Live B4 path emits at primary zones_data.append.
u4 (3 non-live zones_data.append defaults): placement_markers: [] at IMP-30 u4 empty-shell, IMP-86 u1 adapter_needed, post-loop unrenderable plan-record paths — uniform zone shape, stamper no-op surface.
u5/u6 (tests/test_phase_z2_imp94_marker_parity.py): 33 hard tests + 2 cross-axis skip-if-anchor-absent (Emergency P4/P4b future axis). Coverage: 13 family-partial root anchors, F29 + F9 frame-slot preservation, idempotence, live render_slide stamping, P4b empty-marker no-crash, MDX 01 strip-attr parity, trace-to-DOM parity.
Disjoint from #96 (data-frame-slot-id) by attribute name. SPEC anchor: docs/architecture/PHASE-Z-CONTENT-OBJECT-SUBZONE-SPEC.md §6.4 + §7.2 (Layer A read targets + render-path activation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
138 lines
5.4 KiB
Python
138 lines
5.4 KiB
Python
"""IMP-94 (#94) u1 — region/content marker stamper for Phase Z final.html.
|
|
|
|
Annotates each rendered family-partial root ``<div>`` with stable
|
|
``data-region-id="..."`` and ``data-content-unit-id="..."`` attributes so
|
|
downstream Layer A telemetry (placement_trace ↔ DOM parity, Step 21 self-
|
|
report, fit_classifier read targets §6.4) can resolve a rendered zone
|
|
back to its PlacementPlan ``slot_assignments[]`` entry.
|
|
|
|
DOM contract (single point of truth — mirrored verbatim across the axis) ::
|
|
|
|
<div class="..." data-region-id="{region_id}" data-content-unit-id="{cuid}" ...
|
|
data-frame-id="..." data-template-id="...">
|
|
|
|
The anchor is the uniform root-div emitted by every Phase Z family
|
|
partial under ``templates/phase_z2/families/`` (13 partials, evidence
|
|
confirmed via ``grep -l data-template-id`` = 13/13). All 13 partials
|
|
carry the pattern::
|
|
|
|
<div class="<fNb>" data-frame-id="..." data-template-id="<family>">
|
|
|
|
The stamper finds the FIRST such opening tag with a permissive regex
|
|
and injects ``data-region-id`` + ``data-content-unit-id`` as new
|
|
attributes. Existing attributes (class, data-frame-id, data-template-id,
|
|
etc.) are preserved verbatim. The injection is idempotent — a zone that
|
|
already carries ``data-region-id`` on its root div is left alone.
|
|
|
|
Source of marker values : ``PlacementPlan.slot_assignments[].region_id``
|
|
and ``.content_unit_id`` (see ``src/phase_z2_placement_planner.py``
|
|
L253-258). u3 wires the live B4 path; u4 ensures non-live append paths
|
|
default to ``placement_markers=[]`` so this stamper safely no-ops.
|
|
|
|
Forward-compat / safety :
|
|
- Empty / None ``markers`` → passthrough (returns ``zone_html`` unchanged).
|
|
- Non-str / empty ``zone_html`` → passthrough.
|
|
- Re-stamping (idempotent) preserves the first stamp.
|
|
- Only the FIRST data-template-id root div is stamped (one per zone).
|
|
- Markers with empty / missing ``region_id`` AND ``content_unit_id`` →
|
|
passthrough (no attribute injection).
|
|
|
|
Guardrails (refs : Stage 1 binding contract, Stage 2 unit u1) :
|
|
- AI-isolation : pure deterministic Python; no LLM calls.
|
|
- Additive only : never edits / removes existing attributes.
|
|
- Idempotent : ``data-region-id`` probe short-circuits before re-inject.
|
|
- Disjoint from #96 (``data-frame-slot-id`` is a separate axis / attr).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import re
|
|
from typing import Any, Iterable, Mapping
|
|
|
|
REGION_ID_ATTR: str = "data-region-id"
|
|
CONTENT_UNIT_ID_ATTR: str = "data-content-unit-id"
|
|
|
|
# Matches the FIRST ``<div ... data-template-id="...">`` opening tag.
|
|
# Group 1 captures the inner attribute string verbatim (incl. leading
|
|
# whitespace) so the rewriter can re-emit it unchanged after injection.
|
|
_ROOT_DIV_TAG_RE = re.compile(
|
|
r'<div\b((?=[^>]*\bdata-template-id\s*=\s*"[^"]+")[^>]*?)>',
|
|
flags=re.IGNORECASE | re.DOTALL,
|
|
)
|
|
# Probe for an existing ``data-region-id`` attribute (any value, any
|
|
# quote) so re-stamping is idempotent.
|
|
_HAS_REGION_ID_RE = re.compile(r"""\bdata-region-id\s*=""", flags=re.IGNORECASE)
|
|
|
|
|
|
def _coerce_marker_value(value: Any) -> str:
|
|
"""Return a safe attribute-value string for ``value``.
|
|
|
|
Non-str / None → ''. Strings are returned verbatim (caller responsible
|
|
for not embedding ``"`` since marker ids derive from
|
|
PlacementPlan.slot_assignments which are deterministic identifiers).
|
|
"""
|
|
if value is None:
|
|
return ""
|
|
if not isinstance(value, str):
|
|
return ""
|
|
return value
|
|
|
|
|
|
def stamp_zone_html(
|
|
zone_html: str,
|
|
markers: Iterable[Mapping[str, Any]] | None,
|
|
) -> str:
|
|
"""Stamp the root family-partial ``<div>`` with region / content-unit ids.
|
|
|
|
``markers`` is an iterable of mapping objects shaped as ::
|
|
|
|
{
|
|
"region_id": "<region_id>",
|
|
"content_unit_id": "<content_unit_id>",
|
|
# optional, ignored here — reserved for #96 (89-d):
|
|
"frame_slot_id": "<frame_slot_id>",
|
|
}
|
|
|
|
Only ``markers[0]`` is consumed (one root div per zone). Excess
|
|
markers are reserved for a future per-slot stamper (#96) and are
|
|
silently ignored by this module.
|
|
|
|
Returns ``zone_html`` unchanged when:
|
|
- ``zone_html`` is not a non-empty string,
|
|
- ``markers`` is None / empty,
|
|
- no ``data-template-id`` root div is found,
|
|
- the root div already carries ``data-region-id`` (idempotent),
|
|
- the first marker carries neither ``region_id`` nor ``content_unit_id``.
|
|
"""
|
|
if not isinstance(zone_html, str) or not zone_html:
|
|
return zone_html
|
|
if markers is None:
|
|
return zone_html
|
|
marker_list = list(markers)
|
|
if not marker_list:
|
|
return zone_html
|
|
first = marker_list[0]
|
|
if not isinstance(first, Mapping):
|
|
return zone_html
|
|
region_id = _coerce_marker_value(first.get("region_id"))
|
|
content_unit_id = _coerce_marker_value(first.get("content_unit_id"))
|
|
if not region_id and not content_unit_id:
|
|
return zone_html
|
|
|
|
stamped = {"done": False}
|
|
|
|
def _replace(match: re.Match[str]) -> str:
|
|
if stamped["done"]:
|
|
return match.group(0)
|
|
attrs = match.group(1) or ""
|
|
if _HAS_REGION_ID_RE.search(attrs):
|
|
stamped["done"] = True
|
|
return match.group(0)
|
|
stamped["done"] = True
|
|
injected = (
|
|
f' {REGION_ID_ATTR}="{region_id}"'
|
|
f' {CONTENT_UNIT_ID_ATTR}="{content_unit_id}"'
|
|
)
|
|
return f"<div{injected}{attrs}>"
|
|
|
|
return _ROOT_DIV_TAG_RE.sub(_replace, zone_html, count=1)
|