feat(#94): IMP-94 u1~u6 Layer A region/content marker injection (stamper + render_slide chain + 4 zones_data.append placement_markers + 35 parity tests)
Some checks failed
Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 21s
Some checks failed
Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 21s
u1 (src/region_marker_stamper.py): deterministic root-div stamper injecting data-region-id + data-content-unit-id onto each family-partial root div anchored by data-template-id. Idempotent (re-stamp = no-op), AI=0, additive only, empty/None markers no-op, F9/F29 frame-slot axis preserved.
u2 (src/phase_z2_pipeline.py render_slide chain): _stamp_region_markers chained after IMP-56 u9 _stamp_zone_html. Marker source = zone.get("placement_markers") or [] — Codex #16 P4b crash risk closed via the or-[] call-site fallback.
u3 (_derive_placement_markers helper): projects PlacementPlan.slot_assignments[] → list[dict] carrying region_id + content_unit_id + frame_slot_id (frame_slot_id reserved for #96 89-d). Live B4 path emits at primary zones_data.append.
u4 (3 non-live zones_data.append defaults): placement_markers: [] at IMP-30 u4 empty-shell, IMP-86 u1 adapter_needed, post-loop unrenderable plan-record paths — uniform zone shape, stamper no-op surface.
u5/u6 (tests/test_phase_z2_imp94_marker_parity.py): 33 hard tests + 2 cross-axis skip-if-anchor-absent (Emergency P4/P4b future axis). Coverage: 13 family-partial root anchors, F29 + F9 frame-slot preservation, idempotence, live render_slide stamping, P4b empty-marker no-crash, MDX 01 strip-attr parity, trace-to-DOM parity.
Disjoint from #96 (data-frame-slot-id) by attribute name. SPEC anchor: docs/architecture/PHASE-Z-CONTENT-OBJECT-SUBZONE-SPEC.md §6.4 + §7.2 (Layer A read targets + render-path activation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
137
src/region_marker_stamper.py
Normal file
137
src/region_marker_stamper.py
Normal file
@@ -0,0 +1,137 @@
|
||||
"""IMP-94 (#94) u1 — region/content marker stamper for Phase Z final.html.
|
||||
|
||||
Annotates each rendered family-partial root ``<div>`` with stable
|
||||
``data-region-id="..."`` and ``data-content-unit-id="..."`` attributes so
|
||||
downstream Layer A telemetry (placement_trace ↔ DOM parity, Step 21 self-
|
||||
report, fit_classifier read targets §6.4) can resolve a rendered zone
|
||||
back to its PlacementPlan ``slot_assignments[]`` entry.
|
||||
|
||||
DOM contract (single point of truth — mirrored verbatim across the axis) ::
|
||||
|
||||
<div class="..." data-region-id="{region_id}" data-content-unit-id="{cuid}" ...
|
||||
data-frame-id="..." data-template-id="...">
|
||||
|
||||
The anchor is the uniform root-div emitted by every Phase Z family
|
||||
partial under ``templates/phase_z2/families/`` (13 partials, evidence
|
||||
confirmed via ``grep -l data-template-id`` = 13/13). All 13 partials
|
||||
carry the pattern::
|
||||
|
||||
<div class="<fNb>" data-frame-id="..." data-template-id="<family>">
|
||||
|
||||
The stamper finds the FIRST such opening tag with a permissive regex
|
||||
and injects ``data-region-id`` + ``data-content-unit-id`` as new
|
||||
attributes. Existing attributes (class, data-frame-id, data-template-id,
|
||||
etc.) are preserved verbatim. The injection is idempotent — a zone that
|
||||
already carries ``data-region-id`` on its root div is left alone.
|
||||
|
||||
Source of marker values : ``PlacementPlan.slot_assignments[].region_id``
|
||||
and ``.content_unit_id`` (see ``src/phase_z2_placement_planner.py``
|
||||
L253-258). u3 wires the live B4 path; u4 ensures non-live append paths
|
||||
default to ``placement_markers=[]`` so this stamper safely no-ops.
|
||||
|
||||
Forward-compat / safety :
|
||||
- Empty / None ``markers`` → passthrough (returns ``zone_html`` unchanged).
|
||||
- Non-str / empty ``zone_html`` → passthrough.
|
||||
- Re-stamping (idempotent) preserves the first stamp.
|
||||
- Only the FIRST data-template-id root div is stamped (one per zone).
|
||||
- Markers with empty / missing ``region_id`` AND ``content_unit_id`` →
|
||||
passthrough (no attribute injection).
|
||||
|
||||
Guardrails (refs : Stage 1 binding contract, Stage 2 unit u1) :
|
||||
- AI-isolation : pure deterministic Python; no LLM calls.
|
||||
- Additive only : never edits / removes existing attributes.
|
||||
- Idempotent : ``data-region-id`` probe short-circuits before re-inject.
|
||||
- Disjoint from #96 (``data-frame-slot-id`` is a separate axis / attr).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from typing import Any, Iterable, Mapping
|
||||
|
||||
REGION_ID_ATTR: str = "data-region-id"
|
||||
CONTENT_UNIT_ID_ATTR: str = "data-content-unit-id"
|
||||
|
||||
# Matches the FIRST ``<div ... data-template-id="...">`` opening tag.
|
||||
# Group 1 captures the inner attribute string verbatim (incl. leading
|
||||
# whitespace) so the rewriter can re-emit it unchanged after injection.
|
||||
_ROOT_DIV_TAG_RE = re.compile(
|
||||
r'<div\b((?=[^>]*\bdata-template-id\s*=\s*"[^"]+")[^>]*?)>',
|
||||
flags=re.IGNORECASE | re.DOTALL,
|
||||
)
|
||||
# Probe for an existing ``data-region-id`` attribute (any value, any
|
||||
# quote) so re-stamping is idempotent.
|
||||
_HAS_REGION_ID_RE = re.compile(r"""\bdata-region-id\s*=""", flags=re.IGNORECASE)
|
||||
|
||||
|
||||
def _coerce_marker_value(value: Any) -> str:
|
||||
"""Return a safe attribute-value string for ``value``.
|
||||
|
||||
Non-str / None → ''. Strings are returned verbatim (caller responsible
|
||||
for not embedding ``"`` since marker ids derive from
|
||||
PlacementPlan.slot_assignments which are deterministic identifiers).
|
||||
"""
|
||||
if value is None:
|
||||
return ""
|
||||
if not isinstance(value, str):
|
||||
return ""
|
||||
return value
|
||||
|
||||
|
||||
def stamp_zone_html(
|
||||
zone_html: str,
|
||||
markers: Iterable[Mapping[str, Any]] | None,
|
||||
) -> str:
|
||||
"""Stamp the root family-partial ``<div>`` with region / content-unit ids.
|
||||
|
||||
``markers`` is an iterable of mapping objects shaped as ::
|
||||
|
||||
{
|
||||
"region_id": "<region_id>",
|
||||
"content_unit_id": "<content_unit_id>",
|
||||
# optional, ignored here — reserved for #96 (89-d):
|
||||
"frame_slot_id": "<frame_slot_id>",
|
||||
}
|
||||
|
||||
Only ``markers[0]`` is consumed (one root div per zone). Excess
|
||||
markers are reserved for a future per-slot stamper (#96) and are
|
||||
silently ignored by this module.
|
||||
|
||||
Returns ``zone_html`` unchanged when:
|
||||
- ``zone_html`` is not a non-empty string,
|
||||
- ``markers`` is None / empty,
|
||||
- no ``data-template-id`` root div is found,
|
||||
- the root div already carries ``data-region-id`` (idempotent),
|
||||
- the first marker carries neither ``region_id`` nor ``content_unit_id``.
|
||||
"""
|
||||
if not isinstance(zone_html, str) or not zone_html:
|
||||
return zone_html
|
||||
if markers is None:
|
||||
return zone_html
|
||||
marker_list = list(markers)
|
||||
if not marker_list:
|
||||
return zone_html
|
||||
first = marker_list[0]
|
||||
if not isinstance(first, Mapping):
|
||||
return zone_html
|
||||
region_id = _coerce_marker_value(first.get("region_id"))
|
||||
content_unit_id = _coerce_marker_value(first.get("content_unit_id"))
|
||||
if not region_id and not content_unit_id:
|
||||
return zone_html
|
||||
|
||||
stamped = {"done": False}
|
||||
|
||||
def _replace(match: re.Match[str]) -> str:
|
||||
if stamped["done"]:
|
||||
return match.group(0)
|
||||
attrs = match.group(1) or ""
|
||||
if _HAS_REGION_ID_RE.search(attrs):
|
||||
stamped["done"] = True
|
||||
return match.group(0)
|
||||
stamped["done"] = True
|
||||
injected = (
|
||||
f' {REGION_ID_ATTR}="{region_id}"'
|
||||
f' {CONTENT_UNIT_ID_ATTR}="{content_unit_id}"'
|
||||
)
|
||||
return f"<div{injected}{attrs}>"
|
||||
|
||||
return _ROOT_DIV_TAG_RE.sub(_replace, zone_html, count=1)
|
||||
Reference in New Issue
Block a user