Add resplit_all_reject_merges() helper in phase_z2_composition.py that
detects parent_merged / parent_merged_inferred units with label=reject
and rebuilds them as per-section single units using each section's own
rank-1 V4 evidence (no frame swap, MDX raw_content preserved).
Pipeline hook fires once after Step 6 settling chain (u12/u4/empty-shell)
and section_assignment_plan resolution, before Step 6 artifact write.
Guards: beneficial-split rule (>=1 non-reject), coverage equality, layout
cap (>4 abort), max_retry=1, section_assignment_override short-circuit.
Audit: comp_debug["imp48_resplit"] additive payload (applied, split_units,
skipped_units, post_split_unit_count, post_split_layout_preset);
selection_path="resplit_from_merge" telemetry on rebuilt singles;
layout_preset re-derived via select_layout_preset(new_units).
Tests: 39/39 PASS (composition u1~u6: 14 cases; pipeline u7~u9: 25 cases).
Scoped regression 720/6 with 6 failures isolated as pre-existing on
baseline 79f9ea5 (independent of IMP-48). mdx03 golden lock preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1569 lines
70 KiB
Python
1569 lines
70 KiB
Python
"""IMP-48 (#77) u7+u8+u9 — Pipeline regression for IMP-48 hook surface.
|
|
|
|
Scope (this file — Stage 2 plan u7 + u8 + u9):
|
|
|
|
u7 (no-op) and u8 (split-help) pipeline regression for the Step 6 hook
|
|
in ``src/phase_z2_pipeline.py`` (u4 — call site at L3970-L3989, u5 —
|
|
re-derive + artifact extension at L3990-L4014, L4061-L4070, L4079-L4084).
|
|
|
|
u7 — no-op contract: when the post-Step-6 unit list contains no IMP-48-
|
|
target merged-reject units, the hook must be a no-op:
|
|
|
|
1. ``comp_debug["imp48_resplit"]["applied"] is False``;
|
|
2. the ``units`` list referenced by the Step 6 artifact write at
|
|
L4023-L4086 is byte-identical (as a list of ``CompositionUnit``
|
|
dataclass instances + as the serialized ``selected_units`` payload)
|
|
to the pre-hook list;
|
|
3. the audit ``skipped_reason`` is one of the deterministic Stage 1
|
|
enum values (``section_assignment_override`` / ``no_detection`` /
|
|
``no_beneficial_split`` / ``incomplete_rebuild`` /
|
|
``layout_cap_exceeded``) — never ``applied=True`` without a swap;
|
|
4. ``layout_preset`` is not re-derived (u5's ``_imp48_audit.get
|
|
("applied")`` gate at L3996 short-circuits when applied=False).
|
|
|
|
u8 — split-help contract: when a ``parent_merged`` /
|
|
``parent_merged_inferred`` unit with ``label="reject"`` is present AND
|
|
≥1 child section has its OWN rank-1 V4 evidence with ``label != "reject"``
|
|
AND the post-split projected count is ≤ 4, the hook must SPLIT it so
|
|
each child section reaches the normal per-section route (use_as_is /
|
|
light_edit / restructure → matched_zone / adapt_matched_zone /
|
|
extract_matched_zone) instead of being handed to IMP-47B (#76) as a
|
|
single blob. The hook must:
|
|
|
|
a. set ``audit["applied"] is True``;
|
|
b. preserve coverage equality (``{sid for unit in out for sid in
|
|
unit.source_section_ids} == {sid for unit in pre for sid in
|
|
unit.source_section_ids}``) — Stage 1 ★ dropped_zero_invariant;
|
|
c. emit singles using each section's OWN rank-1 V4 template_id +
|
|
frame_id + label — Stage 1 ★ feedback_ai_isolation_contract
|
|
(no frame swap from merged parent template_id);
|
|
d. emit singles with ``raw_content == sections[sid].raw_content``
|
|
(not the merged blob) — Stage 1 ★ MDX_raw_content_invariant;
|
|
e. tag each split-produced single ``selection_path="resplit_from_merge"``
|
|
— Stage 1 Q3 YES additive field reuse;
|
|
f. surface ``audit["post_split_layout_preset"]`` from
|
|
``select_layout_preset(out_units)`` so u5's re-derive block at
|
|
``phase_z2_pipeline.py:3996-4006`` reflects the new unit count;
|
|
g. produce a Step 6 ``selected_units`` payload (mirror of the dict-
|
|
comprehension at L4031-L4060) whose entries are byte-identical to
|
|
the singles' OWN fields — IMP-47B (#76) on Step 9 sees per-section
|
|
evidence, not the merged blob.
|
|
|
|
u7 cases (Stage 2 plan no-op axis):
|
|
|
|
1. **Source anchor** — the u4/u5 wiring markers + audit storage in
|
|
``phase_z2_pipeline.py`` are present (cheap structural guard
|
|
against silent removal in a future refactor).
|
|
2. **Import wiring** — ``resplit_all_reject_merges`` is importable
|
|
from ``phase_z2_composition`` (i.e. the import block at
|
|
``phase_z2_pipeline.py:41-50`` is wired and the alphabetical
|
|
position is intact).
|
|
3. **No-op on all-direct slide** — every section is ``use_as_is`` /
|
|
``light_edit`` → ``audit["applied"] is False``,
|
|
``audit["detected_units"] == []``, units identity preserved
|
|
(``out_units is units`` — same Python list object, byte-identical
|
|
to the pre-hook list).
|
|
4. **No-op on mixed single-reject (mdx03 lock shape)** — singles
|
|
with mixed labels, including a single-section reject, do NOT enter
|
|
detection (``merge_type=="single"`` excluded). Mirrors the mdx03
|
|
golden lock invariant (``project_mdx03_frame_lock``).
|
|
5. **No-op on parent_merged non-reject** — merged unit with
|
|
``label != "reject"`` does NOT enter detection. Confirms the
|
|
beneficial-split threshold is anchored on ``label == "reject"``
|
|
(Stage 1 RULE_0 scope-lock — no template_id / frame_id hardcoding).
|
|
6. **Step 6 artifact serialization parity** — the ``selected_units``
|
|
dict-comprehension at ``phase_z2_pipeline.py:4031-4060`` produces
|
|
the same payload pre- and post-hook for no-op inputs (byte-
|
|
identical JSON).
|
|
7. **section_assignment_override skip** — when the pipeline forwards
|
|
``section_assignment_override=True`` (IMP-06 / #6 ground truth at
|
|
``phase_z2_pipeline.py:3988``), the helper short-circuits with
|
|
``audit["skipped_reason"] == "section_assignment_override"`` and
|
|
units identity preserved.
|
|
|
|
u8 cases (Stage 2 plan split-help axis):
|
|
|
|
9. **Split applied (2-section merged-reject + non-reject children)** —
|
|
merged_reject with 2 sections, each with own rank-1 ``use_as_is`` /
|
|
``light_edit`` V4 evidence → ``audit["applied"] is True``,
|
|
out_units = per-section singles (in source_section_ids order),
|
|
merged removed.
|
|
10. **No frame swap — singles carry OWN evidence** — each split-
|
|
produced single's ``frame_template_id`` / ``frame_id`` /
|
|
``frame_number`` / ``label`` come from that section's OWN
|
|
``v4_lookup_fn`` (rank-1), NOT the merged parent's
|
|
``frame_template_id``. ★ feedback_ai_isolation_contract.
|
|
11. **Raw content preservation — per-section, not merged blob** —
|
|
each split-produced single's ``raw_content`` equals
|
|
``sections[sid].raw_content``, not the merged unit's joined
|
|
``raw_content`` string. ★ MDX_raw_content_invariant.
|
|
12. **selection_path telemetry tag** — every split-produced single
|
|
has ``selection_path == "resplit_from_merge"`` (Stage 1 Q3 YES).
|
|
Non-split units in the same out_units list keep their original
|
|
``selection_path`` ("rank_1" etc.) — additive, non-clobbering.
|
|
13. **Normal per-section route restoration** — each split-produced
|
|
single's ``phase_z_status`` maps via ``v4_label_to_status`` from
|
|
its OWN label (matched_zone / adapt_matched_zone /
|
|
extract_matched_zone), NOT ``fallback_candidate``. This is the
|
|
core IMP-48 win: child sections reach the auto-renderable path
|
|
instead of IMP-47B (#76) AI repair.
|
|
14. **Coverage equality** — set of section_ids in out_units equals
|
|
set in pre-hook units (★ Stage 1 dropped_zero_invariant). Pre = 1
|
|
merged unit with 3 sections, post = 3 singles, ∀ sid preserved.
|
|
15. **layout_preset re-derivation contract** —
|
|
``audit["post_split_layout_preset"]`` is non-None when
|
|
``applied=True`` and matches ``select_layout_preset(out_units)``.
|
|
This is what u5's pipeline re-derive block at
|
|
``phase_z2_pipeline.py:3996-4006`` reads to update
|
|
``layout_preset`` (when ``not layout_override_applied``).
|
|
16. **Step 6 artifact serialization for split-help** — the
|
|
``selected_units`` dict-comprehension at
|
|
``phase_z2_pipeline.py:4031-4060`` over the post-split out_units
|
|
contains per-section entries with each section's OWN evidence;
|
|
``imp48_resplit.applied`` is True; merged parent's evidence is
|
|
absent from the payload. Locks the Step 9 / IMP-47B (#76)
|
|
hand-off shape (per-unit, not per-merge-blob).
|
|
17. **Mixed pre-hook list — order preserved** — when pre = [single,
|
|
merged_reject(2 sections), single], post = [single, single,
|
|
single, single] in source-order (split inserted in place of
|
|
merged, surrounding singles untouched).
|
|
|
|
u9 cases (Stage 2 plan split-then-reject axis — coverage preserved +
|
|
remaining reject singles eligible for IMP-47B (#76) handoff):
|
|
|
|
18. **Split applied with mixed reject + non-reject children** — pre =
|
|
[merged_reject(MOCK_S1, MOCK_S2)] where MOCK_S1 has its OWN rank-1
|
|
V4 evidence with ``label="use_as_is"`` and MOCK_S2 has its OWN
|
|
rank-1 V4 evidence with ``label="reject"`` (the section's OWN
|
|
truth is also reject — e.g., the child genuinely has no decent
|
|
frame). ``audit["applied"] is True`` (≥1 non-reject is the
|
|
beneficial-split threshold), 2 singles in source order,
|
|
``audit["split_units"][0]["non_reject_count"] == 1``.
|
|
19. **Reject single routes to IMP-47B handoff via fallback_candidate**
|
|
— the split-produced single for MOCK_S2 (own rank-1 reject)
|
|
carries ``label="reject"`` AND
|
|
``phase_z_status="fallback_candidate"``. This is the contract
|
|
IMP-47B (#76) router reads at
|
|
``src/phase_z2_pipeline.py:582`` (_RECONSTRUCTION_BY_HINT) to
|
|
decide ``ai_adaptation_required``. The non-reject sibling
|
|
(MOCK_S1) routes via its OWN ``matched_zone`` / ``adapt_matched_
|
|
zone``, NOT fallback. The IMP-48 win here is per-section
|
|
handoff: IMP-47B sees individual reject sections instead of one
|
|
merged blob.
|
|
20. **All-children-reject merge — no_beneficial_split skip path** —
|
|
pre = [merged_reject(MOCK_S1, MOCK_S2)] where BOTH sections have
|
|
OWN rank-1 V4 with ``label="reject"``. ``audit["applied"] is
|
|
False``, ``audit["skipped_reason"] == "no_split_applied"``,
|
|
``audit["skipped_units"][0]["reason"] ==
|
|
"no_beneficial_split"``. Merged unit preserved → IMP-47B sees
|
|
the merged blob (existing behavior, IMP-48 is a no-op here).
|
|
Coverage preserved by definition (merged kept whole).
|
|
21. **Coverage preserved across mixed children (3-section split)** —
|
|
pre = [merged_reject(MOCK_S1, MOCK_S2, MOCK_S3)] with 2 non-
|
|
reject + 1 reject. ``audit["applied"] is True``, post = 3
|
|
singles, ``{sid for u in post for sid in u.source_section_ids}
|
|
== {MOCK_S1, MOCK_S2, MOCK_S3}`` (★ Stage 1 dropped_zero_
|
|
invariant); the reject single is NOT dropped — it carries its
|
|
OWN section's raw_content + own V4 reject evidence and routes
|
|
to IMP-47B.
|
|
22. **No frame swap on reject single** — the reject split-produced
|
|
single's ``frame_template_id`` / ``frame_id`` / ``frame_number``
|
|
come from its OWN ``v4_lookup_fn(sid)`` (a reject-labelled V4),
|
|
NOT the merged parent's reject template_id and NOT the non-
|
|
reject sibling's template_id. ★ feedback_ai_isolation_contract.
|
|
23. **selection_path tagging covers reject singles too** — every
|
|
split-produced single, including the one with own-reject label,
|
|
has ``selection_path == "resplit_from_merge"``. Stage 1 Q3 YES
|
|
additive-tag rule is uniform across mixed-children splits.
|
|
24. **Raw content preservation across reject + non-reject singles** —
|
|
both the reject single and the non-reject single carry their
|
|
OWN section's ``raw_content`` (from ``sections[sid]``), NOT the
|
|
merged parent's joined blob. ★ MDX_raw_content_invariant. The
|
|
reject single's raw_content is what IMP-47B (#76) feeds to AI
|
|
restructure — per-section input, not merged blob input.
|
|
25. **Step 6 artifact payload for split-then-reject** — the
|
|
``selected_units`` dict-comp at
|
|
``phase_z2_pipeline.py:4031-4060`` over the post-split out_units
|
|
yields per-section entries; the reject single's payload entry
|
|
has ``phase_z_status="fallback_candidate"`` and the non-reject
|
|
single's entry has ``matched_zone`` / ``adapt_matched_zone``.
|
|
Locks the Step 9 / IMP-47B (#76) hand-off shape: downstream
|
|
consumers see one fallback_candidate single (not a merged blob
|
|
of mixed sections).
|
|
|
|
★ AI=0 throughout — PZ-1 deterministic code path only.
|
|
★ No-hardcoding (RULE_7) — stubs use MOCK_ prefixed identifiers; no
|
|
real catalog template_id / frame_id / MDX sample identifier leaks.
|
|
★ mdx03_lock — case 4 represents the mdx03 shape (all-single, no merged
|
|
reject) and locks the byte-identical no-op contract.
|
|
★ u8 split-help cases lock the mdx04 04-1 expectation: a 2-section
|
|
merged-reject becomes 2 per-section singles, and each child reaches
|
|
the normal route via its own rank-1 V4.
|
|
★ u9 split-then-reject cases lock the mdx05 expectation: a 2~3 section
|
|
merged-reject with mixed reject + non-reject children is split so
|
|
the reject child(ren) reach IMP-47B (#76) AS INDIVIDUAL SECTIONS
|
|
rather than as one merged blob. Existing all-reject merges remain
|
|
no-op (IMP-47B handles merged blob — existing behavior preserved).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
from dataclasses import dataclass
|
|
from pathlib import Path
|
|
from typing import Optional
|
|
|
|
from src.phase_z2_composition import (
|
|
CompositionUnit,
|
|
resplit_all_reject_merges,
|
|
)
|
|
|
|
|
|
# ─── Synthetic stubs (MOCK_ prefix mandatory — IMP-30 u3 convention) ───
|
|
|
|
|
|
@dataclass
|
|
class _StubV4Match:
|
|
template_id: str
|
|
frame_id: str
|
|
frame_number: int
|
|
confidence: float
|
|
label: str
|
|
v4_rank: Optional[int] = None
|
|
selection_path: str = "rank_1"
|
|
fallback_reason: Optional[str] = None
|
|
provisional: bool = False
|
|
|
|
|
|
@dataclass
|
|
class _StubSection:
|
|
section_id: str
|
|
title: str = ""
|
|
raw_content: str = ""
|
|
|
|
|
|
# Mirrors V4_LABEL_TO_PHASE_Z_STATUS / MVP1_ALLOWED_STATUSES at
|
|
# phase_z2_pipeline.py:97-103 — kept inline so the test is self-contained
|
|
# (parallel to IMP-47B u12 stub set, see test_imp47b_mixed_reject_fill.py).
|
|
_LABEL_TO_STATUS = {
|
|
"use_as_is": "matched_zone",
|
|
"light_edit": "adapt_matched_zone",
|
|
"restructure": "extract_matched_zone",
|
|
"reject": "fallback_candidate",
|
|
}
|
|
_ALLOWED_STATUSES = {"matched_zone", "adapt_matched_zone"}
|
|
|
|
|
|
def _make_lookup(matches: dict[str, _StubV4Match]):
|
|
"""Build the lookup_fn the pipeline forwards at u4 call site (L3983)."""
|
|
def _fn(section_id: str) -> Optional[_StubV4Match]:
|
|
return matches.get(section_id)
|
|
return _fn
|
|
|
|
|
|
def _candidates_lookup_empty(section_id: str) -> list:
|
|
"""Stand-in for candidates_lookup_fn (L3987) — empty list is sufficient
|
|
for the no-op cases since detection never fires."""
|
|
return []
|
|
|
|
|
|
def _serialize_units_like_step6_artifact(units: list[CompositionUnit]) -> list[dict]:
|
|
"""Replicate the ``selected_units`` dict-comprehension at
|
|
``phase_z2_pipeline.py:4031-4060`` so byte-identical parity can be
|
|
asserted on the post-hook artifact payload (case 6 — serialization
|
|
parity invariant). Mirrors the exact field set + ordering written by
|
|
``_write_step_artifact(... 6, "composition_plan", ...)``.
|
|
"""
|
|
return [
|
|
{
|
|
"source_section_ids": u.source_section_ids,
|
|
"merge_type": u.merge_type,
|
|
"frame_id": u.frame_id,
|
|
"frame_number": u.frame_number,
|
|
"frame_template_id": u.frame_template_id,
|
|
"label": u.label,
|
|
"v4_rank": u.v4_rank,
|
|
"selection_path": u.selection_path,
|
|
"fallback_reason": u.fallback_reason,
|
|
"score": u.score,
|
|
"phase_z_status": u.phase_z_status,
|
|
"rationale": u.rationale,
|
|
"notes": list(u.notes),
|
|
"v4_candidates": [
|
|
{
|
|
"template_id": c.template_id,
|
|
"frame_id": c.frame_id,
|
|
"frame_number": c.frame_number,
|
|
"confidence": c.confidence,
|
|
"label": c.label,
|
|
}
|
|
for c in u.v4_candidates
|
|
],
|
|
}
|
|
for u in units
|
|
]
|
|
|
|
|
|
def _make_single_unit(
|
|
section_id: str,
|
|
*,
|
|
label: str = "use_as_is",
|
|
template_id: Optional[str] = None,
|
|
) -> CompositionUnit:
|
|
"""Construct a ``merge_type="single"`` CompositionUnit shaped like
|
|
``collect_candidates`` output (mirrors u6's ``_make_single_unit``)."""
|
|
return CompositionUnit(
|
|
source_section_ids=[section_id],
|
|
merge_type="single",
|
|
frame_template_id=template_id or f"MOCK_TMPL_{section_id}",
|
|
frame_id=f"MOCK_FRM_{section_id}",
|
|
frame_number=hash(section_id) % 32,
|
|
confidence=0.85,
|
|
label=label,
|
|
phase_z_status=_LABEL_TO_STATUS.get(label, "unknown"),
|
|
raw_content=f"section {section_id} content",
|
|
title=section_id,
|
|
)
|
|
|
|
|
|
def _make_merged_unit(
|
|
*,
|
|
merge_type: str,
|
|
source_section_ids: list[str],
|
|
label: str,
|
|
template_id: str = "MOCK_TMPL_PARENT",
|
|
) -> CompositionUnit:
|
|
"""Construct a merged CompositionUnit (parent_merged / inferred)."""
|
|
return CompositionUnit(
|
|
source_section_ids=list(source_section_ids),
|
|
merge_type=merge_type,
|
|
frame_template_id=template_id,
|
|
frame_id="MOCK_FRM_PARENT",
|
|
frame_number=99,
|
|
confidence=0.5,
|
|
label=label,
|
|
phase_z_status=_LABEL_TO_STATUS.get(label, "unknown"),
|
|
raw_content="MERGED RAW CONTENT (joined from children)",
|
|
title="MOCK_PARENT",
|
|
)
|
|
|
|
|
|
# ─── Case 1 : Source anchor — u4 + u5 wiring markers present ────────
|
|
|
|
|
|
def test_u4_u5_pipeline_source_contains_imp48_hook_markers():
|
|
"""Anchor test. Ensures the u4 call site + u5 re-derive + audit
|
|
storage + artifact extension blocks in ``src/phase_z2_pipeline.py``
|
|
are present (not silently removed by a future refactor).
|
|
|
|
Asserts on:
|
|
* the IMP-48 marker comment at the u4 hook (L3970-L3979);
|
|
* the helper call ``resplit_all_reject_merges(`` with the
|
|
``section_assignment_override=`` kwarg (L3980-L3989);
|
|
* the audit storage ``comp_debug["imp48_resplit"] = _imp48_audit``
|
|
(L3990);
|
|
* the u5 layout_preset re-derive block (``_imp48_audit.get
|
|
("applied")`` + ``post_split_layout_preset`` + ``not
|
|
layout_override_applied``) at L3996-L4006;
|
|
* the Step 6 artifact additive field ``"imp48_resplit":
|
|
_imp48_audit`` at L4069 and the note extension at L4079-L4084.
|
|
|
|
Cheap structural guard — does not run the heavy pipeline.
|
|
"""
|
|
src_path = Path(__file__).resolve().parent.parent / "src" / "phase_z2_pipeline.py"
|
|
text = src_path.read_text(encoding="utf-8")
|
|
|
|
# u4 marker comment + call site
|
|
assert "IMP-48 (#77) — re-split merged-reject units into per-section singles." in text, (
|
|
"u4 marker comment missing from pipeline — IMP-48 hook may have been removed"
|
|
)
|
|
assert "resplit_all_reject_merges(" in text, (
|
|
"u4 helper call missing from pipeline"
|
|
)
|
|
assert "section_assignment_override=section_assignment_plan is not None" in text, (
|
|
"u4 override-skip kwarg wiring missing — IMP-06 (#6) ground truth contract broken"
|
|
)
|
|
# Audit storage at u4
|
|
assert 'comp_debug["imp48_resplit"] = _imp48_audit' in text, (
|
|
"u4 audit storage missing — comp_debug telemetry key absent"
|
|
)
|
|
|
|
# u5 re-derive block
|
|
assert "_imp48_audit.get(\"applied\")" in text, (
|
|
"u5 applied-gate missing — layout_preset would re-derive on no-op paths"
|
|
)
|
|
assert "post_split_layout_preset" in text, (
|
|
"u5 post_split_layout_preset reference missing"
|
|
)
|
|
assert "not layout_override_applied" in text, (
|
|
"u5 layout-override respect missing — would clobber --override-layout"
|
|
)
|
|
|
|
# Step 6 artifact extension
|
|
assert '"imp48_resplit": _imp48_audit' in text, (
|
|
"u5 Step 6 artifact additive field missing"
|
|
)
|
|
assert "IMP-48 (#77, 2026-05-22)" in text, (
|
|
"u5 Step 6 artifact note IMP-48 entry missing"
|
|
)
|
|
|
|
|
|
# ─── Case 2 : Import wiring (alphabetical block at L41-L50) ────────
|
|
|
|
|
|
def test_resplit_helper_imported_in_pipeline():
|
|
"""The pipeline's import block at ``phase_z2_pipeline.py:41-50``
|
|
imports ``resplit_all_reject_merges`` alongside ``plan_composition``
|
|
and ``select_display_strategy_candidates``. This protects against a
|
|
silent rename / removal that would crash the u4 call site with a
|
|
``NameError`` only at runtime.
|
|
"""
|
|
src_path = Path(__file__).resolve().parent.parent / "src" / "phase_z2_pipeline.py"
|
|
text = src_path.read_text(encoding="utf-8")
|
|
|
|
# Find the from-import block and assert membership.
|
|
assert "from phase_z2_composition import (" in text, (
|
|
"phase_z2_composition import block missing"
|
|
)
|
|
# Alphabetical neighbors (Stage 3 u4 lock — see [Claude #7] r4).
|
|
assert " plan_composition,\n resplit_all_reject_merges,\n" in text, (
|
|
"resplit_all_reject_merges must follow plan_composition alphabetically "
|
|
"in the import block (Stage 3 u4 wiring lock)"
|
|
)
|
|
|
|
|
|
# ─── Case 3 : No-op on all-direct slide (every section auto-renderable) ──
|
|
|
|
|
|
def test_no_op_on_all_direct_singles_units_identity_preserved():
|
|
"""All-direct slide (every section is use_as_is / light_edit) →
|
|
``audit["applied"] is False``, ``audit["detected_units"] == []``,
|
|
units identity preserved (same Python list object — byte-identical
|
|
to the pre-hook list)."""
|
|
units_pre = [
|
|
_make_single_unit("MOCK_S1", label="use_as_is"),
|
|
_make_single_unit("MOCK_S2", label="light_edit"),
|
|
]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
# No-op contract — applied=False, no detection, units identity preserved.
|
|
assert audit["applied"] is False
|
|
assert audit["detected_units"] == []
|
|
assert audit["skipped_reason"] == "no_detection"
|
|
assert audit["split_units"] == []
|
|
assert audit["skipped_units"] == []
|
|
# Same Python list object — helper returned the input list as-is.
|
|
assert out_units is units_pre, (
|
|
"no-op helper must preserve units list identity (no copy)"
|
|
)
|
|
# u5 gate guard — post_split_layout_preset is None when applied=False
|
|
# (so the pipeline's u5 re-derive block at L3996-L4006 short-circuits).
|
|
assert audit["post_split_layout_preset"] is None
|
|
assert audit["post_split_unit_count"] == len(units_pre)
|
|
|
|
|
|
# ─── Case 4 : mdx03 lock shape — singles with single-section reject ──
|
|
|
|
|
|
def test_no_op_on_mdx03_lock_shape_single_reject_not_detected():
|
|
"""mdx03 golden lock invariant : even when a single (merge_type==
|
|
"single") carries label="reject", it does NOT enter detection.
|
|
Detection requires ``merge_type ∈ {parent_merged,
|
|
parent_merged_inferred}`` AND ``len(source_section_ids) >= 2``.
|
|
|
|
This mirrors the mdx03 byte-identical no-op contract from
|
|
``project_mdx03_frame_lock`` — IMP-48 must not perturb mdx03 output
|
|
even if a single section's V4 evidence happens to be reject.
|
|
"""
|
|
units_pre = [
|
|
_make_single_unit("MOCK_S1", label="use_as_is"),
|
|
_make_single_unit("MOCK_S2", label="reject"), # single, NOT merged
|
|
]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.10, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is False
|
|
assert audit["detected_units"] == []
|
|
assert audit["skipped_reason"] == "no_detection"
|
|
assert out_units is units_pre
|
|
|
|
|
|
# ─── Case 5 : No-op on parent_merged non-reject ────────────────────
|
|
|
|
|
|
def test_no_op_on_parent_merged_non_reject_unit():
|
|
"""Beneficial-split threshold is anchored on ``label == "reject"``
|
|
(Stage 1 RULE_0 scope-lock). A ``parent_merged`` unit with
|
|
``label="light_edit"`` (or any non-reject label) does NOT enter
|
|
detection — no template_id / frame_id / section_id pattern-matching."""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="light_edit",
|
|
)
|
|
units_pre = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.85, "light_edit", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.85, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is False
|
|
assert audit["detected_units"] == []
|
|
assert out_units is units_pre
|
|
|
|
|
|
# ─── Case 6 : Step 6 artifact serialization parity ─────────────────
|
|
|
|
|
|
def test_step6_artifact_serialized_payload_byte_identical_for_no_op():
|
|
"""The Step 6 artifact's ``selected_units`` payload (the dict-
|
|
comprehension at ``phase_z2_pipeline.py:4031-4060``) must be byte-
|
|
identical pre- and post-hook on no-op inputs. Guards against a
|
|
helper that mutates returned units in-place (which would change the
|
|
artifact JSON even when ``applied=False``).
|
|
"""
|
|
units_pre = [
|
|
_make_single_unit("MOCK_S1", label="use_as_is"),
|
|
_make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S2", "MOCK_S3"],
|
|
label="light_edit", # non-reject merged → no-op
|
|
),
|
|
]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
_StubSection("MOCK_S3", raw_content="s3"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.85, "light_edit", v4_rank=1),
|
|
"MOCK_S3": _StubV4Match("MOCK_TMPL_S3", "MOCK_FRM_S3", 3, 0.85, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
payload_pre = _serialize_units_like_step6_artifact(units_pre)
|
|
pre_json = json.dumps(payload_pre, sort_keys=True, ensure_ascii=False)
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
payload_post = _serialize_units_like_step6_artifact(out_units)
|
|
post_json = json.dumps(payload_post, sort_keys=True, ensure_ascii=False)
|
|
|
|
assert audit["applied"] is False
|
|
assert post_json == pre_json, (
|
|
"no-op hook must produce byte-identical Step 6 artifact payload "
|
|
"(helper must not mutate units in-place)"
|
|
)
|
|
|
|
|
|
# ─── Case 7 : section_assignment_override skip (IMP-06 ground truth) ──
|
|
|
|
|
|
def test_no_op_when_section_assignment_override_active():
|
|
"""When the pipeline forwards
|
|
``section_assignment_override=section_assignment_plan is not None`` =
|
|
True (IMP-06 / #6 user override at ``phase_z2_pipeline.py:3988``),
|
|
the helper short-circuits before detection. Even if the units
|
|
contain a merged-reject (which would normally trigger), the override
|
|
takes precedence and the units are returned identity-preserved.
|
|
|
|
This locks the contract that IMP-06 zoneSections is the ground
|
|
truth — IMP-48 never overrides a user-supplied section assignment.
|
|
"""
|
|
merged_reject = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject", # would normally trigger detection
|
|
)
|
|
units_pre = [merged_reject]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.85, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
section_assignment_override=True,
|
|
)
|
|
|
|
assert audit["applied"] is False
|
|
assert audit["skipped_reason"] == "section_assignment_override"
|
|
# Detection skipped entirely — detected_units never populated.
|
|
assert audit["detected_units"] == []
|
|
# Units identity preserved.
|
|
assert out_units is units_pre
|
|
|
|
|
|
# ─── Case 8 : Empty units list — degenerate no-op ──────────────────
|
|
|
|
|
|
def test_no_op_on_empty_units_list():
|
|
"""When ``units == []`` (initial plan_composition produced nothing
|
|
and IMP-30 u4 / empty-shell path populated the placeholder via a
|
|
different mechanism, OR Stage 3's empty-shell placeholder hasn't
|
|
been built yet), the helper must short-circuit cleanly without
|
|
raising on the iteration."""
|
|
units_pre: list[CompositionUnit] = []
|
|
sections = [_StubSection("MOCK_S1", raw_content="s1")]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.10, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is False
|
|
assert audit["detected_units"] == []
|
|
assert audit["skipped_reason"] == "no_detection"
|
|
assert out_units is units_pre
|
|
assert audit["post_split_unit_count"] == 0
|
|
|
|
|
|
# ═══════════════════════════════════════════════════════════════════════
|
|
# u8 — Pipeline regression for split-help case
|
|
# ═══════════════════════════════════════════════════════════════════════
|
|
#
|
|
# Each test below validates a contract that the pipeline hook (u4 call
|
|
# site + u5 layout_preset re-derive + Step 6 artifact extension) relies
|
|
# on when a real merged-reject unit is present in the post-Step-6 unit
|
|
# list. We exercise ``resplit_all_reject_merges`` with the SAME signature
|
|
# the pipeline forwards at ``phase_z2_pipeline.py:3980-3989`` (same
|
|
# lookup_fn, label-to-status map, allowed_statuses, capacity_fit-shaped
|
|
# default, candidates lookup, override flag).
|
|
#
|
|
# All identifiers MOCK_ prefixed (★ RULE_7_no_hardcoding). No real
|
|
# catalog template_id / frame_id / MDX sample identifier leaks.
|
|
# ═══════════════════════════════════════════════════════════════════════
|
|
|
|
|
|
# ─── Case 9 : Split applied — 2-section merged-reject + non-reject children ─
|
|
|
|
|
|
def test_split_applied_two_section_merge_with_non_reject_children():
|
|
"""Pre = [merged_reject(MOCK_S1, MOCK_S2)] where each section has its
|
|
OWN rank-1 V4 evidence with a non-reject label. Post = 2 singles, in
|
|
source_section_ids order, ``audit["applied"] is True``."""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", title="t1", raw_content="raw content of s1"),
|
|
_StubSection("MOCK_S2", title="t2", raw_content="raw content of s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# applied=True path pops the contract-stage skipped_reason (see
|
|
# ``src/phase_z2_composition.py:1260`` — ``audit.pop("skipped_reason",
|
|
# None)`` after the applied branch).
|
|
assert "skipped_reason" not in audit, (
|
|
"applied=True path must not carry a skipped_reason value"
|
|
)
|
|
# Out_units shape: 2 per-section singles, merged removed.
|
|
assert len(out_units) == 2
|
|
assert all(u.merge_type == "single" for u in out_units)
|
|
assert [u.source_section_ids for u in out_units] == [["MOCK_S1"], ["MOCK_S2"]]
|
|
# Audit shape: one split entry, no skips.
|
|
assert len(audit["split_units"]) == 1
|
|
assert audit["skipped_units"] == []
|
|
assert audit["split_units"][0]["merged_source_section_ids"] == ["MOCK_S1", "MOCK_S2"]
|
|
assert audit["split_units"][0]["non_reject_count"] == 2
|
|
assert audit["post_split_unit_count"] == 2
|
|
|
|
|
|
# ─── Case 10 : No frame swap — singles carry OWN evidence ──────────────
|
|
|
|
|
|
def test_split_singles_use_own_section_v4_evidence_no_frame_swap():
|
|
"""★ feedback_ai_isolation_contract — each split-produced single's
|
|
frame_template_id / frame_id / frame_number / label come from the
|
|
section's OWN rank-1 V4 lookup. The merged parent's
|
|
``frame_template_id`` ("MOCK_TMPL_PARENT_REJECT") MUST NOT appear on
|
|
any split-produced single. No frame swap of one section's frame onto
|
|
another section.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged_inferred",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
template_id="MOCK_TMPL_PARENT_REJECT",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 7, 0.88, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 11, 0.79, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# No split-produced single carries the merged parent's template_id /
|
|
# frame_id / frame_number. Each carries its OWN section's V4 evidence.
|
|
parent_template = merged.frame_template_id
|
|
parent_frame_id = merged.frame_id
|
|
parent_frame_number = merged.frame_number
|
|
for single in out_units:
|
|
assert single.frame_template_id != parent_template, (
|
|
f"frame swap detected: single {single.source_section_ids[0]} "
|
|
f"carries merged parent template_id={parent_template}"
|
|
)
|
|
assert single.frame_id != parent_frame_id
|
|
assert single.frame_number != parent_frame_number
|
|
# Each single matches its OWN section's V4 evidence exactly.
|
|
s1, s2 = out_units
|
|
assert (s1.frame_template_id, s1.frame_id, s1.frame_number, s1.label) == (
|
|
"MOCK_TMPL_S1", "MOCK_FRM_S1", 7, "use_as_is",
|
|
)
|
|
assert (s2.frame_template_id, s2.frame_id, s2.frame_number, s2.label) == (
|
|
"MOCK_TMPL_S2", "MOCK_FRM_S2", 11, "light_edit",
|
|
)
|
|
|
|
|
|
# ─── Case 11 : Raw content preservation (per-section, not merged blob) ──
|
|
|
|
|
|
def test_split_singles_preserve_per_section_raw_content():
|
|
"""★ MDX_raw_content_invariant — each split-produced single's
|
|
``raw_content`` equals the section's original ``raw_content`` (from
|
|
the ``sections`` list), NOT the merged unit's joined
|
|
``raw_content`` blob. Locks the Stage 1 invariant that the split
|
|
path never edits / summarizes / discards MDX text.
|
|
"""
|
|
merged_raw = "MERGED BLOB — joined from children, must NOT leak to singles"
|
|
merged = CompositionUnit(
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
merge_type="parent_merged",
|
|
frame_template_id="MOCK_TMPL_PARENT_REJECT",
|
|
frame_id="MOCK_FRM_PARENT_REJECT",
|
|
frame_number=99,
|
|
confidence=0.10,
|
|
label="reject",
|
|
phase_z_status=_LABEL_TO_STATUS["reject"],
|
|
raw_content=merged_raw,
|
|
title="MOCK_PARENT",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", title="title-1", raw_content="section S1 ORIGINAL text"),
|
|
_StubSection("MOCK_S2", title="title-2", raw_content="section S2 ORIGINAL text"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Each split-produced single carries its OWN section's raw_content.
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
assert by_sid["MOCK_S1"].raw_content == "section S1 ORIGINAL text"
|
|
assert by_sid["MOCK_S2"].raw_content == "section S2 ORIGINAL text"
|
|
# And title is forwarded from the section (not merged parent title).
|
|
assert by_sid["MOCK_S1"].title == "title-1"
|
|
assert by_sid["MOCK_S2"].title == "title-2"
|
|
# Merged blob MUST NOT appear in any single's raw_content.
|
|
for single in out_units:
|
|
assert merged_raw not in single.raw_content
|
|
|
|
|
|
# ─── Case 12 : selection_path telemetry tag ────────────────────────────
|
|
|
|
|
|
def test_split_singles_tagged_with_resplit_from_merge_selection_path():
|
|
"""Stage 1 Q3 YES — every split-produced single has
|
|
``selection_path == "resplit_from_merge"``. Pre-hook singles that
|
|
surround the merged unit keep their original ``selection_path``
|
|
(additive, non-clobbering).
|
|
"""
|
|
pre_single = CompositionUnit(
|
|
source_section_ids=["MOCK_S0"],
|
|
merge_type="single",
|
|
frame_template_id="MOCK_TMPL_S0",
|
|
frame_id="MOCK_FRM_S0",
|
|
frame_number=0,
|
|
confidence=0.95,
|
|
label="use_as_is",
|
|
phase_z_status=_LABEL_TO_STATUS["use_as_is"],
|
|
raw_content="s0",
|
|
title="t0",
|
|
v4_rank=1,
|
|
selection_path="rank_1",
|
|
)
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [pre_single, merged]
|
|
sections = [
|
|
_StubSection("MOCK_S0", raw_content="s0"),
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S0": _StubV4Match("MOCK_TMPL_S0", "MOCK_FRM_S0", 0, 0.95, "use_as_is", v4_rank=1),
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
assert len(out_units) == 3
|
|
# Pre-existing single keeps its original selection_path (untouched).
|
|
assert out_units[0].source_section_ids == ["MOCK_S0"]
|
|
assert out_units[0].selection_path == "rank_1"
|
|
# Split-produced singles get the IMP-48 telemetry tag.
|
|
assert out_units[1].source_section_ids == ["MOCK_S1"]
|
|
assert out_units[1].selection_path == "resplit_from_merge"
|
|
assert out_units[2].source_section_ids == ["MOCK_S2"]
|
|
assert out_units[2].selection_path == "resplit_from_merge"
|
|
|
|
|
|
# ─── Case 13 : Normal per-section route restoration ────────────────────
|
|
|
|
|
|
def test_split_singles_route_to_normal_phase_z_status_not_fallback():
|
|
"""The IMP-48 win: child sections reach the normal auto-renderable
|
|
route via their OWN label → phase_z_status mapping. The merged
|
|
parent's ``phase_z_status="fallback_candidate"`` (from
|
|
``label="reject"``) MUST NOT propagate to any split-produced single
|
|
whose own label is not reject.
|
|
|
|
Each rebuilt single's ``phase_z_status`` is set by
|
|
``v4_label_to_status.get(match.label, "unknown")`` (see
|
|
``src/phase_z2_composition.py:1126``) — the OWN label, not the
|
|
parent's.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2", "MOCK_S3"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
_StubSection("MOCK_S3", raw_content="s3"),
|
|
]
|
|
# Each section's OWN rank-1: 3 different non-reject labels.
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
"MOCK_S3": _StubV4Match("MOCK_TMPL_S3", "MOCK_FRM_S3", 3, 0.68, "restructure", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
# Each single's phase_z_status maps from its OWN label, not "reject".
|
|
assert by_sid["MOCK_S1"].phase_z_status == "matched_zone"
|
|
assert by_sid["MOCK_S2"].phase_z_status == "adapt_matched_zone"
|
|
assert by_sid["MOCK_S3"].phase_z_status == "extract_matched_zone"
|
|
# None of the singles inherit the merged parent's fallback_candidate
|
|
# status. (Merged parent's phase_z_status was "fallback_candidate".)
|
|
assert all(s.phase_z_status != "fallback_candidate" for s in out_units)
|
|
|
|
|
|
# ─── Case 14 : Coverage equality (★ dropped_zero_invariant) ─────────────
|
|
|
|
|
|
def test_split_preserves_full_section_coverage():
|
|
"""★ Stage 1 dropped_zero_invariant — the set of section_ids covered
|
|
by out_units equals the set covered by pre-hook units. Pre = 1
|
|
merged unit with 3 sections, post = 3 singles, ∀ sid preserved.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2", "MOCK_S3"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
_StubSection("MOCK_S3", raw_content="s3"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
"MOCK_S3": _StubV4Match("MOCK_TMPL_S3", "MOCK_FRM_S3", 3, 0.78, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
pre_sids = {sid for u in units_pre for sid in u.source_section_ids}
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
post_sids = {sid for u in out_units for sid in u.source_section_ids}
|
|
assert pre_sids == post_sids == {"MOCK_S1", "MOCK_S2", "MOCK_S3"}
|
|
# 3 splits → 3 singles, no duplicates, no drops.
|
|
assert len(out_units) == 3
|
|
assert len([sid for u in out_units for sid in u.source_section_ids]) == 3
|
|
|
|
|
|
# ─── Case 15 : layout_preset re-derivation contract (u5 input) ──────────
|
|
|
|
|
|
def test_split_audit_post_split_layout_preset_matches_select_layout_preset():
|
|
"""``audit["post_split_layout_preset"]`` is non-None when
|
|
``applied=True`` and reflects ``select_layout_preset(out_units)``.
|
|
The pipeline's u5 re-derive block at
|
|
``phase_z2_pipeline.py:3996-4006`` reads exactly this field to
|
|
decide whether to update ``layout_preset`` (when
|
|
``not layout_override_applied``).
|
|
"""
|
|
from src.phase_z2_composition import select_layout_preset # local import — no top-level side effects on test discovery
|
|
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
assert audit["post_split_layout_preset"] is not None, (
|
|
"applied=True must surface a non-None post_split_layout_preset "
|
|
"for the u5 pipeline re-derive block"
|
|
)
|
|
# Re-derive must match what u5 would compute on the helper-returned units.
|
|
assert audit["post_split_layout_preset"] == select_layout_preset(out_units)
|
|
# post_split_unit_count tracks len(out_units).
|
|
assert audit["post_split_unit_count"] == len(out_units) == 2
|
|
|
|
|
|
# ─── Case 16 : Step 6 artifact serialization for split-help ────────────
|
|
|
|
|
|
def test_step6_artifact_payload_reflects_per_section_singles_after_split():
|
|
"""The Step 6 artifact's ``selected_units`` payload (dict-comp at
|
|
``phase_z2_pipeline.py:4031-4060``) over the post-split out_units
|
|
contains per-section entries — each entry has the section's OWN V4
|
|
evidence (template_id / frame_id / frame_number / label), not the
|
|
merged parent's. Locks the Step 9 / IMP-47B (#76) hand-off shape:
|
|
downstream consumers see per-section units, not the merged blob.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 7, 0.88, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 11, 0.79, "light_edit", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Step 6 artifact payload mirror.
|
|
payload = _serialize_units_like_step6_artifact(out_units)
|
|
payload_json = json.dumps(payload, sort_keys=True, ensure_ascii=False)
|
|
# Per-section entries reflect each section's OWN evidence.
|
|
assert len(payload) == 2
|
|
by_sid = {entry["source_section_ids"][0]: entry for entry in payload}
|
|
assert by_sid["MOCK_S1"]["merge_type"] == "single"
|
|
assert by_sid["MOCK_S1"]["frame_template_id"] == "MOCK_TMPL_S1"
|
|
assert by_sid["MOCK_S1"]["frame_id"] == "MOCK_FRM_S1"
|
|
assert by_sid["MOCK_S1"]["frame_number"] == 7
|
|
assert by_sid["MOCK_S1"]["label"] == "use_as_is"
|
|
assert by_sid["MOCK_S1"]["phase_z_status"] == "matched_zone"
|
|
assert by_sid["MOCK_S1"]["selection_path"] == "resplit_from_merge"
|
|
assert by_sid["MOCK_S2"]["merge_type"] == "single"
|
|
assert by_sid["MOCK_S2"]["frame_template_id"] == "MOCK_TMPL_S2"
|
|
assert by_sid["MOCK_S2"]["frame_id"] == "MOCK_FRM_S2"
|
|
assert by_sid["MOCK_S2"]["frame_number"] == 11
|
|
assert by_sid["MOCK_S2"]["label"] == "light_edit"
|
|
assert by_sid["MOCK_S2"]["phase_z_status"] == "adapt_matched_zone"
|
|
assert by_sid["MOCK_S2"]["selection_path"] == "resplit_from_merge"
|
|
# Merged parent's identifiers MUST NOT appear in the post-split payload.
|
|
# ★ feedback_ai_isolation_contract — no frame swap from merged parent.
|
|
assert merged.frame_template_id not in payload_json
|
|
assert merged.frame_id not in payload_json
|
|
# imp48_resplit audit is populated for the pipeline's artifact extension.
|
|
assert audit["applied"] is True
|
|
assert len(audit["split_units"]) == 1
|
|
|
|
|
|
# ─── Case 17 : Mixed pre-hook list — order preserved ───────────────────
|
|
|
|
|
|
def test_split_preserves_order_when_merged_is_sandwiched_between_singles():
|
|
"""Pre = [single, merged_reject(2 sections), single]. Post should be
|
|
[single, single_resplit, single_resplit, single] in source order —
|
|
the split inserts in place of the merged unit, surrounding singles
|
|
untouched. Total post count = 4 (within the v0 layout cap)."""
|
|
pre_left = CompositionUnit(
|
|
source_section_ids=["MOCK_S0"],
|
|
merge_type="single",
|
|
frame_template_id="MOCK_TMPL_S0",
|
|
frame_id="MOCK_FRM_S0",
|
|
frame_number=0,
|
|
confidence=0.95,
|
|
label="use_as_is",
|
|
phase_z_status=_LABEL_TO_STATUS["use_as_is"],
|
|
raw_content="s0",
|
|
title="t0",
|
|
v4_rank=1,
|
|
selection_path="rank_1",
|
|
)
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
pre_right = CompositionUnit(
|
|
source_section_ids=["MOCK_S3"],
|
|
merge_type="single",
|
|
frame_template_id="MOCK_TMPL_S3",
|
|
frame_id="MOCK_FRM_S3",
|
|
frame_number=3,
|
|
confidence=0.95,
|
|
label="use_as_is",
|
|
phase_z_status=_LABEL_TO_STATUS["use_as_is"],
|
|
raw_content="s3",
|
|
title="t3",
|
|
v4_rank=1,
|
|
selection_path="rank_1",
|
|
)
|
|
units_pre: list[CompositionUnit] = [pre_left, merged, pre_right]
|
|
sections = [
|
|
_StubSection("MOCK_S0", raw_content="s0"),
|
|
_StubSection("MOCK_S1", raw_content="s1"),
|
|
_StubSection("MOCK_S2", raw_content="s2"),
|
|
_StubSection("MOCK_S3", raw_content="s3"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S0": _StubV4Match("MOCK_TMPL_S0", "MOCK_FRM_S0", 0, 0.95, "use_as_is", v4_rank=1),
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
"MOCK_S3": _StubV4Match("MOCK_TMPL_S3", "MOCK_FRM_S3", 3, 0.95, "use_as_is", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Order preserved: S0, S1 (split), S2 (split), S3.
|
|
assert [u.source_section_ids for u in out_units] == [
|
|
["MOCK_S0"], ["MOCK_S1"], ["MOCK_S2"], ["MOCK_S3"],
|
|
]
|
|
# Surrounding singles untouched (identity preserved).
|
|
assert out_units[0] is pre_left
|
|
assert out_units[-1] is pre_right
|
|
# Only the inner two are split-produced.
|
|
assert out_units[1].selection_path == "resplit_from_merge"
|
|
assert out_units[2].selection_path == "resplit_from_merge"
|
|
# Audit post-split count matches projected 4 (within layout cap).
|
|
assert audit["post_split_unit_count"] == 4
|
|
|
|
|
|
# ═══════════════════════════════════════════════════════════════════════
|
|
# u9 — Pipeline split-then-reject regression (mixed reject + non-reject
|
|
# children). Scope-lock from Stage 2: coverage preserved + remaining
|
|
# reject singles remain eligible for IMP-47B (#76) handoff.
|
|
#
|
|
# Differs from u8 (split-help): u8 covers the "all children non-reject"
|
|
# case where every split-produced single reaches the normal auto-render
|
|
# route. u9 covers the harder case where one or more child sections
|
|
# carry their OWN rank-1 V4 reject (the section is genuinely difficult
|
|
# even individually). IMP-48 must still split when ≥1 child is non-
|
|
# reject (the beneficial-split threshold), preserving full coverage and
|
|
# letting IMP-47B see PER-SECTION reject singles instead of one merged
|
|
# blob.
|
|
#
|
|
# When ALL children carry own-reject V4, the merged unit is preserved
|
|
# (no_beneficial_split) — existing IMP-47B-on-merged-blob behavior is
|
|
# the no-op, IMP-48 does not regress it. This is the cleanest split-
|
|
# then-reject contract.
|
|
#
|
|
# All identifiers MOCK_ prefixed (★ RULE_7_no_hardcoding). No real
|
|
# catalog template_id / frame_id / MDX sample identifier leaks.
|
|
# ═══════════════════════════════════════════════════════════════════════
|
|
|
|
|
|
# ─── Case 18 : Split applied with mixed reject + non-reject children ─────
|
|
|
|
|
|
def test_split_applied_with_mixed_reject_and_non_reject_children():
|
|
"""Merged_reject(MOCK_S1, MOCK_S2) where MOCK_S1's OWN rank-1 V4 =
|
|
use_as_is (non-reject) and MOCK_S2's OWN rank-1 V4 = reject. Beneficial-
|
|
split threshold (≥1 non-reject) IS met → ``audit["applied"] is True``,
|
|
out = 2 singles in source order, ``non_reject_count == 1``."""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", title="t1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", title="t2", raw_content="raw S2"),
|
|
]
|
|
# MOCK_S1 own rank-1 = use_as_is (auto-renderable).
|
|
# MOCK_S2 own rank-1 = reject (section is genuinely hard even alone).
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 2, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
# Beneficial-split threshold met by ≥1 non-reject child.
|
|
assert audit["applied"] is True
|
|
assert "skipped_reason" not in audit, (
|
|
"applied=True path must not carry a skipped_reason value"
|
|
)
|
|
# 2 per-section singles, in source order, merged removed.
|
|
assert len(out_units) == 2
|
|
assert all(u.merge_type == "single" for u in out_units)
|
|
assert [u.source_section_ids for u in out_units] == [["MOCK_S1"], ["MOCK_S2"]]
|
|
# Audit split entry shows mixed count: 1 non-reject, 1 reject.
|
|
assert len(audit["split_units"]) == 1
|
|
assert audit["skipped_units"] == []
|
|
assert audit["split_units"][0]["non_reject_count"] == 1
|
|
assert audit["post_split_unit_count"] == 2
|
|
# Split entry's split_singles audit records each child's resolved label.
|
|
by_sid = {entry["section_id"]: entry for entry in audit["split_units"][0]["split_singles"]}
|
|
assert by_sid["MOCK_S1"]["label"] == "use_as_is"
|
|
assert by_sid["MOCK_S2"]["label"] == "reject"
|
|
|
|
|
|
# ─── Case 19 : Reject single routes to IMP-47B handoff via fallback ──────
|
|
|
|
|
|
def test_reject_split_single_carries_fallback_candidate_phase_z_status():
|
|
"""The split-produced single for MOCK_S2 (own rank-1 reject) carries
|
|
``label="reject"`` AND ``phase_z_status="fallback_candidate"``. The
|
|
non-reject sibling MOCK_S1 routes via its OWN ``matched_zone``. The
|
|
IMP-48 win: IMP-47B (#76) sees PER-SECTION reject singles instead of
|
|
one merged blob containing mixed sections.
|
|
|
|
IMP-47B's router reads ``phase_z_status="fallback_candidate"`` (mapped
|
|
from ``label="reject"`` via ``V4_LABEL_TO_PHASE_Z_STATUS`` at
|
|
``src/phase_z2_pipeline.py:97-103``) to decide
|
|
``ai_adaptation_required`` (see ``_RECONSTRUCTION_BY_HINT`` at
|
|
``src/phase_z2_pipeline.py:582``). The handoff contract is per-unit:
|
|
each reject single is an independent IMP-47B input."""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 2, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
|
|
# Non-reject sibling routes via its OWN label, NOT fallback_candidate.
|
|
assert by_sid["MOCK_S1"].label == "use_as_is"
|
|
assert by_sid["MOCK_S1"].phase_z_status == "matched_zone"
|
|
|
|
# Reject single carries label=reject + phase_z_status=fallback_candidate.
|
|
# This is the per-section handoff signal to IMP-47B (#76).
|
|
assert by_sid["MOCK_S2"].label == "reject"
|
|
assert by_sid["MOCK_S2"].phase_z_status == "fallback_candidate"
|
|
|
|
|
|
# ─── Case 20 : All-children-reject merge — no_beneficial_split skip ──────
|
|
|
|
|
|
def test_all_children_reject_merge_keeps_merged_no_beneficial_split():
|
|
"""Both MOCK_S1 and MOCK_S2 have OWN rank-1 V4 with ``label="reject"``.
|
|
Beneficial-split threshold (≥1 non-reject) is NOT met → IMP-48 must
|
|
NOT split. Merged unit preserved → IMP-47B (#76) sees the merged blob
|
|
(existing behavior). IMP-48 is a no-op for this shape — coverage is
|
|
trivially preserved because the merged unit is kept whole.
|
|
|
|
Audit fingerprint:
|
|
* ``audit["applied"] is False``
|
|
* ``audit["skipped_reason"] == "no_split_applied"``
|
|
* ``audit["skipped_units"][0]["reason"] == "no_beneficial_split"``
|
|
* ``audit["post_split_layout_preset"] is None`` (u5 re-derive gate
|
|
short-circuits — see ``phase_z2_pipeline.py:3996``)
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
]
|
|
# Both children carry OWN rank-1 reject — no auto-renderable child.
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1_REJECT", "MOCK_FRM_S1_REJECT", 1, 0.45, "reject", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 2, 0.40, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
# No-op for all-reject merge — merged unit kept, IMP-47B sees it.
|
|
assert audit["applied"] is False
|
|
assert audit["skipped_reason"] == "no_split_applied"
|
|
assert audit["split_units"] == []
|
|
assert len(audit["skipped_units"]) == 1
|
|
assert audit["skipped_units"][0]["reason"] == "no_beneficial_split"
|
|
assert audit["skipped_units"][0]["merged_source_section_ids"] == ["MOCK_S1", "MOCK_S2"]
|
|
# u5 re-derive gate short-circuits because applied=False.
|
|
assert audit["post_split_layout_preset"] is None
|
|
# Merged unit preserved whole — existing IMP-47B-on-merged-blob behavior.
|
|
assert out_units == [merged]
|
|
assert out_units[0] is merged
|
|
|
|
|
|
# ─── Case 21 : Coverage preserved across mixed children (3-section) ──────
|
|
|
|
|
|
def test_coverage_preserved_when_split_includes_reject_child():
|
|
"""★ Stage 1 dropped_zero_invariant — pre = [merged_reject(MOCK_S1,
|
|
MOCK_S2, MOCK_S3)] with 2 non-reject + 1 reject child. Post = 3
|
|
singles (the reject child IS NOT dropped). Set of section_ids
|
|
preserved across pre/post.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2", "MOCK_S3"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
_StubSection("MOCK_S3", raw_content="raw S3"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2", "MOCK_FRM_S2", 2, 0.81, "light_edit", v4_rank=1),
|
|
"MOCK_S3": _StubV4Match("MOCK_TMPL_S3_REJECT", "MOCK_FRM_S3_REJECT", 3, 0.40, "reject", v4_rank=1),
|
|
})
|
|
|
|
pre_sids = {sid for u in units_pre for sid in u.source_section_ids}
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Reject child IS NOT dropped. All 3 sections present post-split.
|
|
post_sids = {sid for u in out_units for sid in u.source_section_ids}
|
|
assert pre_sids == post_sids == {"MOCK_S1", "MOCK_S2", "MOCK_S3"}
|
|
assert len(out_units) == 3
|
|
# No duplicate / no drop — each section appears in exactly one single.
|
|
assert len([sid for u in out_units for sid in u.source_section_ids]) == 3
|
|
# Audit: 2 non-reject + 1 reject, applied=True.
|
|
assert audit["split_units"][0]["non_reject_count"] == 2
|
|
assert audit["post_split_unit_count"] == 3
|
|
|
|
|
|
# ─── Case 22 : No frame swap on reject single ────────────────────────────
|
|
|
|
|
|
def test_reject_split_single_uses_own_v4_evidence_no_frame_swap():
|
|
"""★ feedback_ai_isolation_contract — the reject split-produced
|
|
single's frame_template_id / frame_id / frame_number come from its
|
|
OWN ``v4_lookup_fn(sid)`` (a reject-labelled V4 evidence), NOT the
|
|
merged parent's reject template_id and NOT the non-reject sibling's
|
|
template_id.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
template_id="MOCK_TMPL_PARENT_REJECT",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 7, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 13, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
|
|
# Reject single carries its OWN V4 reject evidence, NOT merged parent's
|
|
# template_id and NOT the non-reject sibling's template_id.
|
|
reject_single = by_sid["MOCK_S2"]
|
|
assert reject_single.frame_template_id == "MOCK_TMPL_S2_REJECT"
|
|
assert reject_single.frame_id == "MOCK_FRM_S2_REJECT"
|
|
assert reject_single.frame_number == 13
|
|
# No swap from merged parent.
|
|
assert reject_single.frame_template_id != merged.frame_template_id
|
|
assert reject_single.frame_id != merged.frame_id
|
|
assert reject_single.frame_number != merged.frame_number
|
|
# No swap from non-reject sibling.
|
|
non_reject_single = by_sid["MOCK_S1"]
|
|
assert reject_single.frame_template_id != non_reject_single.frame_template_id
|
|
assert reject_single.frame_id != non_reject_single.frame_id
|
|
|
|
|
|
# ─── Case 23 : selection_path tagging covers reject singles too ──────────
|
|
|
|
|
|
def test_selection_path_tag_applies_to_reject_split_singles_too():
|
|
"""Stage 1 Q3 YES — every split-produced single, INCLUDING the one
|
|
with own-reject label, has ``selection_path == "resplit_from_merge"``.
|
|
The telemetry tag is uniform across mixed-children splits."""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 2, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Both split-produced singles carry the IMP-48 telemetry tag — uniform.
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
assert by_sid["MOCK_S1"].selection_path == "resplit_from_merge"
|
|
assert by_sid["MOCK_S2"].selection_path == "resplit_from_merge"
|
|
|
|
|
|
# ─── Case 24 : Raw content preservation across reject + non-reject ───────
|
|
|
|
|
|
def test_raw_content_preserved_across_reject_and_non_reject_split_singles():
|
|
"""★ MDX_raw_content_invariant — both the reject single and the non-
|
|
reject single carry their OWN section's raw_content (from
|
|
``sections[sid]``), NOT the merged parent's joined blob. The reject
|
|
single's raw_content is the input IMP-47B (#76) AI restructure reads
|
|
— per-section, not merged blob.
|
|
"""
|
|
merged_raw = "MERGED BLOB — joined from children, must NOT leak to singles"
|
|
merged = CompositionUnit(
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
merge_type="parent_merged",
|
|
frame_template_id="MOCK_TMPL_PARENT_REJECT",
|
|
frame_id="MOCK_FRM_PARENT_REJECT",
|
|
frame_number=99,
|
|
confidence=0.10,
|
|
label="reject",
|
|
phase_z_status=_LABEL_TO_STATUS["reject"],
|
|
raw_content=merged_raw,
|
|
title="MOCK_PARENT",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", title="title-1", raw_content="section S1 ORIGINAL text"),
|
|
_StubSection("MOCK_S2", title="title-2", raw_content="section S2 ORIGINAL text"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 1, 0.92, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 2, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
by_sid = {u.source_section_ids[0]: u for u in out_units}
|
|
# Non-reject single keeps its OWN section's raw_content.
|
|
assert by_sid["MOCK_S1"].raw_content == "section S1 ORIGINAL text"
|
|
assert by_sid["MOCK_S1"].title == "title-1"
|
|
# Reject single ALSO keeps its OWN section's raw_content — per-section
|
|
# input for IMP-47B (#76), NOT merged blob.
|
|
assert by_sid["MOCK_S2"].raw_content == "section S2 ORIGINAL text"
|
|
assert by_sid["MOCK_S2"].title == "title-2"
|
|
# Merged blob MUST NOT appear in any single's raw_content.
|
|
for single in out_units:
|
|
assert merged_raw not in single.raw_content
|
|
|
|
|
|
# ─── Case 25 : Step 6 artifact payload for split-then-reject ─────────────
|
|
|
|
|
|
def test_step6_artifact_payload_shows_per_section_handoff_for_split_then_reject():
|
|
"""The Step 6 artifact's ``selected_units`` payload (dict-comp at
|
|
``phase_z2_pipeline.py:4031-4060``) over the post-split out_units
|
|
contains per-section entries. The reject single's payload entry has
|
|
``label="reject"`` + ``phase_z_status="fallback_candidate"`` (per-
|
|
section IMP-47B handoff signal). The non-reject single's entry has
|
|
its OWN ``matched_zone`` / ``adapt_matched_zone``. The merged
|
|
parent's identifiers MUST NOT appear in the payload.
|
|
"""
|
|
merged = _make_merged_unit(
|
|
merge_type="parent_merged",
|
|
source_section_ids=["MOCK_S1", "MOCK_S2"],
|
|
label="reject",
|
|
)
|
|
units_pre: list[CompositionUnit] = [merged]
|
|
sections = [
|
|
_StubSection("MOCK_S1", raw_content="raw S1"),
|
|
_StubSection("MOCK_S2", raw_content="raw S2"),
|
|
]
|
|
lookup = _make_lookup({
|
|
"MOCK_S1": _StubV4Match("MOCK_TMPL_S1", "MOCK_FRM_S1", 7, 0.88, "use_as_is", v4_rank=1),
|
|
"MOCK_S2": _StubV4Match("MOCK_TMPL_S2_REJECT", "MOCK_FRM_S2_REJECT", 13, 0.45, "reject", v4_rank=1),
|
|
})
|
|
|
|
out_units, audit = resplit_all_reject_merges(
|
|
units_pre, sections, lookup, _LABEL_TO_STATUS, _ALLOWED_STATUSES,
|
|
v4_candidates_lookup_fn=_candidates_lookup_empty,
|
|
)
|
|
|
|
assert audit["applied"] is True
|
|
# Step 6 artifact payload mirror.
|
|
payload = _serialize_units_like_step6_artifact(out_units)
|
|
payload_json = json.dumps(payload, sort_keys=True, ensure_ascii=False)
|
|
|
|
assert len(payload) == 2
|
|
by_sid = {entry["source_section_ids"][0]: entry for entry in payload}
|
|
# Non-reject single's payload — matched_zone (auto-renderable).
|
|
assert by_sid["MOCK_S1"]["merge_type"] == "single"
|
|
assert by_sid["MOCK_S1"]["frame_template_id"] == "MOCK_TMPL_S1"
|
|
assert by_sid["MOCK_S1"]["label"] == "use_as_is"
|
|
assert by_sid["MOCK_S1"]["phase_z_status"] == "matched_zone"
|
|
assert by_sid["MOCK_S1"]["selection_path"] == "resplit_from_merge"
|
|
# Reject single's payload — fallback_candidate (IMP-47B handoff target).
|
|
assert by_sid["MOCK_S2"]["merge_type"] == "single"
|
|
assert by_sid["MOCK_S2"]["frame_template_id"] == "MOCK_TMPL_S2_REJECT"
|
|
assert by_sid["MOCK_S2"]["label"] == "reject"
|
|
assert by_sid["MOCK_S2"]["phase_z_status"] == "fallback_candidate"
|
|
assert by_sid["MOCK_S2"]["selection_path"] == "resplit_from_merge"
|
|
# Merged parent's identifiers MUST NOT appear in the payload.
|
|
assert merged.frame_template_id not in payload_json
|
|
assert merged.frame_id not in payload_json
|
|
# Audit reflects the mixed-children split.
|
|
assert audit["split_units"][0]["non_reject_count"] == 1
|
|
assert audit["post_split_unit_count"] == 2
|