14 KiB
IMP-20 — Phase Q content_verifier Frame Contract Validation Pattern Reference
Status: documented (reference-only, dormant)
Scope: doc-only. No runtime surface modified.
Related issue: #20
Soft dependency: IMP-04 (extended catalog application) — IMP-20 stays dormant; activates only via the A5 gate.
Source axis: INSIGHT-MAP §3 / §2.7 H2 — content_verifier.verify_structure pattern reference.
A1 — Phase Q consumer pattern (read-only reference)
Phase Q implements area-level required-pattern validation at the content-verifier layer. References (do not modify):
src/content_verifier.py:382-392—REQUIRED_PATTERNS: dict[str, list[str]]— top-level pattern dictionary keyed by area name (body_bg,body_core,sidebar,footer). Values verified:body_bg=[],body_core=["key-msg"],sidebar=["padding-left", "text-indent"],footer=[]. Phase T (L379-381comment) removed theoverflow:hiddenrequirement to reconcile with the Phase T prompt's "overflow:hidden 금지" directive — that no-regression boundary is preserved.src/content_verifier.py:395-448—verify_structure(generated_html, area_name, has_image=False, font_hierarchy=None) → VerificationResult— the substring-check + OR + tolerance core logic.:405-412— substring presence loop. Each pattern string is split on|(pattern.split("|")at L410) and treated as an OR alternation: any alternative present passes the pattern. Missing alternatives are appended to amissinglist.:414-416—has_imagebranch. Whenhas_image=Trueandarea_name == "body_core", an additional implicit requirement is enforced:"slide-img-"must appear ingenerated_html. Missing image marker is reported as"slide-img-* (이미지 태그)"inmissing.:418-436—font_hierarchybranch. When supplied, area-name → max-font lookup uses a fixedrole_font_map = {"body_bg":bg/11, "body_core":core/12, "sidebar":sidebar/10, "footer":core/12}. HTMLfont-size:\s*(\d+(?:\.\d+)?)\s*pxmatches are extracted via regex (L430); each measured size >max_font + 1(1px tolerance at L433) emits afont_warningsentry. Warnings do not flippassed.:438-447— result construction.passed = (len(missing) == 0).score = 1.0on pass else1.0 - len(missing) / max(1, len(patterns))(continuous degradation;max(1, …)guards empty-pattern division by zero). Errors prefixed"필수 패턴 누락: ". Warnings carry font hierarchy violations only.
src/content_verifier.py:455-487—verify_area(original_text, generated_html, area_name, has_image=False) → VerificationResult— composes L1 (verify_text_preservation) + L2 (verify_no_forbidden_content) + L3 (verify_structure) at L462-466.verify_structurecall at L465 passeshas_imagebut notfont_hierarchy(font_hierarchy is unused insideverify_area).src/content_verifier.py:490-529—verify_all_areas(generated, area_texts, has_image_areas=None)— area dispatch fan-out.body_htmlis split intobody_bg+body_core(L510-519);body_coreis the only branch that propagateshas_image=("body_core" in has_image_areas)toverify_area(L518).sidebar_html(L521-525) andfooter_html(L527-531) callverify_areawith defaulthas_image=False.
Classification: area-level (Phase Q HTML area axis) required-pattern validation at content-verifier time. Not Phase Z frame_id × sub_zone contract validation.
A2 — Phase Q REQUIRED_PATTERNS shape (read-only reference)
The Phase Q pattern-dict shape — values are Phase Q-specific and excluded from reuse; only the shape is Phase Z design input.
| Axis | Phase Q shape | Where observed |
|---|---|---|
| Key axis | area name (string) | src/content_verifier.py:382 keys: body_bg / body_core / sidebar / footer |
| Value type | list[str] of substring patterns |
src/content_verifier.py:383-391 |
| Alternation semantics | "a|b" → OR (any alt passes) via `pattern.split(" |
")` |
| Image-conditional branch | has_image=True ∧ area_name=="body_core" → implicit "slide-img-" requirement |
src/content_verifier.py:414-416 |
| Font hierarchy tolerance | 1px (fs > max_font + 1); area-name → max-font fixed lookup |
src/content_verifier.py:433, :421-426 |
| Pass/score rule | passed = (missing == []); score = continuous degradation 1.0 - len(missing)/max(1, len(patterns)) |
src/content_verifier.py:438, :445 |
| Empty-pattern handling | max(1, len(patterns)) guards divide-by-zero; empty pattern list always passes |
src/content_verifier.py:445, :382-383 (body_bg=[]) |
Shape-only carry-over candidates for Phase Z design (see A3 in u2):
dict[key]→list[pattern]indirection.- OR via in-string
|separator (low-ceremony alternation). - Conditional implicit requirement injected by external context flag (here
has_image; in Phase Z potentiallyaccepted_content_typesper sub_zone). - Continuous score degradation rather than binary pass/fail (downstream consumers can threshold).
- Separate
errors(block) vswarnings(advisory) lanes — font hierarchy lives in warnings, not errors.
Values that must not carry into Phase Z: the literal strings "key-msg", "padding-left", "text-indent", "slide-img-", and the area names body_bg / body_core / sidebar / footer themselves — these are Phase Q area-HTML idioms, not Phase Z frame/slot idioms.
A3 — Phase Z target pattern dict (design input, not yet active)
The Phase Z-native target axis = frame_id × sub_zone pattern dict, aligned with templates/phase_z2/catalog/frame_contracts.yaml. References (do not modify):
templates/phase_z2/catalog/frame_contracts.yaml:21three_parallel_requirements(F13, 3 sub_zones),:77process_product_two_way(F29, 2 sub_zones × strict 3 cardinality),:128bim_issues_quadrant_four(F16, 4 sub_zones),:189three_persona_benefits(F14, 3 sub_zones),:253construction_goals_three_circle_intersection(F12, 3+1 sub_zones —intersectionismin:0,max:1),:323construction_bim_three_usage(F11, 3 sub_zones),:391bim_dx_comparison_table(F18, 2 header + 1rowswithmin:1,max:12),:456dx_sw_necessity_three_perspectives(F20, 3 sub_zones),:520info_management_what_how_when(F8, 3 sub_zones),:580sw_reality_three_emphasis(F28, 3 sub_zones),:637bim_current_problems_paired(F17, 8 sub_zones — row × side 2-axis).- All 11 contracts carry
accepted_content_types+sub_zones; fielddensity_envelopeis absent across the catalog (verifiedgrep -c "density_envelope" templates/phase_z2/catalog/frame_contracts.yaml= 0). src/phase_z2_mapper.py:49-57load_frame_contracts/get_contract— direct dict lookup against the 11 entries above.src/phase_z2_pipeline.py:3776-3805Step 10 emit — currently surfacesframe_id/family/source_shape/cardinality/visual_hints/accepted_content_types/sub_zones/payload_builder/payload_builder_optionstostep10_frame_contract.jsonwithstep_status="partial". No pattern-dict assertion runs against this payload yet.
Abstraction-mismatch table (Phase Q area-level vs Phase Z frame/slot-level):
| Axis | Phase Q (A1+A2) | Phase Z target (A3) |
|---|---|---|
| Key | area name (body_bg/body_core/sidebar/footer) |
(frame_id, sub_zone_id) tuple — e.g. (1171281190, "pillar_1") |
| Cardinality of keys | 4 fixed area names | open over 11 contracts × N sub_zones (3+2+4+3+4+3+3+3+3+3+8 = 39 sub_zones in current catalog) |
| Value semantics | substring presence (HTML-string match) | candidates: substring presence and/or contract-field assertion (cardinality.strict / accepts membership / partial_target_path resolution) |
| Conditional branch input | has_image external flag |
accepted_content_types per sub_zone (catalog-driven, not external flag) |
| Tolerance | 1px on font-size (single axis) | candidates: font-size 1px tolerance carried over or replaced by visual_hints.min_height_px envelope check |
| Validation timing | post-render HTML (generated_html string) |
post Step 18 final.html (mirrors Phase Q timing) — Step 12 light_edit/restructure proposal is excluded (proposal is upstream of render) |
| Result lanes | errors (block) + warnings (advisory) |
preserved as-is from Phase Q shape (continuous score; separate font-hierarchy warnings) |
Classification: Phase Q area axis ⇄ Phase Z frame/slot axis are not drop-in compatible. The shape (dict indirection + OR alternation + tolerance + conditional implicit-requirement + continuous score) is the only portable element; every value (key strings, area names, literal patterns) is Phase Q-local.
A4 — IMP-04 soft-link boundary (catalog vs validation ownership)
IMP-20 is soft link: IMP-04 per the backlog (docs/architecture/PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md:71). Ownership separation:
- IMP-04 owns: every
frame_contracts.yamlentry — addition / removal /accepted_content_typeschange /sub_zonesschema change /cardinalitychange /visual_hintschange.templates/phase_z2/catalog/frame_contracts.yamlis the IMP-04 source of truth. - IMP-20 owns: reference-only documentation of the Phase Q pattern-dict shape (A1 + A2) and the Phase Z target axis design narrative (A3). No catalog edits, no Step 10 promotion.
- Coupling direction: one-way read. A Phase Z pattern dict (if/when activated through the A5 gate) consumes
frame_contracts.yamlas input. It does not publish back into the catalog. IMP-04 is unaware of IMP-20. - No bidirectional code flow: IMP-20 does not move Phase Q
content_verifier.pycode into Phase Z, and IMP-04 does not consumeREQUIRED_PATTERNS. The two surfaces remain isolated. - Reference direction is one-way: this document points read-only at
src/content_verifier.py,src/phase_z2_mapper.py,src/phase_z2_pipeline.py, andtemplates/phase_z2/catalog/frame_contracts.yaml. No reverse pointer is required in those source files.
If IMP-04 alters the catalog schema (e.g. adds density_envelope or renames sub_zones), A3 must be re-verified (key axis and conditional-branch row in particular). The boundary statement itself does not change.
A5 — Re-activation gate + guardrails
IMP-20 is documented (dormant). Re-activation requires all of the following gate conditions (3-cond AND):
- Trigger: Phase Z Step 10 produces a verifiable case where the partial frame-contract emit alone is insufficient — i.e., a final.html regression that a frame_id × sub_zone pattern dict would have caught (missing slot marker, contract field violation, font-hierarchy breach against a sub_zone-resolved max). The trigger must be a regression that maps cleanly to the frame/slot axis, not to a higher layer (composition planning, content adapter, render-time CSS).
- Evidence requirement: failing-case MDX +
step10_frame_contract.jsontrace + final.html excerpt with the slot path that should have asserted, attached to a new issue or this issue's reopened state. - IMP-04 sign-off: the IMP-04 owner confirms the failing case is not addressable inside the catalog (e.g. tightening
cardinalityoraccepted_content_typesdoes not resolve it) — only then is a Phase Z-native pattern dict justified.
Design questions resolved in this document (revisit if the gate fires):
- Q1 — Key granularity:
(frame_id, sub_zone_id). Frame-only granularity is insufficient because contracts withsub_zonesof differingaccepts(e.g. F29process_columnaccepts[text_block, transform_table]vsproduct_columnaccepts[text_block]) require slot-level differentiation. - Q2 — Value type: hybrid — substring patterns (Phase Q parity) plus contract-field assertions (
cardinality.strict/acceptsmembership /partial_target_pathresolved in DOM) plus numeric tolerance (carried from font-hierarchy 1px). Three lanes preserved separately so each can fail/pass independently. - Q3 — Validation timing: post Step 18 final.html only. Step 12 light_edit/restructure proposal is upstream of render and exposes no HTML for substring assertion; running the dict there would either fire false negatives (no DOM yet) or duplicate Step 18 work.
- Q4 — Font-hierarchy carry-over: replaced — Phase Q's
role_font_mapfixed dict (area → max-font) is Phase Q-local. The Phase Z equivalent reads fromframe_contracts.yamlvisual_hints(min_height_pxalready present; a futuremax_font_pxfield would live invisual_hintsand is IMP-04-owned). 1px tolerance shape is portable; the lookup source is replaced.
Guardrails (preserved from Stage 1 + Stage 2):
- GR1 — Shape-only reference: no Phase Q
REQUIRED_PATTERNSvalue ("key-msg","padding-left","text-indent","slide-img-") or area name (body_bg/body_core/sidebar/footer) may appear in any Phase Z pattern dict activation. - GR2 — Phase Q no-regression:
src/content_verifier.py:382-392REQUIRED_PATTERNSis no-touch. The Phase TL379-381comment (overflow:hidden removed) remains the no-regression boundary; any Phase Z dict design must not re-introduce removed patterns into Phase Q's surface. - GR3 — Phase Z dict is Phase Z-owned: no
importofcontent_verifier.REQUIRED_PATTERNSfrom Phase Z code. The two pattern dicts coexist without symbol sharing. - GR4 — IMP-04 soft-link one-way: per § A4. Activating IMP-20 must not block on or modify IMP-04; the catalog is read-only input.
- PZ-1 — AI isolation contract: pattern dict is code/spec, not AI-generated content. No Kei rewrite, no LLM proposal of pattern values (
feedback_ai_isolation_contract). - RULE 13 — Anchor sync: any future activation must update backlog (
PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md), status board (PHASE-Z-PIPELINE-STATUS-BOARD.md), and INSIGHT-MAP (PHASE-Q-INSIGHT-TO-22STEP-MAP.md) in the same commit.
If IMP-04 alters the catalog schema or src/content_verifier.py is rewritten upstream, A1–A3 must be re-verified (file:line refs); the A5 gate itself does not change.