C.E.L_Slide_test2

Kyeongmin/C.E.L_Slide_test2

Author	SHA1	Message	Date
kyeongmin	6e9e3ee1fb	fix(#94 ): IMP-94 u7 regression-harness SHA parity normalization for additive Layer A markers Strip the two additive IMP-94 attributes (data-region-id, data-content-unit-id) symmetrically at both the 89-a fixture capture script and the b4 mapper source SHA parity test before SHA-256 hashing, honoring the issue body guardrail "mdx 01-05 의 final.html SHA = byte-equivalent except for new data-* attrs" without recapturing the pre-89-a baseline. The strip regex is anchored on the leading-space + attr-token shape emitted by src/region_marker_stamper.py:131-135 so the #96 data-frame-slot-id axis stays disjoint. The marker-parity cross-axis tests for emergency_p4b_verbatim_code and emergency_p4_ai_inline append sites are converted from pytest.skip to vacuous-truth early return when the Emergency P4/P4b anchors are absent in HEAD — the assertion target does not exist in IMP-94 scope, but the contract still locks placement_markers=[] when the Emergency axis lands later. Refreshed 89a_pre_baseline_sha.json (2026-05-27T04:19:30Z) holds the normalized sizes/SHAs for mdx 01-05 post-stamper. Scope: regression harness + fixture only; zero src/ edits. Verified 35/35 marker-parity + 18/18 SHA parity in a clean detached worktree at HEAD `2afedfc` with these four files applied. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 14:09:26 +09:00
kyeongmin	5484077a53	feat(#94 ): IMP-94 u1~u6 Layer A region/content marker injection (stamper + render_slide chain + 4 zones_data.append placement_markers + 35 parity tests) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 21s Details u1 (src/region_marker_stamper.py): deterministic root-div stamper injecting data-region-id + data-content-unit-id onto each family-partial root div anchored by data-template-id. Idempotent (re-stamp = no-op), AI=0, additive only, empty/None markers no-op, F9/F29 frame-slot axis preserved. u2 (src/phase_z2_pipeline.py render_slide chain): _stamp_region_markers chained after IMP-56 u9 _stamp_zone_html. Marker source = zone.get("placement_markers") or [] — Codex #16 P4b crash risk closed via the or-[] call-site fallback. u3 (_derive_placement_markers helper): projects PlacementPlan.slot_assignments[] → list[dict] carrying region_id + content_unit_id + frame_slot_id (frame_slot_id reserved for #96 89-d). Live B4 path emits at primary zones_data.append. u4 (3 non-live zones_data.append defaults): placement_markers: [] at IMP-30 u4 empty-shell, IMP-86 u1 adapter_needed, post-loop unrenderable plan-record paths — uniform zone shape, stamper no-op surface. u5/u6 (tests/test_phase_z2_imp94_marker_parity.py): 33 hard tests + 2 cross-axis skip-if-anchor-absent (Emergency P4/P4b future axis). Coverage: 13 family-partial root anchors, F29 + F9 frame-slot preservation, idempotence, live render_slide stamping, P4b empty-marker no-crash, MDX 01 strip-attr parity, trace-to-DOM parity. Disjoint from #96 (data-frame-slot-id) by attribute name. SPEC anchor: docs/architecture/PHASE-Z-CONTENT-OBJECT-SUBZONE-SPEC.md §6.4 + §7.2 (Layer A read targets + render-path activation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 08:15:08 +09:00
kyeongmin	b9747c2f4a	feat(#84 ): IMP-84 u1~u3 silent automation policy enforcement (FramePanel reject confirm + slide_base provisional badge/outline + IMP-30 visual assertions inverted) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 21s Details - u1 FramePanel.tsx: extract `applyFrameSelection(candidate, onFrameSelect)` pure helper; collapse `handleFrameSelect` to direct onFrameSelect for every V4 label; drop `window.confirm` reject popup (IMP-47B u11 regression noise per `feedback_auto_pipeline_first`). New vitest pin `imp84_framepanel_reject_silent.test.ts` covers helper invocation across all 4 V4 labels + source-presence pins. - u2 templates/phase_z2/slide_base.html: delete `.zone--provisional` CSS, `.zone__needs-adaptation-badge` CSS, the zone--provisional class fragment in the zone div, and the badge `<span>` render at the provisional zone. Preserve `data-provisional="1"` attribute as silent telemetry. New pytest `tests/phase_z2/test_imp84_provisional_silent_render.py` pins the silent contract independently of the IMP-30 first-render file. - u3 tests/test_phase_z2_imp30_first_render.py: invert the three IMP-30 u5 positive provisional-visual assertions to IMP-84 silent-contract negatives (no class, no badge, no CSS selectors); preserve positive `data-provisional` telemetry assertions. Docstrings updated to IMP-84 silent contract. Out of scope (Round #4 + #92 contract): Home.tsx `toast.error(aiReviewMsg)` call line, designAgentApi.ts `api_error_kinds`/`api_error_kind` schema and operational-only formatter, FramePanel reject badge/tooltip read-only labels (L102/L147/L156), and backend `zone.provisional` flag emission. Stage 4 PASS: u1 vitest 10/10, u2 pytest 5/5, u3 pytest 29/29 (incl. 3 IMP-84 inverted assertions: `test_imp84_provisional_zone_silent_no_class_no_badge`, `test_imp84_provisional_badge_never_rendered_in_mixed_zones`, `test_imp84_slide_base_css_strips_provisional_visual_selectors`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 14:15:02 +09:00
kyeongmin	4da22adb43	feat(#90 ): IMP-56 u1-u19 catch-up before final close (post-u20 push fix) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 20s Details u1: text_overrides axis in user_overrides_io u2: structure_overrides axis in user_overrides_io u3: vite allowlist for new endpoints u4: text_override_resolver u5: Step 12 text_overrides apply in phase_z2_pipeline u6: structure_override_resolver u7: text_path_stamper u8: SlideCanvas text-edit capture u9: SlideCanvas structure-edit overlay u10: userOverridesApi service extension u11: designAgent types extension u12: slidePlanUtils restore u13: user_overrides endpoint tests u14: user_overrides restore tests u15: pipeline fallback tests u16: edit-mode state + gating tests u17: slide_base print mode CSS u18: /api/connect endpoint (vite) u19: /api/export endpoint (vite) Recovery scope: 29 files (12 modified + 17 new). u20 already pushed in 9439575; this commit lands u1-u19 that were authored but not committed before #90 was externally closed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-26 06:12:13 +09:00
kyeongmin	ec7471ed59	docs(#1 ): IMP-01 A-6 u1~u5 zone_geometries_px runtime verification log (driver chain + 4-topology runs + schema lock + no-drift guardrail + pytest baseline gate; production source untouched, impl at `1dc81e0`) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 20s Details	2026-05-25 15:49:23 +09:00
kyeongmin	4e281a20d8	feat(#93 ): IMP-55 u1~u12 frontend manual section swap detection (manual_section_assignment bool axis + drag-only marker gate + dual-axis persistence + backend manual-true gate) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 9s Details Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 08:27:09 +09:00
kyeongmin	9062931863	feat(#74 ): IMP-45 u1~u8 slide-level CSS override (frontmatter slide_overrides.css + --override-slide-css/--slide-css-file + idempotent Step 13 injector) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 22s Details u1 KNOWN_AXES tuple gains slide_css entry in src/user_overrides_io.py (snake_case parity with image_overrides); round-trip test extends to 6 axes. u2 src/mdx_normalizer.py surfaces nested slide_overrides.css from the MDX frontmatter into the normalize_mdx_content return dict; absent key -> {}, non-string css drops. 4 unit cases in tests/test_mdx_normalizer.py (present / absent / non-string / title-only). u3 src/slide_css_injector.py NEW (88 lines) mirrors the inject_image_overrides_style contract from src/image_id_stamper.py: marker pair <!--IMP45-SLIDE-CSS:OPEN--> / <!--IMP45-SLIDE-CSS:CLOSE-->, idempotent re-injection, </head> > <body> > document-start three-tier fallback, empty/None -> unchanged. 8 fixtures in tests/test_slide_css_injector.py mirror test_image_id_stamper.py. u4 run_phase_z2_mvp1 accepts override_slide_css: Optional[str] = None; None -> frontmatter slide_overrides.css fallback. Step 13 calls inject_slide_css after image override injection and before the final.html disk write, so CLI/CI/regression renders observe the same backend artifact. u5 argparse adds mutually-exclusive --override-slide-css TEXT (inline CSS, <style> wrapper optional) and --slide-css-file PATH (UTF-8 read, fail-closed sys.exit(2) on missing path / decode error / both flags present). Resolved string is forwarded as override_slide_css kwarg. 6 cases in tests/test_phase_z2_cli_overrides.py (inline / file / both / missing / non-utf8 / neither). u6 samples/mdx_batch/04.mdx frontmatter gains slide_overrides.css block (verbatim of the former MDX04_DEFAULT_OVERRIDE_CSS constant, no sample/frame gate). Subprocess smoke in tests/test_phase_z2_slide_css_smoke.py verifies the marker pair and CSS substring land in final.html. u7 Front/client removes the sample/frame-gated frontend-only injection: Home.tsx drops the MDX04_DEFAULT_OVERRIDE_CSS constant and the sample==="04"+frame==="process_product_two_way" branch (-28 lines); SlideCanvas.tsx drops the iframe contentDocument.head injection of that prop (-14 lines). Live preview now reads backend final.html only. u8 tests/regression/fixtures/89a_pre_baseline_sha.json 04.mdx entry resyncs to the live SHA ddb6bf2f... / 28042 bytes (overwrites the earlier 5-byte-drift d02c76fd... / 28047). Other entries untouched. Note: 01.mdx baseline drift (ad6f16a3... / 29089 -> live f26a7fac... / 29084) predates this branch and is split to a follow-up issue per the closed-issue fresh validation rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 03:26:03 +09:00
kyeongmin	b4be6c1cd0	feat(#72 ): IMP-43 u1~u8 --reuse-from incremental rerun (Step 0/1/2/5/6 reuse + Step 7+ re-execute) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 25s Details u1 argparse --reuse-from PREV_RUN_ID + post-merge fail-closed guard (rejects layout/zone_geometry/zone_section/image override axes by name; only --override-frame is preserved). u2 src/phase_z2_reuse_snapshot.py — JSON-only Step 6 snapshot with mdx_sha256 integrity key and {value, source_path, upstream_step} provenance per axis (pickle forbidden per Stage 2 guardrail). u3 _write_reuse_snapshot at the Step 6 boundary; soft-fails to stderr without aborting the seed run. u4 prev_run_dir RO copy of step00/01/02/05/06 + _reuse_snapshot.json into new run_dir, state rehydration, reuse marker, frame-override application on restored units, Step 7+ resume. u4b fail-closed for missing prev_run_dir / missing/corrupt/invalid snapshot / mdx_sha256 mismatch / accidental new==prev write, with value+path+upstream diagnostics per axis. u5 reuse_from Optional[str] threaded through run_phase_z2_mvp1 signature and CLI dispatch; default None preserves byte-identical pre-IMP-43 behavior. u6 Front /api/run optional reuseFromRunId forwarding (vite.config.ts + designAgentApi.ts + run_pipeline_reuse_from.test.ts). u7a fast CI equivalence (1 mdx × 1 layout × 2 frames); step13 whitelist = run_id/timestamps/prev_run_id only. u7b 3 layouts × 3 mdx × 32 frames sweep gated by pytest.mark.sweep (registered in pyproject.toml; default CI must use -m 'not sweep'). u8 scripts/measure_reuse_savings.py argv-driven A/B/C harness with frame pin self-discovery + seed-time exclusion; status board §8 TBD anchor (issue-body 50-70% / 10-20s→3-8s claim explicitly unverified, not mirrored). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 22:44:27 +09:00
kyeongmin	8648a468d9	feat(#69 ): IMP-40 u1~u6 frame contract label_default placeholder/fallback role discriminator (BIM/DX leak fix) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 26s Details - catalog (frame_contracts.yaml): F18 bim_dx_comparison_table col_a/col_b label_default_role=placeholder; F30 industry_current_status_three_col + F31 industry_characteristics_three_col col_a/col_b/col_c forward-compat placeholder; F33 engn_sw_three_types untouched (no label_default). - mapper (_build_compare_table_2col): generic _resolve_label_default(col_key) branches on <col>_label_default_role — placeholder -> '' (Figma placeholder suppressed at runtime), fallback -> catalog literal (legacy default), unknown -> ValueError with template_id + role_key + value. Absent role defaults to fallback (backward compat for contracts without discriminator). - tests (tests/phase_z2/test_imp40_label_default_role.py): u4 generic matrix (placeholder / fallback / absent / unknown / 3-col axis) + u5 F18-reuse non-BIM/DX synthetic rows asserting placeholder labels emit '' and BIM/DX literal tokens do not leak. - snapshot (tests/integration/__snapshots__/slot_payload.json): mdx 01 F18 string_slot_nonempty.col_a_label/col_b_label True -> False (u6 expected drift from u3 placeholder -> empty string flip). slot_names + rows + title preserved. Verification: - imp40_label_default_role: 6/6 PASSED - phase_z2 sweep: 608/608 PASSED - multi_mdx_regression: 50/50 PASSED - cross-suite sweep: 662/662 PASSED - BIM/DX literal grep on mapper + new test: 0 hits - No mdx-specific branches (mdx 03/04/05 grep on mapper: 0 hits) Guardrails: no MDX 03/04/05 hardcoding (catalog policy only); no spacing shrink; no auto frame swap on reject; no AI call at Step 12; F33 untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 18:53:20 +09:00
kyeongmin	028042aaa9	feat(#68 ): IMP-39 u1~u8 ranking_sort_policy single-source + backend↔frontend label-priority mirror Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 23s Details u1: templates/phase_z2/catalog/ranking_sort_policy.yaml — single-source policy (label_priority asc {use_as_is:0, light_edit:1, restructure:2, reject:3} + confidence desc + v4_rank asc tie-break). u2: src/phase_z2_pipeline.py — apply_ranking_sort helper + lookup_v4_match_with_fallback applies policy AFTER IMP-38 raw-window selection (raw default_window + usable_count preserved on RAW all_judgments). u3: src/phase_z2_pipeline.py — _build_application_plan_unit forwards ranking_sort_policy + sorted_candidate_evidence into Step 9 payload. u4: Front/client/src/services/designAgentApi.ts — frame_candidates builder reads unit.sorted_candidate_evidence + unit.ranking_sort_policy first; local LABEL_PRIORITY retained only on warn-fallback path. u5: tests/test_ranking_sort_policy.py — pure permutation coverage (sample-agnostic). u6: tests/phase_z2/test_label_priority_synthetic.py + fixtures/ranking_sort_policy/ synthetic_divergence.yaml — low-conf use_as_is behind high-conf restructure. u7: tests/phase_z2/test_imp39_mdx04_env_toggle_e2e.py — samples/mdx_batch/04.mdx with AI_FALLBACK_ENABLED=off; backend selected_v4_rank == frontend frame_candidates[0]. u8: tests/phase_z2/test_imp39_corpus_audit.py — real corpus sweep over tests/matching/v4_full32_result.yaml (10 MDX sections); section IDs loaded dynamically (RULE 0 / RULE 7 sample-agnostic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 17:12:07 +09:00
kyeongmin	2e3747c5ab	feat(#88 ): IMP-88 u1~u7 Step 17 retry chain — layout_adjust + image_fit + frame_internal_fit_candidate executors + dispatcher + entry Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 23s Details Step 17 salvage dispatcher previously only ran the 3 actions in _SALVAGE_FAIL_BY_ACTION (cross_zone_redistribute / glue_compression / font_step_compression). Any next_proposed_action outside that set hit salvage_terminal_action and dropped through, so visual_check aborted on layout_adjust / image_fit / frame_internal_fit_candidate cascades. u1 — router data surface (src/phase_z2_router.py) - ACTION_BY_CATEGORY: image_aspect_mismatch -> image_fit (new row), frame_capacity_mismatch -> frame_internal_fit_candidate (was frame_reselect). - ACTION_IMPLEMENTATION_STATUS: layout_adjust / image_fit / frame_internal_fit_candidate flipped MISSING -> IMPLEMENTED with inline IMP-88 rationale. u2 — failure_router cascade surface (src/phase_z2_failure_router.py) - FAILURE_TYPE_DESCRIPTIONS + SALVAGE_FAILURE_TYPE_BY_ACTION extended with layout_adjust_insufficient / image_fit_insufficient / frame_internal_fit_insufficient producers. - NEXT_ACTION_BY_FAILURE + NEXT_ACTION_RATIONALE + NEXT_ACTION_IMPLEMENTATION_STATUS rows added; cascade chain becomes font_step_compression -> layout_adjust -> frame_internal_fit_candidate -> frame_reselect -> details_popup_escalation (#64 terminal). u3~u5 — planners + apply helpers (src/phase_z2_retry.py) - plan_layout_adjust / apply_layout_adjust_layout_css with _layout_swap_priority across 8-preset LAYOUT_PRESETS (preset switch, no shared-margin shrink per Phase Z spacing direction). - plan_image_fit / apply_image_fit_css scoped to frame slot using existing classifier image_event payload (object-fit + max-w/h derivation). - plan_frame_internal_fit_candidate / apply_frame_internal_fit_candidate_css stays inside declared frame contract envelope; emits infeasible path when envelope is absent. u6~u7 — pipeline wiring (src/phase_z2_pipeline.py) - _SALVAGE_FAIL_BY_ACTION extended; _attempt_salvage_chain gains layout_adjust distinct-render branch + frame_internal_fit_candidate CSS-overlay branch + loop cap. - _attempt_step17_image_fit_single_pass added for image_fit entry. - §11.7.1 / §11.7.2 entry triggers wired; Step 17/18/19 artifact refresh + note logging closes the salvage_terminal_action fall-through for the 3 IMP-88 actions. Tests - New: test_router_actions_imp88.py (12), test_failure_router_imp88_cascade.py (12), test_phase_z2_retry_layout_adjust.py (10), test_phase_z2_retry_image_fit.py (13), test_phase_z2_retry_frame_internal_fit.py (13), test_phase_z2_pipeline_salvage_imp88.py (8), test_phase_z2_pipeline_step17_entry_imp88.py. - Regression-aligned: test_phase_z2_failure_router_cascade.py, test_phase_z2_step17_salvage_chain.py — pre-existing cascade + salvage-chain assertions updated to the IMPLEMENTED surface. Out of scope (separate axes / issues) - details_popup_escalation terminal body (#64). - frame_reselect MISSING flip (different axis). - Step 14/16 detection refinement. - Stage 0 mdx_normalizer integration (locked 2026-05-08). - AI fallback activation. Guardrails respected - Phase Z spacing direction: layout_adjust switches preset; no shared margin shrink. - AI isolation contract: planners + dispatcher are deterministic; zero AI calls in u1~u7. - No hardcoding: routing + cascade live in router/failure_router data rows, not inline conditionals. - IMP-46 (#62) cache carve-out: untouched. - 1 commit = 1 decision unit: u1~u7 grouped as a single IMP-88 unit. Stage 4 verification: 7 IMP-88 test files + 2 modified regression files PASS (Claude #12 + Codex #12 consensus YES). Full-suite sweep deferred to a separate step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 15:01:55 +09:00
kyeongmin	e0c39f1bc1	feat(#73 ): IMP-44 u1~u5 layout override unknown-key guard + frontend zone_geometries validation Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 23s Details Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 12:12:24 +09:00
kyeongmin	5deeb97cf6	feat(#71 ): IMP-42 u1~u5 silent fail chain diagnostics (assert + invalid-char detector + DIAG log) Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 24s Details Stage 4 binding scope — diagnostic-only, fail-loud, sample-agnostic (RULE 0 / AI-isolation contract). No production behavior change beyond fail-loud raises on previously-silent failure classes. u1 src/phase_z2_pipeline.py:2747-2772 — render_slide precondition assert (template_id non-empty str + slot_payload dict), placed after the `__empty__` short-circuit at 2740 to preserve empty-zone grid behavior. u2 src/phase_z2_pipeline.py:2681-2710 — _scan_rendered_html_for_invalid_path_chars helper covering src / href / url(...) values for backslash, &, '. Invoked on partial render (2778) and slide_base assembly (2798). u3 src/phase_z2_pipeline.py:2638-2676,2733,5509 — _emit_diag_zones_shape shape-only [DIAG] JSON at Step 12 slot_payload emit and Step 13 render_slide entry. No env gate — silence is the bug. u4 Front/client/src/pages/Home.tsx:388-392 — unconditional [DIAG raw overrides] console.log on handleGenerate boundary, after flushUserOverrides() and immediately before runPipeline. u5 tests/phase_z2/test_phase_z2_diag_smoke_general.py — 32-frame general smoke driven by load_frame_contracts() registry (not literal MDX 03/04/05), parametrizes u1/u2/u3 across the full frame_contracts.yaml top-level. Tests (Stage 4 verification PASS): - u1 8 passed, u2 14 passed, u3 12 passed, u4 5 passed, u5 97 passed. - Backend full regression tests/phase_z2/ 499 passed in 110.84s. - Frontend full regression 182 passed in 1.10s. Out of scope (separate axes): - Path normalization / as_posix migration. - Autoescape policy change. - build_layout_css refactor (Stage 1 category-error rejection). - Recovery / auto-fix on detected invalid path. - MDX content / frame-selection / zone-composition change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 08:28:54 +09:00
kyeongmin	c59864eb9a	feat(#91 ): IMP-91 u2~u15 multi-mdx regression CI suite + status-board auto-update Some checks failed Multi-MDX Regression (IMP-91) / multi-mdx-regression (push) Failing after 31s Details - u2~u5: tests/integration/test_multi_mdx_regression.py — MDX_SET=(01..05) cached integration runs + status/structural/visual snapshots + full_mdx_coverage assertion (9 snapshots populated for 01-05). - u6~u11: F0 normalize / F1 V4 ranking / F2 slot_payload / F3 classifier-only AI / F4 layout / F5 final.html axis per MDX_SET. - u12: pyproject.toml — pytest-json-report>=1.5 in dev extras. - u13: .github/workflows/multi-mdx-regression.yml — pytest+artifact CI. - u14: scripts/update_status_board.py + tests/scripts/test_update_status_board.py — idempotent JSON marker updater (3 unit tests pass). - u15: PHASE-Z-PIPELINE-STATUS-BOARD.md — 30 F0-F5 × mdx01-05 markers initialized `?` + workflow wiring. Stage 4 verify: 59/59 PASS targeted (smoke 6 + updater 3 + integration 50), 386/386 PASS regression umbrella, 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 07:01:58 +09:00
kyeongmin	6aa7564509	feat(#91 ): IMP-91 u1 non-VP subprocess smoke mdx01/02 parametrize Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 02:18:17 +09:00
kyeongmin	b1bbe27c38	feat(#89 ): IMP-89 89-a u1~u5 Layer A render path activation (B4→mapper source-of-truth switch, default-OFF flag) PHASE_Z_B4_MAPPER_SOURCE env flag (default OFF) switches slot_payload source-of-truth from legacy mapper-only / V4 rank-1 to B4 PlacementPlan .selected_template_id at the single switch site in the runtime loop. OFF preserves final.html SHA byte-equivalence (u4 parity guard, mdx 01-05). ON requires Layer A render-active path; BLOCKED exits on B4 no-cover and on B4-selected FitError (IMP-87 honesty gate pattern — NO silent fallback). Distinct from PHASE_Z_B4_GATEKEEPER (mismatch render-skip). Units (1 commit = 1 axis per Stage 1 scope_lock): u1 — _b4_mapper_source_enabled() flag reader (default OFF) u2 — _select_mapper_template_id() selector wired at the switch site u3 — _b4_mapper_source_blocked_exit() for b4_no_cover / b4_selected_fit_error u4 — render SHA parity regression (tests/regression/ baseline mdx 01-05) u5 — slot_payload byte-equivalence (matches_mapper=True axis, mdx 01-05) Targeted 89-a suite 63 PASS; Phase Z regression 323 PASS; IMP-87 mirror 20 PASS. Demo activation via .env only (no vite.config hardcoding). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 00:33:28 +09:00
kyeongmin	896f273ffa	feat(#92 ): IMP-92 u1~u5 AI fallback config validation (model ping + operational error classification) Replaces #84 UI-noise removal plan with positive operational-alert contract. Five-axis stack lands together: (1) default model literal moved to current Opus-family ID, (2) Anthropic SDK error classifier mapping exceptions to quota/billing/auth/other, (3) api_error_kind plumbed through ai_repair_status summary + per-record retention, (4) Step 0 preflight ping gated under ai_fallback_enabled (default OFF preserved) with fail-fast on invalid model/key, (5) frontend formatter rewritten to surface only operational quota/billing/auth toasts (non-operational paths return null per feedback_auto_pipeline_first silent-pipeline policy). u1 - default model literal claude-opus-4-6-20250415 -> claude-opus-4-7 (src/config.py + tests/test_phase_z2_ai_fallback_config.py lock mirror) u2 - classify_operational_error type+status_code dispatch + Step 12 api_error_kind stamp on except path (src/phase_z2_ai_fallback/client.py + src/phase_z2_ai_fallback/step12.py + tests/phase_z2_ai_fallback/test_step12.py) u3 - _summarize_ai_repair_status aggregates api_error_kinds {quota,billing, auth,other}; error_records[i].api_error_kind retained per-record (src/phase_z2_pipeline.py + tests/test_imp47b_failure_surface.py) u4 - _run_step0_ai_preflight + Step0PreflightError; preflight only fires when ai_fallback_enabled=true; one-token ping; invalid key/model => setup failure before Step 1 (src/phase_z2_pipeline.py + tests/phase_z2/test_pipeline_step0_preflight.py NEW) u5 - AiRepairStatus.api_error_kinds? interface + formatAiRepairHumanReview Message rewritten: operational quota/billing/auth -> Korean copy verbatim from issue body (tie-break quota -> billing -> auth); validation/coverage_violated/unsupported_kind/generic-other/legacy payload -> null (Front/client/src/services/designAgentApi.ts + Front/client/tests/imp47b_human_review_toast.test.tsx) Guardrails respected: - feedback_demo_env_toggle_policy: default OFF preserved; preflight skipped when ai_fallback_enabled=false (test_preflight_skipped_when_disabled asserts anthropic.Anthropic() not called). - feedback_auto_pipeline_first: non-operational AI failures stay silent; only quota/billing/auth reach user toast. - feedback_ai_isolation_contract: AI remains fallback-only; no normal-path migration; MDX preserved. - project_imp46_carveout_caveat: cache_key/fingerprints fields untouched on every record; no overlap with #62 cache region. - feedback_no_hardcoding: zero MDX-sample-specific literals; classifier dispatch by SDK type, not by string parsing. - feedback_artifact_status_naming: operational toast scoped to alert axis, not overall PASS signal. Tests: - Targeted u1+u2+u3+u4: 63 passed - u5 vitest (Front/): 10/10 passed - tests/phase_z2_ai_fallback dir regression: 240 passed - tests/phase_z2 dir regression: 323 passed - IMP-92-adjacent (-k "imp47b or ai_fallback or preflight or step12 or step0"): 299 passed (808 deselected) - u1 baseline lock (test_client_mock.py): 8 passed Zero failures, zero regressions outside scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 22:07:25 +09:00
kyeongmin	842a46144c	feat(#87 ): IMP-87 u1~u5 empty_shell honesty gate + BLOCKED exit EMPTY_SHELL_NO_CONTENT overall enum + 3-marker detection (frame_template_id="__empty__" OR label="empty_shell" OR merge_type="empty_shell") routes empty-placeholder-only slides to BLOCKED CLI exit 1 + red final_status.html, blocking fake PASS reports (feedback_artifact_status_naming). Coverage accounting split: legacy covered_section_ids preserved + new content_rendered_section_ids / empty_shell_section_ids. mdx05 Case B (zero V4 evidence) honestly classified instead of synthesizing fabricated rank-1 reject frames. IMP-30 u6/u7 stale empty-shell PASS assertions inverted (29 tests). IMP-85 smoke parametrize: mdx05 removed from exit-0 list + dedicated BLOCKED exit test added (4 tests). No production behavior change for chain_exhausted Case A; no AI route activation; no mdx-id hardcoding. 53 targeted + 76 adjacent Phase Z tests PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 20:40:54 +09:00
kyeongmin	c53722ad0b	feat(#86 ): IMP-86 u1~u5 placeholder zones_data + invariant guard Mapper FitError handler now appends a __empty__ placeholder to zones_data and a matching debug_zone so the surviving cardinality stays in sync with the active layout preset's grid rows. A pre-build_layout_css invariant guard fails fast with preset/positions/count diagnostics if drift recurs. Per-record telemetry (adapter_needed, mapper_fit_error, provisional) is exposed on both placeholder records; authoritative slide_status.adapter_ needed_units schema is unchanged. Closes mdx03 reject override regression: Step 12 AI router now reachable without heights_px ValueError; default-path behavior unaffected. u1 — FitError placeholder zones_data + debug_zone (src/phase_z2_pipeline.py) u2 — pre-build_layout_css invariant guard (src/phase_z2_pipeline.py) u3 — horizontal-2 normal+placeholder helper unit (test_compute_per_zone_geometry.py) u4 — mdx03 reject override → Step 12 integration + default regression u5 — placeholder telemetry surface (adapter_needed/mapper_fit_error/provisional) Tests: - u3 helper: 7 passed (0.06s) - u4+u5 integration: 2 passed (7.87s) - Phase Z2 + AI fallback regression: 544 passed (66.28s) - Broader sweep (excl. matching/pipeline heavy): 1066 passed (96.12s) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 18:25:14 +09:00
kyeongmin	cacc5b30db	feat(#85 ): IMP catalog builder invariant + VP runtime gate (u1~u7) - u1: BuilderMissingError(FitError) — narrow exception aligned with pipeline catch - u2: load_frame_contracts catalog invariant + VP skip + CatalogInvariantError - u3a: audit CLI I1~I3 (partial existence / declared builder / registry membership) - u3b: audit CLI I4 (slot_payload refs vs declared/generated payload keys) - u4: lookup_v4_candidates VP filter (lookup_v4_all_judgments raw telemetry untouched) - u5: catalog invariant regression coverage + temp non-VP failure fixtures - u6: mdx04 VP routing fixture tests (sw_dependency_four_problems excluded from live) - u7: tests/conftest.py env isolation + mdx03/mdx04/mdx05 subprocess smoke Targeted 74 PASS (12.31s). Full regression 1063 PASS (87.70s). Audit CLI clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 16:56:38 +09:00
kyeongmin	d9d338416a	feat(#62 ): IMP-46 cache fingerprint forwarding u1~u4 (router kwarg + step12 forward + 8 scenarios)	2026-05-23 08:53:22 +09:00
kyeongmin	f3ef4d917c	feat(#64 ): IMP-35 details_popup_escalation u1~u10 + Stage 3 R7 anchor re-pin Land the production + test surface for the Step 17 cascade POPUP terminal (DETERMINISTIC -> POPUP -> AI_REPAIR -> USER_OVERRIDE) per Stage 2 plan R2. u11 (baseline-red invariance gate) was already landed in `7c93031` ahead of this commit; this commit completes u1~u10 plus the Stage 3 R7 follow-up anchor re-pin for test_imp17_comment_anchor.py. Implementation units (Stage 2 R2 contract): u1 frame_reselect_insufficient failure_type + post-frame remeasure (q4) - src/phase_z2_failure_router.py, src/phase_z2_pipeline.py u2 NEXT_ACTION_BY_FAILURE row + impl_status flip - src/phase_z2_failure_router.py u3 Router details_popup_escalation MISSING->IMPLEMENTED + executor stub - src/phase_z2_router.py u4 step17.py AI split-decision contract (POPUP cascade_stage + route_for_label + skip_reason); API gated - src/phase_z2_ai_fallback/step17.py u5 Step 17 POPUP gate executor; popup_escalation_plan + has_popup marker - src/phase_z2_pipeline.py, src/phase_z2_ai_fallback/step17.py u6 Composition popup binding -- yaml strategy -> zone payload - src/phase_z2_composition.py u7 Pipeline composer -> render_slide wiring (popup_html / preview_text / has_popup) - src/phase_z2_pipeline.py u8 slide_base.html <details>/<summary> popup wrapper - templates/phase_z2/slide_base.html u9 display_strategies.yaml inline_preview + popup metadata - templates/phase_z2/regions/display_strategies.yaml u10 MDX preservation invariant: popup=full source / body=summary or subset (asserted by tests/phase_z2/test_popup_mdx_preservation.py) u11 (already in `7c93031`) -- baseline-red invariance gate Stage 3 R7 follow-up (anchor re-pin, test-only): - tests/orchestrator_unit/test_imp17_comment_anchor.py Pre-anchor additions in src/phase_z2_pipeline.py (u1 / u5 / u7) shifted the restructure/reject route-hint comments 578/579 -> 586/587. Re-pinned the two guard tests (and docstring re-pin lineage 564 -> 570 -> 578 -> 586). Production code untouched. Verification (Stage 4 R1): pytest -q tests/orchestrator_unit/test_imp17_comment_anchor.py -> 2 passed / 0.02s pytest -q <10 IMP-35 unit files in tests/phase_z2 + tests/phase_z2_ai_fallback> -> 136 passed / 15.94s Baseline-red invariance gate (tests/test_imp47b_step12_ai_wiring.py + tests/test_phase_z2_ai_fallback_config.py) -> 4 failed / 6 passed; FAILED set === IMP35_BASELINE_RED_NODE_IDS (frozen registry from `7c93031`). Contract holds. Codex Stage 4 R1 = YES (independent verify). Guardrails honored: - MDX content preservation: popup carries full source, body holds summary or subset only (CLAUDE.md 자세히보기 원칙; feedback_phase_z_spacing_direction -- capacity expanded, no margin shrink). - AI isolation contract: Step 17 POPUP gate is deterministic; AI hook surface is split-decision contract only, API call gated. - No hardcoding: escalation thresholds derived from existing overflow detector outputs; preview_chars deterministic from container px. - 1 commit = 1 decision unit: u1~u10 land together as the planned production surface; u11 was deliberately split into `7c93031` as Stage 3 R7 carve-out, and the R7 anchor re-pin rides with this commit because it is the direct shift consequence of the u1/u5/u7 pre-anchor additions. - Scope-locked: .claude/settings.json explicitly excluded (Stage 4 exit report contract). Out of scope (per Stage 1 + Stage 2): - AI_REPAIR API activation (post IMP-35 axis). - IMP-34 zone resize, IMP-36 responsive fit (chain partners, separate issues). - Print-time auto-expand JavaScript for <details>. - Popup escalation in stages other than Step 17. - Baseline-red body repair (4 frozen failures) -- separate follow-up issue; u11 only guards the count. - frame_reselect algorithm changes (entry point only). - templates/phase_z2/slide_base.html path rename. source_comment_ids: Stage 1: claude_stage1_problem_review_imp35, codex_stage1_verification_imp35_yes Stage 2: Claude #4 R2 plan, Codex #5 R2 YES Stage 3: Claude #86 (R7 anchor re-pin), Codex #87 YES Stage 4: Claude #88 R1, Codex #89 R1 YES Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 07:36:57 +09:00
kyeongmin	7c93031f9b	feat(#64 ): IMP-35 details_popup_escalation u11 baseline-red invariance gate Add a test-only invariance gate that locks the pre-existing four-test red baseline so IMP-35 cannot silently grow the red surface while in-flight. u11 does NOT fix the four reds — Stage 2 follow_up_candidates tracks the actual repair as a separate issue. u1~u10 production work remains in the worktree and is explicitly out of this commit per Stage 3 R7 carve-out. Frozen registry (IMP35_BASELINE_RED_NODE_IDS, set semantics): 1. tests/test_imp47b_step12_ai_wiring.py ::test_mixed_units_classified_by_route_and_provisional_flag 2. tests/test_imp47b_step12_ai_wiring.py ::test_reject_provisional_unit_reaches_router_short_circuit 3. tests/test_imp47b_step12_ai_wiring.py ::test_step12_ai_repair_artifact_writes_json_serialisable_records 4. tests/test_phase_z2_ai_fallback_config.py ::test_ai_fallback_master_flag_default_off Gate semantics (subprocess pytest, set comparison): - All 4 node ids resolve to collectible pytest items (rename / delete is caught up front). - Broader baseline-area sweep across the two registry files yields EXACTLY 4 FAILED and 0 ERROR, with FAILED set ≡ registry. - A new red in the baseline area flips count above 4 OR introduces a FAILED id outside the registry; either branch fails the gate. - Cross-lock test ensures registry node ids cannot point outside the declared area-files inventory. AI isolation contract (feedback_ai_isolation_contract): Gate body uses stdlib only (subprocess + re + ast). An AST self-verify test rejects `anthropic` imports and `route_ai_fallback` references in this file, structurally preventing AI routing inside the gate. Stage 4 verification (HEAD `c1df656` pre-commit): pytest -q tests/phase_z2/test_imp35_baseline_red_invariance.py → 7 passed in 15.26s. Baseline area sweep (tests/test_imp47b_step12_ai_wiring.py + tests/test_phase_z2_ai_fallback_config.py) → 4 failed / 6 passed / 0 errors; FAILED set ≡ registry (identity). pytest --collect-only on the 4 registered node ids → all 4 resolve. py_compile clean. Codex R1 = YES (independent verify). Guardrails honored: - Scope-locked: test-only file; zero production code in this commit. - 1 commit = 1 decision unit (u11 only). - No hardcoding: registry = Stage 2 contract frozen tuple, not sample-specific literal; gate body has zero magic constants. - AI isolation: stdlib-only gate, AST self-verify locks isolation. - baseline-red 4 body repair = separate follow-up issue, not u11 scope. source_comment_ids: Stage 1 problem-review; Stage 2 plan R2 + Codex R2 YES; Stage 3 Claude #30 + Codex #31 R7 YES; Stage 4 Claude #32 + Codex #33 R1 YES. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 04:13:54 +09:00
kyeongmin	c1df656312	feat(#65 ): IMP-36 fit/rotation generalization (u1~u8) Generalize Phase Z frame partial responsive fit / rotation to four canonical F13/F14/F20/F8 family partials. Surface = 13 canonical partials; 19 builder-only contracts remain explicitly out of scope. u1 test_imp17_comment_anchor: re-pin L570->L578 (restructure+IMP-17), L571->L579 (IMP-29 -> IMP-47B supersession). Stage 1 red baseline gate. u2 frame_contracts.yaml: add rotation_eligible (P1) + body_fit_pattern2 (P2) bool axes on 13 partial-backed contracts. P1 True: F13/F14/F20/F8 (4). P2 True: F23 + P1_set (5). F29 columns[1].body_parser column_plain -> column_with_transform (P3 parity). u3 test_imp36_fit_rotation_generalization (NEW, 166 lines): static parametrized assertions for P1 metadata + CQ presence, P1 opt-out absence, P2 --max-body-lines + clamp + cqh, P2 opt-out absence, 19 builder-only exclusion. u4 three_parallel_requirements (F13): introduce f13b-root container-name + container-type:size + @container (aspect-ratio<1.5) rotation; add inline --max-body-lines + body line-height clamp/cqh/calc. u5 three_persona_benefits (F14): f14b-root P1 + P2 cqh/jinja body fit. Persona colors (#285b4a/#445a2f/#743002) and circle SVG aspect 1/1 preserved. u6 dx_sw_necessity_three_perspectives (F20): f20b-root P1 + P2 cqh/jinja body fit under IMP-49 partial-fidelity lock. u7 info_management_what_how_when (F8): f8b-root P1 + P2 cqh/jinja body fit. u8 test_imp36_overflow_chain_self_fire (NEW, 299 lines): Selenium self-fire harness for F13/F14/F20/F8 at aspect 1.78 vs 1.0. Asserts line-height changes, font-size invariance across all 4 frames (no per-frame exempt), grid columns rotate 3 -> 1, OVERFLOW_CASCADE_ORDER remains 4-tuple. Stage 4 verification (HEAD `6f1c736` pre-commit baseline): u1 2/2 PASS, u3 33/33 PASS, u8 9/9 PASS (live Chrome). Regression sweep tests/phase_z2 + tests/orchestrator_unit 335/335 PASS. font-size mutations introduced: 0. Pre-existing red (test_imp47b_step12_ai_wiring x3, ai_fallback_master_flag default_off x1) verified unchanged via stash swap -> not introduced. Guardrails honored: - cqh / clamp / container query only (no shared margin/padding/gap shrink). - font-size invariant under aspect change (P2 mutates line-height + --max-body-lines only). - No cross-frame .fNb__ class borrowing (IMP-49 partial-fidelity lock). - F14 circle SVG aspect 1/1 untouched; persona colors preserved. - AI isolation: no HTML structure generation; AI calls remain zone-content. - 1 turn = 1 step; commit excludes .claude/settings.json and all out-of-scope untracked worktree per Stage 4 binding contract. source_comment_ids: Stage 1 #13/#14; Stage 2 #21/#22; Stage 3 #4 + Codex #4 YES; Stage 4 Claude #1 + Codex #3 PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 01:18:20 +09:00
kyeongmin	6f1c7367e0	feat(#79 ): IMP-51 image_overrides axis (u1~u11 backend stamp+CLI+CSS inject + frontend drag/resize+persistence + tests)	2026-05-22 21:54:38 +09:00
kyeongmin	9388e25e76	feat(#80 ): IMP-52 user_overrides.json persistence (u1~u10 backend + frontend + tests) 4-axis MDX-stem keyed persistence so layout / zone_geometries / zone_sections / frames survive across `/api/run` sessions. Auto-restore on MDX reopen; CLI > file precedence on backend pipeline entry; 300ms-debounced PUT flushed before Generate. u1 src/user_overrides_io.py — load/save/validate_key (MDX-stem regex), 4-axis schema, miss={}, corrupt warning+{}, atomic tmp+rename, foreign-key preserve. u2 src/phase_z2_pipeline.py — post-argparse fallback fills only missing axes. u3 Front/vite.config.ts — GET /api/user-overrides/:key (200 {} on miss, 400 traversal). u4 Front/vite.config.ts — PUT /api/user-overrides/:key, 4-axis allowlist, partial merge. u5 Front/client/src/services/userOverridesApi.ts — typed get/save + flushUserOverrides with 300ms debounce and mutated-axis partial payloads. u6 Front/client/src/pages/Home.tsx + slidePlanUtils.ts — restore on MDX upload (non-frame axes immediately, frames remapped post-loadRun unit_id → region.id). u7 Home.tsx — persist on 4 mutation handlers (section drop, layout select, zone resize, frame select); zone_sizes and Generate excluded. u8 tests/test_user_overrides_io.py — round-trip, unknown-key passthrough, missing/corrupt, invalid keys (26 tests). u9 tests/test_user_overrides_pipeline_fallback.py — per-axis fill, CLI-wins, no-file noop, corrupt warning+skip (16 tests). u10 Home.tsx + user_overrides_write.test.ts — await flushUserOverrides() before runPipeline in handleGenerate try-block head; source-pattern regression assertions (20 → 22 tests). Backend pytest 42/42 green. Frontend vitest 113/113 green (endpoint 42 / restore 21 / service 28 / write 22). HEAD baseline ee97f4f; no spillover to phase_z2 templates / families / frames / pipeline orchestration outside the IMP-52 surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 11:47:11 +09:00
kyeongmin	ee97f4fc78	feat(#77 ): IMP-48 composition planner re-split on all-reject (u1~u9) Add resplit_all_reject_merges() helper in phase_z2_composition.py that detects parent_merged / parent_merged_inferred units with label=reject and rebuilds them as per-section single units using each section's own rank-1 V4 evidence (no frame swap, MDX raw_content preserved). Pipeline hook fires once after Step 6 settling chain (u12/u4/empty-shell) and section_assignment_plan resolution, before Step 6 artifact write. Guards: beneficial-split rule (>=1 non-reject), coverage equality, layout cap (>4 abort), max_retry=1, section_assignment_override short-circuit. Audit: comp_debug["imp48_resplit"] additive payload (applied, split_units, skipped_units, post_split_unit_count, post_split_layout_preset); selection_path="resplit_from_merge" telemetry on rebuilt singles; layout_preset re-derived via select_layout_preset(new_units). Tests: 39/39 PASS (composition u1~u6: 14 cases; pipeline u7~u9: 25 cases). Scoped regression 720/6 with 6 failures isolated as pre-existing on baseline `79f9ea5` (independent of IMP-48). mdx03 golden lock preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 05:00:07 +09:00
kyeongmin	79f9ea5c92	feat(#78 ): IMP-49 dx_sw_necessity partial Figma provenance fix (u1~u3) Replace eyeballed PROMOTED green hex (#296B55, #123328) with verbatim upstream values from figma_to_html_agent/blocks/1171281198/index.html: - border + check mark: #1d4d3e (upstream :208 -webkit-text-stroke) - header gradient: rgb(15, 50, 30) / rgb(60, 52, 34) (upstream :54, :64) Document .f20b__* as authoring-ordinal namespace (NOT Figma frame_id 1171281198); structural link via data-frame-id attribute. No selector rename, no catalog edit. Add focused regression test (tests/test_imp49_partial_figma_provenance.py) extracting <style>-block hex/rgb/rgba literals and asserting non-whitelisted literals exist byte-identically in upstream source. Whitelist limited to neutrals (#fff, #1a1a1a) + shared zone-title token (#000, #883700, rgba(50,44,30,0.4)). Scope: dx_sw_necessity_three_perspectives.html only. 19 missing partials, .fNb__ rename, full 32-contract audit deferred to follow-up axes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 02:49:43 +09:00
kyeongmin	1186ad8ae2	feat(#76 ): IMP-47B reject-as-AI-adaptation activation (u1~u13 backend + tests) - u1~u9: AI fallback infrastructure (router/prompts/schema/validator) + Step 12 hook - u10: e2e reject chain (writes final.html with AI-repaired slot, full coverage) - u11: frontend wiring deferred to follow-up commit (split from IMP-41 hunks) - u12: coverage_invariant guard - u13: cache save gate (visual_check PASS + user_approved/auto_cache) — Codex #22 verified Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 00:19:10 +09:00
kyeongmin	90503cadd6	feat(#67 ): IMP-38 V4 max_rank policy formalization (u1~u3, 4 round consensus) - u1: separate templates/phase_z2/catalog/v4_fallback_policy.yaml + load_v4_fallback_policy() loader (catalog pollution prevention — Codex #1 correction) - u2: dynamic effective max_rank in lookup_v4_match_with_fallback (3-variable ceiling min, Codex #2 correction: min(configured, len(judgments_full32))) + 3-tier usable predicate (status + catalog + optional capacity) + trace 8 fields (requested/default/configured_extended/ judgments_count/effective_extended_ceiling/effective_max_rank/usable_count/policy_applied) - u3: 2 production call site cleanup (max_rank=3 removed, HEAD baseline) + tracked Front/vite.config.ts PHASE_Z_MAX_RANK env retired + 4 regression scenarios verified: 32 passed (IMP-38 focused scope) — IMP-05 L4 dedup / L2 schema preserved, IMP-30 allow_provisional byte-identical, caller_override backward compat (tests) Stage cycle (#67, 7 round Claude + 5 round Codex): - Stage 1: Claude #1 -> Codex #1 YES + 5 corrections - Stage 2 r1+r2: Claude #2-#4 -> Codex #2 Q2 -> Codex #3 YES (4 round consensus LOCK 23195) - Stage 3 U1+U2+U3: Claude #5-#9 -> Codex #6 NO 4to3 correction -> Codex #7 YES -> Codex #8 YES - Stage 4: Claude #11 -> Codex #9 (anchor attribution nuance) -> Codex #10 readiness -> Codex #11 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 22:14:05 +09:00
kyeongmin	dceb10129f	feat(#63 ): IMP-34 R1 donor capacity measured bound (u1+u2) Bound donor capacity in plan_zone_ratio_retry by min(static_slack, max(0, clientHeight-scrollHeight)) when both Step 14 measured fields are present; fall back to static contract slack when absent. Prevents the donor from being over-allocated when full-but-not-overflowing, avoiding a wasted Selenium rerender before cascade falls to cross_zone_redistribute. - src/phase_z2_retry.py: planner block L122-157 only; donor filter (L107-112), slack<=0 gate, base_plan, greedy aggregation untouched. Adds measured_empty_px + slack_bound_source telemetry to donor_candidates_considered (additive only). - tests/phase_z2/test_phase_z2_retry_measured_bound.py: 5-axis regression (static_fallback / measured<static / measured>=static / measured==0 excludes / filter+bool guard). Guardrails honored: V4 rank-1 frame lock preserved, no frame_swap, no spacing/padding/gap/line-height/font shrink, no content drop, no MDX 03/04/05 branching, no Step 14 schema mutation. Static fallback idempotent when measured fields absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 21:37:41 +09:00
kyeongmin	a06dd3d4b0	feat(#42 ): IMP-04b catalog extension to 32 frames (u1~u24) Extends frame_contracts.yaml from 11 to 32 contracts to match V4 evidence (tests/matching/v4_full32_result.yaml unique template_ids), closing the IMP-04b gap surfaced in IMP-04 (#4) Track A milestone. Scope (Stage 2 24-unit plan): - u3/u4: WIP partial absorb — app_sw_package_vs_solution (F23), pre_construction_model_info_stacked (F9). Both promoted from _WIP_FILES.md to frame_contracts.yaml. WIP allowlist now empty. - u5~u11: Track A 7 frames (index.html present, contract missing). - u12~u23: Track B 12 frames (visual_pending: true; family partial authoring deferred — contract-first per Stage 2 plan). - u24: BT closure gate. Adds test_imp04b_closure_gate_v4_coverage_and_wip_empty (catalog ↔ V4 set-equal + WIP==0) and test_vp_exempt_keys_are_contracted_and_disk_absent (vp ∩ disk == ∅). Relaxes test_contracts_set_equals_disk_families_minus_wip to (disk - wip) ∪ vp. 32 derived from V4 evidence YAML (no hardcoding). Closure facts (locked): contracts = 32, v4_unique = 32, missing = [], extra = [], wip_count = 0, vp_count = 19, vp ∩ disk = []. Guardrails honored: - No calculate_fit migration. - No AI/Kei API call in per-frame work. - No 1-2 sample hardcoding (Codex #7 generalization guardrail). - No production refactor for tests (IMP-32 owns helper extract). - figma_to_html / V4 / Phase Z 3-layer separation preserved. - 1 commit = 1 IMP-04b decision unit (bundled u1~u24 per Stage 2 plan; CAT+WIP atomicity for u3/u4 preserved). Tests: tests/test_family_contract_baseline.py 4/4 PASS. Cross-ref: IMP-04 (#4), IMP-29 (#38), IMP-30 (#39), IMP-31 (#40), IMP-32 (#41), IMP-33 (#61), IMP-47A (#75). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 19:39:16 +09:00
kyeongmin	15ef7c65e9	fix(#75 ): IMP-47A mdx03 frontend execution stabilization (u1~u4) u1: SlideCanvas iframe sandbox += allow-scripts (allow-same-origin preserved) → embedded-mode script in slide_base.html now applies html.embedded → standalone CSS reset deactivates inside iframe; no clipping u2: designAgentApi.loadRun merges candidate_evidence + v4_all_judgments + v4_candidates via Map<template_id\|id\|frame_id> dedup, LABEL_PRIORITY (use_as_is<light_edit<restructure<reject) then confidence desc, capped TOP_N_FRAMES=6 u3: Home.handleGenerate useCallback deps = [uploadedFile, slidePlan, userSelection, pendingZones, pendingLayout] (5-tuple, stale-closure fix) u4: tests/manual/imp47a_e2e.md — mdx03 manual e2e spec (5 axes) Frontend-only. Backend src/ untouched. No template/catalog edits. Determinism preserved (no LLM in frontend merge logic). Baseline: pytest -q tests → 623 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 14:56:56 +09:00
kyeongmin	c864fe0479	feat(#61 ): IMP-33 AI fallback scaffolding (u1~u11, flag default OFF) Frame-aware AI fallback module scaffolded under src/phase_z2_ai_fallback/ with master flag ai_fallback_enabled=False; normal-path AI call count remains 0. AI output constrained to builder_options_patch / partial_overrides / slot_mapping_proposal; MDX / frame_id / raw HTML / raw CSS mutations rejected at schema layer. IMP-46 cache gate (cache.py) raises AiFallbackCacheGateError unless visual_check_passed AND user_approved. Step 12 wires AI repair after IMP-30 provisional payload only; Step 17 stays blocked behind IMP-34 / IMP-35 prerequisites. AST isolation guard forbids fallback package from importing Phase Q / Kei / pipeline runtime symbols. Docs IMP-17 / IMP-31 bound to runtime module surface via 11-row structural test pin (test_docs_sync.py) so drift fails CI. Tests: 116 fallback / 161 phase_z2 regression / 526 scoped full sweep all passing. Existing pre-IMP-33 fixture issue in scripts/test_phase_t_* remains untouched (out of scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 12:46:49 +09:00
kyeongmin	c412f1ea75	refactor(#41 ): IMP-32 Step 9 application_plan helper extraction (u1~u5) Pure refactor — extract inline Step 9 per-unit application_plan dict assembly into module-level private helpers for testability. Replaces IMP-05 Case 7 inspect.getsource() literal guard with direct helper-call shape test. Behavior preserved: key set/order, candidate_evidence + fallback_chain compat alias identity, IMP-06 additive plan fields, IMP-11 D-2 markers (single _contract = get_contract(c.template_id) bind + catalog_registered + min_height_px chain). - u1 _application_candidates_for_unit(unit) at src/phase_z2_pipeline.py :2829-2853 — APPLICATION_MODE_BY_V4_LABEL mapping (pure extraction) - u2 _v4_all_judgments_for_unit(v4_all_for_unit) at :2855-2882 — IMP-11 D-2 chain preserved literally - u3 _build_application_plan_unit(unit, zone_plan, selection_trace, plan_record, v4_all_for_unit, layout_preset, layout_candidates_list) at :2885-2995 — byte-identical per-unit dict (key set + order + value identity), candidate_evidence / fallback_chain compat alias, v4_candidates list, v4_all_judgments, application_candidates, IMP-06 additive plan fields - u4 Step 9 inline loop body at :4620-4658 replaced with helper call; per-index/per-id lookups (zone_region_plans[i], v4_fallback_traces .get(...), plan_record_by_unit_id.get(id(unit)), section_alias_by_id, lookup_v4_all_judgments(...)) stay at call-site - u5 tests/test_phase_z2_v4_fallback.py Case 7 rewritten to test_build_application_plan_unit_emits_candidate_evidence_and_alias — direct helper call with SimpleNamespace duck-typed input; asserts candidate_evidence list identity (is), fallback_chain compat-alias identity (is), key order (candidate_evidence before fallback_chain), and compat-alias comment scoped to inspect.getsource(_build_ application_plan_unit) Verification: targeted 22 passed, full pytest 408 passed (0 fail/skip), smoke 11/11 PASS (2 pre-existing baseline SKIPs unchanged). Cross-ref: IMP-05 (#5) commit `23d1b25` Case 7 temporary source guard (replaced) / Codex #20 + #21 / IMP-11 D-2 marker preserved.	2026-05-21 03:17:27 +09:00
kyeongmin	1efbf672bd	feat(#39 ): IMP-30 first-render invariant + abort bypass (2 paths) Restore first-render invariant: final.html + Step 20 slide_status MUST be written for every input where Step 0~5 succeed. Two abort paths replaced with provisional/empty-shell synthesis; MDX content preserved, AI-free. - u1 V4Match.provisional + lookup_v4_match_with_fallback(allow_provisional) chain_exhausted -> synthesize rank-1 provisional (opt-in, default-off) - u2 CompositionUnit.provisional propagation (single / parent_merged / parent_merged_inferred constructors) - u3 select_composition_units(allow_provisional_fill=True) last-resort fill + _candidate_state="selected_provisional" - u4 pipeline.py path-(a) abort guard replaced with provisional retry + terminal __empty__ shell (no sys.exit(1)) - u5 zones_data.provisional -> slide_base.html zone--provisional class + data-provisional + needs-adaptation badge (template-only) - u6 compute_slide_status additive provisional_first_render_count/_units (overall enum unchanged per IMP-05 Codex #10 D4) - u7 regression: tests/test_phase_z2_imp30_first_render.py (28 tests) + tests/test_phase_z2_v4_fallback.py (+5 cases) Guardrails verified: MVP1_ALLOWED_STATUSES unchanged, no calculate_fit, no LLM in fallback path, no MDX 03/04/05 hardcoding. Anchor sync (Rule 13): tests/orchestrator_unit/test_imp17_comment_anchor.py re-pinned 564/565 -> 570/571 to track V4Match.provisional shift at src/phase_z2_pipeline.py:179-184. Cross-ref: IMP-05 (#5) §5 defer + Codex #2 first-render invariant.	2026-05-21 00:40:58 +09:00
kyeongmin	265d70ed91	refactor(#28 ): IMP-28 L4 _parse_json dedup (4 modules -> src/json_utils) Consolidate duplicate _parse_json helpers from content_editor.py / design_director.py / kei_client.py (fuller form) and pipeline.py (simple form) into shared src/json_utils.parse_json (strict superset). All 18 call-sites preserved via `parse_json as _parse_json` alias import; no behavior change. - src/json_utils.py (new): shared helper, fenced/plain-fence/bare-brace patterns + list-prefix cleanup fallback. - tests/test_json_utils.py (new): 9 unit tests pinning parser semantics. - src/content_editor.py / design_director.py: remove local helper + unused `import json` / `import re`. - src/kei_client.py / pipeline.py: remove local helper; `json` / `re` retained (used elsewhere). Targeted tests 9 passed; full pytest 374 passed (3 pre-existing scripts/ collection errors reproduce on baseline `909bf75`, IMP-28 unrelated).	2026-05-20 20:44:19 +09:00
kyeongmin	909bf75edc	refactor(#27 ): IMP-27 K5 catalog loader + _get_block_by_id cleanup Consolidate three duplicated catalog readers and two _get_block_by_id implementations behind a single shared module (src/catalog.py) that owns file-read + mtime cache. All caller signatures and return contracts remain byte-identical. Units: - u1 NEW src/catalog.py (76 lines): load_root_catalog / load_blocks / get_block_by_id / get_catalog_mtime as the sole file-read + mtime-cache owner. - u2 src/block_reference.py: _load_catalog delegates to load_blocks (list[dict] preserved); _get_block_by_id (no-arg) delegates to catalog.get_block_by_id. Module-level _catalog_cache removed. - u3 src/block_selector.py: load_catalog delegates to load_root_catalog (root dict preserved); _get_block_by_id (catalog-injected sig preserved) delegates to catalog.get_block_by_id. Module-level _catalog_cache / _catalog_mtime / CATALOG_PATH removed. - u4 src/renderer.py: _load_catalog_map and _load_catalog_map_with_variants consume catalog.load_blocks; renderer projection caches kept local but keyed via catalog.get_catalog_mtime(). Per-projection invalidation keys (_CATALOG_MAP_MTIME / _CATALOG_VARIANT_MAP_MTIME) introduced. import yaml, CATALOG_PATH, legacy _CATALOG_MTIME removed. - tests NEW tests/test_catalog_shared_loader.py (421 lines, 23 cases): shared loader + 3 wrappers covering single file-read, contract preservation, signature preservation, shared cache, private state absence, mtime invalidation propagation to renderer projections. Verification: - pytest tests/test_catalog_shared_loader.py -v: 23/23 PASS in 0.13s. - pytest tests/ -q --ignore=tests/matching: 365/365 PASS in 38.10s. - src/fit_verifier.py, src/space_allocator.py, src/pipeline.py and templates/catalog.yaml unchanged (git diff empty). Out of scope: - catalog.yaml schema/path unchanged. - Catalog direct-read call sites in fit_verifier / space_allocator / pipeline left for a separate follow-up axis. - Phase Z 22-step runtime, frame_selection, light_edit/restructure flows untouched. Refs: IMP-27 (gitea #27), INSIGHT-MAP §5 K5, PHASE-Q-AUDIT §2.10	2026-05-20 19:31:26 +09:00
kyeongmin	5d23b747ff	fix(orchestrator): P5b first-line agent header strict + supplement throttle Bug discovered during #24 IMP-24 K6 Stage 2 (2026-05-20): - Codex r1, r2, r3 started with '=== IMPLEMENTATION_UNITS ===' on first line (not '[Codex #N] ...'), so detect_agent (P0-1 strict, first-line only) returned None. - For non-audit issues, the P5 supplement guard was audit-only gated → silent loop until Codex r4 happened to use correct format. 4 rounds wasted. Verified that #21 Stage 4 had the same latent silent loop pattern ('## [Codex #1]' first line) — orchestrator looped through ~10 Claude rounds before random recovery. P5b fix addresses this long-standing bug. Patch (defensive parser-contract hardening; does not assume single root cause): 1. RULES global gets explicit "FIRST non-empty line MUST be [Claude #N] / [Codex #N]" rule that OVERRIDES any stage-specific "body MUST contain" constraint. 2. COMPACT_PLAN_RULE wording clarified: "body" begins AFTER the first-line agent header. The 'body MUST contain ONLY' set no longer accidentally permits '=== IMPLEMENTATION_UNITS ===' on line 1. 3. is_codex None supplement guard: - audit-only gate REMOVED → fires for all issues (#24 latent loop fixed) - Throttle: max 2 supplements per stage; on 3rd violation, orchestrator hard-stops the issue with explicit "user action required" message and exits run_stage cleanly - Supplement message names both Claude AND Codex (Claude's first-line violation also breaks downstream via Codex mimicry) - Body-head 80 chars logged on detection failure (debugging aid) 4. Regression tests (+5 cases in test_orchestrator_core.py): - TestDetectAgent: '=== IMPLEMENTATION_UNITS ===' first line → None - TestDetectAgent: [Codex #N] first line + units after → 'codex' OK - TestDetectAgent: '## ', '📌 ', '' prefix all → None - TestRulesAndCompactPlanFirstLineContract: RULES wording has FIRST/OVERRIDES - TestRulesAndCompactPlanFirstLineContract: COMPACT_PLAN_RULE has carve-out Cosmetic side effect (accepted): Claude's '📌 [Claude #N] ...' or '## [Codex #N] ...' decoration prefixes will fail detect_agent. Agents will drop decorations from line 1; line 2+ can still use them. Out of scope (NOT included to keep regression risk low): - detect_agent function logic UNCHANGED (P0-1 strict preserved) - consensus parser UNCHANGED - stage loop structure UNCHANGED - git/Gitea retrieval logic UNCHANGED - audit-only mode P4/P4a guards UNCHANGED - pre-post comment validation (future axis, larger refactor) Total: 131/131 pytest pass (126 prior + 5 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 17:01:24 +09:00
kyeongmin	134f52d3d3	feat(#58 ): L3 dormant trigger guard -- DORMANT-TRIGGERS.yaml + checker + orchestrator hook P5-1 docs/architecture/DORMANT-TRIGGERS.yaml -- 5 entries (IMP-16/17/18/19 active + IMP-20 followup-linked #55). P5-2 scripts/check_dormant_triggers.py -- standalone, reads registry, scans tree + diff, writes .orchestrator/dormant_alerts.json, exit 0 always. P5-3 orchestrator.py -- _check_dormant_triggers() helper + Stage 4->5 informational alert branch (skips audit-only, never blocks). P5-4 tests/orchestrator_unit/test_dormant_triggers.py -- 30 cases (yaml schema, registry contents, checker matching, false-positive guards, manual-evidence skip, orchestrator branch, audit bypass, governance ref). P5-5 PROJECT-INTENT-AND-GOVERNANCE.md -- single anti-patterns row referencing the L3 registry as binding contract surface. Tests: pytest -q tests = 337 passed (baseline 307 + 30 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 09:43:14 +09:00
kyeongmin	9389b8425b	fix(orchestrator): P5 audit-anchor-first-line regression guard Bug discovered during #56 INTEGRATION-AUDIT-02 execution (2026-05-20): - Both Claude and Codex put "Audit anchor: ..." as the FIRST line of every Gitea comment per the #56 issue body instruction "cite anchor at start of every stage". - detect_agent (P0-1 strict, first-line only) then returns None for these comments because the first line is "Audit anchor:..." not "[Codex #N]" or "[Claude #N]". - Result: orchestrator's "is_codex" check (line ~1288) flips false → "Codex 응답 미감지 — continuing" → infinite Stage 4 loop. #56 reached Round #14 (>300 comments, ~2 hours wasted token). Fix path (NOT relaxing detect_agent — that would revive the original #45 pre-P0-1 bug where [Claude #N] citations inside Codex bodies caused mis-detection): 1. AUDIT_ONLY_NOTE updated to enforce comment format: - FIRST non-empty line MUST be `[Claude #N] <stage>` or `[Codex #N] <stage>` - Audit anchor / banners / prefaces MUST appear line 2 or later - Concrete CORRECT example included - Explicit warning that violation breaks stage advance 2. is_codex None guard auto-supplements: - When _audit_mode(title) AND detect_agent returns None, orchestrator posts a Gitea supplement comment requesting the correct format - Next round's Claude/Codex see the supplement and correct - Breaks the infinite loop automatically (no manual ctrl-C needed) 3. Regression tests in TestDetectAgent (test_orchestrator_core.py): - test_audit_anchor_preface_breaks_detection: confirms P0-1 strict correctly returns None when anchor is first line - test_audit_anchor_after_header_works: correct format passes Total: 96/96 pytest pass (94 prior + 2 P5 regression). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 07:03:12 +09:00
kyeongmin	02e2ae0afb	docs(#54 ): F-4 legacy annotation + F-5 fixture convention -- AUDIT-01 housekeeping INTEGRATION-AUDIT-01 (#50) §10.4 / §10.5 housekeeping carry-over. F-4: annotate 14 remaining legacy Phase R'/Q sample-text hits across 10 src/ files with inline marker `# [legacy Phase R'/Q example -- INTEGRATION-AUDIT-01 §10.4]`. Comment-only. No string-literal / regex / sample dict value mutated. fit_verifier.py L612 marker keeps Phase Z partial-live import graph (FitAnalysis / RoleFit / redistribute / salvage) byte-precise. F-5: docs-only addendum -- §10.5.1 in INTEGRATION-AUDIT-01-REPORT.md + tests/CLAUDE.md fixture convention note. No root tests/fixtures/ dir created; existing tests/phase_z2/fixtures/ convention preserved. Documents test-only sample-reference allowance vs src/** runtime prohibition. Out of scope: Phase Z source 11 hits (phase_z2_content_extractor / failure_router / mapper / retry), production behavior change, #19 work. Verified: pytest -q tests/phase_z2/ = 157 PASS. git diff +210/-0 (35 src/docs lines + 175 new tests/CLAUDE.md). No behavioral delta. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 20:23:36 +09:00
kyeongmin	8f06a4c99f	docs(IMP-52): reconcile Phase Z family count drift -- F-2 option (c) Audit follow-up F-2 (INTEGRATION-AUDIT-01 §10.2). Phase Z families surface showed 11 tracked / 11 contracted / 13 on disk. The 2 untracked WIP files (app_sw_package_vs_solution.html, pre_construction_model_info_stacked.html) are now declared in _WIP_FILES.md as uncontracted and out-of-scope for the runtime matcher; promote/remove is gated on #42. The 11/11 tracked + contracted baseline is unchanged. A new pytest enforces tracked families ↔ frame_contracts.yaml set-equality modulo the WIP allowlist parsed from _WIP_FILES.md, so future drift fails fast in CI before #42 expands to 32 frames. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 19:15:04 +09:00
kyeongmin	e32f632464	fix(orchestrator): P4a baseline-diff guard + Stage 5 commit scope P4 had two production issues blocking #50 integration audit deployment: 1. Stage 3 guard had no baseline awareness — flagged ALL forbidden-path changes including pre-existing dirty WIP. Empirical: 328 such files already in current working tree (tests/matching/ artifacts etc). #50 would have hit reject loops immediately without Claude doing anything wrong. 2. Stage 5 had no commit-scope guard — if Claude ran `git add -A` and committed user's existing WIP, audit commit would be polluted with unrelated production changes. P4a additions: - _audit_baseline_path / _ensure_audit_baseline / _load_audit_baseline: snapshot working-tree dirty paths at run_issue entry for audit issues. Resumed runs preserve existing baseline (no overwrite). - _check_audit_only_violations(baseline=None): accept baseline set, subtract from violations — only flags NEW forbidden changes introduced after audit start. - _check_audit_commit_scope: verify HEAD commit's file list matches AUDIT_ALLOWED_COMMIT_GLOBS (INTEGRATION-AUDIT-*.md, BACKLOG.md). - run_issue: save baseline on audit-mode entry only — no impact on normal issues. - Stage 5 (commit-push) YES gate: new guard rejects on out-of-scope files with remediation prompt (git reset --soft + force-with-lease). 19 new tests: - baseline subtraction (5): pre-existing removed, None=keep-all, empty-set=catch-all, full-coverage filter, Windows path normalize. - baseline persist (5): roundtrip, no-overwrite on resume, missing fallback, corrupt JSON fallback, non-list fallback. - commit scope detection (7): report-only allowed, backlog allowed, src/ rejected, unrelated docs rejected, git error fail-open, Windows backslash, empty commit pass. - allowed globs sanity (2): every glob has audit marker, all under docs/architecture/. Total: 94/94 pytest pass (75 prior + 19 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:29:15 +09:00
kyeongmin	4289a500b6	feat(orchestrator): P3 wrapper input/encoding fix + P4 audit-only mode P3 hotfix (2026-05-18 — verified during #46 retry attempt): - _run_with_tree_kill: encode input only when Popen is in binary mode. Previously force-encoded str→bytes even with encoding= set, breaking text-mode stdin pipes with: write() argument must be str, not bytes. - run_claude path was the only affected call site. - 3 new C7 regression tests (input+encoding / bytes+binary / auto-encode). - C3/C6 test fixtures hardened with DEVNULL stdio isolation. P4 audit-only mode (2026-05-19, prep for #50 integration audit): - _is_audit_issue: title-based detection for [INTEGRATION-AUDIT], [AUDIT-ONLY], or "integration audit" phrase. - _audit_mode + --audit-only CLI flag: manual override regardless of title. - AUDIT_ONLY_NOTE injected into context pack across all stages/rounds. - Stage 3 (code-edit) YES gate: deterministic git status check. Changes touching src/, templates/, tests/* auto-reject Stage 3 YES and post a supplement-request comment. LLM-independent enforcement. - 26 new audit-mode tests (title detection, CLI override, forbidden prefix detection, allowed paths pass, Windows backslash normalization, quoted paths with spaces, git error fail-open, constants sanity). Total: 75/75 pytest pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:18:28 +09:00
kyeongmin	e10ec36617	feat(IMP-17): AI repair fallback infra carve-out — design-only boundary + 3-cond AND gate u1 — src/phase_z2_pipeline.py:564 route hint comment corrected from non-existent IMP-31 to IMP-17 (carve-out, AI fallback only, normal path 밖). Line 565 IMP-29 frontend override reference untouched. u2 — docs/architecture/IMP-17-CARVE-OUT.md (new) defines: - allowed scope (Step 12 restructure proposal, Step 16/17 retry fallback) - forbidden scope (normal-path AI calls, MDX compression, HTML structure) - 3-condition AND activation gate (User GO ∧ B4 frame_selection evidence ∧ IMP-04 catalog + IMP-05 V4 fallback live) - pattern shape reference (link-only): content_editor.py:21,318 + sse_utils.py:16-50 (Phase Q Archive Candidate, no port) - AI 격리 contract + Kei persona 단절 (permanent) u3 — PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md:68 IMP-17 row gains carve-out doc link + 3-cond AND gate pointer. u4 — PHASE-Q-INSIGHT-TO-22STEP-MAP.md AI repair fallback infra registry row prefixed with IMP-17 + carve-out link; normal_path=no preserved. Anchor test: tests/orchestrator_unit/test_imp17_comment_anchor.py asserts line 564 IMP-17 wording AND line 565 IMP-29 preservation (2 tests pass). Runtime behavior change: 0. Only delta in executable file is one comment line. Normal-path AI invocation count remains 0. Refs: gitea #17 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:12:43 +09:00
kyeongmin	23ba8b68cd	feat(IMP-16): U1 H3 verification utility port + U2 wiring design U1 (runtime, u1-u10): new Phase Z-owned deterministic verification module src/phase_z2_verification_utils.py (335 LOC, stdlib only) porting H3 utility surface — VerificationResult, extract_text_from_html, normalize_for_comparison, extract_keywords, strip_meta_lines, split_into_sentences, verify_text_preservation, detect_invented_text. 10 unit tests under tests/phase_z2/test_pz2_vu_*.py (56 tests). u11 (design-only): docs/architecture/IMP-16-U2-WIRING-DESIGN.md fixes the Step 1/2/14/21/22 reverse-path contract, redesigned frame-contract pattern reservation (IMP-20), and IMP-07 hard-gate criteria. No runtime wiring lands in this commit — U2 stays blocked until IMP-07 reverse path is implemented + verified + runtime-hit. Guardrails: no src.content_verifier import; no FORBIDDEN_KEI_MEMOS / generate_with_retry / REQUIRED_PATTERNS / verify_structure / verify_area / verify_all_areas usage; no AI / Kei / httpx / SSE path; AI-isolation contract upheld (utility is deterministic). Tests: 56 targeted PASS (0.19s), 15 regression baseline PASS (7.59s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 04:42:35 +09:00
kyeongmin	614c53358e	feat(IMP-15): 실행-4 — debug.json event surfacing + spec taxonomy row Issue: #48 (IMP-15 실행-4, axis 4: debug.json + spec doc trace). Parent: #15. Depends on 실행-1/2/3 (events + classifier outputs). Surfaces the image/table event streams that 실행-1/2/3 already produced and consumed, mirroring the existing `zone_geometries_px` top-level precedent (no new pattern introduced). Adds the matching taxonomy row to the Phase Z fit-classifier/router spec. src/phase_z2_pipeline.py (+3): - write_debug_json now lifts `image_events` and `table_events` to top-level of `debug.json` via `(visual_runtime_check or {}).get(<k>, [])`, exactly mirroring the immediately preceding `zone_geometries_px` surfacing line. Defaults to `[]` when `visual_runtime_check` is None — additive, no consumer-visible breakage. docs/architecture/PHASE-Z-FIT-CLASSIFIER-ROUTER-SPEC.md (+1): - §3.1 taxonomy adds `image_aspect_mismatch` row. Row text explicitly marks the signal as post-render `fail_reasons` from Step 14 visual_runtime_check (rendered vs declared aspect ratio mismatch), NOT a router-routed fit_classifier output, and notes the separate `image_events` stream surface. Prevents future readers from wiring this taxonomy into §3.2 priority list or §4 router action map. tests/phase_z2/test_debug_json_event_surfacing.py (new, 2 tests): - `test_write_debug_json_surfaces_image_and_table_events` invokes write_debug_json with synthetic visual_runtime_check containing both event lists; reads back the on-disk debug.json and asserts both keys are present at top level with the exact payloads. - `test_write_debug_json_defaults_when_visual_runtime_check_none` asserts both new keys default to `[]` when visual_runtime_check is None — guards the defensive `(… or {})` pattern. tests/phase_z2/test_spec_taxonomy_image_aspect_mismatch.py (new, 2 tests): - `test_spec_has_image_aspect_mismatch_row` opens the spec file and asserts exactly one `^\\| image_aspect_mismatch \\|` row exists inside the §3.1 table block (no markdown-parser dependency). - `test_spec_row_marks_post_render_fail_reasons_semantic` asserts the row text carries both "Post-render" and "fail_reasons" tokens — enforces the Stage 1 guardrail wording. Verification (Stage 4 PASS, Claude + Codex independent): - pytest -q tests/phase_z2/test_debug_json_event_surfacing.py \ tests/phase_z2/test_spec_taxonomy_image_aspect_mismatch.py → 4 passed in 0.07s. - git diff scope: 4 files, +148 insertions / 0 deletions. Scope-locked: no edits to classifier (실행-3), event generation (실행-1/2), Step 21 viewer, §3.2 priority list, §4 router action mapping, or `table_self_overflow` taxonomy row. Pre-existing dirty/untracked working-tree files left untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 22:25:41 +09:00
kyeongmin	535c4848fd	feat(IMP-15): 실행-3 — classifier consumes image+table events Issue #47 (IMP-15 실행-3 axis 3): extend `classify_visual_runtime_check` to consume the `image_events[]` and `table_events[]` arrays produced by `run_overflow_check` (실행-1/2) and widen `visual_check_passed`. Changes (src/phase_z2_classifier.py): - Remove `overflow.passed=True` early-return so image/table event scans always run, even when zone-level overflow was clean. - Deferred import of `IMAGE_ASPECT_DELTA_TOL` and `TABLE_SCROLL_TOL_PX` from `phase_z2_pipeline` (circular-safe SSoT; no duplicate literals). - New `image_events` scan emits `image_aspect_mismatch` when `delta is not None AND \|delta\| > IMAGE_ASPECT_DELTA_TOL` (delta=None ⇒ skip, image not loaded). - New `table_events` scan emits `tabular_overflow` when `wrapper_clipped_index is None AND (excess_x or excess_y > TABLE_SCROLL_TOL_PX)` (wrapper-clipped tables deduped against the existing zone cascade). - `visual_check_passed = overflow.passed AND not classifications` — any image/table classification now flips the gate. Guardrails preserved: - §3.2 8-rule zone cascade (clipped_inner / zone-self) untouched — the new emitters are ADDITIONAL. - `placement_diagnostics`, `categories_seen`, `unclassified_signals` return-shape preserved. - No `pipeline.py` production changes; no router action or `debug.json` passthrough changes. Tests (tests/phase_z2/test_phase_z2_visual_classifier.py — new): - `test_image_aspect_mismatch_emits_classification` (\|delta\|>TOL fires) - `test_image_aspect_delta_below_tol_no_classification` (≤TOL skipped) - `test_standalone_table_overflow_emits_classification` (wrapper_clipped_index=None, excess>TOL fires) - `test_table_dedup_when_wrapper_clipped` (wrapper_clipped_index set ⇒ no `tabular_overflow` emit) All 4 pure-dict (no Selenium / chromedriver / pipeline execution). Tolerances imported from `phase_z2_pipeline` (SSoT enforced via test import — no classifier-local literals). Verification (Stage 4): - New classifier tests: 4/4 PASS. - Regression `tests/phase_z2/` excluding new file: 93/93 PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 21:45:06 +09:00
kyeongmin	2827622858	feat(IMP-16): Step 14 table_self_overflow detection Add table self-overflow detection with element-identity wrapper dedup, mirroring the image_aspect_mismatch axis pattern (#45). JS layer: TABLE_SCROLL_TOL_PX=5 module constant; clippedWrapperMap built as Map<Element,int> keyed by DOM node reference (NOT className) so two wrappers with identical class strings remain distinguishable; table_events collected via querySelectorAll('table').forEach with closest()-ancestor walk resolving wrapper_clipped_index = int\|null. Py layer: aggregate result['table_events'] and append fail_reason 'table_self_overflow' only when (excess_x>TOL OR excess_y>TOL) AND wrapper_clipped_index is None; wrapper-clipped path continues to fail via existing clipped_inner reporting. Tests (Selenium, chromedriver guard mirrored from image_check): - Fixture D: standalone <table> overflow → table_self_overflow fail - Fixture E: <table> in clipped wrapper → dedup suppresses table fail - Fixture F (F1 acceptance): two wrappers with identical className f13b-cell, W1 clipped by non-table child, W2 hosts self-overflow <table> with W2 itself NOT clipped → element-identity ensures W2's table is not suppressed by W1's class; both fails emitted. Out of scope: image_events behavior (intact from #45), classifier pass/fail consumer (→실행-3), debug.json surfacing (→실행-4). Refs: #46 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 21:06:01 +09:00

1 2

69 Commits