IMP-03 A-1 popup/image/table trace #3

New Issue

Kyeongmin · 2026-05-12T17:38:49+09:00

Kyeongmin commented

2026-05-12 17:38:49 +09:00

관련 step: Step 3
source: INSIGHT-MAP §2 (A-1 chained 보강)
priority: medium

scope:

normalized 의 popups / images / tables → ContentObject 변환
B1 v0 dormant module 의 render path 활성화 또는 보강

guardrail / validation:

AI/Kei content extraction 회귀 X
popup/image/table 추출 trace 설명 가능
ContentObject schema 정합

dependency: hard link: IMP-02 (Stage 0 normalize output 의 popup/image/table list 의존)

cross-ref:

review loop:

Codex 1차 review
Claude 재검토
Codex 재검증
scope-locked
ready-for-implementation
implemented
verified

**관련 step**: Step 3 **source**: INSIGHT-MAP §2 (A-1 chained 보강) **priority**: medium **scope**: - normalized 의 popups / images / tables → ContentObject 변환 - B1 v0 dormant module 의 render path 활성화 또는 보강 **guardrail / validation**: - AI/Kei content extraction 회귀 X - popup/image/table 추출 trace 설명 가능 - ContentObject schema 정합 **dependency**: `hard link: IMP-02` (Stage 0 normalize output 의 popup/image/table list 의존) **cross-ref**: - [backlog §1 IMP-03](https://gitea.hmac.kr/Kyeongmin/C.E.L_Slide_test2/src/branch/main/docs/architecture/PHASE-Z-IMPLEMENTATION-ISSUE-BACKLOG.md) - [INSIGHT-MAP §2 Step 3](https://gitea.hmac.kr/Kyeongmin/C.E.L_Slide_test2/src/branch/main/docs/architecture/PHASE-Q-INSIGHT-TO-22STEP-MAP.md) - [22-step pipeline Step 3](https://gitea.hmac.kr/Kyeongmin/C.E.L_Slide_test2/src/branch/main/docs/architecture/PHASE-Z-PIPELINE-OVERVIEW.md) **review loop**: - [ ] Codex 1차 review - [ ] Claude 재검토 - [ ] Codex 재검증 - [ ] scope-locked - [ ] ready-for-implementation - [ ] implemented - [ ] verified

Kyeongmin added the needs-codex-review label 2026-05-12 17:38:49 +09:00

Kyeongmin added this to the B-1 §1 22-step map (11) milestone 2026-05-12 18:16:10 +09:00

sub-axis	내용	위험
A schema 확장	B1 v0 에 `table` / `image` / `details` 3 type 추가 (SPEC v1 §1.2 따라)	text_block / transform_table 회귀 X 보장
B input source 정합	`extract_content_objects` 가 `normalized` dict 의 popups/images/tables 도 consume — IMP-02 hand-off chain 완성	section.raw_content path 와 dual source 충돌 위험. priority chain 정의 필요
C Step 3 trace 보강	trace artifact 에 popup/image/table 등장 trace 기록 (final.html / render path / mapper 미터치 — 여전히 trace-only)	"render path 활성화" (mapper 대체) 는 별 axis 로 분리

조건	안
1. SPEC v1 §1.2 의 `table` / `image` / `details` 3 type 을 B1 v0 에 추가	확정
2. `extract_content_objects` API 확장 — 새 `normalized: Optional[dict] = None` arg	확정 (backward compat)
3. step02 schema 보강 — `normalized_popups` / `normalized_images` / `normalized_tables` additive field (env=1 일 때만 채워짐, env=OFF 일 때 빈 list)	확정 (IMP-03 안 처리)
4. canary flag — (a) separate `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF vs (b) conditional on IMP-02 flag	Codex 의견
5. render path = trace fidelity 보강 only. mapper / `pipeline_path_connected: False` / V4 / composition 변경 X	확정
6. env=OFF fallback path — (a) `section.raw_content` regex 기반 best-effort vs (b) IMP-03 자체가 env=1 conditional (env=OFF 에서 v0 동작 유지)	Codex 의견
7. id pattern — `{section_id}.{popup,image,table}-{N}` (각 type prefix 분리)	확정
8. `details` type 정식 이름 (SPEC §1.2) — backlog 의 "popup" 은 `details.display_hint=popup` 으로 흡수	확정
9. guardrail (IMP-03 description) — AI/Kei content extraction 회귀 X / trace 설명 가능 / ContentObject schema 정합	확정
10. out of scope — render path activation / mapper 대체 / V4 / composition / Step 6+ / `diagram` type	확정

#	Codex 답변/catch	자체 검증	결과
Q1	`stage0_normalized_assets` nested (top-level field 거부)	diagnostics vs data handoff 분리 cleaner. nested 가 의미 응집도 ↑	✅ 수용
Q2	separate `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF	독립 axis 독립 canary. IMP-02 dependency 는 데이터 존재 check 로 표현 (flag coupling X)	✅ 수용
Q3	env OFF 시 v0 preserve, regex fallback X, disabled trace marker	regex = mdx_normalizer logic 중복 → 위험 surface ↑. 안전	✅ 수용
Q4	render path = trace fidelity only, mapper / `pipeline_path_connected=False` 유지	자체 round 1 lock 정합	✅ 수용
Q5	정식 이름 = `details` (SPEC §1.2), `type_specific.display_hint="popup"` 로 source wording 보존	SPEC 정합 + 사용자 wording 모두 보존	✅ 수용
Q6	id pattern `{section_id}.{details,image,table}-{N}`	type prefix 분리 + collision 방지	✅ 수용
A	two-layer separation — `plan_placement()` 가 `{obj.type}` 로 frame 선택 → 새 type 노출 시 B4 변화	🚨 자체 round 1 miss 한 큰 catch. parallel list (`content_objects` v0 unchanged + `rich_content_objects` new) 필수. 안 하면 B4 회귀 → trace fidelity only 보장 깨짐	✅ 수용 — 핵심 lock 조건
B	Step 3 artifact 의 serialized object trace	현재 step03 `internal_regions` 만 → IMP-03 의 산물 observable 안 됨	✅ 수용
C	transform_table vs 일반 table dedup	normalize tables list 안 arrow table 도 포함될 수 있음 → 중복 위험	✅ 수용 — 아래 §3 dedup rule
D	JSON-safe payload	hygiene	✅ 수용
E	verification 6 항목	minimum bar 적합	✅ 수용

field	위치	의미
`stage0_adapter_diagnostics.adapter_counts.popups`	diagnostics nested	IMP-02 가 기록한 count (audit trail)
`stage0_normalized_assets.popups`	top-level handoff	IMP-03 가 consume 할 actual list

#	조건	상태
1	SPEC v1 §1.2 의 `table` / `image` / `details` 3 type 을 B1 v0 에 추가. `diagram` out of scope.	확정
2	API 확장: `extract_content_objects(section, source_shape=None, normalized=None)` — `normalized` default `None` backward compat	확정
3	Step 2 schema 보강: `stage0_normalized_assets: {popups: list, images: list, tables: list}` additive nested field. env=OFF 시 빈 list (`stage0_adapter_diagnostics.enabled=false` 일 때 항상 empty)	확정
4	canary flag : `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF. env=1 + `stage0_normalized_assets` non-empty 일 때만 rich extraction. 둘 중 하나라도 false → disabled marker (`rich_objects_enabled=false`, `disabled_reason="..."`)	확정
5	two-layer separation (Codex catch A) — v0 `content_objects` 그대로 + 새 `rich_content_objects` parallel list. `plan_placement()` 는 v0 만 feed → B4 trace 회귀 X	핵심 lock
6	Step 3 artifact field 추가: `rich_content_objects` / `rich_content_objects_enabled` / `rich_content_objects_source` / `rich_content_objects_disabled_reason`. 기존 field 보존	확정
7	env OFF preserve v0 — regex fallback X. disabled marker 만 기록	확정
8	`details` 정식 type, `type_specific.display_hint="popup"` 로 source wording 보존	확정
9	id pattern `{section_id}.{details,image,table}-{N}` (rank-1 prefix 분리)	확정
10	transform_table dedup rule (§3 위)	Codex 의견 (P/Q 위 §2 + dedup option a/b/c 위 §3)
11	render path = trace fidelity only. mapper / V4 / composition / Step 6+ / AI/Kei / `pipeline_path_connected=False` 모두 변경 X	확정
12	guardrail (IMP-03 description) — AI/Kei content extraction 회귀 X / popup/image/table 추출 trace 설명 가능 / ContentObject schema 정합	확정

#	Codex 결정/catch	자체 검증	결과
1 (P/Q)	P (count + list 모두 유지) + invariant `adapter_counts.X == len(stage0_normalized_assets.X)` + mismatch warning soft handle	semantic 분리 명확 (diagnostics ≠ data handoff). backward compat 보존	✅ 수용
2 (a/b/c)	C simplified — `transform_table` = v0 path 단독 source. normalize tables → generic `table` for non-transform 만. arrow 감지 시 skip with `skipped_transform_table_duplicate` reason	line-range 매칭 fragile 회피. type-level clean separation	✅ 수용
3 (impl shape)	C — `extract_content_objects()` v0 unchanged + `extract_rich_content_objects()` new function. tuple/namedtuple 반환 X	accidental misuse 회피 + invariant explicit. `placement_plan = plan_placement(content_objects=...)` 가 v0 만 feed 강제	✅ 수용
A (handoff wiring)	`_stage0_chained_adapter()` return shape 확장 — assets 별도 노출	5-tuple `(title, sections, footer, diagnostics, normalized_assets)` 가 cleanest. backward compat = caller side 만 unpack 추가	✅ 수용
B (disabled state)	explicit `rich_content_objects_disabled_reason: FLAG_OFF / NO_NORMALIZED_ASSETS`	auditable env-OFF/v0 preservation	✅ 수용
C (B4/Internal Region scope guard)	rich objects → region/placement planning 미연결	scope-lock #5 + #11 정합	✅ 수용
D (placement-input assertion)	`plan_placement()` 가 v0 list 만 받는지 verification 시 직접 assert	regression risk 명시적 guard	✅ 수용

옵션	방식	장점	단점
(a) source line / position	`mdx_normalizer` 가 asset 별 line 기록 후 section 의 line range 와 매칭	정확	`mdx_normalizer` 수정 필요 (IMP-02 area touch — scope creep)
(b) 텍스트 overlap	`popup.content` / `image.alt` 등을 section.raw_content 에서 substring search → match 한 section 에 attribute	코드 추가 only (`mdx_normalizer` 미터치)	heuristic, false-positive 위험, multi-section 매칭 시 ambiguity
(c) slide-level attribution	assets 를 section 에 매핑 안 함. id = `_slide.image-1`, `_slide.details-1`, `_slide.table-1` (slide-global namespace)	가장 단순, 0 heuristic, fully traceable	section 별 분배 trace 가 없음 — 향후 region/placement 활용 시 reattribution 필요
(d) defer	IMP-03 에서 section attribution 자체를 defer. assets 는 section 무관 surface 만 — section_id 필드 자체 omit (별 ID scheme)	scope ↓	향후 axis 가 더 복잡해짐

#	조건	상태
1	SPEC v1 §1.2 `table` / `image` / `details` 3 type 추가. `diagram` out of scope	확정
2	`extract_content_objects()` v0 unchanged. 새 함수 `extract_rich_content_objects(section, normalized_assets) -> list[ContentObject]` 추가 (Codex round 3 #3)	확정
3	Step 2 schema 보강: nested `stage0_normalized_assets: {popups: list, images: list, tables: list}` additive. env=OFF 시 빈 list	확정
4	`_stage0_chained_adapter()` 5-tuple 반환 — `(title, sections, footer, diagnostics, normalized_assets)`. caller (step 1 dispatch) unpack 추가	확정
5	canary flag : `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF. enable 조건 = flag=1 AND `stage0_normalized_assets` non-empty	확정
6	two-layer separation (Codex catch A) — `content_objects` (v0) → `plan_placement` feed. `rich_content_objects` parallel → Step 3 artifact only. `plan_placement()` 가 v0 만 받는지 verification 시 assert (Codex catch D)	핵심 lock
7	Step 3 artifact 추가 field: `rich_content_objects` / `rich_content_objects_enabled` / `rich_content_objects_source` / `rich_content_objects_disabled_reason` (`FLAG_OFF` / `NO_NORMALIZED_ASSETS`)	확정
8	env OFF preserve v0 exact (regex fallback X). disabled marker 만 기록	확정
9	`details` 정식 type, `type_specific.display_hint="popup"` source wording 보존	확정
10	transform_table dedup = type-level separation (Codex catch 2). v0 가 단독 transform_table source, normalize tables 는 generic `table` only for non-transform. arrow 감지 시 skip + `skipped_transform_table_duplicate` reason	확정
11	Count/list redundancy = P (both retained) + invariant `adapter_counts.X == len(stage0_normalized_assets.X)`. mismatch warning soft handle (no fail)	확정
12	render path = trace fidelity only. mapper / V4 / composition / Step 6+ / AI/Kei / `pipeline_path_connected=False` 모두 변경 X	확정
13	asset row shape contract lock (위 §2 Catch 8 — `mdx_normalizer` SoT)	확정
14	section attribution strategy — (a) line / (b) overlap / (c) slide-level / (d) defer 중	Codex 의견

#	Codex 결정	자체 검증	결과
1 (attribution)	(c) slide-level — flat asset list, mdx_normalizer 미터치, heuristic 회피	round 4 자체 추천과 일치	✅
2 (id pattern)	`{mdx_id}.{type}-N` (`03.image-1` / `03.details-1` / `03.table-1`) — 기존 `03-1` namespace 와 일관	`_slide` sentinel 보다 명확	✅
3 (explicit metadata)	`scope:"slide"` / `mdx_id` / `section_id:null` field 노출 — attribution 결정 visibility ↑	accidental section assumption 방지	✅
catch (per-zone dup)	root-level `rich_content_objects` once + per-zone `content_objects` (v0)	Step 3 의 per-zone loop 구조 보존, dup 회피	✅ 핵심
impl refinement	`extract_rich_content_objects(normalized_assets, mdx_id) -> list[ContentObject]` (section param 제거)	slide-level lock 정합. section param 노출 = 매핑 의도 오해 위험	✅

#	조건	산출 위치
1	SPEC v1 §1.2 `table` / `image` / `details` 3 type 추가. `diagram` out of scope	`phase_z2_content_extractor.py` 새 함수
2	`extract_content_objects()` v0 unchanged (signature / behavior). 새 함수 `extract_rich_content_objects(normalized_assets, mdx_id) -> list[ContentObject]` 추가 (section 인자 X — slide-level 정합)	같은 module
3	Step 2 schema 보강 : nested `stage0_normalized_assets: {popups: list, images: list, tables: list}` additive field. env=OFF 시 빈 list	`phase_z2_pipeline.py` Step 2 write_artifact
4	`_stage0_chained_adapter()` 5-tuple 반환 — `(title, sections, footer, diagnostics, normalized_assets)`. caller (Step 1 dispatch) unpack 추가	같은 helper
5	canary flag : `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF. enable 조건 = flag=1 AND `stage0_normalized_assets` non-empty (둘 중 하나 false → disabled marker)	os.environ check
6	two-layer separation (핵심 lock) — `content_objects` (v0) → `plan_placement` feed. `rich_content_objects` parallel → Step 3 artifact only. `plan_placement()` 가 v0 list 만 받는지 verification 시 assert	Step 3 dispatch
7	per-zone duplication 회피 — `rich_content_objects` = root-level once. per-zone `content_objects` = v0 만	Step 3 artifact shape
8	Step 3 artifact 추가 field: `rich_content_objects` (list) / `rich_content_objects_enabled` (bool) / `rich_content_objects_scope` ("slide") / `rich_content_objects_source` ("stage0_normalized_assets") / `rich_content_objects_disabled_reason` (`FLAG_OFF` / `NO_NORMALIZED_ASSETS` / null)	Step 3 write_artifact
9	id pattern : `{mdx_id}.details-N` / `{mdx_id}.image-N` / `{mdx_id}.table-N`. 각 object 에 `scope:"slide"` / `mdx_id:<str>` / `section_id:null` field 노출	rich extractor output
10	env OFF preserve v0 exact (regex fallback X). disabled marker 만 기록	rich extractor early return
11	`details` 정식 type, `type_specific.display_hint="popup"` source wording 보존	ContentObject build
12	transform_table dedup = type-level separation — v0 단독 `transform_table` source. normalize tables 의 arrow 감지 시 skip + `skipped_transform_table_duplicate` reason (디버그 list 에 기록)	rich extractor logic
13	count/list redundancy = P (둘 다 유지) + invariant `adapter_counts.X == len(stage0_normalized_assets.X)`. mismatch 시 warning 기록 (no fail)	`stage0_adapter_diagnostics` 보존
14	asset row shape contract lock — `popup={title:str, content:str}` / `image={alt:str, path:str}` / `table={headers:list[str], rows:list[list[str]]}` (mdx_normalizer SoT). 향후 shape 변경 시 IMP-03 cascade	rich extractor input parsing
15	render path = trace fidelity only. mapper / V4 / composition / Step 6+ / AI/Kei / `pipeline_path_connected=False` 모두 변경 X	scope guard
16	guardrail (IMP-03 description) — AI/Kei content extraction 회귀 X / popup/image/table 추출 trace 설명 가능 / ContentObject schema 정합	verification

검증 axis	결과
rich extractor logic (5 case unit)	✅ self-test PASS
canary state machine (FLAG_OFF / NO_NORMALIZED_ASSETS / enabled)	✅ run A / B / C 로 모두 trigger 됨
Step 2 schema handoff (`stage0_normalized_assets` populated when env=1)	✅ run C step02
count/list invariant	✅ run C step02 (popups 0=0, images 0=0, tables 1=1)
`plan_placement()` v0-only feed (scope-lock #6)	✅ grep 직접 검증 (single call site, v0 input)
Step 3 artifact under env=1 (rich populated end-to-end)	⚠️ inherited IMP-02 composition abort 으로 Step 3 미작성 — 별 axis (IMP-03+ downstream adapter compatibility)
v0 path 회귀	✅ run A legacy PASS

Codex check	자체 검증	결과
`fc3f7d8` origin + slide2 도달	git ls-remote 확인	✅
py_compile + 5 rich self-test	local re-run 동일	✅
Run A (env OFF + rich OFF) `FLAG_OFF` / count 0 / per_zone 2	artifact 일치	✅
Run B (env OFF + rich=1) `NO_NORMALIZED_ASSETS` / count 0	canary chain 정합	✅
Run C (env=1 + rich=1) step02 `stage0_normalized_assets.tables=1` + invariant	wire 정합 증명	✅
scope-lock 16 조건 모두 honored (특히 `plan_placement` v0-only + root-level once + `pipeline_path_connected=False`)	grep + diff 직접 확인	✅
Codex Chrome/Selenium 환경 한계	IMP-01 동일 별 axis (이미 backlog)	✅
End-to-end Step 3 미작성 = IMP-02 inherited composition abort	IMP-02 close 시 IMP-03+ axis 이양 명시	✅

파일	변경
`src/phase_z2_content_extractor.py`	`ContentObject` 확장 (scope/mdx_id/section_id 추가) + `extract_rich_content_objects` 신규 + `_looks_like_transform_table` / `_reconstruct_markdown_table` helper + 5 rich self-test
`src/phase_z2_pipeline.py`	`_stage0_chained_adapter` 5-tuple 반환 (`normalized_assets` 추가) + Step 1 dispatch unpack + Step 2 `stage0_normalized_assets` field 추가 + Step 3 rich extraction dispatch (root-level once) + import 확장

#	조건	impl 위치
1	SPEC v1 §1.2 `table` / `image` / `details` 3 type 추가 (`diagram` out of scope)	extractor.py `extract_rich_content_objects` body
2	`extract_content_objects()` v0 unchanged, 새 함수 `extract_rich_content_objects(normalized_assets, mdx_id)` 추가	extractor.py
3	Step 2 schema 보강: nested `stage0_normalized_assets: {popups, images, tables}` additive	pipeline.py Step 2 _write_step_artifact
4	`_stage0_chained_adapter()` 5-tuple 반환 `(title, sections, footer, diagnostics, normalized_assets)`	pipeline.py
5	canary `PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF. enable = flag=1 AND assets non-empty	pipeline.py Step 3 dispatch
6	two-layer separation — `plan_placement()` v0 list only feed. grep 검증: 단일 call site (line 1810), input `content_objects` (v0 only)	grep 직접 확인
7	per-zone duplication 회피 — `rich_content_objects` root-level once	Step 3 artifact shape
8	Step 3 artifact 추가 6 field: `rich_content_objects` / `_enabled` / `_scope` / `_source` / `_disabled_reason` / `_skips` / `_invariant_warnings`	pipeline.py Step 3
9	id pattern `{mdx_id}.{details,image,table}-N` + `scope='slide'` / `mdx_id` / `section_id=None`	extractor.py rich extractor
10	env OFF preserve v0 (regex fallback X)	rich extractor early return on empty assets
11	`details` 정식 type, `type_specific.display_hint='popup'`	extractor.py details emit
12	transform_table dedup — arrow row 감지 시 skip + reason	`_looks_like_transform_table` + skip dict
13	count/list invariant `adapter_counts.X == len(stage0_normalized_assets.X)` warning soft	pipeline.py Step 3 `invariant_warnings`
14	asset row shape contract (mdx_normalizer SoT) — popup/image/table 형식	extractor.py rich extractor body
15	render path = trace fidelity only (mapper / V4 / composition / Step 6+ / AI/Kei / `pipeline_path_connected=False` 변경 X)	grep + commit diff 확인
16	guardrail (AI/Kei 회귀 X / trace 설명 가능 / schema 정합)	verification list 통과

axis	결과
code	`src/phase_z2_content_extractor.py` + `src/phase_z2_pipeline.py` (commit `fc3f7d8`, 2 file 346+/14-)
새 함수	`extract_rich_content_objects(normalized_assets, mdx_id) -> tuple[list[ContentObject], list[dict]]`
ContentObject 확장	`scope` / `mdx_id` / `section_id` optional metadata fields (v0 unchanged)
Step 2 schema 보강	`stage0_normalized_assets: {popups, images, tables}` additive nested
`_stage0_chained_adapter`	4-tuple → 5-tuple (`normalized_assets` 추가)
canary flag	`PHASE_Z_STEP3_RICH_OBJECTS_ENABLED` default OFF (canary, IMP-02 와 동등 pattern)
state machine	3 case 모두 검증 (FLAG_OFF / NO_NORMALIZED_ASSETS / enabled)
Step 3 artifact 추가 6 field	`rich_content_objects` / `_enabled` / `_scope` / `_source` / `_disabled_reason` / `_skips` / `_invariant_warnings`
transform_table dedup	arrow row 감지 시 skip + `skipped_transform_table_duplicate` reason. v0 단독 transform_table source
ID pattern	`{mdx_id}.{details,image,table}-N` (slide-level namespace, e.g., `03.details-1`)
invariant	`adapter_counts.X == len(stage0_normalized_assets.X)` warning soft (no fail)
`plan_placement()` input	v0 list only (grep 직접 검증, single call site)
scope guard	mapper / V4 / composition / Step 6+ / AI/Kei / `pipeline_path_connected=False` 모두 변경 X
self-test	v0 2 + rich 5 case 모두 PASS
audit §4 guardrail	특정 MDX / frame 결과 고정 X / 하드코딩 baseline X 정합

IMP-03 A-1 popup/image/table trace #3

1. Code reading 요약

1.1 현재 B1 v0 (src/phase_z2_content_extractor.py)

1.2 Step 3 현재 동작 (phase_z2_pipeline.py:1786, :1895~1911)

1.3 SPEC v1 §1 정의 (이미 존재)

1.4 IMP-02 hand-off (env=1)

2. Scope 분해 (3 sub-axes)

3. 자체 catch (round 1 lock)

Catch 1 — IMP-02 hand-off 단절

Catch 2 — env=OFF 일 때 IMP-03 input 부재

Catch 3 — render path 활성화 위험

Catch 4 — ContentObject id collision

Catch 5 — details vs popup 용어 정합

Catch 6 — canary flag 패턴

Catch 7 — extract_content_objects API breaking change 위험

4. Scope-lock candidate (Codex 의견 요청)

5. Codex 1차 review 요청

Verified facts from code/docs

Answers to Claude review questions

Additional required catches before implementation

Scope-lock recommendation

1. Codex 답변 6 + catch 5 자체 비판 검토

2. 자체 추가 catch — count vs list 정합 question

3. Transform table dedup rule (Codex catch C 따라)

4. Refined scope-lock candidate (12 조건)

5. Codex 재검증 요청 (round 3 에서)

Verified code facts

Answers to Claude's requested decisions

1. P vs Q for counts/list redundancy

2. Transform table dedup option

3. Implementation shape for two-layer separation

Additional implementation catches

Final scope-lock position

1. Codex round 3 답변 + catch 4 자체 비판 검토

2. 자체 추가 catch (round 1~3 miss)

Catch 8 — Asset row shape 명시 (mdx_normalizer.py SoT)

Catch 9 — Section attribution gap (대형 catch)

3. Refined scope-lock candidate (14 조건)

4. Codex 재검증 요청 (round 5 에서)

Verified facts

Answers to Claude's round-5 questions

1. Section attribution strategy

2. ID pattern

3. Catch 8 / Catch 9 assessment

Additional required catch: avoid per-zone duplication

Implementation-shape refinement

Final agreement position

1. Codex round 5 답변 + catch + refinement 자체 비판 검토 결과

2. Scope-lock 확정 (16 조건)

3. Adapter chained 호출 contract (prose)

4. Verification list (implementation 후)

5. 라벨 전환

Verified final state

Answer to the remaining implementation-shape question

Small implementation refinement

Final implementation checklist to preserve the lock

1. 변경 파일 + 위치

2. scope-lock 16 조건 honored

3. py_compile + self-test 검증

4. End-to-end verification (MDX 03)

4.1 run A — env OFF + rich OFF (legacy double-OFF)

4.2 run B — env OFF + rich=1 (chained check)

4.3 run C — env=1 + rich=1

5. self-honest 보고 — 검증 한계

6. 라벨 / 다음 단계

Remote / commit verification

Verification commands I ran

Artifact checks from my runs

A. env OFF + rich OFF

B. env OFF + rich=1

C. env=1 + rich=1

Scope-lock condition review

One minor scope-lock drift, not a blocker

Review conclusion

1. Codex implementation review #5 자체 비판 검토 결과

2. Codex 의 minor drift 의 자체 surface

3. IMP-03 최종 산물 요약

4. Follow-up axis (별 issue)

5. 라벨 / 상태 전환

Final state verified

1.1 현재 B1 v0 (`src/phase_z2_content_extractor.py`)

1.2 Step 3 현재 동작 (`phase_z2_pipeline.py:1786`, `:1895~1911`)

Catch 5 — `details` vs `popup` 용어 정합

Catch 7 — `extract_content_objects` API breaking change 위험

Catch 8 — Asset row shape 명시 (`mdx_normalizer.py` SoT)