문서 정리: Phase 히스토리 md를 docs/history/로 이동 + 오래된 테스트/에셋 정리

- 루트의 IMPROVEMENT-PHASE-*.md, PHASE-*.md 등 45개 → docs/history/로 이동 - docs/block-tests/ 오래된 블록 테스트 HTML 삭제 (figma_to_html_agent로 대체) - docs/figma-analysis/, docs/figma-assets/, docs/figma-screenshots/ 정리 - docs/test-*.html 등 초기 테스트 파일 정리 - 참고 페이지/ 스크린샷 정리 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:56:23 +09:00
parent d57860578f
commit c42e01f060
206 changed files with 0 additions and 13498 deletions
--- a/docs/history/ARCHITECTURE-PHASE-T.md
+++ b/docs/history/ARCHITECTURE-PHASE-T.md
@@ -0,0 +1,854 @@
+# Design Agent Architecture — Phase T
+
+> MDX 원본 문서 → 고정 크기 HTML 슬라이드(1280×720px) 자동 생성 파이프라인
+> **폰트 위계가 먼저, 컨테이너가 따라간다** — 텍스트 보존 · 폰트 위계 강제 · 디자인 요소 크기를 수학적으로 역산
+
+---
+
+## 1. 핵심 설계 원칙
+
+### 1.1 AI vs 코드 역할 분리
+
+| 역할 | 담당 | 해당 Stage |
+|---|---|---|
+| 콘텐츠 판단 · 분류 | AI (Kei Persona API / Opus) | 1A, 1B |
+| 폰트 위계 확정 + 컨테이너 비율 역산 | 코드 (결정론적 수학) | 1.5a |
+| 블록 선택 · 변형 결정 | 코드 (키워드 매칭 + 룩업 테이블) | 1.7 |
+| 블록 schema 기반 디자인 예산 역산 | 코드 (결정론적 수학) | 1.5b |
+| HTML 생성 | AI (Claude Sonnet 4) | 2 |
+| 텍스트·구조 검증 (L1~L3) | 코드 (kiwipiepy + regex) | 2 직후 |
+| 실측 렌더링 (L4) | Selenium (headless Chrome) | 3 직후 |
+| 시각 품질 평가 (L5) | AI (Opus Vision) | 4 |
+| HTML 조립 · 서빙 | 코드 | 3, 5 |
+
+AI가 공간을 볼 수 없는 근본적 한계를 코드(수학적 예산 역산)로 보완하는 구조.
+LLM이 참고 HTML 구조를 70~90% 복사하는 경향을 장점으로 활용 — "디자인 레퍼런스" 프레이밍.
+
+### 1.2 폰트 위계 (Phase T 핵심 — 이것이 모든 계산의 출발점)
+
+Phase S에서 폰트 크기가 중요도와 완전히 역전됨 (sidebar 14px > key-msg 11px).
+Phase T는 **위계를 먼저 확정하고, 컨테이너가 위계에 맞춰지는** 방향으로 전환.
+
+| 영역 | 중요도 | 폰트 범위 | 강제 규칙 |
+|------|--------|----------|----------|
+| 핵심 (key-msg) | 1위 | **14px** bold | 무조건 한 줄, 슬라이드에서 가장 큰 폰트 |
+| 본문 (core) | 2위 | **12px** | 본문 텍스트 기본 |
+| 배경 (bg) | 3위 | **10-12px** | 텍스트 양에 따라 범위 내 조정 |
+| 첨부 (sidebar) | 4위 | **9-11px** | 참고 자료, 가장 작아도 됨 |
+
+**검증 기준:** `font_size(핵심) > font_size(본문) ≥ font_size(배경) > font_size(첨부)` — 위반 시 에러.
+
+### 1.3 파이프라인 운영 패턴
+
+#### 누적 컨텍스트 객체 (Pydantic BaseModel)
+
+각 Stage가 독립 JSON을 읽고 쓰는 대신, `PipelineContext` 하나가 파이프라인을 따라가며 점진적으로 확장. T-0 조사 결과 **Pydantic BaseModel** 채택 (dataclass 아님) — `model_dump_json()` 직렬화, `validate_assignment=True` 타입 검증.
+
+```python
+context.normalized.clean_text                           # Stage 0
+context.normalized.title                                # Stage 0
+context.normalized.images                               # Stage 0
+context.normalized.popups                               # Stage 0
+context.normalized.tables                               # Stage 0
+context.analysis.core_message                           # Stage 1A
+context.analysis.topics[0].source_hint                  # Stage 1A
+context.analysis.page_structure                         # Stage 1A
+context.topics[0].relation_type                         # Stage 1B
+context.topics[0].expression_hint                       # Stage 1B
+context.topics[0].source_data                           # Stage 1B
+context.font_hierarchy                                  # Stage 1.5a
+context.container_ratio                                 # Stage 1.5a (동적 body:sidebar 비율)
+context.containers["본심"].text_budget                   # Stage 1.5a
+context.references["본심"].block_id                      # Stage 1.7
+context.references["본심"].design_reference_html         # Stage 1.7
+context.containers["본심"].design_budget                 # Stage 1.5b (블록 선택 후 재계산)
+context.generated_html                                  # Stage 2
+context.rendered_html                                   # Stage 3
+context.measurement                                     # Stage 4
+context.quality_score                                   # Stage 4
+```
+
+#### 각 Stage 공통 실행 패턴
+
+```python
+async def run_stage(stage_fn, context, stage_name, max_retries=1):
+    for attempt in range(max_retries + 1):
+        result = await stage_fn(context)
+        errors = result.get("_errors", [])
+        if not errors:
+            # Pydantic: model_copy(update=...) 사용
+            context = context.model_copy(update=result)
+            context.save_snapshot(stage_name)
+            return context
+        context.errors.append({"stage": stage_name, "attempt": attempt, "errors": errors})
+        if attempt < max_retries:
+            context.retry_feedback = build_retry_feedback(stage_name, errors)
+    raise StageFailure(stage_name, errors)
+```
+
+#### 에러 3등급 분류
+
+| 등급 | 의미 | 대응 |
+|------|------|------|
+| **FATAL** | 복구 불가 (원본 문제, JSON 파싱 실패) | 파이프라인 중단 |
+| **RETRYABLE** | AI 재시도로 해결 가능 (분류 오류, 누락) | Self-Refine 피드백 포함 재요청 (최대 2회) |
+| **ADJUSTABLE** | 코드로 자동 조정 가능 (높이 부족, 비율 초과) | 자동 조정 후 경고 기록 |
+
+#### 스냅샷 저장
+
+`data/runs/{run_id}/step{N}_context.json` — run_id는 `YYYYMMDD_HHMMSS` timestamp.
+Pydantic `model_dump_json()`으로 직렬화. `diff step1a_context.json step1b_context.json`으로 추적.
+
+---
+
+## 2. 파이프라인 (11 Stage)
+
+### Stage 0: MDX 표준화
+
+- **담당:** 코드
+- **신규 파일:** `src/mdx_normalizer.py`
+- **라이브러리:** `python-frontmatter` + `markdown-it-py` + `mdit-py-plugins` (총 ~1MB)
+- **입력:** 원본 MDX 텍스트
+- **처리 (4-Layer 파서):**
+  - **Layer 1:** `python-frontmatter.parse()` → `(metadata_dict, body_str)` 분리. title 추출.
+  - **Layer 2:** 코드블록 보호 (backtick 10→3 순서로 fenced block → placeholder) → MDX 전용 패턴 처리:
+    - Astro `:::directive` → `[핵심요약]...[/핵심요약]` 마커
+    - `<details><summary>제목</summary>내용</details>` → popups[] 추출
+    - JSX `style={{}}`, `import/export` 제거
+  - **Layer 3:** `markdown-it-py` AST 파싱 (`js-default` 프리셋, table 기본 포함):
+    - heading 토큰 → 섹션 구조 추출 (tag, level, content, source line)
+    - image 토큰 → images[] 추출 (alt, src)
+    - table 토큰 → tables[] 추출 (header, rows)
+    - 코드블록 placeholder 복원
+  - **Layer 4:** 텍스트 정리 — 남은 HTML 태그 제거, 빈 줄 정리, 최종 clean_text
+- **출력:**
+  ```python
+  {
+      "clean_text": str,         # 정규화된 순수 텍스트
+      "title": str,              # frontmatter 제목
+      "images": [{"alt": str, "path": str}],
+      "popups": [{"title": str, "content": str}],
+      "tables": [{"header": list, "rows": list}],
+      "sections": [{"level": int, "title": str, "content": str}]  # ## 기준 섹션 분리
+  }
+  ```
+- **검증:**
+  - clean_text 비어있지 않음
+  - `##` 섹션 최소 1개
+  - 원본 대비 30% 이상 텍스트 보존 (과도한 제거 방지)
+  - images[] 수 = 원본 `![` 패턴 수
+  - popups[] 수 = 원본 `<details>` 패턴 수
+- **주의:** 기존 `normalize_mdx()`의 `r"^## \d+\.\s*"` → `r"^## \d+\.\s+"` 수정 (공백 1개 이상 필수)
+- **저장:** `context.normalized.*`
+
+---
+
+### Stage 1A: Kei 꼭지 추출
+
+- **담당:** AI (Kei Persona API, localhost:8000, Opus — SSE 스트리밍)
+- **입력:** `context.normalized.clean_text` (Stage 0에서 정규화된 텍스트)
+- **처리:** Kei가 콘텐츠를 읽고 꼭지 분류 + 스토리라인 설계
+- **출력:**
+  - `topics[]` — id, title, purpose, role, layer, weight, **source_hint** (원본 MDX 섹션 참조)
+  - `page_structure` — { "본심": {topic_ids, weight}, "배경": {...}, "첨부": {...}, "결론": {...} }
+  - `core_message` — 슬라이드 핵심 메시지 한 줄
+- **검증 (Pydantic + 코드 대조):**
+  - **형식:** weight 합 0.9~1.1 범위, 본심 weight ≥ 0.3, 필수 필드 존재, topics > 0
+  - **내용 대조:** 원본 `##` 섹션 수 vs topic 수 비교 — 차이가 크면 분류 오류 가능성
+  - **내용 대조:** topic summary 키워드가 원본 해당 섹션에 실제 존재하는지 (kiwipiepy)
+  - 실패 시 RETRYABLE → Self-Refine 피드백 포함 재요청 (최대 2회)
+- **저장:** `context.analysis`, `context.topics`, `context.page_structure`
+
+---
+
+### Stage 1B: 컨셉 구체화
+
+- **담당:** AI (Kei Persona API, Opus — SSE 스트리밍)
+- **입력:** `context.normalized.clean_text` + `context.topics` (Stage 1A 결과)
+- **처리:** 각 꼭지에 관계 유형, 표현 힌트, 원본 텍스트 참조 부여
+- **출력:** topics에 아래 필드 병합
+  - `relation_type` — **7개 enum:** hierarchy / cause_effect / comparison / sequence / definition / inclusion / **none**
+  - `expression_hint` — 디자인 방향 힌트 (3문장 구조: 관계 선언 + 콘텐츠 설명 + 시각 지침)
+  - `source_data` — 원본 텍스트 참조
+- **검증 (Pydantic + 코드 대조 + 모순 탐지):**
+  - **형식:** relation_type이 7개 enum 중 하나, expression_hint 비어있지 않음, source_data 비어있지 않음
+  - **모순 결정 테이블:**
+
+    | purpose | 모순인 relation_type | 이유 |
+    |---------|---------------------|------|
+    | 결론강조 | comparison, sequence | 결론은 비교나 순서가 아님 |
+    | 문제제기 | sequence, definition | 문제제기는 순서 나열이나 정의가 아님 |
+    | 용어정의 | hierarchy, cause_effect | 정의 나열은 상하위나 인과가 아님 |
+    | 구조시각화 | none | 시각화할 관계가 없으면 구조시각화가 아님 |
+
+  - **source_data 원본 대조:** source_data 키워드가 원본 clean_text에 실제 존재하는지 (kiwipiepy). 없는 출처 감지 → 할루시네이션
+  - **relation_type 원본 대조:** 한국어 관계 표현 패턴으로 검증
+
+    | relation_type | 원본에 있어야 하는 패턴 (일부) |
+    |---------------|-------------------------------|
+    | comparison | vs, 반면, 차이점, 에 비해, 와 달리, 상이, 구분 |
+    | sequence | →, 이후, 단계, 먼저, 점진적, 과정, 를 거쳐 |
+    | hierarchy | 상위, 하위, 속하, 범주, 구성요소, 체계, 계층 |
+    | inclusion | 포함, 융합, 통합, 결합, 내포, 포괄, 연계 |
+    | cause_effect | 때문에, 따라서, 결과, 로 인해, 초래, 야기, 기인 |
+    | definition | 이란, 정의, 의미, 을 말한다, 라 함은, 용어 |
+
+  - 실패 시 RETRYABLE → 모순/불일치 topic만 피드백 포함 재요청 (최대 2회)
+- **저장:** `context.topics[].relation_type`, `.expression_hint`, `.source_data`
+
+---
+
+### Stage 1.5a: 폰트 위계 확정 + 컨테이너 비율 역산
+
+- **담당:** 코드 (AI 아님, 결정론적 수학)
+- **입력:** page_structure weight + 각 영역의 source_data 텍스트 양
+- **핵심 원칙:** **폰트가 먼저, 컨테이너가 따라간다**
+
+#### (1) 폰트 위계에서 필요 공간 계산
+
+```python
+FONT_HIERARCHY = {
+    "핵심":  {"min": 14, "max": 14, "weight": "bold"},
+    "본심":  {"min": 12, "max": 12},
+    "배경":  {"min": 10, "max": 12},
+    "첨부":  {"min": 9,  "max": 11},
+}
+
+def calculate_required_space(role, content, font_size):
+    """이 폰트 크기로 이 텍스트를 넣으려면 몇 px 필요한가?"""
+    char_width_px = font_size * 0.947   # Pretendard 한글 실측 비율
+    line_height_px = font_size * 1.5    # 본문 기준
+    chars_per_line = available_width // char_width_px
+    total_lines = len(content) // chars_per_line
+    required_height = total_lines * line_height_px + padding
+    return required_height
+```
+
+#### (2) 동적 body:sidebar 비율 역산
+
+고정 65:35가 아니라 텍스트 양에서 역산:
+
+```python
+def calculate_container_ratio(roles_text_volume, font_hierarchy):
+    """폰트 위계를 지키면서 모든 텍스트가 들어가는 비율을 역산"""
+    # 1. 각 역할의 위계 기준 폰트로 필요 공간 계산
+    sidebar_need = calculate_required_space("첨부", sidebar_text, font_hierarchy["첨부"]["max"])
+    body_need = sum(calculate_required_space(r, t, font_hierarchy[r]["max"])
+                    for r, t in body_roles)
+
+    # 2. sidebar 충전율로 비율 결정
+    sidebar_capacity_at_35 = estimate_capacity(slide_width * 0.35, font_hierarchy["첨부"]["max"])
+    fill_rate = len(sidebar_text) / sidebar_capacity_at_35
+
+    if fill_rate < 0.5:
+        ratio = (72, 28)   # sidebar 텍스트 적음 → body 확대
+    elif fill_rate < 0.8:
+        ratio = (68, 32)   # 보통
+    else:
+        ratio = (65, 35)   # 현재 유지
+
+    return ratio  # (body_pct, sidebar_pct)
+```
+
+#### (3) 텍스트 예산 계산
+
+비율 확정 후, 각 영역의 텍스트 예산:
+
+```python
+def calculate_text_budget(container, content, font_size):
+    char_width_px = font_size * 0.947
+    line_height_px = font_size * 1.5
+    inner_width = container.width_px - padding * 2
+    inner_height = container.height_px - padding * 2
+
+    chars_per_line = int(inner_width / char_width_px)
+    max_lines = int(inner_height / line_height_px)
+    max_chars = chars_per_line * max_lines
+
+    source_chars = len(content)
+    needs_compression = source_chars > max_chars
+
+    return TextBudget(
+        font_size=font_size,
+        chars_per_line=chars_per_line,
+        max_lines=max_lines,
+        max_chars=max_chars,
+        source_chars=source_chars,
+        needs_compression=needs_compression,
+    )
+```
+
+#### (4) 다단 레이아웃 판단
+
+위계 범위 내 최소 폰트로도 텍스트가 안 들어가면 구조 변경:
+
+```
+1. 위계 기준 폰트(max)로 수용량 계산
+2. 텍스트 양 > 수용량 → 폰트 1px 축소 (위계 min까지)
+3. 최소 폰트로도 불가 → 레이아웃 변경 (1단→2단)
+4. 2단으로도 불가 → 비율 조정 (sidebar 축소 → body 확대)
+5. 비율 조정으로도 불가 → 텍스트 편집 필요 경고 (context.warnings에 기록)
+```
+
+- **검증:** height_px 합 ≤ 전체 높이, 폰트 위계 유지, 음수 없음
+- **저장:** `context.font_hierarchy`, `context.container_ratio`, `context.containers[].text_budget`
+
+---
+
+### Stage 1.7: 참고 블록 선택 + 변형 결정
+
+- **담당:** 코드 (키워드 매칭 + 룩업 테이블, AI 아님)
+- **입력:** 1B의 relation_type + expression_hint + 1.5a의 컨테이너 스펙 + catalog.yaml
+- **처리 4단계:**
+
+#### (1) relation_type → 블록 후보 (1차 필터)
+
+catalog.yaml의 `relation_types` 필드로 필터:
+
+```python
+candidates = [b for b in catalog.blocks
+              if relation_type in b.relation_types or not b.relation_types]
+```
+
+#### (2) expression_hint → 블록 세분화 (2차 필터 — 키워드 포함 여부)
+
+expression_hint는 긴 문장이므로 **정확한 문자열 매칭이 아니라 키워드 포함(substring) 매칭**:
+
+```python
+VISUAL_TYPE_KEYWORDS = {
+    "인과": {"keywords": ["인과", "현상->결과", "야기", "원인"], "blocks": ["callout-warning", "dark-bullet-list"]},
+    "나열_병렬": {"keywords": ["독립적 나열", "병렬 나열", "개별 증거"], "blocks": ["dark-bullet-list", "card-icon-desc"]},
+    "나열_정의": {"keywords": ["독립적 정의", "용어", "참조용"], "blocks": ["card-numbered"]},
+    "포함_계층": {"keywords": ["상위-하위", "포함 관계", "계층적"], "blocks": ["venn-diagram", "keyword-circle-row"]},
+    "강조_결론": {"keywords": ["핵심 메시지 강조", "임팩트", "한 줄 강조"], "blocks": ["banner-gradient", "quote-big-mark"]},
+    "비교": {"keywords": ["대등 비교", "좌우 대조", "vs"], "blocks": ["compare-2col-split", "compare-3col-badge"]},
+    "순서": {"keywords": ["시간 순서", "단계별", "A->B->C"], "blocks": ["flow-arrow-horizontal", "process-horizontal"]},
+}
+
+def match_visual_type(expression_hint: str) -> str:
+    """expression_hint에서 키워드를 찾아 시각적 유형 반환"""
+    for vtype, spec in VISUAL_TYPE_KEYWORDS.items():
+        if any(kw in expression_hint for kw in spec["keywords"]):
+            return vtype
+    return "default"
+```
+
+시각 매핑 근거 (Gestalt 원칙):
+- 폐합(Closure) → hierarchy/inclusion → 원형(벤 다이어그램)
+- 근접(Proximity) → comparison → 좌우 표/비교
+- 연속(Continuity) → sequence → 화살표 흐름
+- 유사(Similarity) → definition → 동일 형태 카드 반복
+- PPTAgent(EMNLP 2025): "참고 기반 생성"의 효과를 학술 입증
+
+#### (3) 컨테이너 크기 적합성 검사
+
+```python
+candidates = [b for b in candidates
+              if b.min_height_px <= container.height_px]
+```
+
+#### (4) 블록 변형(variant) + 레이아웃 자동 선택
+
+```python
+def select_block_variant(block, container, content):
+    if not block.variants or len(block.variants) <= 1:
+        return block.id, "default"
+
+    for variant in block.variants:
+        if variant.id == "compact" and container.height_px < 150:
+            return block.id, "compact"
+        if variant.id == "wide" and container_ratio[0] >= 70:  # body 70% 이상
+            return block.id, "wide"
+
+    return block.id, "default"
+```
+
+#### (5) fallback 정의
+
+모든 필터를 통과하는 후보가 없을 때의 카테고리별 기본 블록:
+
+| 카테고리 | fallback 블록 | 이유 |
+|----------|-------------|------|
+| cards | card-numbered | 가장 범용, compact~xlarge 대응 |
+| emphasis | dark-bullet-list | 텍스트 중심, 높이 유연 |
+| visuals | venn-diagram | N개 자동 배치 가능 |
+| tables | compare-2col-split | 가장 기본적 비교 |
+| media | image-side-text | 텍스트+이미지 조합 |
+
+#### 디자인 레퍼런스 HTML 생성
+
+Jinja 변수를 샘플 데이터로 치환한 완성된 HTML + 구조 의도 주석.
+LLM이 이 구조를 70~90% 복사 → 레이아웃을 "발명"하지 않고 검증된 구조를 따름.
+
+```python
+def generate_design_reference(block, variant, catalog_entry):
+    template = load_template(block.template)
+    sample_data = build_sample_data(catalog_entry.slots)
+    rendered = template.render(**sample_data)
+
+    # 구조 의도 주석 추가 (LLM이 의도를 정확히 파악)
+    annotated = f"<!-- {block.id}: {catalog_entry.visual} -->\n"
+    if catalog_entry.get("visual_diff"):
+        annotated += f"<!-- 차별점: {catalog_entry.visual_diff} -->\n"
+    annotated += rendered
+
+    return annotated
+```
+
+- **출력:**
+
+```json
+{
+  "block_id": "dark-bullet-list",
+  "variant": "default",
+  "visual_type": "인과",
+  "schema": {
+    "title": {"max_lines": 1, "font_size": 16, "max_chars": 30},
+    "bullet_item": {"max_lines": 1, "font_size": 14, "max_chars": 86},
+    "max_bullets": 5
+  },
+  "design_reference_html": "<!-- dark-bullet-list: 다크 배경 + 파란 제목 + 흰 불릿 -->\n<div ...>..."
+}
+```
+
+- **검증:** 선택된 블록이 catalog.yaml에 실제 존재, min_height_px ≤ container.height_px
+- **저장:** `context.references["본심"].*`
+
+---
+
+### Stage 1.5b: 디자인 예산 재계산 (블록 선택 후)
+
+- **담당:** 코드 (AI 아님)
+- **입력:** Stage 1.7에서 선택된 블록의 schema + Stage 1.5a의 컨테이너 스펙
+- **목적:** 텍스트 영역 확보 후 남은 공간 = 디자인 요소 예산. **텍스트를 줄이는 것이 아니라 도형·이미지·CSS 요소의 크기를 맞추는 방향.**
+
+```python
+def calculate_design_budget(container, text_budget, block_schema):
+    # 블록 schema에서 텍스트 슬롯별 높이 합산
+    text_height = 0
+    for slot_name, spec in block_schema.items():
+        if slot_name.startswith("max_"):
+            continue
+        slot_lines = spec.get("max_lines", 1)
+        slot_font = spec.get("font_size", 14)
+        text_height += slot_lines * (slot_font * 1.6)
+
+    remaining_height = container.height_px - text_height - padding
+    remaining_width = container.width_px - padding
+
+    return DesignBudget(
+        available_height_px=remaining_height,
+        available_width_px=remaining_width,
+        max_circle_diameter=min(remaining_height, remaining_width) - 4,
+        max_img_width=remaining_width * 0.4,
+        max_img_height=remaining_height,
+        fits=remaining_height >= 0,
+    )
+```
+
+- **검증:** available_height_px ≥ 0 (음수 = 블록이 컨테이너에 안 맞음 → Stage 1.7 재선택 또는 ADJUSTABLE)
+- **저장:** `context.containers["본심"].design_budget`
+
+---
+
+### Stage 2: HTML 생성
+
+- **담당:** AI (Claude Sonnet 4, Anthropic API 직접, 현재 모델: `claude-sonnet-4-20250514`)
+- **입력:** 원본 텍스트 + 누적 컨텍스트 전체
+- **처리:** 영역별(배경/본심/첨부/결론) **각각 개별 호출**로 HTML 생성
+
+프롬프트 구성 — 모든 수치를 **구체적으로** 전달 (Phase S 교훈: 추상적 프롬프트는 실패):
+
+| 출처 | 포함 내용 |
+|------|----------|
+| Stage 0 | clean_text (원본 텍스트 — "이 텍스트를 그대로 사용하라") |
+| Stage 1A | core_message |
+| Stage 1B | expression_hint, relation_type |
+| Stage 1.5a | 확정된 폰트 크기, 줄 수, 글자 수, 컨테이너 px |
+| Stage 1.5b | 디자인 요소 크기 제약 (max_circle_px, max_img_width 등) |
+| Stage 1.7 | 디자인 레퍼런스 HTML + visual_diff 설명 |
+
+프롬프트 예시:
+
+```
+[디자인 레퍼런스]
+아래 HTML의 구조와 색상 패턴을 따르되 콘텐츠를 교체하세요.
+<!-- dark-bullet-list: 다크 배경 + 파란 제목 + 흰 불릿 -->
+<!-- 차별점: 같은 다크 계열 callout-warning과 달리 경고 아이콘 없음. 순수 나열용. -->
+<div style="background:#1a2332; padding:20px; border-radius:6px;">
+  <!-- SLOT: title (1줄, 16px bold, max 30자) -->
+  <h3 style="color:#4a9eff; font-size:16px; font-weight:700;">샘플 제목</h3>
+  <!-- SLOT: bullets (1줄씩, 14px, max 86자, max 5개) -->
+  <ul style="list-style:none; padding:0;">
+    <li style="color:#e2e8f0; font-size:14px; padding:4px 0;">• 샘플 항목 1</li>
+  </ul>
+</div>
+
+[수치 제약 — 반드시 준수]
+- 컨테이너: 너비 707px, 높이 176px
+- 폰트: 11px (배경 영역 위계)
+- 줄당 최대 68자
+- 최대 10줄
+- 디자인 요소 예산: 높이 84px, 너비 707px
+
+[원본 텍스트 — 축약/변형 금지]
+"DX와 BIM이 개념적으로 명확히 정립되지 않은채 혼용되어 사용되고 있음..."
+
+[필수 규칙]
+- inline style만 사용, <style> 블록 금지
+- overflow:hidden 금지
+- 디자인 레퍼런스의 구조를 따르되 콘텐츠에 맞게 커스텀
+- 개조식 통일 (서술형 ~하다/~이다 → 개조식 ~에 해당/~인식되는 중)
+```
+
+- **이미지 처리:** Stage 0에서 추출된 `images[]`의 경로와 크기 정보를 프롬프트에 포함. Stage 5에서 base64 인라인 변환.
+- **팝업 처리:** Stage 0에서 추출된 `popups[]`를 `<details>/<summary>` HTML로 변환 지시.
+- **출력:** `{body_html, sidebar_html, footer_html}`
+- **저장:** `context.generated_html`
+
+---
+
+### 분산 검증 시스템
+
+5층 검증을 한 곳에 집중하지 않고, **각 Layer가 적합한 시점에 분산 실행**.
+재시도 프롬프트는 Self-Refine(NeurIPS 2023) 패턴: `localization + evidence + instruction`.
+VASCAR(2024)의 Scorer+Suggester 분리: 점수 매기기와 피드백 생성을 분리.
+
+#### Stage 2 직후: L1 + L2 + L3 (코드 검증)
+
+| Layer | 도구 | 검증 내용 |
+|---|---|---|
+| L1 | kiwipiepy + regex | 키워드 보존율 80% 이상 |
+| L2 | regex | Kei 메모("간결한 문제 제기용" 등)가 출력에 포함 안 됐는지 |
+| L3 | regex | overflow:hidden 없는지, 폰트 위계 위반 없는지, inline style만 사용했는지 |
+
+실패 시 → Self-Refine 프롬프트로 Stage 2 재실행 (최대 2회):
+
+```
+[재생성 요청 - 시도 2/3]
+이전 생성의 문제:
+1. L1: 키워드 보존율 65%. 누락: {'BIM', '혼용', '설계오류'}
+2. L3: overflow:hidden 감지
+
+수정 지시 (Self-Refine):
+- localization: 키워드 보존 실패, 구조 위반
+- evidence: 원본 핵심 키워드 3개 누락, overflow:hidden 존재
+- instruction: 누락 키워드 포함, overflow:hidden 제거, 나머지 제약 동일
+```
+
+---
+
+### Stage 3: 렌더링 (조립)
+
+- **담당:** 코드 (AI 아님)
+- **입력:** L1~L3 통과한 body/sidebar/footer HTML + 프리셋 grid + **동적 비율**
+- **처리:**
+  - tokens.css + base.css 인라인 병합
+  - CSS Grid 프레임 구성 — **동적 비율 적용** (예: `72fr 28fr` or `65fr 35fr`)
+  - 각 영역 HTML을 `<div class="area-body">` 등에 삽입
+  - Pretendard 폰트 CDN 링크 포함
+- **출력:** 완전한 단독 실행 HTML
+- **저장:** `context.rendered_html`
+
+#### Stage 3 직후: L4 (Selenium 실측)
+
+```python
+def validate_after_stage3(context, rendered_html):
+    measurements = selenium_measure(rendered_html)
+    errors = []
+    for area, m in measurements.items():
+        if m.scroll_height > m.client_height:
+            overflow = m.scroll_height - m.client_height
+            errors.append({
+                "layer": "L4", "severity": "RETRYABLE",
+                "localization": f"{area} overflow {overflow}px",
+                "evidence": f"scrollHeight {m.scroll_height} > clientHeight {m.client_height}",
+                "instruction": f"이 영역의 디자인 요소를 {overflow+10}px 줄이거나 bullet 1개 제거"
+            })
+    return errors  # 실패 → 해당 영역만 Stage 2로 (최대 2회)
+```
+
+---
+
+### Stage 4: 품질 게이트 (L5)
+
+- **담당:** Selenium (스크린샷 캡처) + Opus Vision (품질 판정, 현재 모델: `claude-opus-4-0-20250514`)
+- **처리:**
+  - 전체 페이지 스크린샷 캡처 → Opus Vision에 base64 전송
+  - 5가지 평가 기준 (VASCAR 방식):
+    1. 콘텐츠 겹침/잘림 없는가?
+    2. 본심 영역이 시각적으로 가장 두드러지는가?
+    3. 폰트가 읽을 수 있는 크기인가? **폰트 위계가 유지되는가?**
+    4. 한국어 비즈니스 프레젠테이션으로서 적절한가?
+    5. 블록 유형에 다양성이 있는가?
+  - 0~100점 평가:
+    - 30점 미만 → 출력 차단 (FATAL)
+    - 30~60점 → Opus 피드백으로 Stage 2 재실행
+    - 60점 이상 → Stage 5로
+- **L4와의 차이:** L4는 영역 단위 px 실측(Stage 3 직후), L5는 조립 후 전체 페이지 **시각적** 평가
+- **저장:** `context.measurement`, `context.quality_score`
+
+---
+
+### Stage 5: 서빙
+
+- **담당:** 코드
+- **처리:**
+  - 이미지 경로 → base64 인라인 변환 (다운로드 HTML에서도 이미지 표시)
+  - `<details>` 인쇄 시 자동 펼침 JS 삽입 (`window.onbeforeprint`)
+  - final.html 저장
+- **저장:** `data/runs/{run_id}/final.html`
+
+---
+
+## 3. 검증 흐름 요약
+
+```
+Stage 1A (Kei 분석)
+    ↓
+  1A 검증 (Pydantic + 원본 대조) ──실패──→ Kei 재요청 (최대 2회)
+    ↓ 통과
+Stage 1B (컨셉 구체화)
+    ↓
+  1B 검증 (모순 탐지 + 원본 대조) ──실패──→ Kei 재요청 (최대 2회)
+    ↓ 통과
+Stage 1.5a → 1.7 → 1.5b (코드, 결정론적)
+    ↓
+Stage 2 (HTML 생성)
+    ↓
+  L1+L2+L3 ──실패──→ Self-Refine → Stage 2 재실행 (최대 2회)
+    ↓ 통과
+Stage 3 (조립)
+    ↓
+  L4 ──실패──→ 실패 영역만 Stage 2로 (최대 2회)
+    ↓ 통과
+Stage 4 (L5 최종 판정)
+    ↓
+  30점 미만 → 차단 (FATAL)
+  30~60점  → Opus 피드백으로 Stage 2 재실행 (최대 1회)
+  60점 이상 → Stage 5
+```
+
+### 재시도 총 예산
+
+| 지점 | 최대 재시도 | 대상 |
+|------|-----------|------|
+| Stage 1A | 2회 | Kei 전체 |
+| Stage 1B | 2회 | 실패 topic만 |
+| L1~L3 | 2회 | 실패 영역만 |
+| L4 | 2회 | 실패 영역만 |
+| L5 | 1회 | 전체 |
+| **최악 합계** | **Stage 2 최대 5회** | 영역별 독립이므로 1영역 기준 |
+
+전체 파이프라인 타임아웃: 300초. 초과 시 최선 결과 반환 + 경고.
+
+---
+
+## 4. 카탈로그 시스템 (catalog.yaml)
+
+### 4.1 블록 구조
+
+38개 블록, **6개 카테고리:**
+
+| 카테고리 | 블록 수 | 용도 |
+|----------|---------|------|
+| headers | 5 | 꼭지/섹션 제목 |
+| cards | 9 | 항목 나열/비교 |
+| **tables** | **3** | 비교 표, 스트라이프 표 |
+| **visuals** | **6** | 벤 다이어그램, 프로세스, 흐름 |
+| emphasis | 10 | 강조/콜아웃/배너/불릿 |
+| media | 5 | 이미지 배치 |
+
+별도 섹션: **4개 레이아웃 프리셋** (sidebar-right, two-column, hero-detail, single-column)
+
+### 4.2 블록 메타데이터 (현재 상태 + Phase T 추가)
+
+```yaml
+- id: block-id
+  name: 한글 이름
+  category: headers | cards | tables | visuals | emphasis | media
+  template: blocks/category/block-id.html
+  height_cost: compact | medium | large | xlarge
+  min_height_px: 80
+  relation_types: [comparison, cause_effect]   # 빈 배열 = 모든 relation에 가능
+  min_items: 2                                  # 19/38 블록에 존재
+  max_items: 5                                  # 19/38 블록에 존재
+  visual: "시각적 설명"
+  when: "사용 적합 상황"
+  not_for: "부적합 상황"
+  purpose_fit: [핵심전달, 문제제기]
+  zone: full-width-only                         # 4/38 블록에 존재 (선택)
+  slots:
+    required: [title, description]
+    optional: [icon, source]
+  # --- Phase T 필수 추가 ---
+  schema:                                        # ★ 현재 19/38 → 38/38 완성 필요
+    title: {max_lines: 1, font_size: 16, ref_chars: {body: 30, sidebar: 20}}
+    description: {max_lines: 3, font_size: 14, ref_chars: {body: 150, sidebar: 90}}
+  visual_diff: |                                 # ★ 신규 (T-4), 유사 블록 20개에 추가
+    유사 블록과의 차이: ...
+  variants:                                      # 현재 4/38 → 필요 시 확장
+    - id: default
+    - id: compact
+```
+
+### 4.3 expression_hint → 블록 매핑 (키워드 포함 매칭)
+
+| 시각적 유형 | 매칭 키워드 | 매핑 블록 |
+|---|---|---|
+| 인과 | "인과", "현상->결과", "야기" | callout-warning, dark-bullet-list |
+| 나열_병렬 | "독립적 나열", "병렬", "개별 증거" | dark-bullet-list, card-icon-desc |
+| 나열_정의 | "독립적 정의", "참조용", "용어" | card-numbered |
+| 포함_계층 | "상위-하위", "포함 관계", "계층적" | venn-diagram, keyword-circle-row |
+| 강조_결론 | "핵심 메시지 강조", "임팩트" | banner-gradient, quote-big-mark |
+| 비교 | "대등 비교", "좌우 대조", "vs" | compare-2col-split, compare-3col-badge |
+| 순서 | "시간 순서", "단계별", "A->B->C" | flow-arrow-horizontal, process-horizontal |
+
+### 4.4 Phase T에서 필요한 개선
+
+| 항목 | 현재 | 목표 | 우선순위 |
+|------|------|------|----------|
+| schema 완성 | 19/38 | 38/38 | 높음 (Stage 1.5b 필수) |
+| visual_diff 추가 | 0/38 | 20/38 (유사 블록 그룹) | 중간 (T-3 프롬프트 품질) |
+| ref_chars ↔ 컨테이너 폭 정합성 검증 | 없음 | 시작 시 자동 검증 | 높음 |
+| 블록 독립 렌더링 스크린샷 | 없음 | 38개 PNG | 중간 (visual_diff 근거) |
+
+---
+
+## 5. 데이터 흐름 요약
+
+```
+[원본 MDX]
+    │
+    ▼
+Stage 0 ─── 코드 ──→ context.normalized
+    │       (python-frontmatter + markdown-it-py)
+    │       {clean_text, title, images[], popups[], tables[], sections[]}
+    ▼
+Stage 1A ── Kei ──→ context.analysis
+    │               {topics[], page_structure, core_message}
+    │               ✓ Pydantic 검증 + 원본 대조
+    ▼
+Stage 1B ── Kei ──→ context.topics[].relation_type, .expression_hint, .source_data
+    │               ✓ 모순 탐지 + 원본 패턴 대조
+    ▼
+Stage 1.5a ─ 코드 ──→ context.font_hierarchy, .container_ratio, .containers[].text_budget
+    │                   (폰트 위계 확정 → 비율 역산 → 텍스트 예산)
+    ▼
+Stage 1.7 ── 코드 ──→ context.references[]
+    │                   (relation_type 1차 + expression_hint 2차 → 블록+변형)
+    │                   (Jinja 치환 → 디자인 레퍼런스 HTML)
+    ▼
+Stage 1.5b ─ 코드 ──→ context.containers[].design_budget
+    │                   (블록 schema 기반 디자인 요소 크기 역산)
+    ▼
+Stage 2 ── Sonnet ──→ context.generated_html
+    │                   (디자인 레퍼런스 + 구체적 수치 + 원본 텍스트)
+    │                   ✓ L1+L2+L3 코드 검증
+    ▼
+Stage 3 ── 코드 ──→ context.rendered_html
+    │                 (동적 비율 grid 조립)
+    │                 ✓ L4 Selenium 실측
+    ▼
+Stage 4 ── Opus ──→ context.quality_score
+    │                (스크린샷 기반 시각 평가, 30점 미만 차단)
+    ▼
+Stage 5 ── 코드 ──→ final.html
+                     (이미지 base64 변환, details JS 삽입)
+```
+
+---
+
+## 6. 에러 핸들링
+
+| 실패 지점 | 등급 | 복구 전략 |
+|---|---|---|
+| Stage 0 검증 실패 | FATAL | 원본 MDX 자체 문제 — 사용자에게 에러 반환 |
+| Stage 1A Pydantic 실패 | RETRYABLE | Self-Refine 피드백 포함 Kei 재요청 (최대 2회) |
+| Stage 1B 모순 탐지 | RETRYABLE | 모순 topic만 피드백 포함 재요청 (최대 2회) |
+| Stage 1.5a 수치 이상 | FATAL | 결정론적이므로 재시도 무의미 — 입력 점검 필요 |
+| Stage 1.7 적합 블록 없음 | ADJUSTABLE | 카테고리별 fallback 블록 선택 + 경고 기록 |
+| Stage 1.5b 음수 예산 | ADJUSTABLE | 폰트 1px 축소 or 블록 재선택 |
+| L1~L3 실패 | RETRYABLE | Self-Refine 프롬프트로 Stage 2 재실행 (최대 2회) |
+| L4 overflow | RETRYABLE | 실패 영역만 Stage 2로 + 구체적 px 피드백 (최대 2회) |
+| L5 30점 미만 | FATAL | 출력 차단 + 에러 기록 |
+| L5 30~60점 | RETRYABLE | Opus 피드백으로 Stage 2 재실행 (최대 1회) |
+| 타임아웃 (300초) | FATAL | 최선 결과 반환 + 경고 |
+
+---
+
+## 7. 기술 스택 (Phase T)
+
+| 구성 요소 | 기술 | 비고 |
+|---|---|---|
+| 웹 서버 | **FastAPI** + uvicorn | 포트 8001, SSE 스트리밍 |
+| 파이프라인 런타임 | Python (async) | Pydantic BaseModel (PipelineContext) |
+| MDX 파싱 | python-frontmatter + markdown-it-py + mdit-py-plugins | ~1MB 추가 |
+| 콘텐츠 판단 | Kei Persona API (localhost:8000, Opus, SSE) | httpx streaming |
+| HTML 생성 | Claude Sonnet 4 (Anthropic API) | 영역별 개별 호출 |
+| 한국어 키워드 추출 | kiwipiepy | L1 검증 + Stage 1A/1B 원본 대조 |
+| 관계 표현 패턴 | regex 7종 (relation_type별) | Stage 1B 검증 보조 |
+| 시각 품질 평가 | Opus Vision (Anthropic API) | L5, 스크린샷 기반 |
+| 실측 렌더링 | Selenium headless Chrome | L4, 1280×920 viewport |
+| 블록 카탈로그 | catalog.yaml (38개 블록) | schema 38/38 완성 필요 |
+| 템플릿 엔진 | Jinja2 | 블록 HTML 렌더링 |
+| 디자인 토큰 | tokens.css + base.css | Pretendard Variable |
+| HTTP 클라이언트 | httpx | Kei API SSE 통신 |
+| 스냅샷 저장 | JSON (Pydantic model_dump_json) | `data/runs/{run_id}/` |
+
+---
+
+## 8. 마이그레이션 맵 (현재 코드 → Phase T)
+
+### 신규 생성
+
+| 파일 | 역할 |
+|------|------|
+| `src/pipeline_context.py` | PipelineContext Pydantic 모델 |
+| `src/mdx_normalizer.py` | Stage 0 MDX 파서 (4-Layer) |
+| `src/validators.py` | Stage 1A/1B Pydantic 스키마 + 모순 탐지 + 원본 대조 |
+| `src/block_reference.py` | Stage 1.7 블록 선택 + 디자인 레퍼런스 생성 |
+| `scripts/capture_block_screenshots.py` | 38개 블록 독립 렌더링 스크린샷 |
+
+### 수정
+
+| 파일 | 변경 내용 |
+|------|----------|
+| `src/pipeline.py` | run_stage 패턴 + 11-Stage 러너 + PipelineContext 기반 |
+| `src/html_generator.py` | 프롬프트에 context 기반 수치+레퍼런스 주입, 하드코딩 CSS 제거 |
+| `src/space_allocator.py` | 폰트 위계 + 동적 비율 역산 + design_budget 계산 |
+| `src/content_verifier.py` | L1에 kiwipiepy 추가, L3에 폰트 위계 검증 추가 |
+| `templates/catalog.yaml` | schema 19개 추가 완성 + visual_diff 20개 추가 |
+
+### 미사용 (Phase S에서 이미 미사용, 삭제 후보)
+
+| 파일/함수 | 이유 |
+|-----------|------|
+| `src/block_selector.py` | Phase R'에서 제거됨. Stage 1.7의 block_reference.py로 대체 |
+| `src/content_editor.py` | Phase S에서 별도 텍스트 편집 Stage 제거됨 |
+| `src/design_director.py` | Step B 프롬프트 제거됨. 프리셋 선택 로직만 space_allocator로 이동 |
+
+---
+
+## 9. Phase T 범위 vs Phase ZZ 예고
+
+### Phase T (현재) — 폰트 위계 + 파이프라인 안정화
+
+- 11-Stage 파이프라인 전체 구현 (PipelineContext + run_stage 패턴)
+- Stage 0: MDX 4-Layer 파서
+- Stage 1A/1B: Pydantic 검증 + 모순 탐지 + 원본 대조
+- Stage 1.5a: **폰트 위계 확정 + 동적 비율 역산** (Phase T 핵심)
+- Stage 1.7: 블록 참고 선택 (키워드 매칭 + fallback)
+- Stage 1.5b: 블록 schema 기반 디자인 예산 역산
+- catalog.yaml schema 38/38 완성 + visual_diff 20개
+- 분산 검증 (L1~L5) + Self-Refine 재시도
+- **합격 기준:** 어떤 MDX에서도 폰트 위계 유지, overflow 없음, 품질 60점+
+
+### Phase ZZ (최종 전환) — 판단 체계 전환 + 워크플로우
+
+- Kei Persona API → Opus 직접 + Gitea 위키 판단 기준 전환 (비교 평가 후 결정)
+- 청크별 보존율 차등화 (verbatim / summary / core_80)
+- Stage 1.5a → 1A/1B 역방향 협상 루프 (weight 재조정 요청)
+- Gitea 이슈 기반 워크플로우 전환
+- Starlight `.astro` 임베딩
+- 반응형 전환 여부 판단