C.E.L_Slide_test2/ARCHITECTURE-PHASE-T.md

# Design Agent Architecture — Phase T

> MDX 원본 문서 → 고정 크기 HTML 슬라이드(1280×720px) 자동 생성 파이프라인
> **폰트 위계가 먼저, 컨테이너가 따라간다** — 텍스트 보존 · 폰트 위계 강제 · 디자인 요소 크기를 수학적으로 역산

---

## 1. 핵심 설계 원칙

### 1.1 AI vs 코드 역할 분리

| 역할 | 담당 | 해당 Stage |
|---|---|---|
| 콘텐츠 판단 · 분류 | AI (Kei Persona API / Opus) | 1A, 1B |
| 폰트 위계 확정 + 컨테이너 비율 역산 | 코드 (결정론적 수학) | 1.5a |
| 블록 선택 · 변형 결정 | 코드 (키워드 매칭 + 룩업 테이블) | 1.7 |
| 블록 schema 기반 디자인 예산 역산 | 코드 (결정론적 수학) | 1.5b |
| HTML 생성 | AI (Claude Sonnet 4) | 2 |
| 텍스트·구조 검증 (L1~L3) | 코드 (kiwipiepy + regex) | 2 직후 |
| 실측 렌더링 (L4) | Selenium (headless Chrome) | 3 직후 |
| 시각 품질 평가 (L5) | AI (Opus Vision) | 4 |
| HTML 조립 · 서빙 | 코드 | 3, 5 |

AI가 공간을 볼 수 없는 근본적 한계를 코드(수학적 예산 역산)로 보완하는 구조.
LLM이 참고 HTML 구조를 70~90% 복사하는 경향을 장점으로 활용 — "디자인 레퍼런스" 프레이밍.

### 1.2 폰트 위계 (Phase T 핵심 — 이것이 모든 계산의 출발점)

Phase S에서 폰트 크기가 중요도와 완전히 역전됨 (sidebar 14px > key-msg 11px).
Phase T는 **위계를 먼저 확정하고, 컨테이너가 위계에 맞춰지는** 방향으로 전환.

| 영역 | 중요도 | 폰트 범위 | 강제 규칙 |
|------|--------|----------|----------|
| 핵심 (key-msg) | 1위 | **14px** bold | 무조건 한 줄, 슬라이드에서 가장 큰 폰트 |
| 본문 (core) | 2위 | **12px** | 본문 텍스트 기본 |
| 배경 (bg) | 3위 | **10-12px** | 텍스트 양에 따라 범위 내 조정 |
| 첨부 (sidebar) | 4위 | **9-11px** | 참고 자료, 가장 작아도 됨 |

**검증 기준:** `font_size(핵심) > font_size(본문) ≥ font_size(배경) > font_size(첨부)` — 위반 시 에러.

### 1.3 파이프라인 운영 패턴

#### 누적 컨텍스트 객체 (Pydantic BaseModel)

각 Stage가 독립 JSON을 읽고 쓰는 대신, `PipelineContext` 하나가 파이프라인을 따라가며 점진적으로 확장. T-0 조사 결과 **Pydantic BaseModel** 채택 (dataclass 아님) — `model_dump_json()` 직렬화, `validate_assignment=True` 타입 검증.

```python
context.normalized.clean_text                           # Stage 0
context.normalized.title                                # Stage 0
context.normalized.images                               # Stage 0
context.normalized.popups                               # Stage 0
context.normalized.tables                               # Stage 0
context.analysis.core_message                           # Stage 1A
context.analysis.topics[0].source_hint                  # Stage 1A
context.analysis.page_structure                         # Stage 1A
context.topics[0].relation_type                         # Stage 1B
context.topics[0].expression_hint                       # Stage 1B
context.topics[0].source_data                           # Stage 1B
context.font_hierarchy                                  # Stage 1.5a
context.container_ratio                                 # Stage 1.5a (동적 body:sidebar 비율)
context.containers["본심"].text_budget                   # Stage 1.5a
context.references["본심"].block_id                      # Stage 1.7
context.references["본심"].design_reference_html         # Stage 1.7
context.containers["본심"].design_budget                 # Stage 1.5b (블록 선택 후 재계산)
context.generated_html                                  # Stage 2
context.rendered_html                                   # Stage 3
context.measurement                                     # Stage 4
context.quality_score                                   # Stage 4
```

#### 각 Stage 공통 실행 패턴

```python
async def run_stage(stage_fn, context, stage_name, max_retries=1):
    for attempt in range(max_retries + 1):
        result = await stage_fn(context)
        errors = result.get("_errors", [])
        if not errors:
            # Pydantic: model_copy(update=...) 사용
            context = context.model_copy(update=result)
            context.save_snapshot(stage_name)
            return context
        context.errors.append({"stage": stage_name, "attempt": attempt, "errors": errors})
        if attempt < max_retries:
            context.retry_feedback = build_retry_feedback(stage_name, errors)
    raise StageFailure(stage_name, errors)
```

#### 에러 3등급 분류

| 등급 | 의미 | 대응 |
|------|------|------|
| **FATAL** | 복구 불가 (원본 문제, JSON 파싱 실패) | 파이프라인 중단 |
| **RETRYABLE** | AI 재시도로 해결 가능 (분류 오류, 누락) | Self-Refine 피드백 포함 재요청 (최대 2회) |
| **ADJUSTABLE** | 코드로 자동 조정 가능 (높이 부족, 비율 초과) | 자동 조정 후 경고 기록 |

#### 스냅샷 저장

`data/runs/{run_id}/step{N}_context.json` — run_id는 `YYYYMMDD_HHMMSS` timestamp.
Pydantic `model_dump_json()`으로 직렬화. `diff step1a_context.json step1b_context.json`으로 추적.

---

## 2. 파이프라인 (11 Stage)

### Stage 0: MDX 표준화

- **담당:** 코드
- **신규 파일:** `src/mdx_normalizer.py`
- **라이브러리:** `python-frontmatter` + `markdown-it-py` + `mdit-py-plugins` (총 ~1MB)
- **입력:** 원본 MDX 텍스트
- **처리 (4-Layer 파서):**
  - **Layer 1:** `python-frontmatter.parse()` → `(metadata_dict, body_str)` 분리. title 추출.
  - **Layer 2:** 코드블록 보호 (backtick 10→3 순서로 fenced block → placeholder) → MDX 전용 패턴 처리:
    - Astro `:::directive` → `[핵심요약]...[/핵심요약]` 마커
    - `<details><summary>제목</summary>내용</details>` → popups[] 추출
    - JSX `style={{}}`, `import/export` 제거
  - **Layer 3:** `markdown-it-py` AST 파싱 (`js-default` 프리셋, table 기본 포함):
    - heading 토큰 → 섹션 구조 추출 (tag, level, content, source line)
    - image 토큰 → images[] 추출 (alt, src)
    - table 토큰 → tables[] 추출 (header, rows)
    - 코드블록 placeholder 복원
  - **Layer 4:** 텍스트 정리 — 남은 HTML 태그 제거, 빈 줄 정리, 최종 clean_text
- **출력:**
  ```python
  {
      "clean_text": str,         # 정규화된 순수 텍스트
      "title": str,              # frontmatter 제목
      "images": [{"alt": str, "path": str}],
      "popups": [{"title": str, "content": str}],
      "tables": [{"header": list, "rows": list}],
      "sections": [{"level": int, "title": str, "content": str}]  # ## 기준 섹션 분리
  }
  ```
- **검증:**
  - clean_text 비어있지 않음
  - `##` 섹션 최소 1개
  - 원본 대비 30% 이상 텍스트 보존 (과도한 제거 방지)
  - images[] 수 = 원본 `![` 패턴 수
  - popups[] 수 = 원본 `<details>` 패턴 수
- **주의:** 기존 `normalize_mdx()`의 `r"^## \d+\.\s*"` → `r"^## \d+\.\s+"` 수정 (공백 1개 이상 필수)
- **저장:** `context.normalized.*`

---

### Stage 1A: Kei 꼭지 추출

- **담당:** AI (Kei Persona API, localhost:8000, Opus — SSE 스트리밍)
- **입력:** `context.normalized.clean_text` (Stage 0에서 정규화된 텍스트)
- **처리:** Kei가 콘텐츠를 읽고 꼭지 분류 + 스토리라인 설계
- **출력:**
  - `topics[]` — id, title, purpose, role, layer, weight, **source_hint** (원본 MDX 섹션 참조)
  - `page_structure` — { "본심": {topic_ids, weight}, "배경": {...}, "첨부": {...}, "결론": {...} }
  - `core_message` — 슬라이드 핵심 메시지 한 줄
- **검증 (Pydantic + 코드 대조):**
  - **형식:** weight 합 0.9~1.1 범위, 본심 weight ≥ 0.3, 필수 필드 존재, topics > 0
  - **내용 대조:** 원본 `##` 섹션 수 vs topic 수 비교 — 차이가 크면 분류 오류 가능성
  - **내용 대조:** topic summary 키워드가 원본 해당 섹션에 실제 존재하는지 (kiwipiepy)
  - 실패 시 RETRYABLE → Self-Refine 피드백 포함 재요청 (최대 2회)
- **저장:** `context.analysis`, `context.topics`, `context.page_structure`

---

### Stage 1B: 컨셉 구체화

- **담당:** AI (Kei Persona API, Opus — SSE 스트리밍)
- **입력:** `context.normalized.clean_text` + `context.topics` (Stage 1A 결과)
- **처리:** 각 꼭지에 관계 유형, 표현 힌트, 원본 텍스트 참조 부여
- **출력:** topics에 아래 필드 병합
  - `relation_type` — **7개 enum:** hierarchy / cause_effect / comparison / sequence / definition / inclusion / **none**
  - `expression_hint` — 디자인 방향 힌트 (3문장 구조: 관계 선언 + 콘텐츠 설명 + 시각 지침)
  - `source_data` — 원본 텍스트 참조
- **검증 (Pydantic + 코드 대조 + 모순 탐지):**
  - **형식:** relation_type이 7개 enum 중 하나, expression_hint 비어있지 않음, source_data 비어있지 않음
  - **모순 결정 테이블:**

    | purpose | 모순인 relation_type | 이유 |
    |---------|---------------------|------|
    | 결론강조 | comparison, sequence | 결론은 비교나 순서가 아님 |
    | 문제제기 | sequence, definition | 문제제기는 순서 나열이나 정의가 아님 |
    | 용어정의 | hierarchy, cause_effect | 정의 나열은 상하위나 인과가 아님 |
    | 구조시각화 | none | 시각화할 관계가 없으면 구조시각화가 아님 |

  - **source_data 원본 대조:** source_data 키워드가 원본 clean_text에 실제 존재하는지 (kiwipiepy). 없는 출처 감지 → 할루시네이션
  - **relation_type 원본 대조:** 한국어 관계 표현 패턴으로 검증

    | relation_type | 원본에 있어야 하는 패턴 (일부) |
    |---------------|-------------------------------|
    | comparison | vs, 반면, 차이점, 에 비해, 와 달리, 상이, 구분 |
    | sequence | →, 이후, 단계, 먼저, 점진적, 과정, 를 거쳐 |
    | hierarchy | 상위, 하위, 속하, 범주, 구성요소, 체계, 계층 |
    | inclusion | 포함, 융합, 통합, 결합, 내포, 포괄, 연계 |
    | cause_effect | 때문에, 따라서, 결과, 로 인해, 초래, 야기, 기인 |
    | definition | 이란, 정의, 의미, 을 말한다, 라 함은, 용어 |

  - 실패 시 RETRYABLE → 모순/불일치 topic만 피드백 포함 재요청 (최대 2회)
- **저장:** `context.topics[].relation_type`, `.expression_hint`, `.source_data`

---

### Stage 1.5a: 폰트 위계 확정 + 컨테이너 비율 역산

- **담당:** 코드 (AI 아님, 결정론적 수학)
- **입력:** page_structure weight + 각 영역의 source_data 텍스트 양
- **핵심 원칙:** **폰트가 먼저, 컨테이너가 따라간다**

#### (1) 폰트 위계에서 필요 공간 계산

```python
FONT_HIERARCHY = {
    "핵심":  {"min": 14, "max": 14, "weight": "bold"},
    "본심":  {"min": 12, "max": 12},
    "배경":  {"min": 10, "max": 12},
    "첨부":  {"min": 9,  "max": 11},
}

def calculate_required_space(role, content, font_size):
    """이 폰트 크기로 이 텍스트를 넣으려면 몇 px 필요한가?"""
    char_width_px = font_size * 0.947   # Pretendard 한글 실측 비율
    line_height_px = font_size * 1.5    # 본문 기준
    chars_per_line = available_width // char_width_px
    total_lines = len(content) // chars_per_line
    required_height = total_lines * line_height_px + padding
    return required_height
```

#### (2) 동적 body:sidebar 비율 역산

고정 65:35가 아니라 텍스트 양에서 역산:

```python
def calculate_container_ratio(roles_text_volume, font_hierarchy):
    """폰트 위계를 지키면서 모든 텍스트가 들어가는 비율을 역산"""
    # 1. 각 역할의 위계 기준 폰트로 필요 공간 계산
    sidebar_need = calculate_required_space("첨부", sidebar_text, font_hierarchy["첨부"]["max"])
    body_need = sum(calculate_required_space(r, t, font_hierarchy[r]["max"])
                    for r, t in body_roles)

    # 2. sidebar 충전율로 비율 결정
    sidebar_capacity_at_35 = estimate_capacity(slide_width * 0.35, font_hierarchy["첨부"]["max"])
    fill_rate = len(sidebar_text) / sidebar_capacity_at_35

    if fill_rate < 0.5:
        ratio = (72, 28)   # sidebar 텍스트 적음 → body 확대
    elif fill_rate < 0.8:
        ratio = (68, 32)   # 보통
    else:
        ratio = (65, 35)   # 현재 유지

    return ratio  # (body_pct, sidebar_pct)
```

#### (3) 텍스트 예산 계산

비율 확정 후, 각 영역의 텍스트 예산:

```python
def calculate_text_budget(container, content, font_size):
    char_width_px = font_size * 0.947
    line_height_px = font_size * 1.5
    inner_width = container.width_px - padding * 2
    inner_height = container.height_px - padding * 2

    chars_per_line = int(inner_width / char_width_px)
    max_lines = int(inner_height / line_height_px)
    max_chars = chars_per_line * max_lines

    source_chars = len(content)
    needs_compression = source_chars > max_chars

    return TextBudget(
        font_size=font_size,
        chars_per_line=chars_per_line,
        max_lines=max_lines,
        max_chars=max_chars,
        source_chars=source_chars,
        needs_compression=needs_compression,
    )
```

#### (4) 다단 레이아웃 판단

위계 범위 내 최소 폰트로도 텍스트가 안 들어가면 구조 변경:

```
1. 위계 기준 폰트(max)로 수용량 계산
2. 텍스트 양 > 수용량 → 폰트 1px 축소 (위계 min까지)
3. 최소 폰트로도 불가 → 레이아웃 변경 (1단→2단)
4. 2단으로도 불가 → 비율 조정 (sidebar 축소 → body 확대)
5. 비율 조정으로도 불가 → 텍스트 편집 필요 경고 (context.warnings에 기록)
```

- **검증:** height_px 합 ≤ 전체 높이, 폰트 위계 유지, 음수 없음
- **저장:** `context.font_hierarchy`, `context.container_ratio`, `context.containers[].text_budget`

---

### Stage 1.7: 참고 블록 선택 + 변형 결정

- **담당:** 코드 (키워드 매칭 + 룩업 테이블, AI 아님)
- **입력:** 1B의 relation_type + expression_hint + 1.5a의 컨테이너 스펙 + catalog.yaml
- **처리 4단계:**

#### (1) relation_type → 블록 후보 (1차 필터)

catalog.yaml의 `relation_types` 필드로 필터:

```python
candidates = [b for b in catalog.blocks
              if relation_type in b.relation_types or not b.relation_types]
```

#### (2) expression_hint → 블록 세분화 (2차 필터 — 키워드 포함 여부)

expression_hint는 긴 문장이므로 **정확한 문자열 매칭이 아니라 키워드 포함(substring) 매칭**:

```python
VISUAL_TYPE_KEYWORDS = {
    "인과": {"keywords": ["인과", "현상->결과", "야기", "원인"], "blocks": ["callout-warning", "dark-bullet-list"]},
    "나열_병렬": {"keywords": ["독립적 나열", "병렬 나열", "개별 증거"], "blocks": ["dark-bullet-list", "card-icon-desc"]},
    "나열_정의": {"keywords": ["독립적 정의", "용어", "참조용"], "blocks": ["card-numbered"]},
    "포함_계층": {"keywords": ["상위-하위", "포함 관계", "계층적"], "blocks": ["venn-diagram", "keyword-circle-row"]},
    "강조_결론": {"keywords": ["핵심 메시지 강조", "임팩트", "한 줄 강조"], "blocks": ["banner-gradient", "quote-big-mark"]},
    "비교": {"keywords": ["대등 비교", "좌우 대조", "vs"], "blocks": ["compare-2col-split", "compare-3col-badge"]},
    "순서": {"keywords": ["시간 순서", "단계별", "A->B->C"], "blocks": ["flow-arrow-horizontal", "process-horizontal"]},
}

def match_visual_type(expression_hint: str) -> str:
    """expression_hint에서 키워드를 찾아 시각적 유형 반환"""
    for vtype, spec in VISUAL_TYPE_KEYWORDS.items():
        if any(kw in expression_hint for kw in spec["keywords"]):
            return vtype
    return "default"
```

시각 매핑 근거 (Gestalt 원칙):
- 폐합(Closure) → hierarchy/inclusion → 원형(벤 다이어그램)
- 근접(Proximity) → comparison → 좌우 표/비교
- 연속(Continuity) → sequence → 화살표 흐름
- 유사(Similarity) → definition → 동일 형태 카드 반복
- PPTAgent(EMNLP 2025): "참고 기반 생성"의 효과를 학술 입증

#### (3) 컨테이너 크기 적합성 검사

```python
candidates = [b for b in candidates
              if b.min_height_px <= container.height_px]
```

#### (4) 블록 변형(variant) + 레이아웃 자동 선택

```python
def select_block_variant(block, container, content):
    if not block.variants or len(block.variants) <= 1:
        return block.id, "default"

    for variant in block.variants:
        if variant.id == "compact" and container.height_px < 150:
            return block.id, "compact"
        if variant.id == "wide" and container_ratio[0] >= 70:  # body 70% 이상
            return block.id, "wide"

    return block.id, "default"
```

#### (5) fallback 정의

모든 필터를 통과하는 후보가 없을 때의 카테고리별 기본 블록:

| 카테고리 | fallback 블록 | 이유 |
|----------|-------------|------|
| cards | card-numbered | 가장 범용, compact~xlarge 대응 |
| emphasis | dark-bullet-list | 텍스트 중심, 높이 유연 |
| visuals | venn-diagram | N개 자동 배치 가능 |
| tables | compare-2col-split | 가장 기본적 비교 |
| media | image-side-text | 텍스트+이미지 조합 |

#### 디자인 레퍼런스 HTML 생성

Jinja 변수를 샘플 데이터로 치환한 완성된 HTML + 구조 의도 주석.
LLM이 이 구조를 70~90% 복사 → 레이아웃을 "발명"하지 않고 검증된 구조를 따름.

```python
def generate_design_reference(block, variant, catalog_entry):
    template = load_template(block.template)
    sample_data = build_sample_data(catalog_entry.slots)
    rendered = template.render(**sample_data)

    # 구조 의도 주석 추가 (LLM이 의도를 정확히 파악)
    annotated = f"<!-- {block.id}: {catalog_entry.visual} -->\n"
    if catalog_entry.get("visual_diff"):
        annotated += f"<!-- 차별점: {catalog_entry.visual_diff} -->\n"
    annotated += rendered

    return annotated
```

- **출력:**

```json
{
  "block_id": "dark-bullet-list",
  "variant": "default",
  "visual_type": "인과",
  "schema": {
    "title": {"max_lines": 1, "font_size": 16, "max_chars": 30},
    "bullet_item": {"max_lines": 1, "font_size": 14, "max_chars": 86},
    "max_bullets": 5
  },
  "design_reference_html": "<!-- dark-bullet-list: 다크 배경 + 파란 제목 + 흰 불릿 -->\n<div ...>..."
}
```

- **검증:** 선택된 블록이 catalog.yaml에 실제 존재, min_height_px ≤ container.height_px
- **저장:** `context.references["본심"].*`

---

### Stage 1.5b: 디자인 예산 재계산 (블록 선택 후)

- **담당:** 코드 (AI 아님)
- **입력:** Stage 1.7에서 선택된 블록의 schema + Stage 1.5a의 컨테이너 스펙
- **목적:** 텍스트 영역 확보 후 남은 공간 = 디자인 요소 예산. **텍스트를 줄이는 것이 아니라 도형·이미지·CSS 요소의 크기를 맞추는 방향.**

```python
def calculate_design_budget(container, text_budget, block_schema):
    # 블록 schema에서 텍스트 슬롯별 높이 합산
    text_height = 0
    for slot_name, spec in block_schema.items():
        if slot_name.startswith("max_"):
            continue
        slot_lines = spec.get("max_lines", 1)
        slot_font = spec.get("font_size", 14)
        text_height += slot_lines * (slot_font * 1.6)

    remaining_height = container.height_px - text_height - padding
    remaining_width = container.width_px - padding

    return DesignBudget(
        available_height_px=remaining_height,
        available_width_px=remaining_width,
        max_circle_diameter=min(remaining_height, remaining_width) - 4,
        max_img_width=remaining_width * 0.4,
        max_img_height=remaining_height,
        fits=remaining_height >= 0,
    )
```

- **검증:** available_height_px ≥ 0 (음수 = 블록이 컨테이너에 안 맞음 → Stage 1.7 재선택 또는 ADJUSTABLE)
- **저장:** `context.containers["본심"].design_budget`

---

### Stage 2: HTML 생성

- **담당:** AI (Claude Sonnet 4, Anthropic API 직접, 현재 모델: `claude-sonnet-4-20250514`)
- **입력:** 원본 텍스트 + 누적 컨텍스트 전체
- **처리:** 영역별(배경/본심/첨부/결론) **각각 개별 호출**로 HTML 생성

프롬프트 구성 — 모든 수치를 **구체적으로** 전달 (Phase S 교훈: 추상적 프롬프트는 실패):

| 출처 | 포함 내용 |
|------|----------|
| Stage 0 | clean_text (원본 텍스트 — "이 텍스트를 그대로 사용하라") |
| Stage 1A | core_message |
| Stage 1B | expression_hint, relation_type |
| Stage 1.5a | 확정된 폰트 크기, 줄 수, 글자 수, 컨테이너 px |
| Stage 1.5b | 디자인 요소 크기 제약 (max_circle_px, max_img_width 등) |
| Stage 1.7 | 디자인 레퍼런스 HTML + visual_diff 설명 |

프롬프트 예시:

```
[디자인 레퍼런스]
아래 HTML의 구조와 색상 패턴을 따르되 콘텐츠를 교체하세요.
<!-- dark-bullet-list: 다크 배경 + 파란 제목 + 흰 불릿 -->
<!-- 차별점: 같은 다크 계열 callout-warning과 달리 경고 아이콘 없음. 순수 나열용. -->
<div style="background:#1a2332; padding:20px; border-radius:6px;">
  <!-- SLOT: title (1줄, 16px bold, max 30자) -->
  <h3 style="color:#4a9eff; font-size:16px; font-weight:700;">샘플 제목</h3>
  <!-- SLOT: bullets (1줄씩, 14px, max 86자, max 5개) -->
  <ul style="list-style:none; padding:0;">
    <li style="color:#e2e8f0; font-size:14px; padding:4px 0;">• 샘플 항목 1</li>
  </ul>
</div>

[수치 제약 — 반드시 준수]
- 컨테이너: 너비 707px, 높이 176px
- 폰트: 11px (배경 영역 위계)
- 줄당 최대 68자
- 최대 10줄
- 디자인 요소 예산: 높이 84px, 너비 707px

[원본 텍스트 — 축약/변형 금지]
"DX와 BIM이 개념적으로 명확히 정립되지 않은채 혼용되어 사용되고 있음..."

[필수 규칙]
- inline style만 사용, <style> 블록 금지
- overflow:hidden 금지
- 디자인 레퍼런스의 구조를 따르되 콘텐츠에 맞게 커스텀
- 개조식 통일 (서술형 ~하다/~이다 → 개조식 ~에 해당/~인식되는 중)
```

- **이미지 처리:** Stage 0에서 추출된 `images[]`의 경로와 크기 정보를 프롬프트에 포함. Stage 5에서 base64 인라인 변환.
- **팝업 처리:** Stage 0에서 추출된 `popups[]`를 `<details>/<summary>` HTML로 변환 지시.
- **출력:** `{body_html, sidebar_html, footer_html}`
- **저장:** `context.generated_html`

---

### 분산 검증 시스템

5층 검증을 한 곳에 집중하지 않고, **각 Layer가 적합한 시점에 분산 실행**.
재시도 프롬프트는 Self-Refine(NeurIPS 2023) 패턴: `localization + evidence + instruction`.
VASCAR(2024)의 Scorer+Suggester 분리: 점수 매기기와 피드백 생성을 분리.

#### Stage 2 직후: L1 + L2 + L3 (코드 검증)

| Layer | 도구 | 검증 내용 |
|---|---|---|
| L1 | kiwipiepy + regex | 키워드 보존율 80% 이상 |
| L2 | regex | Kei 메모("간결한 문제 제기용" 등)가 출력에 포함 안 됐는지 |
| L3 | regex | overflow:hidden 없는지, 폰트 위계 위반 없는지, inline style만 사용했는지 |

실패 시 → Self-Refine 프롬프트로 Stage 2 재실행 (최대 2회):

```
[재생성 요청 - 시도 2/3]
이전 생성의 문제:
1. L1: 키워드 보존율 65%. 누락: {'BIM', '혼용', '설계오류'}
2. L3: overflow:hidden 감지

수정 지시 (Self-Refine):
- localization: 키워드 보존 실패, 구조 위반
- evidence: 원본 핵심 키워드 3개 누락, overflow:hidden 존재
- instruction: 누락 키워드 포함, overflow:hidden 제거, 나머지 제약 동일
```

---

### Stage 3: 렌더링 (조립)

- **담당:** 코드 (AI 아님)
- **입력:** L1~L3 통과한 body/sidebar/footer HTML + 프리셋 grid + **동적 비율**
- **처리:**
  - tokens.css + base.css 인라인 병합
  - CSS Grid 프레임 구성 — **동적 비율 적용** (예: `72fr 28fr` or `65fr 35fr`)
  - 각 영역 HTML을 `<div class="area-body">` 등에 삽입
  - Pretendard 폰트 CDN 링크 포함
- **출력:** 완전한 단독 실행 HTML
- **저장:** `context.rendered_html`

#### Stage 3 직후: L4 (Selenium 실측)

```python
def validate_after_stage3(context, rendered_html):
    measurements = selenium_measure(rendered_html)
    errors = []
    for area, m in measurements.items():
        if m.scroll_height > m.client_height:
            overflow = m.scroll_height - m.client_height
            errors.append({
                "layer": "L4", "severity": "RETRYABLE",
                "localization": f"{area} overflow {overflow}px",
                "evidence": f"scrollHeight {m.scroll_height} > clientHeight {m.client_height}",
                "instruction": f"이 영역의 디자인 요소를 {overflow+10}px 줄이거나 bullet 1개 제거"
            })
    return errors  # 실패 → 해당 영역만 Stage 2로 (최대 2회)
```

---

### Stage 4: 품질 게이트 (L5)

- **담당:** Selenium (스크린샷 캡처) + Opus Vision (품질 판정, 현재 모델: `claude-opus-4-0-20250514`)
- **처리:**
  - 전체 페이지 스크린샷 캡처 → Opus Vision에 base64 전송
  - 5가지 평가 기준 (VASCAR 방식):
    1. 콘텐츠 겹침/잘림 없는가?
    2. 본심 영역이 시각적으로 가장 두드러지는가?
    3. 폰트가 읽을 수 있는 크기인가? **폰트 위계가 유지되는가?**
    4. 한국어 비즈니스 프레젠테이션으로서 적절한가?
    5. 블록 유형에 다양성이 있는가?
  - 0~100점 평가:
    - 30점 미만 → 출력 차단 (FATAL)
    - 30~60점 → Opus 피드백으로 Stage 2 재실행
    - 60점 이상 → Stage 5로
- **L4와의 차이:** L4는 영역 단위 px 실측(Stage 3 직후), L5는 조립 후 전체 페이지 **시각적** 평가
- **저장:** `context.measurement`, `context.quality_score`

---

### Stage 5: 서빙

- **담당:** 코드
- **처리:**
  - 이미지 경로 → base64 인라인 변환 (다운로드 HTML에서도 이미지 표시)
  - `<details>` 인쇄 시 자동 펼침 JS 삽입 (`window.onbeforeprint`)
  - final.html 저장
- **저장:** `data/runs/{run_id}/final.html`

---

## 3. 검증 흐름 요약

```
Stage 1A (Kei 분석)
    ↓
  1A 검증 (Pydantic + 원본 대조) ──실패──→ Kei 재요청 (최대 2회)
    ↓ 통과
Stage 1B (컨셉 구체화)
    ↓
  1B 검증 (모순 탐지 + 원본 대조) ──실패──→ Kei 재요청 (최대 2회)
    ↓ 통과
Stage 1.5a → 1.7 → 1.5b (코드, 결정론적)
    ↓
Stage 2 (HTML 생성)
    ↓
  L1+L2+L3 ──실패──→ Self-Refine → Stage 2 재실행 (최대 2회)
    ↓ 통과
Stage 3 (조립)
    ↓
  L4 ──실패──→ 실패 영역만 Stage 2로 (최대 2회)
    ↓ 통과
Stage 4 (L5 최종 판정)
    ↓
  30점 미만 → 차단 (FATAL)
  30~60점  → Opus 피드백으로 Stage 2 재실행 (최대 1회)
  60점 이상 → Stage 5
```

### 재시도 총 예산

| 지점 | 최대 재시도 | 대상 |
|------|-----------|------|
| Stage 1A | 2회 | Kei 전체 |
| Stage 1B | 2회 | 실패 topic만 |
| L1~L3 | 2회 | 실패 영역만 |
| L4 | 2회 | 실패 영역만 |
| L5 | 1회 | 전체 |
| **최악 합계** | **Stage 2 최대 5회** | 영역별 독립이므로 1영역 기준 |

전체 파이프라인 타임아웃: 300초. 초과 시 최선 결과 반환 + 경고.

---

## 4. 카탈로그 시스템 (catalog.yaml)

### 4.1 블록 구조

38개 블록, **6개 카테고리:**

| 카테고리 | 블록 수 | 용도 |
|----------|---------|------|
| headers | 5 | 꼭지/섹션 제목 |
| cards | 9 | 항목 나열/비교 |
| **tables** | **3** | 비교 표, 스트라이프 표 |
| **visuals** | **6** | 벤 다이어그램, 프로세스, 흐름 |
| emphasis | 10 | 강조/콜아웃/배너/불릿 |
| media | 5 | 이미지 배치 |

별도 섹션: **4개 레이아웃 프리셋** (sidebar-right, two-column, hero-detail, single-column)

### 4.2 블록 메타데이터 (현재 상태 + Phase T 추가)

```yaml
- id: block-id
  name: 한글 이름
  category: headers | cards | tables | visuals | emphasis | media
  template: blocks/category/block-id.html
  height_cost: compact | medium | large | xlarge
  min_height_px: 80
  relation_types: [comparison, cause_effect]   # 빈 배열 = 모든 relation에 가능
  min_items: 2                                  # 19/38 블록에 존재
  max_items: 5                                  # 19/38 블록에 존재
  visual: "시각적 설명"
  when: "사용 적합 상황"
  not_for: "부적합 상황"
  purpose_fit: [핵심전달, 문제제기]
  zone: full-width-only                         # 4/38 블록에 존재 (선택)
  slots:
    required: [title, description]
    optional: [icon, source]
  # --- Phase T 필수 추가 ---
  schema:                                        # ★ 현재 19/38 → 38/38 완성 필요
    title: {max_lines: 1, font_size: 16, ref_chars: {body: 30, sidebar: 20}}
    description: {max_lines: 3, font_size: 14, ref_chars: {body: 150, sidebar: 90}}
  visual_diff: |                                 # ★ 신규 (T-4), 유사 블록 20개에 추가
    유사 블록과의 차이: ...
  variants:                                      # 현재 4/38 → 필요 시 확장
    - id: default
    - id: compact
```

### 4.3 expression_hint → 블록 매핑 (키워드 포함 매칭)

| 시각적 유형 | 매칭 키워드 | 매핑 블록 |
|---|---|---|
| 인과 | "인과", "현상->결과", "야기" | callout-warning, dark-bullet-list |
| 나열_병렬 | "독립적 나열", "병렬", "개별 증거" | dark-bullet-list, card-icon-desc |
| 나열_정의 | "독립적 정의", "참조용", "용어" | card-numbered |
| 포함_계층 | "상위-하위", "포함 관계", "계층적" | venn-diagram, keyword-circle-row |
| 강조_결론 | "핵심 메시지 강조", "임팩트" | banner-gradient, quote-big-mark |
| 비교 | "대등 비교", "좌우 대조", "vs" | compare-2col-split, compare-3col-badge |
| 순서 | "시간 순서", "단계별", "A->B->C" | flow-arrow-horizontal, process-horizontal |

### 4.4 Phase T에서 필요한 개선

| 항목 | 현재 | 목표 | 우선순위 |
|------|------|------|----------|
| schema 완성 | 19/38 | 38/38 | 높음 (Stage 1.5b 필수) |
| visual_diff 추가 | 0/38 | 20/38 (유사 블록 그룹) | 중간 (T-3 프롬프트 품질) |
| ref_chars ↔ 컨테이너 폭 정합성 검증 | 없음 | 시작 시 자동 검증 | 높음 |
| 블록 독립 렌더링 스크린샷 | 없음 | 38개 PNG | 중간 (visual_diff 근거) |

---

## 5. 데이터 흐름 요약

```
[원본 MDX]
    │
    ▼
Stage 0 ─── 코드 ──→ context.normalized
    │       (python-frontmatter + markdown-it-py)
    │       {clean_text, title, images[], popups[], tables[], sections[]}
    ▼
Stage 1A ── Kei ──→ context.analysis
    │               {topics[], page_structure, core_message}
    │               ✓ Pydantic 검증 + 원본 대조
    ▼
Stage 1B ── Kei ──→ context.topics[].relation_type, .expression_hint, .source_data
    │               ✓ 모순 탐지 + 원본 패턴 대조
    ▼
Stage 1.5a ─ 코드 ──→ context.font_hierarchy, .container_ratio, .containers[].text_budget
    │                   (폰트 위계 확정 → 비율 역산 → 텍스트 예산)
    ▼
Stage 1.7 ── 코드 ──→ context.references[]
    │                   (relation_type 1차 + expression_hint 2차 → 블록+변형)
    │                   (Jinja 치환 → 디자인 레퍼런스 HTML)
    ▼
Stage 1.5b ─ 코드 ──→ context.containers[].design_budget
    │                   (블록 schema 기반 디자인 요소 크기 역산)
    ▼
Stage 2 ── Sonnet ──→ context.generated_html
    │                   (디자인 레퍼런스 + 구체적 수치 + 원본 텍스트)
    │                   ✓ L1+L2+L3 코드 검증
    ▼
Stage 3 ── 코드 ──→ context.rendered_html
    │                 (동적 비율 grid 조립)
    │                 ✓ L4 Selenium 실측
    ▼
Stage 4 ── Opus ──→ context.quality_score
    │                (스크린샷 기반 시각 평가, 30점 미만 차단)
    ▼
Stage 5 ── 코드 ──→ final.html
                     (이미지 base64 변환, details JS 삽입)
```

---

## 6. 에러 핸들링

| 실패 지점 | 등급 | 복구 전략 |
|---|---|---|
| Stage 0 검증 실패 | FATAL | 원본 MDX 자체 문제 — 사용자에게 에러 반환 |
| Stage 1A Pydantic 실패 | RETRYABLE | Self-Refine 피드백 포함 Kei 재요청 (최대 2회) |
| Stage 1B 모순 탐지 | RETRYABLE | 모순 topic만 피드백 포함 재요청 (최대 2회) |
| Stage 1.5a 수치 이상 | FATAL | 결정론적이므로 재시도 무의미 — 입력 점검 필요 |
| Stage 1.7 적합 블록 없음 | ADJUSTABLE | 카테고리별 fallback 블록 선택 + 경고 기록 |
| Stage 1.5b 음수 예산 | ADJUSTABLE | 폰트 1px 축소 or 블록 재선택 |
| L1~L3 실패 | RETRYABLE | Self-Refine 프롬프트로 Stage 2 재실행 (최대 2회) |
| L4 overflow | RETRYABLE | 실패 영역만 Stage 2로 + 구체적 px 피드백 (최대 2회) |
| L5 30점 미만 | FATAL | 출력 차단 + 에러 기록 |
| L5 30~60점 | RETRYABLE | Opus 피드백으로 Stage 2 재실행 (최대 1회) |
| 타임아웃 (300초) | FATAL | 최선 결과 반환 + 경고 |

---

## 7. 기술 스택 (Phase T)

| 구성 요소 | 기술 | 비고 |
|---|---|---|
| 웹 서버 | **FastAPI** + uvicorn | 포트 8001, SSE 스트리밍 |
| 파이프라인 런타임 | Python (async) | Pydantic BaseModel (PipelineContext) |
| MDX 파싱 | python-frontmatter + markdown-it-py + mdit-py-plugins | ~1MB 추가 |
| 콘텐츠 판단 | Kei Persona API (localhost:8000, Opus, SSE) | httpx streaming |
| HTML 생성 | Claude Sonnet 4 (Anthropic API) | 영역별 개별 호출 |
| 한국어 키워드 추출 | kiwipiepy | L1 검증 + Stage 1A/1B 원본 대조 |
| 관계 표현 패턴 | regex 7종 (relation_type별) | Stage 1B 검증 보조 |
| 시각 품질 평가 | Opus Vision (Anthropic API) | L5, 스크린샷 기반 |
| 실측 렌더링 | Selenium headless Chrome | L4, 1280×920 viewport |
| 블록 카탈로그 | catalog.yaml (38개 블록) | schema 38/38 완성 필요 |
| 템플릿 엔진 | Jinja2 | 블록 HTML 렌더링 |
| 디자인 토큰 | tokens.css + base.css | Pretendard Variable |
| HTTP 클라이언트 | httpx | Kei API SSE 통신 |
| 스냅샷 저장 | JSON (Pydantic model_dump_json) | `data/runs/{run_id}/` |

---

## 8. 마이그레이션 맵 (현재 코드 → Phase T)

### 신규 생성

| 파일 | 역할 |
|------|------|
| `src/pipeline_context.py` | PipelineContext Pydantic 모델 |
| `src/mdx_normalizer.py` | Stage 0 MDX 파서 (4-Layer) |
| `src/validators.py` | Stage 1A/1B Pydantic 스키마 + 모순 탐지 + 원본 대조 |
| `src/block_reference.py` | Stage 1.7 블록 선택 + 디자인 레퍼런스 생성 |
| `scripts/capture_block_screenshots.py` | 38개 블록 독립 렌더링 스크린샷 |

### 수정

| 파일 | 변경 내용 |
|------|----------|
| `src/pipeline.py` | run_stage 패턴 + 11-Stage 러너 + PipelineContext 기반 |
| `src/html_generator.py` | 프롬프트에 context 기반 수치+레퍼런스 주입, 하드코딩 CSS 제거 |
| `src/space_allocator.py` | 폰트 위계 + 동적 비율 역산 + design_budget 계산 |
| `src/content_verifier.py` | L1에 kiwipiepy 추가, L3에 폰트 위계 검증 추가 |
| `templates/catalog.yaml` | schema 19개 추가 완성 + visual_diff 20개 추가 |

### 미사용 (Phase S에서 이미 미사용, 삭제 후보)

| 파일/함수 | 이유 |
|-----------|------|
| `src/block_selector.py` | Phase R'에서 제거됨. Stage 1.7의 block_reference.py로 대체 |
| `src/content_editor.py` | Phase S에서 별도 텍스트 편집 Stage 제거됨 |
| `src/design_director.py` | Step B 프롬프트 제거됨. 프리셋 선택 로직만 space_allocator로 이동 |

---

## 9. Phase T 범위 vs Phase ZZ 예고

### Phase T (현재) — 폰트 위계 + 파이프라인 안정화

- 11-Stage 파이프라인 전체 구현 (PipelineContext + run_stage 패턴)
- Stage 0: MDX 4-Layer 파서
- Stage 1A/1B: Pydantic 검증 + 모순 탐지 + 원본 대조
- Stage 1.5a: **폰트 위계 확정 + 동적 비율 역산** (Phase T 핵심)
- Stage 1.7: 블록 참고 선택 (키워드 매칭 + fallback)
- Stage 1.5b: 블록 schema 기반 디자인 예산 역산
- catalog.yaml schema 38/38 완성 + visual_diff 20개
- 분산 검증 (L1~L5) + Self-Refine 재시도
- **합격 기준:** 어떤 MDX에서도 폰트 위계 유지, overflow 없음, 품질 60점+

### Phase ZZ (최종 전환) — 판단 체계 전환 + 워크플로우

- Kei Persona API → Opus 직접 + Gitea 위키 판단 기준 전환 (비교 평가 후 결정)
- 청크별 보존율 차등화 (verbatim / summary / core_80)
- Stage 1.5a → 1A/1B 역방향 협상 루프 (weight 재조정 요청)
- Gitea 이슈 기반 워크플로우 전환
- Starlight `.astro` 임베딩
- 반응형 전환 여부 판단