X' 핵심 수정: MDX sections에서 직접 텍스트 가져오기 + normalizer ### 지원

핵심 변경: - mdx_normalizer: ### (h3) 소목차도 section으로 분리 (기존 ## 만) - _assemble_type_b: Kei structured_text 대신 normalized.sections에서 직접 텍스트 - 대목차/소목차 계층 구조 그대로 반영 결과: - 슬라이드 제목: 원본 MDX frontmatter 그대로 - 대목차: "DX 기반 Process 혁신에 따른 주체별 기대효과" - 소목차 좌: "업무 수행 과정(Process)의 변화" - 소목차 우: "DX 시행 주체별 기대효과" + 팝업 링크 + Kei 요약 표 - 캡션: normalized.images alt text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:22:09 +09:00
parent 6b17f448eb
commit 3719704d75
2 changed files with 80 additions and 75 deletions
--- a/src/mdx_normalizer.py
+++ b/src/mdx_normalizer.py
@@ -229,15 +229,18 @@ def _extract_structure(text: str) -> dict[str, Any]:
    current_section_title = ""
    current_section_lines = []

+    current_section_level = 2
+
    def _flush_section():
-        nonlocal current_section_title, current_section_lines
+        nonlocal current_section_title, current_section_lines, current_section_level
        if current_section_title:
            sections.append({
-                "level": 2,
+                "level": current_section_level,
                "title": current_section_title,
                "content": "\n".join(current_section_lines).strip(),
            })
            current_section_lines = []
+            current_section_level = 2

    for i, token in enumerate(tokens):
        # 이미지 추출 (inline children)
@@ -283,12 +286,13 @@ def _extract_structure(text: str) -> dict[str, Any]:
            if table["headers"] or table["rows"]:
                tables.append(table)

-        # 섹션 추출 (## 기준)
-        if token.type == "heading_open" and token.tag == "h2":
+        # 섹션 추출 (## 및 ### 기준 — 대목차/소목차 모두)
+        if token.type == "heading_open" and token.tag in ("h2", "h3"):
            _flush_section()
            # 다음 토큰이 inline (제목 텍스트)
            if i + 1 < len(tokens) and tokens[i + 1].type == "inline":
                current_section_title = tokens[i + 1].content
+                current_section_level = 2 if token.tag == "h2" else 3
        elif current_section_title and token.type in ("paragraph_open", "bullet_list_open",
                                                       "ordered_list_open", "fence"):
            # 섹션 내용 수집 — inline 토큰의 content만