docs + V4 catalog + samples + Phase Q legacy 보존

전체 26 files (20 추가 + 6 수정), 10507 insertions.

Phase Z 문서 :
- docs/architecture/PHASE-Z-CHANGE-LOG.md (신설) — axis-by-axis 의사결정 history
  (newest-on-top). Step 7-A 부터 6 entry 박힘 + 2026-05-08 / 2026-05-08 #2
  (compat 매트릭스 폐기 / 6-B 폐기 / F14 표현 정정 / label gate policy 분리).
- docs/architecture/PHASE-Z-PIPELINE-OVERVIEW.md (수정) — Step 5/6/9 Gap note
  append (구조 무변, append-only). 6-B 폐기 사실 + Refinement F.
- docs/architecture/PHASE-Z-PIPELINE-STATUS-BOARD.md (수정) — snapshot date
  2026-05-08 갱신. §3 핵심 missing item 5 (Step 5/6/9 boundary axis breakdown
  + 폐기 기록). §6 한 줄 갱신 — 다음 axis 후보 A~F.

Project root docs :
- PLAN.md / PROGRESS.md / README.md (수정) — 토큰 체계 / 폴더 구조 / 설계 문서 /
  역할 분리 반영.
- IMPROVEMENT-REDESIGN.md (신설) — Phase Z 설계 핵심 문서.
- PROCESS_OVERVIEW.html (신설) — 파이프라인 개요 시각.
- docs/tasks/* (신설) — Phase Z task 문서.

V4 catalog (Phase Z runtime 필수 의존성) :
- tests/matching/v4_full32_result.yaml (신설, 4888 줄) — V4 매칭 결과 32 frame
  × 10 MDX section. lookup_v4_match() / lookup_v4_candidates() 가 본 파일 read.
  Phase Z runtime 이 *없으면 즉시 abort* — clone 후 즉시 동작 가능 보장.

Samples :
- samples/mdx_batch/04.mdx (신설) — MDX04 기본 sample.
- samples/mdx/04. DX 지연 요인.mdx (신설) — MDX04 원본.

Phase Q legacy 보존 (별 axis "Phase Q audit & salvage" 영역) :
- src/block_matcher_tfidf.py / catalog_blocks.py / frame_extractor.py /
  pipeline_v2.py — Phase Q (옛 파이프라인) src 신규 untracked 파일들.
  Phase Z runtime 와 의존성 0. Phase Q audit axis 에서 검토 예정.
- scripts/eval_block_matcher.py / fetch_all_frame_screenshots.py /
  match_17_units_my_matcher.py / match_mdx_strict.py / match_mdx_to_frames_tfidf.py /
  ocr_augment_texts.py / run_pipeline_v2.py / previews/ — Phase Q 작업 시
  사용한 옛 script. 같이 보존.
- run_mdx03_pipeline.py (수정) — Phase Q 진입점 (no flag) + Phase Z 진입점
  (--phase-z2 flag) 동시 wrapper. Phase Z 만 사용 시 `python -m
  src.phase_z2_pipeline samples/mdx_batch/03.mdx <run_id>` 직접 호출.

비-scope :
- tests/matching/ (v4_full32_result.yaml 외 ~63MB) — V4 진화 history /
  reports / DECK / ATTACH. Phase Q audit axis 에서 검토.
- tests/pipeline/ (~15MB) — pipeline data. Phase Q audit 영역.
- templates/catalog/blocks.yaml — 옛 block catalog. Phase Q audit.
- templates/phase_z2/frames/ — 옛 frame partial 위치. Phase Q audit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-08 09:47:58 +09:00
parent ec83405770
commit 85c680f02a
26 changed files with 10507 additions and 46 deletions

View File

@@ -0,0 +1,180 @@
"""모든 Figma 프레임의 스크린샷을 번호 붙은 단일 폴더로 정리.
결과:
data/figma_previews/01.png, 02.png, ..., 32.png
data/figma_previews/index.json ({number: {frame_id, node_id, title}})
"""
from __future__ import annotations
import base64
import json
import sys
from pathlib import Path
from urllib import error, request
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.frame_extractor import extract_all_frames
MCP_URL = "http://127.0.0.1:3845/mcp"
OUT_DIR = Path("data/figma_previews")
# frame_id → node_id (32개, metadata에서 추출)
FRAME_NODE_MAP = {
"1171281172": "145:8352",
"1171281173": "182:2870",
"1171281174": "182:2810",
"1171281175": "182:2829",
"1171281176": "182:3046",
"1171281177": "182:3053",
"1171281178": "145:8394",
"1171281179": "182:3024",
"1171281180": "112:87",
"1171281181": "182:2572",
"1171281182": "182:2523",
"1171281189": "100:65",
"1171281190": "51:99",
"1171281191": "100:132",
"1171281192": "182:2602",
"1171281193": "106:205",
"1171281194": "112:7",
"1171281195": "106:252",
"1171281197": "182:2727",
"1171281198": "182:2766",
"1171281201": "145:8310",
"1171281202": "112:49",
"1171281203": "145:8266",
"1171281204": "145:8223",
"1171281205": "182:2668",
"1171281206": "182:2643",
"1171281208": "145:8504",
"1171281209": "145:8523",
"1171281210": "181:2519",
"1171281211": "181:2520",
"1171281212": "181:2521",
"1171281213": "181:2522",
}
def parse_sse(body: str) -> dict:
for line in body.splitlines():
if line.startswith("data: "):
return json.loads(line[6:])
raise RuntimeError(f"No data line in response: {body[:200]}")
def post(payload: dict, session_id: str | None = None) -> tuple[dict, str | None]:
data = json.dumps(payload).encode()
req = request.Request(
MCP_URL,
data=data,
method="POST",
headers={
"Content-Type": "application/json",
"Accept": "application/json, text/event-stream",
},
)
if session_id:
req.add_header("mcp-session-id", session_id)
with request.urlopen(req, timeout=60) as resp:
body = resp.read().decode()
sid = resp.headers.get("mcp-session-id")
return (parse_sse(body) if body.strip() else {}, sid)
def initialize() -> str:
payload = {
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "frame-dumper", "version": "1.0"},
},
}
_, sid = post(payload)
notify = {"jsonrpc": "2.0", "method": "notifications/initialized", "params": {}}
data = json.dumps(notify).encode()
req = request.Request(
MCP_URL,
data=data,
method="POST",
headers={
"Content-Type": "application/json",
"Accept": "application/json, text/event-stream",
"mcp-session-id": sid or "",
},
)
try:
request.urlopen(req, timeout=10).read()
except error.HTTPError:
pass
return sid or ""
def get_screenshot(session_id: str, node_id: str, call_id: int) -> bytes:
payload = {
"jsonrpc": "2.0",
"id": call_id,
"method": "tools/call",
"params": {"name": "get_screenshot", "arguments": {"nodeId": node_id}},
}
resp, _ = post(payload, session_id=session_id)
if "error" in resp:
raise RuntimeError(f"MCP error for {node_id}: {resp['error']}")
for item in resp.get("result", {}).get("content", []):
if item.get("type") == "image":
return base64.b64decode(item["data"])
raise RuntimeError(f"No image in response for {node_id}")
def main() -> int:
# frame_id → title_text 맵
frames = extract_all_frames("figma_to_html_agent/blocks")
title_map = {f["frame_id"]: (f.get("title_text") or "").replace("\n", " ")[:80] for f in frames}
# 정렬된 frame_id 목록에 1부터 번호 매김
frame_ids = sorted(FRAME_NODE_MAP.keys())
OUT_DIR.mkdir(parents=True, exist_ok=True)
print("[init] MCP session...")
sid = initialize()
print(f"[init] session-id={sid}")
index: dict[str, dict] = {}
for i, fid in enumerate(frame_ids, start=1):
node_id = FRAME_NODE_MAP[fid]
num = f"{i:02d}"
out_path = OUT_DIR / f"{num}.png"
title = title_map.get(fid, "")
if out_path.exists():
print(f"[{num}] {fid} (node {node_id}) — 이미 있음, skip")
else:
print(f"[{num}] {fid} (node {node_id}) fetching...")
try:
png = get_screenshot(sid, node_id, 100 + i)
out_path.write_bytes(png)
print(f" saved {len(png)} bytes → {out_path}")
except Exception as e:
print(f" FAILED: {e}")
continue
index[num] = {
"frame_id": fid,
"node_id": node_id,
"title_text": title,
"png": f"{num}.png",
}
(OUT_DIR / "index.json").write_text(
json.dumps(index, ensure_ascii=False, indent=2), encoding="utf-8"
)
print(f"\n[done] {len(index)}개 저장, index: {OUT_DIR/'index.json'}")
return 0
if __name__ == "__main__":
raise SystemExit(main())