Files
manual_wpf/notedetectproblem.txt
2025-07-30 15:18:59 +09:00

134 lines
5.7 KiB
Plaintext

NOTE Detection Algorithm Context Report
Problem Summary
Successfully integrated NOTE extraction from ExportExcel_note.cs into the new modular architecture, but encountering
issues where only some NOTEs are being detected and finding their content boxes.
Current Status
✅ FIXED: No note content issue - reverted to original working cross-line intersection algorithm from ExportExcel_old.cs
🔄 ONGOING: Not detecting all NOTEs (missing notes 2 and 4 from a 4-note layout)
Architecture Overview
Key Files and Components
- Main Entry Point: Models/ExportExcel.cs:265-271 - calls note extraction in ExportAllDwgToExcelHeightSorted
- Core Algorithm: Models/DwgDataExtractor.cs:342-480 - ExtractNotesFromDrawing method
- Note Box Detection: Models/DwgDataExtractor.cs:513-569 - FindNoteBox method
- Excel Output: Models/ExcelDataWriter.cs:282-371 - WriteNoteEntities method
Current Algorithm Flow
1. Collection Phase: Gather all DBText, Polyline, and Line entities
2. NOTE Detection: Find DBText containing "NOTE" (case-insensitive)
3. Box Finding: For each NOTE, use cross-line intersection to find content box below
4. Content Extraction: Find text entities within detected boxes
5. Sorting & Grouping: Sort by coordinates (Y descending, X ascending) and group NOTE+content
6. Excel Output: Write to Excel with NOTE followed immediately by its content
Current Working Algorithm (Reverted from ExportExcel_old.cs)
FindNoteBox Method (DwgDataExtractor.cs:513-569)
// Draws horizontal search line below NOTE position
double searchY = notePos.Y - (noteHeight * 2);
var searchLineStart = new Point3d(notePos.X - noteHeight * 10, searchY, 0);
var searchLineEnd = new Point3d(notePos.X + noteHeight * 50, searchY, 0);
// 1. Check Polyline intersections
// 2. Check Line intersections and trace rectangles
// 3. Use usedBoxes HashSet to prevent duplicate assignment
IsValidNoteBox Validation (DwgDataExtractor.cs:1005-1032)
// Simple validation criteria:
// - Box must be below NOTE (box.maxPoint.Y < notePos.Y)
// - Size constraints: noteHeight < width/height < noteHeight * 100
// - Distance constraints: X distance < noteHeight * 50, Y distance < noteHeight * 10
Known Issues from Previous Sessions
Issue 1: 1/1/3/3 Duplicate Content (PREVIOUSLY FIXED)
Problem: Multiple NOTEs finding the same large spanning polyline
Root Cause: Box detection finding one large polyline spanning multiple note areas
Solution Applied: Used usedBoxes HashSet to prevent duplicate assignment
Issue 2: Reverse Note Ordering (PREVIOUSLY FIXED)
Problem: Notes written in reverse order
Solution Applied: Sort by Y descending (bigger Y = top), then X ascending
Issue 3: Wrong Note Grouping (PREVIOUSLY FIXED)
Problem: All NOTEs grouped first, then all content
Solution Applied: Group each NOTE immediately with its content
Issue 4: Missing NOTEs 2 and 4 (CURRENT ISSUE)
Problem: In a 4-note layout arranged as 1-2 (top row) and 3-4 (bottom row), only notes 1 and 3 are detected
Possible Causes:
- Search line positioning not intersecting with notes 2 and 4's content boxes
- Box validation criteria too restrictive for right-side notes
- Geometric relationship between NOTE position and content box differs for right-side notes
Debug Information Available
Last Known Debug Output (5 NOTEs detected but no content found)
[DEBUG] Note 텍스트 발견: 'NOTE' at (57.0572050838764,348.6990318186563,0)
[DEBUG] Note 텍스트 발견: 'NOTE' at (471.6194660633719,501.3393888589908,0)
[DEBUG] Note 텍스트 발견: 'NOTE' at (444.9503218738628,174.19527687737536,0)
[DEBUG] Note 텍스트 발견: 'NOTE' at (602.7327260134425,174.43523739278135,0)
[DEBUG] Note 텍스트 발견: 'NOTE' at (635.5065816693041,502.83938885945645,0)
Reference Image
- noteExample.png shows expected layout with numbered sections 1-7 in Korean text
- Shows box-structured layout where each NOTE should have corresponding content below
Key Coordinate Analysis
From debug logs, NOTEs at similar Y coordinates appear to be in pairs:
- Top Row: (444.95, 174.20) and (602.73, 174.44) - Y≈174
- Middle Row: (471.62, 501.34) and (635.51, 502.84) - Y≈502
- Single: (57.06, 348.70) - Y≈349
Pattern suggests left-right pairing where right-side NOTEs might need different search strategies.
Investigation Areas for Next Session
Priority 1: Search Line Geometry
- Analyze why horizontal search lines from right-side NOTEs don't intersect content boxes
- Consider adjusting search line direction/positioning for right-side notes
- Debug actual intersection results for missing NOTEs
Priority 2: Box Validation Criteria
- Review IsValidNoteBox distance calculations for right-side NOTEs
- Consider if content boxes for right-side NOTEs have different geometric relationships
Priority 3: Coordinate Pattern Analysis
- Investigate why NOTEs at (602.73, 174.44) and (635.51, 502.84) aren't finding content
- Compare successful vs failed NOTE positions and their content box relationships
Quick Start Commands for Next Session
1. Run existing code to see current NOTE detection results
2. Add detailed debug logging to FindNoteBox for specific coordinates: (602.73, 174.44) and (635.51, 502.84)
3. Analyze intersection results and box validation for these specific NOTEs
4. Consider geometric adjustments for right-side NOTE detection
Code State
- Current implementation in Models/DwgDataExtractor.cs uses proven cross-line intersection algorithm
- usedBoxes tracking prevents duplicate assignment
- NOTE+content grouping and Y-coordinate sorting working correctly
- Excel output formatting functional
The foundation is solid; focus should be on geometric refinements for complete NOTE detection coverage.