134 lines
5.7 KiB
Plaintext
134 lines
5.7 KiB
Plaintext
NOTE Detection Algorithm Context Report
|
|
|
|
Problem Summary
|
|
|
|
Successfully integrated NOTE extraction from ExportExcel_note.cs into the new modular architecture, but encountering
|
|
issues where only some NOTEs are being detected and finding their content boxes.
|
|
|
|
Current Status
|
|
|
|
✅ FIXED: No note content issue - reverted to original working cross-line intersection algorithm from ExportExcel_old.cs
|
|
|
|
🔄 ONGOING: Not detecting all NOTEs (missing notes 2 and 4 from a 4-note layout)
|
|
|
|
Architecture Overview
|
|
|
|
Key Files and Components
|
|
|
|
- Main Entry Point: Models/ExportExcel.cs:265-271 - calls note extraction in ExportAllDwgToExcelHeightSorted
|
|
- Core Algorithm: Models/DwgDataExtractor.cs:342-480 - ExtractNotesFromDrawing method
|
|
- Note Box Detection: Models/DwgDataExtractor.cs:513-569 - FindNoteBox method
|
|
- Excel Output: Models/ExcelDataWriter.cs:282-371 - WriteNoteEntities method
|
|
|
|
Current Algorithm Flow
|
|
|
|
1. Collection Phase: Gather all DBText, Polyline, and Line entities
|
|
2. NOTE Detection: Find DBText containing "NOTE" (case-insensitive)
|
|
3. Box Finding: For each NOTE, use cross-line intersection to find content box below
|
|
4. Content Extraction: Find text entities within detected boxes
|
|
5. Sorting & Grouping: Sort by coordinates (Y descending, X ascending) and group NOTE+content
|
|
6. Excel Output: Write to Excel with NOTE followed immediately by its content
|
|
|
|
Current Working Algorithm (Reverted from ExportExcel_old.cs)
|
|
|
|
FindNoteBox Method (DwgDataExtractor.cs:513-569)
|
|
|
|
// Draws horizontal search line below NOTE position
|
|
double searchY = notePos.Y - (noteHeight * 2);
|
|
var searchLineStart = new Point3d(notePos.X - noteHeight * 10, searchY, 0);
|
|
var searchLineEnd = new Point3d(notePos.X + noteHeight * 50, searchY, 0);
|
|
|
|
// 1. Check Polyline intersections
|
|
// 2. Check Line intersections and trace rectangles
|
|
// 3. Use usedBoxes HashSet to prevent duplicate assignment
|
|
|
|
IsValidNoteBox Validation (DwgDataExtractor.cs:1005-1032)
|
|
|
|
// Simple validation criteria:
|
|
// - Box must be below NOTE (box.maxPoint.Y < notePos.Y)
|
|
// - Size constraints: noteHeight < width/height < noteHeight * 100
|
|
// - Distance constraints: X distance < noteHeight * 50, Y distance < noteHeight * 10
|
|
|
|
Known Issues from Previous Sessions
|
|
|
|
Issue 1: 1/1/3/3 Duplicate Content (PREVIOUSLY FIXED)
|
|
|
|
Problem: Multiple NOTEs finding the same large spanning polyline
|
|
Root Cause: Box detection finding one large polyline spanning multiple note areas
|
|
Solution Applied: Used usedBoxes HashSet to prevent duplicate assignment
|
|
|
|
Issue 2: Reverse Note Ordering (PREVIOUSLY FIXED)
|
|
|
|
Problem: Notes written in reverse order
|
|
Solution Applied: Sort by Y descending (bigger Y = top), then X ascending
|
|
|
|
Issue 3: Wrong Note Grouping (PREVIOUSLY FIXED)
|
|
|
|
Problem: All NOTEs grouped first, then all content
|
|
Solution Applied: Group each NOTE immediately with its content
|
|
|
|
Issue 4: Missing NOTEs 2 and 4 (CURRENT ISSUE)
|
|
|
|
Problem: In a 4-note layout arranged as 1-2 (top row) and 3-4 (bottom row), only notes 1 and 3 are detected
|
|
Possible Causes:
|
|
- Search line positioning not intersecting with notes 2 and 4's content boxes
|
|
- Box validation criteria too restrictive for right-side notes
|
|
- Geometric relationship between NOTE position and content box differs for right-side notes
|
|
|
|
Debug Information Available
|
|
|
|
Last Known Debug Output (5 NOTEs detected but no content found)
|
|
|
|
[DEBUG] Note 텍스트 발견: 'NOTE' at (57.0572050838764,348.6990318186563,0)
|
|
[DEBUG] Note 텍스트 발견: 'NOTE' at (471.6194660633719,501.3393888589908,0)
|
|
[DEBUG] Note 텍스트 발견: 'NOTE' at (444.9503218738628,174.19527687737536,0)
|
|
[DEBUG] Note 텍스트 발견: 'NOTE' at (602.7327260134425,174.43523739278135,0)
|
|
[DEBUG] Note 텍스트 발견: 'NOTE' at (635.5065816693041,502.83938885945645,0)
|
|
|
|
Reference Image
|
|
|
|
- noteExample.png shows expected layout with numbered sections 1-7 in Korean text
|
|
- Shows box-structured layout where each NOTE should have corresponding content below
|
|
|
|
Key Coordinate Analysis
|
|
|
|
From debug logs, NOTEs at similar Y coordinates appear to be in pairs:
|
|
- Top Row: (444.95, 174.20) and (602.73, 174.44) - Y≈174
|
|
- Middle Row: (471.62, 501.34) and (635.51, 502.84) - Y≈502
|
|
- Single: (57.06, 348.70) - Y≈349
|
|
|
|
Pattern suggests left-right pairing where right-side NOTEs might need different search strategies.
|
|
|
|
Investigation Areas for Next Session
|
|
|
|
Priority 1: Search Line Geometry
|
|
|
|
- Analyze why horizontal search lines from right-side NOTEs don't intersect content boxes
|
|
- Consider adjusting search line direction/positioning for right-side notes
|
|
- Debug actual intersection results for missing NOTEs
|
|
|
|
Priority 2: Box Validation Criteria
|
|
|
|
- Review IsValidNoteBox distance calculations for right-side NOTEs
|
|
- Consider if content boxes for right-side NOTEs have different geometric relationships
|
|
|
|
Priority 3: Coordinate Pattern Analysis
|
|
|
|
- Investigate why NOTEs at (602.73, 174.44) and (635.51, 502.84) aren't finding content
|
|
- Compare successful vs failed NOTE positions and their content box relationships
|
|
|
|
Quick Start Commands for Next Session
|
|
|
|
1. Run existing code to see current NOTE detection results
|
|
2. Add detailed debug logging to FindNoteBox for specific coordinates: (602.73, 174.44) and (635.51, 502.84)
|
|
3. Analyze intersection results and box validation for these specific NOTEs
|
|
4. Consider geometric adjustments for right-side NOTE detection
|
|
|
|
Code State
|
|
|
|
- Current implementation in Models/DwgDataExtractor.cs uses proven cross-line intersection algorithm
|
|
- usedBoxes tracking prevents duplicate assignment
|
|
- NOTE+content grouping and Y-coordinate sorting working correctly
|
|
- Excel output formatting functional
|
|
|
|
The foundation is solid; focus should be on geometric refinements for complete NOTE detection coverage. |