# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a C# WPF application that extracts data from DWG (AutoCAD) files and processes them using AI analysis. The application has two main components: 1. **C# WPF Application** (`DwgExtractorManual`) - Main GUI application for DWG processing 2. **Python Analysis Module** (`fletimageanalysis`) - AI-powered document analysis using Gemini API ## Build and Development Commands ### C# Application ```bash # Build the application dotnet build # Run the application dotnet run # Clean build artifacts dotnet clean # Publish for deployment dotnet publish -c Release ``` ### Python Module Setup ```bash # Run the cleanup and setup script (Windows) cleanup_and_setup.bat # Or manually setup Python environment cd fletimageanalysis python -m venv venv call venv\Scripts\activate.bat pip install -r requirements.txt ``` ### Python CLI Usage ```bash # Batch process files via CLI cd fletimageanalysis python batch_cli.py --files "file1.pdf,file2.dxf" --schema "한국도로공사" --concurrent 3 --output "results.csv" ``` ## Architecture ### C# Component Structure - **MainWindow.xaml.cs** - Main WPF window and UI logic - **Models/DwgDataExtractor.cs** - Core DWG file processing using Teigha SDK - **Models/ExcelDataWriter.cs** - Excel output generation using Office Interop - **Models/TeighaServicesManager.cs** - Singleton manager for Teigha SDK lifecycle - **Models/FieldMapper.cs** - Maps extracted data to target formats - **Models/SettingsManager.cs** - Application configuration management ### Python Component Structure - **batch_cli.py** - Command-line interface for batch processing - **multi_file_processor.py** - Orchestrates multi-file processing workflows - **gemini_analyzer.py** - AI analysis using Google Gemini API - **pdf_processor.py** - PDF document processing - **dxf_processor.py** - DXF file processing - **csv_exporter.py** - CSV output generation ### Key Dependencies - **Teigha SDK** - DWG file reading and CAD entity processing (requires DLL files in specific path) - **Microsoft Office Interop** - Excel file generation - **Npgsql** - PostgreSQL database connectivity - **Google Gemini API** - AI-powered document analysis - **PyMuPDF** - PDF processing in Python component ## Current Development Focus The project is undergoing a **Note Detection Refactor** (see `NoteDetectionRefactor.md`): - Replacing fragile "horizontal search line" algorithm in `DwgDataExtractor.cs` - Implementing robust "vertical ray-casting" approach for NOTE content box detection - Key methods being refactored: `FindNoteBox`, `GetAllLineSegments`, `TraceBoxFromTopLine` ## Important Notes - Teigha DLLs must be present in the specified path for DWG processing to work - The Python module requires Google Gemini API key configuration - Excel output uses COM Interop and requires Microsoft Office installation - The application supports both manual GUI operation and automated batch processing via CLI