3.0 KiB
3.0 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a C# WPF application that extracts data from DWG (AutoCAD) files and processes them using AI analysis. The application has two main components:
- C# WPF Application
(
DwgExtractorManual) - Main GUI application for DWG processing - Python Analysis Module
(
fletimageanalysis) - AI-powered document analysis using Gemini API
Build and Development Commands
C# Application
# Build the application
dotnet build
# Run the application
dotnet run
# Clean build artifacts
dotnet clean
# Publish for deployment
dotnet publish -c ReleasePython Module Setup
# Run the cleanup and setup script (Windows)
cleanup_and_setup.bat
# Or manually setup Python environment
cd fletimageanalysis
python -m venv venv
call venv\Scripts\activate.bat
pip install -r requirements.txtPython CLI Usage
# Batch process files via CLI
cd fletimageanalysis
python batch_cli.py --files "file1.pdf,file2.dxf" --schema "한국도로공사" --concurrent 3 --output "results.csv"Architecture
C# Component Structure
- MainWindow.xaml.cs - Main WPF window and UI logic
- Models/DwgDataExtractor.cs - Core DWG file processing using Teigha SDK
- Models/ExcelDataWriter.cs - Excel output generation using Office Interop
- Models/TeighaServicesManager.cs - Singleton manager for Teigha SDK lifecycle
- Models/FieldMapper.cs - Maps extracted data to target formats
- Models/SettingsManager.cs - Application configuration management
Python Component Structure
- batch_cli.py - Command-line interface for batch processing
- multi_file_processor.py - Orchestrates multi-file processing workflows
- gemini_analyzer.py - AI analysis using Google Gemini API
- pdf_processor.py - PDF document processing
- dxf_processor.py - DXF file processing
- csv_exporter.py - CSV output generation
Key Dependencies
- Teigha SDK - DWG file reading and CAD entity processing (requires DLL files in specific path)
- Microsoft Office Interop - Excel file generation
- Npgsql - PostgreSQL database connectivity
- Google Gemini API - AI-powered document analysis
- PyMuPDF - PDF processing in Python component
Current Development Focus
The project is undergoing a Note Detection Refactor
(see NoteDetectionRefactor.md):
- Replacing fragile “horizontal search line” algorithm in
DwgDataExtractor.cs - Implementing robust “vertical ray-casting” approach for NOTE content box detection
- Key methods being refactored:
FindNoteBox,GetAllLineSegments,TraceBoxFromTopLine
Important Notes
- Teigha DLLs must be present in the specified path for DWG processing to work
- The Python module requires Google Gemini API key configuration
- Excel output uses COM Interop and requires Microsoft Office installation
- The application supports both manual GUI operation and automated batch processing via CLI