85 lines
3.0 KiB
Markdown
85 lines
3.0 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
This is a C# WPF application that extracts data from DWG (AutoCAD) files and processes them using AI analysis. The application has two main components:
|
|
|
|
1. **C# WPF Application** (`DwgExtractorManual`) - Main GUI application for DWG processing
|
|
2. **Python Analysis Module** (`fletimageanalysis`) - AI-powered document analysis using Gemini API
|
|
|
|
## Build and Development Commands
|
|
|
|
### C# Application
|
|
```bash
|
|
# Build the application
|
|
dotnet build
|
|
|
|
# Run the application
|
|
dotnet run
|
|
|
|
# Clean build artifacts
|
|
dotnet clean
|
|
|
|
# Publish for deployment
|
|
dotnet publish -c Release
|
|
```
|
|
|
|
### Python Module Setup
|
|
```bash
|
|
# Run the cleanup and setup script (Windows)
|
|
cleanup_and_setup.bat
|
|
|
|
# Or manually setup Python environment
|
|
cd fletimageanalysis
|
|
python -m venv venv
|
|
call venv\Scripts\activate.bat
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### Python CLI Usage
|
|
```bash
|
|
# Batch process files via CLI
|
|
cd fletimageanalysis
|
|
python batch_cli.py --files "file1.pdf,file2.dxf" --schema "한국도로공사" --concurrent 3 --output "results.csv"
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### C# Component Structure
|
|
- **MainWindow.xaml.cs** - Main WPF window and UI logic
|
|
- **Models/DwgDataExtractor.cs** - Core DWG file processing using Teigha SDK
|
|
- **Models/ExcelDataWriter.cs** - Excel output generation using Office Interop
|
|
- **Models/TeighaServicesManager.cs** - Singleton manager for Teigha SDK lifecycle
|
|
- **Models/FieldMapper.cs** - Maps extracted data to target formats
|
|
- **Models/SettingsManager.cs** - Application configuration management
|
|
|
|
### Python Component Structure
|
|
- **batch_cli.py** - Command-line interface for batch processing
|
|
- **multi_file_processor.py** - Orchestrates multi-file processing workflows
|
|
- **gemini_analyzer.py** - AI analysis using Google Gemini API
|
|
- **pdf_processor.py** - PDF document processing
|
|
- **dxf_processor.py** - DXF file processing
|
|
- **csv_exporter.py** - CSV output generation
|
|
|
|
### Key Dependencies
|
|
- **Teigha SDK** - DWG file reading and CAD entity processing (requires DLL files in specific path)
|
|
- **Microsoft Office Interop** - Excel file generation
|
|
- **Npgsql** - PostgreSQL database connectivity
|
|
- **Google Gemini API** - AI-powered document analysis
|
|
- **PyMuPDF** - PDF processing in Python component
|
|
|
|
## Current Development Focus
|
|
|
|
The project is undergoing a **Note Detection Refactor** (see `NoteDetectionRefactor.md`):
|
|
- Replacing fragile "horizontal search line" algorithm in `DwgDataExtractor.cs`
|
|
- Implementing robust "vertical ray-casting" approach for NOTE content box detection
|
|
- Key methods being refactored: `FindNoteBox`, `GetAllLineSegments`, `TraceBoxFromTopLine`
|
|
|
|
## Important Notes
|
|
|
|
- Teigha DLLs must be present in the specified path for DWG processing to work
|
|
- The Python module requires Google Gemini API key configuration
|
|
- Excel output uses COM Interop and requires Microsoft Office installation
|
|
- The application supports both manual GUI operation and automated batch processing via CLI |