{"id":33345738,"url":"https://github.com/hah23255/aec-rag-system","last_synced_at":"2026-04-12T01:39:05.953Z","repository":{"id":324948232,"uuid":"1099191606","full_name":"hah23255/aec-rag-system","owner":"hah23255","description":"Production-grade RAG system for AEC design management with GraphRAG, Ollama, and FastAPI. Supports CAD/PDF parsing, version tracking, and impact analysis.","archived":false,"fork":false,"pushed_at":"2025-11-18T17:59:07.000Z","size":55,"stargazers_count":1,"open_issues_count":7,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-18T19:23:42.797Z","etag":null,"topics":["aec","architecture","cad","docker","fastapi","graphrag","llm","ollama","python","rag"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hah23255.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-18T17:25:59.000Z","updated_at":"2025-11-18T17:59:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hah23255/aec-rag-system","commit_stats":null,"previous_names":["hah23255/aec-rag-system"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/hah23255/aec-rag-system","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hah23255%2Faec-rag-system","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hah23255%2Faec-rag-system/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hah23255%2Faec-rag-system/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hah23255%2Faec-rag-system/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hah23255","download_url":"https://codeload.github.com/hah23255/aec-rag-system/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hah23255%2Faec-rag-system/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285737153,"owners_count":27223129,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-22T02:00:05.934Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aec","architecture","cad","docker","fastapi","graphrag","llm","ollama","python","rag"],"created_at":"2025-11-22T05:00:27.441Z","updated_at":"2025-11-22T05:01:02.682Z","avatar_url":"https://github.com/hah23255.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AEC Design Management RAG System\n\n[![Python](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-009688.svg)](https://fastapi.tiangolo.com)\n[![Docker](https://img.shields.io/badge/docker-ready-blue.svg)](https://www.docker.com/)\n[![CI](https://github.com/hah23255/aec-rag-system/workflows/CI/badge.svg)](https://github.com/hah23255/aec-rag-system/actions)\n\nA production-grade Retrieval-Augmented Generation (RAG) system for Architecture, Engineering, and Construction (AEC) design management, powered by GraphRAG and local LLMs.\n\n\u003e 📋 **For detailed codebase overview and statistics, see [CODEBASE_OVERVIEW.md](CODEBASE_OVERVIEW.md)**\n\n## Features\n\n### Core Capabilities\n- **GraphRAG Architecture**: Relation-free graph construction using nano-graphrag or LinearRAG\n- **Version Tracking**: Built-in support for drawing revisions with SUPERSEDES relationships\n- **Impact Analysis**: Multi-hop reasoning to trace design change effects\n- **Code Compliance**: Track building code requirements and component compliance\n- **Document Processing**: Parse CAD files (DWG/DXF), PDFs, and scanned documents\n- **Fully Local**: Zero API costs - runs entirely on local hardware\n\n### Technical Stack\n- **Embeddings**: nomic-embed-text-v1 (8K token context, 0.7GB VRAM)\n- **LLM**: Llama-3.1-8B Q4 via Ollama (6GB VRAM)\n- **GraphRAG**: nano-graphrag with NetworkX storage (scales to Neo4j)\n- **Vector DB**: ChromaDB (embedded) or Milvus (production)\n- **API**: FastAPI with async support\n- **Deployment**: Docker Compose orchestration\n\n## Quick Start\n\n### Prerequisites\n- Python 3.9+\n- Docker \u0026 Docker Compose\n- NVIDIA GPU with 16GB VRAM (RTX A5000 or equivalent)\n- 16GB+ RAM\n- Ubuntu 20.04+ or compatible Linux\n\n### Installation\n\n1. **Clone the repository**:\n```bash\ngit clone https://github.com/hah23255/aec-rag-system.git\ncd aec-rag-system\n```\n\n2. **Set up environment**:\n```bash\n# Copy environment template\ncp .env.example .env\n\n# Edit .env with your configuration\nnano .env\n```\n\n3. **Start services with Docker Compose**:\n```bash\n# Start Ollama + API\ndocker-compose up -d\n\n# Pull required models\ndocker exec aec-rag-ollama ollama pull nomic-embed-text\ndocker exec aec-rag-ollama ollama pull llama3.1:8b\n```\n\n4. **Verify installation**:\n```bash\n# Check API health\ncurl http://localhost:8000/api/v1/health\n\n# View API documentation\nopen http://localhost:8000/api/docs\n```\n\n### Manual Installation (Development)\n\n```bash\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # Linux/Mac\n# venv\\Scripts\\activate  # Windows\n\n# Install dependencies\npip install -r requirements.txt\n\n# Install Ollama separately\ncurl -fsSL https://ollama.com/install.sh | sh\n\n# Pull models\nollama pull nomic-embed-text\nollama pull llama3.1:8b\n\n# Run API\nuvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000\n```\n\n## Usage\n\n### Upload a Document\n\n```bash\n# Upload a CAD file\ncurl -X POST \"http://localhost:8000/api/v1/documents/upload\" \\\n  -F \"file=@/path/to/drawing.dxf\" \\\n  -F \"document_type=cad\"\n\n# Upload a PDF\ncurl -X POST \"http://localhost:8000/api/v1/documents/upload\" \\\n  -F \"file=@/path/to/spec.pdf\" \\\n  -F \"document_type=pdf\"\n```\n\n### Query the System\n\n```bash\n# Natural language query\ncurl -X POST \"http://localhost:8000/api/v1/query\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"What components are affected by changes to Drawing A-101?\"}'\n\n# Get version history\ncurl \"http://localhost:8000/api/v1/versions/A-101\"\n\n# Impact analysis\ncurl \"http://localhost:8000/api/v1/impact/component-id-123\"\n```\n\n## Project Structure\n\n```\naec-rag-system/\n├── src/\n│   ├── core/              # RAG core modules\n│   │   ├── embeddings.py  # Embedding generation\n│   │   ├── llm.py         # LLM interface\n│   │   └── graphrag.py    # GraphRAG logic\n│   ├── schema/            # AEC domain schema\n│   │   └── aec_schema.py  # Entity \u0026 relationship definitions\n│   ├── ingestion/         # Document processing\n│   │   ├── cad_parser.py  # CAD file parsing\n│   │   └── pdf_parser.py  # PDF parsing\n│   ├── api/               # REST API\n│   │   └── main.py        # FastAPI application\n│   ├── retrieval/         # Query processing\n│   └── utils/             # Utilities\n├── tests/                 # Test suite\n├── docs/                  # Documentation\n├── scripts/               # Utility scripts\n├── config/                # Configuration\n├── deployment/            # Deployment configs\n├── Dockerfile\n├── docker-compose.yml\n├── requirements.txt\n├── pyproject.toml\n├── CODEBASE_OVERVIEW.md   # Detailed codebase documentation\n└── README.md\n```\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=src --cov-report=html\n\n# Run specific test suite\npytest tests/unit/\npytest tests/integration/\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack src/ tests/\n\n# Lint\nruff check src/ tests/\n\n# Type check\nmypy src/\n```\n\n### Adding New Features\n\n1. Define entities/relationships in `src/schema/aec_schema.py`\n2. Implement parsing logic in `src/ingestion/`\n3. Add query capabilities in `src/retrieval/`\n4. Expose via API in `src/api/main.py`\n5. Write tests in `tests/`\n\n## API Documentation\n\nInteractive API documentation is available at:\n- Swagger UI: `http://localhost:8000/api/docs`\n- ReDoc: `http://localhost:8000/api/redoc`\n\n### Key Endpoints\n\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/api/v1/health` | GET | Health check |\n| `/api/v1/documents/upload` | POST | Upload document |\n| `/api/v1/query` | POST | Natural language query |\n| `/api/v1/versions/{drawing_id}` | GET | Version history |\n| `/api/v1/impact/{entity_id}` | GET | Impact analysis |\n| `/api/v1/graph/export` | GET | Export graph data |\n\n## Architecture\n\n### GraphRAG Flow\n\n```\nDocument Upload → Parse → Extract Entities → Generate Embeddings\n                                    ↓\n                              Build Graph (NetworkX)\n                                    ↓\nQuery → Embed → Retrieve Subgraph → LLM Reasoning → Response\n```\n\n### Resource Usage\n\n| Component | VRAM | RAM | Notes |\n|-----------|------|-----|-------|\n| nomic-embed-text | 0.7 GB | 1 GB | Efficient embedding model |\n| Llama-3.1-8B Q4 | 6.0 GB | 8 GB | Quantized for efficiency |\n| API + Services | - | 2 GB | FastAPI, ChromaDB |\n| **Total** | **7.7 GB** | **11 GB** | Fits RTX A5000 (16GB VRAM) |\n\n## Deployment\n\n### Docker Compose (Recommended)\n\n```bash\n# Production deployment\ndocker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d\n\n# Scale API instances\ndocker-compose up -d --scale api=3\n```\n\n### Kubernetes (Advanced)\n\n```bash\n# Apply manifests\nkubectl apply -f deployment/k8s/\n\n# Check status\nkubectl get pods -n aec-rag\n```\n\n## Configuration\n\nKey environment variables (see `.env.example`):\n\n```bash\n# Ollama\nOLLAMA_HOST=http://localhost:11434\nEMBEDDING_MODEL=nomic-embed-text\nLLM_MODEL=llama3.1:8b\n\n# API\nAPI_HOST=0.0.0.0\nAPI_PORT=8000\nAPI_WORKERS=4\n\n# Storage\nGRAPH_BACKEND=networkx  # or neo4j\nVECTOR_DB=chromadb      # or milvus\nDATA_DIR=./data\n```\n\n## Troubleshooting\n\n### Common Issues\n\n**Ollama not responding**\n```bash\n# Check Ollama status\ndocker logs aec-rag-ollama\n\n# Restart Ollama\ndocker-compose restart ollama\n```\n\n**Out of VRAM**\n- Reduce batch sizes in `.env`\n- Use smaller quantized models (Q3 instead of Q4)\n- Close other GPU applications\n\n**Slow queries**\n- Check if models are loaded: `curl http://localhost:11434/api/tags`\n- Enable embedding cache (default: enabled)\n- Consider upgrading to Milvus for vector DB\n\n## Performance\n\n### Benchmarks (RTX A5000)\n\n| Operation | Time | Throughput |\n|-----------|------|------------|\n| Embed 1K tokens | 50ms | 20K tokens/s |\n| LLM generation (500 tokens) | 2-3s | ~200 tokens/s |\n| CAD parsing (500KB DXF) | 1-2s | - |\n| Graph query (3-hop) | 100ms | - |\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for development guidelines.\n\n## License\n\nThis project is licensed under the MIT License - see [LICENSE](LICENSE) file.\n\n## Acknowledgments\n\n- Based on [nano-graphrag](https://github.com/gusye1234/nano-graphrag) framework\n- Inspired by [LinearRAG](https://github.com/NVIDIA/GenerativeAIExamples/tree/main/community/linear-rag) principles\n- Built on [Ollama](https://ollama.com) for local LLM inference\n\n## Support\n\n- 📧 Email: support@example.com\n- 🐛 Issues: [GitHub Issues](https://github.com/hah23255/aec-rag-system/issues)\n- 💬 Discussions: [GitHub Discussions](https://github.com/hah23255/aec-rag-system/discussions)\n\n---\n\n**Status**: Production-ready v0.1.0 | **Last Updated**: November 2025\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhah23255%2Faec-rag-system","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhah23255%2Faec-rag-system","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhah23255%2Faec-rag-system/lists"}