https://github.com/bhargavi35/codebase-archaeologist-backend
A high-performance, asynchronous FastAPI orchestration layer designed to drive semantic discovery, map software structural dependencies, and execute real-time code vector transformations.
https://github.com/bhargavi35/codebase-archaeologist-backend
chromadb fastapi genai python3 uvicorn
Last synced: about 8 hours ago
JSON representation
A high-performance, asynchronous FastAPI orchestration layer designed to drive semantic discovery, map software structural dependencies, and execute real-time code vector transformations.
- Host: GitHub
- URL: https://github.com/bhargavi35/codebase-archaeologist-backend
- Owner: bhargavi35
- Created: 2026-06-19T07:24:51.000Z (12 days ago)
- Default Branch: main
- Last Pushed: 2026-06-19T10:05:11.000Z (12 days ago)
- Last Synced: 2026-06-19T10:25:39.317Z (12 days ago)
- Topics: chromadb, fastapi, genai, python3, uvicorn
- Language: Python
- Homepage: https://codebase-archaeologist-backend.onrender.com
- Size: 162 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🐍 Codebase Archaeologist Production Engine — Backend API
A high-performance, asynchronous FastAPI orchestration layer designed to drive semantic discovery, map software structural dependencies, and execute real-time code vector transformations.
📂 **Live API Interactive Docs:** [https://codebase-archaeologist-backend.onrender.com/docs](https://codebase-archaeologist-backend.onrender.com/docs)
---
## ✨ Key Capabilities
* **🌐 Memory-Buffered GitHub Ingestion:** Securely clones public GitHub ZIP archives directly into short-lived system memory, bypassing local disk clutter, to parse and map source file modules on the fly.
* **🧠 Free-Tier Cloud Optimized Indexing:** Utilizes a high-speed `chromadb` Ephemeral Memory Client to instantiate in-RAM data collections, bypassing the need for paid cloud disk space while ensuring zero-leak data flushes.
* **🔮 AST-Inspired Import Extractor:** Employs advanced regex parsing arrays to scan incoming code files, pull relative/aliased paths, and construct an interactive network link matrix.
* **📡 Real-Time SSE Response Streaming:** Implements Server-Sent Events (SSE) via FastAPI's `StreamingResponse` to push real-time, token-by-token answer responses directly to the client dashboard.
* **🛡️ Defensive Quota Boundaries:** Enforces protective file capping constraints (maximum 15 concurrent source files mapped) and traps rate-limiting failures to pass down unified `ERROR_QUOTA_EXHAUSTED` tokens gracefully.
---
## 🛠️ Technical Stack & Architecture
* **Language/Framework:** Python 3.11+, FastAPI, Uvicorn
* **AI & Embeddings:** Google GenAI SDK (`gemini-embedding-2`, `gemini-2.5-flash`)
* **Vector Database:** ChromaDB (In-Memory Ephemeral Client mapping configuration)
* **HTTP Infrastructure:** HTTPX (Asynchronous network connection routing)
---
## 🚀 Local Deployment & API Testing
1. **Clone the repository:**
```bash
git clone [https://github.com/bhargavi35/codebase-archaeologist-backend.git](https://github.com/bhargavi35/codebase-archaeologist-backend.git)
cd codebase-archaeologist-backend
Set up a virtual environment:
Bash
python -m venv venv
# On Windows:
.\venv\Scripts\activate
Install production dependencies:
Bash
pip install -r requirements.txt
Inject Your AI Key Secret:
Configure a .env file or export the key variable directly to your environment:
Bash
export GEMINI_API_KEY="your_google_ai_studio_key_here"
Fire up the Uvicorn worker:
Bash
uvicorn main:app --reload
Access the live interactive Swagger UI testing playground at http://127.0.0.1:8000/docs.