https://github.com/josiah-mbao/docu-vision
An AI tool that turns documents into insights using Google Cloud and a FastAPI backend
https://github.com/josiah-mbao/docu-vision
api fastapi gcp-storage-bucket
Last synced: about 2 months ago
JSON representation
An AI tool that turns documents into insights using Google Cloud and a FastAPI backend
- Host: GitHub
- URL: https://github.com/josiah-mbao/docu-vision
- Owner: josiah-mbao
- Created: 2025-05-08T14:32:05.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-08T17:49:28.000Z (about 1 year ago)
- Last Synced: 2025-05-22T04:12:06.959Z (about 1 year ago)
- Topics: api, fastapi, gcp-storage-bucket
- Language: JavaScript
- Homepage:
- Size: 1.62 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DocuVision 🧠📄 - AI Document Processing
[](https://fastapi.tiangolo.com/)
[](https://python.org)
[](https://openrouter.ai)
[](https://web.dev/progressive-web-apps/)
[](https://developer.mozilla.org/en-US/docs/Web/HTML)
[](https://developer.mozilla.org/en-US/docs/Web/CSS)
[](https://developer.mozilla.org/en-US/docs/Web/JavaScript)
---

---
## 📄 About
**DocuVision** is an AI-powered document processing system that combines OCR extraction with intelligent document analysis. It uses **OCR.space** for text extraction and **OpenRouter AI** for document understanding, classification, and data structuring. The application is built with production-ready practices including rate limiting, API security, and PWA support.
Key Improvements:
- 🛡️ Added API security and rate limiting
- 📱 Enhanced mobile experience with PWA support
- 🧠 Smarter document analysis with AI classification
- ⚡ Optimized performance with GZip compression
- 🎨 Improved UI/UX with better feedback systems
---
## 🧠 Core Features
### Document Processing
- 📤 **Multi-format Upload**: Supports PDFs, JPEG, PNG (up to 10MB)
- 🔍 **Advanced OCR**: OCR.space integration with fallback handling
- 🧠 **AI Analysis**: Document classification and data extraction via OpenRouter
- 📊 **Structured Output**: Clean JSON responses with typed fields
### Technical Features
- ⚡ **Production-Ready API**: Rate limiting, error handling, and docs
- 🔒 **Optional API Key Security**: Protect your endpoints
- 📱 **PWA Support**: Installable and works offline
- 📈 **Progress Tracking**: Real-time upload and processing feedback
- 🖨️ **Print Styles**: Document-friendly print output
### Developer Experience
- 📝 **Swagger Docs**: Interactive API documentation at `/api/docs`
- 🔍 **Validation**: Strict file type and size validation
- 🧩 **Modular Design**: Clean separation of concerns
- 📊 **Logging**: Comprehensive request and error logging
---
## 🌐 Enhanced Tech Stack
| Component | Technology |
|--------------------|---------------------------------------------------------------------------|
| **Frontend** | HTML5, CSS3, JavaScript (PWA-enabled) |
| **Backend** | FastAPI (Python) with Pydantic models |
| **OCR** | OCR.space API |
| **AI Analysis** | OpenRouter AI (Mistral 7B) |
| **Security** | API Key Auth, Rate Limiting |
| **Performance** | GZip Middleware, Async Processing |
| **DevOps** | Logging, Error Tracking, Configuration Management |
---
## ⚙️ Local Setup
### Prerequisites
- Python 3.9+
- Node.js (for optional frontend builds)
- API keys for:
- OCR.space (free tier available)
- OpenRouter (optional)
### Installation
```bash
# Clone the repository
git clone https://github.com/josiah-mbao/docuvision.git
cd docuvision
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keys