https://github.com/josiah-mbao/docu-vision

An AI tool that turns documents into insights using Google Cloud and a FastAPI backend
https://github.com/josiah-mbao/docu-vision

api fastapi gcp-storage-bucket

Last synced: about 2 months ago
JSON representation

An AI tool that turns documents into insights using Google Cloud and a FastAPI backend

Host: GitHub
URL: https://github.com/josiah-mbao/docu-vision
Owner: josiah-mbao
Created: 2025-05-08T14:32:05.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-05-08T17:49:28.000Z (about 1 year ago)
Last Synced: 2025-05-22T04:12:06.959Z (about 1 year ago)
Topics: api, fastapi, gcp-storage-bucket
Language: JavaScript
Homepage:
Size: 1.62 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # DocuVision 🧠📄 - AI Document Processing

[![FastAPI](https://img.shields.io/badge/FastAPI-005571?style=for-the-badge&logo=fastapi&logoColor=white)](https://fastapi.tiangolo.com/)

[![Python](https://img.shields.io/badge/Python-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://python.org)

[![OpenRouter](https://img.shields.io/badge/OpenRouter-1E1E2D?style=for-the-badge&logo=openai&logoColor=white)](https://openrouter.ai)

[![PWA](https://img.shields.io/badge/PWA-5A0FC8?style=for-the-badge&logo=pwa&logoColor=white)](https://web.dev/progressive-web-apps/)

[![HTML5](https://img.shields.io/badge/HTML5-E34F26?style=for-the-badge&logo=html5&logoColor=white)](https://developer.mozilla.org/en-US/docs/Web/HTML)

[![CSS3](https://img.shields.io/badge/CSS3-1572B6?style=for-the-badge&logo=css3&logoColor=white)](https://developer.mozilla.org/en-US/docs/Web/CSS)

[![JavaScript](https://img.shields.io/badge/JavaScript-F7DF1E?style=for-the-badge&logo=javascript&logoColor=black)](https://developer.mozilla.org/en-US/docs/Web/JavaScript)

---

![DocuVision Screenshot](https://github.com/user-attachments/assets/497c3d82-f396-4d54-92c3-5df37fc0e249)

---

## 📄 About

**DocuVision** is an AI-powered document processing system that combines OCR extraction with intelligent document analysis. It uses **OCR.space** for text extraction and **OpenRouter AI** for document understanding, classification, and data structuring. The application is built with production-ready practices including rate limiting, API security, and PWA support.

Key Improvements:

- 🛡️ Added API security and rate limiting

- 📱 Enhanced mobile experience with PWA support

- 🧠 Smarter document analysis with AI classification

- ⚡ Optimized performance with GZip compression

- 🎨 Improved UI/UX with better feedback systems

---

## 🧠 Core Features

### Document Processing

- 📤 **Multi-format Upload**: Supports PDFs, JPEG, PNG (up to 10MB)

- 🔍 **Advanced OCR**: OCR.space integration with fallback handling

- 🧠 **AI Analysis**: Document classification and data extraction via OpenRouter

- 📊 **Structured Output**: Clean JSON responses with typed fields

### Technical Features

- ⚡ **Production-Ready API**: Rate limiting, error handling, and docs

- 🔒 **Optional API Key Security**: Protect your endpoints

- 📱 **PWA Support**: Installable and works offline

- 📈 **Progress Tracking**: Real-time upload and processing feedback

- 🖨️ **Print Styles**: Document-friendly print output

### Developer Experience

- 📝 **Swagger Docs**: Interactive API documentation at `/api/docs`

- 🔍 **Validation**: Strict file type and size validation

- 🧩 **Modular Design**: Clean separation of concerns

- 📊 **Logging**: Comprehensive request and error logging

---

## 🌐 Enhanced Tech Stack

| Component          | Technology                                                                 |

|--------------------|---------------------------------------------------------------------------|

| **Frontend**       | HTML5, CSS3, JavaScript (PWA-enabled)                                    |

| **Backend**        | FastAPI (Python) with Pydantic models                                    |

| **OCR**           | OCR.space API                                                            |

| **AI Analysis**    | OpenRouter AI (Mistral 7B)                                               |

| **Security**       | API Key Auth, Rate Limiting                                              |

| **Performance**    | GZip Middleware, Async Processing                                       |

| **DevOps**         | Logging, Error Tracking, Configuration Management                       |

---

## ⚙️ Local Setup

### Prerequisites

- Python 3.9+

- Node.js (for optional frontend builds)

- API keys for:

  - OCR.space (free tier available)

  - OpenRouter (optional)

### Installation

```bash

# Clone the repository

git clone https://github.com/josiah-mbao/docuvision.git

cd docuvision

# Create and activate virtual environment

python -m venv venv

source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies

pip install -r requirements.txt

# Set up environment variables

cp .env.example .env

# Edit .env with your API keys

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/josiah-mbao/docu-vision

Awesome Lists containing this project

README