https://github.com/am-ankitgit/indian-gst-chatbot-with-invoice-analysis
https://github.com/am-ankitgit/indian-gst-chatbot-with-invoice-analysis
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/am-ankitgit/indian-gst-chatbot-with-invoice-analysis
- Owner: AM-Ankitgit
- License: mit
- Created: 2025-01-31T10:25:30.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-04T07:54:02.000Z (over 1 year ago)
- Last Synced: 2025-02-04T08:26:12.740Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 2.93 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# INDIAN-GST-chatbot-with-Invoice-analysis
# RAG-Based GST Analysis Agent
## Overview
This project is a Retrieval-Augmented Generation (RAG) based agent for GST (Goods and Services Tax) analysis. The system is designed to extract, process, and analyze GST-related data from various sources, including PDFs, images, and text inputs. It leverages large language models and advanced document parsing techniques to provide meaningful insights.
### Key Features
* **RAG-Based Architecture:** Combines retrieval and generation for accurate GST analysis.
* **Multi-Modal Input:** Supports text, PDFs, and images.
* **Advanced OCR & Table Extraction:** Extracts structured data from scanned invoices and GST documents.
* **Integration with O1-mini Model:** Utilizes the O1-mini model for enhanced response generation.
---
## Installation
### Prerequisites
Ensure you have the following dependencies installed:
* Python 3.8+
* Pip
* Virtual Environment (recommended)
### Setup
### git clone
### [https://github.com/your-repo/gst-analysis-agent.git](https://github.com/your-repo/gst-analysis-agent.git)
### cd gst-analysis-agent
- python -m venv botenv
- source botenv/bin/activate # On Windows, use: botenv\Scripts\activate
- pip install -r requirements.txt
- setup api key in .env (open api key)
### how to run
- python app.py
### API Endpoints
1. **Upload Data:**
* Endpoint: `/UploadData`
* Method: `POST`
* Supports: Text, PDFs, Images
2. **PDF Processing:**
* Uses `unstructured.partition.pdf` for structured data extraction.
* Extracts tables and text separately.
3. **Image Processing:**
* Uses `pytesseract` for OCR-based text extraction.
* Object detection using `torchvision`.
4. **Integration with O1-mini:**
* The system leverages the `O1-mini` model for intelligent text generation and reasoning over retrieved GST data.