Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aniketwdubey/chatpdf
This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.
https://github.com/aniketwdubey/chatpdf
chat-application document-retrieval fastapi huggingface large-language-models llm python rag retrieval-augmented-generation
Last synced: 3 months ago
JSON representation
This project is a Document Retrieval application that utilizes Retrieval-Augmented Generation (RAG) techniques to enable users to interact with uploaded PDF documents. By leveraging a Large Language Model (LLM), users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.
- Host: GitHub
- URL: https://github.com/aniketwdubey/chatpdf
- Owner: aniketwdubey
- License: apache-2.0
- Created: 2024-10-20T15:23:53.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-20T19:17:00.000Z (4 months ago)
- Last Synced: 2024-10-21T23:14:52.858Z (4 months ago)
- Topics: chat-application, document-retrieval, fastapi, huggingface, large-language-models, llm, python, rag, retrieval-augmented-generation
- Language: Jupyter Notebook
- Homepage:
- Size: 961 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ChatPDF
This project is a **Document Retrieval** application that utilizes **Retrieval-Augmented Generation (RAG)** techniques to enable users to interact with uploaded PDF documents. By leveraging a **Large Language Model (LLM)**, users can ask questions about the content of the documents and receive accurate answers based on the information retrieved.
## Features
- **PDF Upload**: Users can upload PDF files for processing.
- **AI Interaction**: Ask questions about the content of the uploaded PDFs.
- **Machine Learning Integration**: Utilizes advanced machine learning models for document processing and question answering.## Technologies Used
- **Backend**: FastAPI
- **Frontend**: Streamlit
- **Machine Learning**: Langchain, Hugging Face Transformers
- **Vector Store**: FAISS for efficient similarity search## Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/chatpdf.git
cd chatpdf
```2. Create a virtual environment and activate it:
```bash
python -m venv .venv
source .venv/bin/activate # On Windows use .venv\Scripts\activate
```3. Install the required packages:
```bash
pip install -r requirements.txt
```## Usage
1. Start the FastAPI server:
```bash
uvicorn app.main:app --reload
```2. Open the Streamlit app in another terminal:
```bash
streamlit run app/streamlit_app.py
```3. Navigate to `http://localhost:8501` in your web browser to access the application.
## API Endpoints
- **GET /**: Returns a welcome message.
- **POST /upload_pdf/**: Uploads a PDF file for processing.
- **Request**: Multipart form data with the PDF file.
- **Response**: Success message upon successful upload and processing.
- **POST /ask/**: Asks a question about the uploaded PDF.
- **Request**: JSON body with the question.
- **Response**: The answer to the question based on the PDF content.![alt text]()
![alt text]()## Testing
4. To run the tests, use:
```bash
streamlit run app/streamlit_app.py
```