https://github.com/itsvineetkr/ask-pdf-v0
This web application utilizes Langchain and FAST Api to help query PDFs and ask questions about it. It also maintaing conversation history and supports features like Speech to Text in multiple languages.
https://github.com/itsvineetkr/ask-pdf-v0
fastapi huggingface langchain llm python rag
Last synced: about 2 months ago
JSON representation
This web application utilizes Langchain and FAST Api to help query PDFs and ask questions about it. It also maintaing conversation history and supports features like Speech to Text in multiple languages.
- Host: GitHub
- URL: https://github.com/itsvineetkr/ask-pdf-v0
- Owner: itsvineetkr
- Created: 2025-01-16T09:32:04.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-22T08:36:51.000Z (over 1 year ago)
- Last Synced: 2025-10-29T20:46:11.210Z (8 months ago)
- Topics: fastapi, huggingface, langchain, llm, python, rag
- Language: Python
- Homepage:
- Size: 660 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Ask PDF
This web application utilizes Langchain and FAST Api to help query PDFs and ask questions about it.
It also maintains history and supports speech-to-text functionality, so you can ask questions in any language by just saying it.
It is built using Web Sockets, making it seamless in terms of interactions.
It uses Mistral AI from Hugging Face for the language model and Huggingface embeddings for creating a vector database of documents collected out of PDF for retrieval.
## API endpoints
1. /upload : This endpoint accepts a file and stores it in the local file system.
2. /ask : This is the WebSocket end point that receives audio or text data from the client for processing.
## Data storing
There are two models. One is `questions` which store all the questions and their answers for a particular pdf and the other one is `pdfMetadata` which stores the metadata of all the pdfs ever uploaded.
## Installation
1. **Clone the repository:**
```bash
git clone https://github.com/itsvineetkr/ask-pdf.git
cd ask-pdf
```
2. **Create a virtual environment:**
```bash
python3 -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
```
3. **Install the dependencies:**
```bash
pip install -r requirements.txt
```
## Setting up the .env file
1. **Create a `.env` file in the root directory of your project.**
2. **Add the following environment variables to the `.env` file:**
```env
HUGGINGFACEHUB_API_TOKEN = "hf_----xxxxxxx----"
```
Replace `hf_----xxxxxxx----` and `` with your huggingface api key.
3. **Add the Tavily API key to the `.env` file:**
```env
TAVILY_API_KEY = ""
```
Replace `` with your Tavily API key.
## Running the Application
1. **Start the FastAPI server:**
```bash
uvicorn main:app --reload
```
The application will be available at `http://localhost:8000/`.