https://github.com/shaadclt/llm-powered-pdf-chatbot
This is a Streamlit-based PDF Chatbot powered by OpenAI's Language Models. The chatbot allows users to upload PDF files, extract text content, and ask natural language questions about the PDF content
https://github.com/shaadclt/llm-powered-pdf-chatbot
faiss langchain openai
Last synced: about 1 year ago
JSON representation
This is a Streamlit-based PDF Chatbot powered by OpenAI's Language Models. The chatbot allows users to upload PDF files, extract text content, and ask natural language questions about the PDF content
- Host: GitHub
- URL: https://github.com/shaadclt/llm-powered-pdf-chatbot
- Owner: shaadclt
- Created: 2023-09-25T15:53:49.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-14T16:21:02.000Z (about 2 years ago)
- Last Synced: 2025-04-10T01:12:11.551Z (about 1 year ago)
- Topics: faiss, langchain, openai
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 12
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LLM-powered PDF Chatbot

This is a Streamlit-based PDF Chatbot powered by OpenAI's Language Models. The chatbot allows users to upload PDF files, extract text content, and ask natural language questions about the PDF content. Key project components include:
**PDF Text Extraction**: Utilized the PyPDF2 library to extract text from uploaded PDF files, enabling easy access to the document's content.
**Text Splitting**: Implemented a custom text splitter to break down the extracted text into manageable chunks for efficient processing.
**Embeddings and Vector Stores**: Generated embeddings using OpenAI's Language Models and created a Vector Store using FAISS for efficient text similarity search.
**Question Answering**: Integrated OpenAI's LLMs to provide accurate answers to user queries about the PDF content.
**Persistence**: Implemented data persistence by storing generated embeddings to accelerate future queries.
## Getting Started
To run this project locally, follow these steps:
1. Clone this repository:
```bash
git clone https://github.com/shaadclt/LLM-powered-PDF-Chatbot.git
```
2. Navigate to the project directory:
```bash
cd LLM-powered-PDF-Chatbot
```
3. Install the required Python packages:
```bash
pip install -r requirements.txt
```
4. Create a `.env` file and set your environment variables.
```plaintext
OPENAI_API_KEY=your_openai_api_key
```
5. Run the Streamlit app:
```bash
streamlit run app.py
```
## Usage
1. Upload a PDF file using the "Upload your PDF" button.
2. Ask questions about the PDF file using the text input field.
3. The chatbot will use OpenAI's models to answer your questions based on the PDF content.
## Embeddings and Vector Store
The PDF text is split into chunks, and embeddings are generated using OpenAI's model. These embeddings are stored in a Vector Store for efficient similarity search.
## Acknowledgments
- [Streamlit](https://streamlit.io/) for creating an easy-to-use web app framework.
- [OpenAI](https://openai.com/) for providing powerful language models.
## License
This project is licensed under the [MIT License](LICENSE).