https://github.com/a-tommarvoloriddle/chat-with-pdf
This is a streamlit application that allows users to upload PDF files, it extract text from them & then ask questions about the content of the PDFs. It uses NLP & ML techniques to provide answers to the user's questions.
https://github.com/a-tommarvoloriddle/chat-with-pdf
ai anthropic-api anthropic-claude langchain machine-learning natural-language-processing nlp pdf-reader python streamlit streamlit-webapp
Last synced: 5 months ago
JSON representation
This is a streamlit application that allows users to upload PDF files, it extract text from them & then ask questions about the content of the PDFs. It uses NLP & ML techniques to provide answers to the user's questions.
- Host: GitHub
- URL: https://github.com/a-tommarvoloriddle/chat-with-pdf
- Owner: A-TomMarvoloRiddle
- Created: 2025-04-04T18:07:21.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-04-04T19:17:54.000Z (11 months ago)
- Last Synced: 2025-04-04T19:45:34.324Z (11 months ago)
- Topics: ai, anthropic-api, anthropic-claude, langchain, machine-learning, natural-language-processing, nlp, pdf-reader, python, streamlit, streamlit-webapp
- Language: Python
- Homepage:
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PDF Chatbot 📄🤖
This is a **PDF-based chatbot** built using **Streamlit**, **PyPDF2**, and **Claude Sonnet API**. It enables users to upload **PDF files**, extract text, and ask questions about the content using **Natural Language Processing (NLP) and Machine Learning (ML)**.
## Features 🚀
✔ **Upload Multiple PDFs**: Users can upload multiple PDFs to extract text.
✔ **Text Extraction**: The chatbot reads text from PDFs using PyPDF2.
✔ **Text Chunking**: Large text is split into smaller, manageable chunks.
✔ **Vector Store with FAISS**: Stores text chunks efficiently for retrieval.
✔ **Conversational AI**: Uses **Claude Sonnet API** to process and answer user queries.
✔ **Interactive UI**: Built with **Streamlit** for an easy-to-use web interface.
## Technologies Used 🛠
- **Python** 🐍
- **Streamlit** (Web UI)
- **PyPDF2** (PDF text extraction)
- **FAISS** (Vector store for efficient text retrieval)
- **LangChain** (NLP & ML framework)
- **Claude Sonnet API** (AI-based conversation model)
- **OpenAI & Anthropic APIs** (LLM integration)
## How It Works 🤖
1️⃣ **User uploads PDF files** → The chatbot extracts text.
2️⃣ **Text is split into chunks** → Helps in better query matching.
3️⃣ **FAISS stores the text** → Creates an efficient vector-based storage.
4️⃣ **User asks a question** → The bot retrieves relevant text from PDFs.
5️⃣ **Claude Sonnet API processes the query** → AI generates the response.
6️⃣ **Answer is displayed** in the chat interface.
## Usage Instructions 📝
- Run the chatbot using **Streamlit**.
- Upload PDFs via the **sidebar menu**.
- Click **"Submit & Process"** to extract and store text.
- Ask questions in the input field, and the chatbot will **provide accurate answers**.
## Screenshots 🖼
