https://github.com/defaultspace/smart-pdf-chat

🔍 Chat with your PDFs using local LLMs (DeepSeek, Mistral) and get visually highlighted answers – all offline.
https://github.com/defaultspace/smart-pdf-chat

faiss langchain llm ollama pdf pdf-chat-bot python qwen2-5

Last synced: 3 months ago
JSON representation

🔍 Chat with your PDFs using local LLMs (DeepSeek, Mistral) and get visually highlighted answers – all offline.

Host: GitHub
URL: https://github.com/defaultspace/smart-pdf-chat
Owner: DefaultSpace
Created: 2025-05-08T00:14:01.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-05-08T00:19:00.000Z (5 months ago)
Last Synced: 2025-05-11T15:46:21.916Z (5 months ago)
Topics: faiss, langchain, llm, ollama, pdf, pdf-chat-bot, python, qwen2-5
Language: Python
Homepage:
Size: 523 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Role-based PDF QA Chatbot (LangChain + FAISS + Qwen2.5 via Ollama)

An intelligent chatbot powered by a local LLM that can role-play and answer questions based on PDF content.

## Technologies Used
- Python 3.10+
- Streamlit – Interface
- PyMuPDF (fitz) – PDF text extraction
- LangChain – Chunking, Retriever, PromptTemplate
- FAISS – Vector database
- Ollama (Qwen2.5:latest) – Embedding and LLM

## Project Folder Structure
```bash
pdf-chatbot/
│
├── app.py # Main application (Streamlit interface)
├── pdf_handler.py # Extracts text from PDF and chunks it
├── embedder.py # Embedding + FAISS database operations
├── chatbot.py # Response generation with Qwen2.5 (via Ollama)
├── prompts.py # Role-based prompt templates
├── roles.json # Defines the list of roles
├── vectordb/ # FAISS files are stored here (created when the app runs)
├── data/ # User-uploaded PDFs (created when the app runs)
├── requirements.txt # Required libraries
└── README.md # Project description
```

## Installation Instructions

1. **Set up the Python environment:**
```bash
python -m venv venv
source venv/bin/activate # For Windows: venv\Scripts\activate
pip install -r requirements.txt
```

2. **Ensure Ollama is running:**
Make sure the `qwen2.5:latest` model is running on Ollama. If not installed:
```bash
ollama pull qwen2.5:latest
```
The Ollama service needs to be running in the background (usually started with the `ollama serve` command or the Ollama Desktop application is running). It is not necessary to run the model separately with `ollama run qwen2.5:latest` while the application is running.

To list installed models:
```bash
ollama list
```
You should see the `qwen2.5:latest` model (or a similar Qwen2.5 tag) in this list.

3. **Start the application:**
While in the project's main directory (`pdf-chatbot/`):
```bash
streamlit run app.py
```

## Project Niche and Added Value

| Aspect | Description |
|-----------------|-----------------------------------------------------------------------------|
| 📌 **Niche** | An AI capable of role-playing and providing document-based consultancy. |
| 🧠 **LLM Usage**| Local model (Qwen2.5 via Ollama) providing AI privacy and offline operation advantages. |
| 🛠️ **RAG Arch.**| Data-driven approach using LangChain + FAISS, differing from classic chatbots. |
| 💼 **CV Contrib.**| Combination of LangChain, FAISS, Ollama, PDF parsing, real-world use case. |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/defaultspace/smart-pdf-chat

Awesome Lists containing this project

README