https://github.com/eziodevio/ai-knowledge-bot
his is my own custom-built offline AI bot that lets you chat with PDFs and web pages using **local embeddings** and **local LLMs** like LLaMA 3. I built it step by step using LangChain, FAISS, HuggingFace, and Ollama โ without relying on OpenAI or DeepSeek APIs anymore (they just kept failing or costing too much)
https://github.com/eziodevio/ai-knowledge-bot
ai-chatbot chat-with-pdf chat-with-webpage document-summarization eziodevio faiss huggingface-embeddings langchain llama3 local-llm offline-ai ollama rag streamlit vectorstore
Last synced: 4 months ago
JSON representation
his is my own custom-built offline AI bot that lets you chat with PDFs and web pages using **local embeddings** and **local LLMs** like LLaMA 3. I built it step by step using LangChain, FAISS, HuggingFace, and Ollama โ without relying on OpenAI or DeepSeek APIs anymore (they just kept failing or costing too much)
- Host: GitHub
- URL: https://github.com/eziodevio/ai-knowledge-bot
- Owner: EzioDEVio
- Created: 2025-06-11T02:45:09.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-06-11T05:18:31.000Z (4 months ago)
- Last Synced: 2025-06-11T06:24:57.701Z (4 months ago)
- Topics: ai-chatbot, chat-with-pdf, chat-with-webpage, document-summarization, eziodevio, faiss, huggingface-embeddings, langchain, llama3, local-llm, offline-ai, ollama, rag, streamlit, vectorstore
- Language: Python
- Homepage:
- Size: 29.3 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README








# ๐ง AI Knowledge Bot
This is my own custom-built offline AI bot that lets you chat with PDFs and web pages using **local embeddings** and **local LLMs** like LLaMA 3.
I built it step by step using LangChain, FAISS, HuggingFace, and Ollama โ without relying on OpenAI or DeepSeek APIs anymore (they just kept failing or costing too much).
---
## ๐ Features
- ๐ Chat with uploaded PDF files
- ๐ Ask questions about a webpage URL
- ๐ง Uses local HuggingFace embeddings (`all-MiniLM-L6-v2`)
- ๐ฆ Powered by Ollama + LLaMA 3 (fully offline LLM)
- ๐๏ธ Built-in FAISS vectorstore
- ๐งพ PDF inline preview
- ๐งฎ Built-in calculator + summarizer tools (via LangChain agents)
- ๐ง Page citation support (know where each answer came from)
- ๐ Chat history viewer with download button (JSON)
- ๐๏ธ Simple Streamlit UI with dark/light mode toggle
- ๐จโ๐ป Footer credit: *Developed by EzioDEVio*---
## ๐ฆ Tech Stack
- `langchain`, `langchain-community`
- `sentence-transformers` for local embeddings
- `ollama` for local LLMs (`llama3`)
- `PyPDF2` for PDF parsing
- `FAISS` for vector indexing
- `Streamlit` for frontend---
## ๐ Setup Guide
### 1. Clone this repo
```bash
git clone https://github.com/EzioDEVio/ai-knowledge-bot.git
cd ai-knowledge-bot
````---
### 2. Create and activate virtualenv (optional but recommended)
```bash
python -m venv venv
.\venv\Scripts\activate # Windows for Mac is different
```---
### 3. Install dependencies
```bash
pip install -r requirements.txt
```Make sure `sentence-transformers` is installed โ needed for local embeddings.
---
### 4. Install Ollama (for local LLM)
Download and install from:
๐ [https://ollama.com/download](https://ollama.com/download)
After installation, verify:
```bash
ollama --version
```Then pull and run the model:
```bash
ollama run llama3
```> This will download the LLaMA 3 model (approx. 4โ8GB). You can also try `mistral`, `codellama`, etc.
---
### 5. Run the app
```bash
streamlit run app.py
```The app will open at:
```
http://localhost:8501
```---
## ๐ Folder Structure
```
ai-knowledge-bot/
โโโ app.py # Main Streamlit UI
โโโ backend/
โ โโโ pdf_loader.py # PDF text extraction
โ โโโ web_loader.py # Webpage scraper
โ โโโ vector_store.py # Embedding + FAISS
โ โโโ qa_chain.py # LLM QA logic (Ollama + tools)
โโโ .env # Not used anymore (was for API keys)
โโโ requirements.txt
โโโ README.md
```---
## โ Working Setup Summary
| Component | Mode |
| ---------------- | ------------------------------------ |
| Embeddings | Local (`HuggingFace`) |
| Vectorstore | Local (`FAISS`) |
| LLM Response | Local (`Ollama` + `llama3`) |
| Internet Needed? | โ Only for first-time model download |---
## โ ๏ธ Why I Avoided OpenAI / DeepSeek
* **OpenAI** failed with `RateLimitError` and quota issues unless I added billing.
* **DeepSeek** embedding endpoints didnโt work โ only chat models supported.So I switched to:
* ๐ Local `HuggingFaceEmbeddings` for vectorization
* ๐ฆ `ChatOllama` for full offline AI answers---
## โ Now Completed Features
* โ PDF upload + preview
* โ URL content QA
* โ Chat history with page citations
* โ Calculator + summarizer tools
* โ Footer attribution
* โ JSON export
* โ 100% offline functionality---
## ๐ณ Run with Docker (Secure Production Mode)
Build and run the app securely using a **multi-stage Dockerfile**:
1. Build the container
```bash
docker build -t ai-knowledge-bot .
```2. Run the container
Make sure Ollama is running on the host, open up a powershell or in different terminal then:
```
docker run -p 8501:8501 \
--add-host=host.docker.internal:host-gateway \
ai-knowledge-bot
```
---
## ๐ Dockerfile Security Highlights
โ Multi-stage build (separates dependencies from runtime)โ Minimal base (python:3.10-slim)
โ Non-root appuser by default
โ .env, venv, logs excluded via .dockerignore
โ Exposes only necessary port (8501)
โ Automatically starts Streamlit app
---
## ๐ฌ LicenseMIT โ feel free to fork, use, or improve it.
---
## ๐ฅ Built by EzioDEVio | ๐ฎ๐ถ | ๐ง
From concept to offline AI โ all step by step.
---