https://github.com/sarmishra/python-rag-chatbot
Build an AI-powered chatbot using RAG (Retrieval-Augmented Generation) with Python, LangChain, ChromaDB, and OpenAI. It allows you to ask questions about documents, and receive context-aware, grounded responses with source citations.
https://github.com/sarmishra/python-rag-chatbot
chatgpt chatgpt-api chromadb embeddings langchain python3 rag rag-chatbot streamlit
Last synced: about 2 months ago
JSON representation
Build an AI-powered chatbot using RAG (Retrieval-Augmented Generation) with Python, LangChain, ChromaDB, and OpenAI. It allows you to ask questions about documents, and receive context-aware, grounded responses with source citations.
- Host: GitHub
- URL: https://github.com/sarmishra/python-rag-chatbot
- Owner: sarmishra
- Created: 2025-06-19T03:29:29.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-19T06:31:37.000Z (about 1 year ago)
- Last Synced: 2025-06-19T07:26:41.163Z (about 1 year ago)
- Topics: chatgpt, chatgpt-api, chromadb, embeddings, langchain, python3, rag, rag-chatbot, streamlit
- Language: Python
- Homepage:
- Size: 137 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🧠 RAG + LangChain: AI Chatbot for Docs
Build an AI-powered chatbot using **RAG (Retrieval-Augmented Generation)** with **Python**, **LangChain**, **ChromaDB**, **OpenAI**, and **Streamlit UI**. It allows you to ask questions about documents (such as PDFs, Markdown files, etc.) and receive context-aware, grounded responses with source citations.
---
## 🚀 Features
- Chat with your local files (Markdown, text, etc.)
- Uses RAG for more accurate, grounded answers
- Generate responses using OpenAI's GPT models
- Fast similarity search with ChromaDB
- Source chunk citation for transparency
- Run via command line or a clean Streamlit UI
---
## 🧰 Tech Stack
- **Python**
- **LangChain**
- **ChromaDB**
- **OpenAI (embeddings + LLM)**
- **Streamlit** (for web UI)
---
## 💻 Setup Instructions
### 1. Install Dependencies
1. Run this command to install dependencies in the `requirements.txt` file.
```python
pip install -r requirements.txt
```
### 2. Set Your OpenAI API Key
Create a .env file in the root directory and define your API Key's value there.
```.env
OPENAI_API_KEY=your_api_key_here
```
### 3. Create database
Create the Chroma DB.
```python
python create_database.py
```
---
## ▶️ Usage
### Option 1: Command Line Interface
Ask questions from the terminal using:
```bash
python query_data.py "How does Alice meet the Mad Hatter?"
```
• Uses similarity search + relevance threshold (>= 0.7)
• Returns a formatted response and document sources
### Option 2: Streamlit Web App
Launch the UI:
```bash
streamlit run app.py
```
Then open http://localhost:8501 in your browser.
• Enter a question in the input box
• See the AI response and source documents instantly
---
## 🛠️ How It Works
1. Load Documents: Preprocessed and stored in Chroma vector DB.
2. Chunking: Large texts are split into ~1000-character chunks.
3. Embedding: Uses OpenAI embeddings to convert text into vectors.
4. Search: Uses vector similarity to retrieve relevant chunks.
5. Prompting: Combines retrieved context and question into a prompt.
6. LLM Response: Uses OpenAI Chat API to generate grounded answers.
---
## How to create an OpenAI API Key
> Need to set up an OpenAI account and generate the OpenAI key from here : [Create a new secret key](https://platform.openai.com/api-keys). We will need to make a payment for the OpenAI API to work.
---
## 🖼️ Preview
