An open API service indexing awesome lists of open source software.

https://github.com/haswinai/rag_chatbot

A Retrieval-Augmented Generation (RAG) chatbot that answers questions from uploaded PDFs. Built using LangChain, Groq LLMs (LLaMA 3), FAISS, HuggingFace Embeddings, and Streamlit.
https://github.com/haswinai/rag_chatbot

faiss langchain llama3 rag-chatbot streamlit

Last synced: 3 months ago
JSON representation

A Retrieval-Augmented Generation (RAG) chatbot that answers questions from uploaded PDFs. Built using LangChain, Groq LLMs (LLaMA 3), FAISS, HuggingFace Embeddings, and Streamlit.

Awesome Lists containing this project

README

          

# πŸ“š RAG Chatbot – PDF Q&A with LangChain + Groq
# A Retrieval-Augmented Generation (RAG) chatbot that answers questions from uploaded PDFs. Built using LangChain, Groq LLMs (LLaMA 3), FAISS, HuggingFace Embeddings, and Streamlit.

# 🧠 What This Project Does
πŸ—‚ Uploads a PDF document

πŸ“– Reads and chunks the content into manageable parts

πŸ”’ Converts chunks into embeddings using HuggingFace models

πŸ’Ύ Stores embeddings in a FAISS vector store

❓ Accepts user queries and retrieves top-matching chunks

🧠 Sends the context to Groq’s LLaMA 3 model to generate an answer

πŸ’¬ Displays accurate, document-based responses in real-time

# πŸ” Workflow

PDF Upload
β†’ Text Extraction
β†’ Chunking
β†’ Embedding (HuggingFace)
β†’ FAISS Vector Store

User Query
β†’ Embedding
β†’ Search FAISS for Relevant Chunks
β†’ Generate Answer with Groq LLM
🧰 Tech Stack
Tool Role
🦜 LangChain LLM and retrieval framework
🧠 Groq LLM inference (e.g. llama3-70b-8192)
🧠 HuggingFace Embedding model (all-MiniLM-L6-v2)
πŸ“š FAISS Vector search for chunk similarity
πŸ“‘ PyPDF2 PDF text extraction
⚑ Streamlit UI to upload and chat
πŸ” Dotenv Secure API key handling

πŸ“ Project Structure

rag-chatbot/
β”œβ”€β”€ app.py # Streamlit UI
β”œβ”€β”€ rag_chain.py # Retrieval and LLM logic
β”œβ”€β”€ utils.py # PDF reading and chunking
β”œβ”€β”€ requirements.txt # Required packages
β”œβ”€β”€ .env # API keys (not pushed)
β”œβ”€β”€ data/ # Uploaded PDFs (ignored)
└── README.md # You are here
πŸ›  Installation
Clone this repo:

git clone https://github.com/HaswinAI/RAG-chatbot.git
cd rag-chatbot-groq
Create a virtual environment:

python -m venv chatbot
source chatbot/bin/activate # or chatbot\Scripts\activate on Windows
Install requirements:

pip install -r requirements.txt
Set up .env:

GROQ_API_KEY=your_actual_groq_key_here
▢️ Run the App

streamlit run app.py
Go to localhost in your browser.

πŸ’¬ Example Use
Upload a whitepaper or article PDF

Ask: "What are the key takeaways?"

Get an instant, AI-generated summary based on the real content

πŸ“Œ Notes
PDF contents are not stored permanently

.env, .pdf files, and data/ are in .gitignore

Compatible with models like:

llama3-70b-8192

llama3-8b-8192

gemma-7b-it

mixtral-8x22b

πŸ“„ License
This project is released under the MIT License

✨ Acknowledgements
LangChain

Groq

FAISS by Meta

HuggingFace Sentence Transformers

Streamlit

# Creator - HASWIN