https://github.com/mohd-faizy/rag-deepseek
Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Faceβs ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode
https://github.com/mohd-faizy/rag-deepseek
deepseek-r1 faiss-cpu faiss-vector-database huggingface langchain ollama rag retrival-augmented-generation streamlit
Last synced: 2 months ago
JSON representation
Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Faceβs ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode
- Host: GitHub
- URL: https://github.com/mohd-faizy/rag-deepseek
- Owner: mohd-faizy
- License: mit
- Created: 2025-01-31T14:16:58.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-09T20:12:58.000Z (over 1 year ago)
- Last Synced: 2025-03-09T21:17:52.504Z (over 1 year ago)
- Topics: deepseek-r1, faiss-cpu, faiss-vector-database, huggingface, langchain, ollama, rag, retrival-augmented-generation, streamlit
- Language: Python
- Homepage:
- Size: 405 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π€ RAG PDF Assistant: Powered by π `DeepSeek-R1` ~ 1.5B , `Ollama`, `Streamlit` & `FAISS`








---
## ππ₯ Live Demo π
π **Click below to try it now:**
[](https://rag-deepseek-4xbslhkxhxxdwtmxs7ceey.streamlit.app)
## π **Overview**
**RAG PDF Assistant** is an AI-powered **Retrieval-Augmented Generation (RAG)** chatbot that enables **intelligent PDF document search and retrieval.** It combines:
β
**DeepSeek-R1 (1.5B)** β Advanced AI-powered language model for **accurate responses.**
β
**FAISS** β Fast vector search for **efficient document retrieval.**
β
**Ollama** β Lightweight model serving and **seamless inference.**
β
**LangChain** β Modular AI framework for **query execution and reasoning.**
β
**Streamlit** β Intuitive and interactive **web-based UI.**
### π― **Use Cases**
π **Quickly search** through large PDF documents.
π **Summarize reports, research papers, contracts,** and more.
π **Extract relevant information** with AI-driven accuracy.
π€ **Ask natural language questions** and get concise answers.
---
## π **Demo Screenshots**
### **π UI Preview**

### **π Application Workflow**

---
## π οΈ **Installation & Setup**
### **π§ Prerequisites**
Ensure you have the following installed:
- **Python 3.9+**
- **pip** (Python package manager)
- **Git** (for cloning the repository)
### **π₯ Step 1: Clone the Repository**
```bash
$ git clone https://github.com/mohd-faizy/RAG-DeepSeek.git
$ cd RAG-DeepSeek
```
### **π¦ Step 2: Install Dependencies**
```bash
$ pip install -r requirements.txt
```
### **βοΈ Step 3: Run the Application**
1. Start Ollama service:
```bash
ollama serve
```
2. In a separate terminal, launch the chat interface:
```bash
streamlit run app/main.py
```
The application will launch in your browser at `http://localhost:11434/`.
---
## π **Directory Structure**
```plaintext
RAG-DeepSeek/
βββ app/
β βββ __init__.py # Python package initialization
β βββ main.py # Main Streamlit application file
β βββ utils.py # Utility functions for PDF processing, embeddings, retrieval
βββ assets/ # Static files like images, CSS, etc.
βββ requirements.txt # Python dependencies
βββ .gitignore # Files ignored by Git
βββ README.md # Project documentation
```
---
## π§ **How It Works**
1οΈβ£ **Upload a PDF** β The AI extracts and indexes the content.
2οΈβ£ **Ask a question** β The system searches for the most relevant passages.
3οΈβ£ **AI answers your query** β Based on retrieved document content.
πΉ **Uses FAISS** for **fast, efficient document retrieval.**
πΉ **DeepSeek-R1** ensures **high-quality, context-aware answers.**
---
## π **Technologies Used**
| **Technology** | **Purpose** |
|---------------|-------------|
| **DeepSeek-R1 (1.5B)** | Language model for intelligent responses |
| **Ollama** | Model serving and inference |
| **FAISS** | Vector search for document retrieval |
| **LangChain** | AI-driven reasoning and query handling |
| **Streamlit** | User-friendly web interface |
| **PDFPlumber** | Extracting text from PDFs |
---
### **π€ Steps to Contribute**
1. Fork the repository
2. Create a new feature branch (`git checkout -b feature-name`)
3. Commit your changes (`git commit -m "Added new feature"`)
4. Push to your fork (`git push origin feature-name`)
5. Open a pull request
---
## β β€ License
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.
## β€οΈ Support
If you find this repository helpful, show your support by starring it! For questions or feedback, reach out on [Twitter(`X`)](https://twitter.com/F4izy).
## πConnect with me
β€ If you have questions or feedback, feel free to reach out!!!
[
][twitter]
[
][linkedin]
[
][Portfolio]
[twitter]: https://twitter.com/F4izy
[linkedin]: https://www.linkedin.com/in/mohd-faizy/
[Portfolio]: https://ai.stackexchange.com/users/36737/faizy?tab=profile
---