An open API service indexing awesome lists of open source software.

https://github.com/mohd-faizy/rag-deepseek

Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Face’s ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode
https://github.com/mohd-faizy/rag-deepseek

deepseek-r1 faiss-cpu faiss-vector-database huggingface langchain ollama rag retrival-augmented-generation streamlit

Last synced: 2 months ago
JSON representation

Efficiently search and retrieve information from PDF documents using a Retrieval-Augmented Generation (RAG) approach. This project leverages DeepSeek-R1 (1.5B) for advanced language understanding, FAISS for high-speed vector search, and Hugging Face’s ecosystem for enhanced NLP capabilities. With an intuitive Streamlit interface and Ollama for mode

Awesome Lists containing this project

README

          

# πŸ€– RAG PDF Assistant: Powered by πŸ† `DeepSeek-R1` ~ 1.5B , `Ollama`, `Streamlit` & `FAISS`

![author](https://img.shields.io/badge/author-mohd--faizy-red)
![Python 3.9+](https://img.shields.io/badge/Python-3.9%2B-3776AB?logo=python&logoColor=white)
![Streamlit](https://img.shields.io/badge/Streamlit-FF4B4B?logo=streamlit&logoColor=white)
![Ollama](https://img.shields.io/badge/Ollama-0C0D0E?logo=ollama&logoColor=white)
![PDFPlumber](https://img.shields.io/badge/PDFPlumber-FF0000?logo=pdf&logoColor=white)
![LangChain](https://img.shields.io/badge/LangChain-00ADD8?logo=langchain&logoColor=white)
![HuggingFace](https://img.shields.io/badge/HuggingFace-FFD43B?logo=huggingface&logoColor=black)
![FAISS](https://img.shields.io/badge/FAISS-00A98F?logo=faiss&logoColor=white)

---

## πŸš€πŸ”₯ Live Demo 🌍

πŸ‘‰ **Click below to try it now:**

[![πŸš€ Try it Now](https://img.shields.io/badge/Try%20Live-Click%20Here-28a745?style=for-the-badge)](https://rag-deepseek-4xbslhkxhxxdwtmxs7ceey.streamlit.app)

## 🌟 **Overview**

**RAG PDF Assistant** is an AI-powered **Retrieval-Augmented Generation (RAG)** chatbot that enables **intelligent PDF document search and retrieval.** It combines:

βœ… **DeepSeek-R1 (1.5B)** – Advanced AI-powered language model for **accurate responses.**
βœ… **FAISS** – Fast vector search for **efficient document retrieval.**
βœ… **Ollama** – Lightweight model serving and **seamless inference.**
βœ… **LangChain** – Modular AI framework for **query execution and reasoning.**
βœ… **Streamlit** – Intuitive and interactive **web-based UI.**

### 🎯 **Use Cases**

πŸ” **Quickly search** through large PDF documents.
πŸ“„ **Summarize reports, research papers, contracts,** and more.
πŸ“˜ **Extract relevant information** with AI-driven accuracy.
πŸ€– **Ask natural language questions** and get concise answers.

---

## πŸš€ **Demo Screenshots**

### **πŸ“Œ UI Preview**

![Demo](https://github.com/mohd-faizy/RAG-DeepSeek/blob/main/assets/rag-pdf-retv.png?raw=true)

### **πŸ“Œ Application Workflow**

![Workflow](https://github.com/mohd-faizy/RAG-DeepSeek/blob/main/assets/RAG-app-flow.png?raw=true)

---

## πŸ› οΈ **Installation & Setup**

### **πŸ”§ Prerequisites**

Ensure you have the following installed:

- **Python 3.9+**
- **pip** (Python package manager)
- **Git** (for cloning the repository)

### **πŸ“₯ Step 1: Clone the Repository**

```bash
$ git clone https://github.com/mohd-faizy/RAG-DeepSeek.git
$ cd RAG-DeepSeek
```

### **πŸ“¦ Step 2: Install Dependencies**

```bash
$ pip install -r requirements.txt
```

### **βš™οΈ Step 3: Run the Application**

1. Start Ollama service:

```bash
ollama serve
```

2. In a separate terminal, launch the chat interface:

```bash
streamlit run app/main.py
```

The application will launch in your browser at `http://localhost:11434/`.

---

## πŸ“ **Directory Structure**

```plaintext
RAG-DeepSeek/
β”œβ”€β”€ app/
β”‚ β”œβ”€β”€ __init__.py # Python package initialization
β”‚ β”œβ”€β”€ main.py # Main Streamlit application file
β”‚ β”œβ”€β”€ utils.py # Utility functions for PDF processing, embeddings, retrieval
β”œβ”€β”€ assets/ # Static files like images, CSS, etc.
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ .gitignore # Files ignored by Git
β”œβ”€β”€ README.md # Project documentation
```

---

## 🧠 **How It Works**

1️⃣ **Upload a PDF** β†’ The AI extracts and indexes the content.
2️⃣ **Ask a question** β†’ The system searches for the most relevant passages.
3️⃣ **AI answers your query** β†’ Based on retrieved document content.

πŸ”Ή **Uses FAISS** for **fast, efficient document retrieval.**
πŸ”Ή **DeepSeek-R1** ensures **high-quality, context-aware answers.**

---

## πŸ”— **Technologies Used**

| **Technology** | **Purpose** |
|---------------|-------------|
| **DeepSeek-R1 (1.5B)** | Language model for intelligent responses |
| **Ollama** | Model serving and inference |
| **FAISS** | Vector search for document retrieval |
| **LangChain** | AI-driven reasoning and query handling |
| **Streamlit** | User-friendly web interface |
| **PDFPlumber** | Extracting text from PDFs |

---

### **🀝 Steps to Contribute**

1. Fork the repository
2. Create a new feature branch (`git checkout -b feature-name`)
3. Commit your changes (`git commit -m "Added new feature"`)
4. Push to your fork (`git push origin feature-name`)
5. Open a pull request

---

## βš– ➀ License

This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.

## ❀️ Support

If you find this repository helpful, show your support by starring it! For questions or feedback, reach out on [Twitter(`X`)](https://twitter.com/F4izy).

## πŸ”—Connect with me

➀ If you have questions or feedback, feel free to reach out!!!

[][twitter]
[][linkedin]
[][Portfolio]

[twitter]: https://twitter.com/F4izy
[linkedin]: https://www.linkedin.com/in/mohd-faizy/
[Portfolio]: https://ai.stackexchange.com/users/36737/faizy?tab=profile

---