https://github.com/analyticace/retrieval-augmented-generation-workshop

This workshop demonstrates how to build an intelligent document Q&A system using Retrieval-Augmented Generation (RAG).
https://github.com/analyticace/retrieval-augmented-generation-workshop

deepseek llm-inference rag-chatbot

Last synced: about 1 year ago
JSON representation

This workshop demonstrates how to build an intelligent document Q&A system using Retrieval-Augmented Generation (RAG).

Host: GitHub
URL: https://github.com/analyticace/retrieval-augmented-generation-workshop
Owner: AnalyticAce
License: mit
Created: 2025-04-14T15:34:44.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-04-26T07:46:54.000Z (about 1 year ago)
Last Synced: 2025-04-26T08:32:00.491Z (about 1 year ago)
Topics: deepseek, llm-inference, rag-chatbot
Language: Jupyter Notebook
Homepage:
Size: 5.84 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Retrieval-Augmented Generation Workshop

This workshop demonstrates how to build an intelligent document Q&A system using Retrieval-Augmented Generation (RAG).

## What is RAG and Why It's Important ?
RAG is a framework that combines information retrieval with text generation to create AI systems that produce more accurate, factual, and contextually relevant responses.

### How RAG Works
1. Retrieval Phase: When a user asks a question, RAG first searches through a knowledge base (documents, databases, etc.) to find the most relevant information.

2. Augmentation Phase: The retrieved information is injected as context into the prompt sent to a large language model (LLM).

3. Generation Phase: The LLM generates a response based on both its pre-trained knowledge and the specific retrieved context.

![Diagram](assets/diagram1.png)

## Why RAG is Important

1. Improved Accuracy: By grounding responses in specific source material, RAG reduces hallucinations and factual errors common in pure LLM responses.

2. Up-to-date Information: RAG can access the latest information from your documents, overcoming the limitation of LLMs being trained on older data.

3. Domain Specificity: Enables LLMs to answer questions about specialized or proprietary information not present in their training data.

4. Source Attribution: RAG can provide references to the source documents, increasing transparency and trustworthiness.

5. Lower Costs: Smaller, more efficient models can be used since they're augmented with external knowledge rather than needing to memorize everything.

6. Privacy & Control: Organizations can keep sensitive information in their own knowledge base rather than sending it to external LLM providers.

RAG represents an important evolution in AI systems, creating a bridge between traditional search and modern generative AI to deliver more reliable and useful responses.

## Workshop/Project Overview

This project implements a RAG-based system that allows users to:
1. Upload PDF documents
2. Process them into searchable vector stores
3. Ask questions about the documents using natural language
4. Receive contextually relevant answers based on the document content

## Features

- **PDF Document Processing**: Upload and process PDF documents
- **Vector Embedding**: Convert document chunks into embeddings using Hugging Face models
- **Intelligent Retrieval**: Find the most relevant document sections for user queries
- **Natural Language Responses**: Generate human-like responses using Large Language Models
- **Interactive UI**: Easy-to-use Streamlit interface for document uploading and querying

## Architecture

The project consists of three main components:

1. **Core RAG Engine** (core.py):
- Document loading and chunking
- Vector embedding generation
- Retrieval system
- LLM integration

2. **Web Interface** (streamlit_app.py):
- Document upload interface
- Chat interface for asking questions
- Session state management

3. **Colab Notebook** (RAG_(Retrieval_Augmented_Generation).ipynb):
- Step-by-step tutorial implementation
- Detailed explanations of each RAG component
- Executable demonstration

## Getting Started

### Prerequisites

- Python 3.13+
- HuggingFace account and API token
- DeepSeek API key (or compatible LLM API)

### Installation

1. Clone the repository:
```
git clone https://github.com/AnalyticAce/Retrieval-Augmented-Generation-Workshop.git
cd Retrieval-Augmented-Generation-Workshop
```

2. Install dependencies:
```
pip install -r requirements.txt
```

3. Create a .env file with your API keys:
```
HF_TOKEN="your_huggingface_token"
DEEPSEEK_API_KEY="your_deepseek_api_key"
DEEPSEEK_API_BASE="https://api.deepseek.com/v1"
```

### Running the Application

Start the Streamlit web interface:
```
streamlit run streamlit_app.py
```

## Usage

1. Open the web interface in your browser (typically http://localhost:8501)
2. Upload a PDF document using the sidebar
3. Click "Process Document" to analyze the document
4. Ask questions in the chat interface
5. View the AI-generated answers based on document content

### Streamlit UI
![UI](assets/interphase.png)

## Technical Details

- **Document Chunking**: Uses RecursiveCharacterTextSplitter with chunk size of 10,000 characters
- **Embeddings**: Generated using Hugging Face's "all-MiniLM-L6-v2" model
- **Vector Database**: FAISS for efficient similarity search
- **LLM**: DeepSeek Chat model (can be configured to use other LLMs)
- **Framework**: Built on LangChain for component orchestration

## Workshop Content

The colab notebook.ipynb) walks through:

1. Environment setup and dependency installation
2. Data ingestion and preprocessing techniques
3. Vector store creation with different embedding options
4. LLM configuration and RAG chain setup
5. Query processing and response generation

## License

This project is licensed under the MIT License - see the LICENSE file for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/analyticace/retrieval-augmented-generation-workshop

Awesome Lists containing this project

README