https://github.com/FareedKhan-dev/all-rag-techniques
Implementation of all RAG techniques in a simpler way
https://github.com/FareedKhan-dev/all-rag-techniques
ai llm llms multi-modal openai python rag
Last synced: 22 days ago
JSON representation
Implementation of all RAG techniques in a simpler way
- Host: GitHub
- URL: https://github.com/FareedKhan-dev/all-rag-techniques
- Owner: FareedKhan-dev
- License: mit
- Created: 2025-03-07T09:30:06.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-03-10T05:13:58.000Z (about 1 month ago)
- Last Synced: 2025-03-14T23:45:56.566Z (about 1 month ago)
- Topics: ai, llm, llms, multi-modal, openai, python, rag
- Language: Jupyter Notebook
- Homepage: https://medium.com/@fareedkhandev/testing-every-rag-technique-to-find-the-best-094d166af27f
- Size: 1.71 MB
- Stars: 194
- Watchers: 5
- Forks: 28
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-LLM-resourses - Implementation of all RAG techniques in a simpler way.
- StarryDivineSky - FareedKhan-dev/all-rag-techniques - dev/all-rag-techniques项目旨在以更简单的方式实现各种RAG(检索增强生成)技术。该项目提供了一个清晰易懂的RAG技术实现方案,方便开发者学习和应用。它可能包含多种RAG技术的示例代码和教程,帮助用户理解不同技术的原理和用法。项目的重点在于简化RAG技术的复杂性,使其更容易上手。通过学习该项目,用户可以快速掌握RAG技术,并将其应用于自己的项目中。该项目可能涵盖向量数据库的使用、检索策略的优化以及生成模型的集成等方面。它是一个学习和实践RAG技术的优秀资源。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
- awesome-hacking-lists - FareedKhan-dev/all-rag-techniques - Implementation of all RAG techniques in a simpler way (Jupyter Notebook)
README
# All RAG Techniques: A Simpler, Hands-On Approach ✨
[](https://www.python.org/downloads/release/python-370/) [](https://cloud.nebius.ai/services/llm-embedding) [](https://openai.com/) [](https://medium.com/@fareedkhandev/testing-every-rag-technique-to-find-the-best-094d166af27f)
This repository takes a clear, hands-on approach to **Retrieval-Augmented Generation (RAG)**, breaking down advanced techniques into straightforward, understandable implementations. Instead of relying on frameworks like `LangChain` or `FAISS`, everything here is built using familiar Python libraries `openai`, `numpy`, `matplotlib`, and a few others.
The goal is simple: provide code that is readable, modifiable, and educational. By focusing on the fundamentals, this project helps demystify RAG and makes it easier to understand how it really works.
## Update: 📢
- (20-Mar-2025) Added a new notebook on RAG with Reinforcement Learning.
- (07-Mar-2025) Added 20 RAG techniques to the repository.## 🚀 What's Inside?
This repository contains a collection of Jupyter Notebooks, each focusing on a specific RAG technique. Each notebook provides:
* A concise explanation of the technique.
* A step-by-step implementation from scratch.
* Clear code examples with inline comments.
* Evaluations and comparisons to demonstrate the technique's effectiveness.
* Visualization to visualize the results.Here's a glimpse of the techniques covered:
| Notebook | Description |
| :-------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [1. Simple RAG](1_simple_rag.ipynb) | A basic RAG implementation. A great starting point! |
| [2. Semantic Chunking](2_semantic_chunking.ipynb) | Splits text based on semantic similarity for more meaningful chunks. |
| [3. Chunk Size Selector](3_chunk_size_selector.ipynb) | Explores the impact of different chunk sizes on retrieval performance. |
| [4. Context Enriched RAG](4_context_enriched_rag.ipynb) | Retrieves neighboring chunks to provide more context. |
| [5. Contextual Chunk Headers](5_contextual_chunk_headers_rag.ipynb) | Prepends descriptive headers to each chunk before embedding. |
| [6. Document Augmentation RAG](6_doc_augmentation_rag.ipynb) | Generates questions from text chunks to augment the retrieval process. |
| [7. Query Transform](7_query_transform.ipynb) | Rewrites, expands, or decomposes queries to improve retrieval. Includes **Step-back Prompting** and **Sub-query Decomposition**. |
| [8. Reranker](8_reranker.ipynb) | Re-ranks initially retrieved results using an LLM for better relevance. |
| [9. RSE](9_rse.ipynb) | Relevant Segment Extraction: Identifies and reconstructs continuous segments of text, preserving context. |
| [10. Contextual Compression](10_contextual_compression.ipynb) | Implements contextual compression to filter and compress retrieved chunks, maximizing relevant information. |
| [11. Feedback Loop RAG](11_feedback_loop_rag.ipynb) | Incorporates user feedback to learn and improve RAG system over time. |
| [12. Adaptive RAG](12_adaptive_rag.ipynb) | Dynamically selects the best retrieval strategy based on query type. |
| [13. Self RAG](13_self_rag.ipynb) | Implements Self-RAG, dynamically decides when and how to retrieve, evaluates relevance, and assesses support and utility. |
| [14. Proposition Chunking](14_proposition_chunking.ipynb) | Breaks down documents into atomic, factual statements for precise retrieval. |
| [15. Multimodel RAG](15_multimodel_rag.ipynb) | Combines text and images for retrieval, generating captions for images using LLaVA. |
| [16. Fusion RAG](16_fusion_rag.ipynb) | Combines vector search with keyword-based (BM25) retrieval for improved results. |
| [17. Graph RAG](17_graph_rag.ipynb) | Organizes knowledge as a graph, enabling traversal of related concepts. |
| [18. Hierarchy RAG](18_hierarchy_rag.ipynb) | Builds hierarchical indices (summaries + detailed chunks) for efficient retrieval. |
| [19. HyDE RAG](19_HyDE_rag.ipynb) | Uses Hypothetical Document Embeddings to improve semantic matching. |
| [20. CRAG](20_crag.ipynb) | Corrective RAG: Dynamically evaluates retrieval quality and uses web search as a fallback. |
| [21. Rag with RL](21_rag_with_rl.ipynb) | Maximize the reward of the RAG model using Reinforcement Learning. |## 🗂️ Repository Structure
```
fareedkhan-dev-all-rag-techniques/
├── README.md <- You are here!
├── 1_simple_rag.ipynb
├── 2_semantic_chunking.ipynb
├── 3_chunk_size_selector.ipynb
├── 4_context_enriched_rag.ipynb
├── 5_contextual_chunk_headers_rag.ipynb
├── 6_doc_augmentation_rag.ipynb
├── 7_query_transform.ipynb
├── 8_reranker.ipynb
├── 9_rse.ipynb
├── 10_contextual_compression.ipynb
├── 11_feedback_loop_rag.ipynb
├── 12_adaptive_rag.ipynb
├── 13_self_rag.ipynb
├── 14_proposition_chunking.ipynb
├── 15_multimodel_rag.ipynb
├── 16_fusion_rag.ipynb
├── 17_graph_rag.ipynb
├── 18_hierarchy_rag.ipynb
├── 19_HyDE_rag.ipynb
├── 20_crag.ipynb
├── 21_rag_with_rl.ipynb
├── requirements.txt <- Python dependencies
└── data/
└── val.json <- Sample validation data (queries and answers)
└── AI_information.pdf <- A sample PDF document for testing.
└── attention_is_all_you_need.pdf <- A sample PDF document for testing (for Multi-Modal RAG).
```## 🛠️ Getting Started
1. **Clone the repository:**
```bash
git clone https://github.com/FareedKhan-dev/all-rag-techniques.git
cd all-rag-techniques
```2. **Install dependencies:**
```bash
pip install -r requirements.txt
```3. **Set up your OpenAI API key:**
* Obtain an API key from [Nebius AI](https://studio.nebius.com/).
* Set the API key as an environment variable:
```bash
export OPENAI_API_KEY='YOUR_NEBIUS_AI_API_KEY'
```
or
```bash
setx OPENAI_API_KEY "YOUR_NEBIUS_AI_API_KEY" # On Windows
```
or, within your Python script/notebook:```python
import os
os.environ["OPENAI_API_KEY"] = "YOUR_NEBIUS_AI_API_KEY"
```4. **Run the notebooks:**
Open any of the Jupyter Notebooks (`.ipynb` files) using Jupyter Notebook or JupyterLab. Each notebook is self-contained and can be run independently. The notebooks are designed to be executed sequentially within each file.
**Note:** The `data/AI_information.pdf` file provides a sample document for testing. You can replace it with your own PDF. The `data/val.json` file contains sample queries and ideal answers for evaluation.
The 'attention_is_all_you_need.pdf' is for testing Multi-Modal RAG Notebook.## 💡 Core Concepts
* **Embeddings:** Numerical representations of text that capture semantic meaning. We use Nebius AI's embedding API and, in many notebooks, also the `BAAI/bge-en-icl` embedding model.
* **Vector Store:** A simple database to store and search embeddings. We create our own `SimpleVectorStore` class using NumPy for efficient similarity calculations.
* **Cosine Similarity:** A measure of similarity between two vectors. Higher values indicate greater similarity.
* **Chunking:** Dividing text into smaller, manageable pieces. We explore various chunking strategies.
* **Retrieval:** The process of finding the most relevant text chunks for a given query.
* **Generation:** Using a Large Language Model (LLM) to create a response based on the retrieved context and the user's query. We use the `meta-llama/Llama-3.2-3B-Instruct` model via Nebius AI's API.
* **Evaluation:** Assessing the quality of the RAG system's responses, often by comparing them to a reference answer or using an LLM to score relevance.
## 🤝 Contributing
Contributions are welcome!