{"id":26589720,"url":"https://github.com/FareedKhan-dev/all-rag-techniques","last_synced_at":"2025-03-23T13:03:25.763Z","repository":{"id":281166539,"uuid":"944400209","full_name":"FareedKhan-dev/all-rag-techniques","owner":"FareedKhan-dev","description":"Implementation of all RAG techniques in a simpler way","archived":false,"fork":false,"pushed_at":"2025-03-10T05:13:58.000Z","size":1798,"stargazers_count":194,"open_issues_count":0,"forks_count":28,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-14T23:45:56.566Z","etag":null,"topics":["ai","llm","llms","multi-modal","openai","python","rag"],"latest_commit_sha":null,"homepage":"https://medium.com/@fareedkhandev/testing-every-rag-technique-to-find-the-best-094d166af27f","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FareedKhan-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-07T09:30:06.000Z","updated_at":"2025-03-14T23:34:42.000Z","dependencies_parsed_at":"2025-03-07T11:23:31.183Z","dependency_job_id":"67a550fe-295c-48ea-aa8f-16213d96f271","html_url":"https://github.com/FareedKhan-dev/all-rag-techniques","commit_stats":null,"previous_names":["fareedkhan-dev/all-rag-techniques"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FareedKhan-dev%2Fall-rag-techniques","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FareedKhan-dev%2Fall-rag-techniques/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FareedKhan-dev%2Fall-rag-techniques/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FareedKhan-dev%2Fall-rag-techniques/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FareedKhan-dev","download_url":"https://codeload.github.com/FareedKhan-dev/all-rag-techniques/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245104523,"owners_count":20561379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","llm","llms","multi-modal","openai","python","rag"],"created_at":"2025-03-23T13:02:32.322Z","updated_at":"2025-03-23T13:03:25.512Z","avatar_url":"https://github.com/FareedKhan-dev.png","language":"Jupyter Notebook","funding_links":[],"categories":["教程 Tutorial","A01_文本生成_文本对话","Jupyter Notebook"],"sub_categories":["大语言对话模型及数据"],"readme":"# All RAG Techniques: A Simpler, Hands-On Approach ✨\n\n[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-370/) [![Nebius AI](https://img.shields.io/badge/Nebius%20AI-API-brightgreen)](https://cloud.nebius.ai/services/llm-embedding) [![OpenAI](https://img.shields.io/badge/OpenAI-API-lightgrey)](https://openai.com/) [![Medium](https://img.shields.io/badge/Medium-Blog-black?logo=medium)](https://medium.com/@fareedkhandev/testing-every-rag-technique-to-find-the-best-094d166af27f)\n\nThis repository takes a clear, hands-on approach to **Retrieval-Augmented Generation (RAG)**, breaking down advanced techniques into straightforward, understandable implementations. Instead of relying on frameworks like `LangChain` or `FAISS`, everything here is built using familiar Python libraries `openai`, `numpy`, `matplotlib`, and a few others.\n\nThe goal is simple: provide code that is readable, modifiable, and educational. By focusing on the fundamentals, this project helps demystify RAG and makes it easier to understand how it really works.\n\n## Update: 📢\n- (20-Mar-2025) Added a new notebook on RAG with Reinforcement Learning.\n- (07-Mar-2025) Added 20 RAG techniques to the repository.\n\n## 🚀 What's Inside?\n\nThis repository contains a collection of Jupyter Notebooks, each focusing on a specific RAG technique.  Each notebook provides:\n\n*   A concise explanation of the technique.\n*   A step-by-step implementation from scratch.\n*   Clear code examples with inline comments.\n*   Evaluations and comparisons to demonstrate the technique's effectiveness.\n*   Visualization to visualize the results.\n\nHere's a glimpse of the techniques covered:\n\n| Notebook                                      | Description                                                                                                                                                         |\n| :-------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| [1. Simple RAG](1_simple_rag.ipynb)           | A basic RAG implementation.  A great starting point!                                                                                                       |\n| [2. Semantic Chunking](2_semantic_chunking.ipynb) | Splits text based on semantic similarity for more meaningful chunks.                                                                                           |\n| [3. Chunk Size Selector](3_chunk_size_selector.ipynb) | Explores the impact of different chunk sizes on retrieval performance.                                                                                    |\n| [4. Context Enriched RAG](4_context_enriched_rag.ipynb) | Retrieves neighboring chunks to provide more context.                                                                                                     |\n| [5. Contextual Chunk Headers](5_contextual_chunk_headers_rag.ipynb) | Prepends descriptive headers to each chunk before embedding.                                                                                                |\n| [6. Document Augmentation RAG](6_doc_augmentation_rag.ipynb) | Generates questions from text chunks to augment the retrieval process.                                                                                           |\n| [7. Query Transform](7_query_transform.ipynb)   | Rewrites, expands, or decomposes queries to improve retrieval.  Includes **Step-back Prompting** and **Sub-query Decomposition**.                                      |\n| [8. Reranker](8_reranker.ipynb)               | Re-ranks initially retrieved results using an LLM for better relevance.                                                                                       |\n| [9. RSE](9_rse.ipynb)                         | Relevant Segment Extraction:  Identifies and reconstructs continuous segments of text, preserving context.                                                   |\n| [10. Contextual Compression](10_contextual_compression.ipynb) | Implements contextual compression to filter and compress retrieved chunks, maximizing relevant information.                                                 |\n| [11. Feedback Loop RAG](11_feedback_loop_rag.ipynb) | Incorporates user feedback to learn and improve RAG system over time.                                                                                      |\n| [12. Adaptive RAG](12_adaptive_rag.ipynb)     | Dynamically selects the best retrieval strategy based on query type.                                                                                          |\n| [13. Self RAG](13_self_rag.ipynb)             | Implements Self-RAG, dynamically decides when and how to retrieve, evaluates relevance, and assesses support and utility.                                        |\n| [14. Proposition Chunking](14_proposition_chunking.ipynb) | Breaks down documents into atomic, factual statements for precise retrieval.                                                                                      |\n| [15. Multimodel RAG](15_multimodel_rag.ipynb)   | Combines text and images for retrieval, generating captions for images using LLaVA.                                                                  |\n| [16. Fusion RAG](16_fusion_rag.ipynb)         | Combines vector search with keyword-based (BM25) retrieval for improved results.                                                                                |\n| [17. Graph RAG](17_graph_rag.ipynb)           | Organizes knowledge as a graph, enabling traversal of related concepts.                                                                                        |\n| [18. Hierarchy RAG](18_hierarchy_rag.ipynb)        | Builds hierarchical indices (summaries + detailed chunks) for efficient retrieval.                                                                                   |\n| [19. HyDE RAG](19_HyDE_rag.ipynb)             | Uses Hypothetical Document Embeddings to improve semantic matching.                                                                                              |\n| [20. CRAG](20_crag.ipynb)                     | Corrective RAG: Dynamically evaluates retrieval quality and uses web search as a fallback.                                                                           |\n| [21. Rag with RL](21_rag_with_rl.ipynb)                     | Maximize the reward of the RAG model using Reinforcement Learning.                                                                           |\n\n## 🗂️ Repository Structure\n\n```\nfareedkhan-dev-all-rag-techniques/\n├── README.md                          \u003c- You are here!\n├── 1_simple_rag.ipynb\n├── 2_semantic_chunking.ipynb\n├── 3_chunk_size_selector.ipynb\n├── 4_context_enriched_rag.ipynb\n├── 5_contextual_chunk_headers_rag.ipynb\n├── 6_doc_augmentation_rag.ipynb\n├── 7_query_transform.ipynb\n├── 8_reranker.ipynb\n├── 9_rse.ipynb\n├── 10_contextual_compression.ipynb\n├── 11_feedback_loop_rag.ipynb\n├── 12_adaptive_rag.ipynb\n├── 13_self_rag.ipynb\n├── 14_proposition_chunking.ipynb\n├── 15_multimodel_rag.ipynb\n├── 16_fusion_rag.ipynb\n├── 17_graph_rag.ipynb\n├── 18_hierarchy_rag.ipynb\n├── 19_HyDE_rag.ipynb\n├── 20_crag.ipynb\n├── 21_rag_with_rl.ipynb\n├── requirements.txt                   \u003c- Python dependencies\n└── data/\n    └── val.json                       \u003c- Sample validation data (queries and answers)\n    └── AI_information.pdf             \u003c- A sample PDF document for testing.\n    └── attention_is_all_you_need.pdf  \u003c- A sample PDF document for testing (for Multi-Modal RAG).\n```\n\n## 🛠️ Getting Started\n\n1.  **Clone the repository:**\n\n    ```bash\n    git clone https://github.com/FareedKhan-dev/all-rag-techniques.git\n    cd all-rag-techniques\n    ```\n\n2.  **Install dependencies:**\n\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n3.  **Set up your OpenAI API key:**\n\n    *   Obtain an API key from [Nebius AI](https://studio.nebius.com/).\n    *   Set the API key as an environment variable:\n        ```bash\n        export OPENAI_API_KEY='YOUR_NEBIUS_AI_API_KEY'\n        ```\n        or\n        ```bash\n        setx OPENAI_API_KEY \"YOUR_NEBIUS_AI_API_KEY\"  # On Windows\n        ```\n        or, within your Python script/notebook:\n\n        ```python\n        import os\n        os.environ[\"OPENAI_API_KEY\"] = \"YOUR_NEBIUS_AI_API_KEY\"\n        ```\n\n4.  **Run the notebooks:**\n\n    Open any of the Jupyter Notebooks (`.ipynb` files) using Jupyter Notebook or JupyterLab.  Each notebook is self-contained and can be run independently.  The notebooks are designed to be executed sequentially within each file.\n\n    **Note:** The `data/AI_information.pdf` file provides a sample document for testing. You can replace it with your own PDF.  The `data/val.json` file contains sample queries and ideal answers for evaluation.\n    The 'attention_is_all_you_need.pdf' is for testing Multi-Modal RAG Notebook.\n\n## 💡 Core Concepts\n\n*   **Embeddings:**  Numerical representations of text that capture semantic meaning.  We use Nebius AI's embedding API and, in many notebooks, also the `BAAI/bge-en-icl` embedding model.\n\n*   **Vector Store:**  A simple database to store and search embeddings.  We create our own `SimpleVectorStore` class using NumPy for efficient similarity calculations.\n\n*   **Cosine Similarity:**  A measure of similarity between two vectors.  Higher values indicate greater similarity.\n\n*   **Chunking:**  Dividing text into smaller, manageable pieces.  We explore various chunking strategies.\n\n*   **Retrieval:** The process of finding the most relevant text chunks for a given query.\n\n*   **Generation:**  Using a Large Language Model (LLM) to create a response based on the retrieved context and the user's query.  We use the `meta-llama/Llama-3.2-3B-Instruct` model via Nebius AI's API.\n\n*   **Evaluation:**  Assessing the quality of the RAG system's responses, often by comparing them to a reference answer or using an LLM to score relevance.\n\n## 🤝 Contributing\n\nContributions are welcome!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFareedKhan-dev%2Fall-rag-techniques","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFareedKhan-dev%2Fall-rag-techniques","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFareedKhan-dev%2Fall-rag-techniques/lists"}