{"id":14964491,"url":"https://github.com/kanad13/ragify","last_synced_at":"2026-03-10T14:04:24.397Z","repository":{"id":252060162,"uuid":"835795781","full_name":"kanad13/RAGify","owner":"kanad13","description":"Chat with your documents using Generative AI \u0026 Retrieval-Augmented Generation (RAG)","archived":false,"fork":false,"pushed_at":"2024-09-19T13:29:52.000Z","size":845,"stargazers_count":11,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T05:42:54.585Z","etag":null,"topics":["ai","generative-ai","groq","huggingface","llama","ml","rag","retrieval-augmented-generation","streamlit"],"latest_commit_sha":null,"homepage":"https://ragify.streamlit.app","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kanad13.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-30T14:44:35.000Z","updated_at":"2025-04-06T17:00:00.000Z","dependencies_parsed_at":"2024-08-07T13:42:32.267Z","dependency_job_id":"82593bc0-e68f-48d3-9ae4-0b9ee46b3fa9","html_url":"https://github.com/kanad13/RAGify","commit_stats":{"total_commits":53,"total_committers":1,"mean_commits":53.0,"dds":0.0,"last_synced_commit":"f1b01be0b099eebef1a42ee46e4bf31779d3c3ee"},"previous_names":["kanad13/ragify"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kanad13/RAGify","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kanad13%2FRAGify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kanad13%2FRAGify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kanad13%2FRAGify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kanad13%2FRAGify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kanad13","download_url":"https://codeload.github.com/kanad13/RAGify/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kanad13%2FRAGify/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30336098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T12:41:07.687Z","status":"ssl_error","status_checked_at":"2026-03-10T12:41:06.728Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","generative-ai","groq","huggingface","llama","ml","rag","retrieval-augmented-generation","streamlit"],"created_at":"2024-09-24T13:33:15.747Z","updated_at":"2026-03-10T14:04:24.371Z","avatar_url":"https://github.com/kanad13.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RAGify - Chat with Your Documents using Gen AI\n\nThis github repository is for RAGify - a tool that lets you chat with your documents using Generative AI \u0026 Retrieval-Augmented Generation (RAG).\n\nThink of RAGify as giving your documents a brain.\n\n[Click here to see RAGify in action on some sample documents.](https://ragify.streamlit.app)\n\n## Why RAGify?\n\nPeople and businesses often collect many documents in formats like PDF and DOCX.\n\nTo find specific information, they have to search through all these documents. Imagine if you could just chat with these documents to get the answers you need easily.\n\nRAGify provides a secure way to:\n\n- Create a private system for your documents\n- Connect with trusted external sources\n- Keep your data confidential\n\nRAGify offers an easy and secure method to make your documents **interactive** using Generative AI.\n\n## How RAGify Works?\n\n```mermaid\ngraph LR\n    A[Question from User] --\u003e B[Chatbot Frontend]\n    D[Input Documents] --\u003e E[Text Embeddings]\n    subgraph RAGify\n        B --\u003e F[Retrieve Relevant Text]\n        E --\u003e F\n        F --\u003e G[Generative AI]\n    end\n    G --\u003e H[Answer to User]\n```\n\n- **Chatbot Frontend** - Streamlit frontend for interacting with the chatbot.\n- **Input Documents** - Source documents for querying information e.g. NewsLetters, Company Policies, etc.\n- **Text Embeddings** - SentenceTransformer and FAISS convert text into embeddings for similarity search.\n- **Retrieve Relevant Answers** - Retrieve relevant chunks from the input documents.\n- **Generative AI** - Use a language model to generate natural language responses.\n\n## Key Components of RAGify\n\n- **Large Language Model (LLM) to generate context-aware responses.**\n  - My project uses Groq's API to interact with the large language model, specifically 8B model from the [Meta Llama 3.1 collection](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md).\n  - This model is served by [Groq Cloud](https://wow.groq.com/why-groq/) and is responsible for producing intelligent and contextually relevant responses based on the retrieved chunks of text from the input documents.\n  - The code can be easily customized to use other models like `OpenAI`, `Mistral`, or `Gemini` or even use a local LLM.\n- **Sentence Transformers to generate embeddings for text chunks and queries.**\n  - `SentenceTransformer` [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) is a pre-trained model from the Sentence Transformers library.\n  - It is used for transforming sentences or text data extracted from PDFs, into numerical embeddings in a way that captures semantic meaning.\n  - This particular model is based on the MPNet architecture and is fine-tuned for tasks like semantic similarity, clustering, and information retrieval.\n- **FAISS (Facebook AI Similarity Search) to perform similarity search over large datasets**\n  - FAISS is employed to perform fast and accurate nearest neighbor searches among the embeddings generated by embedding models like all-mpnet-base-v2 from Sentence Transformers.\n  - IndexFlatL2 and IndexIVFFlat are two types of indexing structures used for similarity search\n  - Depending on the number of chunks, different FAISS index types (`IndexFlatL2` for small datasets or `IndexIVFFlat` for larger datasets) are dynamically selected to balance speed and accuracy.\n- **PyPDF for pdf extraction**\n  - Usage - The `PdfReader` from the [pypdf library](https://pypdf.readthedocs.io/en/latest/index.html) is used to read and extract text content from PDF files, which is then processed and chunked for further analysis.\n  - This allows the system to work with document-based data sources.\n- **Langchain for text chunking and splitting.**\n  - The `RecursiveCharacterTextSplitter` from [Langchain](https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/) is used to break down large chunks of text into smaller, more manageable pieces.\n  - This facilitates efficient processing and ensures that the text chunks are of an appropriate size for embedding and retrieval.\n- **Caching Mechanism to speed up query responses by reusing previous results.**\n  - The system includes a JSON-based caching mechanism to store previous queries, their embeddings, and responses.\n  - Before generating a new response, the system checks the cache to see if a similar query has already been processed.\n  - If found, the cached response is returned, significantly improving response time and reducing the need for redundant computations.\n- **Streamlit for creating a web-based user interface.**\n  - Streamlit is used to create an interactive web application that allows users to input queries and view responses generated by the system.\n  - It provides an intuitive and easy-to-deploy interface for users to interact with the RAG system without needing to run the code manually.\n\n## Input Documents - Blunder Mifflin\n\nNow that I have explained some complex terms about Retrieval-Augmented Generation, let me get back to RAGify - the project to understand how you can use a chatbot on your own documents.\n\nTo showcase how RAGify works with LLM on custom documents, I created the Employee Handbook for a fictional company called Blunder Mifflin.\n\nThe chatbot answers questions related to the company policy.\n\nSo for example, if an employee of Blunder Mifflin wants to know what is the \"Work from Home\" policy, then they can just ask the chatbot that question and get the answer using the power of Generative AI.\n\nIf the company's Work From Home policy gets updated, they just update the documents, no changes needed to the chatbot. The chatbot starts providing answers based on latest information.\n\nThis approach allows me to explore the functionality of RAGify without using real, sensitive data.\n\nThe [5 PDF documents](/input_files/) include:\n\n1. Blunder Mifflin's History and Mission\n2. Blunder Mifflin's Employee Handbook\n3. In-Office Relationships Policy\n4. Prank Protocol\n5. Birthday Party Committee Rules\n\nIn the next few sections, I will use these 5 documents as sample input to demonstrate how RAGify can turn static information into an interactive, AI-driven Q\u0026A system. By feeding these PDFs into the RAGify system, you will be able to ask questions like:\n\n- \"What is Blunder Mifflin's mission?\"\n- \"What's the policy on in-office relationships?\"\n- \"How should pranks be conducted at Blunder Mifflin?\"\n\nRAGify will process these documents, breaking them down into manageable chunks, and store them for quick retrieval.\nWhen you ask a question, RAGify searches through these chunks to find the most relevant information, then uses AI to craft a clear and concise response.\n\nThis code will show how you can use RAG to handle your own company's documents.\n\n## Uniqueness of RAGify Architecture\n\n- **Source Attribution**\n  - Responses from RAGify are accompanied by sources and specific document chunks used:\n    - Provides users insight into the origins of the response, enhancing transparency and trust.\n    - Utilizes the `chunk_to_doc` mapping for clear traceability.\n    - [Link](https://github.com/kanad13/RAGify/blob/5554ea8a72ea6e43924c1043d29c9111bb38416d/pages/02-Chatbot.py#L254) to relevant code.\n- **Adaptive FAISS Indexing**\n  - The system selects the most suitable FAISS index based on dataset size:\n    - Utilizes a `FlatL2` index for precise results on smaller collections.\n    - Switches to an `IVFFlat` index with optimized clustering for efficient search in larger datasets.\n    - This adaptability ensures optimal performance with low latency, regardless of dataset size.\n    - [Link](https://github.com/kanad13/RAGify/blob/5554ea8a72ea6e43924c1043d29c9111bb38416d/pages/02-Chatbot.py#L110) to relevant code.\n- **Multi-Level Caching Strategy**\n  - RAGify employs a multi-tiered caching approach to optimize performance:\n    - I have implemented a semantic cache that stores and retrieves results based on embedding similarity. This reduces API calls and improves response times for similar queries.\n    - Streamlit's `@st.cache_data` decorator caches processed PDF content for efficiency.\n    - Vector embeddings are cached using `@st.cache_resource` to enhance retrieval speed and avoid recomputation.\n    - A custom semantic cache stores and retrieves responses for similar queries, thus accelerating repeated tasks.\n    - Relevant code: `@st.cache_data` for `process_pdfs`, `@st.cache_resource` for `create_faiss_index`, and the custom semantic cache in `load_cache`, `save_cache`, `retrieve_from_cache`, and `update_cache` functions.\n- **Hash-Based Dynamic Cache Invalidation**\n  - RAGify implements a sophisticated, hash-based cache invalidation mechanism:\n    - A hash of all input files is computed on each run.\n    - When the hash changes, indicating updated content, all relevant caches are automatically invalidated and rebuilt.\n    - This approach ensures the system always works with the most current information, minimizing downtime and resource usage.\n    - [Link](https://github.com/kanad13/RAGify/blob/5554ea8a72ea6e43924c1043d29c9111bb38416d/pages/02-Chatbot.py#L62) to relevant code.\n- **Hybrid Model Approach**\n  - I have implemented a primary and fallback model system approach. So if one endpoint is unavailable or if one model hits the rate limits, then the code automatically falls back to another endpoint.\n  - [Link](https://github.com/kanad13/RAGify/blob/5554ea8a72ea6e43924c1043d29c9111bb38416d/pages/02-Chatbot.py#L188) to relevant code.\n- **Detailed documentation**\n  - I have included detailed explanations for my code along with Mermaid diagrams to make it easy to maintain the documentation as the code changes.\n    - Link to [relevant documentation](#ragify-technical-deepdive).\n\n## RAGify Technical Deepdive\n\n## RAGify workflow\n\n```mermaid\ngraph TD\n    subgraph User\n    A[Input Query]\n    H[Get Answer]\n    end\n    subgraph Knowledge Base\n    I[Document Storage]\n    J[Vector Embeddings]\n    end\n    subgraph RAG System\n    B[Embed Query]\n    C[Vector Search]\n    D[Retrieve Relevant Texts]\n    E[Create Prompt]\n    end\n    subgraph LLM\n    F[Process Prompt]\n    G[Generate Response]\n    end\n    A --\u003e B\n    I --\u003e J\n    J --\u003e C\n    B --\u003e C\n    C --\u003e D\n    D --\u003e E\n    E --\u003e F\n    F --\u003e G\n    G --\u003e H\n```\n\nHere is a simple overview of the diagram above:\n\n1. The user asks a question.\n2. The RAG system converts this question into a numerical format (vector) that computers can understand and compare easily.\n3. It then searches through a database of pre-converted document vectors to find the most relevant information.\n4. The system retrieves the actual text of these relevant documents.\n5. It combines the user's question with this relevant information to create a detailed prompt.\n6. This prompt is sent to an AI (the LLM), which processes it and generates a response.\n7. Finally, the user receives this response as their answer.\n\n### Full RAGify architecture\n\nBefore diving into the details of building RAGify, here is a high-level overview of the system:\n\n```mermaid\nflowchart TD\n    A[Start] --\u003e B[Load Environment Variables]\n    B --\u003e C[Initialize Groq Client]\n    C --\u003e D[Load Sentence Transformer Model]\n    D --\u003e E[Process PDFs]\n    E --\u003e F[Extract Text]\n    F --\u003e G[Create Chunks]\n    G --\u003e H[Generate Embeddings]\n    H --\u003e I{Dataset Size}\n    I -- Small --\u003e J[Create FlatL2 Index]\n    I -- Large --\u003e K[Create IVFFlat Index]\n    J --\u003e L[FAISS Index]\n    K --\u003e L\n    M[User Query] --\u003e N[Retrieve Relevant Chunks]\n    L --\u003e N\n    N --\u003e O[Check Cache]\n    O -- Hit --\u003e P[Return Cached Response]\n    O -- Miss --\u003e Q[Generate LLM Response]\n    Q --\u003e R{API Error?}\n    R -- Yes --\u003e S[Retry or Use Fallback Model]\n    S --\u003e Q\n    R -- No --\u003e T[Update Cache]\n    T --\u003e U[Display Results]\n    P --\u003e U\n    subgraph \"Initialization\"\n        B\n        C\n        D\n    end\n    subgraph \"Data Processing \u0026 Indexing\"\n        E\n        F\n        G\n        H\n        I\n        J\n        K\n        L\n    end\n    subgraph \"Query Processing and Response Generation\"\n        M\n        N\n        O\n        P\n        Q\n        R\n        S\n        T\n    end\n    subgraph \"User Interface\"\n        U\n    end\n```\n\n### Visualize Caching aspect of RAGify\n\nWhen a query is received, the system uses this prepared index to quickly retrieve relevant information, which is then used to augment the LLM's response. The next section will cover the LLM part.\n\n```mermaid\nsequenceDiagram\n    participant User\n    participant Streamlit as Streamlit UI\n    participant PDF as PDF Files\n    participant PyPDF as PyPDF\n    participant TextSplitter as RecursiveCharacterTextSplitter\n    participant SentenceTransformer as SentenceTransformer\n    participant FAISS as FAISS\n    participant Cache as Semantic Cache\n    participant Groq as Groq API\n    participant Disk as Disk Storage\n\n    User-\u003e\u003eStreamlit: Input Query\n    Streamlit-\u003e\u003ePDF: Read Files\n    PDF-\u003e\u003ePyPDF: Input\n    PyPDF-\u003e\u003eTextSplitter: Extracted Text\n    TextSplitter-\u003e\u003eSentenceTransformer: Text Chunks\n    SentenceTransformer-\u003e\u003eFAISS: Embeddings\n    FAISS-\u003e\u003eDisk: Store Vector Index\n    Streamlit-\u003e\u003eSentenceTransformer: Encode Query\n    SentenceTransformer-\u003e\u003eFAISS: Query Vector\n    FAISS-\u003e\u003eCache: Check Cache\n    alt Cache Hit\n        Cache-\u003e\u003eStreamlit: Cached Response\n    else Cache Miss\n        FAISS-\u003e\u003eFAISS: Retrieve Relevant Chunks\n        FAISS-\u003e\u003eGroq: Context + Query\n        Groq-\u003e\u003eCache: Store Response\n        Groq-\u003e\u003eStreamlit: Generated Response\n    end\n    Streamlit-\u003e\u003eUser: Display Results\n\n    Note over PDF,PyPDF: Text Extraction\n    Note over PyPDF,TextSplitter: Text Chunking\n    Note over TextSplitter,SentenceTransformer: Vectorization\n    Note over SentenceTransformer,FAISS: Similarity Search Index Creation\n    Note over FAISS,Cache: Semantic Caching\n    Note over FAISS,Disk: Persistent Storage\n    Note over Groq,Streamlit: LLM Response Generation\n```\n\n## Simple Explanations for Complex Terms\n\nThis section explains some of the complex terms used in this project.\n\n1. [Prompt Engineering](./readme.md#prompt-engineering)\n2. [Fine-tuning](./readme.md#fine-tuning)\n3. [Retrieval-Augmented Generation (RAG)](./readme.md#retrieval-augmented-generation-rag)\n4. [Vector Database](./readme.md#vector-database)\n5. [Retriever](./readme.md#retriever)\n\n### Prompt Engineering\n\nThe questions you ask a Generative AI model (e.g., ChatGPT, Gemini) are called \"prompts.\" To get better answers, you must \"engineer\" or \"refine\" your prompt.\n\nExample - Instead of saying \"Write about dogs,\" say \"Write a 200-word paragraph about the history of domesticated dogs, focusing on their roles in human society.\"\n\n```mermaid\ngraph LR\n    A[User Input] --\u003e B[Simple Prompting]\n    A --\u003e C[Prompt Engineering]\n\n    B --\u003e D[Direct Output]\n\n    C --\u003e E{Refine Prompt}\n    E --\u003e|Improve| F[Better Results]\n    E --\u003e|Iterate| C\n    F --\u003e G[Final Output]\n\n    subgraph Simple Prompting\n    B\n    D\n    end\n\n    subgraph Prompt Engineering\n    C\n    E\n    F\n    end\n```\n\n### Fine-tuning\n\nFine-tuning makes an AI model better at specific tasks. It is like teaching a smart student to become an expert in a new subject.\n\nLet us say a law firm needs to create many legal documents every day. ChatGPT can write these documents, but it might not use the right words or the right format, that the law firm needs.\n\nTo fix this, the law firm can \"fine-tune\" the AI model. This means teaching the AI to write documents exactly how the law firm wants them.\n\nTo do this, the firm shows the AI model some examples of perfect legal documents written by their best lawyers. The AI learns from these examples and gets better at writing documents just like the law firm wants.\n\n```mermaid\nflowchart LR\n    A[(\"Base LLM\")] --\u003e|Fine-tuning| B\n    subgraph FT [Fine-tuned LLM]\n        B[(\"Fine-tuned Model\")]\n        D[(\"Fine-tuned Knowledge\")]\n    end\n    C[(\"Law-firm specific examples\")] --\u003e FT\n    FT \u003c--\u003e|Prompt/Response| E[(\"User\")]\n```\n\n### Retrieval-Augmented Generation (RAG)\n\nRAG allows you to feed the AI model your own data sources, enabling it to give more relevant and tailored responses.\n\nImagine a pizza restaurant's chatbot using RAG. It is like giving the chatbot a constantly updated menu card. When customers ask about today's specials, changed delivery zones, or new toppings, the chatbot can instantly access this fresh information. It does not just rely on old data but can pull up the latest details.\n\n```mermaid\ngraph TB\n    subgraph \"Traditional AI Model\"\n    A1[User Query] --\u003e B1[AI Model]\n    B1 --\u003e C1[General Response]\n    end\n    subgraph \"RAG-Enabled AI Model\"\n    A2[User Query] --\u003e R[Retriever]\n    R \u003c--\u003e V[Vector DB]\n    V \u003c--\u003e D[Custom Data Sources]\n    R --\u003e B2[AI Model]\n    A2 --\u003e B2\n    B2 --\u003e C2[Tailored Response]\n    end\n```\n\n### Vector Database\n\nA Vector Database is a smart storage system for AI. It helps AI access new or specific information not included in its original training.\n\nA vector database is like a smart library where instead of searching for books by their title or author, you are searching by the ideas inside the books. This helps the AI find and compare information more effectively.\n\nLet me reuse the example from the previous section about the chatbot for a pizza restaurant. The restaurant keeps its latest menu in a Vector Database. When customers ask about new pizzas, the chatbot can quickly check this database for current information. This way, the restaurant does not need to constantly update the chatbot. They just add new pizza details to the database, and the chatbot can access this information when needed.\n\nVector databases store data differently from relational databases like MySQL. Instead of using rows and columns, vector databases convert each piece of data into a numerical format called an embedding. These embeddings are placed in a multi-dimensional space. Similar items are positioned closer together.\n\nFor example, \"cats\" and \"dogs\" would be near each other, while \"table\" and \"chair\" would be further apart. This method helps AI models provide more relevant answers.\n\n![](./assets/vector_db_emeddings.svg)\n\n### Retriever\n\nThe Retriever in a RAG system works like a smart search tool. It helps connect what users ask with the information stored in \"vector databases\".\n\nWhen someone asks a question, the Retriever does three main things:\n\n1. **Find Similar Info**: It looks for information that is close to what the user asked.\n\n2. **Sort by Importance**: It puts the found information in order, with the most useful stuff at the top.\n\n3. **Pick the Best**: It chooses the top pieces of information to send back to the AI.\n\nThe AI then uses this information to give an answer the user can easily understand.\n\n```mermaid\ngraph TB\n    subgraph \"RAG-Enabled AI Model\"\n    A[User Query] --\u003e R[Retriever]\n    R \u003c--\u003e V[Vector DB]\n    V \u003c--\u003e D[Custom Data Sources]\n    R --\u003e B[AI Model]\n    A --\u003e B\n    B --\u003e C[Tailored Response]\n    end\n```\n\n## RAG with your own documents\n\nIn just a few steps, you can set up the RAG system for your own documents. Follow the instructions below:\n\n### Install Prerequisites\n\n- **Python** - Install Python by following [this guide](https://wiki.python.org/moin/BeginnersGuide/Download).\n- **Git** - Install Git by following [these instructions](https://docs.github.com/en/get-started/getting-started-with-git/set-up-git).\n\n### Clone the Repository\n\nClone the RAGify repository to your local machine:\n\n```bash\ngit clone https://github.com/kanad13/RAGify.git\ncd RAGify\n```\n\n### Create a Virtual Environment\n\nCreate a virtual environment to manage dependencies:\n\n```bash\npython -m venv rag_venv\n```\n\nActivate the virtual environment:\n\n- On Windows:\n\n```sh\n.\\rag_venv\\Scripts\\activate\n```\n\n- On Mac:\n\n```sh\nsource rag_venv/bin/activate\n```\n\n### Install Required Packages\n\nInstall all the necessary Python packages:\n\n```bash\npip install -r requirements.txt\n```\n\n### Set Up LLM\n\nRAGify utilizes the open-source Meta Llama model, hosted by Groq. To use it, you'll need an API key. Follow [these instructions](https://console.groq.com/docs/quickstart) to obtain your key.\n\nIf you prefer using a different LLM, you can obtain API keys from the following providers:\n\n- [OpenAI](https://platform.openai.com/docs/quickstart)\n- [Gemini](https://ai.google.dev/gemini-api/docs)\n- [Anthropic](https://docs.anthropic.com/en/docs/quickstart)\n- [Mistral](https://docs.mistral.ai/api/)\n\nAlternatively, you can set up an LLM on your own machine using:\n\n- [GPT4All](https://github.com/nomic-ai/gpt4all)\n- [Llama.cpp](https://github.com/ggerganov/llama.cpp)\n- [LocalAI](https://github.com/mudler/LocalAI)\n\n### Configure API Key\n\nOnce you have your API key, create a `.env` file in the root directory of the cloned repository. Add the following line to the `.env` file:\n\n```text\nGROQ_API_KEY=\"your_key\"\n```\n\nThis key will be loaded at runtime into the RAGify system, keeping it private and secure on your machine. The [python-dotenv package](https://pypi.org/project/python-dotenv/) handles this process.\n\n### Prepare Your Input Files\n\nPlace all your documents inside the `input_files` folder. You may remove the existing files if needed.\n\n### Access the RAGify Application\n\nYou are now ready to interact with your documents. You have two options:\n\n- **Option 1**: Run the code in a Jupyter notebook if you're technically inclined. See the notebook [here](./RAGify-full_code.ipynb).\n- **Option 2**: Access the system via a Streamlit web app. Open a terminal, navigate to the root of the repository, and run the following command:\n\n```bash\nstreamlit run Welcome.py\n```\n\n## Acknowledgements\n\nRAGify is powered by a combination of open and closed-source technologies. I am grateful for the contributions of the following initiatives and organizations:\n\n- [Python](https://github.com/python) - The backbone of RAGify's codebase.\n- [PyPDF](https://pypdf.readthedocs.io/en/latest/index.html) - It is used for text extraction and processing from PDF documents.\n- [Hugging Face](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) - The `all-mpnet-base-v2` SentenceTransformer model is used to generate embeddings for semantic search.\n- [Facebook](https://faiss.ai) - Facebook AI Similarity Search (FAISS) is used for performing similarity searches.\n- [Langchain](https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/) - This RecursiveCharacterTextSplitter is used for breaking down large text into manageable chunks, optimizing them for embedding and retrieval.\n- [Meta](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md) - The Llama 3.1 8B Model serves as the LLM behind RAGify's intelligent responses.\n- [Groq](https://wow.groq.com/why-groq/) - The LLM is hosted on Groq Language Processing Unit and inferences are provided through an API.\n- [Streamlit](https://streamlit.io/) - Streamlit provides the technology to build and host the RAGify chatbot.\n- **The Broader AI/ML Community** - A special thanks to the AI/ML community whose ongoing research and open-source contributions have laid the foundation for this project.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkanad13%2Fragify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkanad13%2Fragify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkanad13%2Fragify/lists"}