{"id":24401611,"url":"https://github.com/asifafridii44/langchain_rag_project_2","last_synced_at":"2025-03-13T06:40:22.640Z","repository":{"id":272215065,"uuid":"915853424","full_name":"asifafridii44/LangChain_RAG_project_2","owner":"asifafridii44","description":"LangChain RAG Implementation with Google GenAI and Pinecone This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using LangChain, Pinecone, and Google Generative AI models. It includes embedding generation, vector storage, and a seamless integration to handle and retrieve contextual responses.","archived":false,"fork":false,"pushed_at":"2025-01-13T00:48:50.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-20T22:57:01.937Z","etag":null,"topics":["agentic-ai","agentic-rag","ai","langchain","openai","openai-api","piaic","python3"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/asifafridii44.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-13T00:46:16.000Z","updated_at":"2025-01-13T00:54:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"21ab3a98-0d26-4794-b3d4-922adae16d37","html_url":"https://github.com/asifafridii44/LangChain_RAG_project_2","commit_stats":null,"previous_names":["asifafridii44/langchain_rag_project_2"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asifafridii44%2FLangChain_RAG_project_2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asifafridii44%2FLangChain_RAG_project_2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asifafridii44%2FLangChain_RAG_project_2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asifafridii44%2FLangChain_RAG_project_2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/asifafridii44","download_url":"https://codeload.github.com/asifafridii44/LangChain_RAG_project_2/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243358211,"owners_count":20277989,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agentic-rag","ai","langchain","openai","openai-api","piaic","python3"],"created_at":"2025-01-20T00:32:33.460Z","updated_at":"2025-03-13T06:40:22.620Z","avatar_url":"https://github.com/asifafridii44.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LangChain_RAG_project_2\nLangChain RAG Implementation with Google GenAI and Pinecone This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using LangChain, Pinecone, and Google Generative AI models. It includes embedding generation, vector storage, and a seamless integration to handle and retrieve contextual responses.\n\n# LangChain RAG Implementation with Google GenAI and Pinecone\n\n## Overview\nThis repository demonstrates a practical implementation of a Retrieval-Augmented Generation (RAG) pipeline using LangChain, Pinecone, and Google Generative AI models. It combines embedding generation, vector storage, and an LLM to retrieve and generate responses based on contextual information. \n\nThe project is implemented in a Jupyter Notebook on Google Colab, making it easy to set up and run.\n\n## Features\n- **Embedding Model**: Uses Google Generative AI embeddings to encode data efficiently.\n- **Vector Store**: Stores and retrieves vectorized data using Pinecone.\n- **Retrieval-Augmented Generation**: Combines retrieval and generation capabilities to produce context-aware responses.\n- **PDF Loading**: Demonstrates loading and processing PDF documents as context for the pipeline.\n- **LLM Integration**: Uses Google's Gemini model to generate human-like responses.\n\n## Installation\nEnsure the following dependencies are installed:\n\n```bash\npip install -qU langchain-pinecone langchain-google-genai langchain requests pypdf langchain-community docling-core python-dotenv\n```\n\n## Usage\n\n### 1. Setting Up Environment Variables\nDefine the following environment variables in your system or `.env` file:\n- `GOOGLE_API_KEY`: API key for Google Generative AI.\n- `PINECONE_API_KEY`: API key for Pinecone.\n\n### 2. Code Walkthrough\n\n#### a. Setting up Embedding and Vector Store\n```python\nfrom langchain_google_genai import GoogleGenerativeAIEmbeddings\nfrom pinecone import Pinecone, ServerlessSpec\n\n# Set up Google Generative AI embeddings\nembeddings = GoogleGenerativeAIEmbeddings(model=\"models/embedding-001\")\n\n# Set up Pinecone vector store\npinecone_api_key = os.getenv(\"PINECONE_API_KEY\")\npc = Pinecone(api_key=pinecone_api_key)\nindex_name = \"rag-1\"\nindex = pc.Index(index_name)\nvector_store = PineconeVectorStore(index=index, embedding=embeddings)\n```\n\n#### b. Loading PDF Files as Context\n```python\nfrom langchain.document_loaders import PyPDFLoader\nimport requests\n\nurl = \"https://raw.githubusercontent.com/[user]/[repo]/main/sample.pdf\"\nfilename = \"sample.pdf\"\n\n# Download PDF\nresponse = requests.get(url)\nwith open(filename, \"wb\") as f:\n    f.write(response.content)\n\n# Load document\nloader = PyPDFLoader(filename)\ndocuments = loader.load()\n```\n\n#### c. Building the RAG Chain\n```python\nfrom langchain_core.documents import Document as LCDocument\nfrom docling.document_converter import DocumentConverter\nfrom langchain.chains import ChatGoogleGenerativeAI\n\nretriever = vector_store.as_retriever()\nllm = ChatGoogleGenerativeAI(model=\"gemini-2.0-flash-exp\")\n\ntemplate = \"\"\"Answer this user query: {question}\nHere's some information that might be helpful: {context}\"\"\"\nprompt_template = ChatPromptTemplate.from_template(template)\n\nrag_chain = (\n    {\"context\": retriever, \"question\": RunnablePassthrough()}\n    | prompt_template\n    | llm\n    | StrOutputParser()\n)\n\nquery = \"Tell me about surname\"\nresponse = rag_chain.invoke(query)\nprint(response)\n```\n\n## Results\n- The RAG pipeline retrieves relevant context from the vector store and uses the Gemini model to generate responses.\n- Contextual queries are enhanced by combining document embeddings with generative AI capabilities.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasifafridii44%2Flangchain_rag_project_2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasifafridii44%2Flangchain_rag_project_2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasifafridii44%2Flangchain_rag_project_2/lists"}