https://github.com/abdelrahman-amen/rag_agent

This project uses LangChain agents and Google Generative AI to build a RAG system, combining LLMs with tools like Wikipedia, Arxiv, and custom retrievers for accurate, real-time answers.
https://github.com/abdelrahman-amen/rag_agent

agents api arxiv-papers dotenv embeddings faiss langchain prompt-engineering python retreival-augmented-generation tools vectorstore webloader wikipedia-api

Last synced: 3 months ago
JSON representation

This project uses LangChain agents and Google Generative AI to build a RAG system, combining LLMs with tools like Wikipedia, Arxiv, and custom retrievers for accurate, real-time answers.

Host: GitHub
URL: https://github.com/abdelrahman-amen/rag_agent
Owner: Abdelrahman-Amen
Created: 2025-02-04T15:27:08.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-02-05T14:33:36.000Z (8 months ago)
Last Synced: 2025-04-05T15:46:49.120Z (6 months ago)
Topics: agents, api, arxiv-papers, dotenv, embeddings, faiss, langchain, prompt-engineering, python, retreival-augmented-generation, tools, vectorstore, webloader, wikipedia-api
Language: Jupyter Notebook
Homepage:
Size: 12.7 KB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🚀 Retrieval-Augmented Generation (RAG) with Agents in LangChain and Google Generative AI 🌟

![Image](https://github.com/user-attachments/assets/857e6cff-b1db-4a06-ae90-b88ef7cf63ef)

# 📚 What is Retrieval-Augmented Generation (RAG)?
Retrieval-Augmented Generation (RAG) enhances the capabilities of language models by combining them with retrieval mechanisms. Instead of depending solely on the pre-trained knowledge within a model, RAG retrieves real-time, relevant data from external sources such as APIs, web pages, or document databases. This approach ensures the generated responses are more accurate, contextually rich, and grounded in reliable information.

# 🤖 What is an Agent in LangChain?
An Agent is a sophisticated system within LangChain that acts as a decision-making layer. It interprets user queries, determines which tools (retrievers, APIs, or external services) are required to process the query, and then synthesizes the retrieved information into a coherent response.

In this project, the agent plays a critical role in:

• Binding the Google Generative AI LLM with tools such as Wikipedia and Arxiv APIs.

• Orchestrating a seamless interaction between user inputs, external data sources, and the language model to provide informed and accurate outputs.

• Utilizing LangChain’s ability to format responses and handle intermediate steps.

# 🛠️ Project Workflow

### 1.Environment Configuration:

• The project uses google.generativeai and langchain_community modules to interact with Google Generative AI and external tools.

• Environment variables (API keys) are loaded via dotenv to securely access external services.

### 2.Document Loading and Splitting:

• A WebBaseLoader is used to fetch documents from a specific URL.

• The loaded documents are split into manageable chunks using RecursiveCharacterTextSplitter for effective retrieval and embedding.

### 3.Embeddings and Vector Store:

• Google Generative AI Embeddings are generated for the document chunks using the embedding-001 model.

• A FAISS Vector Store is created to facilitate fast and accurate retrieval of relevant information.

### 4.Tool Integration:

• Wikipedia API Wrapper: Retrieves concise summaries from Wikipedia for user queries.

• Arxiv API Wrapper: Fetches and summarizes scientific papers based on user input.

• Custom Retriever Tool: Searches the loaded documents for information about LangSmith using the FAISS retriever.

### 5.Prompt Design and LLM Binding:

• A ChatPromptTemplate is defined to format user inputs and responses into structured prompts for the language model.

• The Google Generative AI Chat model (gemini-pro) is initialized and bound to the defined tools.

### 6.Agent Execution:

• The agent uses the tools to handle queries such as:

• Retrieving details about LangSmith from the custom document loader.

• Fetching scientific paper summaries from Arxiv.

• Providing general knowledge answers using Wikipedia.

• Outputs are formatted using LangChain’s OpenAIFunctionsAgentOutputParser.

# 🌟 Key Features of This Project

• Combines Google Generative AI with retrieval-based tools for real-time, reliable information generation.

• Implements multiple data sources (Wikipedia, Arxiv, and custom web documents) for diverse query handling.

• Leverages LangChain’s Agent Framework to intelligently select tools and process user inputs.

• Provides a reusable and extensible architecture for Retrieval-Augmented Generation tasks.

#### This system demonstrates how agents can be used within RAG to build scalable, robust, and domain-specific AI solutions.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/abdelrahman-amen/rag_agent

Awesome Lists containing this project

README