Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abdelrahman-amen/question_answering_from_any_url_with_rag
In this project, users can input any URL and ask a question related to its content. Using Retrieval-Augmented Generation (RAG) and LangChain, the app retrieves the most relevant answer from the webpage.
https://github.com/abdelrahman-amen/question_answering_from_any_url_with_rag
api dotenv embeddings genai generative-ai langchain python retreival-augmented-generation streamlit vectorstore webbased
Last synced: 7 days ago
JSON representation
In this project, users can input any URL and ask a question related to its content. Using Retrieval-Augmented Generation (RAG) and LangChain, the app retrieves the most relevant answer from the webpage.
- Host: GitHub
- URL: https://github.com/abdelrahman-amen/question_answering_from_any_url_with_rag
- Owner: Abdelrahman-Amen
- Created: 2025-01-31T16:18:12.000Z (8 days ago)
- Default Branch: main
- Last Pushed: 2025-01-31T16:38:07.000Z (8 days ago)
- Last Synced: 2025-01-31T17:29:05.659Z (8 days ago)
- Topics: api, dotenv, embeddings, genai, generative-ai, langchain, python, retreival-augmented-generation, streamlit, vectorstore, webbased
- Language: Python
- Homepage:
- Size: 1.84 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🚀 Question-Answering from Any URL with RAG and LangChain 🌐
Welcome to the Question-Answering from Any URL project! This repository showcases a cutting-edge implementation of Retrieval-Augmented Generation (RAG) combined with the power of LangChain and Google Generative AI for seamless question-answering. 🧠✨
![Image](https://github.com/user-attachments/assets/89de9f8e-9531-422a-a2cd-dac14eed3d98)
# 🔍 What is RAG (Retrieval-Augmented Generation)?
Retrieval-Augmented Generation is an advanced AI framework that combines retrieval-based systems with generative models. Instead of relying solely on pre-trained knowledge, RAG enhances the model's capability by retrieving contextually relevant information from external sources like documents or web pages. This enables the generation of accurate, up-to-date, and context-aware answers. 💡### How it works in this project:
1. Retrieve: Content is extracted from the provided URL.
2. Embed: The content is processed into vector representations using Google Generative AI embeddings.
3. Generate: The user's question is matched with the most relevant content, and the best answer is presented.
# What I Built in This Project 🛠️
This project is a web-based application where users can:
1. Input a URL: Provide the URL of any publicly accessible web page.
2. Ask a Question: Enter a question related to the content of the web page.
3. Receive an Answer: Get the most relevant answer, retrieved from the content and matched with the user's query.
# Core Features 🔗
• LangChain Framework: Used to streamline the integration of loaders, splitters, embeddings, and vector stores.
• Google Generative AI: Leverages the powerful embedding-001 model for creating embeddings.
• Web Content Loader: The WebBaseLoader extracts and prepares data from the given URL.
• Text Splitting: Documents are divided into manageable chunks using the RecursiveCharacterTextSplitter to ensure smooth processing.
• Vector Search with FAISS: Uses FAISS for high-speed similarity search and matching the query to the most relevant chunk of content.
• Streamlit UI: A clean and interactive user interface for a seamless experience.
# How It Works 💻
1.User Inputs:
• Enter a URL (e.g., a blog post, article, or documentation page).
• Type a question about the content of the page.
2.Processing Pipeline:
• The content is loaded using WebBaseLoader.
• It is split into smaller chunks for better analysis.
• Google Generative AI embeddings are generated for each chunk.
• A vector store is created, enabling fast similarity search.
3.Query Matching:
• The user's question is transformed into an embedding.
• The system searches the vector store for the most relevant content.
4.Results:
• The top result is displayed to the user as the answer to their question.
# Why Use This Project 🌟?
• Dynamic Information: Unlike static knowledge in pre-trained models, this app fetches the latest information from the web.
• Customizable: Easily extendable to support different types of embeddings, content sources, or models.
• User-Friendly: Intuitive Streamlit interface for easy interaction.
• Educational: A great resource to understand RAG, LangChain, and how to build end-to-end NLP applications.
# Demo 📽
Below is a demonstration of how the application works:
![Demo of the Application](https://github.com/Abdelrahman-Amen/Question_Answering_from_Any_URL_with_RAG/blob/main/Demo.gif)