https://github.com/shaheennabi/end-to-end-qa-genai-project
ππ QAChatbot: Smart Document Q&A ππ Transform your documents into an interactive Q&A experience! ππ€ Upload PDFs or files, and let the botβpowered by Gemini Pro API and built with Streamlitβdeliver accurate answers in a snap! πβ¨
https://github.com/shaheennabi/end-to-end-qa-genai-project
embeddings gemini llama-index llms modular-coding rag streamlit
Last synced: about 1 month ago
JSON representation
ππ QAChatbot: Smart Document Q&A ππ Transform your documents into an interactive Q&A experience! ππ€ Upload PDFs or files, and let the botβpowered by Gemini Pro API and built with Streamlitβdeliver accurate answers in a snap! πβ¨
- Host: GitHub
- URL: https://github.com/shaheennabi/end-to-end-qa-genai-project
- Owner: shaheennabi
- License: mit
- Created: 2024-11-10T17:09:12.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-11-12T18:10:51.000Z (6 months ago)
- Last Synced: 2025-03-26T06:30:38.726Z (about 2 months ago)
- Topics: embeddings, gemini, llama-index, llms, modular-coding, rag, streamlit
- Language: Jupyter Notebook
- Homepage:
- Size: 87.9 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π **End-to-End QA GenAI Project** π
Welcome to the **End-to-End QA GenAI Project**, where **cutting-edge technology** meets **seamless user experience**! π This project implements a **Retrieval-Augmented Generation (RAG)** system with a **Streamlit UI** for uploading and processing **PDF documents**. It leverages the **Google Gemini API** (featuring the powerful **Gemini Pro model**) for high-performance **natural language understanding** and **generation**. The **LlamaIndex** framework is used for **efficient document indexing** and **retrieval**, enabling fast and accurate query responses. ππ¬
With this system, you can interact with documents in a whole new way: **upload**, **query**, and **generate** intelligent responses! π
---
## β¨ **Features** β¨
- π₯οΈ **Streamlit UI** for seamless PDF uploads and interactions.
- π€ **Google Gemini API** integration for **powerful language generation** using the **Gemini Pro model**.
- ποΈ **LlamaIndex** framework for **fast and scalable document indexing** and retrieval.
- π οΈ **Modular Code Structure** for easy maintainability and modification.
- π **High-performance QA generation** based on uploaded documents and queries.---
## ποΈ **Architecture** ποΈ
The system is built with a **modular** design for scalability and easy maintenance. Below are the core components that make everything work:
1. **Streamlit UI**: Upload PDF files and interact with the system seamlessly.
2. **Document Preprocessing**: Extracts content from PDFs for indexing and retrieval.
3. **LlamaIndex Integration**: Indexes document content for **fast search** and retrieval.
4. **Google Gemini API**: Processes queries and generates responses using the **Gemini Pro model**.
5. **Modular Code**: Cleanly separated components for easy updates and improvements.---
## βοΈ **Setup** βοΈ
Get ready to set up your local environment and dive into this **powerful GenAI system**! Follow the steps below to get started:
### **Prerequisites**
- Python 3.9 or higher.
- **Google Gemini API** credentials.
- **LlamaIndex** library.### **Installation**
1. Clone the repository:
```bash
git clone https://github.com/shaheennabi/End-to-End-QA-GenAI-Project.git
cd End-to-End-QA-GenAI-Project
```2. Create and activate a Python virtual environment:
```bash
python3.9 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
```3. Install the required Python packages:
```bash
pip install -r requirements.txt
```4. Set up your **Google Gemini API** credentials. Follow [Google's official documentation](https://cloud.google.com/docs/authentication/getting-started) for setting up the API credentials.
5. Install **LlamaIndex**:
```bash
pip install llama-index
```---
## π **Running the Application** π
Now that you're all set up, letβs get the app running:
1. Start the **Streamlit app**:
```bash
streamlit run app.py
```2. Open your browser and visit [http://localhost:8501](http://localhost:8501) to interact with the app. π
3. **Upload a PDF file** and query it for intelligent responses generated using **Gemini Pro**!
---
## π **Workflow** π
Hereβs how the magic happens:
1. **Upload PDF**: The user uploads a PDF file through the **Streamlit UI**.
2. **Text Extraction**: The system extracts text content from the uploaded PDF.
3. **Document Indexing**: The text is indexed using **LlamaIndex** for **quick retrieval**.
4. **Query Generation**: The user submits a query, which is processed by the **Gemini Pro model**.
5. **Response Generation**: The system retrieves the relevant information and generates a **natural language response** using the **Gemini API**. π―---
## π **API Integration** π
This project utilizes the **Google Gemini API** (with the **Gemini Pro model**) for natural language generation. To interact with the API, you must set up your **Google Gemini API credentials** in the `config.py` file.
---
## π± **Future Enhancements** π±
- π **Multilingual Support**: Extend the system to support **multiple languages** for text generation.
- π **Document Summarization**: Automatically generate summaries for **long documents**.
- π **Advanced Search Features**: Add advanced search and filtering capabilities for document retrieval.---
## π€ **Contributing** π€
We welcome contributions to improve this project! To contribute:
1. Fork the repository.
2. Create a new branch for your changes.
3. Make your changes and commit them.
4. Open a pull request for review.---
## π **License** π
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for more details.
---
## π **Acknowledgments** π
A special thank you to the following technologies and resources that made this project possible:
- **Google Gemini API**: For providing powerful AI capabilities.
- **LlamaIndex**: For efficient document indexing and retrieval.
- **Streamlit**: For creating beautiful and user-friendly web interfaces.
- **Python 3.9**: The language powering this entire project.
- **Contributors**: For making this project even better! π---
## β **Star the Project** β
If you love this project, donβt forget to **star** it on GitHub! It helps us keep the project alive and motivates us to keep improving it. ππ
---
## π **Let's Build the Future Together** π
Ready to jump in? Clone the repository, install the dependencies, and start exploring this **next-gen AI-powered QA system**! ππ¬