https://github.com/wittyicon29/qabot-with-conversational-memory
Natural Language Query Agent over some web data and some pdf which has conversational memory using Groq Cloud API
https://github.com/wittyicon29/qabot-with-conversational-memory
chromadb groq-api langchain llms python rag streamlit
Last synced: 2 months ago
JSON representation
Natural Language Query Agent over some web data and some pdf which has conversational memory using Groq Cloud API
- Host: GitHub
- URL: https://github.com/wittyicon29/qabot-with-conversational-memory
- Owner: wittyicon29
- License: gpl-3.0
- Created: 2024-06-26T11:16:23.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-26T17:57:16.000Z (almost 2 years ago)
- Last Synced: 2025-03-15T15:17:11.926Z (over 1 year ago)
- Topics: chromadb, groq-api, langchain, llms, python, rag, streamlit
- Language: Python
- Homepage:
- Size: 680 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
### Project Overview
This project demonstrates the creation of a Natural Language Query Agent capable of answering questions based on a small set of lecture notes from Stanford's LLM lectures and a table of milestone LLM architectures. The system leverages LLMs and open-source vector indexing and storage frameworks to provide conversational answers, with an emphasis on follow-up queries and conversational memory.
### Data Sources
1. **Stanford LLMs Lecture Notes**:
- Introduction: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/introduction/)
- Capabilities: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/capabilities/)
- Harm-1: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/harm-1/)
- Harm-2: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/harm-2/)
- Data: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/data/)
- Modeling: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/modeling/)
- Training: [Lecture Link](https://stanford-cs324.github.io/winter2022/lectures/training/)
2. **Milestone Papers**: Table of model architectures from [Awesome LLM](https://github.com/Hannibal046/Awesome-LLM#milestone-papers).
### Project Structure
- **data_loading.py**: Contains functions to load data from the web and PDF.
- **processing.py**: Functions to split text into chunks and generate embeddings.
- **model_initialization.py**: Code to initialize the model and retrieval chain.
- **main.py**: Streamlit application for the chatbot interface.
### Intermediary Representation

**Data Organization and Embedding**:
1. **Raw Data Loading**:
- Web pages and PDF files are loaded using `WebBaseLoader` and `PyPDFLoader` respectively.
2. **Text Splitting**:
- Documents are split into manageable chunks using `RecursiveCharacterTextSplitter` with a chunk size of 1200 characters and an overlap of 200 characters.
3. **Embedding**:
- Text chunks are converted into embeddings using the HuggingFace model `BAAI/bge-small-en`.
4. **Vector Store**:
- The embeddings are stored in a Chroma vector store, making them searchable.
### Detailed Steps
#### Loading Data
1. **WebBaseLoader**: Fetches and loads web pages.
2. **PyPDFLoader**: Loads and parses the PDF containing milestone papers.
3. **MergedDataLoader**: Merges the data from the web and PDF loaders.
#### Processing Data
1. **Text Splitting**:
- `RecursiveCharacterTextSplitter` divides the loaded text into smaller, overlapping chunks to ensure that context is preserved.
2. **Embedding Generation**:
- `HuggingFaceBgeEmbeddings` generates embeddings for the text chunks using a pre-trained model.
3. **Vector Store**:
- The Chroma vector store is used to store and index these embeddings, enabling efficient retrieval.
#### Initializing the Model
1. **LLM Initialization**:
- `ChatGroq` initializes the chosen LLM model using the provided API key.
2. **Prompt Templates**:
- Custom prompt templates are created to reformulate user queries and generate responses based on the retrieved context.
3. **Retrieval Chain**:
- A retrieval chain is created that uses a history-aware retriever to provide context-aware answers.
### Application
A Streamlit application allows users to interact with the chatbot. Key features include:
- **Input Query**: Users can enter natural language queries.
- **Chat History**: The system maintains context across multiple queries.
- **Display of Sources**: The sources used to generate answers are displayed, ensuring transparency.
### Workflow of the System

### Deployment and Scaling
1. **Deployment Plan**:
- Can be directly deployed over Streamlit Cloud for public access
- Containerize the application using Docker for easy deployment.
- Use cloud services like AWS or GCP for scalability.
3. **Scaling**:
- Utilizing GPU capability to reduce the latency of generating the response.
- As the number of lectures or papers grows, the retrieval can be made more efficient through improved vector storing
- Implement caching strategies to improve response times for frequently asked questions.
### Improvements and Future Work
- **Enhanced Conversational Memory**: Improving the system's ability to handle complex, multi-turn conversations.
- **Citation and Reference Handling**: More sophisticated citation mechanisms to link specific sections of texts used in answers.
### Setup Instructions
1. **Clone the Repository**:
```sh
git clone
cd
```
2. **Install Dependencies**:
```sh
pip install -r requirements.txt
```
3. **Run the Application**:
```sh
streamlit run main.py
```