https://github.com/ako1983/llm_research_assistant
An AI-powered research assistant that leverages Retrieval-Augmented Generation (RAG) to provide accurate responses to user queries by retrieving relevant documents and reasoning through complex questions.
https://github.com/ako1983/llm_research_assistant
agent-based-modeling agents anthropic chromadb dspy langchain-python langgraph-python llm openai rag reasoning
Last synced: 5 months ago
JSON representation
An AI-powered research assistant that leverages Retrieval-Augmented Generation (RAG) to provide accurate responses to user queries by retrieving relevant documents and reasoning through complex questions.
- Host: GitHub
- URL: https://github.com/ako1983/llm_research_assistant
- Owner: ako1983
- License: mit
- Created: 2025-03-23T22:25:52.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-03-24T09:48:23.000Z (6 months ago)
- Last Synced: 2025-05-06T21:08:33.015Z (5 months ago)
- Topics: agent-based-modeling, agents, anthropic, chromadb, dspy, langchain-python, langgraph-python, llm, openai, rag, reasoning
- Language: Python
- Homepage:
- Size: 56.6 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π LLM-Powered Research Assistant π€
An AI-powered research assistant that leverages Retrieval-Augmented Generation (RAG) to provide accurate responses to user queries by retrieving relevant documents and reasoning through complex questions.
## Features
- **Smart Query Routing**: Autonomously decides whether to answer directly from knowledge, retrieve additional context, or use specialized tools
- **RAG Pipeline**: Retrieves relevant documents to enhance responses with accurate, up-to-date information
- **Multi-step Reasoning**: Uses DSPy for structured reasoning to break down complex queries
- **Tool Integration**: Can utilize calculators, web search, and other external tools when needed
- **Evaluation Framework**: Measures response quality and relevance using DSPy's evaluation capabilities## Architecture
```ascii
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β User Interface ββββββΆβ Query Router ββββββΆβ RAG Pipeline β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β² β
βΌ β βΌ
βββββββββββββββββββ βββββββββββββββββββ
β Tools (Calc, β β Vector Store β
β Web Search) β β (ChromaDB) β
βββββββββββββββββββ βββββββββββββββββββ
β β² β
βΌ β βΌ
βββββββββββββββββββ βββββββββββββββββββ
β LLM Provider ββββββΆβ DSPy Modules β
β (OpenAI/Claude) β β & Metrics β
βββββββββββββββββββ βββββββββββββββββββ
```## Setup & Installation
1. Clone the repository
```bash
git clone https://github.com/ako1983//LLM_research_assistant.git
cd /LLM_research_assistant
```2. Install dependencies
```bash
pip install -r requirements.txt
```3. Set up environment variables
```bash
export OPENAI_API_KEY="your-api-key"
export ANTHROPIC_API_KEY="your-api-key" # If using Claude
```4. Prepare your data
```bash
python src/vectorstore_builder.py
```## Usage
```python
from src.agent import ResearchAssistant
from src.llm_providers import OpenAILLM
from src.rag_pipeline import RAGPipeline# Initialize components
llm = OpenAILLM(model_name="gpt-3.5-turbo")
rag = RAGPipeline()
rag.initialize()
retriever = rag.get_retriever()# Create and use the assistant
assistant = ResearchAssistant(llm_provider=llm, retriever=retriever)
response = assistant.process_query("How do I fix Wi-Fi connection issues?")
print(response["response"])
```## Project Structure
```
llm-research-assistant/
βββ data/
β βββ raw/ # Original dataset files
β βββ processed/ # Cleaned CSV files
β βββ vector_stores/ # ChromaDB vector stores
βββ prompts/
β βββ query_classification_prompt_template.txt # LLM prompts
βββ src/
β βββ agent.py # Main assistant logic
β βββ llm_providers.py # LLM abstraction layer
β βββ rag_pipeline.py # Document retrieval system
β βββ router.py # Query routing logic
β βββ tools/ # External tool integrations
β βββ dspy_modules/ # DSPy components
βββ tests/ # Test cases
βββ main.py # Entry point
βββ requirements.txt # Dependencies
```## Requirements
- Python 3.8+
- LangChain
- DSPy
- ChromaDB
- OpenAI or Anthropic API access## Evaluation
The system uses DSPy's evaluation framework to assess:
- Answer correctness
- Context relevance
- Reasoning quality## Acknowledgements
- Data sourced from [Bitext Customer Support Dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)
- Built for an assessment