https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub
This project creates a Retrieve-and-Generate (RAG) powered chatbot for summarizing and interacting with articles. The system processes articles provided as PDFs or URLs, extracts text, splits the content into chunks, generates embeddings, and stores them in a vector database
https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub
article-extractor chatbot llama3 llm pdf-document-processor rag streamlit summarizer vector-database
Last synced: 3 months ago
JSON representation
This project creates a Retrieve-and-Generate (RAG) powered chatbot for summarizing and interacting with articles. The system processes articles provided as PDFs or URLs, extracts text, splits the content into chunks, generates embeddings, and stores them in a vector database
- Host: GitHub
- URL: https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub
- Owner: RobinMillford
- License: agpl-3.0
- Created: 2025-01-23T16:25:01.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-01-28T16:43:53.000Z (11 months ago)
- Last Synced: 2025-01-28T17:36:06.255Z (11 months ago)
- Topics: article-extractor, chatbot, llama3, llm, pdf-document-processor, rag, streamlit, summarizer, vector-database
- Language: Python
- Homepage: https://multi-model-rag-powered-article-chatbot.streamlit.app/
- Size: 341 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Cortex AI: Multi-Model Insights Hub
๐ค **Advanced AI-Powered Document Analysis with Multimodal RAG Capabilities**
Cortex AI Hub integrates multiple Large Language Models (LLMs) with a sophisticated **Multimodal Retrieve-and-Generate (RAG)** system, enabling you to extract insights from both **text and visual content** in documents.
**โจ NEW: Multimodal Capabilities** - Now with support for images, charts, graphs, and infographics!
---
## ๐ **Key Features**
### ๐ผ๏ธ **Multimodal RAG**
- **๐ Visual Content Understanding**: Analyze images, charts, graphs, and infographics
- **๐ Unified Text-Image Search**: Search across both textual and visual content
- **๐ฏ Context-Aware Analysis**: Enhanced understanding with specialized prompts
- **๐พ Persistent Storage**: Efficient FAISS-based multimodal embeddings
- **๐ Free & Local**: Uses open-source models (BLIP, BLIP-2, GIT, CLIP)
### ๐ **Advanced Search & RAG**
- **๐ง Hybrid Search**: Combines semantic vector search with BM25 keyword search
- **๐ Multi-Document Support**: Upload PDFs or provide URLs
- **๐พ Persistent Vector Database**: ChromaDB-powered storage
- **โ
Accurate Citations**: Source-linked responses with references
### ๐ค **AI-Powered Search Agent**
- **๐ Real-Time Research**: ArXiv, Wikipedia, and web search tools
- **๐ฐ Current Information**: Up-to-date news and research insights
- **โก Instant Responses**: Fast, context-aware answers
---
## ๐ **Supported AI Models**
| Model | Provider | Best For |
| ----------------------------- | -------- | ----------------------------- |
| llama-3.3-70b-versatile | Meta | Complex reasoning, analysis |
| llama-3.1-8b-instant | Meta | Quick queries, fast responses |
| deepseek-r1-distill-llama-70b | DeepSeek | Extended conversations |
| qwen/qwen3-32b | Alibaba | Document summarization |
| openai/gpt-oss-120b | OpenAI | Complex analysis tasks |
### ๐ผ๏ธ **Vision Models**
| Model | Description | Best For |
| ------ | ---------------------- | ---------------------------- |
| BLIP | Quick image captioning | Speed, basic analysis |
| BLIP-2 | Advanced understanding | Complex visual content |
| GIT | Detailed descriptions | Charts, graphs, infographics |
---
## ๐ธ **Application Screenshots**
### ๐ค **RAG Chatbot Interface**

_Traditional RAG chatbot with document upload and multi-LLM selection_
### ๐ผ๏ธ **Multimodal RAG Interface**

_Enhanced multimodal interface with vision model selection and image analysis_
### ๐ **Search Agent Interface**

_AI-powered search agent with real-time research capabilities_
---
## ๐ **System Architecture**
### ๐ **RAG Chatbot Workflow**

_Complete RAG chatbot workflow with document processing, hybrid search, and multi-LLM response generation_
### ๐ค **Search Agent Workflow**

_AI-powered search agent workflow with multi-tool research and intelligent orchestration_
### ๐ผ๏ธ **Multimodal RAG Workflow**

_Enhanced multimodal workflow combining text and visual content analysis_
---
## ๐ **Getting Started**
### ๐ **Prerequisites**
- Python 3.12+
- Git
- API Keys: ChatGroq and Tavily
### ๐ฅ **Installation**
1. **Clone Repository**
```bash
git clone https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub.git
cd Cortex-AI-Multi-Model-Insights-Hub
```
2. **Setup Environment**
```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
```
3. **Configure API Keys**
```bash
cp .env.template .env
# Add your GROQ_API_KEY and TAVILY_API_KEY to .env
```
4. **Run Application**
```bash
streamlit run Main_Page.py
```
### ๐ **Live Demo**
**[๐ Try it now](https://cortex-ai-multi-model-insights-app.streamlit.app/)**
---
## ๐ **Usage Guide**
### ๐ผ๏ธ **Multimodal Document Analysis**
1. Navigate to **"Multimodal RAG"** page
2. Choose vision model (BLIP for speed, GIT for accuracy)
3. Upload PDF with images/charts
4. Enable **"Extract and analyze images"**
5. Ask questions about text and visual content
### ๐ **Traditional Document Chat**
1. Go to **"RAG Chatbot"** page
2. Upload PDFs or enter URLs
3. Configure retrieval parameters
4. Select LLM models for comparison
5. Ask questions and get cited responses
### ๐ **Research & Web Search**
1. Visit **"Search Agent"** page
2. Enter research queries
3. Choose preferred LLM model
4. Get real-time answers with sources
---
## ๐ ๏ธ **Technology Stack**
- **Frontend**: Streamlit with dark theme
- **Backend**: Python, LangChain/LangGraph
- **Vector DB**: ChromaDB (text), FAISS (multimodal)
- **Embeddings**: HuggingFace sentence-transformers, CLIP
- **Vision**: BLIP, BLIP-2, GIT (Hugging Face)
- **LLMs**: Groq API
- **Search**: Tavily, ArXiv, Wikipedia APIs
### ๐ **Project Structure**
```
โโโ Main_Page.py # App entry point
โโโ multimodal_helpers.py # Multimodal processing
โโโ helpers.py # Text utilities
โโโ chain_setup.py # LLM configuration
โโโ pages/
โ โโโ 1_RAG_Chatbot.py # Traditional RAG
โ โโโ 2_Search_Agent.py # Web search agent
โ โโโ 3_Multimodal_RAG.py # Multimodal interface
โโโ chroma_db/ # Text vector storage
โโโ multimodal_stores/ # Multimodal storage
โโโ requirements.txt # Dependencies
```
---
## ๐ง **Key Technical Features**
### ๐ง **Architecture Highlights**
- **Two-Layer Vision**: Vision models โ descriptions, CLIP โ embeddings
- **Hybrid Search**: Semantic + BM25 for optimal retrieval
- **Model Caching**: Global cache prevents reloading
- **Session Management**: Streamlit state for persistence
### โก **Performance Optimizations**
- Vision models cached globally
- Processed embeddings saved for reuse
- Lazy loading when needed
- Real-time progress feedback
---
## ๐ค **Contributing**
1. Fork the repository
2. Create feature branch: `git checkout -b feature/your-feature`
3. Make changes and test locally
4. Commit and push: `git commit -m "Add feature"`
5. Create Pull Request
### ๐ฏ **Areas for Contribution**
- ๐ผ๏ธ New vision models or analysis techniques
- ๐ Better retrieval algorithms
- ๐จ UI/UX improvements
- ๐ Analytics and metrics
- ๐งช Testing and documentation
---
## ๐ **License**
This project is licensed under the **AGPL-3.0 License**.
---
## ๐ **Support**
- **๐ Issues**: [GitHub Issues](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/issues)
- **๐ฌ Discussions**: [GitHub Discussions](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/discussions)
---
## ๐ **Acknowledgments**
- **๐ค Hugging Face**: Free open-source vision models
- **๐ฆ Meta**: Llama models and CLIP
- **๐ Salesforce**: BLIP vision models
- **๐ข Microsoft**: GIT vision model
- **โก Groq**: Fast LLM inference
- **๐ Streamlit**: Amazing app framework
---