An open API service indexing awesome lists of open source software.

https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system
https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

article-extractor chatbot data-analysis data-visualization deepseek-chat deepseek-r1 llama3 llm pdf-document-processor rag streamlit-webapp summarizer vector-database

Last synced: 8 months ago
JSON representation

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system

Awesome Lists containing this project

README

          

# Cortex AI: Multi-Model Insights Hub

๐Ÿค– **Advanced AI-Powered Document Analysis with Multimodal RAG Capabilities**

Cortex AI Hub integrates multiple Large Language Models (LLMs) with a sophisticated **Multimodal Retrieve-and-Generate (RAG)** system, enabling you to extract insights from both **text and visual content** in documents.

**โœจ NEW: Multimodal Capabilities** - Now with support for images, charts, graphs, and infographics!

---

## ๐ŸŒŸ **Key Features**

### ๐Ÿ–ผ๏ธ **Multimodal RAG**

- **๐Ÿ“Š Visual Content Understanding**: Analyze images, charts, graphs, and infographics
- **๐Ÿ”— Unified Text-Image Search**: Search across both textual and visual content
- **๐ŸŽฏ Context-Aware Analysis**: Enhanced understanding with specialized prompts
- **๐Ÿ’พ Persistent Storage**: Efficient FAISS-based multimodal embeddings
- **๐Ÿ†“ Free & Local**: Uses open-source models (BLIP, BLIP-2, GIT, CLIP)

### ๐Ÿ” **Advanced Search & RAG**

- **๐Ÿง  Hybrid Search**: Combines semantic vector search with BM25 keyword search
- **๐Ÿ“‚ Multi-Document Support**: Upload PDFs or provide URLs
- **๐Ÿ’พ Persistent Vector Database**: ChromaDB-powered storage
- **โœ… Accurate Citations**: Source-linked responses with references

### ๐Ÿค– **AI-Powered Search Agent**

- **๐ŸŒ Real-Time Research**: ArXiv, Wikipedia, and web search tools
- **๐Ÿ“ฐ Current Information**: Up-to-date news and research insights
- **โšก Instant Responses**: Fast, context-aware answers

---

## ๐Ÿš€ **Supported AI Models**

| Model | Provider | Best For |
| ----------------------------- | -------- | ----------------------------- |
| llama-3.3-70b-versatile | Meta | Complex reasoning, analysis |
| llama-3.1-8b-instant | Meta | Quick queries, fast responses |
| deepseek-r1-distill-llama-70b | DeepSeek | Extended conversations |
| qwen/qwen3-32b | Alibaba | Document summarization |
| openai/gpt-oss-120b | OpenAI | Complex analysis tasks |

### ๐Ÿ–ผ๏ธ **Vision Models**

| Model | Description | Best For |
| ------ | ---------------------- | ---------------------------- |
| BLIP | Quick image captioning | Speed, basic analysis |
| BLIP-2 | Advanced understanding | Complex visual content |
| GIT | Detailed descriptions | Charts, graphs, infographics |

---

## ๐Ÿ“ธ **Application Screenshots**

### ๐Ÿค– **RAG Chatbot Interface**

![RAG Chatbot Interface](images/Ragbot_interface.png)
_Traditional RAG chatbot with document upload and multi-LLM selection_

### ๐Ÿ–ผ๏ธ **Multimodal RAG Interface**

![Multimodal RAG Interface](images/MultiModel_Rag_Interface.png)
_Enhanced multimodal interface with vision model selection and image analysis_

### ๐Ÿ” **Search Agent Interface**

![Search Agent Interface](images/Search_Agent_Interface.png)
_AI-powered search agent with real-time research capabilities_

---

## ๐Ÿ”„ **System Architecture**

### ๐Ÿ“Š **RAG Chatbot Workflow**

![RAG Chatbot Workflow](images/Ragchotbot_diagram.png)
_Complete RAG chatbot workflow with document processing, hybrid search, and multi-LLM response generation_

### ๐Ÿค– **Search Agent Workflow**

![Search Agent Workflow](images/Search_Agent_Diagram.png)
_AI-powered search agent workflow with multi-tool research and intelligent orchestration_

### ๐Ÿ–ผ๏ธ **Multimodal RAG Workflow**

![Multimodal RAG Workflow](images/Multimodel_Rag.png)
_Enhanced multimodal workflow combining text and visual content analysis_

---

## ๐Ÿš€ **Getting Started**

### ๐Ÿ“‹ **Prerequisites**

- Python 3.12+
- Git
- API Keys: ChatGroq and Tavily

### ๐Ÿ“ฅ **Installation**

1. **Clone Repository**

```bash
git clone https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub.git
cd Cortex-AI-Multi-Model-Insights-Hub
```

2. **Setup Environment**

```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
```

3. **Configure API Keys**

```bash
cp .env.template .env
# Add your GROQ_API_KEY and TAVILY_API_KEY to .env
```

4. **Run Application**
```bash
streamlit run Main_Page.py
```

### ๐ŸŒ **Live Demo**

**[๐Ÿš€ Try it now](https://cortex-ai-multi-model-insights-app.streamlit.app/)**

---

## ๐Ÿ“– **Usage Guide**

### ๐Ÿ–ผ๏ธ **Multimodal Document Analysis**

1. Navigate to **"Multimodal RAG"** page
2. Choose vision model (BLIP for speed, GIT for accuracy)
3. Upload PDF with images/charts
4. Enable **"Extract and analyze images"**
5. Ask questions about text and visual content

### ๐Ÿ“„ **Traditional Document Chat**

1. Go to **"RAG Chatbot"** page
2. Upload PDFs or enter URLs
3. Configure retrieval parameters
4. Select LLM models for comparison
5. Ask questions and get cited responses

### ๐Ÿ” **Research & Web Search**

1. Visit **"Search Agent"** page
2. Enter research queries
3. Choose preferred LLM model
4. Get real-time answers with sources

---

## ๐Ÿ› ๏ธ **Technology Stack**

- **Frontend**: Streamlit with dark theme
- **Backend**: Python, LangChain/LangGraph
- **Vector DB**: ChromaDB (text), FAISS (multimodal)
- **Embeddings**: HuggingFace sentence-transformers, CLIP
- **Vision**: BLIP, BLIP-2, GIT (Hugging Face)
- **LLMs**: Groq API
- **Search**: Tavily, ArXiv, Wikipedia APIs

### ๐Ÿ“ **Project Structure**

```
โ”œโ”€โ”€ Main_Page.py # App entry point
โ”œโ”€โ”€ multimodal_helpers.py # Multimodal processing
โ”œโ”€โ”€ helpers.py # Text utilities
โ”œโ”€โ”€ chain_setup.py # LLM configuration
โ”œโ”€โ”€ pages/
โ”‚ โ”œโ”€โ”€ 1_RAG_Chatbot.py # Traditional RAG
โ”‚ โ”œโ”€โ”€ 2_Search_Agent.py # Web search agent
โ”‚ โ””โ”€โ”€ 3_Multimodal_RAG.py # Multimodal interface
โ”œโ”€โ”€ chroma_db/ # Text vector storage
โ”œโ”€โ”€ multimodal_stores/ # Multimodal storage
โ””โ”€โ”€ requirements.txt # Dependencies
```

---

## ๐Ÿ”ง **Key Technical Features**

### ๐Ÿง  **Architecture Highlights**

- **Two-Layer Vision**: Vision models โ†’ descriptions, CLIP โ†’ embeddings
- **Hybrid Search**: Semantic + BM25 for optimal retrieval
- **Model Caching**: Global cache prevents reloading
- **Session Management**: Streamlit state for persistence

### โšก **Performance Optimizations**

- Vision models cached globally
- Processed embeddings saved for reuse
- Lazy loading when needed
- Real-time progress feedback

---

## ๐Ÿค **Contributing**

1. Fork the repository
2. Create feature branch: `git checkout -b feature/your-feature`
3. Make changes and test locally
4. Commit and push: `git commit -m "Add feature"`
5. Create Pull Request

### ๐ŸŽฏ **Areas for Contribution**

- ๐Ÿ–ผ๏ธ New vision models or analysis techniques
- ๐Ÿ” Better retrieval algorithms
- ๐ŸŽจ UI/UX improvements
- ๐Ÿ“Š Analytics and metrics
- ๐Ÿงช Testing and documentation

---

## ๐Ÿ“ **License**

This project is licensed under the **AGPL-3.0 License**.

---

## ๐Ÿ†˜ **Support**

- **๐Ÿ› Issues**: [GitHub Issues](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/issues)
- **๐Ÿ’ฌ Discussions**: [GitHub Discussions](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/discussions)

---

## ๐Ÿ™ **Acknowledgments**

- **๐Ÿค— Hugging Face**: Free open-source vision models
- **๐Ÿฆ™ Meta**: Llama models and CLIP
- **๐Ÿ” Salesforce**: BLIP vision models
- **๐Ÿข Microsoft**: GIT vision model
- **โšก Groq**: Fast LLM inference
- **๐ŸŒ Streamlit**: Amazing app framework

---