An open API service indexing awesome lists of open source software.

https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub

This project creates a Retrieve-and-Generate (RAG) powered chatbot for summarizing and interacting with articles. The system processes articles provided as PDFs or URLs, extracts text, splits the content into chunks, generates embeddings, and stores them in a vector database
https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub

article-extractor chatbot llama3 llm pdf-document-processor rag streamlit summarizer vector-database

Last synced: 3 months ago
JSON representation

This project creates a Retrieve-and-Generate (RAG) powered chatbot for summarizing and interacting with articles. The system processes articles provided as PDFs or URLs, extracts text, splits the content into chunks, generates embeddings, and stores them in a vector database

Awesome Lists containing this project

README

          

# Cortex AI: Multi-Model Insights Hub

๐Ÿค– **Advanced AI-Powered Document Analysis with Multimodal RAG Capabilities**

Cortex AI Hub integrates multiple Large Language Models (LLMs) with a sophisticated **Multimodal Retrieve-and-Generate (RAG)** system, enabling you to extract insights from both **text and visual content** in documents.

**โœจ NEW: Multimodal Capabilities** - Now with support for images, charts, graphs, and infographics!

---

## ๐ŸŒŸ **Key Features**

### ๐Ÿ–ผ๏ธ **Multimodal RAG**

- **๐Ÿ“Š Visual Content Understanding**: Analyze images, charts, graphs, and infographics
- **๐Ÿ”— Unified Text-Image Search**: Search across both textual and visual content
- **๐ŸŽฏ Context-Aware Analysis**: Enhanced understanding with specialized prompts
- **๐Ÿ’พ Persistent Storage**: Efficient FAISS-based multimodal embeddings
- **๐Ÿ†“ Free & Local**: Uses open-source models (BLIP, BLIP-2, GIT, CLIP)

### ๐Ÿ” **Advanced Search & RAG**

- **๐Ÿง  Hybrid Search**: Combines semantic vector search with BM25 keyword search
- **๐Ÿ“‚ Multi-Document Support**: Upload PDFs or provide URLs
- **๐Ÿ’พ Persistent Vector Database**: ChromaDB-powered storage
- **โœ… Accurate Citations**: Source-linked responses with references

### ๐Ÿค– **AI-Powered Search Agent**

- **๐ŸŒ Real-Time Research**: ArXiv, Wikipedia, and web search tools
- **๐Ÿ“ฐ Current Information**: Up-to-date news and research insights
- **โšก Instant Responses**: Fast, context-aware answers

---

## ๐Ÿš€ **Supported AI Models**

| Model | Provider | Best For |
| ----------------------------- | -------- | ----------------------------- |
| llama-3.3-70b-versatile | Meta | Complex reasoning, analysis |
| llama-3.1-8b-instant | Meta | Quick queries, fast responses |
| deepseek-r1-distill-llama-70b | DeepSeek | Extended conversations |
| qwen/qwen3-32b | Alibaba | Document summarization |
| openai/gpt-oss-120b | OpenAI | Complex analysis tasks |

### ๐Ÿ–ผ๏ธ **Vision Models**

| Model | Description | Best For |
| ------ | ---------------------- | ---------------------------- |
| BLIP | Quick image captioning | Speed, basic analysis |
| BLIP-2 | Advanced understanding | Complex visual content |
| GIT | Detailed descriptions | Charts, graphs, infographics |

---

## ๐Ÿ“ธ **Application Screenshots**

### ๐Ÿค– **RAG Chatbot Interface**

![RAG Chatbot Interface](images/Ragbot_interface.png)
_Traditional RAG chatbot with document upload and multi-LLM selection_

### ๐Ÿ–ผ๏ธ **Multimodal RAG Interface**

![Multimodal RAG Interface](images/MultiModel_Rag_Interface.png)
_Enhanced multimodal interface with vision model selection and image analysis_

### ๐Ÿ” **Search Agent Interface**

![Search Agent Interface](images/Search_Agent_Interface.png)
_AI-powered search agent with real-time research capabilities_

---

## ๐Ÿ”„ **System Architecture**

### ๐Ÿ“Š **RAG Chatbot Workflow**

![RAG Chatbot Workflow](images/Ragchotbot_diagram.png)
_Complete RAG chatbot workflow with document processing, hybrid search, and multi-LLM response generation_

### ๐Ÿค– **Search Agent Workflow**

![Search Agent Workflow](images/Search_Agent_Diagram.png)
_AI-powered search agent workflow with multi-tool research and intelligent orchestration_

### ๐Ÿ–ผ๏ธ **Multimodal RAG Workflow**

![Multimodal RAG Workflow](images/Multimodel_Rag.png)
_Enhanced multimodal workflow combining text and visual content analysis_

---

## ๐Ÿš€ **Getting Started**

### ๐Ÿ“‹ **Prerequisites**

- Python 3.12+
- Git
- API Keys: ChatGroq and Tavily

### ๐Ÿ“ฅ **Installation**

1. **Clone Repository**

```bash
git clone https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub.git
cd Cortex-AI-Multi-Model-Insights-Hub
```

2. **Setup Environment**

```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
```

3. **Configure API Keys**

```bash
cp .env.template .env
# Add your GROQ_API_KEY and TAVILY_API_KEY to .env
```

4. **Run Application**
```bash
streamlit run Main_Page.py
```

### ๐ŸŒ **Live Demo**

**[๐Ÿš€ Try it now](https://cortex-ai-multi-model-insights-app.streamlit.app/)**

---

## ๐Ÿ“– **Usage Guide**

### ๐Ÿ–ผ๏ธ **Multimodal Document Analysis**

1. Navigate to **"Multimodal RAG"** page
2. Choose vision model (BLIP for speed, GIT for accuracy)
3. Upload PDF with images/charts
4. Enable **"Extract and analyze images"**
5. Ask questions about text and visual content

### ๐Ÿ“„ **Traditional Document Chat**

1. Go to **"RAG Chatbot"** page
2. Upload PDFs or enter URLs
3. Configure retrieval parameters
4. Select LLM models for comparison
5. Ask questions and get cited responses

### ๐Ÿ” **Research & Web Search**

1. Visit **"Search Agent"** page
2. Enter research queries
3. Choose preferred LLM model
4. Get real-time answers with sources

---

## ๐Ÿ› ๏ธ **Technology Stack**

- **Frontend**: Streamlit with dark theme
- **Backend**: Python, LangChain/LangGraph
- **Vector DB**: ChromaDB (text), FAISS (multimodal)
- **Embeddings**: HuggingFace sentence-transformers, CLIP
- **Vision**: BLIP, BLIP-2, GIT (Hugging Face)
- **LLMs**: Groq API
- **Search**: Tavily, ArXiv, Wikipedia APIs

### ๐Ÿ“ **Project Structure**

```
โ”œโ”€โ”€ Main_Page.py # App entry point
โ”œโ”€โ”€ multimodal_helpers.py # Multimodal processing
โ”œโ”€โ”€ helpers.py # Text utilities
โ”œโ”€โ”€ chain_setup.py # LLM configuration
โ”œโ”€โ”€ pages/
โ”‚ โ”œโ”€โ”€ 1_RAG_Chatbot.py # Traditional RAG
โ”‚ โ”œโ”€โ”€ 2_Search_Agent.py # Web search agent
โ”‚ โ””โ”€โ”€ 3_Multimodal_RAG.py # Multimodal interface
โ”œโ”€โ”€ chroma_db/ # Text vector storage
โ”œโ”€โ”€ multimodal_stores/ # Multimodal storage
โ””โ”€โ”€ requirements.txt # Dependencies
```

---

## ๐Ÿ”ง **Key Technical Features**

### ๐Ÿง  **Architecture Highlights**

- **Two-Layer Vision**: Vision models โ†’ descriptions, CLIP โ†’ embeddings
- **Hybrid Search**: Semantic + BM25 for optimal retrieval
- **Model Caching**: Global cache prevents reloading
- **Session Management**: Streamlit state for persistence

### โšก **Performance Optimizations**

- Vision models cached globally
- Processed embeddings saved for reuse
- Lazy loading when needed
- Real-time progress feedback

---

## ๐Ÿค **Contributing**

1. Fork the repository
2. Create feature branch: `git checkout -b feature/your-feature`
3. Make changes and test locally
4. Commit and push: `git commit -m "Add feature"`
5. Create Pull Request

### ๐ŸŽฏ **Areas for Contribution**

- ๐Ÿ–ผ๏ธ New vision models or analysis techniques
- ๐Ÿ” Better retrieval algorithms
- ๐ŸŽจ UI/UX improvements
- ๐Ÿ“Š Analytics and metrics
- ๐Ÿงช Testing and documentation

---

## ๐Ÿ“ **License**

This project is licensed under the **AGPL-3.0 License**.

---

## ๐Ÿ†˜ **Support**

- **๐Ÿ› Issues**: [GitHub Issues](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/issues)
- **๐Ÿ’ฌ Discussions**: [GitHub Discussions](https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub/discussions)

---

## ๐Ÿ™ **Acknowledgments**

- **๐Ÿค— Hugging Face**: Free open-source vision models
- **๐Ÿฆ™ Meta**: Llama models and CLIP
- **๐Ÿ” Salesforce**: BLIP vision models
- **๐Ÿข Microsoft**: GIT vision model
- **โšก Groq**: Fast LLM inference
- **๐ŸŒ Streamlit**: Amazing app framework

---