https://github.com/codestrate/vigilius_analyst
AI Agent that ingests datasets to query from Natural Language to SQL. Connected via Streamlit UI. More feats. TBD
https://github.com/codestrate/vigilius_analyst
ai-agents langchain-python langgraph-python ollama openai-api python
Last synced: 2 days ago
JSON representation
AI Agent that ingests datasets to query from Natural Language to SQL. Connected via Streamlit UI. More feats. TBD
- Host: GitHub
- URL: https://github.com/codestrate/vigilius_analyst
- Owner: CodeStrate
- Created: 2025-09-06T19:58:39.000Z (9 months ago)
- Default Branch: sql-agent
- Last Pushed: 2025-12-25T08:58:13.000Z (6 months ago)
- Last Synced: 2025-12-26T21:52:16.751Z (6 months ago)
- Topics: ai-agents, langchain-python, langgraph-python, ollama, openai-api, python
- Language: Python
- Homepage:
- Size: 168 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Vigilius Analyst
[](https://www.python.org/downloads/release/python-3129/)
[](https://docs.conda.io/en/latest/)
An intelligent **data analysis assistant** built with **Streamlit** and **LangGraph**/**Langchain**.
Vigilius lets you query datasets in **natural language**, automatically generating SQL, visualizations, and insights β all powered by multiple AI providers.
---
## π Key Features
- π **Multi-Provider AI Support**: OpenAI, Groq, Gemini, and Ollama
- π§ **Smart SQL Agent**: Generates, validates, and executes SQL from natural language
- π¬ **Data Assistant**: Handles small talk & intent classification
- π **Multiple File Formats**: CSV, Excel, and SQLite database support
- β‘ **Streaming Responses**: Real-time answers with clean formatting
- πΎ **Session Management**: Persistent chat history (Graph CheckPointer) & model configs
---
## π Project Structure
Based on the [`sql-agent`](https://github.com/CodeStrate/Vigilius_Analyst/tree/sql-agent) branch:
```
Vigilius_Analyst/
βββ agent/
β βββ agent_handler.py # Core SQL agent logic
β βββ data_assistant_handler.py # Intent classification + small talk
β βββ llm_factory.py # Multi-provider LLM factory
β βββ prompts.py # Agent system prompts
β
βββ assets/
β βββ chat_icons/ # User & bot avatars
β
βββ backend/ # FastAPI backend (future scope)
β
βββ datasets/ # Uploaded and processed datasets
β
βββ debug/
β βββ check_agent.py # CLI testing tool for agents
β
βββ prebuilt/
β βββ react_sql_agent.py # LangGraph ReAct SQL agent template
β
βββ utils/
β βββ ai_providers.py # Provider configs + available models
β βββ app_utils.py # Streamlit utilities
β βββ misc_utils.py # General helper functions
β
βββ .env # env file for API Keys (More in future)
βββ agent_graph.png # Mermaid Image for Agent Graph Architecture
βββ app.py # Streamlit frontend entrypoint
βββ requirements.txt # Python dependencies
βββ README.md
```
---
## βοΈ Setup Instructions
### Requirements
- **Python** β₯ 3.12.9
- Works on macOS, Linux, Windows
### Option 1: Virtualenv
```bash
git clone https://github.com/CodeStrate/Vigilius_Analyst.git
cd Vigilius_Analyst
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
### Option 2: Conda
```bash
git clone https://github.com/CodeStrate/Vigilius_Analyst.git
cd Vigilius_Analyst
conda create -n vigilius python=3.12.9
conda activate vigilius
pip install -r requirements.txt
```
### Configure Environment
Create a `.env` file with your keys:
```ini
# AI Provider Keys
OPENAI_API_KEY=your_openai_key
GROQ_API_KEY=your_groq_key
GEMINI_API_KEY=your_gemini_key
# Ollama requires no API key (runs locally)
# Install from: https://ollama.com/download
```
### (Optional) Install Ollama Models
```bash
ollama pull llama3:8b
```
---
## βΆοΈ Running Vigilius
**Web App (Streamlit)**
```bash
streamlit run app.py
```
**Terminal Debugging**
```bash
python -m debug.check_agent
```
---
## π― Usage Guide
1. **Upload Your Dataset**
- Supported: CSV, Excel (.xlsx), SQLite (.db)
- Files are converted to SQLite + schema analyzed
2. **Select Models**
- Choose AI providers + models for SQL Agent & Data Assistant
- Confirm selection to initialize
3. **Chat with Your Data**
- Example queries:
- βTop 10 customers by salesβ
- βRevenue trends by monthβ
- βMost popular productsβ
4. **Get Results**
- Auto-generated SQL β executed on database
- Outputs as tables (whenever available, pandas WIP)
- Streaming responses with formatting
---
## π§© AI Agent Architecture
### SQL Agent
- Schema discovery
- Query generation + validation
- Results formatting
### Data Assistant
- Handles non-data queries
- Classifies and validates intent (SQL vs. small talk)
- Maintains conversation flow
### LLM Factory
- Unified interface for all providers
- Dynamic model switching
- Provider-specific optimizations
---
## π§ Configuration
| Provider | Models | Best For |
|-----------|--------|----------|
| **OpenAI** | GPT-4, GPT-3.5 | High accuracy, complex queries |
| **Groq** | Llama-3, Mixtral | Ultra-fast inference |
| **Gemini** | Gemini-Pro | Googleβs latest models |
| **Ollama** | Llama3, Mistral, CodeLlama | Local, private, free |
- Edit prompts β `agent/prompts.py`
- Adjust model configs β `agent/llm_factory.py`
- UI tweaks β `app.py`
---
## π§ͺ Testing
**CLI Debugging**
```bash
python -m debug.check_agent
```
---
## π οΈ Future Roadmap
- β
FastAPI backend (multi-user, sessions, API access)
- β
Persistent chat history
- β
Export results (CSV, Excel, PDF)
- β
Advanced visualizations + customization
- β
Scheduled reports + notifications
---
## π€ Contributing
1. Fork this repo
2. Create a branch (`git checkout -b feature/your-feature`)
3. Commit (`git commit -m "Add your feature"`)
4. Push (`git push origin feature/your-feature`)
5. Open a Pull Request
---
## π Support
For help or feature requests, please [open an issue](../../issues).