https://github.com/ssahas/data-analysis-copilot
An application which can be used for data analysis using natural language.
https://github.com/ssahas/data-analysis-copilot
codellm copilot streamlit texttosql
Last synced: about 2 months ago
JSON representation
An application which can be used for data analysis using natural language.
- Host: GitHub
- URL: https://github.com/ssahas/data-analysis-copilot
- Owner: SSahas
- Created: 2025-02-27T13:32:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-27T13:35:54.000Z (over 1 year ago)
- Last Synced: 2025-02-27T19:19:18.931Z (over 1 year ago)
- Topics: codellm, copilot, streamlit, texttosql
- Language: Python
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Data Analysis Copilot
Data Analysis Copilot is an AI-powered tool that helps you analyze data using natural language queries. It combines the power of LLMs with SQL and Python to provide interactive data analysis capabilities.
## Features
- Natural language to SQL query conversion
- Automated data analysis code generation
- Interactive Streamlit interface
- Support for SQLite databases
- Visualization capabilities
## Installation
1. Clone the repository:
```bash
git clone https://github.com/SSahas/data-analysis-copilot.git
cd data-analysis-copilot
```
2. Create a virtual environment and activate it:
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
## Usage
1. Start the Streamlit app:
```bash
streamlit run src/app.py
```
2. Upload your SQLite database file through the interface.
3. Enter your analysis questions in natural language.
4. View the generated SQL queries, analysis code, and results.
## Project Structure
```
data-analysis-copilot/
├── src/ # Source code
├── core/ # Core functionality
├── services/ # Business logic services
|── utils/ # Utility functions
├── requirements.txt # Project dependencies
└── README.md # Project documentation
```
# Improvements
1. Add an intent classification layer to know if the user input needs code generation to perform the data analysis or only querying the database. Adding this layer reduces the execution time and also reduces the resources used.