https://github.com/first-coding/aidanalyst
AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow
https://github.com/first-coding/aidanalyst
data-analysis llm openai prompt-engineering python
Last synced: 9 months ago
JSON representation
AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow
- Host: GitHub
- URL: https://github.com/first-coding/aidanalyst
- Owner: first-coding
- Created: 2025-03-07T13:33:53.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-05-25T15:49:21.000Z (10 months ago)
- Last Synced: 2025-06-04T20:06:36.048Z (10 months ago)
- Topics: data-analysis, llm, openai, prompt-engineering, python
- Language: Python
- Homepage:
- Size: 4.95 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## AIDAnalyst

### Project Overview
- This project leverages large language models (LLMs) for data analysis, utilizing prompt engineering to generate SQL queries that retrieve data. The system helps you analyze the data, generate charts, and answer any relevant questions you propose.
### How To Use:
1. ```bash
pip install -r requirements.txt
```
2. ```bash
python main.py --openai_key your_key
```
- **Upload data**:Upload one or more CSV files for analysis.
- **Explore Data Schema**: After uploading, the system generates a schema description to help you understand your data structure.
- **Ask Questions**: Type your query in natural language about the data.
- **View Reports and Visualizations**: The system will generate SQL queries based on your description, fetch the data, debug and regenerate SQL queries as needed, and create visualizations along with a detailed report.
### Workflow
The project follows a structured data analysis workflow:
1. **Data Upload**: CSV files are uploaded for processing.
2. **Data Schema Generation**: The system generates a description of your data schema.
3. **Query Description**: Describe the data you want to analyze in natural language.
4. **SQL Query Generation**: The system generates SQL queries based on the description.
5. **SQL Execution**: Queries are executed on the uploaded data to fetch the necessary information.
6. **Report and Visualization**: The system generates detailed reports and visualizations based on the retrieved data.
7. **Error Handling**: If SQL errors are encountered, the system provides suggestions for corrections.
8. **Final Report**: Receive a comprehensive report, including SQL queries, retrieved data, and visualizations.
### Dependencies
- openai: OpenAI Python SDK for interfacing with GPT
- pandas: For data manipulation and analysis.
- pandasql: For running SQL queries on DataFrame objects.
- gradio: For creating the user interface.
### Suggestions and Issues
- We hope this project proves helpful! If you have any feedback, ideas, or questions, feel free to contact me through the issues section. Thank you!