https://github.com/codera21/text2sql
LLM Application that converts text to SQL and execute it
https://github.com/codera21/text2sql
Last synced: 3 months ago
JSON representation
LLM Application that converts text to SQL and execute it
- Host: GitHub
- URL: https://github.com/codera21/text2sql
- Owner: codera21
- Created: 2025-02-15T02:51:29.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-02-24T11:13:59.000Z (3 months ago)
- Last Synced: 2025-02-24T12:26:43.066Z (3 months ago)
- Language: Python
- Size: 91.8 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# text2sql
LLM Application that converts text to SQL and execute it
## Problem and Approach
### Problem
The main problem this application aims to solve is the complexity and difficulty of writing SQL queries for users who may not be familiar with SQL syntax. Users often need to extract specific information from databases but lack the expertise to write the necessary SQL queries.
### Approach
The approach taken to solve this problem involves leveraging a Language Model (LLM) to convert natural language text into SQL queries. This allows users to input their queries in plain English, and the application translates these queries into SQL, executes them, and returns the results.
## Key Implementation Decisions
### Database Choice
DuckDB is used as the database engine due to its efficiency and ease of integration with Python. It supports SQL and is optimized for analytical queries. Since the data was in csv, it support out of the box csv to sql support.
### LLM Integration
The application integrates with the Gemini LLM to generate SQL queries from user prompts. This involves sending the user prompt to the LLM and receiving the generated SQL query.
### Key assumption
- At the moment, there is no session so the assumption is only a user is using this locally.
### Error handling
- As Gemini is a third party api, I have added retry mechanim for error handling, which will exponentially retry the api.
## Demo
### Running the Application
To run the application, use the following command:
- download the dataset from kaggle: https://www.kaggle.com/datasets/usdot/flight-delays/data and keep the csv files in ./dataset folder
- use your gemini api key as shown in .env.example
- install poetry and run the command `poetry install`
- finally run the application via the command```sh
python src/api.py
```