{"id":28603854,"url":"https://github.com/visvav/natural-language-to-sql-query-generation-with-gemini","last_synced_at":"2025-10-27T02:19:55.128Z","repository":{"id":296837623,"uuid":"994669380","full_name":"VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini","owner":"VisvaV","description":"Convert natural language queries into SQL using LangChain and Google's Gemini model. This notebook demonstrates integration with Cloud SQL, query generation, automated execution, and response rephrasing. Ideal for AI-powered database interactions and automated analytics workflows.","archived":false,"fork":false,"pushed_at":"2025-06-02T10:56:22.000Z","size":35,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-11T17:21:41.765Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VisvaV.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-02T09:50:06.000Z","updated_at":"2025-06-02T10:56:24.000Z","dependencies_parsed_at":"2025-06-02T21:43:20.014Z","dependency_job_id":null,"html_url":"https://github.com/VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini","commit_stats":null,"previous_names":["visvav/natural-language-to-sql-query-generation-with-gemini"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisvaV%2FNatural-Language-to-SQL-Query-Generation-with-Gemini","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisvaV%2FNatural-Language-to-SQL-Query-Generation-with-Gemini/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisvaV%2FNatural-Language-to-SQL-Query-Generation-with-Gemini/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisvaV%2FNatural-Language-to-SQL-Query-Generation-with-Gemini/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VisvaV","download_url":"https://codeload.github.com/VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VisvaV%2FNatural-Language-to-SQL-Query-Generation-with-Gemini/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278909715,"owners_count":26066887,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-11T17:07:57.828Z","updated_at":"2025-10-08T07:43:28.360Z","avatar_url":"https://github.com/VisvaV.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Natural Language to SQL Query Generation using Gemini and LangChain\n\n## Project Overview\nThis project demonstrates how to convert natural language questions into SQL queries using Google's Gemini AI and LangChain. It integrates with Google Cloud SQL to execute queries automatically, providing efficient and intuitive database interactions using natural language.\n\n## Key Features\n\n- Natural Language Processing (NLP): Converts plain English into SQL queries.\n- AI Integration: Uses Google's Gemini AI model (gemini-2.0-flash) to generate SQL queries.\n- Automated Query Execution: Automatically executes generated queries on Google Cloud SQL.\n- Result Rephrasing: Clearly conveys results in readable form.\n- Error Handling: Comprehensive diagnostics and SQL query sanitization.\n\n## Technical Stack\n\n- Python 3\n- LangChain\n- Gemini AI (Google Generative AI)\n- Google Cloud SQL\n- SQLAlchemy\n- PyMySQL\n\n## Setup Instructions\n\n### 1. Clone the Repository\n```bash\ngit clone [my repo](https://github.com/VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini)\ncd [my repo](Natural-Language-to-SQL-Query-Generation-with-Gemini)\n```\n\n### 2. Install Dependencies\n```bash\npip install langchain langchain-openai langchain-google-genai sqlalchemy pymysql google-cloud-sql-connector langsmith\n```\n\n### 3. Configure Environment Variables\n\nSet these variables:\n```bash\nexport GOOGLE_API_KEY='your_google_api_key'\nexport LANGSMITH_API_KEY='your_langsmith_api_key'\nexport LANGCHAIN_ENDPOINT='https://api.smith.langchain.com'\nexport LANGCHAIN_PROJECT='your_langchain_project_name'\n```\n\n### 4. Google Cloud Authentication\nPlace your Google Cloud credentials JSON file in the project directory and set the path:\n```python\nos.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = \"path/to/your/credentials.json\"\n```\n\n## Database Schema Description\nInclude your database schema description CSV in the repository root as `database_table_descriptions.csv`. This file helps the model accurately generate queries.\n\n```bash\ndatabase_table_descriptions.csv\n```\n\nThis CSV should describe tables, columns, data types, constraints, and relationships.\n\n## Workflow and Concepts Explained\n\n### LangChain Pipelines\nLangChain manages NLP inputs, the Gemini model, and database execution. Workflow steps:\n1. Prompt Templating\n2. SQL Query Generation by Gemini\n3. Query Cleaning\n4. Execution on Google Cloud SQL\n5. Result Rephrasing\n\n### Prompt Engineering\nDetailed schema descriptions and clear task definitions greatly enhance query accuracy.\n\n### Error Handling and Debugging\nThe project includes detailed error handling:\n- KeyError: Ensuring consistent input naming (`input`, `question`, `table_info`).\n- SQL Syntax Errors: Handled through SQLAlchemy, ensuring adherence to schema.\n\n## Troubleshooting\n\nCommon issues:\n- Credential Errors: Verify Google Cloud credentials and paths.\n- KeyError Issues: Ensure correct input keys.\n- SQL Query Errors: Verify queries against provided schema.\n\n## Use Cases\n\nApplicable for:\n- Business Intelligence Automation\n- AI-powered Database Interfaces\n- Automated Analytics and Reporting\n- Quick Data Exploration\n\n## Future Enhancements\n\n- Implement caching mechanisms.\n- Support additional SQL dialects and databases.\n- Optimize prompt engineering for accuracy.\n\n## Contribution\n\nContributions welcomed. Submit issues or pull requests via GitHub.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisvav%2Fnatural-language-to-sql-query-generation-with-gemini","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvisvav%2Fnatural-language-to-sql-query-generation-with-gemini","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisvav%2Fnatural-language-to-sql-query-generation-with-gemini/lists"}