{"id":13665579,"url":"https://github.com/caesarHQ/textSQL","last_synced_at":"2025-04-26T08:32:33.264Z","repository":{"id":132561725,"uuid":"607438804","full_name":"caesarHQ/textSQL","owner":"caesarHQ","description":null,"archived":false,"fork":false,"pushed_at":"2023-10-20T18:17:35.000Z","size":6219,"stargazers_count":1548,"open_issues_count":10,"forks_count":157,"subscribers_count":23,"default_branch":"main","last_synced_at":"2024-10-29T17:55:58.601Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://censusgpt.com","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/caesarHQ.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-28T01:03:37.000Z","updated_at":"2024-10-29T14:02:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"9ee5fdc0-b8a1-4bcc-a93f-f710c3e75a34","html_url":"https://github.com/caesarHQ/textSQL","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarHQ%2FtextSQL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarHQ%2FtextSQL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarHQ%2FtextSQL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarHQ%2FtextSQL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/caesarHQ","download_url":"https://codeload.github.com/caesarHQ/textSQL/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224031926,"owners_count":17244361,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T06:00:42.332Z","updated_at":"2024-11-11T00:30:57.298Z","avatar_url":"https://github.com/caesarHQ.png","language":"JavaScript","readme":"## 🚨 Check out the latest project from the creators of textSQL: [Julius.ai](https://julius.ai?utm_source=github\u0026utm_campaign=textSQL) 🚨\n\n###\n\n# Natural Language → SQL\n\n### \n\n:bridge_at_night: Demo on San Francisco City Data: [SanFranciscoGPT.com](http://sanfranciscogpt.com)\n\n:us: Demo on US Census Data: [CensusGPT.com](https://censusgpt.com)\n\n\n\u003ch3 align=\"center\"\u003e\n\u003ca href=\"http://sanfranciscogpt.com\" target=\"_blank\"\u003e SanFranciscoGPT \u003c/a\u003e\u0026bull;\n  \u003ca href=\"https://censusgpt.com/\" target=\"_blank\"\u003e CensusGPT \u003c/a\u003e\u0026bull;\n  \u003ca href=\"https://t.co/FuOOcB6aGr\"\u003e\u003cb\u003eJoin the Discord Server\u003c/b\u003e\u003c/a\u003e\n\u003c/h3\u003e\n\n\u003c!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section --\u003e\n\u003cp align=\"center\"\u003e\n   \u003ca href='http://makeapullrequest.com'\u003e\u003cimg alt='PRs Welcome' src='https://img.shields.io/badge/PRs-welcome-43AF11.svg?style=shields'/\u003e\u003c/a\u003e\n   \u003ca href=\"#contributors\"\u003e\u003cimg src=\"https://img.shields.io/github/contributors/uselotus/lotus.svg?color=c0c8d0\"\u003e\u003c/a\u003e\n   \u003ca href=\"https://github.com/caesarHQ/textSQL/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/caesarHQ/textSQL?color=e4b442\" alt=\"Github Stars\"\u003e\u003c/a\u003e\n   \u003ca href=\"https://github.com/caesarHQ/textSQL/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-9d2235\" alt=\"License\"\u003e\u003c/a\u003e\n   \u003ca href=\"https://github.com/caesarHQ/textSQL/commits/main\"\u003e\u003cimg alt=\"GitHub commit activity\" src=\"https://img.shields.io/github/commit-activity/m/caesarHQ/textSQL?color=8b55e3\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nWelcome to textSQL, a project which uses LLMs to democratize access to data analysis. Example use cases of textSQL are San Francisco GPT and CensusGPT — natural language interfaces to public data (SF city data and US census data), enabling anyone to analyze and gain insights from the data.\n\n\u003cimg width=\"1316\" alt=\"Screenshot 2023-03-10 at 12 55 44 AM\" src=\"https://user-images.githubusercontent.com/10172332/224270303-087495bd-2391-4e1f-a8ad-ef5ae49ace0c.png\"\u003e\n\n## :thinking: How it works:\nWith CensusGPT, you can ask any question related to census data in natural language. \n\nThese natural language questions get converted to SQL using GPT-3.5 and are then used to query the census database.\n\nHere are some examples:\n\n* [🔍 Five cities with a population over 100,000 and lowest crime](https://censusgpt.com/?s=five%20cities%20with%20a%20population%20over%20100%2C000%20and%20lowest%20crime)\n* [🔍 10 highest income areas in california](https://censusgpt.com/?s=10%20highest%20income%20areas%20in%20california)\n\nHere is a similar example from sfGPT:\n\n* [🔍 Which four neighborhoods had the most crime in San Francisco in 2021?](https://censusgpt.com/sf?s=Which+four+neighborhoods+had+the+most+crime+in+San+Francisco+in+2021%3F)\n\n\n#### Diagram:\n\n![TextSQL diagram](https://raw.githubusercontent.com/zafileo23/textSQL/zafileo23-patch-2/TextSQL.svg)\n\n\n## :world_map: Roadmap:\n\nWe're splitting the roadmap for this project broadly into two categories:\n\n\n### 1. Visualizations: \n\nCurrently, textSQL only supports visualizing zip codes and cities on an interactive map and bar chart using [Mapbox](https://www.mapbox.com/) + [Plotly](https://plotly.com/). But data can be visualized in other interesting ways such as Heatmaps and Pie charts. Not every kind of data can be (or should be) visualized on a map. For example, a query like _\"What percent of total crime in San Francisco is burglary vs in New York City\"_ is perfect for visualizing as a stacked bar chart, but really hard to visualize on map.\n\nBar Chart:\n\n\u003cimg width=\"500\" alt=\"Top 5 richest cities in Washington\" src=\"https://user-images.githubusercontent.com/102765426/224921440-48937efa-ccc2-4718-9f55-09008465f1ae.png\"\u003e\n\n[coming soon] Heatmap: \n\n\u003cimg width=\"480\" alt=\"Screenshot 2023-03-10 at 12 58 33 AM\" src=\"https://user-images.githubusercontent.com/10172332/224271087-58cdcfd9-8940-4543-a3a5-1119477bd209.png\"\u003e\n\n[coming soon] Visualization-GPT: A way to use natural language to create and iterate on data visualizations in natural language through a text-to-vega engine.\n\n### 2. 🔌 Text-to-SQL BYOD (Bring Your Own Data) [here](https://github.com/caesarHQ/textSQL/tree/main/byod)\n\n\nYou can now connect your own database \u0026 datasets to textSQL and self-host the service. Our vision is to continue to modularize and improve this process.\n\n#### Use cases\n\n- Public-facing interactive interfaces for data. Democratizing public data\n- Empowering researchers. Enabling journalists and other researchers to more easily explore data\n- Business intelligence. Reducing the burden on technical employees to build \u0026 run queries for non-technical\n\n\nSetup instructions for BYOD are [here](https://github.com/caesarHQ/textSQL/tree/main/byod).\n\n\n## :pencil: Additional Notes\n\n#### Datasets: \n\nA lot of the users of this project have asked for additional data for both CensusGPT and sfGPT — historical census data (trends), weather, health, transportation and real-estate data. Feel free to create a pull request, drop a link to your dataset in our [Discord](https://discord.gg/JZtxhZQQus), or contribute data via our [dedicated submission form](https://airtable.com/shrDKRRGyRCihWEZd).\n\nMore data → Better CensusGPT and sfGPT\n\n#### Query Building:\n\nUsers build complex queries progressively. They start with a simple query like _\"Which neighborhoods in LA have the best schools?\"_ and then progressively add details like _\"with median income that is under $100,000\"_. One of the most powerful aspects of textSQL is enabling iterating on a query as a process of uncovering insights.\n\n### \n\n## :computer: How to Contribute:\n\nJoin our [discord](https://discord.gg/JZtxhZQQus)\n\nReadMe for the backend [here](https://github.com/caesarHQ/textSQL/blob/main/api/README.md)\n\nReadMe for the frontend [here](https://github.com/caesarHQ/textSQL/blob/main/client/censusGPT/README.md)\n\n\u003ca href=\"https://github.com/caesarHQ/textSQL/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=caesarHQ/textSQL\" /\u003e\n\u003c/a\u003e  \n\n### \n\n**Note:** Census data, like any other dataset, has its limitations and potential biases. Some data may not be collected or reported uniformly across different regions or time periods, which can affect the comparability of results. Users should keep these limitations in mind when interpreting the results of their queries and exercise caution when making decisions based on census data.\n","funding_links":[],"categories":["JavaScript","📦 Legacy \u0026 Inactive Projects"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FcaesarHQ%2FtextSQL","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FcaesarHQ%2FtextSQL","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FcaesarHQ%2FtextSQL/lists"}