{"id":28404703,"url":"https://github.com/sodadata/tutorial-demo-project","last_synced_at":"2025-08-08T05:14:31.750Z","repository":{"id":45915464,"uuid":"422529087","full_name":"sodadata/tutorial-demo-project","owner":"sodadata","description":"Material and setup files for Soda-sql getting started tutorial. Provides users with docker containers with soda-sql installed and pre-filled database.","archived":false,"fork":false,"pushed_at":"2021-11-28T18:46:05.000Z","size":25513,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":12,"default_branch":"main","last_synced_at":"2025-06-02T05:50:52.948Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sodadata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-10-29T10:13:12.000Z","updated_at":"2022-11-19T17:37:23.000Z","dependencies_parsed_at":"2022-09-23T09:52:32.937Z","dependency_job_id":null,"html_url":"https://github.com/sodadata/tutorial-demo-project","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sodadata/tutorial-demo-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Ftutorial-demo-project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Ftutorial-demo-project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Ftutorial-demo-project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Ftutorial-demo-project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sodadata","download_url":"https://codeload.github.com/sodadata/tutorial-demo-project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Ftutorial-demo-project/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269366887,"owners_count":24405257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-08T02:00:09.200Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-01T20:37:34.503Z","updated_at":"2025-08-08T05:14:31.743Z","avatar_url":"https://github.com/sodadata.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Soda SQL Demo\n\nThis repo provides an easy way to set up a PostgreSQL database with data from the \u003ca href=\"https://data.cityofnewyork.us/Transportation/Bus-Breakdown-and-Delays/ez4e-fazm\" target=\"_blank\"\u003eNYC Bus Breakdowns and Delay Dataset\u003c/a\u003e and a pre-configured Soda SQL project. You can use this repo as a test environment in which to experiment with Soda SQL. The Soda SQL interactive demo also references this project.\n\n## Prerequisites\n\n* a recent version of [Docker](https://docs.docker.com/get-docker/) \n* [Docker Compose](https://docs.docker.com/compose/install/) that is able to run `docker-compose` files version 3.9 and later\n\n## Set up using a script\n\nFrom the command-line, run the following command:\n\n```bash\n/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/sodadata/tutorial-demo-project/main/scripts/setup.sh)\"\n```\n\nThe command completes the following tasks:\n\n* fetches and unpacks the demo in the local directory\n* spins up the Docker containers using Docker Compose\n* drops you into a shell in the container, ready to begin using Soda SQL.\n\n```bash\n\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n| Welcome to the Docker-based shell for testing Soda SQL        |\n| To exit, just type CTRL-D or type \"exit\" and hit return       |\n|                                                               |\n| Type \"hint\" if you don't know where to start                  |\n+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\n\nSoda SQL, see https://docs.soda.io\n(c) Soda Data NV 2021\nroot@462fc591d108:/workspace# \n```\n\n#### Troubleshoot\n**Problem:** When running the script on a Mac, you get an error such as `failed to solve with frontend dockerfile.v0: failed to read dockerfile: error from sender: open /Users/\u003cuser\u003e/.Trash: operation not permitted`.\n\n**Solution:** You need to grant Full Disk Access to the Terminal application. Go to System Preferences \u003e Security \u0026 Privacy \u003e Privacy, then select Full Disk Access. Check the box next to Terminal to grant full disk access.\n\n## Set up manually\n\n1. Clone this repository to your local environment.\n2. In the command-line, navigate into the tutorial project: `cd tutorial-demo-project`.\n3. Build the Docker containers: `docker-compose up -d` (the `-d` flag means \"detached\" which means that you do not need to keep the terminal running for the docker containers to continue to run.)\n4. Validate that the setup is complete: `docker ps -a | grep soda`  This command yields output like the following:\n\n```\nCONTAINER ID   IMAGE                                    COMMAND                  CREATED       STATUS         PORTS                                       NAMES\n90b555b29ccd   tutorial-demo-project_soda_sql_project   \"/bin/bash\"              3 hours ago   Exited (2) 3 seconds ago   0.0.0.0:8001-\u003e5432/tcp, :::8001-\u003e5432/tcp   tutorial-demo-project_soda_sql_project_1\nd7950300de7a   postgres                                 \"docker-entrypoint.s…\"   3 hours ago   Up 3 seconds   0.0.0.0:5432-\u003e5432/tcp, :::5432-\u003e5432/tcp   tutorial-demo-project_soda_sql_tutorial_db_1\n```\n5. To run Soda commands and test your dataset, you need to get into the container's shell. From the project's root dir where the `docker-compose.yml` file exists, run the following command:\n\n```bash\ndocker-compose run --rm soda_sql_project \"cd /workspace \u0026\u0026 /bin/bash\"\n```\nThis command drops you into the container's shell with a prompt like the following:\n\n```bash\nroot@90461262c35e:/workspace# \n```\n\n\n## (Optional) Examine or query the dataset\n\nOnce the docker container is up, you can use any database clients such as DBeaver or DataGrip to connect to the database and query the `new_york.breakdowns` dataset.\n\nTo set up a connection in those clients use the following parameters:\n\n```\nhost: localhost\nusername: sodasql\npassword: \u003cleave empty\u003e\nport: 5432\n```\n\nThe table exists in the `sodasql_tutorial` database in the `new_york` schema. You can select it using the following query:\n\n```sql\nselect * from sodasql_tutorial.new_york.breakdowns limit 50;\n```\n\n## Run Soda commands in the Soda SQL Docker container\n\nAccess \u003ca href=\"https://docs.soda.io/soda-sql/configure.html\" target=\"_blank\"\u003edocs.soda.io\u003c/a\u003e for full instructions on how to set up and use Soda SQL.\n\n* Try running `soda` to see a list of Soda commands.\n* Try running `soda create postgres` to create a new Soda SQL project.\n* To enable you to run `soda analyze` and `soda scan` without configuring anything yourself, you can navigate to the `new_york_bus_breakdowns_demo` directory to use a sample `warehouse.yml` and sample `breakdowns-demo.yml` file. In the `new_york_bus_breakdowns_demo` directory, try running:\n    * `soda analyze`\n    * `soda scan warehouse.yml tables/breakdowns-demo.yml` \n\nThe output from the scan command yields something like this:\n\n```\n  | 2.1.0b18\n  | Scanning tables/breakdowns-demo.yml ...\n...\n  | Derived measurement: invalid_count(school_age_or_prek) = 0\n  | Derived measurement: valid_percentage(school_age_or_prek) = 100.0\n  | Test test(row_count \u003e 0) passed with measurements {\"expression_result\": 199998, \"row_count\": 199998}\n  | Test column(school_year) test(invalid_percentage == 0) passed with measurements {\"expression_result\": 0.0, \"invalid_percentage\": 0.0}\n  | Test column(bus_no) test(invalid_percentage \u003c= 20) passed with measurements {\"expression_result\": 19.99919999199992, \"invalid_percentage\": 19.99919999199992}\n  | Test column(schools_serviced) test(invalid_percentage \u003c= 15) passed with measurements {\"expression_result\": 12.095620956209562, \"invalid_percentage\": 12.095620956209562}\n  | Test column(incident_number) test(invalid_percentage == 0) failed with measurements {\"expression_result\": 0.4785047850478505, \"invalid_percentage\": 0.4785047850478505}\n  | Test column(incident_number) test(missing_count == 0) failed with measurements {\"expression_result\": 192614, \"missing_count\": 192614}\n  | Executed 2 queries in 0:00:02.360158\n  | Scan summary ------\n  | 245 measurements computed\n  | 6 tests executed\n  | 2 of 6 tests failed:\n  |   Test column(incident_number) test(invalid_percentage == 0) failed with measurements {\"expression_result\": 0.4785047850478505, \"invalid_percentage\": 0.4785047850478505}\n  |   Test column(incident_number) test(missing_count == 0) failed with measurements {\"expression_result\": 192614, \"missing_count\": 192614}\n  | Exiting with code 1\n```\n\n\n### Modify the tests \n\nIn the `new_york_bus_breakdowns_demo` directory, you can use a command-line text editor to open the `breakdowns-demo.yml` and adjust the existing tests or add new ones to the YAML file. Save the file, then run `soda scan warehouse.yml tables/breakdowns-demo.yml` again to see the results of your new or modified tests.\n\nLearn how to define tests in the YAML file at \u003ca href=\"https://docs.soda.io/soda-sql/tests.html\" target=\"_blank\"\u003edocs.soda.io\u003c/a\u003e.\n\n## Exit\n\nWhen you're done with the test environment, you can stop the Docker container.\n\n* If you set up the container using the script, use `ctrl+d` to shut down your container, or type `exit`.\n* If you set up the container manually, type `exit` or use `docker-compose down`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Ftutorial-demo-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsodadata%2Ftutorial-demo-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Ftutorial-demo-project/lists"}