{"id":28404662,"url":"https://github.com/sodadata/data-council-workshop","last_synced_at":"2025-07-30T02:05:13.301Z","repository":{"id":43010174,"uuid":"471346342","full_name":"sodadata/data-council-workshop","owner":"sodadata","description":null,"archived":false,"fork":false,"pushed_at":"2022-04-12T09:00:37.000Z","size":1159,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-07-29T21:38:35.377Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sodadata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-18T11:45:04.000Z","updated_at":"2022-06-15T00:01:08.000Z","dependencies_parsed_at":"2022-09-09T13:52:19.648Z","dependency_job_id":null,"html_url":"https://github.com/sodadata/data-council-workshop","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sodadata/data-council-workshop","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fdata-council-workshop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fdata-council-workshop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fdata-council-workshop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fdata-council-workshop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sodadata","download_url":"https://codeload.github.com/sodadata/data-council-workshop/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fdata-council-workshop/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267797267,"owners_count":24145700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-01T20:37:29.012Z","updated_at":"2025-07-30T02:05:13.283Z","avatar_url":"https://github.com/sodadata.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Council Workshop\n\n## Prerequisites\n\n* Python 3.8+\n* Docker\n\n## Asking for help\n\nTo get assistence during the workshop, ask on the Soda Community Slack.\n\nRegister for the community Slack: [https://go.soda.io/slack](https://go.soda.io/slack)\n\nAnd use [the data-council-workshop channel](https://soda-community.slack.com/archives/C0378BFA2P9)\n\n## The Soda Core documentation\n\nIn [the Soda Core documentation](https://docs.soda.io/soda-core/overview.html), you can find all information on how to get the CLI or library installed\nand how to use the open source project.\n\n## Get this repository\n\nClone the repository with git or download the zip file from this page. \n\n![Where to download](images/getting-the-workshop-files.png)\n\n## Running the demo data docker container\n\nBefore installing the Soda CLI, we will first launch the demo data docker container.\nIt's a single docker container that starts a Postgres database with the demo data \npreloaded on it.  \n\nRun convenience script `scripts/start_postgres.sh`\nOr run these commands individually:\n\n```shell\ncd postgres_demo_data\ndocker-compose up\n```\nYou should see output like\n\n```\n[data-council-workshop] scripts/02_start_postgres.sh \nStarting postgres_demo_data_soda-sql-postgres_1 ... done\nAttaching to postgres_demo_data_soda-sql-postgres_1\nsoda-sql-postgres_1  | ********************************************************************************\nsoda-sql-postgres_1  | WARNING: POSTGRES_HOST_AUTH_METHOD has been set to \"trust\". This will allow\nsoda-sql-postgres_1  |          anyone with access to the Postgres port to access your database without\nsoda-sql-postgres_1  |          a password, even if POSTGRES_PASSWORD is set. See PostgreSQL\n...\nsoda-sql-postgres_1  | waiting for server to start....LOG:  database system was shut down at 2022-03-19 09:35:46 UTC\nsoda-sql-postgres_1  | LOG:  MultiXact member wraparound protections are now enabled\nsoda-sql-postgres_1  | LOG:  database system is ready to accept connections\nsoda-sql-postgres_1  | LOG:  autovacuum launcher started\n```\n\nNow you have a postgres database/warehouse with the demo data preloaded.\n\n[View the demo data on page postgres_demo_data/DATA.md](postgres_demo_data/DATA.md) \n\nNow you will have to leave this console open until you want to stop the Postgres demo data docker \ncontainer.  \n\nTo stop the container (at the end of the workshop), simply press `CTRL+C`.\n\n## Installing the Soda CLI\n\nTo continue the workshop, start a new console because the postgres container is blocking \nthe command line. \n\nAfter you cloned / unzipped the repository, navigate to the `data-council-workshop` folder.\n```shell\ncd data-council-workshop\n```\n\nTo check if you're in the right folder, `ls` should look like this\n```shell\n[data-council-workshop] ls \nLICENSE\t\t\tchecks\t\t\timages\t\t\tscripts\nREADME.md\t\tconfiguration\t\tpostgres_demo_data\n```\n\nRun convenience script `scripts/create_venv.sh`  \nOr run these commands individually:\n\n```shell\nrm -rf .venv\npython3 -m venv .venv\nsource .venv/bin/activate\npip install --upgrade pip\npip install soda-core-postgres\n```\n\nYou should see output like\n```\n[data-council-workshop] rm -rf .venv\n[data-council-workshop] python3 -m venv .venv\n[data-council-workshop] source .venv/bin/activate\n(.venv) [data-council-workshop] pip install --upgrade pip\nRequirement already satisfied: pip in ./.venv/lib/python3.8/site-packages (21.1.1)\nCollecting pip\n  Using cached pip-22.0.4-py3-none-any.whl (2.1 MB)\nInstalling collected packages: pip\n  Attempting uninstall: pip\n    Found existing installation: pip 21.1.1\n    Uninstalling pip-21.1.1:\n      Successfully uninstalled pip-21.1.1\nSuccessfully installed pip-22.0.4\n(.venv) [data-council-workshop] \n(.venv) [data-council-workshop] pip install --upgrade pip\nRequirement already satisfied: pip in ./.venv/lib/python3.8/site-packages (21.1.1)\nCollecting pip\n  Using cached pip-22.0.4-py3-none-any.whl (2.1 MB)\nInstalling collected packages: pip\n  Attempting uninstall: pip\n    Found existing installation: pip 21.1.1\n    Uninstalling pip-21.1.1:\n      Successfully uninstalled pip-21.1.1\nSuccessfully installed pip-22.0.4\n(.venv) [data-council-workshop] pip install soda-core-postgres\nCollecting soda-core-postgres\n...\nSuccessfully installed Deprecated-1.2.13 Jinja2-2.11.3 antlr4-python3-runtime-4.9.3 backoff-1.10.0 certifi-2021.10.8 charset-normalizer-2.0.12 click-8.0.4 googleapis-common-protos-1.56.0 idna-3.3 markupsafe-2.0.1 opentelemetry-api-1.10.0 opentelemetry-exporter-otlp-proto-http-1.10.0 opentelemetry-proto-1.10.0 opentelemetry-sdk-1.10.0 opentelemetry-semantic-conventions-0.29b0 protobuf-3.19.4 psycopg2-binary-2.9.3 requests-2.27.1 ruamel.yaml-0.17.21 ruamel.yaml.clib-0.2.6 soda-core-3.0.0b1 soda-core-postgres-3.0.0b1 typing-extensions-4.1.1 urllib3-1.26.9 wrapt-1.14.0\n```\n\nNow verify that the installation went ok by entering command `soda`:\n```shell\n(.venv) [data-council-workshop] soda\nUsage: soda [OPTIONS] COMMAND [ARGS]...\n\n  Soda Core CLI version 3.0.0b1\n\nOptions:\n  --help  Show this message and exit.\n\nCommands:\n  scan    runs a scan\n  update  updates a distribution reference file\n```\n\n## Registering a Soda Cloud account\n\n1. [Sign up for a Soda Cloud account](https://cloud.soda.io/signup)\n\n2. Open `configuration/configuration.yml` in your favorite text editor\n\n3. Create the API key.  In your Soda Cloud account, navigate to your avatar \u003e Profile and choose the API Keys tab, then click the plus icon to generate new API keys.\n\n![Create API key](images/soda_cloud_create_api_key.png)\n\n4. Copy the API Key ID, then paste it into the configuration.yml as the value for api_key.\n\n5. Copy the API Key Secret, then paste it into the configuration.yml as the value for api_secret.\n\n6. Save the changes to the configuration.yml file. Close the Create API Key dialog box in Soda Cloud.\n\n## The SodaCL documentation\n\nIn the next section you'll be executing and exploring SodaCL YAML files. \n\nReference the [the SodaCL documentation](https://docs.soda.io/soda-cl/soda-cl-overview.html), \nwhere you will find all information on how to write checks\n\n## Running the checks\n\nReview the SodaCL file `checks/01_basic_checks.yml` in your text/YAML editor.\nIf the checks in the file are not clear, ask us for help.\n\nRun SodaCL file `checks/01_basic_checks.yml` \n```shell\nsoda scan -c configuration/configuration.yml -d workshop_ds checks/01_basic_checks.yml\n```\n\nFeel free to try and make some changes.  You can refer to \n[the SodaCL documentation](https://docs.soda.io/soda-cl/soda-cl-overview.html) \nto see the possibilities.  Try to make some updates and re-run the scan.\n\nIf you encounter issues:\n\nAsk the community on [Slack](https://soda-community.slack.com/archives/C0378BFA2P9).  Our experts will be monitoring \nthe data-council-workshop channel extra closely during the workshop.\n\nRun SodaCL file `checks/02_advanced_and_cool.yml` \n```shell\nsoda scan -v \"START=2022-02-24 00:00:00\" -v \"END=2022-02-25 00:00:00\" -c configuration/configuration.yml -d workshop_ds checks/02_advanced_and_cool.yml\n```\n\nFinally, review the analyst use cases\n\nRun SodaCL file `checks/03_analyst_use_cases.yml` \n```shell\nsoda scan -c configuration/configuration.yml -d workshop_ds checks/03_analyst_use_cases.yml\n```\n\n## Feedback\n\nAs this is the initial public release of our new SodaCL language, we want to hear \nwhat you think.   Here are some questions that might help formulate your feedback.\n\n* Would you recommend Soda Core to a collegue?\n  * Why or why not?\n  * Do you think it's more relevant to others in your organization?  If so, who?\n* What do you think is the most attractive feature?\n* Any specific features missing?\n* How would you compare SodaCL and Soda Core vs other open source data quality tools that you know?\n* If you do not yet plan to use it right now, what is the most important blocker for you?\n\nLet us know [the data-council-workshop channel](https://soda-community.slack.com/archives/C0378BFA2P9)\n\nOr if you prefer, send your feedback as an email to [mailto:workshop@soda.io](workshop@soda.io)  \n\n## Configuring the database in IntelliJ or Pycharm\n\nThis section is optional and only needed when you want to connect your PyCharm or \nIntelliJ to the Postgres DB containing the demo data.  Feel free to use any other \nDB browser of your choice.\n\nHost: `localhost`\nPort: `5432`\nDatabase: `demo`\nUsername: `soda`\nPassword not necessary\n\nOr\n\nURL: `jdbc:postgresql://localhost:5432/demo`\n\n![Database configuration in PyCharm](images/intellij-postgres-connection-details.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Fdata-council-workshop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsodadata%2Fdata-council-workshop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Fdata-council-workshop/lists"}