{"id":48954906,"url":"https://github.com/sodadata/soda-cli","last_synced_at":"2026-04-17T23:12:28.533Z","repository":{"id":349059930,"uuid":"1172595675","full_name":"sodadata/soda-cli","owner":"sodadata","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-17T08:58:17.000Z","size":60596,"stargazers_count":6,"open_issues_count":3,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-17T10:36:42.342Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sodadata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-04T13:43:22.000Z","updated_at":"2026-04-17T08:55:23.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sodadata/soda-cli","commit_stats":null,"previous_names":["sodadata/soda-cli"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/sodadata/soda-cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fsoda-cli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fsoda-cli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fsoda-cli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fsoda-cli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sodadata","download_url":"https://codeload.github.com/sodadata/soda-cli/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodadata%2Fsoda-cli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31949472,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T17:29:20.459Z","status":"ssl_error","status_checked_at":"2026-04-17T17:28:47.801Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-17T23:12:27.824Z","updated_at":"2026-04-17T23:12:28.509Z","avatar_url":"https://github.com/sodadata.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Soda CLI\n\nA single command-line tool for [Soda](https://www.soda.io) data quality. Manage datasources, datasets, contracts, monitors, incidents, and permissions from your terminal or pipeline.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/hero.png\" alt=\"Soda CLI\" width=\"580\"\u003e\n\u003c/p\u003e\n\nPreviously this was split between `soda-core` (local execution) and the Soda Cloud web UI (cloud management). Soda CLI unifies both into one `sodacli \u003cresource\u003e \u003caction\u003e` interface.\n\n\u003e **AI-agent friendly.** Every command supports `--no-interactive`, `--output json`, and structured exit codes, so it works well with LLMs, orchestrators, and CI/CD. This project includes a [SKILL](skills/soda-cli) so Claude/Codex or any other agent can run Soda commands, interpret results, and manage data quality through natural conversation.\n\n## Current Status\n\n**Version:** `v0.2.0` (active development)\n\nThe CLI is functional for core workflows. Here's where things stand:\n\n| Area | Status |\n|---|---|\n| Auth (login, logout, status, profiles) | Working |\n| Datasource (list, get, create, update, delete, onboard, test-connection, diagnostics) | Working |\n| Dataset (list, get, update, delete, profiling, diagnostics, permissions, onboard) | Working |\n| Contract (list, push, pull, diff, create, lint, verify via cloud or local) | Working |\n| Monitor (list, config, add column/custom, update, delete) | Working |\n| Results (list with filtering, sorting, date ranges) | Working |\n| Runner (list, get, create, delete) | Working |\n| IAM (user list, user invite, group CRUD, role list) | Working |\n| Job (status, logs) | Working |\n| Secrets (list, get, create, update, delete — client-side encrypted) | Working |\n| Contract verify (local via soda-core) | Working |\n| Incidents (list, get, update) | Wired, waiting on API deploy |\n| Dataset attributes | Wired, waiting on API deploy |\n| Notifications | Planned |\n| Dashboard | Planned |\n\nPer-command status is tracked in [`command_tree.txt`](command_tree.txt):\n\n```\n✅  implemented with real API call\n🔌  CLI wired, waiting on API endpoint\n🏠  local operation, no API needed\n❌  no public API endpoint yet\n```\n\n## Install\n\n### Homebrew (macOS/Linux)\n\n```bash\nbrew tap sodadata/tap\nbrew install sodacli\n```\n\n### Install script (macOS/Linux)\n\n```bash\ncurl -sSL https://raw.githubusercontent.com/sodadata/soda-cli/main/install.sh | sh\n```\n\n### Windows\n\nDownload the latest `.zip` for your architecture from [GitHub Releases](https://github.com/sodadata/soda-cli/releases), extract `sodacli.exe`, and add it to your PATH.\n\n### Download binary (any platform)\n\nGrab the archive for your OS/arch from [GitHub Releases](https://github.com/sodadata/soda-cli/releases), extract, and add to your PATH. Available for Linux, macOS, and Windows (amd64 + arm64).\n\n### From source (Go 1.22+)\n\n```bash\ngit clone https://github.com/sodadata/soda-cli.git\ncd soda-cli/go\ngo build -o sodacli .\nsudo mv sodacli /usr/local/bin/   # macOS/Linux\n```\n\n### Verify\n\n```bash\nsodacli version\nsodacli --help\n```\n\n## Quickstart\n\n### 1. Authenticate\n\n```bash\n# Interactive: prompts for host, API key ID, and secret\nsodacli auth login\n\n# Check that it worked\nsodacli auth status\n```\n\nGenerate API keys at [docs.soda.io/reference/generate-api-keys](https://docs.soda.io/reference/generate-api-keys).\n\n### 2. Onboard a datasource\n\n```bash\n# Full onboard: create datasource, discover datasets, enable monitoring + profiling + contracts + verify\nsodacli datasource onboard warehouse.yml --monitoring --profiling --contracts copilot\n```\n\nOr step by step:\n\n```bash\nsodacli datasource create warehouse.yml           # register datasource, returns ID\nsodacli dataset list --datasource my_warehouse    # see discovered datasets\nsodacli datasource onboard \u003cdatasource-id\u003e --monitoring --profiling --contracts skeleton\n```\n\n### 3. Verify a contract\n\n```bash\n# Run checks via Soda Cloud Runner (local file)\nsodacli contract verify orders.yml\n\n# Run checks via Soda Cloud Runner using dataset DQN — no local file needed\nsodacli contract verify datasource/db/schema/table\n\n# Or run locally via soda-core (no cloud needed)\nsodacli contract verify orders.yml --local --datasource datasource.yml\n\n# Check results\nsodacli results list --status failing\nsodacli job logs \u003cscan-id\u003e\n```\n\n## Essential Commands\n\n### Authentication\n\n```bash\nsodacli auth login                  # interactive setup\nsodacli auth login --host cloud.us.soda.io --api-key-id \u003cid\u003e --api-key-secret \u003csecret\u003e\nsodacli auth status                 # check connection health\nsodacli auth switch \u003cprofile\u003e       # switch between profiles (planned)\n```\n\n### Datasources\n\n```bash\nsodacli datasource list\nsodacli datasource get \u003cid\u003e\nsodacli datasource create config.yml                          # register from YAML config\nsodacli datasource onboard config.yml --monitoring --profiling --contracts skeleton  # full setup\nsodacli datasource update \u003cid\u003e --label \"Production DW\"        # change label, runner, or connection\nsodacli datasource test-connection config.yml                  # async connection test via Runner\nsodacli datasource diagnostics \u003cid\u003e                            # view diagnostics warehouse config\nsodacli datasource diagnostics \u003cid\u003e --enable --warehouse same --collect-results --collect-failed-rows\nsodacli datasource diagnostics \u003cid\u003e --max-failed-rows 5000 --expose-failed-rows-query\nsodacli datasource delete \u003cid\u003e\n```\n\n### Datasets\n\n```bash\nsodacli dataset list --datasource \u003cname\u003e --status onboarded --limit 50\nsodacli dataset get \u003cid\u003e\nsodacli dataset update \u003cid\u003e --tag production --tag critical\nsodacli dataset attributes \u003cid\u003e                                # list dataset attributes\nsodacli dataset profiling \u003cid\u003e --enable --schedule \"0 6 * * *\"\nsodacli dataset time-partition \u003cid\u003e --column created_at\nsodacli dataset diagnostics \u003cid\u003e --collect-results --collect-failed-rows\nsodacli dataset permissions list \u003cid\u003e\nsodacli dataset permissions assign \u003cid\u003e --role \u003crole-id\u003e --user \u003cuser-email\u003e\n```\n\n### Contracts\n\n```bash\nsodacli contract list\nsodacli contract create --dataset ds/db/schema/table --mode skeleton     # generate from schema\nsodacli contract create --dataset ds/db/schema/table --mode copilot      # AI-generated checks\nsodacli contract pull ds/db/schema/table                                 # download from cloud\nsodacli contract push my_table.yml                                       # upload to cloud\nsodacli contract diff my_table.yml                                       # local vs cloud diff\nsodacli contract lint my_table.yml                                       # validate syntax (offline)\nsodacli contract lint contracts/*.yml                                    # lint multiple files\nsodacli contract verify my_table.yml                                     # run checks via cloud Runner (local file)\nsodacli contract verify datasource/db/schema/table                       # run checks via cloud Runner (DQN, no local file)\nsodacli contract verify my_table.yml --no-wait                           # fire and forget\nsodacli contract verify my_table.yml --local --datasource config.yml     # run locally via soda-core\nsodacli contract verify my_table.yml --local --datasource config.yml --push  # run locally + push results to cloud\n```\n\n### Monitors\n\n```bash\nsodacli monitor list --dataset \u003cid\u003e\nsodacli monitor config \u003cdataset-id\u003e --enable --schedule \"0 */6 * * *\" --timezone \"UTC\"\nsodacli monitor add --dataset \u003cid\u003e --type column --column revenue --metric avg\nsodacli monitor add --dataset \u003cid\u003e --type column --column order_id --metric count --group-by region\nsodacli monitor add --dataset \u003cid\u003e --type custom --name \"dup check\" \\\n  --sql \"SELECT count(*) as c FROM t\" --result-metric c\nsodacli monitor update \u003cmonitor-id\u003e --dataset \u003cid\u003e --disable\nsodacli monitor delete \u003cmonitor-id\u003e --dataset \u003cid\u003e\n```\n\n### Secrets\n\n```bash\nsodacli secret list\nsodacli secret get \u003cid\u003e\nsodacli secret create --name DB_PASSWORD                       # masked interactive prompt\nsodacli secret create --name DB_PASSWORD --value \"s3cret\"      # via flag (visible in shell history)\necho \"s3cret\" | sodacli secret create --name DB_PASSWORD       # via stdin pipe\nsodacli secret update \u003cid\u003e                                     # masked prompt for new value\nsodacli secret delete \u003cid\u003e\n# Values are encrypted client-side (AES-256-GCM + RSA-OAEP) — Soda never sees plaintext.\n# Reference in datasource configs: ${secret.DB_PASSWORD}\n```\n\n### Results \u0026 Jobs\n\n```bash\nsodacli results list\nsodacli results list --dataset-name \"orders\" --status failing --from 2026-03-01 --limit 20\nsodacli job status \u003cscan-id\u003e\nsodacli job logs \u003cscan-id\u003e\n```\n\n### IAM\n\n```bash\nsodacli iam user list\nsodacli iam user invite --email alice@co.com --email bob@co.com   # invite up to 10 users\nsodacli iam group create --name \"Data Engineers\" --member alice@co.com --member bob@co.com\nsodacli iam group update \u003cid\u003e --add-member carol@co.com\nsodacli iam role list --scope dataset\n```\n\n### Runners\n\n```bash\nsodacli runner list\nsodacli runner get \u003cid\u003e\nsodacli runner create --name \"prod-runner\"    # returns credentials (shown once)\nsodacli runner delete \u003cid\u003e\n```\n\n## CI/CD Integration\n\nEvery command works non-interactively:\n\n```bash\n# Authenticate\nsodacli auth login \\\n  --host cloud.soda.io \\\n  --api-key-id \"$SODA_API_KEY_ID\" \\\n  --api-key-secret \"$SODA_API_KEY_SECRET\" \\\n  --no-interactive\n\n# Run contract checks (via cloud Runner)\nsodacli contract verify contracts/orders.yml --no-interactive --output json\n\n# Or run locally (no cloud auth needed, just soda-core on PATH)\nsodacli contract verify contracts/orders.yml --local --datasource datasource.yml\n\n# Exit codes\n# 0 = all checks passed\n# 1 = one or more checks failed  →  fail the pipeline\n# 2 = execution error             →  retry or alert\n# 3 = authentication error        →  check credentials\n```\n\n### GitHub Actions example\n\n```yaml\n- name: Verify data contracts\n  run: |\n    sodacli auth login --host cloud.soda.io \\\n      --api-key-id ${{ secrets.SODA_KEY_ID }} \\\n      --api-key-secret ${{ secrets.SODA_KEY_SECRET }} \\\n      --no-interactive\n    sodacli contract verify contracts/orders.yml --no-interactive\n```\n\n## Output Formats\n\nThe CLI picks the right format automatically:\n\n- **TTY** (interactive terminal): human-readable tables with color\n- **Piped** (`sodacli dataset list | jq .`): JSON\n- **Override**: `--output json|table|csv` on any command\n\n```bash\nsodacli dataset list                    # colored table\nsodacli dataset list --output json      # JSON\nsodacli dataset list --output csv       # CSV\nsodacli dataset list | jq '.[] | .id'  # auto-JSON when piped\n```\n\n## Global Flags\n\nThese work on every command:\n\n| Flag | Description |\n|---|---|\n| `--output table\\|json\\|csv` | Output format (auto-detects TTY) |\n| `--profile \u003cname\u003e` | Override active auth profile |\n| `--no-color` | Disable color output |\n| `--quiet` | Suppress non-essential output |\n| `--verbose` | Show detailed output |\n| `--no-interactive` | Never prompt, fail with clear error if input is missing |\n\n## Telemetry\n\nSoda CLI collects anonymous usage data (command names, exit codes, duration, OS/arch) to help us understand which features are used and improve the tool. No personal information, API keys, file contents, or query data is ever collected.\n\nTo opt out:\n\n```bash\nexport SODACLI_TELEMETRY=false\n```\n\n## What's Missing \u0026 Roadmap\n\n### Waiting on Soda Cloud API\n\nThe CLI code is written for these. They'll work as soon as the API endpoints ship:\n\n- **Incidents** (list, get, update) — documented in OpenAPI spec but still returns HTML\n- **Notifications** (rules and integrations CRUD)\n- **Job list** (scan history)\n- **Job cancel** (cancel running scans)\n\n### Planned Features\n\n- **Dashboard.** Org-level overview of datasets, results, and incidents.\n- **Contract proposals.** PR-style review flow for contract changes.\n\n### Vision\n\nThe goal is one CLI that covers the full data quality lifecycle:\n\n1. **Connect.** `sodacli datasource onboard` sets up a database connection with monitoring, profiling, contracts, and verification in one command.\n2. **Define.** `sodacli contract create --mode copilot` uses AI to generate meaningful checks from your schema and data profile.\n3. **Import.** `sodacli contract translate` translates existing data quality definitions from other formats (ODCS, dbt tests, Great Expectations, SodaCL v3) into Soda contracts.\n4. **Verify.** `sodacli contract verify` runs checks locally or in the cloud, from CI/CD or your terminal.\n5. **Monitor.** `sodacli monitor` adds ML anomaly detection that fires alerts when metrics drift.\n6. **Respond.** `sodacli incident` and `sodacli notification` close the loop from detection to resolution.\n7. **Govern.** `sodacli iam` and `sodacli dataset permissions` control who can do what.\n\nAll of this works the same way for humans typing commands and for AI agents calling them programmatically. Same interface, same exit codes, same JSON output.\n\n## Soda CLI vs soda-core\n\n| | Soda CLI (`sodacli`) | soda-core (`soda`) |\n|---|---|---|\n| **Language** | Go (single binary, no dependencies) | Python (requires pip + DB connectors) |\n| **Execution** | Cloud via Soda Runner, or local via `--local` | Local only |\n| **Scope** | Full platform: datasources, datasets, contracts, monitors, results, IAM, incidents | Contract verification and data source testing |\n| **Contract generation** | `contract create --mode copilot` (AI) or `skeleton` | Manual authoring only |\n| **CI/CD** | `--no-interactive`, `--output json`, structured exit codes | Basic exit codes |\n\n**Why use Soda CLI?** If you only need to run checks locally, soda-core is enough. If you want to manage your entire data quality lifecycle from one tool — generate contracts with AI, monitor anomalies, track results, control permissions, and integrate with CI/CD — use sodacli. It shells out to soda-core for local execution when needed (`--local`), so you get both.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Fsoda-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsodadata%2Fsoda-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodadata%2Fsoda-cli/lists"}