{"id":50454110,"url":"https://github.com/mizcausevic-dev/data-contract-registry","last_synced_at":"2026-06-01T01:05:39.364Z","repository":{"id":358081228,"uuid":"1239280847","full_name":"mizcausevic-dev/data-contract-registry","owner":"mizcausevic-dev","description":"Schema registry for data contracts. Semver versioning, compatibility checks (backward/forward/full), ownership, freshness SLAs. Bridges to procurement-decision-api.","archived":false,"fork":false,"pushed_at":"2026-05-15T17:14:56.000Z","size":28,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-15T18:19:40.618Z","etag":null,"topics":["data-contract","data-governance","data-quality","fastapi","kinetic-gain","pydantic","python","schema-registry"],"latest_commit_sha":null,"homepage":"https://kineticgain.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mizcausevic-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-15T00:09:25.000Z","updated_at":"2026-05-15T16:13:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mizcausevic-dev/data-contract-registry","commit_stats":null,"previous_names":["mizcausevic-dev/data-contract-registry"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/mizcausevic-dev/data-contract-registry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fdata-contract-registry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fdata-contract-registry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fdata-contract-registry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fdata-contract-registry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mizcausevic-dev","download_url":"https://codeload.github.com/mizcausevic-dev/data-contract-registry/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fdata-contract-registry/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33755379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-contract","data-governance","data-quality","fastapi","kinetic-gain","pydantic","python","schema-registry"],"created_at":"2026-06-01T01:05:39.290Z","updated_at":"2026-06-01T01:05:39.351Z","avatar_url":"https://github.com/mizcausevic-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# data-contract-registry\n\n[![CI](https://github.com/mizcausevic-dev/data-contract-registry/actions/workflows/ci.yml/badge.svg)](https://github.com/mizcausevic-dev/data-contract-registry/actions/workflows/ci.yml)\n[![Python](https://img.shields.io/badge/python-3.11%20%7C%203.12%20%7C%203.13-blue)](https://www.python.org/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n\n**Schema registry for data contracts.** Semver versioning, compatibility checks (backward / forward / full), declared owners, freshness SLAs. The \"you can't promote a new dataset version without an approved contract\" pattern, lifted from API governance and aimed at data pipelines.\n\nThe headline endpoint is `POST /contracts` — register a new version, get back a deterministic compatibility report or a 422 with every breaking change called out by field name and kind.\n\n---\n\n## Why\n\nThe thing that gets data teams paged at 2am isn't a missing test. It's a producer who quietly removed `ltv` because \"we never use it anymore\" while three downstream dashboards still join on it. Schema registries (Confluent, Buf, etc.) solved this for streaming and gRPC; data pipelines need the same hardness in a shape that fits the things data teams actually argue about:\n\n- **owners** — who do I page when this dataset goes stale\n- **freshness SLA** — when does \"stale\" become \"broken\"\n- **primary key** — changing it is a `MAJOR`, not a `MINOR`\n- **enum drift** — adding a value is fine; removing one is a backward-compatibility break\n- **deprecation policy** — flag a version with the URI of the migration plan; don't delete it\n\nThis package is the smallest thing that does all of those.\n\n---\n\n## Install\n\n```bash\npip install data-contract-registry\n# with the FastAPI surface:\npip install \"data-contract-registry[api]\"\n```\n\nPython 3.11+. Runtime deps: `pydantic` + `PyYAML`.\n\n---\n\n## Library quickstart\n\n```python\nfrom data_contract_registry import (\n    ContractRegistry,\n    DataContract,\n    DataField,\n    Owner,\n)\n\nregistry = ContractRegistry()\n\nv1 = DataContract(\n    dataset_id=\"users.daily_active\",\n    version=\"1.0.0\",\n    primary_key=[\"user_id\", \"active_date\"],\n    owners=[Owner(team=\"growth-platform\", contact=\"#growth-platform\")],\n    fields=[\n        DataField(name=\"user_id\",     type=\"string\"),\n        DataField(name=\"active_date\", type=\"timestamp\"),\n        DataField(name=\"plan\",        type=\"string\", enum=[\"free\", \"pro\", \"enterprise\"]),\n        DataField(name=\"ltv\",         type=\"number\", required=False),\n    ],\n    status=\"active\",\n)\nregistry.register(v1)\n\n# Compatible promotion (added an optional field).\nv1_1 = v1.model_copy(update={\n    \"version\": \"1.1.0\",\n    \"fields\": [*v1.fields, DataField(name=\"signup_source\", type=\"string\", required=False)],\n})\nreport = registry.register(v1_1)\nprint(report.compatible)   # True\n\n# Incompatible promotion — removing a field breaks backward compatibility.\nv2 = v1.model_copy(update={\"version\": \"2.0.0\", \"fields\": [f for f in v1.fields if f.name != \"ltv\"]})\nreport = registry.register(v2)\nprint(report.compatible)               # False\nprint(report.errors[0].kind)           # \"field_removed\"\nprint(report.errors[0].message)        # \"field 'ltv' was removed; old data will fail validation\"\n```\n\n---\n\n## Compatibility modes\n\n| Mode       | Meaning |\n| ---------- | --- |\n| `backward` | New schema can read data produced by the previous schema. **Default.** Consumers upgrade first. |\n| `forward`  | Previous schema can read data produced by the new schema. Producers upgrade first. |\n| `full`     | Both. |\n| `none`     | Anything goes. First-time onboarding only. |\n\nThe checks the engine knows how to flag (each carries a structured `kind` so you can build CI gates around specific failures):\n\n| Kind                       | Severity | Mode |\n| -------------------------- | -------- | --- |\n| `field_removed`            | error    | backward |\n| `field_type_changed`       | error    | backward |\n| `field_required_added`     | error    | backward (optional→required) **or** forward (new required field) |\n| `field_enum_shrunk`        | error    | backward |\n| `primary_key_changed`      | error    | always |\n| `version_not_increasing`   | error    | always |\n| `owner_missing`            | error    | always |\n\n---\n\n## FastAPI surface\n\n```bash\npip install \"data-contract-registry[api]\"\nuvicorn data_contract_registry.app:app --port 8090\n```\n\n| Method | Path | What it does |\n| --- | --- | --- |\n| GET | `/` | Service info. |\n| GET | `/healthz` | Liveness probe. |\n| GET | `/datasets` | List registered dataset IDs. |\n| POST | `/contracts` | Register / promote a contract. 422 with a structured issue list when incompatible. |\n| POST | `/contracts/check` | Dry-run compatibility check — does **not** register. |\n| GET | `/contracts/{ds}/latest` | Latest **active** contract for a dataset. |\n| GET | `/contracts/{ds}/versions` | Full version history. |\n| GET | `/contracts/{ds}/versions/{v}` | One specific version. |\n| POST | `/contracts/{ds}/versions/{v}/deprecate` | Mark deprecated with a migration URI. |\n| POST | `/contracts/{ds}/versions/{v}/archive` | Archive a version (history preserved). |\n| POST | `/contracts/owners/from-decision-card` | **Cross-ecosystem hook** — pull owners out of a Decision Card. |\n\nBundles are held in-memory by default. For restart-safe storage, swap `_BundleStore`'s implementation; the protocol is small.\n\n---\n\n## The cross-ecosystem hook\n\nThe third hook in the portfolio (after `procurement-decision-api` → `policy-as-code-engine` and the Suite → Decision Intelligence bridge). When a buyer approves a vendor whose data product the team will consume, the Decision Card's `buyer.name` + `decision_maker` are **the right answer** to \"who owns the contract on our side\":\n\n```bash\ncurl -X POST http://localhost:8090/contracts/owners/from-decision-card \\\n  -H 'Content-Type: application/json' \\\n  -d @decision-card.json\n# -\u003e [\n#   {\"team\": \"Springfield USD\",                            \"contact\": \"#data-platform\"},\n#   {\"team\": \"Director of Data (Alex Chen)\",               \"contact\": null}\n# ]\n```\n\nDrop that list straight into `DataContract.owners` and the registration carries paging info the team didn't have to re-type.\n\n---\n\n## YAML authoring\n\n```yaml\n# contracts/users-daily-active.yaml\ndataset_id: users.daily_active\nversion: \"1.0.0\"\nowners:\n  - team: growth-platform\n    contact: \"#growth-platform\"\nfreshness_sla:\n  max_lag_seconds: 86400\nfields:\n  - {name: user_id,      type: string}\n  - {name: active_date,  type: timestamp}\n  - {name: plan,         type: string, enum: [free, pro, enterprise]}\n```\n\nHand-author in YAML, validate in CI, register from Python:\n\n```python\nimport yaml\nfrom pathlib import Path\nfrom data_contract_registry import ContractRegistry, DataContract\n\nraw = yaml.safe_load(Path(\"contracts/users-daily-active.yaml\").read_text())\nContractRegistry().register(DataContract.model_validate(raw))\n```\n\n---\n\n## Tests\n\n```bash\npip install -e \".[dev]\"\nruff check src tests \u0026\u0026 ruff format --check src tests\nmypy src\npytest -v\n```\n\nCI matrix runs Python 3.11 / 3.12 / 3.13.\n\n---\n\n## Related in this ecosystem\n\n- **[procurement-decision-api](https://github.com/mizcausevic-dev/procurement-decision-api)** — drafts the Decision Cards this registry pulls owners from.\n- **[policy-as-code-engine](https://github.com/mizcausevic-dev/policy-as-code-engine)** — pair with this registry to enforce contracts at request time.\n- **[slo-budget-tracker](https://github.com/mizcausevic-dev/slo-budget-tracker)** — wire your freshness SLA into the same monitoring story.\n- More at [kineticgain.com](https://kineticgain.com/).\n\n---\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fdata-contract-registry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmizcausevic-dev%2Fdata-contract-registry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fdata-contract-registry/lists"}