{"id":50111376,"url":"https://github.com/npow/metaflow-contracts","last_synced_at":"2026-05-23T12:32:13.995Z","repository":{"id":343151692,"uuid":"1176510110","full_name":"npow/metaflow-contracts","owner":"npow","description":"Catch bad data between Metaflow steps before it corrupts your pipeline","archived":false,"fork":false,"pushed_at":"2026-03-09T06:17:46.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-09T09:38:58.391Z","etag":null,"topics":["data-contracts","data-validation","metaflow","pipeline","pydantic","pypi","python","type-checking"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/npow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-09T04:57:12.000Z","updated_at":"2026-03-09T06:28:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/npow/metaflow-contracts","commit_stats":null,"previous_names":["npow/metaflow-contracts"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/npow/metaflow-contracts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npow%2Fmetaflow-contracts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npow%2Fmetaflow-contracts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npow%2Fmetaflow-contracts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npow%2Fmetaflow-contracts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/npow","download_url":"https://codeload.github.com/npow/metaflow-contracts/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/npow%2Fmetaflow-contracts/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33396574,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-23T04:15:53.637Z","status":"ssl_error","status_checked_at":"2026-05-23T04:15:53.242Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-contracts","data-validation","metaflow","pipeline","pydantic","pypi","python","type-checking"],"created_at":"2026-05-23T12:32:13.205Z","updated_at":"2026-05-23T12:32:13.989Z","avatar_url":"https://github.com/npow.png","language":"Python","funding_links":[],"categories":["Developer Tooling"],"sub_categories":[],"readme":"# metaflow-contracts\n\n[![CI](https://github.com/npow/metaflow-contracts/actions/workflows/ci.yml/badge.svg)](https://github.com/npow/metaflow-contracts/actions/workflows/ci.yml)\n[![PyPI](https://img.shields.io/pypi/v/metaflow-contracts)](https://pypi.org/project/metaflow-contracts/)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-blue.svg)](LICENSE)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![Docs](https://img.shields.io/badge/docs-mintlify-18a34a?style=flat-square)](https://mintlify.com/npow/metaflow-contracts)\n\nCatch bad data between Metaflow steps before it corrupts your pipeline.\n\n## The problem\n\nMetaflow passes data between steps as untyped artifacts on `self`. A step produces the wrong type — a string where downstream expects a float, a `None` that slipped through — and the failure surfaces two steps later with a confusing traceback pointing nowhere near the cause. There's no built-in way for a step to declare what it promises to produce.\n\n## Quick start\n\n```bash\npip install metaflow-contracts\n```\n\n```python\nfrom metaflow import FlowSpec, step\nfrom metaflow_contracts import contract\n\nclass MyFlow(FlowSpec):\n\n    @step\n    @contract(outputs={\"scores\": list[float]})\n    def start(self):\n        self.scores = load_scores()\n        self.next(self.classify)\n\n    @step\n    @contract(outputs={\"label\": str, \"confidence\": float})\n    def classify(self):\n        self.label, self.confidence = model.predict(self.scores)\n        self.next(self.end)\n\n    @step\n    def end(self):\n        print(self.label, self.confidence)\n```\n\nIf `scores` is the wrong type when `start` finishes, the run fails immediately at that step — not somewhere downstream:\n\n```\nContractViolationError in step 'start' [output]: 'scores' expected list, got str\n```\n\n## Install\n\n```bash\n# Core — plain Python type hints\npip install metaflow-contracts\n\n# With Pydantic support\npip install \"metaflow-contracts[pydantic]\"\n\n# With beartype for generic types (list[int], dict[str, float], Optional, Union…)\npip install \"metaflow-contracts[beartype]\"\n\n# Everything\npip install \"metaflow-contracts[pydantic,beartype]\"\n```\n\n## Usage\n\n### Output contracts (primary pattern)\n\nPut the contract on the step that produces the data. Errors point at the source.\n\n```python\n@step\n@contract(outputs={\"label\": str, \"confidence\": float})\ndef classify(self):\n    self.label = model.predict(self.scores)\n    self.confidence = model.score(self.scores)\n    self.next(self.end)\n```\n\n```\n# Wrong type:   ContractViolationError in step 'classify' [output]: 'confidence' expected float, got str\n# Missing field: ContractViolationError in step 'classify' [output]: 'label' is missing (expected str)\n```\n\n### Pydantic models\n\nUse a Pydantic model when you want field-level validators or already have schemas defined elsewhere.\n\n```python\nfrom pydantic import BaseModel, field_validator\n\nclass ClassifyOutput(BaseModel):\n    label: str\n    confidence: float\n\n    @field_validator(\"confidence\")\n    @classmethod\n    def must_be_probability(cls, v: float) -\u003e float:\n        if not 0.0 \u003c= v \u003c= 1.0:\n            raise ValueError(\"must be between 0 and 1\")\n        return v\n\n@step\n@contract(outputs=ClassifyOutput)\ndef classify(self):\n    self.label = \"cat\"\n    self.confidence = 1.5  # raises: confidence must be between 0 and 1\n    self.next(self.end)\n```\n\n### Input contracts (defensive pattern)\n\nUse `inputs=` when consuming artifacts from steps you don't own — third-party flows or fan-in joins where you can't add an output contract upstream.\n\n```python\n@step\n@contract(inputs={\"raw_data\": list[dict]}, outputs={\"result\": float})\ndef join(self):\n    self.result = aggregate(self.raw_data)\n    self.next(self.end)\n```\n\n## How it works\n\n`@contract` wraps the step. Input contracts run before the body; output contracts run after. On failure, `ContractViolationError` is raised with the step name, phase (`input`/`output`), field, expected type, and actual type.\n\nPlain dict specs use `beartype` for generic checking when available, falling back to `isinstance` for simple types. Pydantic specs delegate to `model_validate`. Both backends are interchangeable — you can mix them freely across steps.\n\n## Development\n\n```bash\ngit clone git@github.com:npow/metaflow-contracts.git\ncd metaflow-contracts\npip install -e \".[dev]\"\npytest                   # 108 tests, 94%+ coverage\nruff check .             # lint\nmypy metaflow_contracts  # type check\n```\n\n## License\n\nApache 2.0 — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnpow%2Fmetaflow-contracts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnpow%2Fmetaflow-contracts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnpow%2Fmetaflow-contracts/lists"}