{"id":34810792,"url":"https://github.com/vertti/daffy","last_synced_at":"2026-02-21T07:20:53.966Z","repository":{"id":39893012,"uuid":"334617862","full_name":"vertti/daffy","owner":"vertti","description":"Lightweight DataFrame validation decorators for Pandas, Polars, Modin, and PyArrow. No custom types required.","archived":false,"fork":false,"pushed_at":"2026-02-19T13:00:55.000Z","size":734,"stargazers_count":53,"open_issues_count":6,"forks_count":5,"subscribers_count":9,"default_branch":"main","last_synced_at":"2026-02-19T17:06:57.727Z","etag":null,"topics":["data-quality","data-validation","dataframe","dataframe-schema","dataframe-validation","decorator","modin","narwhals","pandas","polars","pyarrow","pydantic","python","python-decorator","runtime-validation","validation"],"latest_commit_sha":null,"homepage":"https://daffy.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vertti.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"docs/contributing.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"thanks_dev":"u/gh/vertti","buy_me_a_coffee":"vertti"}},"created_at":"2021-01-31T09:27:29.000Z","updated_at":"2026-02-18T14:13:10.000Z","dependencies_parsed_at":"2026-02-19T15:02:24.768Z","dependency_job_id":null,"html_url":"https://github.com/vertti/daffy","commit_stats":{"total_commits":76,"total_committers":5,"mean_commits":15.2,"dds":0.1842105263157895,"last_synced_commit":"f8bb8e1c52d0f3338c5f3b3634b462293871497f"},"previous_names":["thoughtworksinc/daffy","fourkind/daffy","vertti/daffy"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/vertti/daffy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vertti%2Fdaffy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vertti%2Fdaffy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vertti%2Fdaffy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vertti%2Fdaffy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vertti","download_url":"https://codeload.github.com/vertti/daffy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vertti%2Fdaffy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29676176,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-21T06:23:40.028Z","status":"ssl_error","status_checked_at":"2026-02-21T06:23:39.222Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-quality","data-validation","dataframe","dataframe-schema","dataframe-validation","decorator","modin","narwhals","pandas","polars","pyarrow","pydantic","python","python-decorator","runtime-validation","validation"],"created_at":"2025-12-25T12:48:50.638Z","updated_at":"2026-02-21T07:20:53.960Z","avatar_url":"https://github.com/vertti.png","language":"Python","readme":"# Daffy — Validate pandas \u0026 Polars DataFrames with Python Decorators\n\n[![PyPI](https://img.shields.io/pypi/v/daffy)](https://pypi.org/project/daffy/)\n[![conda-forge](https://img.shields.io/conda/vn/conda-forge/daffy.svg)](https://anaconda.org/conda-forge/daffy)\n[![Python](https://img.shields.io/pypi/pyversions/daffy)](https://pypi.org/project/daffy/)\n[![Docs](https://readthedocs.org/projects/daffy/badge/?version=latest)](https://daffy.readthedocs.io)\n[![CI](https://github.com/vertti/daffy/actions/workflows/main.yml/badge.svg)](https://github.com/vertti/daffy/actions)\n[![codecov](https://codecov.io/gh/vertti/daffy/graph/badge.svg?token=CISZTQ2XMS)](https://codecov.io/gh/vertti/daffy)\n\n**Validate your pandas and Polars DataFrames at runtime with simple Python decorators.** Daffy catches missing columns, wrong data types, and invalid values before they cause downstream errors in your data pipeline.\n\nAlso supports Modin and PyArrow DataFrames.\n\n- ✅ **Column \u0026 dtype validation** — lightweight, minimal overhead\n- ✅ **Value constraints** — nullability, uniqueness, range checks\n- ✅ **Row validation with Pydantic** — when you need deeper checks\n- ✅ **Works with pandas, Polars, Modin, PyArrow** — no lock-in\n\n---\n\n## Installation\n\n```bash\npip install daffy\n```\n\nor with conda:\n\n```bash\nconda install -c conda-forge daffy\n```\n\nWorks with whatever DataFrame library you already have installed. Python 3.10–3.14.\n\n---\n\n## Quickstart\n\n```python\nfrom daffy import df_in, df_out\n\n@df_in([\"price\", \"bedrooms\", \"location\"])\n@df_out([\"price_per_room\", \"price_category\"])\ndef analyze_housing(houses_df):\n    # Transform raw housing data into price analysis\n    return analyzed_df\n```\n\nIf a column is missing, has wrong dtype, or violates a constraint — **Daffy fails fast** with a clear error message at the function boundary.\n\n---\n\n## Why Daffy?\n\nMost DataFrame validation tools are schema-first (define schemas separately) or pipeline-wide (run suites over datasets). **Daffy is decorator-first:** validate inputs and outputs where transformations happen.\n\n|                          |                                                                                  |\n| ------------------------ | -------------------------------------------------------------------------------- |\n| **Non-intrusive**        | Just add decorators — no refactoring, no custom DataFrame types, no schema files |\n| **Easy to adopt**        | Add in 30 seconds, remove just as fast if needed                                 |\n| **In-process**           | No external stores, orchestrators, or infrastructure                             |\n| **Pay for what you use** | Column validation is essentially free; opt into row validation when needed       |\n\n---\n\n## Examples\n\n### Column validation\n\n```python\nfrom daffy import df_in, df_out\n\n@df_in([\"Brand\", \"Price\"])\n@df_out([\"Brand\", \"Price\", \"Discount\"])\ndef apply_discount(df):\n    df = df.copy()\n    df[\"Discount\"] = df[\"Price\"] * 0.1\n    return df\n```\n\n### Regex column matching\n\nMatch dynamic column names with regex patterns:\n\n```python\n@df_in([\"id\", \"r/feature_\\\\d+/\"])\ndef process_features(df):\n    return df\n```\n\n### Value constraints\n\nVectorized checks with zero row iteration overhead:\n\n```python\n@df_in({\n    \"price\": {\"checks\": {\"gt\": 0, \"lt\": 10000}},\n    \"status\": {\"checks\": {\"isin\": [\"active\", \"pending\", \"closed\"]}},\n    \"email\": {\"checks\": {\"str_regex\": r\"^[^@]+@[^@]+\\.[^@]+$\"}},\n})\ndef process_orders(df):\n    return df\n```\n\nAvailable checks: `gt`, `ge`, `lt`, `le`, `between`, `eq`, `ne`, `isin`, `notnull`, `str_regex`\nAlso supported: `notin`, `str_startswith`, `str_endswith`, `str_contains`, `str_length`\n\n### Nullability and uniqueness\n\n```python\n@df_in({\n    \"user_id\": {\"unique\": True, \"nullable\": False},  # user_id must be unique and not null\n    \"email\": {\"nullable\": False},  # email cannot be null\n    \"age\": {\"dtype\": \"int64\"},\n})\ndef clean_users(df):\n    return df\n```\n\n### Row validation with Pydantic\n\nFor complex, cross-field validation:\n\n```bash\npip install 'daffy[pydantic]'\n```\n\n```python\nfrom pydantic import BaseModel, Field\nfrom daffy import df_in\n\nclass Product(BaseModel):\n    name: str\n    price: float = Field(gt=0)\n    stock: int = Field(ge=0)\n\n@df_in(row_validator=Product)\ndef process_inventory(df):\n    return df\n```\n\n---\n\n## Daffy vs Alternatives\n\n| Use Case                     |        Daffy        |      Pandera       | Great Expectations  |\n| ---------------------------- | :-----------------: | :----------------: | :-----------------: |\n| Function boundary guardrails |  ✅ Primary focus   |     ⚠️ Possible     | ❌ Not designed for |\n| Quick column/type checks     |   ✅ Lightweight    | ⚠️ Requires schemas |  ⚠️ Requires setup   |\n| Complex statistical checks   |      ⚠️ Limited      |    ✅ Extensive    |    ✅ Extensive     |\n| Pipeline/warehouse QA        | ❌ Not designed for |   ⚠️ Some support   |  ✅ Primary focus   |\n| Multi-backend support        |         ✅          |      ⚠️ Varies      |         ✅          |\n\n---\n\n## Configuration\n\nConfigure Daffy project-wide via `pyproject.toml`:\n\n```toml\n[tool.daffy]\nstrict = true\n```\n\n---\n\n## Documentation\n\nFull documentation available at **[daffy.readthedocs.io](https://daffy.readthedocs.io)**\n\n- [Getting Started](https://daffy.readthedocs.io/getting-started/) — quick introduction\n- [Usage Guide](https://daffy.readthedocs.io/usage/) — comprehensive reference\n- [API Reference](https://daffy.readthedocs.io/api/) — decorator signatures\n- [Changelog](https://github.com/vertti/daffy/blob/master/CHANGELOG.md) — version history\n\n---\n\n## Contributing\n\nIssues and pull requests welcome on [GitHub](https://github.com/vertti/daffy).\n\n## License\n\nMIT\n","funding_links":["https://thanks.dev/u/gh/vertti","https://buymeacoffee.com/vertti"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvertti%2Fdaffy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvertti%2Fdaffy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvertti%2Fdaffy/lists"}