{"id":15009793,"url":"https://github.com/facultyai/marshmallow-dataframe","last_synced_at":"2025-04-09T17:51:55.444Z","repository":{"id":57439821,"uuid":"169726340","full_name":"facultyai/marshmallow-dataframe","owner":"facultyai","description":"Marshmallow Schema generator for Pandas DataFrames","archived":false,"fork":false,"pushed_at":"2020-08-11T09:22:03.000Z","size":78,"stargazers_count":24,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-05-06T17:43:34.767Z","etag":null,"topics":["dataframe","marshmallow","pandas","python","schema","validation"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/marshmallow-dataframe/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facultyai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-08T11:40:01.000Z","updated_at":"2024-03-14T18:51:38.000Z","dependencies_parsed_at":"2022-09-26T17:20:47.449Z","dependency_job_id":null,"html_url":"https://github.com/facultyai/marshmallow-dataframe","commit_stats":null,"previous_names":["zblz/marshmallow-numerical"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facultyai%2Fmarshmallow-dataframe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facultyai%2Fmarshmallow-dataframe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facultyai%2Fmarshmallow-dataframe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facultyai%2Fmarshmallow-dataframe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facultyai","download_url":"https://codeload.github.com/facultyai/marshmallow-dataframe/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248083182,"owners_count":21045055,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataframe","marshmallow","pandas","python","schema","validation"],"created_at":"2024-09-24T19:28:37.646Z","updated_at":"2025-04-09T17:51:55.420Z","avatar_url":"https://github.com/facultyai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# marshmallow-dataframe\n\n[![Build Status](https://github.com/facultyai/marshmallow-dataframe/workflows/Tests/badge.svg)](https://github.com/facultyai/marshmallow-dataframe/actions?query=workflow%3ATests)\n[![PyPI](https://img.shields.io/pypi/v/marshmallow-dataframe.svg)](https://pypi.org/project/marshmallow-dataframe/)\n[![License](https://img.shields.io/github/license/facultyai/marshmallow-dataframe.svg)](https://github.com/facultyai/marshmallow-dataframe/blob/master/LICENSE)\n\n`marshmallow-dataframe` is a library that helps you generate\n[marshmallow](https://marshmallow.readthedocs.io/) Schemas for Pandas\nDataFrames.\n\n# Usage\n\nLet's start by creating an example dataframe for which we want to create a\n`Schema`. This dataframe has four columns: two of them are of string type, one\nis a float, and the last one is an integer.\n\n```python\nimport pandas as pd\nimport numpy as np\nfrom marshmallow_dataframe import SplitDataFrameSchema\n\nanimal_df = pd.DataFrame(\n    [\n        (\"falcon\", \"bird\", 389.0, 2),\n        (\"parrot\", \"bird\", 24.0, 2),\n        (\"lion\", \"mammal\", 80.5, 4),\n        (\"monkey\", \"mammal\", np.nan, 4),\n    ],\n    columns=[\"name\", \"class\", \"max_speed\", \"num_legs\"],\n)\n```\n\nYou can then create a marshmallow schema that will validate and load dataframes\nthat follow the same structure as the one above and that have been serialized\nwith `DataFrame.to_json` with the [`orient=split`\nformat](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html#pandas.DataFrame.to_json).\nThe `dtypes` attribute of the `Meta` class is required, and other [`marshmallow`\nSchema\noptions](https://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.Schema.Meta)\ncan also be passed as attributes of `Meta`:\n\n```python\nclass AnimalSchema(SplitDataFrameSchema):\n    \"\"\"Automatically generated schema for animal dataframe\"\"\"\n\n    class Meta:\n        dtypes = animal_df.dtypes\n```\n\nWhen passing a valid payload for a new animal, this schema will validate it and\nbuild a dataframe:\n\n```python\nanimal_schema = AnimalSchema()\n\nnew_animal = {\n    \"data\": [(\"leopard\", \"mammal\", 58.0, 4), (\"ant\", \"insect\", 0.288, 6)],\n    \"columns\": [\"name\", \"class\", \"max_speed\", \"num_legs\"],\n    \"index\": [0, 1],\n}\n\nnew_animal_df = animal_schema.load(new_animal)\n\nprint(type(new_animal_df))\n# \u003cclass 'pandas.core.frame.DataFrame'\u003e\nprint(new_animal_df)\n#       name   class  max_speed  num_legs\n# 0  leopard  mammal     58.000         4\n# 1      ant  insect      0.288         6\n```\n\nHowever, if we pass a payload that doesn't conform to the schema, it will raise\na marshmallow `ValidationError` exception with informative message about errors:\n\n```python\ninvalid_animal = {\n    \"data\": [(\"leopard\", \"mammal\", 58.0, \"four\")],  # num_legs is not an int\n    \"columns\": [\"name\", \"class\", \"num_legs\"],  # missing  max_speed column\n    \"index\": [0],\n}\n\nanimal_schema.load(invalid_animal)\n\n# Raises:\n# marshmallow.exceptions.ValidationError: {\n#     'columns': [\"Must be equal to ['name', 'class', 'max_speed', 'num_legs'].\"],\n#     'data': {0: {3: ['Not a valid integer.']}}\n# }\n```\n\n`marshmallow_dataframe` can also generate Schemas for the `orient=records`\nformat by following the above steps but using\n`marshmallow_dataframe.RecordsDataFrameSchema` as the superclass for\n`AnimalSchema`.\n\n# Installation\n\nmarshmallow-dataframe requires Python \u003e= 3.6 and marshmallow \u003e= 3.0. You can\ninstall it with pip:\n\n```\npip install marshmallow-dataframe\n```\n\n# Contributing\n\nContributions are welcome!\n\nYou can report a problem or feature request in the [issue\ntracker](https://github.com/facultyai/marshmallow-dataframe/issues). If you feel\nthat you can fix it or implement it, please submit a pull request referencing\nthe issues it solves.\n\nUnit tests written using the [`pytest`](https://pytest.org) framework are in the\n`tests` directory, and are run using\n[tox](https://tox.readthedocs.io/en/latest/) on Python 3.6 and 3.7. You can run\nthe tests by installing tox:\n```\npip install tox\n```\nand running the linters and tests for all Python versions by running `tox`, or\nfor a specific Python version by running:\n```\ntox -e py36\n```\n\nWe format the code with [black](https://github.com/python/black), and you can\nformat your checkout of the code before commiting it by running:\n```\ntox -e black -- .\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacultyai%2Fmarshmallow-dataframe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacultyai%2Fmarshmallow-dataframe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacultyai%2Fmarshmallow-dataframe/lists"}