{"id":18418946,"url":"https://github.com/smrfeld/upandup","last_synced_at":"2025-06-22T17:08:42.124Z","repository":{"id":221298064,"uuid":"753407801","full_name":"smrfeld/upandup","owner":"smrfeld","description":"upandup is a simple schema versioning system for Python dataclasses.","archived":false,"fork":false,"pushed_at":"2024-02-16T02:54:55.000Z","size":68,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-14T16:58:16.021Z","etag":null,"topics":["dataclass","python","schema","serialization","versioning"],"latest_commit_sha":null,"homepage":"http://www.oliver-ernst.com/upandup/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/smrfeld.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-06T03:53:29.000Z","updated_at":"2024-02-16T02:56:06.000Z","dependencies_parsed_at":"2024-02-14T02:26:16.454Z","dependency_job_id":"299d3ff2-c5c0-48b9-8d6e-a5c143233490","html_url":"https://github.com/smrfeld/upandup","commit_stats":null,"previous_names":["smrfeld/upandup"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/smrfeld/upandup","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smrfeld%2Fupandup","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smrfeld%2Fupandup/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smrfeld%2Fupandup/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smrfeld%2Fupandup/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/smrfeld","download_url":"https://codeload.github.com/smrfeld/upandup/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smrfeld%2Fupandup/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260327110,"owners_count":22992441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataclass","python","schema","serialization","versioning"],"created_at":"2024-11-06T04:15:05.122Z","updated_at":"2025-06-22T17:08:37.102Z","avatar_url":"https://github.com/smrfeld.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Up-and-up :rocket:\n\n`upandup` is a simple schema versioning system for Python dataclasses.\n\nContents:\n* [Why?](#why)\n* [Serialization formats supported](#serialization-formats-supported)\n* [Installation](#installation)\n* [Example](#example)\n* [Advanced usage](#advanced)\n* [Tests](#tests)\n\n## Why?\n\nIn Python, `dataclasses` are a great way to define data schemas. However, when the schema changes, you need to be able to update the old data to the latest version, or risk breaking the ability to load old data from JSON, YAML, or other formats.\n\n`upandup` provides a simple way to define how to update between different versions of a schema, and then load the latest version of the schema from old data.\n\nLet's say you have a `dataclass` like this:\n\n```python\n@dataclass\nclass DataSchemaV1:\n    x: int\n```\n\nUsers might end up serializing this to JSON:\n\n```json\n{\n    \"x\": 1\n}\n```\n\nLater, you decide to add a new field `y`:\n\n```python\n@dataclass\nclass DataSchema:\n    x: int\n    x_str: str\n```\n\nNow, users can no longer load the old data, because the schema has changed. You need to define how to update the old data to the new schema. `upandup` provides a way to do this. \n\n```python\nimport upandup as upup\n\nupdate = lambda cls_start, cls_end, obj_start: cls_end(x=obj_start.x, x_str=\"the value is: %d\" % obj_start.x)\n\n# Register the update\nupup.register_updates(\"DataSchema\", DataSchemaV1, DataSchema, fn_update=update)\n```\n\nIn the end, `upandup` exposes a `load` method that users can call to load the latest version of the schema from old data **every time**.\n\n```python\ndata = { x: 1 }\nobj = upup.load(\"DataSchema\", data)\nprint(obj.x_str) # the value is: 1\n```\n\nThis `load` method can also be exposed as an anonymous function such as:\n\n```python\n# In your package:\nload_data_schema = upup.make_load_fn(\"DataSchema\")\n\n# In scripts using your package\ndata = { x: 1 }\nobj = load_data_schema(data)\n```\n\n## Serialization formats supported\n\n* Dictionary - define `to_dict` and `from_dict` methods on your dataclasses.\n* JSON - define `to_json` and `from_json` methods on your dataclasses.\n* YAML - define `to_yaml` and `from_yaml` methods on your dataclasses.\n* TOML - define `to_toml` and `from_toml` methods on your dataclasses.\n\n## Installation\n\nInstall via pip:\n\n```bash\npip install upandup\n```\n\nAlternatively, you can install from source:\n\n```bash\ngit clone https://github.com/smrfeld/upandup\ncd upandup\npip install -e .\n```\n\n## Example\n\nFirst, define some dataclasses. Let's say you have a `DataSchema` dataclass, which is the latest version, but also 2 older versions called `DataSchemaV1` and `DataSchemaV2`. These classes have `to_json` and `from_json` methods defined via the `mashumaro` package by inheriting from `DataClassDictMixin` (they could also be defined manually).\n\n```python\nfrom dataclasses import dataclass\nfrom mashumaro import DataClassDictMixin\n\n@dataclass\nclass DataSchemaV1(DataClassDictMixin):\n    x: int\n\n@dataclass\nclass DataSchemaV2(DataClassDictMixin):\n    x: int\n    y: int\n    z: int = 0\n\n@dataclass\nclass DataSchema(DataClassDictMixin):\n    x: int\n    name: str\n```\n\nHere, the first version `DataSchemaV1` has only one field `x`, and the second version `DataSchemaV2` has 3 fields `x`, `y`, and `z`. The field `y` has no default, so we will have to define how to update it. The field `z` already has a default in the definition. In the final version, the fields `y` and `z` have been removed again, and a new field `name` has been added.\n\nNow, we can define how to update between the versions. We can use the `upandup` package to do this.\n\n```python\nimport upandup as upup\n\n# Define the functions to update between the versions\n# The functions take the start and end classes, and the object to update\n# The functions should return an object of the end class\n# The functions can be lambdas or regular functions\n# For the first update, we need to add a default value for the new field `y` (`z` already has a default).\nupdate_1_to_2 = lambda cls_start, cls_end, obj_start: cls_end(x=obj_start.x, y=0)\n\n# For the second update, we need to exclude the fields `y` and `z`, and add the new field `name` with a default value.\nupdate_2_to_latest = lambda cls_start, cls_end, obj_start: cls_end(x=obj_start.x, name=\"default\")\n\n# Register the update under the label `DataSchema`\nupup.register_updates(\"DataSchema\", DataSchemaV1, DataSchemaV2, fn_update=update_1_to_2)\nupup.register_updates(\"DataSchema\", DataSchemaV2, DataSchema, fn_update=update_2_to_latest)\n\n# Expose a helper function to load the latest version of the schema\n# This makes a thin wrapper around upup.load\nload_data_schema = upup.make_load_fn(\"DataSchema\")\n```\n\nFinally, we can test the update.\n\n```python\n# Test the update\ndata = {\"x\": 1}\nobj = load_data_schema(data, options=upup.LoadOptions())\n\nprint(\"Result:\")\nprint(f\"Loaded object: {obj} of type {type(obj)}\") # Loaded object: DataSchema(x=1, name='default') of type DataSchema\n```\n\n## Advanced\n\n### Write intermediate versions\n\nBy default, the intermediate versions from updating to the latest are not written to the output. If you want to write them, you can set the `write_intermediate` option to `True`.\n\n```python\ndata = {\"x\": 1}\noptions = upup.LoadOptions(write_versions=True, write_version_prefix=\"version\", write_versions_dir=\".\")\nobj = upup.load(\"DataSchema\", data, options=options)\n```\n\nThis will write the files:\n```\nversion_DataSchema.json\nversion_DataSchemaV1.json\nversion_DataSchemaV2.json\n```\n\n### Example in a package\n\nWe can organize the same example above to demonstrate how to use it in a package.\n\nCreate the following files:\n\n```\nsetup.py\nmypackage/\n    __init__.py\n    data_latest.py\n    data_v1.py\n    data_v2.py\n    register_updates.py\nrun_example.py\n```\n\nThe data schemas are defined by `data_v1.py`, `data_v2.py`, and `data_latest.py`. The update functions between them are defined in `register_updates.py`.\n\nThe package is installed by the `setup.py` file:\n\n```python\nfrom setuptools import setup, find_packages\n\nsetup(\n    name='mypackage',\n    version='0.1.0',\n    description='An example package',\n    packages=find_packages(),\n    install_requires=[\n        \"loguru\",\n        \"mashumaro\",\n        \"setuptools\",\n        \"upandup\"\n    ],\n    python_requires='\u003e=3.11',\n)\n```\n\nThe contents of `data_v1.py` are:\n\n```python\nfrom mashumaro import DataClassDictMixin\nfrom dataclasses import dataclass\n\n@dataclass\nclass DataSchemaV1(DataClassDictMixin):\n    x: int\n```\n\nThe contents of `data_v2.py` are:\n\n```python\nfrom mashumaro import DataClassDictMixin\nfrom dataclasses import dataclass\n\n@dataclass\nclass DataSchemaV2(DataClassDictMixin):\n    x: int\n    y: int\n    z: int = 0\n```\n\nThe contents of `data_latest.py` are:\n\n```python\nfrom mashumaro import DataClassDictMixin\nfrom dataclasses import dataclass\n\n@dataclass\nclass DataSchema(DataClassDictMixin):\n    x: int\n    name: str\n```\n\nThe `__init__.py` exposes only the latest version of the schema:\n\n```python\nfrom .data_latest import DataSchema\nfrom .register_updates import load_data_schema, Options\n```\n\nThe `register_updates.py` contains the update functions:\n\n```python\nimport upandup as upup\nfrom mypackage.data_v1 import DataSchemaV1\nfrom mypackage.data_v2 import DataSchemaV2\nfrom mypackage.data_latest import DataSchema\n\nupdate_1_to_2 = lambda cls_start, cls_end, obj_start: cls_end(x=obj_start.x, y=0)\nupdate_2_to_latest = lambda cls_start, cls_end, obj_start: cls_end(x=obj_start.x, name=\"default\")\n\n# Register the update\nupup.register_updates(\"DataSchema\", DataSchemaV1, DataSchemaV2, fn_update=update_1_to_2)\nupup.register_updates(\"DataSchema\", DataSchemaV2, DataSchema, fn_update=update_2_to_latest)\n\n# Expose the load function and options in a nicer way\nload_data_schema = upup.make_load_fn(\"DataSchema\")\nOptions = upup.LoadOptions\n```\n\nAs noted in the `__init__.py`, we also expose the `load_data_schema` and `Options` from `register_updates.py`. This lets users easily load the latest version of the schema **every time** from any old version.\n\nFinally, the `run_example.py` contains the test code:\n\n```python\nimport mypackage as mp\n\n# Test the update\ndata = {\"x\": 1}\nobj = mp.load_data_schema(data, mp.Options())\nprint(\"Result:\")\nprint(f\"Loaded object: {obj} of type {type(obj).__name__}\") # Loaded object: DataSchema(x=1, name='default') of type DataSchema\n```\n\nNote that the `upandup` package itself did not have to be called.\n\n### Tests\n\nTests are included in the `tests` directory and built on `pytest` - from the root directory, run:\n\n```bash\npytest\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmrfeld%2Fupandup","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsmrfeld%2Fupandup","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmrfeld%2Fupandup/lists"}