{"id":17874477,"url":"https://github.com/marcosschroh/dataclasses-avroschema","last_synced_at":"2025-05-16T04:03:40.168Z","repository":{"id":35088484,"uuid":"205188270","full_name":"marcosschroh/dataclasses-avroschema","owner":"marcosschroh","description":"Generate avro schemas from python dataclasses, Pydantic models and Faust Records. Code generation from avro schemas. Serialize/Deserialize python instances with avro schemas.","archived":false,"fork":false,"pushed_at":"2025-05-06T10:04:05.000Z","size":8926,"stargazers_count":231,"open_issues_count":20,"forks_count":71,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-06T10:49:49.844Z","etag":null,"topics":["apache-avro","avro","avro-schemas","code-generation","faust-streaming","json","json-schema","model","pydantic","python3","schema","serialization"],"latest_commit_sha":null,"homepage":"https://marcosschroh.github.io/dataclasses-avroschema/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/marcosschroh.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"marcosschroh"}},"created_at":"2019-08-29T14:58:17.000Z","updated_at":"2025-05-06T10:03:36.000Z","dependencies_parsed_at":"2023-02-11T11:16:16.272Z","dependency_job_id":"95b19e48-8bde-4f79-93d5-94a0f0044a86","html_url":"https://github.com/marcosschroh/dataclasses-avroschema","commit_stats":{"total_commits":291,"total_committers":23,"mean_commits":"12.652173913043478","dds":0.5085910652920962,"last_synced_commit":"4c18f126061404cfe7c1b60d0add8b107ddd3ffd"},"previous_names":[],"tags_count":199,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcosschroh%2Fdataclasses-avroschema","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcosschroh%2Fdataclasses-avroschema/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcosschroh%2Fdataclasses-avroschema/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcosschroh%2Fdataclasses-avroschema/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/marcosschroh","download_url":"https://codeload.github.com/marcosschroh/dataclasses-avroschema/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254464891,"owners_count":22075570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-avro","avro","avro-schemas","code-generation","faust-streaming","json","json-schema","model","pydantic","python3","schema","serialization"],"created_at":"2024-10-28T11:09:02.386Z","updated_at":"2025-05-16T04:03:40.143Z","avatar_url":"https://github.com/marcosschroh.png","language":"Python","funding_links":["https://github.com/sponsors/marcosschroh"],"categories":[],"sub_categories":[],"readme":"# Dataclasses Avro Schema\n\nGenerate [avro schemas](https://avro.apache.org/docs/1.8.2/spec.html) from python dataclasses, [Pydantic](https://docs.pydantic.dev/latest/) models and [Faust](https://faust-streaming.github.io/faust/) Records. [Code generation](https://marcosschroh.github.io/dataclasses-avroschema/model_generator/) from avro schemas. [Serialize/Deserialize](https://marcosschroh.github.io/dataclasses-avroschema/serialization/) python instances with avro schemas\n\n[![Tests](https://github.com/marcosschroh/dataclasses-avroschema/actions/workflows/tests.yaml/badge.svg)](https://github.com/marcosschroh/dataclasses-avroschema/actions/workflows/tests.yaml)\n[![GitHub license](https://img.shields.io/github/license/marcosschroh/dataclasses-avroschema.svg)](https://github.com/marcosschroh/dataclasses-avroschema/blob/master/LICENSE)\n[![codecov](https://codecov.io/gh/marcosschroh/dataclasses-avroschema/branch/master/graph/badge.svg)](https://codecov.io/gh/marcosschroh/dataclasses-avroschema)\n![python version](https://img.shields.io/badge/python-3.9%2B-yellowgreen)\n\n## Requirements\n\n`python 3.9+`\n\n## Installation\n\nwith `pip` or `poetry`:\n\n`pip install dataclasses-avroschema` or `poetry add dataclasses-avroschema`\n\n### Extras\n\n- [pydantic](https://docs.pydantic.dev/): `pip install 'dataclasses-avroschema[pydantic]'` or `poetry add dataclasses-avroschema --extras \"pydantic\"`\n- [faust-streaming](https://github.com/faust-streaming/faust): `pip install 'dataclasses-avroschema[faust]'` or `poetry add dataclasses-avroschema --extras \"faust\"`\n- [faker](https://github.com/joke2k/faker): `pip install 'dataclasses-avroschema[faker]'` or `poetry add dataclasses-avroschema --extras \"faker\"`\n- [dc-avro](https://marcosschroh.github.io/dc-avro/): `pip install 'dataclasses-avroschema[cli]'` or `poetry add dataclasses-avroschema --with cli`\n\n*Note*: You can install all extra dependencies with `pip install dataclasses-avroschema[faust,pydantic,faker,cli]` or `poetry add dataclasses-avroschema --extras \"pydantic faust faker cli\"`\n\n## Documentation\n\nhttps://marcosschroh.github.io/dataclasses-avroschema/\n\n## Usage\n\n### Generating the avro schema\n\n```python\nfrom dataclasses import dataclass\nimport enum\n\nimport typing\n\nfrom dataclasses_avroschema import AvroModel\n\n\nclass FavoriteColor(str, enum.Enum):\n    BLUE = \"BLUE\"\n    YELLOW = \"YELLOW\"\n    GREEN = \"GREEN\"\n\n\n@dataclass\nclass User(AvroModel):\n    \"An User\"\n    name: str\n    age: int\n    pets: typing.List[str]\n    accounts: typing.Dict[str, int]\n    favorite_colors: FavoriteColor\n    country: str = \"Argentina\"\n    address: typing.Optional[str] = None\n\n    class Meta:\n        namespace = \"User.v1\"\n        aliases = [\"user-v1\", \"super user\"]\n\n\nprint(User.avro_schema())\n\n# {\n#    \"type\": \"record\",\n#    \"name\": \"User\",\n#    \"fields\": [\n#        {\"name\": \"name\", \"type\": \"string\"},\n#        {\"name\": \"age\", \"type\": \"long\"},\n#        {\"name\": \"pets\", \"type\": {\"type\": \"array\", \"items\": \"string\", \"name\": \"pet\"}},\n#        {\"name\": \"accounts\", \"type\": {\"type\": \"map\", \"values\": \"long\", \"name\": \"account\"}},\n#        {\"name\": \"favorite_colors\", \"type\": {\"type\": \"enum\", \"name\": \"FavoriteColor\", \"symbols\": [\"BLUE\", \"YELLOW\", \"GREEN\"]}},\n#        {\"name\": \"country\", \"type\": \"string\", \"default\": \"Argentina\"},\n#        {\"name\": \"address\", \"type\": [\"null\", \"string\"], \"default\": null}\n#    ], \n#    \"doc\": \"An User\",\n#    \"namespace\": \"User.v1\", \n#    \"aliases\": [\"user-v1\", \"super user\"]\n# }\n\nassert User.avro_schema_to_python() == {\n    \"type\": \"record\",\n    \"name\": \"User\",\n    \"doc\": \"An User\",\n    \"namespace\": \"User.v1\",\n    \"aliases\": [\"user-v1\", \"super user\"],\n    \"fields\": [\n        {\"name\": \"name\", \"type\": \"string\"},\n        {\"name\": \"age\", \"type\": \"long\"},\n        {\"name\": \"pets\", \"type\": {\"type\": \"array\", \"items\": \"string\", \"name\": \"pet\"}},\n        {\"name\": \"accounts\", \"type\": {\"type\": \"map\", \"values\": \"long\", \"name\": \"account\"}},\n        {\"name\": \"favorite_colors\", \"type\": {\"type\": \"enum\", \"name\": \"FavoriteColor\", \"symbols\": [\"BLUE\", \"YELLOW\", \"GREEN\"]}},\n        {\"name\": \"country\", \"type\": \"string\", \"default\": \"Argentina\"},\n        {\"name\": \"address\", \"type\": [\"null\", \"string\"], \"default\": None}\n    ],\n}\n```\n\n### Serialization to avro or avro-json and json payload\n\nFor serialization is neccesary to use python class/dataclasses instance\n\n```python\nfrom dataclasses import dataclass\n\nimport typing\n\nfrom dataclasses_avroschema import AvroModel\n\n\n@dataclass\nclass Address(AvroModel):\n    \"An Address\"\n    street: str\n    street_number: int\n\n\n@dataclass\nclass User(AvroModel):\n    \"User with multiple Address\"\n    name: str\n    age: int\n    addresses: typing.List[Address]\n\naddress_data = {\n    \"street\": \"test\",\n    \"street_number\": 10,\n}\n\n# create an Address instance\naddress = Address(**address_data)\n\ndata_user = {\n    \"name\": \"john\",\n    \"age\": 20,\n    \"addresses\": [address],\n}\n\n# create an User instance\nuser = User(**data_user)\n\n# serialization\nassert user.serialize() == b\"\\x08john(\\x02\\x08test\\x14\\x00\"\n\nassert user.serialize(\n    serialization_type=\"avro-json\"\n) == b'{\"name\": \"john\", \"age\": 20, \"addresses\": [{\"street\": \"test\", \"street_number\": 10}]}'\n\n# # Get the json from the instance\nassert user.to_json() == '{\"name\": \"john\", \"age\": 20, \"addresses\": [{\"street\": \"test\", \"street_number\": 10}]}'\n\n# # Get a python dict\nassert user.to_dict() == {\n    \"name\": \"john\", \n    \"age\": 20, \n    \"addresses\": [\n        {\"street\": \"test\", \"street_number\": 10}\n    ]\n}\n```\n\n### Deserialization\n\nDeserialization could take place with an instance dataclass or the dataclass itself. Can return the dict representation or a new class instance\n\n```python\nimport typing\nimport dataclasses\n\nfrom dataclasses_avroschema import AvroModel\n\n\n@dataclasses.dataclass\nclass Address(AvroModel):\n    \"An Address\"\n    street: str\n    street_number: int\n\n@dataclasses.dataclass\nclass User(AvroModel):\n    \"User with multiple Address\"\n    name: str\n    age: int\n    addresses: typing.List[Address]\n\navro_binary = b\"\\x08john(\\x02\\x08test\\x14\\x00\"\navro_json_binary = b'{\"name\": \"john\", \"age\": 20, \"addresses\": [{\"street\": \"test\", \"street_number\": 10}]}'\n\n# return a new class instance!!\nassert User.deserialize(avro_binary) == User(\n    name='john', \n    age=20,\n    addresses=[Address(street='test', street_number=10)]\n)\n\n# return a python dict\nassert User.deserialize(avro_binary, create_instance=False) == {\n    \"name\": \"john\",\n    \"age\": 20,\n    \"addresses\": [\n        {\"street\": \"test\", \"street_number\": 10}\n    ]\n}\n\n# return a new class instance!!\nassert User.deserialize(avro_json_binary, serialization_type=\"avro-json\") == User(\n    name='john',\n    age=20,\n    addresses=[Address(street='test', street_number=10)]\n)\n\n# return a python dict\nassert User.deserialize(\n    avro_json_binary,\n    serialization_type=\"avro-json\",\n    create_instance=False\n) == {\"name\": \"john\", \"age\": 20, \"addresses\": [{\"street\": \"test\", \"street_number\": 10}]}\n```\n\n## Pydantic integration\n\nTo add `dataclasses-avroschema` functionality to `pydantic` you only need to replace `BaseModel` by `AvroBaseModel`:\n\n```python\nimport typing\nimport enum\n\nfrom dataclasses_avroschema.pydantic import AvroBaseModel\n\nfrom pydantic import Field, ValidationError\n\n\nclass FavoriteColor(str, enum.Enum):\n    BLUE = \"BLUE\"\n    YELLOW = \"YELLOW\"\n    GREEN = \"GREEN\"\n\n\nclass UserAdvance(AvroBaseModel):\n    name: str\n    age: int\n    pets: typing.List[str] = Field(default_factory=lambda: [\"dog\", \"cat\"])\n    accounts: typing.Dict[str, int] = Field(default_factory=lambda: {\"key\": 1})\n    has_car: bool = False\n    favorite_colors: FavoriteColor = FavoriteColor.BLUE\n    country: str = \"Argentina\"\n    address: typing.Optional[str] = None\n\n    class Meta:\n        schema_doc = False\n\n\nassert UserAdvance.avro_schema_to_python() == {\n    \"type\": \"record\",\n    \"name\": \"UserAdvance\",\n    \"fields\": [\n        {\"name\": \"name\", \"type\": \"string\"},\n        {\"name\": \"age\", \"type\": \"long\"},\n        {\"name\": \"pets\", \"type\": {\"type\": \"array\", \"items\": \"string\", \"name\": \"pet\"}, \"default\": [\"dog\", \"cat\"]},\n        {\"name\": \"accounts\", \"type\": {\"type\": \"map\", \"values\": \"long\", \"name\": \"account\"}, \"default\": {\"key\": 1}},\n        {\"name\": \"has_car\", \"type\": \"boolean\", \"default\": False},{\"name\": \"favorite_colors\", \"type\": {\"type\": \"enum\", \"name\": \"FavoriteColor\", \"symbols\": [\"BLUE\", \"YELLOW\", \"GREEN\"]}, \"default\": \"BLUE\"},\n        {\"name\": \"country\", \"type\": \"string\", \"default\": \"Argentina\"}, {\"name\": \"address\", \"type\": [\"null\", \"string\"], \"default\": None}\n    ]\n}\n\nprint(UserAdvance.json_schema())\n\n# {\n#   \"$defs\": {\"FavoriteColor\": {\"enum\": [\"BLUE\", \"YELLOW\", \"GREEN\"], \"title\": \"FavoriteColor\", \"type\": \"string\"}},\n#   \"properties\": {\n#       \"name\": {\"title\": \"Name\", \"type\": \"string\"},\n#       \"age\": {\"title\": \"Age\", \"type\": \"integer\"},\n#       \"pets\": {\"items\": {\"type\": \"string\"}, \"title\": \"Pets\", \"type\": \"array\"},\n#       \"accounts\": {\"additionalProperties\": {\"type\": \"integer\"}, \"title\": \"Accounts\", \"type\": \"object\"},\n#       \"has_car\": {\"default\": false, \"title\": \"Has Car\", \"type\": \"boolean\"},\n#       \"favorite_colors\": {\"allOf\": [{\"$ref\": \"#/$defs/FavoriteColor\"}], \"default\": \"BLUE\"},\n#       \"country\": {\"default\": \"Argentina\", \"title\": \"Country\", \"type\": \"string\"},\n#       \"address\": {\"anyOf\": [{\"type\": \"string\"}, {\"type\": \"null\"}], \"default\": null, \"title\": \"Address\"}\n#   }, \n#   \"required\": [\"name\", \"age\"],\n#   \"title\": \"UserAdvance\",\n#   \"type\": \"object\"\n# }\"\"\"\n\nuser = UserAdvance(name=\"bond\", age=50)\n\n# pydantic\nassert user.dict() == {\n    'name': 'bond',\n    'age': 50,\n    'pets': ['dog', 'cat'],\n    'accounts': {'key': 1},\n    'has_car': False,\n    'favorite_colors': FavoriteColor.BLUE,\n    'country': 'Argentina',\n    'address': None\n}\n\n# pydantic\nprint(user.json())\n\nassert user.json() == '{\"name\":\"bond\",\"age\":50,\"pets\":[\"dog\",\"cat\"],\"accounts\":{\"key\":1},\"has_car\":false,\"favorite_colors\":\"BLUE\",\"country\":\"Argentina\",\"address\":null}'\n\n# pydantic\ntry:\n    user = UserAdvance(name=\"bond\")\nexcept ValidationError as exc:\n    ...\n\n# dataclasses-avroschema\nevent = user.serialize()\nassert event == b'\\x08bondd\\x04\\x06dog\\x06cat\\x00\\x02\\x06key\\x02\\x00\\x00\\x00\\x12Argentina\\x00'\n\nassert UserAdvance.deserialize(data=event) == UserAdvance(\n    name='bond',\n    age=50, \n    pets=['dog', 'cat'],\n    accounts={'key': 1},\n    has_car=False, \n    favorite_colors=FavoriteColor.BLUE,\n    country='Argentina', \n    address=None\n)\n```\n\n## Examples with python streaming drivers (kafka and redis)\n\nUnder [examples](https://github.com/marcosschroh/dataclasses-avroschema/tree/master/examples) folder you can find 3 differents kafka examples, one with [aiokafka](https://github.com/aio-libs/aiokafka) (`async`) showing the simplest use case when a `AvroModel` instance is serialized and sent it thorught kafka, and the event is consumed.\nThe other two examples are `sync` using the [kafka-python](https://github.com/dpkp/kafka-python) driver, where the `avro-json` serialization and `schema evolution` (`FULL` compatibility) is shown.\nAlso, there are two `redis` examples using `redis streams` with [walrus](https://github.com/coleifer/walrus) and [redisgears-py](https://github.com/RedisGears/redisgears-py)\n\n## Factory and fixtures\n\n[Dataclasses Avro Schema](https://github.com/marcosschroh/dataclasses-avroschema) also includes a `factory` feature, so you can generate `fast` python instances and use them, for example, to test your data streaming pipelines. Instances can be generated using the `fake` method.\n\n*Note*: This feature is not enabled by default and requires you have the `faker` extra installed. You may install it with `pip install 'dataclasses-avroschema[faker]'`\n\n```python\nimport typing\nimport dataclasses\n\nfrom dataclasses_avroschema import AvroModel\n\n\n@dataclasses.dataclass\nclass Address(AvroModel):\n    \"An Address\"\n    street: str\n    street_number: int\n\n\n@dataclasses.dataclass\nclass User(AvroModel):\n    \"User with multiple Address\"\n    name: str\n    age: int\n    addresses: typing.List[Address]\n\n\nAddress.fake()\n# \u003e\u003e\u003e\u003e Address(street='PxZJILDRgbXyhWrrPWxQ', street_number=2067)\n\nUser.fake()\n# \u003e\u003e\u003e\u003e User(name='VGSBbOGfSGjkMDnefHIZ', age=8974, addresses=[Address(street='vNpPYgesiHUwwzGcmMiS', street_number=4790)])\n```\n\n## Features\n\n- [x] Primitive types: int, long, double, float, boolean, string and null support\n- [x] Complex types: enum, array, map, fixed, unions and records support\n- [x] `typing.Annotated` supported\n- [x] `typing.Literal` supported\n- [x] Logical Types: date, time (millis and micro), datetime (millis and micro), uuid support\n- [x] Schema relations (oneToOne, oneToMany)\n- [x] Recursive Schemas\n- [x] Generate Avro Schemas from `faust.Record`\n- [x] Instance serialization correspondent to `avro schema` generated\n- [x] Data deserialization. Return python dict or class instance\n- [x] Generate json from python class instance\n- [x] Case Schemas\n- [x] Generate models from `avsc` files\n- [x] Examples of integration with `kafka` drivers: [aiokafka](https://github.com/aio-libs/aiokafka), [kafka-python](https://github.com/dpkp/kafka-python)\n- [x] Example of integration  with `redis` drivers: [walrus](https://github.com/coleifer/walrus) and [redisgears-py](https://github.com/RedisGears/redisgears-py)\n- [x] Factory instances\n- [x] [Pydantic](https://pydantic-docs.helpmanual.io/) integration\n\n## Development\n\n[Poetry](https://python-poetry.org/docs/) is needed to install the dependencies and develope locally\n\n1. Install dependencies: `poetry install --all-extras`\n2. Code linting: `./scripts/format`\n3. Run tests: `./scripts/test`\n4. Tests documentation: `./scripts/test-documentation`\n\nFor commit messages we use [commitizen](https://commitizen-tools.github.io/commitizen/) in order to standardize a way of committing rules\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcosschroh%2Fdataclasses-avroschema","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarcosschroh%2Fdataclasses-avroschema","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcosschroh%2Fdataclasses-avroschema/lists"}