{"id":27248840,"url":"https://github.com/bakdata/faust-large-message-serializer","last_synced_at":"2025-04-10T23:48:16.267Z","repository":{"id":37860794,"uuid":"243516111","full_name":"bakdata/faust-large-message-serializer","owner":"bakdata","description":"A Faust Serializer that reads and writes records from and to S3 or Azure Blob Storage transparently.","archived":false,"fork":false,"pushed_at":"2022-06-15T12:01:10.000Z","size":18,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-10T23:48:12.649Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bakdata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-27T12:38:55.000Z","updated_at":"2023-05-01T09:50:23.000Z","dependencies_parsed_at":"2022-08-18T04:55:39.906Z","dependency_job_id":null,"html_url":"https://github.com/bakdata/faust-large-message-serializer","commit_stats":null,"previous_names":["bakdata/faust-s3-backed-serializer"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bakdata%2Ffaust-large-message-serializer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bakdata%2Ffaust-large-message-serializer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bakdata%2Ffaust-large-message-serializer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bakdata%2Ffaust-large-message-serializer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bakdata","download_url":"https://codeload.github.com/bakdata/faust-large-message-serializer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248317727,"owners_count":21083528,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-10T23:48:15.658Z","updated_at":"2025-04-10T23:48:16.256Z","avatar_url":"https://github.com/bakdata.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![GitHub license](https://img.shields.io/github/license/bakdata/faust-s3-backed-serializer)](https://github.com/bakdata/faust-large-message-serializer/blob/master/LICENSE)\n[![Python Version](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8-blue.svg)](https://img.shields.io/badge/python-3.6%20%7C%203.7-blue.svg)\n[![Build Status](https://dev.azure.com/bakdata/public/_apis/build/status/bakdata.faust-large-message-serializer?branchName=master)](https://dev.azure.com/bakdata/public/_build/latest?definitionId=22\u0026branchName=master)\n[![PyPI version](https://badge.fury.io/py/faust-large-message-serializer.svg)](https://badge.fury.io/py/faust-large-message-serializer)\n# faust-large-message-serializer\n\nA Faust Serializer that reads and writes records from and to S3 or Azure Blob Storage transparently.\n\nThis serializer is compatible with our [Kafka large-message-serializer SerDe](https://github.com/bakdata/kafka-large-message-serde) for Java.\n\nRead more about it on our [blog](https://medium.com/bakdata/processing-large-messages-with-kafka-streams-167a166ca38b).\n\n# Getting Started\n\n#### PyPi\n\n```\npip install faust-large-message-serializer\n```\n\n\n##### Usage\n\nThe serializer was build to be used with other serializers. The idea is to use the [\"concatenation\" feature](https://faust.readthedocs.io/en/latest/userguide/models.html#codec-registry) that comes with Faust\n\n```python\nimport faust\nfrom faust import Record\nimport logging\nfrom faust_large_message_serializer import LargeMessageSerializer, LargeMessageSerializerConfig\nfrom faust.serializers import codecs\n\n\n# model.user\nclass UserModel(Record, serializer=\"s3_json\"):\n    first_name: str\n    last_name: str\n\n\nconfig = LargeMessageSerializerConfig(base_path=\"s3://your-bucket-name/\",\n                                      max_size=0,\n                                      large_message_s3_region=\"eu-central-1\",\n                                      large_message_s3_access_key=\"access_key\",\n                                      large_message_s3_secret_key=\"secret_key\")\n\ntopic_name = \"users_s3\"\ns3_backed_serializer = LargeMessageSerializer(topic_name, config, is_key=False)\njson_serializer = codecs.get_codec(\"json\")\n\n# Here we use json as the first serializer and\n# then we can upload everything to the S3 bucket\ns3_json_serializer = json_serializer | s3_backed_serializer\n\n# config\nlogger = logging.getLogger(__name__)\ncodecs.register(\"s3_json\", s3_json_serializer)\napp = faust.App(\"app_id\", broker=\"kafka://localhost:9092\")\nusers_topic = app.topic(topic_name, value_type=UserModel)\n\n\n@app.agent(users_topic)\nasync def users(users):\n    async for user in users:\n        logger.info(\"Event received in topic\")\n        logger.info(f\"The user is : {user}\")\n\n\n@app.timer(5.0, on_leader=True)\nasync def send_users():\n    data_user = {\"first_name\": \"bar\", \"last_name\": \"foo\"}\n    user = UserModel(**data_user)\n    await users.send(value=user, value_serializer=s3_json_serializer)\n\n\napp.main()\n\n````\n\n\n## Contributing\n\nWe are happy if you want to contribute to this project.\nIf you find any bugs or have suggestions for improvements, please open an issue.\nWe are also happy to accept your PRs.\nJust open an issue beforehand and let us know what you want to do and why.\n\n## License\nThis project is licensed under the MIT license.\nHave a look at the [LICENSE](https://github.com/bakdata/faust-s3-backed-serializer/blob/master/LICENSE) for more details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbakdata%2Ffaust-large-message-serializer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbakdata%2Ffaust-large-message-serializer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbakdata%2Ffaust-large-message-serializer/lists"}