{"id":43280184,"url":"https://github.com/tosun-si/pasgarde","last_synced_at":"2026-02-01T17:04:13.824Z","repository":{"id":37615660,"uuid":"444032661","full_name":"tosun-si/pasgarde","owner":"tosun-si","description":"Asgarde allows simplifying error handling with Apache Beam Python, with less code, more concise and expressive code.","archived":false,"fork":false,"pushed_at":"2022-06-22T12:42:22.000Z","size":63,"stargazers_count":31,"open_issues_count":0,"forks_count":1,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-09-20T07:37:50.132Z","etag":null,"topics":["apache-beam","cloud-dataflow","error-handling","google-cloud-platform","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tosun-si.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-01-03T11:13:14.000Z","updated_at":"2024-08-25T11:08:28.000Z","dependencies_parsed_at":"2022-07-11T21:52:09.621Z","dependency_job_id":null,"html_url":"https://github.com/tosun-si/pasgarde","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/tosun-si/pasgarde","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tosun-si%2Fpasgarde","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tosun-si%2Fpasgarde/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tosun-si%2Fpasgarde/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tosun-si%2Fpasgarde/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tosun-si","download_url":"https://codeload.github.com/tosun-si/pasgarde/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tosun-si%2Fpasgarde/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28983432,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T16:29:42.054Z","status":"ssl_error","status_checked_at":"2026-02-01T16:29:41.428Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-beam","cloud-dataflow","error-handling","google-cloud-platform","python"],"created_at":"2026-02-01T17:04:13.133Z","updated_at":"2026-02-01T17:04:13.808Z","avatar_url":"https://github.com/tosun-si.png","language":"Python","readme":"![Logo](asgarde_logo_small.gif)\n\n# Asgarde\n\nThis module allows simplifying error handling with Apache Beam Python.\n\n## Versions compatibility between Beam and Asgarde\n\n| Asgarde      | Beam |\n| -----------  | ----------- |\n| 0.16.0       | \\\u003e= 2.37.0   |\n\n## Installation of project\n\nThe project is hosted on PyPi repository.\\\nYou can install it with all the build tools compatibles with PyPi and pip.\n\n#### PyPi\n\n##### Example with pip command line from bash\n\n```bash\npip install asgarde==0.16.0\n```\n\n##### Example with requirements.txt\n\nrequirements.txt file\n\n```text\nasgarde==0.16.0\n```\n\n```bash\npip install -r requirements.txt\n```\n\n##### Example with Pipenv\n\nPipFile\n\n```text\n[[source]]\nurl = \"https://pypi.org/simple\"\nverify_ssl = true\nname = \"pypi\"\n\n[packages]\nasgarde = \"==0.16.0\"\n\n[requires]\npython_version = \"3.8\"\n```\n\n```bash\npipenv shell\n\npipenv install\n```\n\n- pipenv shell creates a virtual env \n- `pipenv install` installs all the packages specified in the Pipfile\n- A PipFile.lock is generated with a hash on installed packages\n\nhttps://pipenv.pypa.io/en/latest/\n\n\n## Example of native error handling with Beam\n\nThe following example shows error handling in each step with usual Beam code.\n\n```python\n@dataclass\nclass TeamInfo:\n    name: str\n    country: str\n    city: str\n\n@dataclass\nclass Failure:\n    pipeline_step: str\n    input_element: str\n    exception: Exception\n\nteam_names = [\n    'PSG',\n    'OL',\n    'Real',\n    'ManU'\n]\n    \nteam_countries = {\n    'PSG': 'France',\n    'OL': 'France',\n    'Real': 'Spain',\n    'ManU': 'England'\n}\n\nteam_cities = {\n    'PSG': 'Paris',\n    'OL': 'France',\n    'Real': 'Madrid',\n    'ManU': 'Manchester'\n}\n\nclass MapToTeamWithCountry(DoFn):\n\n    def process(self, element, *args, **kwargs):\n        try:\n            team_name: str = element\n\n            yield TeamInfo(\n                name=team_name,\n                country=team_countries[team_name],\n                city=''\n            )\n        except Exception as err:\n            failure = Failure(\n                pipeline_step=\"Map 1\",\n                input_element=element,\n                exception=err\n            )\n\n            yield pvalue.TaggedOutput(FAILURES, failure)\n\n\nclass MapToTeamWithCity(DoFn):\n\n    def process(self, element, *args, **kwargs):\n        try:\n            team_info: TeamInfo = element\n            city: str = team_cities[team_info.name]\n\n            yield TeamInfo(\n                name=team_info.name,\n                country=team_info.country,\n                city=city\n            )\n        except Exception as err:\n            failure = Failure(\n                pipeline_step=\"Map 2\",\n                input_element=element,\n                exception=err\n            )\n\n            yield pvalue.TaggedOutput(FAILURES, failure)\n\n\nclass FilterFranceTeams(DoFn):\n\n    def process(self, element, *args, **kwargs):\n        try:\n            team_info: TeamInfo = element\n\n            if team_info.country == 'France':\n                yield element\n        except Exception as err:\n            failure = Failure(\n                pipeline_step=\"Filter France teams\",\n                input_element=element,\n                exception=err\n            )\n\n            yield pvalue.TaggedOutput(FAILURES, failure)\n\n# In Beam pipeline.\ninput_teams: PCollection[str] = p | 'Read' \u003e\u003e beam.Create(team_names)\n\noutputs_map1, failures_map1 = (input_teams | 'Map to team with country' \u003e\u003e ParDo(MapToTeamWithCountry())\n                               .with_outputs(FAILURES, main='outputs'))\n\noutputs_map2, failures_map2 = (outputs_map1 | 'Map to team with city' \u003e\u003e ParDo(MapToTeamWithCity())\n                               .with_outputs(FAILURES, main='outputs'))\n\noutputs_filter, failures_filter = (outputs_map2 | 'Filter France teams' \u003e\u003e ParDo(FilterFranceTeams())\n                                   .with_outputs(FAILURES, main='outputs'))\n\nall_failures = (failures_map1, failures_map2, failures_filter) | 'All Failures PCollections' \u003e\u003e beam.Flatten()\n```\n\nThis example starts with an input `PCollection` containing team names.\\\nThen 3 operations and steps are applied : 2 maps and 1 filter.\n\nFor each operation a custom `DoFn` class is proposed and must override `process` function containing the \ntransformation logic.\\\nA `try except bloc` is added to catch all the possible errors.\\\nIn the `Except` bloc a `Failure` object is built with input element and current exception. This object is then added on \na `tuple tag` dedicated to errors.\\\nThis `tag` mechanism allows having multi sink in the pipeline and a dead letter queue for failures.\n\nThere are some inconveniences : \n- We have to repeat many technical codes and same logic like `try except bloc`, `tuple tags`, \n`failure logic` and all this logic can be centralized.\n- If we want to intercept all the possible errors in the pipeline, we have to repeat the recovery of output and failure in each step.\n- All the failures `PCollection` must be concatenated at end.\n- The code is verbose.\n\nThe repetition of technical codes is error-prone and less maintainable.\n\n\n## Example of error handling using Asgarde library\n\n```python\n# Beam pipeline with Asgarde library.\ninput_teams: PCollection[str] = p | 'Read' \u003e\u003e beam.Create(team_names)\n\nresult = (CollectionComposer.of(input_teams)\n            .map('Map with country', lambda tname: TeamInfo(name=tname, country=team_countries[tname], city=''))\n            .map('Map with city', lambda tinfo: TeamInfo(name=tinfo.name, country=tinfo.country, city=team_cities[tinfo.name]))\n            .filter('Filter french team', lambda tinfo: tinfo.country == 'France'))\n\nresult_outputs: PCollection[TeamInfo] = result.outputs\nresult_failures: PCollection[Failure] = result.failures\n```\n\n### CollectionComposer class\n\nAsgarde proposes a `CollectionComposer` wrapper class instantiated from a `PCollection`.\n\n### Operators exposed by CollectionComposer class\n\nThe `CollectionComposer` class exposes the following operators : `map`, `flatMap` and `filter`.\n\nThese classical operators takes a function, the implementation can be : \n- A `lambda expression`\n- A `method` having the same signature of the expected `function`\n\n### Failure object exposed by Asgarde\n\nBehind the scene, for each step the `CollectionComposer` class adds `try except` bloc and `tuple tag logic` with output\nand failure `sinks`.\n\nThe bad sink is based on a `Failure` object proposed by the library : \n\n```python\n@dataclass\nclass Failure:\n    pipeline_step: str\n    input_element: str\n    exception: Exception\n```\n\nThis object contains the current pipeline step name, input element with string form and current exception.\n\nInput element on Failure object are built following these rules :\n- If the current element in the `PCollection` is a `dict`, the Json string form of this `dict` is retrieved\n- For all others types, the `string` form of object is retrieved. If developers want to bring their own serialization\nlogic, they have to override `__str__` method in the object, example for a `dataclass` : \n\n```python\nimport dataclasses\nimport json\nfrom dataclasses import dataclass\n\n@dataclass\nclass Team:\n    name: str\n\n    def __str__(self) -\u003e str:\n        return json.dumps(dataclasses.asdict(self))\n```\n\n### Result of CollectionComposer flow\n\nThe `CollectionComposer` class after applying and chaining different operations, returns a `tuple` with : \n- Output `PCollection`\n- Failures `PCollection`\n\n```python\nresult = (CollectionComposer.of(input_teams)\n            .map('Map with country', lambda tname: TeamInfo(name=tname, country=team_countries[tname], city=''))\n            .map('Map with city', lambda tinfo: TeamInfo(name=tinfo.name, country=tinfo.country, city=team_cities[tinfo.name]))\n            .filter('Filter french team', lambda tinfo: tinfo.country == 'France'))\n\nresult_outputs: PCollection[TeamInfo] = result.outputs\nresult_failures: PCollection[Failure] = result.failures\n```\n\n### Example of a flow with side inputs\n\n`Asgarde` allows applying transformations with error handling and passing `side inputs`.\\\nThe syntax is the same as usual Beam pipeline with `AsDict` or `AsList` passed as function parameters.\n\n```python\ndef to_team_with_city(self, team_name: str, team_countries: Dict[str, str]) -\u003e TeamInfo:\n    return TeamInfo(name=team_name, country=team_countries[team_name], city='')\n\nteam_countries = {\n    'PSG': 'France',\n    'OL': 'France',\n    'Real': 'Spain',\n    'ManU': 'England'\n}\n\n# Side inputs.\ncountries_side_inputs = p | 'Countries' \u003e\u003e beam.Create(team_countries)\n\n# Beam Pipeline.\nresult = (CollectionComposer.of(input_teams)\n            .map('Map with country', self.to_team_with_city, team_countries=AsDict(countries_side_inputs))\n            .map('Map with city', lambda ti: TeamInfo(name=ti.name, country=ti.country, city=team_cities[ti.name]))\n            .filter('Filter french team', lambda ti: ti.country == 'France'))\n\nresult_outputs: PCollection[str] = result.outputs\nresult_failures: PCollection[Failure] = result.failures\n```\n\n### Asgarde and error handling with Beam DoFn lifecyle\n\n`Asgarde` allows interacting with `DoFn` lifecycle while chaining transformation with error handling, example : \n\n```python\n(CollectionComposer.of(input_teams)\n     .map('Map to Team info',\n          input_element_mapper=lambda team_name: TeamInfo(name=team_name, country='test', city='test'),\n          setup_action=lambda: print('Setup action'),\n          start_bundle_action=lambda: print('Start bundle action'),\n          finish_bundle_action=lambda: print('Finish bundle action'),\n          teardown_action=lambda: print('Teardown action'))\n     )\n```\n\nThe `map` and `flat_map` methods of `CollectionComposer` class propose the following functions to interact with \n`DoFn` lifecycle :\n- setup_action\n- start_bundle_action\n- finish_bundle_action\n- teardown_action\n\nThese functions take a `function` without input parameter and return `None`, it corresponds to an action executed \nin the dedicated lifecycle method : \n\nhttps://beam.apache.org/documentation/transforms/python/elementwise/pardo/\n\n### Advantage of using Asgarde\n\n`Asgarde` presents the following advantages :\n- Simplifies error handling with less code and more expressive and concise code\n- No need to repeat same technical code for error handling like `try except` bloc, `tuple tags` and concatenation of all the pipeline failures\n- Allows interacting with Beam lifecycle while chaining the transformation and error handling\n\n\n\n\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftosun-si%2Fpasgarde","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftosun-si%2Fpasgarde","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftosun-si%2Fpasgarde/lists"}