{"id":17819119,"url":"https://github.com/maestre3d/dynamodb-tx-outbox-sample","last_synced_at":"2025-04-02T09:25:15.114Z","repository":{"id":104103396,"uuid":"438533069","full_name":"maestre3d/dynamodb-tx-outbox-sample","owner":"maestre3d","description":"A demonstration of the transactional outbox messaging pattern (+ Log Trailing) with Amazon DynamoDB (+ Streams) written in Go.","archived":false,"fork":false,"pushed_at":"2022-04-27T04:22:31.000Z","size":38,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-08T00:32:12.033Z","etag":null,"topics":["amazon-dynamodb","aws","aws-dynamodb","aws-lambda","dynamodb","event-driven-architecture","golang","lambda-functions","outbox-pattern","serverless","terraform","trailing-log-pattern"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maestre3d.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-15T07:19:52.000Z","updated_at":"2022-04-12T15:46:58.000Z","dependencies_parsed_at":"2023-11-24T13:30:08.469Z","dependency_job_id":null,"html_url":"https://github.com/maestre3d/dynamodb-tx-outbox-sample","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maestre3d%2Fdynamodb-tx-outbox-sample","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maestre3d%2Fdynamodb-tx-outbox-sample/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maestre3d%2Fdynamodb-tx-outbox-sample/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maestre3d%2Fdynamodb-tx-outbox-sample/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maestre3d","download_url":"https://codeload.github.com/maestre3d/dynamodb-tx-outbox-sample/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246786497,"owners_count":20833702,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-dynamodb","aws","aws-dynamodb","aws-lambda","dynamodb","event-driven-architecture","golang","lambda-functions","outbox-pattern","serverless","terraform","trailing-log-pattern"],"created_at":"2024-10-27T16:57:45.435Z","updated_at":"2025-04-02T09:25:15.092Z","avatar_url":"https://github.com/maestre3d.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Transactional Outbox Pattern in Amazon DynamoDB\nA demonstration of the transactional outbox messaging pattern (+ Log Trailing) with Amazon DynamoDB (+ Streams) written in Go.\n\nFor more information about transaction outbox pattern, please read [this article](https://microservices.io/patterns/data/transactional-outbox.html).\n\nFor more information about log trailing pattern, please read [this article](https://microservices.io/patterns/data/transaction-log-tailing.html).\n\n**Requirements**\n- 2 tables in Amazon DynamoDB (+ table stream).\n- 1 serverless function in Amazon Lambda.\n- 1 topic in Amazon Simple Notification Service (SNS).\n\n_Note: Live infrastructure is ready to deploy using the Terraform application from_ `deployments/aws`.\n\nThe architecture is very simple as it relies on serverless patterns and services provided by Amazon Web Services.\nMost of the heavy lifting is done by Amazon itself. Nevertheless, please consider factors such as \nAmazon Lambda concurrency limits and API calls from/to Amazon services as it could impact both performance and scalability.\n\nBefore implementing the solution, please create **two** Amazon DynamoDB tables. One called `students` and \nthe other called `outbox`.\n\nThe `students` table MUST have a property named `student_id` as _Partition Key_ and another property\nnamed `school_id` as _Sort Key_.\n\nThe `outbox` table MUST have a property named `transaction_id` as _Partition Key_ and another property\nnamed `occurred_at` as _Sort Key_.\n\nAll keys MUST be **String**.\n\nFinally, enable **Time-To-Live** and **Streams** on the `outbox` table. \n\n![arch](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/StreamsAndTriggers.png)\n\n_Overall architecture, took from [this AWS article](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.Tutorial.html)_\n\nThe workflow is also very simple and so is straightforward.\n\n1. When writing in the database, the Amazon DynamoDB repository MUST call the `TransactionWriteItems`\n   1. Add up business operations into the transaction to any tables required.\n   2. Add up the domain events _(encoded using JSON)_ into a single transaction to reduce transaction items and comply \n   with the _25 ops/per-transaction API limit_.\n   This item MUST be written in the `outbox table`.\n2. The new row will get projected into the `outbox` table stream.\n3. The `outbox` table stream will trigger the `log trailing daemon` Amazon Lambda function.\n4. The serverless function will:\n   1. Decode the domain events from the stream message _(using JSON)_.\n   2. Using [`Neutrino Gluon`](https://github.com/NeutrinoCorp/gluon), the `publish()` function will transform\n   domain events into integration events (Gluon uses CNCF's CloudEvents specification). \n   3. Gluon will publish the integration events into the desired event bus using the selected codec. In this specific scenario,\n   messages will be published in _Amazon Simple Notification Service (SNS)_ using _Apache Avro_ codec.\n5. The event bus (Amazon SNS) will route the message to specific destinations. In event-driven systems, the most common\n    destination are Amazon Simple Queue Service (SQS) queues (one-queue per job). This is called\n    _topic-queue chaining pattern_.\n\n## Additional lecture\n\n### Cleaning the Outbox table\n\nUsing the Time-To-Live (TTL) mechanism, batches of messages stored in the `outbox` table will get removed after\na specific time defined by the developer _(for this example, default is 1 day)_. If Amazon DynamoDB is not an option,\na TTL mechanism MUST be implemented manually to keep the `outbox` table lightweight.\n\nAn open source alternative is `Apache Cassandra` which based most of its initial implementation from cloud-native solutions such as `Amazon DynamoDB` and `Google BigQuery`. It also has TTL mechanisms out the box.\n\nFurthermore, the defined time for a record to be removed opens the possibility to replay batches of messages generated within transactions.\n\n### Common issues with Event-Driven systems\n\n#### Dealing with duplication and disordered messages when replicating data\n\nAs the majority of event-driven systems, messages COULD be published without a specific order\nand additionally, messages COULD be published more than one time _(caused by At-least one message delivery)_.\n\nThus, event handlers operations MUST be idempotent. More in deep:\n- When creating an entity/aggregate:\n  - If a mutation or deletion arrives first to the system, \n      fail immediately so the process can be retried after a specific time. While this backoff passes by,\n      the create operation might get executed.\n- When removing an entity/aggregate:\n  - If a mutation operation arrives after the removal, the handler MUST return\n  no error and acknowledge the arrival of the message.\n- When mutating an entity/aggregate:\n  - If an old version of the entity/aggregate arrives after, using the\n    _Change-Data-Capture (CDC)_ `version` delta or `last_update_time` timestamp, the \n    operation MUST distinct between older versions and skip the actual mutation process and\n    acknowledge the message arrival as it was actually updated.\n    The `version` field is recommended over `last_update_time` as time precision COULD lead into\n    race conditions (e.g. using seconds while the operations take milliseconds, thus, the field could have basically \n    the same time).\n\nIf processes keep failing after N-times _(N defined by the developer team)_, store poison messages into a \nDead-Letter Queue (DLQ) so they can be replayed manually after fixes get deployed. No data should be lost.\n\n#### Dealing with duplication and disordered messages in complex processing\n\nSometimes, a business might require functionality which align perfectly with the nature of events \n(reacting to change). For example, the product team might require to notify the user when he/she executes a \nspecific operation.\n\nIn that scenario, using the techniques described before will not be sufficient to deal with the nature of event-driven\nsystems (duplication and disorder of messages). Nevertheless, they are still solvable as they only require to do a\nspecific action triggered by some event.\n\nIn order to solve duplication of processes, a table named `inbox` (or similar) COULD be used to track message processes \nalready executed in the system (even if it is a cluster of nodes).\nMore in deep:\n\n1. Message arrives.\n2. A middleware is called before the actual message process.\n   1. The middleware checks if the message was already processed using the message id as key and the table `inbox`.\n      1. If message was already processed, stop the process and acknowledge the arrival of the message.\n      2. If not, continue with the message processing normally.\n3. The message process gets executed.\n4. The middleware will be called again.\n   1. If the processing was successful, commit the message process as success in the `inbox` table.\n5. If processing failed, do not acknowledge the message arrival, so it can be retried.\n\nFinally, one thing to consider while implementing this approach is the necessity of a Time-To-Live (TTL) \nmechanism, just as the `outbox` table, to keep the table lightweight.\n\nNote: This `inbox` table COULD be implemented anywhere as it does not require transactions or any similar mechanism.\nIt is recommended to use an external database to reduce computational overhead from the main database used by business \noperations. An in-memory database such as Redis _(which also has built-in TTL)_ or even Amazon DynamoDB/Apache Cassandra \n(distributed databases) are one of the best choices as they handle massive read operations efficiently.\n\nIn the other hand, if disordered processing is a serious problem for the business, the development team might take advantage\nof the previous described approach for duplication of processes adhering workarounds such as the usage of timestamps or \neven deltas to distinct the order of the processes. Getting deeper:\n\n![Correlation and Causation IDs](https://blog-arkency.imgix.net/correlation_id_causation_id_rails_ruby_event/CorrelationAndCausationEventsCommands.png?w=768\u0026h=758\u0026fit=max)\n\n1. Message arrives.\n2. A middleware `duplication` is called before the actual message process.\n   1. The middleware checks if the process was already processed.\n      1. If already processed, stop the process and acknowledge the message.\n3. A middleware `disorder` is called before the actual message process.\n   1. The middleware verifies if the previous process was already executed using the `causation_id` property.\n      1. If not, return an error and do not acknowledge the message, so it can be retried again after a backoff.\n      2. If previous process was already executed, continue with the message processing normally.\n4. The message process gets executed.\n5. The middleware `duplication` will be called again.\n    1. If the processing was successful, commit the message process as success in the `inbox` table.\n6. If processing failed, do not acknowledge the message arrival, so it can be retried.\n\nFor more information about this last approach, please read [this article about correlation and causation IDs](https://blog.arkency.com/correlation-id-and-causation-id-in-evented-systems/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaestre3d%2Fdynamodb-tx-outbox-sample","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaestre3d%2Fdynamodb-tx-outbox-sample","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaestre3d%2Fdynamodb-tx-outbox-sample/lists"}