{"id":13811883,"url":"https://github.com/daangn/kinesumer","last_synced_at":"2025-04-22T13:34:09.755Z","repository":{"id":42525198,"uuid":"404985190","full_name":"daangn/kinesumer","owner":"daangn","description":"A Go client implementing a client-side distributed consumer group client for Amazon Kinesis","archived":false,"fork":false,"pushed_at":"2023-06-05T06:49:44.000Z","size":495,"stargazers_count":75,"open_issues_count":6,"forks_count":6,"subscribers_count":73,"default_branch":"main","last_synced_at":"2024-08-04T04:01:32.580Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/daangn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-09-10T06:55:28.000Z","updated_at":"2024-07-19T06:48:32.000Z","dependencies_parsed_at":"2024-01-13T15:37:24.693Z","dependency_job_id":"59ecdb42-12f6-4b71-b4a9-430513f71f11","html_url":"https://github.com/daangn/kinesumer","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daangn%2Fkinesumer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daangn%2Fkinesumer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daangn%2Fkinesumer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daangn%2Fkinesumer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/daangn","download_url":"https://codeload.github.com/daangn/kinesumer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223899085,"owners_count":17221874,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T04:00:37.714Z","updated_at":"2024-11-10T00:24:01.912Z","avatar_url":"https://github.com/daangn.png","language":"Go","readme":"# Kinesumer\n\n[![Run tests](https://github.com/daangn/kinesumer/actions/workflows/test.yml/badge.svg?branch=main)](https://github.com/daangn/kinesumer/actions/workflows/test.yml) [![Release](https://img.shields.io/github/v/tag/daangn/kinesumer?label=Release)](https://github.com/daangn/kinesumer/releases)\n\nKinesumer is a Go client implementing a client-side distributed consumer group client for [Amazon Kinesis](https://aws.amazon.com/kinesis/). It supports following features:\n\n- Implement the client-side distributed Kinesis consumer group client.\n- A client can consume messages from multiple Kinesis streams.\n- Clients are automatically assigned a shard id range for each stream.\n- Rebalance each shard id range when clients or upstream shards are changed. (by restart or scaling issues)\n- Manage the checkpoint for each shard, so that clients can continue to consume from the last checkpoints.\n- Able to consume from the Kinesis stream in a different AWS account.\n- Manage all the consumer group client states with a DynamoDB table. (we call this table as `state store`.)\n\n![architecture](./docs/images/architecture.png)\n\n## Setup\n\nKinesumer manages the state of the distributed clients with a database, called \"state store\". It uses the DynamoDB as the state store, so you need to create a DynamoDB table first. Create a table with [LSI schema](./schema/ddb-lsi.json). See the details in [here](#how-it-works).\n\n\u003e Current state store implementation supports multiple applications (you will pass the app name when initialize the client). So, if you already have a kinesumer state store, you don't need to create another state store table.\n\n### If your Kinesis stream is in different account\n\n\u003e If you want to connect to Kinesis in a different account, you need to set up the IAM role to access to the target account, and pass the role arn (`kinesumer.Config.RoleARN`) when initialze the Kinesumer client: [Reference](https://docs.aws.amazon.com/kinesisanalytics/latest/java/examples-cross.html).\n\u003e \n\n## Usage\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"time\"\n\n    \"github.com/daangn/kinesumer\"\n)\n\nfunc main() {\n    client, err := kinesumer.NewKinesumer(\n        \u0026kinesumer.Config{\n            App:            \"myapp\",\n            KinesisRegion:  \"ap-northeast-2\",\n            DynamoDBRegion: \"ap-northeast-2\",\n            DynamoDBTable:  \"kinesumer-state-store\",\n            ScanLimit:      1500,\n            ScanTimeout:    2 * time.Second,\n        },\n    )\n    if err != nil {\n        // Error handling.\n    }\n\n    go func() {\n        for err := range client.Errors() {\n            // Error handling.\n        }\n    }()\n\n    // Consume multiple streams.\n    // You can refresh the streams with `client.Refresh()` method.\n    records, err := client.Consume([]string{\"stream1\", \"stream2\"})\n    if err != nil {\n        // Error handling.\n    }\n\n    for record := range records {\n        fmt.Printf(\"record: %v\\n\", record)\n    }\n}\n```\n\n## How it works\n\nKinesumer implements the client-side distributed consumer group client without any communications between clients. Then, how do clients know the state of an entire system? The answer is the distributed key-value store.\n\nTo evenly distribute the shard range among clients, the Kinesumer relies on a centralized database, called `state store`. State store manages the states of the distributed clients, shard cache, and checkpoints.\n\nThis is the overview architecture of Kinesumer:\n\n![how-it-works](./docs/images/how-it-works.png)\n\nFollowing explains how the Kinesumer works:\n\n- **Leader election**: Clients register themselves to the state store and set their indexes. The index is determined by sorting all active client ids. And, a client who has zero index will be a leader. So, when clients are scaled or restarted, the leader could be changed.\n- **Shard rebalancing**: A client will fetch the full shard id list and client list from the state store. Then, divide the shard id list by the number of clients and assign a range of shard id corresponding to their index. All clients will repeat this process periodically.\n- **Synchronization**: The leader client is responsible to sync the shard cache with the latest shard list, and pruning the outdated client list (to prevent the orphan shard range) periodically.\n- **Offset checkpoint**: Whenever a client consumes messages from its assigned shards, it updates a per-shard checkpoint with the sequence number of the last message read from each shard.\n\n## License\n\nSee [LICENSE](./LICENSE).\n\n","funding_links":[],"categories":["Go"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaangn%2Fkinesumer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaangn%2Fkinesumer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaangn%2Fkinesumer/lists"}