{"id":22636730,"url":"https://github.com/daroczig/awr.kinesis","last_synced_at":"2025-08-31T16:43:08.331Z","repository":{"id":56937565,"uuid":"82682382","full_name":"daroczig/AWR.Kinesis","owner":"daroczig","description":"Amazon Kinesis Consumer Application from R for Stream Processing","archived":false,"fork":false,"pushed_at":"2018-03-08T23:37:47.000Z","size":13400,"stargazers_count":4,"open_issues_count":1,"forks_count":1,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-08-23T09:39:39.999Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/daroczig.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-21T13:23:03.000Z","updated_at":"2023-07-25T17:15:54.000Z","dependencies_parsed_at":"2022-08-21T06:50:08.287Z","dependency_job_id":null,"html_url":"https://github.com/daroczig/AWR.Kinesis","commit_stats":null,"previous_names":["cardcorp/awr.kinesis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/daroczig/AWR.Kinesis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FAWR.Kinesis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FAWR.Kinesis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FAWR.Kinesis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FAWR.Kinesis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/daroczig","download_url":"https://codeload.github.com/daroczig/AWR.Kinesis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daroczig%2FAWR.Kinesis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273010935,"owners_count":25030368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-09T03:30:00.612Z","updated_at":"2025-08-31T16:43:08.287Z","avatar_url":"https://github.com/daroczig.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AWR.Kinesis: An Amazon Kinesis Client Library for R\n\nThis R package is a wrapper around and an interface to the Amazon Kinesis Client Library (KCL) [MultiLangDaemon](https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java), which is part of the [Amazon KCL for Java](https://github.com/awslabs/amazon-kinesis-client). This Java-based daemon takes care of communicating with the Kinesis API (to retrieve status of streams, shards and eg to retrieve records from those) and also handles a bunch of other useful things, like checkpointing using Amazon DynamoDB -- so that the R developer can actually concentrate on the stream processing algorithm.\n\n## Writing a record processor application\n\nA minimal stream processing script written in R looks something like:\n\n```r\nAWR.Kinesis::kinesis_consumer(processRecords = function(records) {\n\tflog.info(jsonlite::toJSON(records))\n}\n```\n\nThis R script, executed by the MultiLangDaemon, reads records from the Kinesis stream and logs those as JSON in the application log, which by default is a temporary file. Note: it's important not to write anything to `stdout`, as `stdin` and `stdout` is used by the package internals to communicate with the MultiLangDaemon. But as the package is already depends on and integrates the `futile.logger` R package, it's very convenient to use the `flog` functions for app logging with various log levels.\n\nLet's see a more complex stream processing app:\n\n```r\nAWR.Kinesis::kinesis_consumer(\n        initialize     = function()\n            flog.info('Loading some data'),\n        processRecords = function(records)\n            flog.info('Received some records from Kinesis'),\n        shutdown       = function()\n            flog.info('Bye'),\n        updater        = list(\n            list(1, function()\n                flog.info('Updating some data every minute')),\n            list(1/60, function()\n                flog.info('This is a high frequency updater call'))))\n```\n\nThis application takes multiple (anonymous) functions. Besides the `processRecords` argument, which we used in the above application to define a function to process the records, we also have an init, a shutdown and two updater functions. The `initialize` and `shutdown` calls are trivial: these functions are run when the applications starts and when it stops, eg when there are no further records to be read from a shard due to a shard merge operation.\n\nThe `updater` part starts a timer in the background and executes the defined functions at the given frequency (1 minute and 1 second in the above example) before the `processRecords` calls.\n\nSo this application will log\n* `Loading some data` on app start,\n* `Received some records from Kinesis` every time it reads from Kinesis,\n* `This is a high frequency updater call` (almost) every second after a process records call,\n* `Updating some data every minute` around once a minute,\n* `Bye` when the app stops.\n\nUse the `initialize` function to load/cache some data for the `processRecords` calls, then use the `updater` functions to refresh your cached data on a regular basis. To store credentials to databases, APIs etc, use the [AWR.KMS](https://github.com/cardcorp/AWR.KMS) R package to interact with the AWS Key Management Service.\n\n## Executing the record processor application\n\nThe R script has to be an executable, so add the executable bit (`chmod +x`) and also set a hashbang (for eg `Rscript` or use [littler](http://dirk.eddelbuettel.com/code/littler.html)). Then define a configuration file for the MultiLangDaemon, for example the content of `app.properties` could be as simple as:\n\n```\nexecutableName = ./demo_app.R\nstreamName = demo_stream\napplicationName = demo_app\n```\n\nThis config file will look for a `demo_stream` Kinesis stream in the default (US East) region, start reading the oldest available record (via `TRIM_HORIZON`), and run the `demo_app.R` script to process the records, using to the `demo_app` DynamodDB table for checkpointing. There are quite many other useful settings as well, see the [example file of the Python client](https://github.com/awslabs/amazon-kinesis-client-python/blob/master/samples/sample.properties) for more details.\n\nRunning the MultiLangDaemon with the above defined configuration file is easy, as the required `jar` files are bundled with the `AWR` and `AWR.Kinesis` packages. So first, identify where the `jar` files were installed:\n\n```r\nsapply(c('AWR', 'AWR.Kinesis'), function(pkg) system.file('java', package = pkg))\n```\n\nThis returns two folders that you should pass as custom classpaths to `java`, for example:\n\n```\n/usr/bin/java -cp `Rscript --vanilla -e \"cat(paste(c(sapply(c('AWR', 'AWR.Kinesis'), function(pkg) file.path(system.file('java', package = pkg), '*')), './'), collapse = ':'))\"` com.amazonaws.services.kinesis.multilang.MultiLangDaemon ./app.properties\n```\n\nPlease note that you need AWS access to both Kinesis and DynamoDB to get the above examples working.\n\n## Further reading\n\nAgain, this is just a wrapper around the MultiLangDaemon, so the related documentation will be extremely useful if you get stuck:\n* [AWS introduction into Kinesis](http://docs.aws.amazon.com/streams/latest/dev/introduction.html)\n* [AWS docs on using the KCL](http://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-kcl.html)\n* [Java Kinesis Client](https://github.com/awslabs/amazon-kinesis-client)\n* [Python Kinesis Client](https://github.com/awslabs/amazon-kinesis-client-python)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaroczig%2Fawr.kinesis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaroczig%2Fawr.kinesis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaroczig%2Fawr.kinesis/lists"}