{"id":13563881,"url":"https://github.com/zalando-incubator/remora","last_synced_at":"2025-08-20T09:30:34.457Z","repository":{"id":20832014,"uuid":"88889842","full_name":"zalando-incubator/remora","owner":"zalando-incubator","description":"Kafka consumer lag-checking application for monitoring, written in Scala and Akka HTTP; a wrap around the Kafka consumer group command. Integrations with Cloudwatch and Datadog. Authentication recently added","archived":false,"fork":false,"pushed_at":"2022-09-29T17:13:59.000Z","size":161,"stargazers_count":196,"open_issues_count":6,"forks_count":30,"subscribers_count":17,"default_branch":"master","last_synced_at":"2024-11-26T01:34:04.604Z","etag":null,"topics":["akka-http","authentication","cloudwatch","consumer","consumer-group","consumer-lag-checking","datadog","datadog-agent","kafka","lag","monitoring","remora","scala","security","zalando","zalando-dublin"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zalando-incubator.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"Security.md","support":null}},"created_at":"2017-04-20T16:57:05.000Z","updated_at":"2024-10-12T06:59:21.000Z","dependencies_parsed_at":"2023-01-12T03:30:55.394Z","dependency_job_id":null,"html_url":"https://github.com/zalando-incubator/remora","commit_stats":null,"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-incubator%2Fremora","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-incubator%2Fremora/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-incubator%2Fremora/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-incubator%2Fremora/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zalando-incubator","download_url":"https://codeload.github.com/zalando-incubator/remora/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230408171,"owners_count":18220974,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["akka-http","authentication","cloudwatch","consumer","consumer-group","consumer-lag-checking","datadog","datadog-agent","kafka","lag","monitoring","remora","scala","security","zalando","zalando-dublin"],"created_at":"2024-08-01T13:01:24.231Z","updated_at":"2024-12-19T09:08:14.833Z","avatar_url":"https://github.com/zalando-incubator.png","language":"Scala","funding_links":[],"categories":["Scala","UI / Cluster management","Operations","Kafka"],"sub_categories":["Monitoring","Spring Cloud框架"],"readme":"# Remora\n\n![Grafana Graph](https://raw.githubusercontent.com/imduffy15/remora-fetcher/master/img/grafana.png)\n\n[Remora](https://github.com/zalando-incubator/remora) is a monitoring utility for [Apache Kafka](http://kafka.apache.org/) that provides consumer lag checking as a service. An HTTP endpoint is provided to request consumer group information on demand. Combining this with a time series database like [KairosDB](https://kairosdb.github.io/) it is possible to graph your consumer group status; see [remora fetcher](https://github.com/imduffy15/remora-fetcher) for an example of this. \n\nRemora is stable and **production ready**. A number of production kafka clusters in Zalando are being monitored by Remora right now!\n\n## Inspiration\n\nWe created Remora after spending some time using Linkedin's [burrow](https://github.com/linkedin/Burrow) application for monitoring consumer lag and experiencing some performance problems (burrow shut down after an unknown amount with no error stack, alert or sign of error. We have no idea why but we had to keep restarting the app which was very annoying). Remora provides the [Kafka consumer group command](https://github.com/apache/kafka/blob/0.10.0/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala) as an HTTP endpoint.\n\n## User Testimonials \n\n\u003e We are using Kafka 0.10.2.1 extensively.  As almost all our applications depend on Kafka, we needed a way to visualise consumer data over a time period in order to discover issues with our consumers. Remora lets us do exactly this, it exposes consumer group metrics over HTTP which allow us to create alarms if a consumer has stopped or slowed consumption from a topic or even on a single partition. ~ Team Buffalo @ Zalando Dublin\n\n\u003e We are using Kafka 0.10.2.1 along with Akka Streams. We use Remora to track, alert, and visualise any lag within any of our components ~ Team Setanta @ Zalando Dublin\n\n\u003e We rely on Kafka for streaming DB change events on to other teams within our organisation. Remora greatly aids us in ensuring our Kafka and Kafka Connect components are functioning correctly by monitoring both the number of events been produced, and any lag present on a per consumer basis. It is proving an excellent tool in providing data which we use to trigger real time alerts ~ Team Warhol @ Zalando Dublin\n\n\u003e We use Kafka and Kafka Streaming to orchestrate the different components of our text processing pipeline. Through data provided by Remora, we monitor lags in different topics as part of our monitoring dashboard and alerting system. Remora makes it easier for us to quickly identify and respond to bottlenecks and problems. ~ Team Sapphire @ Zalando Dublin\n\n\u003e We are using Mirror Maker to replicate data between two Kafka brokers and Remora has been a great help to monitor the replication in real time. The metrics exposed by Remora are pushed to Datadog, on top of which we build dashboards and triggers to help us react in case of failure. ~ Sqooba Switzerland\n\n## Getting started\n\n### Dependencies\n\nThe latest release of [Remora](https://github.com/zalando-incubator/remora) supports [Apache Kafka](http://kafka.apache.org/) 3.1.0 and earlier.\n\nTo find the latest releases, please see the following examples:\n\n```\n$ curl https://registry.opensource.zalan.do/teams/buffalo/artifacts/remora/tags | jq \".[] | .name\"\n\n$ pierone latest buffalo remora --url registry.opensource.zalan.do # requires `$ pip3 install stups-pierone`\n```\n\n### Running it\n\nImages for all versions are available on [Zalando opensource pierone](http://registry.opensource.zalan.do)\n\nThey can be used as follows:\n\n```bash\ndocker run -it --rm -p 9000:9000 -e KAFKA_ENDPOINT=127.0.0.1:9092 registry.opensource.zalan.do/buffalo/remora\n```\n\nRun it with different log level:\n\n```bash\ndocker run -it --rm -p 9000:9000 -e KAFKA_ENDPOINT=127.0.0.1:9092 -e 'JAVA_OPTS=-Dlogback-root-level=INFO' registry.opensource.zalan.do/buffalo/remora\n```\n\nFor further examples see the [docker-compose.yml](basic-example/docker-compose.yml)\n\n```bash\ndocker-compose -f basic-example/docker-compose.yml up\n```\n\nRun remora in IDE with kafka and zookeeper run by docker-compose. Note you must set `-e KAFKA_ENDPOINT=\"kafka:9094\"` and `--network basic-example_default` for Remora to work with Kafka from docker-compose.\n\n```bash\ndocker-compose -f basic-example/docker-compose.yml up --scale remora=0\n```\n\nRemora is stateless, so test the scale of the API\n\n```bash\ndocker-compose -f basic-example/docker-compose.yml up --scale remora=3\n```\n\nFor examples with broker authentication see the [docker-compose.yml](auth-example/docker-compose.yml)\n\n```bash\ndocker-compose -f auth-example/docker-compose.yml up\n```\n\n### Usage\n\n#### Show active consumers \n\n```bash\n$ curl http://localhost:9000/consumers\n[\"consumer-1\", \"consumer-2\", \"consumer-3\"]\n```\n\n#### Show specific consumer group information\n\n```bash\n$ curl http://localhost:9000/consumers/\u003cConsumerGroupId\u003e\n{  \n   \"state\":\"Empty\",\n   \"partition_assignment\":[  \n      {  \n         \"group\":\"console-consumer-20891\",\n         \"coordinator\":{  \n            \"id\":0,\n            \"id_string\":\"0\",\n            \"host\":\"foo.company.com\",\n            \"port\":9092\n         },\n         \"topic\":\"products-in\",\n         \"partition\":1,\n         \"offset\":3,\n         \"lag\":0,\n         \"consumer_id\":\"-\",\n         \"host\":\"-\",\n         \"client_id\":\"-\",\n         \"log_end_offset\":3\n      },\n      {  \n         \"group\":\"console-consumer-20891\",\n         \"coordinator\":{  \n            \"id\":0,\n            \"id_string\":\"0\",\n            \"host\":\"foo.company.com\",\n            \"port\":9092\n         },\n         \"topic\":\"products-in\",\n         \"partition\":0,\n         \"offset\":3,\n         \"lag\":0,\n         \"consumer_id\":\"consumer-1-7baba9b9-0ec3-4241-9433-f36255dd4708\",\n         \"host\":\"/xx.xxx.xxx.xxx\",\n         \"client_id\":\"consumer-1\",\n         \"log_end_offset\":3\n      }\n   ],\n   \"lag_per_topic\":{\n        \"products-in\" : 0\n   }\n}\n```\n\n#### Health\n\n```bash\n$ curl http://localhost:9000/health\n{\n    \"cluster_id\": \"foobar_123\",\n    \"controller\": {\n        \"host\": \"xx.xxx.xxx.xxx\",\n        \"id\": 0,\n        \"id_string\": \"0\",\n        \"port\": 9092\n    },\n    \"nodes\": [\n        {\n            \"host\": \"xx.xxx.xxx.xxx\",\n            \"id\": 0,\n            \"id_string\": \"0\",\n            \"port\": 9092\n        }\n    ]\n}\n```\n\n### Metrics\n\n```bash\n$ curl http://localhost:9000/metrics\n{\n  \"version\": \"3.0.0\",\n  \"gauges\": {\n    \"PS-MarkSweep.count\": {\n      \"value\": 7371\n    },\n    \"PS-MarkSweep.time\": {\n      \"value\": 310404\n    },\n    \"PS-Scavenge.count\": {\n      \"value\": 476530\n    },\n    \"PS-Scavenge.time\": {\n      \"value\": 1234370\n    },\n    \"blocked.count\": {\n      \"value\": 0\n    },\n    \"count\": {\n      \"value\": 12\n    },\n    \"daemon.count\": {\n      \"value\": 3\n    },\n    \"deadlock.count\": {\n      \"value\": 0\n    },\n    \"deadlocks\": {\n      \"value\": []\n    },\n    \"heap.committed\": {\n      \"value\": 74448896\n    },\n    \"heap.init\": {\n      \"value\": 132120576\n    },\n    \"heap.max\": {\n      \"value\": 1860698112\n    },\n    \"heap.usage\": {\n      \"value\": 0.021295551247380425\n    },\n    \"heap.used\": {\n      \"value\": 39624592\n    },\n    \"new.count\": {\n      \"value\": 0\n    },\n    \"non-heap.committed\": {\n      \"value\": 73883648\n    },\n    \"non-heap.init\": {\n      \"value\": 2555904\n    },\n    \"non-heap.max\": {\n      \"value\": -1\n    },\n    \"non-heap.usage\": {\n      \"value\": -72377144\n    },\n    \"non-heap.used\": {\n      \"value\": 72377144\n    },\n    \"pools.Code-Cache.committed\": {\n      \"value\": 27525120\n    },\n    \"pools.Code-Cache.init\": {\n      \"value\": 2555904\n    },\n    \"pools.Code-Cache.max\": {\n      \"value\": 251658240\n    },\n    \"pools.Code-Cache.usage\": {\n      \"value\": 0.10638478597005209\n    },\n    \"pools.Code-Cache.used\": {\n      \"value\": 26772608\n    },\n    \"pools.Compressed-Class-Space.committed\": {\n      \"value\": 5242880\n    },\n    \"pools.Compressed-Class-Space.init\": {\n      \"value\": 0\n    },\n    \"pools.Compressed-Class-Space.max\": {\n      \"value\": 1073741824\n    },\n    \"pools.Compressed-Class-Space.usage\": {\n      \"value\": 0.004756048321723938\n    },\n    \"pools.Compressed-Class-Space.used\": {\n      \"value\": 5106768\n    },\n    \"pools.Metaspace.committed\": {\n      \"value\": 41115648\n    },\n    \"pools.Metaspace.init\": {\n      \"value\": 0\n    },\n    \"pools.Metaspace.max\": {\n      \"value\": -1\n    },\n    \"pools.Metaspace.usage\": {\n      \"value\": 0.984972144911835\n    },\n    \"pools.Metaspace.used\": {\n      \"value\": 40497768\n    },\n    \"pools.PS-Eden-Space.committed\": {\n      \"value\": 40894464\n    },\n    \"pools.PS-Eden-Space.init\": {\n      \"value\": 33554432\n    },\n    \"pools.PS-Eden-Space.max\": {\n      \"value\": 693108736\n    },\n    \"pools.PS-Eden-Space.usage\": {\n      \"value\": 0.02002515230164405\n    },\n    \"pools.PS-Eden-Space.used\": {\n      \"value\": 13879608\n    },\n    \"pools.PS-Old-Gen.committed\": {\n      \"value\": 31457280\n    },\n    \"pools.PS-Old-Gen.init\": {\n      \"value\": 88080384\n    },\n    \"pools.PS-Old-Gen.max\": {\n      \"value\": 1395654656\n    },\n    \"pools.PS-Old-Gen.usage\": {\n      \"value\": 0.018360885975505965\n    },\n    \"pools.PS-Old-Gen.used\": {\n      \"value\": 25625456\n    },\n    \"pools.PS-Survivor-Space.committed\": {\n      \"value\": 2097152\n    },\n    \"pools.PS-Survivor-Space.init\": {\n      \"value\": 5242880\n    },\n    \"pools.PS-Survivor-Space.max\": {\n      \"value\": 2097152\n    },\n    \"pools.PS-Survivor-Space.usage\": {\n      \"value\": 0.0625\n    },\n    \"pools.PS-Survivor-Space.used\": {\n      \"value\": 131072\n    },\n    \"runnable.count\": {\n      \"value\": 4\n    },\n    \"terminated.count\": {\n      \"value\": 0\n    },\n    \"timed_waiting.count\": {\n      \"value\": 1\n    },\n    \"total.committed\": {\n      \"value\": 148332544\n    },\n    \"total.init\": {\n      \"value\": 134676480\n    },\n    \"total.max\": {\n      \"value\": 1860698111\n    },\n    \"total.used\": {\n      \"value\": 112001672\n    },\n    \"waiting.count\": {\n      \"value\": 7\n    }\n  },\n  \"counters\": {\n    \"KafkaClientActor.receiveCounter\": {\n      \"count\": 1443078\n    }, \n    \"foo.3.bar.GET-rejections\": {\n      \"count\": 1\n    },\n    \"foo.3bar.GET-rejections\": {\n      \"count\": 1\n     },\n     \"foo.4.bar.GET-rejections\": {\n       \"count\": 1\n     },\n     \"health.GET-2xx\": {\n       \"count\": 1\n     },\n     \"metrics.GET-2xx\": {\n       \"count\": 5\n     }   \n  },\n  \"histograms\": {},\n  \"meters\": {\n    \"KafkaClientActor.receiveExceptionMeter\": {\n      \"count\": 0,\n      \"m15_rate\": 0,\n      \"m1_rate\": 0,\n      \"m5_rate\": 0,\n      \"mean_rate\": 0,\n      \"units\": \"events/second\"\n    }\n  },\n  \"timers\": {\n    \"KafkaClientActor.receiveTimer\": {\n      \"count\": 1443078,\n      \"max\": 0.496106,\n      \"mean\": 0.023955427605185976,\n      \"min\": 0.00855,\n      \"p50\": 0.013158,\n      \"p75\": 0.015818,\n      \"p95\": 0.069989,\n      \"p98\": 0.18145599999999998,\n      \"p99\": 0.193686,\n      \"p999\": 0.47478499999999996,\n      \"stddev\": 0.04561406607191679,\n      \"m15_rate\": 0.8672873098267513,\n      \"m1_rate\": 0.8576046718431439,\n      \"m5_rate\": 0.8704903354041494,\n      \"mean_rate\": 0.34074311090084636,\n      \"duration_units\": \"milliseconds\",\n      \"rate_units\": \"calls/second\"\n    },\n    \"RemoraKafkaConsumerGroupService.describe-timer\": {\n      \"count\": 1372542,\n      \"max\": 3953.5592429999997,\n      \"mean\": 165.67620936478744,\n      \"min\": 4.631377,\n      \"p50\": 22.125121,\n      \"p75\": 124.258938,\n      \"p95\": 527.534084,\n      \"p98\": 800.1686119999999,\n      \"p99\": 3316.226616,\n      \"p999\": 3611.7097409999997,\n      \"stddev\": 473.995637636751,\n      \"m15_rate\": 0.8508541627113339,\n      \"m1_rate\": 0.8450436821406069,\n      \"m5_rate\": 0.8545541048945428,\n      \"mean_rate\": 0.324087977369598,\n      \"duration_units\": \"milliseconds\",\n      \"rate_units\": \"calls/second\"\n    },\n    \"RemoraKafkaConsumerGroupService.list-timer\": {\n      \"count\": 70536,\n      \"max\": 2167.1663869999998,\n      \"mean\": 163.13534839326368,\n      \"min\": 56.275192999999994,\n      \"p50\": 162.584495,\n      \"p75\": 162.584495,\n      \"p95\": 162.584495,\n      \"p98\": 200.345285,\n      \"p99\": 200.345285,\n      \"p999\": 437.69862,\n      \"stddev\": 23.321317038931596,\n      \"m15_rate\": 0.016617378383700615,\n      \"m1_rate\": 0.015343754688965648,\n      \"m5_rate\": 0.016501030706405084,\n      \"mean_rate\": 0.016655133007592124,\n      \"duration_units\": \"milliseconds\",\n      \"rate_units\": \"calls/second\"\n    },\n    \"metrics.GET\": {\n      \"count\": 2,\n      \"max\": 174.712404,\n      \"mean\": 88.26670169568574,\n      \"min\": 4.375856,\n      \"p50\": 4.375856,\n      \"p75\": 174.712404,\n      \"p95\": 174.712404,\n      \"p98\": 174.712404,\n      \"p99\": 174.712404,\n      \"p999\": 174.712404,\n      \"stddev\": 85.15869346735195,\n      \"m15_rate\": 0,\n      \"m1_rate\": 0,\n      \"m5_rate\": 0,\n      \"mean_rate\": 0.6714371986436051,\n      \"duration_units\": \"milliseconds\",\n      \"rate_units\": \"calls/second\"\n      }\n  }\n}\n```\n\n## Configuring Remora\n\nAdditional configuration can be passed via the following environment variables:\n\n* **SERVER_PORT** - default `9000`\n* **KAFKA_ENDPOINT** - default `localhost:9092`\n* **ACTOR_TIMEOUT** - default `60 seconds`\n* **AKKA_HTTP_SERVER_REQUEST_TIMEOUT** - `default 60 seconds`\n* **AKKA_HTTP_SERVER_IDLE_TIMEOUT** - `default 60 seconds`\n* **TO_REGISTRY** - `default false` reports lag/offset/end to metricsRegistry\n* **EXPORT_METRICS_INTERVAL_SECONDS** - `default 20` interval to report lag/offset/end to metricsRegistry\n\n\n### Configuring Remora with Cloudwatch\n\nThe following environment variables can be used to configure reporting to Cloudwatch:\n\n* **CLOUDWATCH_ON** - `default false` reports metricsRegistry to cloudwatch, TO_REGISTRY will need to be switched on!\n* **CLOUDWATCH_NAME** - `default 'remora'` name to appear on cloudwatch\n* **CLOUDWATCH_METRIC_FILTER** - `default ''` metric names to filter on cloudwatch. Set the CLOUDWATCH_METRIC_FILTER variable to a regex string to filter out metric names that DO NOT match the regex.\n\n### Configuring Remora with Datadog\n\nThe following environment variables can be used to configure reporting to Datadog:\n\n* **DATADOG_ON** - `default false` reports metricsRegistry to Datadog, TO_REGISTRY will need to be switched on!\n* **DATADOG_NAME** - `default 'remora'` name to appear on datadog\n* **DATADOG_INTERVAL_MINUTES** - `default '1'` The reporting interval, in minutes.\n* **DATADOG_AGENT_HOST** - `default 'localhost'` The host on which a Datadog agent is running.\n* **DATADOG_AGENT_PORT** - `default '8125'` The port of the Datadog agent.\n* **DATADOG_CONSUMER_GROUPS** - `default '[]'` List of consumer groups for which metrics will be sent to Datadog. An empty list means that all metrics will be sent.\n\n__Reporting to datadog agent__:\n\nReporting to Datadog is done via [DogStatsD](https://docs.datadoghq.com/guides/dogstatsd/), which is usually running on the same host as remora.\nHowever, as Remora is running inside a docker container, some steps are required to make the integration:\n\n* Set **DATADOG_AGENT_HOST** as the address of the host on your machine\n* In the datadog agent configuration, set `non_local_traffic: yes`\n\nThis way, a docker container running Remora will be able to communicate with a Datadog agent on the host machine.\n\n## Building from source\n\n### Prerequisites\n\n - Scala\n - SBT\n\n### Build\n\nCreate docker image locally. The image will be built to `remora:0.1.0-SNAPSHOT` by default.\n\n```bash\n$ sbt docker:publishLocal\n```\n\n## Contributing\n\nWe are happy to accept contributions. First, take a look at our [contributing guidelines](CONTRIBUTING.md).\n\n## TODO\n\nPlease check the [Issues Page](https://github.com/zalando-incubator/remora/issues)\nfor contribution ideas.\n\n## Contact\n\nFeel free to contact one of the [maintainers](MAINTAINERS).\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzalando-incubator%2Fremora","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzalando-incubator%2Fremora","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzalando-incubator%2Fremora/lists"}