{"id":36537717,"url":"https://github.com/monasca/monasca-aggregator","last_synced_at":"2026-01-15T02:21:28.043Z","repository":{"id":80010840,"uuid":"89937086","full_name":"monasca/monasca-aggregator","owner":"monasca","description":"Near real-time continuous aggregation of Monasca metrics","archived":false,"fork":false,"pushed_at":"2019-04-16T02:21:56.000Z","size":118,"stargazers_count":11,"open_issues_count":4,"forks_count":8,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-11-15T00:58:36.607Z","etag":null,"topics":["golang","kubernetes","monasca","monitoring","openstack","stream-processing"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/monasca.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-05-01T15:41:10.000Z","updated_at":"2021-02-18T02:17:06.000Z","dependencies_parsed_at":"2024-01-16T00:22:40.069Z","dependency_job_id":"37c593ae-6c29-4898-8ff6-a98d4ac8dc16","html_url":"https://github.com/monasca/monasca-aggregator","commit_stats":{"total_commits":142,"total_committers":5,"mean_commits":28.4,"dds":0.5845070422535211,"last_synced_commit":"95b71d7012cf228d2033d5aa1950be90b06cc94f"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/monasca/monasca-aggregator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monasca%2Fmonasca-aggregator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monasca%2Fmonasca-aggregator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monasca%2Fmonasca-aggregator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monasca%2Fmonasca-aggregator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/monasca","download_url":"https://codeload.github.com/monasca/monasca-aggregator/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monasca%2Fmonasca-aggregator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28335158,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T00:36:25.062Z","status":"online","status_checked_at":"2026-01-12T02:00:08.677Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["golang","kubernetes","monasca","monitoring","openstack","stream-processing"],"created_at":"2026-01-12T05:08:37.404Z","updated_at":"2026-01-15T02:21:28.035Z","avatar_url":"https://github.com/monasca.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# monasca-aggregator\n\n## Introduction\n\nA high-speed near real-time continuous aggregation micro-service for Monasca with the following features:\n\n* Read metrics from Kafka.\n\n* Write aggregated metrics to Kafka.\n\n* Filter metrics by metric name.\n\n* Filter metrics by dimension (name, value) pairs.\n\n* Reject metrics by dimension (name, value) pairs.\n\n* Reject metrics by dimension name.\n\n* Group by metric dimension names.\n\n* Write aggregated metrics using a specified aggregated name.\n\n* Supported aggregations include the following:\n \n  * sum\n  * count\n  * avg\n  * min\n  * max\n  * delta\n  * rate\n\n* Aggregate on specified window sizes. Support any window size. E.g. 10 seconds or one hour.\n\n* Aggregations aligned to time window boundaries.\nTime window aggregations occur at boundaries aligned to the start of the epoch.\nE.g. If a one hour window size is specified, time window aggregations will start on the hour, not randomly in the middle based on when the process is started.\n\n* Lag time. Aggregations are produced at a specified lag time past the end of the time window.\nThe time at which the aggregations start is specified based on a \"lag\" time, which is the duration past the end of the time window.\nE.g. 10 minutes past the hour. This can be set to any value, such as 10 hours if desired.\n\n* Continuous near real-time aggregations.\nAggregations are stored in memory only.\nTherefore, metrics don't need to be pulled into memory and operated on in a batch operation.\nE.g. When perfoming a sum operation for a series only the running total is kept in memory for each series.\n\n* Event time window processing.\nAggregations for metrics are processed based on the timestamp of the metric in event time, and not the process time or time at which the metric is being processed.\n\n* Stop/start, crash/restarts handling.\nKafka offsets are manually committed after an aggregation is produced to allow processing to start off from where the last successful aggregation completed.\nTherefore, aggregations are computed with no data loss.\nIf for any reason  processing stops in the middle of a time window the Kafka offsets will not be committed for that time window.\nWhen re-started, the Kafka offsets are read from Kafka and processing starts off from the last succesful commit.\nThis implies that metrics may be read from Kafka multiple times in the event of a re-start, but there is no data loss.\n\n* Domain Specific Language (DSL).\nA simple expressive DSL for specifying aggregations.\nSee, [aggregation-specifications.yaml](aggregation-specifications.yaml).\n\n* Performance. \u003e 50K metrics/sec, but we're not exactly sure how fast it is.\nIt is possible it is greater than 100K metrics/sec, but we'll need a different testing strategy to verify.\n\n* Written in Go.\n\n* Dependencies: Dependent on only the following Go libraries:\n\n  * [Confluent's Apache Kafka client for Golang](https://github.com/confluentinc/confluent-kafka-go)\n\n  * [Prometheus Go Client Library](https://github.com/prometheus/client_golang)\n\n  * [logrus](https://github.com/sirupsen/logrus)\n\n  * [Viper](https://github.com/spf13/viper)\n\n* No additional runtime requirements, beyond Apache Kafka, such as Apache Spark and Apache Storm.\nIn addition, no additional databases required.\nFor example, Kafka offsets are stored in Kafka and do not require an external database, such as MySQL.\n\n* Instantaneous start-up times.\nDue to it's lightweight design and use of Go, start-up times are extremely fast.\n\n* Easily deployed and configured.\nDue to the use of Go and small set of dependencies, can be easily deployed.\n\n* Low cpu and memory footprint.\nSince processing is continuous and only the aggregations are stored in memory, such as the sum, the memory footprint is very small.\n\n* Testable.\nDue to it's lightweight design and footprint, as well as ability to specify small windows sizes, it is very easy to test.\nFor example, when testing it is possible to aggregate with 10 second window sizes.\nIn addition, due to Go and a small set of dependencies, it is possible to run monasca-aggregation on a laptop without any additional runtime environment, other than Kafka.\n\n* Instrumented using the [Prometheus Go Client Library](https://github.com/prometheus/client_golang) and [logrus](https://github.com/sirupsen/logrus).\n\n* Configured using [Viper](https://github.com/spf13/viper).\nViper supports many configuration options, but we use it for yaml config files.\nSee [config.yaml](config.yaml) and [aggregation-specifications.yaml](aggregation-specifications.yaml)\n\n## Documentation\n\n* [Aggregation Specifications](./docs/aggregations.md)\n* [Installing and Running Locally](./docs/local_install.md)\n\n## References\n\nSeveral of the concepts, such as time windows, continuous aggregations, event time processing, are best described in the following references.\n\n### Kafka Streams\n\nAlthough Kafka Streams isn't used by monasca-aggregator, it serves as excellent background on stream processing.\nOne of the main concepts that Kafka Streams introduces is a time windowed key/value store that can be used to store aggregations.\nIf used wisely, this can help address more complicated scenarios, such as fail/re-start, without having to manually manage and commit Kafka offsets.\nKafka Streams is a really exciting technology, but we didn't use it here, as it is only available in Java.\nHowever, several of the concepts in Kafka Streams are used here.\nHopefully, Kafka Streams is ported to Go someday.\n\n* [Introducing Kafka Streams: Stream Processing Made Simple](https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/)\n\n* [Introduction to Streaming Data and Stream Processing with Apache Kafka](https://www.confluent.io/apache-kafka-talk-series/introduction-to-stream-processing-with-apache-kafka/})\n\n* [Kafka Streams](http://docs.confluent.io/3.0.0/streams/)\n\n### Google and Apache Beam\n\nAlthough Apache Beam isn't used here, Tyler Akidau et al's seminal paper, which led to the Apache Beam project, is an excellent reference for understanding event and process time windowing.\n\n* [The Dataflow Model: A Practical Approach to Balancing\n Correctness, Latency, and Cost in Massive-Scale,\n Unbounded, Out-of-Order Data Processing](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)\n \n* [The world beyond batch: Streaming 101](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101)\n \n* [The world beyond batch: Streaming 102](https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102)\n\n* [MillWheel: Fault-Tolerant Stream Processing at Internet Scale](https://research.google.com/pubs/pub41378.html)\n\n### Misc\n \n* [Building Scalable Stateful Services by Caitie McCaffrey](https://www.youtube.com/watch?v=H0i_bXKwujQ\u0026feature=youtu.be\u0026a)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonasca%2Fmonasca-aggregator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmonasca%2Fmonasca-aggregator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonasca%2Fmonasca-aggregator/lists"}