{"id":18753013,"url":"https://github.com/spotify/semantic-metrics","last_synced_at":"2025-04-06T07:12:18.252Z","repository":{"id":4240699,"uuid":"52371323","full_name":"spotify/semantic-metrics","owner":"spotify","description":"Capturing meaningful metrics in your Java application","archived":false,"fork":false,"pushed_at":"2024-07-26T16:00:06.000Z","size":450,"stargazers_count":66,"open_issues_count":24,"forks_count":36,"subscribers_count":19,"default_branch":"master","last_synced_at":"2025-04-05T17:13:43.671Z","etag":null,"topics":["metrics"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spotify.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-23T15:58:00.000Z","updated_at":"2024-07-28T14:31:52.000Z","dependencies_parsed_at":"2024-07-26T17:52:53.488Z","dependency_job_id":null,"html_url":"https://github.com/spotify/semantic-metrics","commit_stats":null,"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fsemantic-metrics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fsemantic-metrics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fsemantic-metrics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fsemantic-metrics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spotify","download_url":"https://codeload.github.com/spotify/semantic-metrics/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247445671,"owners_count":20939958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["metrics"],"created_at":"2024-11-07T17:23:47.085Z","updated_at":"2025-04-06T07:12:18.231Z","avatar_url":"https://github.com/spotify.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# semantic-metrics\n[![Build Status](https://github.com/spotify/semantic-metrics/workflows/JavaCI/badge.svg)](https://github.com/spotify/semantic-metrics/actions?query=workflow%3AJavaCI)\n\n# :warning: Deprecation Notice :warning:\n\n**We are no longer accepting feature requests for Semantic Metrics. We will only be releasing security patches going forward until it is archived.**\n\nThis project contains modifications to the\n[dropwizard metrics](https://github.com/dropwizard/metrics) project.\n\nThe primary additions includes a replacement for `MetricRegistry` allowing for\nmetric names containing tags through\n[MetricId](api/src/main/java/com/spotify/metrics/core/MetricId.java).\n\n# Usage\n\nThe following are the interfaces and classes that _has_ to be used from this\npackage in order for MetricId to be used.\n\nYou will find these types in [com.spotify.metrics.core](\ncore/src/main/java/com/spotify/metrics/core).\n\n* SemanticMetricRegistry\n  \u0026mdash; Replacement for MetricRegistry.\n* MetricId\n  \u0026mdash; Replacement for string-based metric names.\n* SemanticMetricFilter\n  \u0026mdash; Replacement for MetricFilter.\n* SemanticMetricRegistryListener\n  \u0026mdash; Replacement for MetricRegistryListener.\n* SemanticMetricSet\n  \u0026mdash; Replacement for MetricSet.\n\nCare must be taken _not to_ use the upstream MetricRegistry because it does not\nsupport the use of MetricId.\nTo ease this, all of the replacing classes follow the `Semantic*` naming\nconvention.\n\nAs an effect of this, pre-existing plugins for codahale metrics _will not_\nwork.\n\n# Installation\n\nAdd a dependency in maven.\n\n```\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.spotify.metrics\u003c/groupId\u003e\n  \u003cartifactId\u003esemantic-metrics-core\u003c/artifactId\u003e\n  \u003cversion\u003e${semantic-metrics.version}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n# Provided Plugins\n\nThis project provide the following set of plugins;\n\n* [com.spotify.metrics.ffwd](ffwd-reporter/src/main/java/com/spotify/metrics/ffwd)\n  A reporter into [FastForward](https://github.com/spotify/ffwd).\n* [com.spotify.metrics.jvm](core/src/main/java/com/spotify/metrics/jvm)\n  Ported MetricSet's for internal java statistics.\n\nSee and run [examples](examples/src/main/java/com/spotify/metrics/example).\n\n# Considerations\n\n#### `MetricIdCache`\n\nIf you find yourself in a situation where you create many instances of this\nclass (i.e. when reporting metrics) and profiling/benchmarks show a significant \namount of time spent constructing MetricId instances, considering making use of\na [MetricIdCache](api/src/main/java/com/spotify/metrics/core/MetricIdCache.java)\n\nThe following is an example integrating with Guava.\n\n```java\n// GuavaCache.java\n\npublic final class GuavaCache\u003cT\u003e implements MetricIdCache.Cache\u003cT\u003e {\n    final Cache\u003cT, MetricId\u003e cache = CacheBuilder.newBuilder().expireAfterAccess(6, TimeUnit.HOURS)\n            .build();\n\n    private final MetricIdCache.Loader\u003cT\u003e loader;\n\n    public GuavaCache(Loader\u003cT\u003e loader) {\n        this.loader = loader;\n    }\n\n    @Override\n    public MetricId get(final MetricId base, final T key) throws ExecutionException {\n        return cache.get(key, new Callable\u003cMetricId\u003e() {\n            @Override\n            public MetricId call() throws Exception {\n                return loader.load(base, key);\n            }\n        });\n    }\n\n    @Override\n    public void invalidate(T key) {\n        cache.invalidate(key);\n    }\n\n    @Override\n    public void invalidateAll() {\n        cache.invalidateAll();\n    }\n\n    public static MetricIdCache.Any setup() {\n        return MetricIdCache.builder().cacheBuilder(new MetricIdCache.CacheBuilder() {\n            @Override\n            public \u003cT\u003e MetricIdCache.Cache\u003cT\u003e build(final Loader\u003cT\u003e loader) {\n                return new GuavaCache\u003cT\u003e(loader);\n            }\n        });\n    }\n}\n```\n\n```java\n// MyApplicationStatistics.java\n\npublic class MyApplicationStatistics() {\n    private final MetricIdCache.Typed\u003cString\u003e endpoint = GuavaCache.setup()\n        .loader(new MetricIdCache.Loader\u003cString\u003e() {\n            @Override\n            public MetricId load(MetricId base, String endpoint) {\n                return base.tagged(\"endpoint\", endpoint);\n            }\n        });\n\n    private final MetricIdCache\u003cString\u003e requests = endpoint\n        .metricId(MetricId.build().tagged(\"what\", \"endpoint-requests\", \"unit\", \"request\"))\n        .build();\n\n    private final MetricIdCache\u003cString\u003e errors = endpoint\n        .metricId(MetricId.build().tagged(\"what\", \"endpoint-errors\", \"unit\", \"error\"))\n        .build();\n\n    private final SemanticMetricRegistry registry;\n\n    public MyApplicationStatistics(SemanticMetricRegistry registry) {\n        this.registry = registry;\n    }\n\n    public void reportRequest(String endpoint) {\n        registry.meter(requests.get(endpoint)).mark();\n    }\n\n    public void reportError(String endpoint) {\n        registry.meter(errors.get(endpoint)).mark();\n    }\n}\n```\n\n#### Don't assume that semantic-metrics will be around forever\n\nAvoid performing deep integration of semantic-metrics into your library or\napplication.\nThis will prevent you, and third parties, from integrating your code with\ndifferent metric collectors.\n\nAs an alternative you should build a tree of interfaces that your application\nuses to report metrics (e.g. `my-service-statistics`), and use these to\nbuild an implementation using semantic metrics\n(`my-service-semantic-statistics`).\n\nThis pattern greatly simplifies integrating your application with more than one\nmetric collector, or ditching semantic-metrics when it becomes superseded by\nsomething better.\n\nAt configuration time your application can decide which implementation to use\nby simply providing an instance of the statistics API which suits their\nrequirements.\n\n##### Example\n\nBuild an interface describing all the _things_ that your application reports.\n\n```java\npublic interface MyApplicationStatistics {\n    /**\n     * Report that a single request has been received by the application.\n     */\n    void reportRequest();\n}\n```\n\nProvide a semantic-metrics implementation.\n\n```java\npublic class SemanticMyApplicationStatistics implements MyApplicationStatistics {\n    private final SemanticMetricRegistry registry;\n\n    private final Meter request;\n\n    public SemanticMyApplicationStatistics(SemanticMetricRegistry registry) {\n        this.registry = registry;\n        this.request = registry.meter(MetricId.build().tagged(\n            \"what\", \"requests\", \"unit\", \"request\"));\n    }\n\n    @Override\n    public void reportRequest() {\n        request.mark();\n    }\n}\n```\n\nNow a user of your framework/application can do something like the following to\nbootstrap your application.\n\n```java\npublic class Entry {\n    public static void main(String[] argv) {\n        final SemanticMetricRegistry registry = new SemanticMetricRegistry();\n        final MyApplicationStatistics statistics = new SemanticMyApplicationStatistics(registry);\n        /* your application */\n        final MyApplication app = MyApplication.builder().statistics(statistics).build();\n\n        final FastForwardReporter reporter = FastForwardReporter.forRegistry(registry).build()\n\n        reporter.start();\n        app.start();\n\n        app.join();\n        reporter.stopWithFlush();\n        System.exit(0);\n    }\n}\n```\n\n# Metric Types\n\nThere are different metric types that can be used depending on what it is that\nwe want to measure, e.g., queue length, or request time, etc.\n\n## Gauge\nA gauge is an instantaneous measurement of a value. For example if we want to measure the number of pending jobs in a queue.\n\n```java\nregistry.register(metric.tagged(\"what\", \"job-queue-length\"), new Gauge\u003cInteger\u003e() {\n    @Override\n    public Integer getValue() {\n        // fetch the queue length the way you like\n        final int queueLength = 10;\n        // obviously this is gonna keep reporting 10, but you know ;)\n        return queueLength;\n    }\n});\n```\nIn addition to the tags that are specified (e.g., \"what\" in this example), FfwdReporter adds the following tags to each Gauge data point:\n\n| tag         | values  | comment |\n|-------------|---------|---------|\n| metric_type | gauge   |         |\n\n## Counter\nA counter is just a gauge for an AtomicLong instance.\nYou can increment or decrement its value.\n\nFor example we want a more efficient way of measuring the pending job in a\nqueue.\n\n```java\nfinal Counter counter = registry.counter(metric.tagged(\"what\", \"job-count\"));\n// Somewhere in your code where you are adding new jobs to the queue you increment the counter as well\ncounter.inc();\n// Somewhere in your code the job is going to be removed from the queue you decrement the counter\ncounter.dec();\n```\n\nIn addition to the tags that are specified (e.g., \"what\" in this example), FfwdReporter adds the following tags to each Counter data point:\n\n| tag         | values  | comment |\n|-------------|---------|---------|\n| metric_type | counter |         |\n\n## Meter\nA meter measures the rate of events over time (e.g., \"requests per second\").\nIn addition to the mean rate, meters also track 1- and 5-minute moving\naverages.\n\nFor example we have an endpoint that we want to measure how frequent we receive\nrequests for it.\n\n```java\nMeter meter = registry.meter(metric.tagged(\"what\", \"incoming-requests\").tagged(\"endpoint\", \"/v1/list\"));\n// Now a request comes and it's time to mark the meter\nmeter.mark();\n```\n\nIn addition to the tags that are specified (e.g., \"what\" and \"endpoint\" in this example), FfwdReporter adds the following tags to each Meter data point:\n\n| tag         | values   | comment |\n|-------------|----------|---------|\n| metric_type | meter    |         |\n| unit        | \\\u003cunit\\\u003e/s |\\\u003cunit\\\u003e is what is originally specified as \"unit\" attribute during declaration. If missing, the value will be set as \"n/s\". For example if you originally specify .tagged(\"unit\", \"request\") on a Meter, FfwdReporter emits Meter data points with \"unit\":\"request/s\"       |\n| stat | 1m, 5m    | **1m** means the size of the time bucket of the calculated moving average of this data point is 1 minute. **5m** means 5 minutes.         |\n\n**NOTE:** Meter also reports the meter counter value to allow platforms to derive rates using the monotonically increasing count instead of only aggregating the rate computed by the meter itself. It is useful for applications to be able to report both count and rate using a meter.\n\n## Deriving Meter\nA deriving meter takes the derivative of a value that is expected to be monotonically increasing.\u003cBR\u003e\u003cBR\u003e\nA typical use case is to get the rate of change of a counter of the total number of events.\u003cBR\u003e\u003cBR\u003e\nThis implementation ignores updates that decrease the counter value.\nThe rationale is that the counter is expected to be monotonically increasing between\ninfrequent resets (when a process has been restarted, for example).\nThus, negative values should only happen on restart and should be safe to discard.\n\n```java\nDerivingMeter derivingMeter = registry.derivingMeter(metric.tagged(\"what\", \"incoming-requests\").tagged(\"endpoint\", \"/v1/list\"));\nderivingMeter.mark();\n```\n\nIn addition to the tags that are specified (e.g., \"what\" and \"endpoint\" in this example), FfwdReporter adds the following tags to each Meter data point:\n\n| tag         | values   | comment |\n|-------------|----------|---------|\n| metric_type | deriving_meter |         |\n| unit        | \\\u003cunit\\\u003e/s |\\\u003cunit\\\u003e is set to what is specified during declaration. For example, if you specify .tagged(\"unit\", \"request\") on a DerivingMeter, FfwdReporter emits DerivingMeter data points with \"unit\":\"request/s\". Default: \"n/s\".|\n| stat | 1m, 5m    | \\\u003cstat\\\u003e means the size of the time bucket of the calculated moving average of this data point. **1m** is 1 minute. **5m** means 5 minutes.         |\n\n\n## Histogram\nA histogram measures the statistical distribution of values in a stream of\ndata.\nIt measures minimum, maximum, mean, median, standard deviation, as well as 75th\nand 99th percentiles.\n\nFor example this histogram will measure the size of responses in bytes.\n\n```java\nHistogram histogram = registry.histogram(metric.tagged(\"what\", \"response-size\").tagged(\"endpoint\", \"/v1/content\"));\n// fetch the size of the response\nfinal long responseSize = getResponseSize(response);\nhistogram.update(responseSize);\n```\nIn addition to the tags that are specified (e.g., \"what\" and \"endpoint\" in this example), FfwdReporter adds the following tags to each Histogram data point:\n\n| tag         | values                         | comment                       |\n|-------------|--------------------------------|-------------------------------|\n| metric_type | histogram                      |                               |\n| stat        | min, max, mean, median, stddev, p75, p99 |**min:** the lowest value in the snapshot\u003cbr\u003e**max:** the highest value in the snapshot\u003cbr\u003e**mean:** the arithmetic mean of the values in the snapshot\u003cbr\u003e**median:** the median value in the distribution\u003cbr\u003e**stddev:** the standard deviation of the values in the snapshot\u003cbr\u003e**p75:** the value at the 75th percentile in the distribution\u003cbr\u003e**p99:** the value at the 99th percentile in the distribution |\n\nNote that added custom percentiles will show up in the stat tag.\n\n### Histogram with ttl\n`HistogramWithTtl` changes the behavior of the default codahale histogram when update rate is low. If the update rate goes below a certain threshold for a certain time, all samples that have been received during that time are used instead of the random sample that is used in the default histogram implementation. When update rates are above the threshold, the default implementation is used.\n\n**What problem does it solve?**\n\nThe default histogram implementation uses a random sampling algorithm with exponentially decaying probabilities over time. This works well if update rates are approximately 10 requests per second or above. When rates go below that, the metrics, especially p99 and above tends to flatline because the values are not replaced often enough. We solve this by using a different implementation whenever the update rate goes below 10 RPS. This gives much more dynamic percentile measurements during low update rates. When update rates go above the threshold we switch to the default implementation.\n\nThis was authored by Johan Buratti. \n\n\n## Distribution [DO NOT USE]\n\n**Distributions are no longer supported. The code to create them and the Heroic code to query them still exists, however they are being retired and no further adoption should occur.** \n\n**Heroic is being retired in favor of OpenSource alternatives, and this distribution implementation will not be portable to the future TSDB/query interface. Since only a few services with a few metrics had experimented with distributions, the choice was made to halt adoption now, to reduce the pain of conversion to a proper histrogram later.**\n \n\n\n*For historical refrence only* \n\n**DO NOT USE**\n\nDistribution is a simple interface that allows users to record measurements to compute rank statistics on data distribution, not just a local source.\n\nEvery implementation should produce a serialized data sketch in a byteBuffer as this metric point value.\n\nUnlike traditional histograms, distribution doesn't require a predefined percentile value. Data recorded can be used upstream to compute any percentile.\n\nDistribution doesn't require any binning configuration. Just get an instance through SemanticMetricBuilder and record data.\n\nDistribution is a good choice if you care about percentile accuracy in a distributed environment and you want to rely on P99 to set SLOs.\n\nFor example, this distribution will measure the size of messages in bytes.\n\n```java\nDistribution distribution = registry.distribution(metric.tagged(\"what\", \"distribution-message-size\", \"unit\", Units.BYTE));\n// fetch the size of the message\nint size = getMessageSize(response);\ndistribution.record(size);\n```\nIn addition to the tags that are specified (e.g., \"what\" and \"unit\" in this example), FfwdReporter adds the following tags to each Histogram data point:\n\n| tag         | values                         | comment                       |\n|-------------|--------------------------------|-------------------------------|\n| metric_type | distribution                   |                               |\n| tdigeststat | P50, P75, P99                  |**P50:** the value at the 50th percentile in the distribution\u003cbr\u003e**P75:** the value at the 75th percentile in the distribution\u003cbr\u003e**P99:** the value at the 99th percentile in the distribution |\n\n\n**What problem does it solve?**\n\n* Accurate Aggregated Histogram Data\n\nThis can record data and send data sketches. A sketch of a dataset is a small data structure that lets you approximate certain characteristics of the original dataset. Sketches are\n used to compute rank based statistics such as percentile. Sketches are mergeable and can be used\nto compute any percentile on the entire data distribution.\n\n* Support Sophisticated Data-point Values\n\nWith distributions we are able to support sophisticated data point values, such as the Open-census metric distribution.\n\nAuthored by Adele Okoubo.\n\n## Timer\nA timer measures both the rate that a particular piece of code is called and\nthe distribution of its duration.\n\nFor example we want to measure the rate and handling duration of incoming\nrequests.\n\n```java\nTimer timer = registry.timer(metric.tagged(\"what\", \"incoming-request-time\").tagged(\"endpoint\", \"/v1/get_stuff\"));\n// Do this before starting to do the thing. This creates a measurement context object that you can pass around.\nfinal Context context = timer.time();\ndoStuff();\n// Tell the context that it's done. This will register the duration and counts one occurrence.\ncontext.stop();\n```\n\nIn addition to the tags that are specified (e.g., \"what\" and \"endpoint\" in this example), FfwdReporter adds the following tags to each Timer data point:\n\n| tag         | values                         | comment                       |\n|-------------|--------------------------------|-------------------------------|\n| metric_type | timer                          |                               |\n| unit        | ns                             |                               |\n  \n**NOTE:** Timer is really just a combination of a Histogram and a Meter, so apart from the tags above, combination of both Histogram and Meter tags will be included.\n\n# Why Semantic Metrics?\n\nWhen dealing with thousands of similar timeseries over thousands of hosts,\nclassification becomes a big issue.\n\nClassical systems organize metric names as strings, containing a lot of\ninformation about the metric in question.\n\nYou will often see things like ```webserver.host.example.com.df.used./```.\n\nThe same metric expressed as a set of tags could look like.\n\n```json\n{\"role\": \"webserver\", \"host\": \"host.example.com\", \"what\": \"disk-used\",\n \"mountpoint\": \"/\"}\n```\n\nThis system of classification from the host machine greatly simplifies any\nmetrics pipeline.\nWhen transported with a stable serialization method (like JSON) it does not\nmatter if we add additional tags, or decide to change the order in which the\ntimeseries happens to be designated.\n\nWe can also easily index this timeseries by its tag using a system like\nElasticSearch and ask it interesting questions about which timeseries are\navailable.\n\nIf used with a metrics backend that supports efficient aggregation and\nfiltering across _tags_ you gain a flexible and intionistic pipeline that is\npowerful and agnostic about what it sends, all the way from the service being\nmonitored to your metrics GUI.\n\n# Contributing\n\nThis project adheres to the [Open Code of Conduct](https://github.com/spotify/code-of-conduct/blob/master/code-of-conduct.md).\nBy participating, you are expected to honor this code.\n\n1. Fork semantic-metrics from\n   [github](https://github.com/spotify/semantic-metrics) and clone your fork.\n2. Hack.\n3. Push the branch back to GitHub.\n4. Send a pull request to our upstream repo.\n\n# Releasing\n\nReleasing is done via the `maven-release-plugin` and `nexus-staging-plugin` which are configured via the\n`release` [profile](https://github.com/spotify/semantic-metrics/blob/master/pom.xml#L140). Deploys are staged in oss.sonatype.org before being deployed to Maven Central. Check out the [maven-release-plugin docs](http://maven.apache.org/maven-release/maven-release-plugin/) and the [nexus-staging-plugin docs](https://help.sonatype.com/repomanager2) for more information. \n\nTo release, first run: \n\n``mvn -P release release:prepare``\n\nYou will be prompted for the release version and the next development version. On success, follow with:\n\n``mvn -P release release:perform``\n  \nWhen you have finished these steps, please \"Draft a new release\" in Github and list the included PRs (aside from changes to documentation). \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fsemantic-metrics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspotify%2Fsemantic-metrics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fsemantic-metrics/lists"}