https://github.com/digipost/digipost-micrometer-prometheus

Metrics and http scrape endpoints for Digipost monitoring
https://github.com/digipost/digipost-micrometer-prometheus
Last synced: about 2 months ago
JSON representation
Metrics and http scrape endpoints for Digipost monitoring
Host: GitHub
URL: https://github.com/digipost/digipost-micrometer-prometheus
Owner: digipost
License: apache-2.0
Created: 2019-12-19T20:57:44.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2025-03-04T08:47:30.000Z (3 months ago)
Last Synced: 2025-03-04T09:33:46.429Z (3 months ago)
Language: Java
Size: 169 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        [![Maven Central](https://maven-badges.herokuapp.com/maven-central/no.digipost/digipost-micrometer-prometheus/badge.svg)](https://maven-badges.herokuapp.com/maven-central/no.digipost/digipost-micrometer-prometheus)

![](https://github.com/digipost/digipost-micrometer-prometheus/workflows/Build%20and%20deploy/badge.svg)

[![License](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/digipost/digipost-micrometer-prometheus/blob/main/LICENCE)

# digipost-micrometer-prometheus

## Common application tag MeterFilter

To include the **name of your application in all reported metrics**, make sure to configure this as

early as possible before any meters are bound to your `MeterRegistry`. E.g. configure this where you _create_ your registry:

```java

MeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);

MeterFilters.tryIncludeApplicationNameCommonTag().ifPresentOrElse(

        config()::meterFilter,

        () -> LOG.warn("Unable to include application common tag in MeterRegistry"));

```

This will try to determine the name of your application by resolving the value of the `Implementation-Title` key in the

`MANIFEST.MF` file of the JAR file containing the class which started your application.

You also have the option to provide a class yourself instead of relying on this being automatically discovered. The class should

be located in the JAR which also contains the `MANIFEST.MF` which contains the `Implementation-Title` you would like to use as

your application name.

The example above logs a warning should this discovery mechanism fail to resolve your application name. You may choose to handle

this in any way depending on your preference, e.g. throw an exception instead of just logging.

You can also skip all this automatic discovery, and just **supply the name of your application** when configuring the filter:

```java

MeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);

meterRegistry.config().meterFilter(MeterFilters.includeApplicationNameCommonTag("my-application"))

```

See PR #34 for more examples on how to configure the filter.

## Micrometer metrics

Usage with a `MeterRegistry`:

```java

new ApplicationInfoMetrics().bindTo(this);

```

Application metric with data from `MANIFEST.MF` or environment variables.

This is what is expected to exist in the manifest or as key value environment variables:

```

Build-Jdk-Spec: 12

Git-Build-Time: 2019-12-19T22:52:05+0100

Git-Build-Version: 1.2.3

Git-Commit: ffb9099

```

This will create this metric in Prometheus running java 11:

```

# HELP app_info General build and runtime information about the application. This is a static value

# TYPE app_info gauge

app_info{application="my-application",buildNumber="ffb9099",buildTime="2019-12-19T22:52:05+0100",buildVersion="1.2.3",javaBuildVersion="12",javaVersion="11"} 1.0

```

(Note, `application="my-application"` will only be included if you configured the "Common application tag MeterFilter" described previously.)

The following metric will be created if no values are present in the manifest or environment variables:

```

# HELP app_info General build and runtime information about the application. This is a static value

# TYPE app_info gauge

app_info{javaVersion="11"} 1.0

```

## Simple Prometheus server

The `SimplePrometheusServer` will take a log-method in its constructor to log when the server is up.

To start the server you need your instance of `PrometheusMeterRegistry` and a port. The prometheus metrics will then be on `host:port/metrics`

```java

new SimplePrometheusServer(LOG::info)

        .startMetricsServer(prometheusRegistry, 9610);

```

## TimedThirdPartyCall

With `TimedThirdPartyCall` you can wrap your code to get metrics on the call with extended funtionality on top of what

micrometer Timed gives you.

An example:

```java

BiFunction, AppStatus> warnOnSituation =

        (response, possibleException) -> possibleException.isPresent() || "ERROR_SITUATION".equals(response.data) ? AppStatus.WARN : AppStatus.OK;

TimedThirdPartyCall getStuff = TimedThirdPartyCallDescriptor

        .create("ExternalService", "getStuff", prometheusRegistry)

        .callResponseStatus(warnOnSituation);

getStuff.call(() -> new MyResponse("ERROR_SITUATION"));

```

This will produce a number of metrics:

```

app_third_party_call_total{name="ExternalService_getStuff", status="OK"} 0.0

app_third_party_call_total{name="ExternalService_getStuff", status="WARN"} 1.0

app_third_party_call_total{name="ExternalService_getStuff", status="FAILED"} 0.0

app_third_party_call_seconds_count{name="ExternalService_getStuff"} 1.0

app_third_party_call_seconds_sum{name="ExternalService_getStuff"} 6.6018E-5

app_third_party_call_seconds_max{name="ExternalService_getStuff"} 6.6018E-5

```

The idea is that Timed only count exections overall. What we want in addition is finer granularity to create better alerts

in our alerting rig. By specifying a function by witch we say OK/WARN/FAILED we can exclude error-situations

that we want to igore from alerts reacting to `FAILED` or a percentage of `FAILED/TOTAL`.

You can also use simple exception-mapper-function for a boolean OK/FAILED:

```java

TimedThirdPartyCall getStuff = TimedThirdPartyCallDescriptor

        .create("ExternalService", "getStuff", prometheusRegistry)

        .exceptionAsFailure();

String result = getStuff.call(() -> "OK");

```

For timing `void` functions you can use `NoResultTimedThirdPartyCall`,

acquired by invoking the `noResult()` method:

```java

NoResultTimedThirdPartyCall voidFunction = TimedThirdPartyCallDescriptor

        .create("ExternalService", "voidFunction", prometheusRegistry)

        .noResult()  // allows timing void function calls

        .exceptionAsFailure();

voidFunction.call(() -> {});

```

You can also defined percentiles (default 0.5, 0.95, 0.99):

```java

TimedThirdPartyCallDescriptor

        .create("ExternalService", "getStuff", prometheusRegistry)

        .callResponseStatus(warnOnSituation, 0.95, 0.99);

```

## MetricsUpdater

Often you want to have metrics that might be slow to get. Examples of this is count rows in a Postgres-database or

maybe stats from a keystore. Typically you want to have som kind of worker threat that updates this

value on a regular basis. But how do you know that your worker thread is not stuck?

For this you can use the `MetricsUpdater` class. Create an instance of it and specify the number of threads you want. Now

registrer a runnable at an interval.

```java

metricsUpdater.registerAsyncUpdate("count-table", Duration.ofMinutes(10), () -> {

    //Slow count of huge database

});

```

You can the alert if this is stale:

```

- alert: AsyncUpdateScrapeErrors

     expr: app_async_update_scrape_errors > 0

     for: 2m

```

## EventLogger

Sometimes you want to have metrics for some event that happens in your application. And sometimes you want som kind of

alert or warning when they occur at a given rate. This implementation is a way to achieve that in a generic way.

Your application need to implement the interface `AppBusinessEvent`. We usually do that with an enum so that we have

easy access to the instance of the event. You can se a complete implementation of this in `AppBusinessEventLoggerTest`.

You can also use the interface `AppSensorEvent` to add a multiplier score (severity) to an event.

```java

EventLogger eventLogger = new AppBusinessEventLogger(meterRegistry);

eventLogger.log(MyBusinessEvents.VIOLATION_WITH_WARN);

```

This should produce a prometheus scrape output like this:

```

# HELP app_business_events_1min_warn_thresholds

# TYPE app_business_events_1min_warn_thresholds gauge

app_business_events_1min_warn_thresholds{name="VIOLATION_WITH_WARN",} 5.0

# HELP app_business_events_total

# TYPE app_business_events_total counter

app_business_events_total{name="VIOLATION_WITH_WARN",} 1.0

```

You can then use the gauge `app_business_events_1min_warn_thresholds` to register alerts with your system:

```

  - alert: MyEvents

    expr: >

      sum by (job,name) (increase(app_business_events_total[5m]))

      >=

      max by (job,name) (app_business_events_1min_warn_thresholds) * 5

    labels:

      severity: warning

    annotations:

      summary: 'High event-count for `{{ $labels.name }}`'

      description: >

        Job: `{{ $labels.job }}`, event: `{{ $labels.name }}`, has 15min count of `{{ $value | printf "%.1f" }}`

```

The nice thing here is that by doing the `sum by (job, name)` you will compare only the metrics with the same

name. For this eksample that is `VIOLATION_WITH_WARN` which is your uniqe event name in the system.

## LogbackLoggerMetrics

Log-events metrics for specified logback appender. Dimensions for level and logger.

```java

LogbackLoggerMetrics.forRootLogger().bindTo(meterRegistry);

//or

LogbackLoggerMetrics.forLogger("my.logger.name").bindTo(meterRegistry);

```

This will produce the following prometheus scrape output:

```

# HELP logback_logger_events_total

# TYPE logback_logger_events_total counter

logback_logger_events_total{application="my-application",level="warn",logger="ROOT",} 0.0

logback_logger_events_total{application="my-application",level="error",logger="ROOT",} 0.0

logback_logger_events_total{application="my-application",level="trace",logger="ROOT",} 0.0

logback_logger_events_total{application="my-application",level="info",logger="ROOT",} 18.0

logback_logger_events_total{application="my-application",level="debug",logger="ROOT",} 0.0

```

### Excluding logger from metric

If you for some reasons don't want log events from a spesific logger to be included in the metric this can be done:

```java

LogbackLoggerMetrics.forRootLogger()

    .excludeLogger("ignored.logger.name")

    .bindTo(prometheusRegistry);

```

### Thresholds

Metrics for logging level threshold can also be created with the methods `warnThreshold5min` and `errorThreshold5min`.

Ex:

```java

LogbackLoggerMetrics.forLogger(Logger.ROOT_LOGGER_NAME)

        .warnThreshold5min(10)

        .errorThreshold5min(5)

        .bindTo(prometheusRegistry);

```

This will produce the following metrics during prometheus scraping, in addition to the metrics above:

```

# HELP logger_events_1min_threshold

# TYPE logger_events_1min_threshold gauge

logger_events_5min_threshold{application="my-application",level="warn",logger="ROOT",} 10.0

logger_events_5min_threshold{application="my-application",level="error",logger="ROOT",} 5.0

```

These metrics can be used for alerting in combination with the metrics above. Prometheus expression:

```

sum by (job,level,logger) (increase(logback_logger_events_total[5m]))

>=

max by (job,level,logger) (log_events_5min_threshold)

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/digipost/digipost-micrometer-prometheus

Awesome Lists containing this project

README