Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/stohrendorf/csv-prometheus-exporter

Prometheus Exporter for CSV-based files over SSH
https://github.com/stohrendorf/csv-prometheus-exporter

c-sharp csv log-analysis prometheus-exporter ssh

Last synced: 4 months ago
JSON representation

Prometheus Exporter for CSV-based files over SSH

Host: GitHub
URL: https://github.com/stohrendorf/csv-prometheus-exporter
Owner: stohrendorf
License: lgpl-3.0
Created: 2018-10-17T20:29:46.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-09-28T10:46:20.000Z (over 1 year ago)
Last Synced: 2024-03-14T20:06:05.926Z (11 months ago)
Topics: c-sharp, csv, log-analysis, prometheus-exporter, ssh
Language: C#
Homepage:
Size: 468 KB
Stars: 20
Watchers: 4
Forks: 9
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# CSV Prometheus Exporter

[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fstohrendorf%2Fcsv-prometheus-exporter.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fstohrendorf%2Fcsv-prometheus-exporter?ref=badge_shield)

**Usecase:** If you have a service on many servers and want to expose throughput-like metrics with thousands of events
per second to prometheus, and you can access the servers through SSH, this is for you. Just add a CSV-like log writer to
your software, attach this exporter to the servers, and attach Prometheus to this service. Instead of transferring
megabytes of Prometheus protocol data over the network, this service extracts and accummulates metrics incrementally
from the attached services, reducing traffic to a fraction. The time-resolution precision of the metrics is only based
on the sampling frequency.

It is capable of processing at least 100 servers with thousands of requests per second on a small VM (peak usage 16
cores, average usage 4 cores, average RAM usage below 300MB, average incoming SSH traffic below 600kB/s, not including
resource requirements for Prometheus itself).

> "CSV-like" in this case means "space-separated, double-quote delimited format"; this piece of software was primarily
> developed for parsing access logs, but if needed, it can be extended to parse any CSV-based format that
> [FastCsvParser](https://github.com/bopohaa/CsvParser) can handle.

Metrics are exposed at `host:5000/metrics`.

## Configuration

The configuration format is defined as follows. Identifiers prefixed with a `$` are names that can be
chosen freely. Stuff enclosed within `[...]` is optional, the text enclosed within `<...>` describes
the expected types. Please note that the tilde (`~`) is equivalent to `null`.

```
global:
# The metrics' time-to-live.
ttl:
[background-resilience: ] # how many "ttl" time spans to keep the metrics in background after they
# exceed their ttl
[long-term-resilience: ] # how many "ttl" time spans to keep long-term metrics in background after they
# exceed their ttl
# If prefix is set, all metrics (including process metrics)
# will be exposed as "prefix:metric-name".
[prefix: ]
[histograms: ]
format:

[script: ]
# If a script is given, but no reload-interval, it is executed only once at startup.
[reload-interval: ]

ssh:
[]
environments:

:=
# Numbers do not need to be ordered, an implicit "+Inf" will be added.
# If the values are not set, the default will be used, which is
# [.005, .01, .025, .05, .075, .1, .25, .5, .75, 1, 2.5, 5, 7.5, 10]
$bucket_name_1: <~ | list-of-numbers>
$bucket_name_2: <~ | list-of-numbers>
...

:=
$metric_name_1:
$metric_name_2:

:=
# Example: clf_number + request_bytes_sent
(number | clf_number | label | request_header) [+ $bucket_name]

:=
$environment_name_1:
hosts:
[connection: ]
$environment_name_2:
hosts:
[connection: ]
...

# Note that a few restrictions exist for the connection settings.
# 1. "file" and "user" are defined as required, but this means only
# that they must be either set at the global "ssh" level, or at
# the environment level.
# 2. At least one of "password" or "pkey" must be set.
# 3. If one of the settings is not set explicitly on environment level,
# the value is inherited from the global "ssh" level.
:=
file:
user:
[password: ]
[pkey: ]
[connect-timeout: ]
[read-timeout-ms: s %b"
- remote_host: label
- ~ # ignore remote logname
- remote_user: label
- ~ # ignore timestamp
- request_header: request_header # special parser that emits the labels "request_http_version", "request_uri" and "request_method"
- status: label
- body_bytes_sent: clf_number # maps a single dash to zero, otherwise behaves like "number"
```

Place your `scrapeconfig.yml` either in `/etc`, or
provide the environment variable `SCRAPECONFIG` with a config file path;
[see here for a config file example](./scrapeconfig.example.yml), showing all of its features.

# Installation

A docker image, containing `python3` and `curl`, is available
[here](https://hub.docker.com/r/stohrendorf/csv-prometheus-exporter/).

# Technical & Practical Notes

## The TTL Thing

Metrics track when they were last updated. If a metric doesn't change within the TTL specified in the
config file (which defaults to 60 seconds), it will not be exposed via `/metrics` anymore; this is the
first phase of garbage collection to avoid excessive traffic. If a metric is in the first phase of garbage
collection, and doesn't receive an update for another `background-resilience` periods of the specified TTL,
it will be fully evicted from the processing.

A few metrics (namely, the `parser_errors` and `lines_parsed` metrics) are in "long-term mode"; they will be evicted
after `long-term-resilience` periods of the TTL; the `connected` metric will never be evicted.

Practice has shown that this multi-phase metric garbage collection is strictly necessary to avoid excessive
response sizes and to avoid the process to be clogged up by processing dead metrics. It doesn't have any known
serious impacts on the metrics' values, though.

## A note about Prometheus performance

Performance matters, and the exported metrics are not usable immediately in most cases. The following
Prometheus rules have been tested in high-traffic situations, and sped up Prometheus queries immensely.

**Adjust as necessary, these are only examples.**

```yaml
groups:
- name: access_logs_precompute
interval: 5s
rules:
- record: "prefix:lines_parsed_per_second"
expr: "irate(prefix:lines_parsed_total[1m])"
- record: "prefix:body_bytes_sent_per_second"
expr: "irate(prefix:body_bytes_sent_total[1m])"
- record: "prefix:request_length_per_second"
expr: "irate(prefix:request_length_total[1m])"
- record: "prefix:request_time_per_second"
expr: "irate(prefix:request_time_total[1m])"
```

## License

[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fstohrendorf%2Fcsv-prometheus-exporter.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fstohrendorf%2Fcsv-prometheus-exporter?ref=badge_large)